[gpfsug-discuss] A GPFS newbie

Jonathan Buzzard j.buzzard at dundee.ac.uk
Tue Aug 7 12:56:14 BST 2012


On 07/08/12 12:09, Robert Esnouf wrote:
>
> Dear GPFS users,
>
> Please excuse what is possibly a naive question from a not-yet
> GPFS admin. We are seriously considering GPFS to provide
> storage for our compute clusters. We are probably looking at
> about 600-900TB served into 2000+ Linux cores over InfiniBand.
> DDN SFA10K and SFA12K seem like good fits. Our domain-specific
> need is high I/O rates from multiple readers (100-1000) all
> accessing parts of the same set of 1000-5000 large files
> (typically 30GB BAM files, for those in the know). We could
> easily sustain read rates of 5-10GB/s or more if the system
> would cope.
>
> My question is how should we go about configuring the number
> and specifications of the NSDs? Are there any good rules of
> thumb? And are there any folk out there using GPFS for high
> I/O rates like this in a similar setup who would be happy to
> have their brains/experiences picked?
>

I would guess the biggest question is how sequential is the work load?

Also how many cores per box, aka how many cores per storage interface card?

The next question would be how much of your data is "old cruft" that is
files which have not been used in a long time, but are not going to be
deleted because they might be useful? If this is a reasonably high
number then tiering/ILM is a worthwhile strategy to follow.

Of course if you can afford to buy all your data disks in 600GB 3.5"
15kRPM disks then that is the way to go.

Use SSD's for your metadata disks is I would say a must. How much
depends on how many files you have.

More detailed answers would require more information.


JAB.

--
Jonathan A. Buzzard             Tel: +441382-386998
Storage Administrator, College of Life Sciences
University of Dundee, DD1 5EH

The University of Dundee is a registered Scottish Charity, No: SC015096




More information about the gpfsug-discuss mailing list