[gpfsug-discuss] SSDs for data - DWPD?

Tue Mar 19 13:36:26 GMT 2019

It has been interesting to watch the evolution of the same discussion over
on the Ceph Users mailing list over the last few years.   Obviously GPFS
and Ceph are used differently, so the comparison isn't direct, but the
attitudes have generally shifted from recommending only high DWPD drives to
the lower (or sometimes even lowest) tiers.

The reasoning tends to be that often you will write less data than you
think, and also drives often last longer than their rating.   We have an
all SSD (Samsung SM863a) Ceph cluster backing an Openstack system that's
been in production for ~12 months, and the drives are all reporting 97%+
endurance remaining.  It's not the busiest of storage backends, but Ceph is
a notorious write amplifier and I'm more than happy with the ongoing
endurance that I expect to see.

On Tue, 19 Mar 2019 at 12:10, Jonathan Buzzard <
jonathan.buzzard at strath.ac.uk> wrote:

> On Mon, 2019-03-18 at 19:09 +0000, Buterbaugh, Kevin L wrote:
>
> [SNIP]
>
> >
> > 12 * 7 = 84 TB.  So if you had somewhere between 125 - 150 TB of SSDs
> > ... 1 DWPD SSDs … then in theory you should easily be able to handle
> > your anticipated workload without coming close to exceeding the 1
> > DWPD rating of the SSDs.
> >
> > However, as the saying goes, while in theory there’s no difference
> > between theory and practice, in practice there is ... so am I
> > overlooking anything here from a GPFS perspective???
> >
> > If anybody still wants to respond on the DWPD rating of the SSDs they
> > use for data, I’m still listening.
>
> I would be weary of write amplification in RAID coming to bite you in
> the ass. Just because you write 1TB of data to the file system does not
> mean the drives write 1TB of data, it could be 2TB of data.
>
> I would if you can look at the data written to the drives using
> smartctl if you are on a DSS or ESS or something similar if they are
> behind a conventional storage array.
>
> So for example on my DSS-G picking a random drive used for data which
> is an 8TB NL-SAS for the record shows the following in the output of
> smartctl -a
>
> Error counter log:
>            Errors Corrected by           Total   Correction     Gigabytes
>   Total
>                ECC          rereads/    errors   algorithm      processed
>   uncorrected
>            fast | delayed   rewrites  corrected  invocations   [10^9
> bytes]  errors
> read:   682040270        0         0  682040270          0     116208.442
>          0
> write:         0        0         0         0          0      34680.694
>        0
>
>
> Looking at the gigabytes processed shows that 33TB has been written to
> the drive. These are lifetime figures for the drive, so there is no
> under reporting/estimation going on.
>
> If you can get these figures back you can calculate what drive writes
> you need because they encapsulate the RAID write amplification.
>
> JAB.
>
> --
> Jonathan A. Buzzard                         Tel: +44141-5483420
> HPC System Administrator, ARCHIE-WeSt.
> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20190319/0157557f/attachment.htm>