[gpfsug-discuss] RAID config for SSD's - potential pitfalls

Kumaran Rajaram kums at us.ibm.com
Wed Apr 19 23:03:33 BST 2017


Hi,

>> As I've mentioned before, RAID choices for GPFS are not so simple. Here 
are  a couple points to consider, I'm sure there's more.  And if I'm 
wrong, someone will please correct me - but I believe the two biggest 
pitfalls are:

>>Some RAID configurations (classically 5 and 6) work best with large, 
full block writes.  When the file system does a partial block write, RAID 
may have to read a full "stripe" from several devices, compute the 
differences and then write back the modified data to several devices. 
>>This is certainly true with RAID that is configured over several storage 
devices, with error correcting codes.  SO, you do NOT want to put GPFS 
metadata (system pool!) on RAID configured with large stripes and error 
correction. This is the Read-Modify-Write Raid pitfall.

As you pointed out, the RAID choices for GPFS may not be simple and we 
need to take into consideration factors such as storage subsystem 
configuration/capabilities such as if all drives are homogenous or there 
is mix of drives. If all the drives are homogeneous, then create 
dataAndMetadata NSDs across RAID-6 and if the storage  controller supports 
write-cache + write-cache mirroring (WC + WM) then enable this (WC +WM) 
can alleviate read-modify-write for small writes (typical in metadata). If 
there is MIX of SSD and HDD (e.g. 15K RPM), then we need to take into 
consideration the aggregate IOPS of RAID-1 SSD volumes vs. RAID-6 HDDs 
before separating data and metadata into separate media. For example, if 
the storage subsystem has 2 x SSDs and ~300 x 15K RPM or NL_SAS HDDs then 
most likely aggregate IOPS of RAID-6 HDD volumes will be higher than 
RAID-1 SSD volumes. It would be recommended to also assess the I/O 
performance on different configuration (dataAndMetadata vs 
dataOnly/metadataOnly NSDs) with some application workload + production 
scenarios before deploying the final solution. 

>> GPFS has built-in replication features - consider using those instead 
of RAID replication (classically Raid-1).  GPFS replication can work with 
storage devices that are in different racks, separated by significant 
physical space, and from different manufacturers.  This can be more 
>>robust than RAID in a single box or single rack.  Consider a fire 
scenario, or exploding power supply or similar physical disaster. Consider 
that storage devices and controllers from the same manufacturer may have 
the same bugs, defects, failures.

For high-resiliency (for e.g. metadataOnly) and if there are multiple 
storage across different failure domains (different racks/rooms/DC etc), 
it will be good to enable BOTH hardware RAID-1 as well as GPFS metadata 
replication enabled (at the minimum,  -m 2). 

If there is single shared storage for GPFS file-system storage and 
metadata is separated from data, then RAID-1 would minimize administrative 
overhead compared to GPFS replication in the event of drive failure (since 
with GPFS replication across single SSD would require 
mmdeldisk/mmdelnsd/mmcrnsd/mmadddisk every time disk goes faulty and needs 
to be replaced). 

Best,
-Kums






From:   Marc A Kaplan/Watson/IBM at IBMUS
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   04/19/2017 04:50 PM
Subject:        Re: [gpfsug-discuss] RAID config for SSD's - potential 
pitfalls
Sent by:        gpfsug-discuss-bounces at spectrumscale.org



As I've mentioned before, RAID choices for GPFS are not so simple.    Here 
are  a couple points to consider, I'm sure there's more.  And if I'm 
wrong, someone will please correct me - but I believe the two biggest 
pitfalls are:
Some RAID configurations (classically 5 and 6) work best with large, full 
block writes.  When the file system does a partial block write, RAID may 
have to read a full "stripe" from several devices, compute the differences 
and then write back the modified data to several devices.  This is 
certainly true with RAID that is configured over several storage devices, 
with error correcting codes.  SO, you do NOT want to put GPFS metadata 
(system pool!) on RAID configured with large stripes and error correction. 
This is the Read-Modify-Write Raid pitfall.
GPFS has built-in replication features - consider using those instead of 
RAID replication (classically Raid-1).  GPFS replication can work with 
storage devices that are in different racks, separated by significant 
physical space, and from different manufacturers.  This can be more robust 
than RAID in a single box or single rack.  Consider a fire scenario, or 
exploding power supply or similar physical disaster.  Consider that 
storage devices and controllers from the same manufacturer may have the 
same bugs, defects, failures. 

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170419/58f189c6/attachment.htm>


More information about the gpfsug-discuss mailing list