[gpfsug-discuss] GPFS and Flash/SSD Storage tiered storage

IBM Spectrum Scale scale at us.ibm.com
Thu Feb 22 21:52:01 GMT 2018


My apologies for not being more clear on the flash storage pool.  I meant 
that this would be just another GPFS storage pool in the same cluster, so 
no separate AFM cache cluster.  You would then use the file heat feature 
to ensure more frequently accessed files are migrated to that all flash 
storage pool.

As for LROC could you please clarify what you mean by a few headers/stubs 
of the file?  In reading the LROC documentation and the LROC variables 
available in the mmchconfig command I think you might want to take a look 
a the lrocDataStubFileSize variable since it seems to apply to your 
situation.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
. 

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
1-800-237-5511 in the United States or your local IBM Service Center in 
other countries. 

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.



From:   valleru at cbio.mskcc.org
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Cc:     gpfsug-discuss-bounces at spectrumscale.org
Date:   02/22/2018 04:21 PM
Subject:        Re: [gpfsug-discuss] GPFS and Flash/SSD Storage tiered 
storage
Sent by:        gpfsug-discuss-bounces at spectrumscale.org



Thank you. 

I am sorry if i was not clear, but the metadata pool is all on SSDs in the 
GPFS clusters that we use. Its just the data pool that is on Near-Line 
Rotating disks.
I understand that AFM might not be able to solve the issue, and I will try 
and see if file heat works for migrating the files to flash tier.
You mentioned an all flash storage pool for heavily used files - so you 
mean a different GPFS cluster just with flash storage, and to manually 
copy the files to flash storage whenever needed?
The IO performance that i am talking is prominently for reads, so you 
mention that LROC can work in the way i want it to? that is prefetch all 
the files into LROC cache, after only few headers/stubs of data are read 
from those files?
I thought LROC only keeps that block of data that is prefetched from the 
disk, and will not prefetch the whole file if a stub of data is read.
Please do let me know, if i understood it wrong.

On Feb 22, 2018, 4:08 PM -0500, IBM Spectrum Scale <scale at us.ibm.com>, 
wrote:
I do not think AFM is intended to solve the problem you are trying to 
solve.  If I understand your scenario correctly you state that you are 
placing metadata on NL-SAS storage.  If that is true that would not be 
wise especially if you are going to do many metadata operations.  I 
suspect your performance issues are partially due to the fact that 
metadata is being stored on NL-SAS storage.  You stated that you did not 
think the file heat feature would do what you intended but have you tried 
to use it to see if it could solve your problem?  I would think having 
metadata on SSD/flash storage combined with a all flash storage pool for 
your heavily used files would perform well.  If you expect IO usage will 
be such that there will be far more reads than writes then LROC should be 
beneficial to your overall performance.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale 
(GPFS), then please post it to the public IBM developerWroks Forum at 
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
.

If your query concerns a potential software error in Spectrum Scale (GPFS) 
and you have an IBM software maintenance contract please contact 
1-800-237-5511 in the United States or your local IBM Service Center in 
other countries.

The forum is informally monitored as time permits and should not be used 
for priority messages to the Spectrum Scale (GPFS) team.



From:        valleru at cbio.mskcc.org
To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:        02/22/2018 03:11 PM
Subject:        [gpfsug-discuss] GPFS and Flash/SSD Storage tiered storage
Sent by:        gpfsug-discuss-bounces at spectrumscale.org



Hi All,

I am trying to figure out a GPFS tiering architecture with flash storage 
in front end and near line storage as backend, for Supercomputing

The Backend storage will be a GPFS storage on near line of about 8-10PB. 
The backend storage will/can be tuned to give out large streaming 
bandwidth and enough metadata disks to make the stat of all these files 
fast enough.

I was thinking if it would be possible to use a GPFS flash cluster or GPFS 
SSD cluster in front end that uses AFM and acts as a cache cluster with 
the backend GPFS cluster.

At the end of this .. the workflow that i am targeting is where:


“
If the compute nodes read headers of thousands of large files ranging from 
100MB to 1GB, the AFM cluster should be able to bring up enough threads to 
bring up all of the files from the backend to the faster SSD/Flash GPFS 
cluster.
The working set might be about 100T, at a time which i want to be on a 
faster/low latency tier, and the rest of the files to be in slower tier 
until they are read by the compute nodes.
“


I do not want to use GPFS policies to achieve the above, is because i am 
not sure - if policies could be written in a way, that files are moved 
from the slower tier to faster tier depending on how the jobs interact 
with the files.
I know that the policies could be written depending on the heat, and 
size/format but i don’t think thes policies work in a similar way as 
above.

I did try the above architecture, where an SSD GPFS cluster acts as an AFM 
cache cluster before the near line storage. However the AFM cluster was 
really really slow, It took it about few hours to copy the files from near 
line storage to AFM cache cluster.
I am not sure if AFM is not designed to work this way, or if AFM is not 
tuned to work as fast as it should.

I have tried LROC too, but it does not behave the same way as i guess AFM 
works.

Has anyone tried or know if GPFS supports an architecture - where the fast 
tier can bring up thousands of threads and copy the files almost 
instantly/asynchronously from the slow tier, whenever the jobs from 
compute nodes reads few blocks from these files?
I understand that with respect to hardware - the AFM cluster should be 
really fast, as well as the network between the AFM cluster and the 
backend cluster.

Please do also let me know, if the above workflow can be done using GPFS 
policies and be as fast as it is needed to be.

Regards,
Lohit

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=kMYZhGPhwadAbNHucw79NJgyYAJAMgxyFZKEW-kMeqk&s=AT1gb89TzzE7nt58h8DYyhYkybvBY8mbXvdPjtaRRpU&e=




_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=DuqESC-4ycoY5GoHpYeH1T8baq0JWY8QfkN8z6b8jPw&s=zNUAH3mFyzxcvXtrep_OroKiwR88QouIrcdN8TLJK8M&e=





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180222/c0986970/attachment.htm>


More information about the gpfsug-discuss mailing list