[gpfsug-discuss] system.log pool on client nodes for HAWC

Sven Oehme oehmes at gmail.com
Mon Sep 3 16:32:11 BST 2018


Hi Ken,

what the documents is saying (or try to) is that the behavior of data in
inode or metadata operations are not changed if HAWC is enabled, means if
the data fits into the inode it will be placed there directly instead of
writing the data i/o into a data recovery log record (which is what HAWC
uses) and then later destage it where ever the data blocks of a given file
eventually will be written. that also means if all your application does is
creating small files that fit into the inode, HAWC will not be able to
improve performance.
its unfortunate not so simple to say if HAWC will help or not, but here are
a couple of thoughts where HAWC will not help and help :
on the where it won't help :
1. if you have storage device which has very large or even better are log
structured write cache.
2. if majority of your files are very small
3. if your files will almost always be accesses sequentially
4. your storage is primarily flash based
where it most likely will help :
1. your majority of storage is direct attached HDD (e.g. FPO) with a small
SSD pool for metadata and HAWC
2. your ratio of clients to storage devices is very high (think hundreds of
clients and only 1 storage array)
3. your workload is primarily virtual machines or databases

as always there are lots of exceptions and corner cases, but is the best
list i could come up with.

on how to find out if HAWC could help, there are 2 ways of doing this
first, look at mmfsadm dump iocounters , you see the average size of i/os
and you could check if there is a lot of small write operations done.
a more involved but more accurate way would be to take a trace with trace
level trace=io , that will generate a very lightweight trace of only the
most relevant io layers of GPFS, you could then post process the operations
performance, but the data is not the simplest to understand for somebody
with low knowledge of filesystems, but if you stare at it for a while it
might make some sense to you.

Sven



On Mon, Sep 3, 2018 at 4:06 PM Kenneth Waegeman <kenneth.waegeman at ugent.be>
wrote:

> Thank you Vasily and Simon for the clarification!
>
> I was looking further into it, and I got stuck with more questions :)
>
>
> - In
> https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_hawc_tuning.htm
> I read:
>     HAWC does not change the following behaviors:
>         write behavior of small files when the data is placed in the inode
> itself
>         write behavior of directory blocks or other metadata
> I wondered why? Is the metadata not logged in the (same) recovery logs?
> (It seemed by reading
> https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.ins.doc/bl1ins_logfile.htm
> it does )
>
>
> - Would there be a way to estimate how much of the write requests on a
> running cluster would benefit from enabling HAWC ?
>
>
> Thanks again!
>
>
> Kenneth
>
>
> On 31/08/18 19:49, Vasily Tarasov wrote:
>
> That is correct. The blocks of each recovery log are striped across the
> devices in the system.log pool (if it is defined). As a result, even when
> all clients have a local device in the system.log pool, many writes to the
> recovery log will go to remote devices. For a client that lacks a local
> device in the system.log pool, log writes will always be remote.
>
> Notice, that typically in such a setup you would enable log replication
> for HA. Otherwise, if a single client fails (and its recover log is lost)
> the whole cluster fails as there is no log  to recover FS to consistent
> state. Therefore, at least one remote write is essential.
>
> HTH,
> --
> Vasily Tarasov,
> Research Staff Member,
> Storage Systems Research,
> IBM Research - Almaden
>
>
>
> ----- Original message -----
> From: Kenneth Waegeman <kenneth.waegeman at ugent.be>
> <kenneth.waegeman at ugent.be>
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> <gpfsug-discuss at spectrumscale.org>
> Cc:
> Subject: [gpfsug-discuss] system.log pool on client nodes for HAWC
> Date: Tue, Aug 28, 2018 5:31 AM
>
> Hi all,
>
> I was looking into HAWC , using the 'distributed fast storage in client
> nodes' method (
>
> https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_hawc_using.htm
>
> )
>
> This is achieved by putting  a local device on the clients in the
> system.log pool. Reading another article
> (
> https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_syslogpool.htm
>
> ) this would now be used for ALL File system recovery logs.
>
> Does this mean that if you have a (small) subset of clients with fast
> local devices added in the system.log pool, all other clients will use
> these too instead of the central system pool?
>
> Thank you!
>
> Kenneth
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180903/8f3e7ca6/attachment.htm>


More information about the gpfsug-discuss mailing list