[gpfsug-discuss] system.log pool on client nodes for HAWC

Tue Sep 4 20:23:35 BST 2018

hi vasily, sven,

and is there any advantage in moving the system.log pool to faster
storage (like nvdimm) or increasing its default size when HAWC is not
used (ie write-cache-threshold kept to 0). (i remember the (very
creative) logtip placement on the gss boxes ;)

thanks a lot for the detailed answer

stijn

On 09/04/2018 05:57 PM, Vasily Tarasov wrote:
> Let me add just one more item to Sven's detailed reply: HAWC is especially 
> helpful to decrease the latencies of small synchronous I/Os that come in 
> *bursts*. If your workload contains a sustained high rate of writes, the 
> recovery log will get full very quickly, and HAWC won't help much (or can even 
> decrease performance). Making the recovery log larger allows to adsorb longer 
> I/O bursts. The specific amount of improvements depends  on  the workload (how 
> long/high are bursts, e.g.) and hardware.
> Best,
> Vasily
> --
> Vasily Tarasov,
> Research Staff Member,
> Storage Systems Research,
> IBM Research - Almaden
> 
>     ----- Original message -----
>     From: Sven Oehme <oehmes at gmail.com>
>     To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>     Cc: Vasily Tarasov <vtarasov at us.ibm.com>
>     Subject: Re: [gpfsug-discuss] system.log pool on client nodes for HAWC
>     Date: Mon, Sep 3, 2018 8:32 AM
>     Hi Ken,
>     what the documents is saying (or try to) is that the behavior of data in
>     inode or metadata operations are not changed if HAWC is enabled, means if
>     the data fits into the inode it will be placed there directly instead of
>     writing the data i/o into a data recovery log record (which is what HAWC
>     uses) and then later destage it where ever the data blocks of a given file
>     eventually will be written. that also means if all your application does is
>     creating small files that fit into the inode, HAWC will not be able to
>     improve performance.
>     its unfortunate not so simple to say if HAWC will help or not, but here are
>     a couple of thoughts where HAWC will not help and help :
>     on the where it won't help :
>     1. if you have storage device which has very large or even better are log
>     structured write cache.
>     2. if majority of your files are very small
>     3. if your files will almost always be accesses sequentially
>     4. your storage is primarily flash based
>     where it most likely will help :
>     1. your majority of storage is direct attached HDD (e.g. FPO) with a small
>     SSD pool for metadata and HAWC
>     2. your ratio of clients to storage devices is very high (think hundreds of
>     clients and only 1 storage array)
>     3. your workload is primarily virtual machines or databases
>     as always there are lots of exceptions and corner cases, but is the best
>     list i could come up with.
>     on how to find out if HAWC could help, there are 2 ways of doing this
>     first, look at mmfsadm dump iocounters , you see the average size of i/os
>     and you could check if there is a lot of small write operations done.
>     a more involved but more accurate way would be to take a trace with trace
>     level trace=io , that will generate a very lightweight trace of only the
>     most relevant io layers of GPFS, you could then post process the operations
>     performance, but the data is not the simplest to understand for somebody
>     with low knowledge of filesystems, but if you stare at it for a while it
>     might make some sense to you.
>     Sven
>     On Mon, Sep 3, 2018 at 4:06 PM Kenneth Waegeman <kenneth.waegeman at ugent.be
>     <mailto:kenneth.waegeman at ugent.be>> wrote:
> 
>         Thank you Vasily and Simon for the clarification!
> 
>         I was looking further into it, and I got stuck with more questions :)
> 
> 
>         - In
>         https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_hawc_tuning.htm
>         I read:
>              HAWC does not change the following behaviors:
>                  write behavior of small files when the data is placed in the
>         inode itself
>                  write behavior of directory blocks or other metadata
> 
>         I wondered why? Is the metadata not logged in the (same) recovery logs?
>         (It seemed by reading
>         https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.ins.doc/bl1ins_logfile.htm
>         it does )
> 
> 
>         - Would there be a way to estimate how much of the write requests on a
>         running cluster would benefit from enabling HAWC ?
> 
> 
>         Thanks again!
> 
> 
>         Kenneth
>         On 31/08/18 19:49, Vasily Tarasov wrote:
>>         That is correct. The blocks of each recovery log are striped across
>>         the devices in the system.log pool (if it is defined). As a result,
>>         even when all clients have a local device in the system.log pool, many
>>         writes to the recovery log will go to remote devices. For a client
>>         that lacks a local device in the system.log pool, log writes will
>>         always be remote.
>>         Notice, that typically in such a setup you would enable log
>>         replication for HA. Otherwise, if a single client fails (and its
>>         recover log is lost) the whole cluster fails as there is no log  to
>>         recover FS to consistent state. Therefore, at least one remote write
>>         is essential.
>>         HTH,
>>         --
>>         Vasily Tarasov,
>>         Research Staff Member,
>>         Storage Systems Research,
>>         IBM Research - Almaden
>>
>>             ----- Original message -----
>>             From: Kenneth Waegeman <kenneth.waegeman at ugent.be>
>>             <mailto:kenneth.waegeman at ugent.be>
>>             Sent by: gpfsug-discuss-bounces at spectrumscale.org
>>             <mailto:gpfsug-discuss-bounces at spectrumscale.org>
>>             To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>>             <mailto:gpfsug-discuss at spectrumscale.org>
>>             Cc:
>>             Subject: [gpfsug-discuss] system.log pool on client nodes for HAWC
>>             Date: Tue, Aug 28, 2018 5:31 AM
>>             Hi all,
>>
>>             I was looking into HAWC , using the 'distributed fast storage in
>>             client
>>             nodes' method (
>>             https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_hawc_using.htm
>>
>>             )
>>
>>             This is achieved by putting  a local device on the clients in the
>>             system.log pool. Reading another article
>>             (https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adv_syslogpool.htm
>>
>>             ) this would now be used for ALL File system recovery logs.
>>
>>             Does this mean that if you have a (small) subset of clients with fast
>>             local devices added in the system.log pool, all other clients will use
>>             these too instead of the central system pool?
>>
>>             Thank you!
>>
>>             Kenneth
>>
>>             _______________________________________________
>>             gpfsug-discuss mailing list
>>             gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>>             http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>
>>         _______________________________________________
>>         gpfsug-discuss mailing list
>>         gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>>         http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>         _______________________________________________
>         gpfsug-discuss mailing list
>         gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>         http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>