[gpfsug-discuss] Write performances and filesystem size

Ivano Talamo Ivano.Talamo at psi.ch
Thu Nov 16 08:44:06 GMT 2017


Hello Olaf,

yes, I confirm that is the Lenovo version of the ESS GL2, so 2 
enclosures/4 drawers/166 disks in total.

Each recovery group has one declustered array with all disks inside, so 
vdisks use all the physical ones, even in the case of a vdisk that is 
1/4 of the total size.

Regarding the layout allocation we used scatter.

The tests were done on the just created filesystem, so no close-to-full 
effect. And we run gpfsperf write seq.

Thanks,
Ivano


Il 16/11/17 04:42, Olaf Weiser ha scritto:
> Sure... as long we assume that really all physical disk are used .. the
> fact that  was told 1/2  or 1/4  might turn out that one / two complet
> enclosures 're eliminated ... ?  ..that s why I was asking for  more
> details ..
>
> I dont see this degration in my environments. . as long the vdisks are
> big enough to span over all pdisks ( which should be the case for
> capacity in a range of TB ) ... the performance stays the same
>
> Gesendet von IBM Verse
>
> Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and
> filesystem size ---
>
> Von:	"Jan-Frode Myklebust" <janfrode at tanso.net>
> An:	"gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>
> Datum:	Mi. 15.11.2017 21:35
> Betreff:	Re: [gpfsug-discuss] Write performances and filesystem size
>
> ------------------------------------------------------------------------
>
> Olaf, this looks like a Lenovo «ESS GLxS» version. Should be using same
> number of spindles for any size filesystem, so I would also expect them
> to perform the same.
>
>
>
> -jf
>
>
> ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser <olaf.weiser at de.ibm.com
> <mailto:olaf.weiser at de.ibm.com>>:
>
>      to add a comment ...  .. very simply... depending on how you
>     allocate the physical block storage .... if you - simply - using
>     less physical resources when reducing the capacity (in the same
>     ratio) .. you get , what you see....
>
>     so you need to tell us, how you allocate your block-storage .. (Do
>     you using RAID controllers , where are your LUNs coming from, are
>     then less RAID groups involved, when reducing the capacity ?...)
>
>     GPFS can be configured to give you pretty as much as what the
>     hardware can deliver.. if you reduce resource.. ... you'll get less
>     , if you enhance your hardware .. you get more... almost regardless
>     of the total capacity in #blocks ..
>
>
>
>
>
>
>     From:        "Kumaran Rajaram" <kums at us.ibm.com
>     <mailto:kums at us.ibm.com>>
>     To:        gpfsug main discussion list
>     <gpfsug-discuss at spectrumscale.org
>     <mailto:gpfsug-discuss at spectrumscale.org>>
>     Date:        11/15/2017 11:56 AM
>     Subject:        Re: [gpfsug-discuss] Write performances and
>     filesystem size
>     Sent by:        gpfsug-discuss-bounces at spectrumscale.org
>     <mailto:gpfsug-discuss-bounces at spectrumscale.org>
>     ------------------------------------------------------------------------
>
>
>
>     Hi,
>
>     >>Am I missing something? Is this an expected behaviour and someone
>     has an explanation for this?
>
>     Based on your scenario, write degradation as the file-system is
>     populated is possible if you had formatted the file-system with "-j
>     cluster".
>
>     For consistent file-system performance, we recommend *mmcrfs "-j
>     scatter" layoutMap.*   Also, we need to ensure the mmcrfs "-n"  is
>     set properly.
>
>     [snip from mmcrfs]/
>     # mmlsfs <fs> | egrep 'Block allocation| Estimated number'
>     -j                 scatter                  Block allocation type
>     -n                 128                       Estimated number of
>     nodes that will mount file system/
>     [/snip]
>
>
>     [snip from man mmcrfs]/
>     *layoutMap={scatter|*//*cluster}*//
>                      Specifies the block allocation map type. When
>                      allocating blocks for a given file, GPFS first
>                      uses a round‐robin algorithm to spread the data
>                      across all disks in the storage pool. After a
>                      disk is selected, the location of the data
>                      block on the disk is determined by the block
>                      allocation map type*. If cluster is
>                      specified, GPFS attempts to allocate blocks in
>                      clusters. Blocks that belong to a particular
>                      file are kept adjacent to each other within
>                      each cluster. If scatter is specified,
>                      the location of the block is chosen randomly.*/
>     /
>                  *  The cluster allocation method may provide
>                      better disk performance for some disk
>                      subsystems in relatively small installations.
>                      The benefits of clustered block allocation
>                      diminish when the number of nodes in the
>                      cluster or the number of disks in a file system
>                      increases, or when the file system’s free space
>                      becomes fragmented. *//The *cluster*//
>                      allocation method is the default for GPFS
>                      clusters with eight or fewer nodes and for file
>                      systems with eight or fewer disks./
>     /
>                     *The scatter allocation method provides
>                      more consistent file system performance by
>                      averaging out performance variations due to
>                      block location (for many disk subsystems, the
>                      location of the data relative to the disk edge
>                      has a substantial effect on performance).*//This
>                      allocation method is appropriate in most cases
>                      and is the default for GPFS clusters with more
>                      than eight nodes or file systems with more than
>                      eight disks./
>     /
>                      The block allocation map type cannot be changed
>                      after the storage pool has been created./
>
>     */
>     -n/*/*NumNodes*//
>             The estimated number of nodes that will mount the file
>             system in the local cluster and all remote clusters.
>             This is used as a best guess for the initial size of
>             some file system data structures. The default is 32.
>             This value can be changed after the file system has been
>             created but it does not change the existing data
>             structures. Only the newly created data structure is
>             affected by the new value. For example, new storage
>             pool./
>     /
>             When you create a GPFS file system, you might want to
>             overestimate the number of nodes that will mount the
>             file system. GPFS uses this information for creating
>             data structures that are essential for achieving maximum
>             parallelism in file system operations (For more
>             information, see GPFS architecture in IBM Spectrum
>             Scale: Concepts, Planning, and Installation Guide ). If
>             you are sure there will never be more than 64 nodes,
>             allow the default value to be applied. If you are
>             planning to add nodes to your system, you should specify
>             a number larger than the default./
>
>     [/snip from man mmcrfs]
>
>     Regards,
>     -Kums
>
>
>
>
>
>     From:        Ivano Talamo <Ivano.Talamo at psi.ch
>     <mailto:Ivano.Talamo at psi.ch>>
>     To:        <gpfsug-discuss at spectrumscale.org
>     <mailto:gpfsug-discuss at spectrumscale.org>>
>     Date:        11/15/2017 11:25 AM
>     Subject:        [gpfsug-discuss] Write performances and filesystem size
>     Sent by:        gpfsug-discuss-bounces at spectrumscale.org
>     <mailto:gpfsug-discuss-bounces at spectrumscale.org>
>     ------------------------------------------------------------------------
>
>
>
>     Hello everybody,
>
>     together with my colleagues we are actually running some tests on a new
>     DSS G220 system and we see some unexpected behaviour.
>
>     What we actually see is that write performances (we did not test read
>     yet) decreases with the decrease of filesystem size.
>
>     I will not go into the details of the tests, but here are some numbers:
>
>     - with a filesystem using the full 1.2 PB space we get 14 GB/s as the
>     sum of the disk activity on the two IO servers;
>     - with a filesystem using half of the space we get 10 GB/s;
>     - with a filesystem using 1/4 of the space we get 5 GB/s.
>
>     We also saw that performances are not affected by the vdisks layout,
>     ie.
>     taking the full space with one big vdisk or 2 half-size vdisks per RG
>     gives the same performances.
>
>     To our understanding the IO should be spread evenly across all the
>     pdisks in the declustered array, and looking at iostat all disks
>     seem to
>     be accessed. But so there must be some other element that affects
>     performances.
>
>     Am I missing something? Is this an expected behaviour and someone
>     has an
>     explanation for this?
>
>     Thank you,
>     Ivano
>     _______________________________________________
>     gpfsug-discuss mailing list
>     gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>_
>     __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_
>
>
>     _______________________________________________
>     gpfsug-discuss mailing list
>     gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>     http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
>     _______________________________________________
>     gpfsug-discuss mailing list
>     gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>
>     http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>



More information about the gpfsug-discuss mailing list