[gpfsug-discuss] Write performances and filesystem size

Thu Nov 16 03:42:05 GMT 2017

Sure... as long we assume that really all physical disk are used .. the fact that  was told 1/2  or 1/4  might turn out that one / two complet enclosures 're eliminated ... ?  ..that s why I was asking for  more details .. 

I dont see this degration in my environments. . as long the vdisks are big enough to span over all pdisks ( which should be the case for  capacity in a range of TB ) ... the performance stays the same 

Gesendet von IBM Verse

   Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and filesystem size --- 
    Von:"Jan-Frode Myklebust" <janfrode at tanso.net>An:"gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>Datum:Mi. 15.11.2017 21:35Betreff:Re: [gpfsug-discuss] Write performances and filesystem size

       Olaf, this looks like a Lenovo «ESS GLxS» version. Should be using same number of spindles for any size filesystem, so I would also expect them to perform the same.

   -jf

         ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser <olaf.weiser at de.ibm.com>:

    to add a comment ...  .. very simply... depending on how you allocate the physical block storage .... if you - simply - using less physical resources when reducing the capacity (in the same ratio) .. you get , what you see.... 

   so you need to tell us, how you allocate your block-storage .. (Do you using RAID controllers , where are your LUNs coming from, are then less RAID groups involved, when reducing the capacity ?...) 

   GPFS can be configured to give you pretty as much as what the hardware can deliver.. if you reduce resource.. ... you'll get less , if you enhance your hardware .. you get more... almost regardless of the total capacity in #blocks .. 

   From:        "Kumaran Rajaram" <kums at us.ibm.com>
   To:        gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
   Date:        11/15/2017 11:56 AM
   Subject:        Re: [gpfsug-discuss] Write performances and filesystem size
   Sent by:        gpfsug-discuss-bounces at spectrumscale.org

   Hi,

   >>Am I missing something? Is this an expected behaviour and someone has an explanation for this?

   Based on your scenario, write degradation as the file-system is populated is possible if you had formatted the file-system with "-j cluster". 

   For consistent file-system performance, we recommend mmcrfs "-j scatter" layoutMap.   Also, we need to ensure the mmcrfs "-n"  is set properly.

   [snip from mmcrfs]
   # mmlsfs <fs> | egrep 'Block allocation| Estimated number'
    -j                 scatter                  Block allocation type
    -n                 128                       Estimated number of nodes that will mount file system
   [/snip]

   [snip from man mmcrfs]
    layoutMap={scatter| cluster}
                     Specifies the block allocation map type. When
                     allocating blocks for a given file, GPFS first
                     uses a round‐robin algorithm to spread the data
                     across all disks in the storage pool. After a
                     disk is selected, the location of the data
                     block on the disk is determined by the block
                     allocation map type. If cluster is
                     specified, GPFS attempts to allocate blocks in
                     clusters. Blocks that belong to a particular
                     file are kept adjacent to each other within
                     each cluster. If scatter is specified,
                     the location of the block is chosen randomly.

                    The cluster allocation method may provide
                     better disk performance for some disk
                     subsystems in relatively small installations.
                     The benefits of clustered block allocation
                     diminish when the number of nodes in the
                     cluster or the number of disks in a file system
                     increases, or when the file system’s free space
                     becomes fragmented. The cluster
                     allocation method is the default for GPFS
                     clusters with eight or fewer nodes and for file
                     systems with eight or fewer disks.

                    The scatter allocation method provides
                     more consistent file system performance by
                     averaging out performance variations due to
                     block location (for many disk subsystems, the
                     location of the data relative to the disk edge
                     has a substantial effect on performance).This
                     allocation method is appropriate in most cases
                     and is the default for GPFS clusters with more
                     than eight nodes or file systems with more than
                     eight disks.

                     The block allocation map type cannot be changed
                     after the storage pool has been created.

   -n NumNodes
            The estimated number of nodes that will mount the file
            system in the local cluster and all remote clusters.
            This is used as a best guess for the initial size of
            some file system data structures. The default is 32.
            This value can be changed after the file system has been
            created but it does not change the existing data
            structures. Only the newly created data structure is
            affected by the new value. For example, new storage
            pool.

            When you create a GPFS file system, you might want to
            overestimate the number of nodes that will mount the
            file system. GPFS uses this information for creating
            data structures that are essential for achieving maximum
            parallelism in file system operations (For more
            information, see GPFS architecture in IBM Spectrum
            Scale: Concepts, Planning, and Installation Guide ). If
            you are sure there will never be more than 64 nodes,
            allow the default value to be applied. If you are
            planning to add nodes to your system, you should specify
            a number larger than the default.

   [/snip from man mmcrfs]

   Regards,
   -Kums

   From:        Ivano Talamo <Ivano.Talamo at psi.ch>
   To:        <gpfsug-discuss at spectrumscale.org>
   Date:        11/15/2017 11:25 AM
   Subject:        [gpfsug-discuss] Write performances and filesystem size
   Sent by:        gpfsug-discuss-bounces at spectrumscale.org

   Hello everybody,

   together with my colleagues we are actually running some tests on a new 
   DSS G220 system and we see some unexpected behaviour.

   What we actually see is that write performances (we did not test read 
   yet) decreases with the decrease of filesystem size.

   I will not go into the details of the tests, but here are some numbers:

   - with a filesystem using the full 1.2 PB space we get 14 GB/s as the 
   sum of the disk activity on the two IO servers;
   - with a filesystem using half of the space we get 10 GB/s;
   - with a filesystem using 1/4 of the space we get 5 GB/s.

   We also saw that performances are not affected by the vdisks layout, ie. 
   taking the full space with one big vdisk or 2 half-size vdisks per RG 
   gives the same performances.

   To our understanding the IO should be spread evenly across all the 
   pdisks in the declustered array, and looking at iostat all disks seem to 
   be accessed. But so there must be some other element that affects 
   performances.

   Am I missing something? Is this an expected behaviour and someone has an 
   explanation for this?

   Thank you,
   Ivano
   _______________________________________________
   gpfsug-discuss mailing list
   gpfsug-discuss at spectrumscale.org
   https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=

   _______________________________________________
   gpfsug-discuss mailing list
   gpfsug-discuss at spectrumscale.org
   http://gpfsug.org/mailman/listinfo/gpfsug-discuss

    _______________________________________________
    gpfsug-discuss mailing list
    gpfsug-discuss at spectrumscale.org
    http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20171116/c0760f74/attachment-0002.htm>