[gpfsug-discuss] Write performances and filesystem size
Olaf Weiser
olaf.weiser at de.ibm.com
Thu Nov 16 12:03:16 GMT 2017
Rjx, that makes it a bit clearer.. as your vdisk is big enough to span over all pdisks in each of your test 1/1 or 1/2 or 1/4 of capacity... should bring the same performance. ..
You mean something about vdisk Layout. ..
So in your test, for the full capacity test, you use just one vdisk per RG - so 2 in total for 'data' - right?
What about Md .. did you create separate vdisk for MD / what size then ?
Gesendet von IBM Verse
Ivano Talamo --- Re: [gpfsug-discuss] Write performances and filesystem size ---
Von:"Ivano Talamo" <Ivano.Talamo at psi.ch>An:"gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>Datum:Do. 16.11.2017 03:49Betreff:Re: [gpfsug-discuss] Write performances and filesystem size
Hello Olaf,yes, I confirm that is the Lenovo version of the ESS GL2, so 2 enclosures/4 drawers/166 disks in total.Each recovery group has one declustered array with all disks inside, so vdisks use all the physical ones, even in the case of a vdisk that is 1/4 of the total size.Regarding the layout allocation we used scatter.The tests were done on the just created filesystem, so no close-to-full effect. And we run gpfsperf write seq.Thanks,IvanoIl 16/11/17 04:42, Olaf Weiser ha scritto:> Sure... as long we assume that really all physical disk are used .. the> fact that was told 1/2 or 1/4 might turn out that one / two complet> enclosures 're eliminated ... ? ..that s why I was asking for more> details ..>> I dont see this degration in my environments. . as long the vdisks are> big enough to span over all pdisks ( which should be the case for> capacity in a range of TB ) ... the performance stays the same>> Gesendet von IBM Verse>> Jan-Frode Myklebust --- Re: [gpfsug-discuss] Write performances and> filesystem size --->> Von: "Jan-Frode Myklebust" <janfrode at tanso.net>> An: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>> Datum: Mi. 15.11.2017 21:35> Betreff: Re: [gpfsug-discuss] Write performances and filesystem size>> ------------------------------------------------------------------------>> Olaf, this looks like a Lenovo «ESS GLxS» version. Should be using same> number of spindles for any size filesystem, so I would also expect them> to perform the same.>>>> -jf>>> ons. 15. nov. 2017 kl. 11:26 skrev Olaf Weiser <olaf.weiser at de.ibm.com> <mailto:olaf.weiser at de.ibm.com>>:>> to add a comment ... .. very simply... depending on how you> allocate the physical block storage .... if you - simply - using> less physical resources when reducing the capacity (in the same> ratio) .. you get , what you see....>> so you need to tell us, how you allocate your block-storage .. (Do> you using RAID controllers , where are your LUNs coming from, are> then less RAID groups involved, when reducing the capacity ?...)>> GPFS can be configured to give you pretty as much as what the> hardware can deliver.. if you reduce resource.. ... you'll get less> , if you enhance your hardware .. you get more... almost regardless> of the total capacity in #blocks ..>>>>>>> From: "Kumaran Rajaram" <kums at us.ibm.com> <mailto:kums at us.ibm.com>>> To: gpfsug main discussion list> <gpfsug-discuss at spectrumscale.org> <mailto:gpfsug-discuss at spectrumscale.org>>> Date: 11/15/2017 11:56 AM> Subject: Re: [gpfsug-discuss] Write performances and> filesystem size> Sent by: gpfsug-discuss-bounces at spectrumscale.org> <mailto:gpfsug-discuss-bounces at spectrumscale.org>> ------------------------------------------------------------------------>>>> Hi,>> >>Am I missing something? Is this an expected behaviour and someone> has an explanation for this?>> Based on your scenario, write degradation as the file-system is> populated is possible if you had formatted the file-system with "-j> cluster".>> For consistent file-system performance, we recommend *mmcrfs "-j> scatter" layoutMap.* Also, we need to ensure the mmcrfs "-n" is> set properly.>> [snip from mmcrfs]/> # mmlsfs <fs> | egrep 'Block allocation| Estimated number'> -j scatter Block allocation type> -n 128 Estimated number of> nodes that will mount file system/> [/snip]>>> [snip from man mmcrfs]/> *layoutMap={scatter|*//*cluster}*//> Specifies the block allocation map type. When> allocating blocks for a given file, GPFS first> uses a round‐robin algorithm to spread the data> across all disks in the storage pool. After a> disk is selected, the location of the data> block on the disk is determined by the block> allocation map type*. If cluster is> specified, GPFS attempts to allocate blocks in> clusters. Blocks that belong to a particular> file are kept adjacent to each other within> each cluster. If scatter is specified,> the location of the block is chosen randomly.*/> /> * The cluster allocation method may provide> better disk performance for some disk> subsystems in relatively small installations.> The benefits of clustered block allocation> diminish when the number of nodes in the> cluster or the number of disks in a file system> increases, or when the file system’s free space> becomes fragmented. *//The *cluster*//> allocation method is the default for GPFS> clusters with eight or fewer nodes and for file> systems with eight or fewer disks./> /> *The scatter allocation method provides> more consistent file system performance by> averaging out performance variations due to> block location (for many disk subsystems, the> location of the data relative to the disk edge> has a substantial effect on performance).*//This> allocation method is appropriate in most cases> and is the default for GPFS clusters with more> than eight nodes or file systems with more than> eight disks./> /> The block allocation map type cannot be changed> after the storage pool has been created./>> */> -n/*/*NumNodes*//> The estimated number of nodes that will mount the file> system in the local cluster and all remote clusters.> This is used as a best guess for the initial size of> some file system data structures. The default is 32.> This value can be changed after the file system has been> created but it does not change the existing data> structures. Only the newly created data structure is> affected by the new value. For example, new storage> pool./> /> When you create a GPFS file system, you might want to> overestimate the number of nodes that will mount the> file system. GPFS uses this information for creating> data structures that are essential for achieving maximum> parallelism in file system operations (For more> information, see GPFS architecture in IBM Spectrum> Scale: Concepts, Planning, and Installation Guide ). If> you are sure there will never be more than 64 nodes,> allow the default value to be applied. If you are> planning to add nodes to your system, you should specify> a number larger than the default./>> [/snip from man mmcrfs]>> Regards,> -Kums>>>>>> From: Ivano Talamo <Ivano.Talamo at psi.ch> <mailto:Ivano.Talamo at psi.ch>>> To: <gpfsug-discuss at spectrumscale.org> <mailto:gpfsug-discuss at spectrumscale.org>>> Date: 11/15/2017 11:25 AM> Subject: [gpfsug-discuss] Write performances and filesystem size> Sent by: gpfsug-discuss-bounces at spectrumscale.org> <mailto:gpfsug-discuss-bounces at spectrumscale.org>> ------------------------------------------------------------------------>>>> Hello everybody,>> together with my colleagues we are actually running some tests on a new> DSS G220 system and we see some unexpected behaviour.>> What we actually see is that write performances (we did not test read> yet) decreases with the decrease of filesystem size.>> I will not go into the details of the tests, but here are some numbers:>> - with a filesystem using the full 1.2 PB space we get 14 GB/s as the> sum of the disk activity on the two IO servers;> - with a filesystem using half of the space we get 10 GB/s;> - with a filesystem using 1/4 of the space we get 5 GB/s.>> We also saw that performances are not affected by the vdisks layout,> ie.> taking the full space with one big vdisk or 2 half-size vdisks per RG> gives the same performances.>> To our understanding the IO should be spread evenly across all the> pdisks in the declustered array, and looking at iostat all disks> seem to> be accessed. But so there must be some other element that affects> performances.>> Am I missing something? Is this an expected behaviour and someone> has an> explanation for this?>> Thank you,> Ivano> _______________________________________________> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>_> __https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=McIf98wfiVqHU8ZygezLrQ&m=py_FGl3hi9yQsby94NZdpBFPwcUU0FREyMSSvuK_10U&s=Bq1J9eIXxadn5yrjXPHmKEht0CDBwfKJNH72p--T-6s&e=_>>> _______________________________________________> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss>>>> _______________________________________________> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org <http://spectrumscale.org>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss>>>>> _______________________________________________> gpfsug-discuss mailing list> gpfsug-discuss at spectrumscale.org> http://gpfsug.org/mailman/listinfo/gpfsug-discuss>_______________________________________________gpfsug-discuss mailing listgpfsug-discuss at spectrumscale.orghttp://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20171116/159a067c/attachment.htm>
More information about the gpfsug-discuss
mailing list