[gpfsug-discuss] Blocksize and space and performance for Metadata, release 4.2.x

Marc A Kaplan makaplan at us.ibm.com
Thu Sep 22 21:25:10 BST 2016


There have been a few changes over the years that may invalidate some of 
the old advice about metadata and disk allocations there for.
These have been phased in over the last few years, I am discussing the 
present situation for release 4.2.x

1) Inode size.  Used to be 512.  Now you can set the inodesize at mmcrfs 
time.  Defaults to 4096.

2) Data in inode.  If it fits, then the inode holds the data.  Since a 512 
byte inode still works, you can have more than 3.5KB of data in a 4KB 
inode.

3) Extended Attributes in Inode.  Again, if it fits...  Extended 
attributes used to be stored in a separate file of metadata.  So extended 
attributes performance is way better than the old days.

4) (small) Directories in Inode.  If it fits, the inode of a directory can 
hold the directory entries.  That gives you about 2x performance on 
directory reads, for smallish directories.

5) Big directory blocks.  Directories used to use a maximum of 32KB per 
block, potentially wasting a lot of space and yielding poor performance 
for large directories.
Now directory blocks are the lesser of metadata-blocksize and 256KB.

6) Big directories are shrinkable.  Used to be directories would grow in 
32KB chunks but never shrink.  Yup, even an almost(?) "empty" directory 
would remain the size the directory had to be at its lifetime maximum. 
That means just a few remaining entries could be "sprinkled" over many 
directory blocks.  (See also 5.)
But now directories autoshrink to avoid wasteful sparsity.  Last I looked, 
the implementation just stopped short of "pushing" tiny directories back 
into the inode. But a huge directory can be shrunk down to a single 
(meta)data block.   (See --compact in the docs.)

--marc of GPFS

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160922/737d492b/attachment.htm>


More information about the gpfsug-discuss mailing list