[gpfsug-discuss] Optimal range on inode count for a single folder

Marc A Kaplan makaplan at us.ibm.com
Tue Sep 11 15:03:24 BST 2018


There is no single "optimal" number of files per directory.

GPFS can handle millions of files in a directory, rather efficiently.  It 
uses fairly modern extensible hashing and caching techniques that makes 
lookup, insertions and deletions go fast.   But of course, reading or 
"listing" all directory entries is going to require reading all the disk 
sectors that contain the directory...

"during system remount... restarting the system"  -- NO!  There is no 
relation between directory sizes and mount and startup times... 
If you are experiencing long mount times, something else is happening. IF 
restart is after a crash of some kind, then it is possible GPFS may need 
to process many log entries -- but that would be proportional to the 
number of directory updates "in flight" at the time of the crash...

Having said that there are some changeover conditions in the way 
directories are stored, as one adds more and more entries.  Since 
directory entries are of variable size, varying with the size of the file 
names, the exact numbers depend on file name length, inode size and 
(meta)data block size:

A) All directory entries fit in the directory inode.   Best performance! 
But I do not recommend deliberately changing apps to avoid spilling to ...

B) All directory entries fit in one metadata block. 

C) Directory entries are spread over several blocks.

You can determine how much storage a directory is using by a `stat /path` 
command or equivalent.






From:   "Michael Dutchak" <Michael.Dutchak at ibm.com>
To:     gpfsug-discuss at spectrumscale.org
Date:   09/11/2018 09:21 AM
Subject:        [gpfsug-discuss] Optimal range on inode count for a single 
folder
Sent by:        gpfsug-discuss-bounces at spectrumscale.org



I would like to find out what the limitation, or optimal range on inode 
count for a single folder is in GPFS.  We have several users that have 
caused issues with our current files system by adding up to a million 
small files (1 ~ 40k) to a single directory.  This causes issues during 
system remount where restarting the system can take excessive amounts of 
time.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180911/4978afc6/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 21994 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180911/4978afc6/attachment.gif>


More information about the gpfsug-discuss mailing list