[gpfsug-discuss] Question about changing inode capacity safely
Jared David Baker
Jared.Baker at uwyo.edu
Fri Jan 2 18:37:19 GMT 2015
Hello GPFS admins! I hope everybody had a great start to the new year so far.
Lately, I've had a few of my users get an error similar to:
error creating file: no space left on device.
When trying to create even simple files (using Linux `touch` command). However, if they try again in a second or two, the file is created without a problem and they go on about doing their work. I can never tell when they are likely to get the error message about 'no space left on device'. The filesystem creates many files in parallel (depending on the usage of the system and movement of files from other sites)
However, let me first describe our environment a little better. We have a 3 GPFS file systems (home, project, gscratch) on RHEL 6.3 InfiniBand HPC cluster. The version of GPFS is 3.5.0-11. We utilize fileset quotas (on block limits, not file limits) for each file system. Each user has a home fileset for storing basic configuration files, basic notes, and other small files. Each user belongs to a minimum of one project and the quota is shared between the users of the project. The gscratch file system is similar to that of the project file system except that files are deleted after ~9 days.
The partially good news (perhaps) is that the error mentioned above only occurs on the project file system. We've at least not observed the error on the home and gscratch file systems. Here's my initial investigation so far:
1.)Checked the fileset quota on one of the experienced filesets:
--
# mmlsquota -j ModMast project
Block Limits | File Limits
Filesystem type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks
project FILESET 953382016 0 16106127360 0 none | 8666828 0 0 0 none
--
It would seem from the information that the project is indeed well under their quota for their particular project.
2.)Then I checked the overall file system to see if the capacity/inode is nearly full:
--
# mmdf project
disk disk size failure holds holds free KB free KB
name in KB group metadata data in full blocks in fragments
--------------- ------------- -------- -------- ----- -------------------- -------------------
Disks in storage pool: system (Maximum disk size allowed is 397 TB)
U01_L0 15623913472 -1 Yes Yes 7404335104 ( 47%) 667820032 ( 4%)
U01_L1 15623913472 -1 Yes Yes 7498215424 ( 48%) 642773120 ( 4%)
U01_L2 15623913472 -1 Yes Yes 7497969664 ( 48%) 642664576 ( 4%)
U01_L3 15623913472 -1 Yes Yes 7496232960 ( 48%) 644327936 ( 4%)
U01_L4 15623913472 -1 Yes Yes 7499296768 ( 48%) 640117376 ( 4%)
U01_L5 15623913472 -1 Yes Yes 7494881280 ( 48%) 644168320 ( 4%)
U01_L6 15623913472 -1 Yes Yes 7494164480 ( 48%) 643673216 ( 4%)
U01_L7 15623913472 -1 Yes Yes 7497433088 ( 48%) 639918976 ( 4%)
U01_L8 15623913472 -1 Yes Yes 7494139904 ( 48%) 645130240 ( 4%)
U01_L9 15623913472 -1 Yes Yes 7498375168 ( 48%) 639979520 ( 4%)
U01_L10 15623913472 -1 Yes Yes 7496028160 ( 48%) 641909632 ( 4%)
U01_L11 15623913472 -1 Yes Yes 7496093696 ( 48%) 643749504 ( 4%)
U01_L12 15623913472 -1 Yes Yes 7496425472 ( 48%) 641556992 ( 4%)
U01_L13 15623913472 -1 Yes Yes 7495516160 ( 48%) 643395840 ( 4%)
U01_L14 15623913472 -1 Yes Yes 7496908800 ( 48%) 642418816 ( 4%)
U01_L15 15623913472 -1 Yes Yes 7495823360 ( 48%) 643580416 ( 4%)
U01_L16 15623913472 -1 Yes Yes 7499939840 ( 48%) 641538688 ( 4%)
U01_L17 15623913472 -1 Yes Yes 7497355264 ( 48%) 642184704 ( 4%)
U13_L0 2339553280 -1 Yes No 2322395136 ( 99%) 8190848 ( 0%)
U13_L1 2339553280 -1 Yes No 2322411520 ( 99%) 8189312 ( 0%)
U13_L12 15623921664 -1 Yes Yes 7799422976 ( 50%) 335150208 ( 2%)
U13_L13 15623921664 -1 Yes Yes 8002662400 ( 51%) 126059264 ( 1%)
U13_L14 15623921664 -1 Yes Yes 8001093632 ( 51%) 126107648 ( 1%)
U13_L15 15623921664 -1 Yes Yes 8001732608 ( 51%) 126167168 ( 1%)
U13_L16 15623921664 -1 Yes Yes 8000077824 ( 51%) 126240768 ( 1%)
U13_L17 15623921664 -1 Yes Yes 8001458176 ( 51%) 126068480 ( 1%)
U13_L18 15623921664 -1 Yes Yes 7998636032 ( 51%) 127111680 ( 1%)
U13_L19 15623921664 -1 Yes Yes 8001892352 ( 51%) 125148928 ( 1%)
U13_L20 15623921664 -1 Yes Yes 8001916928 ( 51%) 126187904 ( 1%)
U13_L21 15623921664 -1 Yes Yes 8002568192 ( 51%) 126591616 ( 1%)
------------- -------------------- -------------------
(pool total) 442148765696 219305402368 ( 50%) 13078121728 ( 3%)
============= ==================== ===================
(data) 437469659136 214660595712 ( 49%) 13061741568 ( 3%)
(metadata) 442148765696 219305402368 ( 50%) 13078121728 ( 3%)
============= ==================== ===================
(total) 442148765696 219305402368 ( 50%) 13078121728 ( 3%)
Inode Information
-----------------
Number of used inodes: 133031523
Number of free inodes: 1186205
Number of allocated inodes: 134217728
Maximum number of inodes: 134217728
--
Eureka! From here it seems that the inode capacity is teetering on its limit. I think at this point it would be best to educate our users on not writing millions of small text files as I don't think it is possible to adjust the GPFS block size to something lower (block size is currently 4MB). The system was originally targeted at large read/writes from traditional HPC users, but we have now diversified our user base to include other computing areas outside traditional HPC. Documentation states that if parallel writes are to be done, that a minimum of 5% of the inodes need to be free otherwise performance will suffer. From above, we have less than 1% free which I think is the root of our problem.
Therefore, is there a method to safely increase the maximum inode count and could it be done during operation or should the system be unmounted? I've man paged / searched online and found a few hints suggesting below but was curious about its safety during operation:
mmchfs project --inode-limit <new_max_inode_count>
The man page describes that the limit is:
max_files = total_filesystem_space / (inode_size + subblock_size)
and the subblock size defined from IBM's website as 1/32 of the block size (which is 4MB). Therefore, I calculate that the maximum number of inodes I could potentially have is:
3440846425
Which is approximately 25x the current maximum, so I think there is reason that I can increase the inode count without too much worry. Are there any caveats to my logic here? I'm not saying I'll increase it to the maximum value right away because the inode space would take away from some usable capacity of the system.
Thanks for any comments and recommendations. I have a grand size maintenance period coming up due to datacenter power upgrades and I'll be given ~2 weeks of down time for maintenance which I'm trying to get all my ducks in line and if I need to do something time consuming with the file systems, I'd like to know ahead of time so I can do it during the maintenance window as I will probably not get another one window for many months after.
Again, thank you all!
Jared Baker
ARCC
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150102/ede3f500/attachment.htm>
More information about the gpfsug-discuss
mailing list