[gpfsug-discuss] Capacity pool filling

Buterbaugh, Kevin L Kevin.Buterbaugh at Vanderbilt.Edu
Thu Jun 7 15:16:43 BST 2018


Hi All,

First off, I’m on day 8 of dealing with two different mini-catastrophes at work and am therefore very sleep deprived and possibly missing something obvious … with that disclaimer out of the way…

We have a filesystem with 3 pools:  1) system (metadata only), 2) gpfs23data (the default pool if I run mmlspolicy), and 3) gpfs23capacity (where files with an atime - yes atime - of more than 90 days get migrated to by a script that runs out of cron each weekend.

However … this morning the free space in the gpfs23capacity pool is dropping … I’m down to 0.5 TB free in a 582 TB pool … and I cannot figure out why.  The migration script is NOT running … in fact, it’s currently disabled.  So I can only think of two possible explanations for this:

1.  There are one or more files already in the gpfs23capacity pool that someone has started updating.  Is there a way to check for that … i.e. a way to run something like “find /gpfs23 -mtime -7 -ls” but restricted to only files in the gpfs23capacity pool.  Marc Kaplan - can mmfind do that??  ;-)

2.  We are doing a large volume of restores right now because one of the mini-catastrophes I’m dealing with is one NSD (gpfs23data pool) down due to a issue with the storage array.  We’re working with the vendor to try to resolve that but are not optimistic so we have started doing restores in case they come back and tell us it’s not recoverable.  We did run “mmfileid” to identify the files that have one or more blocks on the down NSD, but there are so many that what we’re doing is actually restoring all the files to an alternate path (easier for out tape system), then replacing the corrupted files, then deleting any restores we don’t need.  But shouldn’t all of that be going to the gpfs23data pool?  I.e. even if we’re restoring files that are in the gpfs23capacity pool shouldn’t the fact that we’re restoring to an alternate path (i.e. not overwriting files with the tape restores) and the default pool is the gpfs23data pool mean that nothing is being restored to the gpfs23capacity pool???

Is there a third explanation I’m not thinking of?

Thanks...

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180607/7c6e8e29/attachment.htm>


More information about the gpfsug-discuss mailing list