[gpfsug-discuss] Capacity pool filling
Jaime Pinto
pinto at scinet.utoronto.ca
Thu Jun 7 15:53:16 BST 2018
I think the restore is is bringing back a lot of material with atime >
90, so it is passing-trough gpfs23data and going directly to
gpfs23capacity.
I also think you may not have stopped the crontab script as you
believe you did.
Jaime
Quoting "Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu>:
> Hi All,
>
> First off, I?m on day 8 of dealing with two different
> mini-catastrophes at work and am therefore very sleep deprived and
> possibly missing something obvious ? with that disclaimer out of the
> way?
>
> We have a filesystem with 3 pools: 1) system (metadata only), 2)
> gpfs23data (the default pool if I run mmlspolicy), and 3)
> gpfs23capacity (where files with an atime - yes atime - of more than
> 90 days get migrated to by a script that runs out of cron each
> weekend.
>
> However ? this morning the free space in the gpfs23capacity pool is
> dropping ? I?m down to 0.5 TB free in a 582 TB pool ? and I cannot
> figure out why. The migration script is NOT running ? in fact, it?s
> currently disabled. So I can only think of two possible
> explanations for this:
>
> 1. There are one or more files already in the gpfs23capacity pool
> that someone has started updating. Is there a way to check for that
> ? i.e. a way to run something like ?find /gpfs23 -mtime -7 -ls? but
> restricted to only files in the gpfs23capacity pool. Marc Kaplan -
> can mmfind do that?? ;-)
>
> 2. We are doing a large volume of restores right now because one of
> the mini-catastrophes I?m dealing with is one NSD (gpfs23data pool)
> down due to a issue with the storage array. We?re working with the
> vendor to try to resolve that but are not optimistic so we have
> started doing restores in case they come back and tell us it?s not
> recoverable. We did run ?mmfileid? to identify the files that have
> one or more blocks on the down NSD, but there are so many that what
> we?re doing is actually restoring all the files to an alternate path
> (easier for out tape system), then replacing the corrupted files,
> then deleting any restores we don?t need. But shouldn?t all of that
> be going to the gpfs23data pool? I.e. even if we?re restoring
> files that are in the gpfs23capacity pool shouldn?t the fact that
> we?re restoring to an alternate path (i.e. not overwriting files
> with the tape restores) and the default pool is the gpfs23data pool
> mean that nothing is being restored to the gpfs23capacity pool???
>
> Is there a third explanation I?m not thinking of?
>
> Thanks...
>
> ?
> Kevin Buterbaugh - Senior System Administrator
> Vanderbilt University - Advanced Computing Center for Research and Education
> Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> -
> (615)875-9633
>
>
>
>
************************************
TELL US ABOUT YOUR SUCCESS STORIES
http://www.scinethpc.ca/testimonials
************************************
---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477
----------------------------------------------------------------
This message was sent using IMP at SciNet Consortium, University of Toronto.
More information about the gpfsug-discuss
mailing list