[gpfsug-discuss] Couple of questions related to storage pools and mmapplypolicy

Buterbaugh, Kevin L Kevin.Buterbaugh at Vanderbilt.Edu
Mon Dec 17 22:01:41 GMT 2018


Hi All,

As those of you who suffered thru my talk at SC18 already know, we’re really short on space on one of our GPFS filesystems as the output of mmdf piped to grep pool shows:

Disks in storage pool: system (Maximum disk size allowed is 24 TB)
(pool total)           4.318T                                1.078T ( 25%)        79.47G ( 2%)
Disks in storage pool: data (Maximum disk size allowed is 262 TB)
(pool total)           494.7T                                38.15T (  8%)        4.136T ( 1%)
Disks in storage pool: capacity (Maximum disk size allowed is 519 TB)
(pool total)           640.2T                                14.56T (  2%)        716.4G ( 0%)

The system pool is metadata only.  The data pool is the default pool.  The capacity pool is where files with an atime (yes, atime) > 90 days get migrated.  The capacity pool is comprised of NSDs that are 8+2P RAID 6 LUNs of 8 TB drives, so roughly 58.2 TB usable space per NSD.

We have the new storage we purchased, but that’s still being tested and held in reserve for after the first of the year when we create a new GPFS 5 formatted filesystem and start migrating everything to the new filesystem.

In the meantime, we have also purchased a 60-bay JBOD and 30 x 12 TB drives and will be hooking it up to one of our existing storage arrays on Wednesday.  My plan is to create another 3 8+2P RAID 6 LUNs and present those to GPFS as NSDs.  They will be about 88 TB usable space each (because … beginning rant … a 12 TB drive is < 11 TB is size … and don’t get me started on so-called “4K” TV’s … end rant).

A very wise man who used to work at IBM but now hangs out with people in red polos (<grin>) once told me that it’s OK to mix NSDs of slightly different sizes in the same pool, but you don’t want to put NSDs of vastly different sizes in the same pool because the smaller ones will fill first and then the larger ones will have to take all the I/O.  I consider 58 TB and 88 TB to be pretty significantly different and am therefore planning on creating yet another pool called “oc” (over capacity if a user asks, old crap internally!) and migrating files with an atime greater than, say, 1 year to that pool.  But since ALL of the files in the capacity pool haven’t even been looked at in at least 90 days already, does it really matter?  I.e. should I just add the NSDs to the capacity pool and be done with it?

If it’s a good idea to create another pool, then I have a question about mmapplypolicy and migrations.  I believe I understand how things work, but after spending over an hour looking at the documentation I cannot find anything that explicitly confirms my understanding … so if I have another pool called oc that’s ~264 TB in size and I write a policy file that looks like:

define(access_age,(DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME)))

RULE 'ReallyOldStuff'
  MIGRATE FROM POOL 'capacity'
  TO POOL 'oc'
  LIMIT(98)
  SIZE(KB_ALLOCATED/NLINK)
  WHERE ((access_age > 365) AND (KB_ALLOCATED > 3584))

RULE 'OldStuff'
  MIGRATE FROM POOL 'data'
  TO POOL 'capacity'
  LIMIT(98)
  SIZE(KB_ALLOCATED/NLINK)
  WHERE ((access_age > 90) AND (KB_ALLOCATED > 3584))

Keeping in mind that my capacity pool is already 98% full, is mmapplypolicy smart enough to calculate how much space it’s going to free up in the capacity pool by the “ReallyOldStuff” rule and therefore be able to potentially also move a ton of stuff from the data pool to the capacity pool via the 2nd rule with just one invocation of mmapplypolicy?  That’s what I expect that it will do.  I’m hoping I don’t have to run the mmapplypolicy twice … the first to move stuff from capacity to oc and then a second time for it to realize, oh, I’ve got a much of space free in the capacity pool now.

Thanks in advance...

Kevin

P.S.  In case you’re scratching your head over the fact that we have files that people haven’t even looked at for months and months (more than a year in some cases) sitting out there … we sell quota in 1 TB increments … once they’ve bought the quota, it’s theirs.  As long as they’re paying us the monthly fee if they want to keep files relating to research they did during the George Bush Presidency out there … and I mean Bush 41, not Bush 43 ….then that’s their choice.  We do not purge files.

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20181217/be27d2e2/attachment.htm>


More information about the gpfsug-discuss mailing list