[gpfsug-discuss] mmapplypolicy didn't migrate everything it should have - why not? hard links! A workaround

Marc A Kaplan makaplan at us.ibm.com
Tue Apr 18 17:56:11 BST 2017


Kevin, Wow.  Never underestimate the power of ...

Anyhow try this as a fix. 

Add the clause  SIZE(KB_ALLOCATED/NLINK) to your MIGRATE rules.

This spreads the total actual size over each hardlink...





From:   "Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   04/18/2017 12:33 PM
Subject:        Re: [gpfsug-discuss] mmapplypolicy didn't migrate 
everything it should    have - why not?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org



Hi Marc, 

Two things:

1.  I have a PMR open now.

2.  You *may* have identified the problem … I’m still checking … but files 
with hard links may be our problem.  I wrote a simple Perl script to 
interate over the log file I had mmapplypolicy create.  Here’s the code 
(don’t laugh, I’m a SysAdmin, not a programmer, and I whipped this out in 
< 5 minutes … and yes, I realize the fact that I used Perl instead of 
Python shows my age as well <grin>):

#!/usr/bin/perl
#
use strict;
use warnings;
my $InputFile = "/tmp/mmapplypolicy.gpfs23.log";
my $TotalFiles = 0;
my $TotalLinks = 0;
my $TotalSize = 0;
open INPUT, $InputFile or die "Couldn\'t open $InputFile for read:  $!\n";
while (<INPUT>) {
  next unless /MIGRATED/;
  $TotalFiles++;
  my $FileName = (split / /)[3];
  if ( -f $FileName ) {  # some files may have been deleted since 
mmapplypolicy ran
    my ($NumLinks, $FileSize) = (stat($FileName))[3,7];
    $TotalLinks += $NumLinks;
    $TotalSize += $FileSize;
  }
}
close INPUT;
print "Number of files / links = $TotalFiles / $TotalLinks, Total size = 
$TotalSize\n";
exit 0;

And here’s what it kicked out:

Number of files / links = 1620263 / 80818483, Total size = 53966202814094

1.6 million files but 80 million hard links!!!

I’m doing some checking right now, but it appears that it is one 
particular group - and therefore one particular fileset - that is 
responsible for this … they’ve got thousands of files with 50 or more hard 
links each … and they’re not inconsequential in size.

IIRC (and keep in mind I’m far from a GPFS policy guru), there is a way to 
say something to the effect of “and the path does not contain 
/gpfs23/fileset/path” … may need a little help getting that right.

I’ll post this information to the ticket as well but wanted to update the 
list.  This wouldn’t be the first time we were an “edge case” for 
something in GPFS… ;-)

Thanks...

Kevin


On Apr 18, 2017, at 10:11 AM, Marc A Kaplan <makaplan at us.ibm.com> wrote:

ANYONE else reading this saga?  Who uses mmapplypolicy to migrate files 
within multi-TB file systems?  Problems? Or all working as expected?

------

Well, again mmapplypolicy "thinks" it has "chosen" 1.6 million files whose 
total size is 61 Terabytes and migrating those will bring the occupancy of 
gpfs23capacity pool to 98% and then we're done.

So now I'm wondering where this is going wrong.  Is there some bug in the 
reckoning inside of mmapplypolicy or somewhere else in GPFS?

Sure you can put in an PMR, and probably should.  I'm guessing whoever 
picks up the PMR will end up calling or emailing me ... but maybe she can 
do some of the clerical work for us...  

While we're waiting for that... Here's what I suggest next.

Add  a clause ...

SHOW(varchar(KB_ALLOCATED) || ' n=' || varchar(NLINK))

before the WHERE clause to each of your rules.

Re-run the command with options  '-I test -L 2'  and collect the output.  

We're not actually going to move any data, but we're going to look at the 
files and file sizes that are "chosen"...

You should see 1.6 million lines that look kind of like this:

/yy/dat/bigC     RULE 'msx' MIGRATE FROM POOL 'system' TO POOL 'xtra' 
WEIGHT(inf) SHOW( 1024 n=1)

Run a script over the output to add up all the SHOW() values in the lines 
that contain TO POOL 'gpfs23capacity' and verify that they do indeed
add up to 61TB...  (The show is in KB so the SHOW numbers should add up to 
61 billion).

That sanity checks the policy arithmetic.  Let's assume that's okay. 

Then the next question is whether the individual numbers are correct... 
Zach Giles made a suggestion... which I'll interpret as 
find some of the biggest of those files and check that they really are 
that big....

At this point, I really don't know, but I'm guessing there's some 
discrepances in the reported KB_ALLOCATED numbers for many of the files...
and/or they are "illplaced"  - the data blocks aren't all in the pool FROM 
POOL ...

HMMMM....  I just thought about this some more and added the NLINK 
statistic.  It would be unusual for this to be a big problem, but files 
that are hard linked are
not recognized by mmapplypolicy as sharing storage... 
This has not come to my attention as a significant problem -- does the 
file system in question have significant GBs of hard linked files?

The truth is that you're the first customer/user/admin in a long time to 
question/examine how mmapplypolicy does its space reckoning ... 
Optimistically that means it works fine for most customers...  

So sorry, something unusual about your installation or usage...




_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170418/8dc9489a/attachment.htm>


More information about the gpfsug-discuss mailing list