[gpfsug-discuss] Initial file placement - first storage pool is now used for data storage (doh!)

Fri Jun 17 16:56:59 BST 2016

Thanks for asking!

Prior to Release 4.2 system pool was the default pool for storing file 
data and you had to write at least one policy  SET POOL rule to make use 
of any NSDs that you assigned to some other pool.

Now if a file system is formatted or upgraded to level 4.2 or higher, and 
there is a data pool defined, and there are no SET POOL policy rules, then 
the "first" such data pool becomes the default
storage pool for file data. "metadata" is always stored in system pool.

So how is the "first" data pool determined? It's usually the first data 
pool added to the file system.

For many customers and installations there is only one such pool, so no 
problem - and 
this is exactly what they wanted all along (since we introduced "pools" in 
release 3.1) We received complaints over the years along the lines: "Heh! 
Why the heck do you think I made
a data pool?  I don't want to know about silly SET POOL rules and yet 
another mm command (mmchpolicy)!  Just do it!"  Well in 4.2 we did.

If you look into the /var/adm/ras/mmfs.log.* file(s) you will see that 
during mmmount 4.2 will tell you....

Fri Jun 17 09:31:26.637 2016: [I] Command: mount yy
Fri Jun 17 09:31:27.625 2016: [I] Loaded policy 'for file system yy': 
Parsed 4 policy rules.
Fri Jun 17 09:31:27.626 2016: Policy has no storage pool placement rules.
Fri Jun 17 09:31:27.627 2016: [I] Data will be stored in pool 'xtra'.

Notice we DID NOT change the behavior for file systems at level 4.1 or 
prior, even when you upgrade the software. 
But when you upgrade the file system to 4.2 (for example to use QOS) ...

From:   "Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   06/17/2016 11:19 AM
Subject:        [gpfsug-discuss] Initial file placement
Sent by:        gpfsug-discuss-bounces at spectrumscale.org

Hi yet again all, 

Well, this has turned out to be an enlightening and surprising morning in 
GPFS land… 

What prompted my question below is this … I am looking to use the new QoS 
features in GPFS 4.2.  I have QoS enabled and am trying to get a baseline 
of IOPs so that I can determine how much I want to assign to the 
maintenance class (currently both maintenance and other are set to 
unlimited).  To do this, I fired off a bonnie++ test from each of my NSD 
servers.

The filesystem in question has two storage pools, the system pool and the 
capacity pool.  The system pool is comprised of a couple of metadata only 
disks (SSD-based RAID 1 mirrors) and several data only disks (spinning 
HD-based RAID 6), while the capacity pool is comprised exclusively of data 
only disks (RAID 6).

When the bonnie++’s were creating, reading, and rewriting the big file 
they create I was quite surprised to see mmlsqos show higher IOP’s on the 
capacity pool than the system pool by a factor of 10!  As I was expecting 
those files to be being written to the system pool, this was quite 
surprising to me.  Once I found the mmlsattr command, I ran it on one of 
the files being created and saw that it was indeed assigned to the 
capacity pool.  The bonnie++’s finished before I could check the other 
files.

I don’t have any file placement policies in effect for this filesystem, 
only file migration policies (each weekend any files in the system pool 
with an atime > 60 days get moved to the capacity pool and any files in 
the capacity pool with an atime < 60 days get moved to the system pool).

In the GPFS 4.2 Advanced Administration Guide, it states, “If a GPFS file 
system does not have a placement policy installed, all the data is stored 
in the first data storage pool.”

This filesystem was initially created in 2010 and at that time consisted 
only of the system pool.  The capacity pool was not created until some 
years (2014?  2015?  don’t remember for sure) later.  I was under the 
obviously mistaken impression that the “first” data storage pool was the 
system pool, but that is clearly not correct.

So my first question is, what is the definition of “the first storage 
pool?” and my second question is, can the documentation be updated with 
the answer to my first question since it’s clearly ambiguous as written 
now?  Thanks…

Kevin

On Jun 17, 2016, at 9:29 AM, Buterbaugh, Kevin L <
Kevin.Buterbaugh at Vanderbilt.Edu> wrote:

Hi All, 

I am aware that with the mmfileid command I can determine which files have 
blocks on a given NSD.  But is there a way to query a particular file to 
see which NSD(s) is has blocks on?  Thanks in advance…

Kevin

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and 
Education
Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and 
Education
Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20160617/fa6f5258/attachment.htm>