[gpfsug-discuss] GPFS v5: Blocksizes and subblocks

Tomer Perry TOMP at il.ibm.com
Wed Mar 27 16:19:40 GMT 2019


Hi,

Not sure how will it work over the mailing list...
Since its a popular question, I've prepared a slide explaining all of that 
- ( pasted/attached below, but I'll try to explain in text as well...).

On the right we can see the various "layers":
- OS disks ( what looks to the OS/GPFS as a physical disk) - its 
properties are size, media, device name etc. ( we actually won't know what 
media means, but we don't really care)
- NSD: When introduced to GPFS, so later on we can use it for "something". 
Two interesting properties at this stage: name and through which servers 
we can get to it...
- FS disk: When NSD is being added to a filesystem, then we start caring 
about stuff like type ( data, metadata, data+metadata, desconly etc.), to 
what pool we add the disk, what failure groups etc.

That's true on a per filesystem basis. With the exception that nsd name 
must be unique across the cluster. All the rest is in a filesystem 
context. So:
- Each filesystem will have its own "system pool" which will store that 
filesystem metadata ( can also store data - which of course belong to that 
filesystem..not others...not the cluster)
- Pool exist just because several filesystem disks were told that they 
belong to that pool ( and hopefully there is some policy that brings data 
to that pool). And since filesystem disks, exist only in the context of 
their filesystem - a pool exist inside a single filesystem only ( other 
filesystems might have their own pools of course).




Regards,

Tomer Perry
Scalable I/O Development (Spectrum Scale)
email: tomp at il.ibm.com
1 Azrieli Center, Tel Aviv 67021, Israel
Global Tel:    +1 720 3422758
Israel Tel:      +972 3 9188625
Mobile:         +972 52 2554625




From:   Stephen Ulmer <ulmer at ulmer.org>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   27/03/2019 17:53
Subject:        Re: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks
Sent by:        gpfsug-discuss-bounces at spectrumscale.org



Hmmm? I was going to ask what structures are actually shared by "two" 
pools that are in different file systems, and you provided the answer 
before I asked.

So all disks which are labelled with a particular storage pool name share 
some metadata: pool id, the name, possibly other items. I was confused 
because the NSD is labelled with the pool when it?s added to the file 
system ? not when it?s created. So I thought that the pool was a property 
of a disk+fs, not the NSD itself.

The more I talk this out the more I think that pools aren?t real, but just 
another label that happens to be orthogonal to all of the other labels:

Only disks have pools ? NSDs do not, because there is no way to give them 
one at creation time.
Disks are NSDs that are in file systems.
A disk is in exactly one file system.
All disks that have the same "pool name" will have the same "pool id", and 
possibly other pool-related metadata.

It appears that the disks in a pool have absolutely nothing in common 
other than that they have been labelled as being in the same pool when 
added to a file system, right? I mean, literally everything but the pool 
name/id could be different ? or is there more there?

Do we do anything to pools outside of the context of a file system? Even 
when we list them we have to provide a file system. Does GPFS keep 
statistics about pools that aren?t related to file systems?

(I love learning things, even when I look like an idiot?)

-- 
Stephen



On Mar 27, 2019, at 11:20 AM, J. Eric Wonderley <eric.wonderley at vt.edu> 
wrote:

mmlspool might suggest there's only 1 system pool per cluster.  We have 2 
clusters and it has id=0 on both.

One of our clusters has 2 filesystems that have same id for two different 
dataonly pools:
[root at cl001 ~]# mmlspool home all
Name            Id
system           0 
fc_8T            65537 
fc_ssd400G       65538 
[root at cl001 ~]# mmlspool work all
Name            Id
system           0 
sas_6T           65537 

I know md lives in the system pool and if you do encryption you can forget 
about putting data into you inodes for small files



On Wed, Mar 27, 2019 at 10:57 AM Stephen Ulmer <ulmer at ulmer.org> wrote:
This presentation contains lots of good information about file system 
structure in general, and GPFS in specific, and I appreciate that and 
enjoyed reading it.

However, it states outright (both graphically and in text) that storage 
pools are a feature of the cluster, not of a file system ? which I believe 
to be completely incorrect. For example, it states that there is "only one 
system pool per cluster", rather than one per file system.

Given that this was written by IBMers and presented at an actual users? 
group, can someone please weigh in on this? I?m asking because it 
represents a fundamental misunderstanding of a very basic GPFS concept, 
which makes me wonder how authoritative the rest of it is...

-- 
Stephen



On Mar 26, 2019, at 12:27 PM, Dorigo Alvise (PSI) <alvise.dorigo at psi.ch> 
wrote:

Hi Marc,
"Indirect block size" is well explained in this presentation: 

http://files.gpfsug.org/presentations/2016/south-bank/D2_P2_A_spectrum_scale_metadata_dark_V2a.pdf

pages 37-41

Cheers,

   Alvise


From: gpfsug-discuss-bounces at spectrumscale.org [
gpfsug-discuss-bounces at spectrumscale.org] on behalf of Caubet Serrabou 
Marc (PSI) [marc.caubet at psi.ch]
Sent: Tuesday, March 26, 2019 4:39 PM
To: gpfsug main discussion list
Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks

Hi all,

according to several GPFS presentations as well as according to the man 
pages:

         Table 1. Block sizes and subblock sizes

+-------------------------------+-------------------------------+
| Block size                    | Subblock size                 |
+-------------------------------+-------------------------------+
| 64 KiB                        | 2 KiB                         |
+-------------------------------+-------------------------------+
| 128 KiB                       | 4 KiB                         |
+-------------------------------+-------------------------------+
| 256 KiB, 512 KiB, 1 MiB, 2    | 8 KiB                         |
| MiB, 4 MiB                    |                               |
+-------------------------------+-------------------------------+
| 8 MiB, 16 MiB                 | 16 KiB                        |
+-------------------------------+-------------------------------+

A block size of 8MiB or 16MiB should contain subblocks of 16KiB.

However, when creating a new filesystem with 16MiB blocksize, looks like 
is using 128KiB subblocks:

[root at merlindssio01 ~]# mmlsfs merlin
flag                value                    description
------------------- ------------------------ 
-----------------------------------
 -f                 8192                     Minimum fragment (subblock) 
size in bytes (system pool)
                    131072                   Minimum fragment (subblock) 
size in bytes (other pools)
 -i                 4096                     Inode size in bytes
 -I                 32768                    Indirect block size in bytes
.
.
.
 -n                 128                      Estimated number of nodes 
that will mount file system
 -B                 1048576                  Block size (system pool)
                    16777216                 Block size (other pools)
.
.
.

What am I missing? According to documentation, I expect this to be a fixed 
value, or it isn't at all?

On the other hand, I don't really understand the concept 'Indirect block 
size in bytes', can somebody clarify or provide some details about this 
setting?

Thanks a lot and best regards,
Marc 
_________________________________________
Paul Scherrer Institut 
High Performance Computing
Marc Caubet Serrabou
Building/Room: WHGA/019A
Forschungsstrasse, 111
5232 Villigen PSI
Switzerland

Telephone: +41 56 310 46 67
E-Mail: marc.caubet at psi.ch
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=mLPyKeOa1gNDrORvEXBgMw&m=bg9EailWWZuz9EdTQO1uOk21naHNDRFX4LSAi9ehmXU&s=fwy_H6JVRBfBQJWU_LfPyKtSsHaKnuRJ9DO-ghnKIaM&e=





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20190327/15518508/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 30588 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20190327/15518508/attachment.gif>


More information about the gpfsug-discuss mailing list