[gpfsug-discuss] GPFS v5: Blocksizes and subblocks
Tomer Perry
TOMP at il.ibm.com
Wed Mar 27 16:19:40 GMT 2019
Hi,
Not sure how will it work over the mailing list...
Since its a popular question, I've prepared a slide explaining all of that
- ( pasted/attached below, but I'll try to explain in text as well...).
On the right we can see the various "layers":
- OS disks ( what looks to the OS/GPFS as a physical disk) - its
properties are size, media, device name etc. ( we actually won't know what
media means, but we don't really care)
- NSD: When introduced to GPFS, so later on we can use it for "something".
Two interesting properties at this stage: name and through which servers
we can get to it...
- FS disk: When NSD is being added to a filesystem, then we start caring
about stuff like type ( data, metadata, data+metadata, desconly etc.), to
what pool we add the disk, what failure groups etc.
That's true on a per filesystem basis. With the exception that nsd name
must be unique across the cluster. All the rest is in a filesystem
context. So:
- Each filesystem will have its own "system pool" which will store that
filesystem metadata ( can also store data - which of course belong to that
filesystem..not others...not the cluster)
- Pool exist just because several filesystem disks were told that they
belong to that pool ( and hopefully there is some policy that brings data
to that pool). And since filesystem disks, exist only in the context of
their filesystem - a pool exist inside a single filesystem only ( other
filesystems might have their own pools of course).
Regards,
Tomer Perry
Scalable I/O Development (Spectrum Scale)
email: tomp at il.ibm.com
1 Azrieli Center, Tel Aviv 67021, Israel
Global Tel: +1 720 3422758
Israel Tel: +972 3 9188625
Mobile: +972 52 2554625
From: Stephen Ulmer <ulmer at ulmer.org>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 27/03/2019 17:53
Subject: Re: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks
Sent by: gpfsug-discuss-bounces at spectrumscale.org
Hmmm? I was going to ask what structures are actually shared by "two"
pools that are in different file systems, and you provided the answer
before I asked.
So all disks which are labelled with a particular storage pool name share
some metadata: pool id, the name, possibly other items. I was confused
because the NSD is labelled with the pool when it?s added to the file
system ? not when it?s created. So I thought that the pool was a property
of a disk+fs, not the NSD itself.
The more I talk this out the more I think that pools aren?t real, but just
another label that happens to be orthogonal to all of the other labels:
Only disks have pools ? NSDs do not, because there is no way to give them
one at creation time.
Disks are NSDs that are in file systems.
A disk is in exactly one file system.
All disks that have the same "pool name" will have the same "pool id", and
possibly other pool-related metadata.
It appears that the disks in a pool have absolutely nothing in common
other than that they have been labelled as being in the same pool when
added to a file system, right? I mean, literally everything but the pool
name/id could be different ? or is there more there?
Do we do anything to pools outside of the context of a file system? Even
when we list them we have to provide a file system. Does GPFS keep
statistics about pools that aren?t related to file systems?
(I love learning things, even when I look like an idiot?)
--
Stephen
On Mar 27, 2019, at 11:20 AM, J. Eric Wonderley <eric.wonderley at vt.edu>
wrote:
mmlspool might suggest there's only 1 system pool per cluster. We have 2
clusters and it has id=0 on both.
One of our clusters has 2 filesystems that have same id for two different
dataonly pools:
[root at cl001 ~]# mmlspool home all
Name Id
system 0
fc_8T 65537
fc_ssd400G 65538
[root at cl001 ~]# mmlspool work all
Name Id
system 0
sas_6T 65537
I know md lives in the system pool and if you do encryption you can forget
about putting data into you inodes for small files
On Wed, Mar 27, 2019 at 10:57 AM Stephen Ulmer <ulmer at ulmer.org> wrote:
This presentation contains lots of good information about file system
structure in general, and GPFS in specific, and I appreciate that and
enjoyed reading it.
However, it states outright (both graphically and in text) that storage
pools are a feature of the cluster, not of a file system ? which I believe
to be completely incorrect. For example, it states that there is "only one
system pool per cluster", rather than one per file system.
Given that this was written by IBMers and presented at an actual users?
group, can someone please weigh in on this? I?m asking because it
represents a fundamental misunderstanding of a very basic GPFS concept,
which makes me wonder how authoritative the rest of it is...
--
Stephen
On Mar 26, 2019, at 12:27 PM, Dorigo Alvise (PSI) <alvise.dorigo at psi.ch>
wrote:
Hi Marc,
"Indirect block size" is well explained in this presentation:
http://files.gpfsug.org/presentations/2016/south-bank/D2_P2_A_spectrum_scale_metadata_dark_V2a.pdf
pages 37-41
Cheers,
Alvise
From: gpfsug-discuss-bounces at spectrumscale.org [
gpfsug-discuss-bounces at spectrumscale.org] on behalf of Caubet Serrabou
Marc (PSI) [marc.caubet at psi.ch]
Sent: Tuesday, March 26, 2019 4:39 PM
To: gpfsug main discussion list
Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks
Hi all,
according to several GPFS presentations as well as according to the man
pages:
Table 1. Block sizes and subblock sizes
+-------------------------------+-------------------------------+
| Block size | Subblock size |
+-------------------------------+-------------------------------+
| 64 KiB | 2 KiB |
+-------------------------------+-------------------------------+
| 128 KiB | 4 KiB |
+-------------------------------+-------------------------------+
| 256 KiB, 512 KiB, 1 MiB, 2 | 8 KiB |
| MiB, 4 MiB | |
+-------------------------------+-------------------------------+
| 8 MiB, 16 MiB | 16 KiB |
+-------------------------------+-------------------------------+
A block size of 8MiB or 16MiB should contain subblocks of 16KiB.
However, when creating a new filesystem with 16MiB blocksize, looks like
is using 128KiB subblocks:
[root at merlindssio01 ~]# mmlsfs merlin
flag value description
------------------- ------------------------
-----------------------------------
-f 8192 Minimum fragment (subblock)
size in bytes (system pool)
131072 Minimum fragment (subblock)
size in bytes (other pools)
-i 4096 Inode size in bytes
-I 32768 Indirect block size in bytes
.
.
.
-n 128 Estimated number of nodes
that will mount file system
-B 1048576 Block size (system pool)
16777216 Block size (other pools)
.
.
.
What am I missing? According to documentation, I expect this to be a fixed
value, or it isn't at all?
On the other hand, I don't really understand the concept 'Indirect block
size in bytes', can somebody clarify or provide some details about this
setting?
Thanks a lot and best regards,
Marc
_________________________________________
Paul Scherrer Institut
High Performance Computing
Marc Caubet Serrabou
Building/Room: WHGA/019A
Forschungsstrasse, 111
5232 Villigen PSI
Switzerland
Telephone: +41 56 310 46 67
E-Mail: marc.caubet at psi.ch
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=mLPyKeOa1gNDrORvEXBgMw&m=bg9EailWWZuz9EdTQO1uOk21naHNDRFX4LSAi9ehmXU&s=fwy_H6JVRBfBQJWU_LfPyKtSsHaKnuRJ9DO-ghnKIaM&e=
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20190327/15518508/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 30588 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20190327/15518508/attachment.gif>
More information about the gpfsug-discuss
mailing list