[gpfsug-discuss] GPFS best practises : end user standpoint

Marc A Kaplan makaplan at us.ibm.com
Wed Jan 17 20:10:45 GMT 2018


Yes, "special" characters in pathnames can lead to trouble... But just for 
the record...

GPFS supports the same liberal file name policy as standard POSIX.  Namely 
any bytestring is valid, except:

/  delimits directory names

The \0 (zero or Null Character) byte value marks the end of the pathname.

There are limits on the length of an individual file or directory name.

AND there is an OS imposed limit on the total length of a pathname you can 
pass through the file system APIs.





From:   "Buterbaugh, Kevin L" <Kevin.Buterbaugh at Vanderbilt.Edu>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   01/16/2018 01:58 PM
Subject:        Re: [gpfsug-discuss] GPFS best practises : end user 
standpoint
Sent by:        gpfsug-discuss-bounces at spectrumscale.org



Hi Jonathan,



Comments / questions inline.  Thanks!



Kevin



> On Jan 16, 2018, at 10:08 AM, Jonathan Buzzard 
<jonathan.buzzard at strath.ac.uk> wrote:

> 

> On Tue, 2018-01-16 at 15:47 +0000, Carl Zetie wrote:

>> Maybe this would make for a good session at a future user group

>> meeting -- perhaps as an interactive session? IBM could potentially

>> provide a facilitator from our Design practice.

>> 

> 

> Most of it in my view is standard best practice regardless of the file

> system in use.

> 

> So in our mandatory training for the HPC, we tell our users don't use

> whacked out characters in your file names and directories. Specifically

> no backticks, no asterik's, no question marks, no newlines (yes

> really), no slashes (either forward or backward) and for Mac users

> don't start the name with a space (forces sorting to the top). We

> recommend sticking to plain ASCII so no accented characters either

> (harder if your native language is not English I guess but we are UK

> based so...). We don't enforce that but if it causes the user problems

> then they are on their own.



We’re in Tennessee, so not only do we not speak English, we barely speak 
American … y’all will just have to understand, bless your hearts!  ;-). 



But seriously, like most Universities, we have a ton of users for whom 
English is not their “primary” language, so dealing with “interesting” 
filenames is pretty hard to avoid.  And users’ problems are our problems 
whether or not they’re our problem.



> 

> We also strongly recommend using ISO 8601 date formats in file names to

> get date sorting from a directory listing too. Surprisingly not widely

> known about, but a great "life hack".

> 

> Then it boils down to don't create zillions of files. I would love to

> be able to somehow do per directory file number quota's where one could

> say set a default of a few thousand. Users would then have to justify

> needing a larger quota. Sure you can set a file number quota but that

> does not stop them putting them all in one directory.



If you’ve got (bio)medical users using your cluster I don’t see how you 
avoid this … they’re using commercial apps that do this kind of stupid 
stuff (10’s of thousands of files in a directory and the full path to each 
file is longer than the contents of the files themselves!).



This reminds me of way back in 2005 when we moved from an NFS server to 
GPFS … I was moving users over by tarring up their home directories on the 
NFS server, copying the tarball over to GPFS and untarring it there … 
worked great for 699 out of 700 users.  But there was one user for whom 
the untar would fail every time I tried … turned out that back in early 
versions of GPFS 2.3 IBM hadn’t considered that someone would put 6 
million files in one directory!  :-O



> 

> If users really need to have zillions of files then charge them more so

> you can afford to beef up your metadata disks to SSD.



OK, so here’s my main question … you’re right that SSD’s are the answer … 
but how do you charge them more?  SSDs are move expensive than hard disks, 
and enterprise SSDs are stupid expensive … and users barely want to pay 
hard drive prices for their storage.  If you’ve got the magic answer to 
how to charge them enough to pay for SSDs I’m sure I’m not the only one 
who’d love to hear how you do it?!?!



> 

> 

> JAB.

> 

> -- 

> Jonathan A. Buzzard                         Tel: +44141-5483420

> HPC System Administrator, ARCHIE-WeSt.

> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG

> 

> _______________________________________________

> gpfsug-discuss mailing list

> gpfsug-discuss at spectrumscale.org

> 
https://urldefense.proofpoint.com/v2/url?u=https-3A__na01.safelinks.protection.outlook.com_-3Furl-3Dhttp-253A-252F-252Fgpfsug.org-252Fmailman-252Flistinfo-252Fgpfsug-2Ddiscuss-26data-3D02-257C01-257CKevin.Buterbaugh-2540vanderbilt.edu-257Cdd3310fd309b4986f95c08d55cfb5d10-257Cba5a7f39e3be4ab3b45067fa80faecad-257C0-257C1-257C636517157039256068-26sdata-3DjZZV718gaMie92MW43qaxlDl6EQcULdk6FONrXpsP7c-253D-26reserved-3D0&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=7n72wm4bwHfrK-yGlxSHLkVEIq0FDXA7XrPI_pyQq1M&s=1cYF3dt9odnG5zCHjcxMGl9_LbVAVNFrFu1iuv5585U&e=




_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=7n72wm4bwHfrK-yGlxSHLkVEIq0FDXA7XrPI_pyQq1M&s=_puCWAGyopu4m3M7evNjbILg3LkDCmiI9vJN2IG1iBE&e=






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180117/b3defe1a/attachment.htm>


More information about the gpfsug-discuss mailing list