[gpfsug-discuss] Mass UID migration suggestions

Jonathan Buzzard jonathan at buzzard.me.uk
Sat Jul 1 10:20:18 BST 2017


On 30/06/17 16:20, hpc-luke at uconn.edu wrote:
> Hello,
>
> 	We're trying to change most of our users uids, is there a clean way to
> migrate all of one users files with say `mmapplypolicy`? We have to change the
> owner of around 273539588 files, and my estimates for runtime are around 6 days.
>
> 	What we've been doing is indexing all of the files and splitting them up by
> owner which takes around an hour, and then we were locking the user out while we
> chown their files. I made it multi threaded as it weirdly gave a 10% speedup
> despite my expectation that multi threading access from a single node would not
> give any speedup.
>
> 	Generally I'm looking for advice on how to make the chowning faster. Would
> spreading the chowning processes over multiple nodes improve performance? Should
> I not stat the files before running lchown on them, since lchown checks the file
> before changing it? I saw mention of inodescan(), in an old gpfsug email, which
> speeds up disk read access, by not guaranteeing that the data is up to date. We
> have a maintenance day coming up where all users will be locked out, so the file
> handles(?) from GPFS's perspective will not be able to go stale. Is there a
> function with similar constraints to inodescan that I can use to speed up this
> process?

My suggestion is to do some development work in C to write a custom 
program to do it for you. That way you can hook into the GPFS API to 
leverage the fast file system scanning API. Take a look at the 
tsbackup.C file in the samples directory. Obviously this is going to 
require someone with appropriate coding skills to develop. On the other 
hand given it is a one off and input is strictly controlled so error 
checking is a one off, then couple hundred lines C tops.

My tip for this would be load the new UID's into a sparse array so you 
can just use the current UID to index into the array for the new UID, 
for speeding things up. It burns RAM but these days RAM is cheap and 
plentiful and speed is the major consideration here.

This should in theory be able to do this in a few hours with this technique.

One thing to bear in mind is that once the UID change is complete you 
will have to backup the entire file system again.


JAB.

-- 
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.



More information about the gpfsug-discuss mailing list