[gpfsug-discuss] suggestions for copying one GPFS file system into another

Ratliff, John jdratlif at iu.edu
Tue Mar 5 16:21:18 GMT 2019


We use a GPFS file system for our computing clusters and we're working on
moving to a new SAN.

 

We originally tried AFM, but it didn't seem to work very well. We tried to
do a prefetch on a test policy scan of 100 million files, and after 24 hours
it hadn't pre-fetched anything. It wasn't clear what was happening. Some
smaller tests succeeded, but the NFSv4 ACLs did not seem to be transferred.

 

Since then we started using rsync with the GPFS attrs patch. We have over
600 million files and 700 TB. I split up the rsync tasks with lists of files
generated by the policy engine and we transferred the original data in about
2 weeks. Now we're working on final synchronization. I'd like to use one of
the delete options to remove files that were sync'd earlier and then
deleted. This can't be combined with the files-from option, so it's harder
to break up the rsync tasks. Some of the directories I'm running this
against have 30-150 million files each. This can take quite some time with a
single rsync process.

 

I'm also wondering if any of my rsync options are unnecessary. I was using
avHAXS and numeric-ids. I'm thinking the A (acls) and X (xatttrs) might be
unnecessary with GPFS->GPFS. We're only using NFSv4 GPFS ACLs. I don't know
if GPFS uses any xattrs that rsync would sync or not. Removing those two
options removed several system calls, which should make it much faster, but
I want to make sure I'm syncing correctly. Also, it seems there is a problem
with the GPFS patch on rsync where it will always give an error trying to
get GPFS attributes on a symlink, which means it doesn't sync any symlinks
when using that option. So you can rsync symlinks or GPFS attrs, but not
both at the same time. This has lead to me running two rsyncs, one to get
all files and one to get all attributes.

 

Thanks for any ideas or suggestions.

 

John Ratliff | Pervasive Technology Institute | UITS | Research Storage -
Indiana University | http://pti.iu.edu

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20190305/729451d3/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5670 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20190305/729451d3/attachment.bin>


More information about the gpfsug-discuss mailing list