[gpfsug-discuss] Is TSM/HSM 7.1 compatible with GPFS 3.5.0.12 ?

Sabuj Pattanayek sabujp at gmail.com
Fri Mar 21 22:16:10 GMT 2014


> the very first challenge is to find what data has changed. the way TSM
> does this is by crawling trough your filesystem, looking at mtime on each
> file to find out which file has changed. think about a ls -Rl on your
> filesystem root. this


The mmbackup mmapplypolicy phase is not a problem. It's going to take as
long as it's going to take. We're using 5 RAID1 SAS SSD NSD's for metadata
and it takes ~1 hour to do the traversal through 157 million files.


>
> the 2nd challenge is if you have to backup a very large number (millions)
> of very small (<32k) files.
> the main issue here is that for each file TSM issues a random i/o to GPFS,
> one at a time, so your throughput directly correlates with size of the
> files and latency for a single file read operation. if you are not on 3.5
> TL3 and/or your files don't fit into the inode its actually even 2 random
> i/os that are issued as you need to read the metadata followed by the data
> block for the file.
> in this scenario you can only do 2 things :


The problem here is why is a single rsync or tar | tar process orders of
magnitude faster than a single tsm client at pulling data off of GPFS into
the same backup system's disk (e.g. disk pool)? It's not a problem with
GPFS, it's a problem with TSM itself. We tried various things, e.g. :

1) changed commmethod to sharedmem
2) increase txnbytelimit to 10G
3) increased movesizethresh to the same as txnbytelimit (10G)
4) increase diskbufsize to 1023kb
5) increased txngroupmax to 65000
6) increased movesizethresh to 10240

the next problem is that one would expect backups to tape to do straight
sequential I/O to tape, in the case of putting the files to the disk pool
before moving them to tape, it did the same random I/O to tape even with
8GB disk pool chunks. We haven't tried the file pool option yet, but we've
been told that it'll do the same thing. If I'm tar'ing or dd'ing large
files to tape that's the most efficient, why doesn't TSM do something
similar?


> 1. parallelism - mmbackup again starts multiple processes in parallel to
> speed up this phase of the backup


..use multiple clients. This would help, but again I'm trying to get a
single tsm client to be on par with a single "cp" process.

Thanks,
Sabuj
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20140321/9a73f570/attachment.htm>


More information about the gpfsug-discuss mailing list