[gpfsug-discuss] Is TSM/HSM 7.1 compatible with GPFS 3.5.0.12 ?

Sven Oehme oehmes at us.ibm.com
Fri Mar 21 22:48:50 GMT 2014


Hi,

> the very first challenge is to find what data has changed. the way 
> TSM does this is by crawling trough your filesystem, looking at 
> mtime on each file to find out which file has changed. think about a
> ls -Rl on your filesystem root. this 
> 
> The mmbackup mmapplypolicy phase is not a problem. It's going to 
> take as long as it's going to take. We're using 5 RAID1 SAS SSD 
> NSD's for metadata and it takes ~1 hour to do the traversal through 
> 157 million files.
>  

that sounds too long, if you want i could take a look at your gpfs config 
and make suggestions on how to further improve this. 
if you want me to do that please send me a email with the output of 
mmlscluster,mmlsconfig,mmlsnsd and mmlsfs all to oehmes at us.ibm.com 

> 
> the 2nd challenge is if you have to backup a very large number 
> (millions) of very small (<32k) files. 
> the main issue here is that for each file TSM issues a random i/o to
> GPFS, one at a time, so your throughput directly correlates with 
> size of the files and latency for a single file read operation. if 
> you are not on 3.5 TL3 and/or your files don't fit into the inode 
> its actually even 2 random i/os that are issued as you need to read 
> the metadata followed by the data block for the file. 
> in this scenario you can only do 2 things : 
> 
> The problem here is why is a single rsync or tar | tar process 
> orders of magnitude faster than a single tsm client at pulling data 
> off of GPFS into the same backup system's disk (e.g. disk pool)? 
> It's not a problem with GPFS, it's a problem with TSM itself. 
> We tried various things, e.g. :
> 
> 1) changed commmethod to sharedmem
> 2) increase txnbytelimit to 10G
> 3) increased movesizethresh to the same as txnbytelimit (10G)
> 4) increase diskbufsize to 1023kb
> 5) increased txngroupmax to 65000
> 6) increased movesizethresh to 10240
> 
> the next problem is that one would expect backups to tape to do 
> straight sequential I/O to tape, in the case of putting the files to
> the disk pool before moving them to tape, it did the same random I/O
> to tape even with 8GB disk pool chunks. We haven't tried the file 
> pool option yet, but we've been told that it'll do the same thing. 

i assume you refer to using raw disks behind TSM ? i never used them, in 
fact the fastest TSM Storage you can build for TSM is to use a GPFS 
filesystem and put files into the filesystem and create a pool with 
sequential file volumes for TSM. if you match the gpfs blocksize with the 
raid blocksize you get very high throughputs to the TSM pool, i have 
customers who see >600 MB/sec backup throughput to a single TSM Server in 
production. 
> If I'm tar'ing or dd'ing large files to tape that's the most 
> efficient, why doesn't TSM do something similar?

i can't tell you much about the tape part of TSM but i can help you speed 
up the ingest into a TSM Server leveraging a Filesystem pool if you want.

> 
> 
> 1. parallelism - mmbackup again starts multiple processes in 
> parallel to speed up this phase of the backup 
> 
> ..use multiple clients. This would help, but again I'm trying to get
> a single tsm client to be on par with a single "cp" process.

cp uses buffered write operation and therefore will not sync the data to 
the disk, which is why its faster. TSM guarantees each write operation to 
to be committed to stable storage before it ever returns to the client, 
this is most likely the difference you see. 

don't get me wrong, i don't try to defend the way TSM works, i just know 
that there are several ways to make it fast and i am happy to help with 
that :-)

if you need a single TSM client to run faster, then option 2 (helper 
process) would fix that. if you want to try something out send me a email 
and i can help you with that. 

> 
> Thanks,
> Sabuj_______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20140321/5747d594/attachment.htm>


More information about the gpfsug-discuss mailing list