[gpfsug-discuss] Is TSM/HSM 7.1 compatible with GPFS 3.5.0.12 ?

Sabuj Pattanayek sabujp at gmail.com
Fri Mar 21 22:19:38 GMT 2014


It's also not a problem with the TSM database, which is sitting on a RAID10
of 4 SSDs. I'll watch dstat and it'll basically do a large 500MB/s write to
the LUN every 30 mins - hour and then just sit idle waiting for more data.


On Fri, Mar 21, 2014 at 5:16 PM, Sabuj Pattanayek <sabujp at gmail.com> wrote:

>
> the very first challenge is to find what data has changed. the way TSM
>> does this is by crawling trough your filesystem, looking at mtime on each
>> file to find out which file has changed. think about a ls -Rl on your
>> filesystem root. this
>
>
> The mmbackup mmapplypolicy phase is not a problem. It's going to take as
> long as it's going to take. We're using 5 RAID1 SAS SSD NSD's for metadata
> and it takes ~1 hour to do the traversal through 157 million files.
>
>
>>
>> the 2nd challenge is if you have to backup a very large number (millions)
>> of very small (<32k) files.
>> the main issue here is that for each file TSM issues a random i/o to
>> GPFS, one at a time, so your throughput directly correlates with size of
>> the files and latency for a single file read operation. if you are not on
>> 3.5 TL3 and/or your files don't fit into the inode its actually even 2
>> random i/os that are issued as you need to read the metadata followed by
>> the data block for the file.
>> in this scenario you can only do 2 things :
>
>
> The problem here is why is a single rsync or tar | tar process orders of
> magnitude faster than a single tsm client at pulling data off of GPFS into
> the same backup system's disk (e.g. disk pool)? It's not a problem with
> GPFS, it's a problem with TSM itself. We tried various things, e.g. :
>
> 1) changed commmethod to sharedmem
> 2) increase txnbytelimit to 10G
> 3) increased movesizethresh to the same as txnbytelimit (10G)
> 4) increase diskbufsize to 1023kb
> 5) increased txngroupmax to 65000
> 6) increased movesizethresh to 10240
>
> the next problem is that one would expect backups to tape to do straight
> sequential I/O to tape, in the case of putting the files to the disk pool
> before moving them to tape, it did the same random I/O to tape even with
> 8GB disk pool chunks. We haven't tried the file pool option yet, but we've
> been told that it'll do the same thing. If I'm tar'ing or dd'ing large
> files to tape that's the most efficient, why doesn't TSM do something
> similar?
>
>
>> 1. parallelism - mmbackup again starts multiple processes in parallel to
>> speed up this phase of the backup
>
>
> ..use multiple clients. This would help, but again I'm trying to get a
> single tsm client to be on par with a single "cp" process.
>
> Thanks,
> Sabuj
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20140321/dcfd38d0/attachment.htm>


More information about the gpfsug-discuss mailing list