[gpfsug-discuss] mmbackup questions

Thu Oct 17 15:26:03 BST 2019

On Thu, Oct 17, 2019 at 10:26:45AM +0000, Jonathan Buzzard wrote:
> I have been looking to give mmbackup another go (a very long history
> with it being a pile of steaming dinosaur droppings last time I tried,
> but that was seven years ago).
> 
> Anyway having done a backup last night I am curious about something
> that does not appear to be explained in the documentation.
> 
> Basically the output has a line like the following
> 
>         Total number of objects inspected:      474630
> 
> What is this number? Is it the number of files that have changed since
> the last backup or something else as it is not the number of files on
> the file system by any stretch of the imagination. One would hope that
> it inspected everything on the file system...

I believe this is the number of paths that matched some include rule (or
didn't match some exclude rule) for mmbackup. I would assume it would
differ from the "total number of objects backed up" line if there were
include/exclude rules that mmbackup couldn't process, leaving it to dsmc to
decide whether to process.

> Also it appears that the shadow database is held on the GPFS file system
> that is being backed up. Is there any way to change the location of that?
> I am only using one node for backup (because I am cheap and don't like
> paying for more PVU's than I need to) and would like to hold it on the
> node doing the backup where I can put it on SSD. Which does to things
> firstly hopefully goes a lot faster, and secondly reduces the impact on
> the file system of the backup.

I haven't tried it, but there is a MMBACKUP_RECORD_ROOT environment
variable noted in the mmbackup man path:

                  Specifies an alternative directory name for
                  storing all temporary and permanent records for
                  the backup. The directory name specified must
                  be an existing directory and it cannot contain
                  special characters (for example, a colon,
                  semicolon, blank, tab, or comma).

Which seems like it might provide a mechanism to store the shadow database
elsewhere. For us, though, we provide storage via a cost center, so we
would want our customers to eat the full cost of their excessive file counts.

> Anyway a significant speed up (assuming it worked) was achieved but I
> note even the ancient Xeon E3113 (dual core 3GHz) was never taxed (load
> average never went above one) and we didn't touch the swap despite only
> have 24GB of RAM. Though the 10GbE networking did get busy during the
> transfer of data to the TSM server bit of the backup but during the
> "assembly stage" it was all a bit quiet, and the DSS-G server nodes where
> not busy either. What options are there for tuning things because I feel
> it should be able to go a lot faster.

We have some TSM nodes (corresponding to GPFS filesets) that stress out our
mmbackup cluster at the sort step of mmbackup. UNIX sort is not
RAM-friendly, as it happens.

-- 
-- Skylar Thompson (skylar2 at u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine