[gpfsug-discuss] mmchdisk hung / proceeding at a glacial pace?

Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP] aaron.s.knister at nasa.gov
Sun Jul 15 18:34:45 BST 2018


Hmm...have you dumped waiters across the entire cluster or just on the NSD servers/fs managers? Maybe there’s a slow node out there participating in the suspend effort? Might be worth running some quick tracing on the FS manager to see what it’s up to.





On July 15, 2018 at 13:27:54 EDT, Buterbaugh, Kevin L <Kevin.Buterbaugh at Vanderbilt.Edu> wrote:
Hi All,

We are in a partial cluster downtime today to do firmware upgrades on our storage arrays.  It is a partial downtime because we have two GPFS filesystems:

1.  gpfs23 - 900+ TB and which corresponds to /scratch and /data, and which I’ve unmounted across the cluster because it has data replication set to 1.

2.  gpfs22 - 42 TB and which corresponds to /home.  It has data replication set to two, so what we’re doing is “mmchdisk gpfs22 suspend -d <the gpfs22 NSD>”, then doing the firmware upgrade, and once the array is back we’re doing a “mmchdisk gpfs22 resume -d <NSD>”, followed by “mmchdisk gpfs22 start -d <NSD>”.

On the 1st storage array this went very smoothly … the mmchdisk took about 5 minutes, which is what I would expect.

But on the 2nd storage array the mmchdisk appears to either be hung or proceeding at a glacial pace.  For more than an hour it’s been stuck at:

mmchdisk: Processing continues ...
Scanning file system metadata, phase 1 …

There are no waiters of any significance and “mmdiag —iohist” doesn’t show any issues either.

Any ideas, anyone?  Unless I can figure this out I’m hosed for this downtime, as I’ve got 7 more arrays to do after this one!

Thanks!

—
Kevin Buterbaugh - Senior System Administrator
Vanderbilt University - Advanced Computing Center for Research and Education
Kevin.Buterbaugh at vanderbilt.edu<mailto:Kevin.Buterbaugh at vanderbilt.edu> - (615)875-9633



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180715/a26c36c6/attachment.htm>


More information about the gpfsug-discuss mailing list