[gpfsug-discuss] mmsdrestore with CCR enabled (was Re: 4.1.1 protocol support)

Simon Thompson (Research Computing - IT Services) S.J.Thompson at bham.ac.uk
Fri Jul 3 12:50:31 BST 2015


Actually, no just ignore me, it does appear to be fixed in 4.1.1


  *   I cleaned up the node by removing the 4.1.1 packages, then cleaned up /var/mmfs, but then when the config tool reinstalled, it put 4.1.0 back on and didn’t apply the updates to 4.1.1, so it  must have been an older version of mmsdrrestore

Simon

From: Simon Thompson <S.J.Thompson at bham.ac.uk<mailto:S.J.Thompson at bham.ac.uk>>
Reply-To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Date: Friday, 3 July 2015 12:22
To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Subject: [gpfsug-discuss] mmsdrestore with CCR enabled (was Re: 4.1.1 protocol support)


Bob, (anyone?)

Have you tried mmsdrestore to see if its working in 4.1.1?

# mmsdrrestore  -p PRIMARY -R /usr/bin/scp

Fri  3 Jul 11:56:05 BST 2015: mmsdrrestore: Processing node PRIMARY

ccrio initialization failed (err 811)

mmsdrrestore: Unable to retrieve GPFS cluster files from CCR.

mmsdrrestore: Unexpected error from updateMmfsEnvironment.  Return code: 1

mmsdrrestore: Command failed. Examine previous error messages to determine cause.

It seems to copy the mmsdrfs file to the local node into /var/mmfs/gen/mmsdrfs but then fails to actually work.

Simon


From: Bob Oesterlin <oester at gmail.com<mailto:oester at gmail.com>>
Reply-To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Date: Thursday, 2 July 2015 20:03
To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Subject: Re: [gpfsug-discuss] 4.1.1 protocol support


On Thu, Jul 2, 2015 at 1:52 PM, Simon Thompson (Research Computing - IT Services) <S.J.Thompson at bham.ac.uk<mailto:S.J.Thompson at bham.ac.uk>> wrote:
I do note that it needs CCR enabled, which we currently don’t have. Now I think this was because we saw issues with mmsdrestore when adding a node that had been reinstalled back into the cluster. I need to check if that is still the case (we work on being able to pull clients, NSDs etc from the cluster and using xcat to reprovision and the a config tool to do the relevant bits to rejoin the cluster … makes it easier for us to stage kernel, GPFS, OFED updates as we just blat on a new image).

Yes, and this is why we couldn't use CCR - our compute nodes are netboot, so they go thru a mmsdrrestore every time they reboot. Now, they have fixed this in 4.1.1, which means if you can get (the cluster) to 4.1.1 and turn on CCR, mmsdrrestore should work. Note to self: Test this out in your sandbox cluster. :-)


Bob Oesterlin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20150703/630f0bf0/attachment.htm>


More information about the gpfsug-discuss mailing list