[gpfsug-discuss] CCR cluster down for the count?

Loic Tortay tortay at cc.in2p3.fr
Wed Sep 20 09:03:54 BST 2017


On 19/09/2017 23:02, Buterbaugh, Kevin L wrote:
> Hi All,
> 
> We have a small test cluster that is CCR enabled.  It only had/has 3 NSD servers (testnsd1, 2, and 3) and maybe 3-6 clients.  testnsd3 died a while back.  I did nothing about it at the time because it was due to be life-cycled as soon as I finished a couple of higher priority projects.
> 
> Yesterday, testnsd1 also died, which took the whole cluster down.  So now resolving this has become higher priority… ;-)
> 
> I took two other boxes and set them up as testnsd1 and 3, respectively.  I’ve done a “mmsdrrestore -p testnsd2 -R /usr/bin/scp” on both of them.  I’ve also done a "mmccr setup -F” and copied the ccr.disks and ccr.nodes files from testnsd2 to them.  And I’ve copied /var/mmfs/gen/mmsdrfs from testnsd2 to testnsd1 and 3.  In case it’s not obvious from the above, networking is fine … ssh without a password between those 3 boxes is fine.
> 
> However, when I try to startup GPFS … or run any GPFS command I get:
> 
> /root
> root at testnsd2# mmstartup -a
> get file failed: Not enough CCR quorum nodes available (err 809)
> gpfsClusterInit: Unexpected error from ccr fget mmsdrfs.  Return code: 158
> mmstartup: Command failed. Examine previous error messages to determine cause.
> /root
> root at testnsd2#
> 
> I’ve got to run to a meeting right now, so I hope I’m not leaving out any crucial details here … does anyone have an idea what I need to do?  Thanks…
> 
Hello,
I have had the same issue multiple times.

The "trick" is to execute "/usr/lpp/mmfs/bin/mmcommon startCcrMonitor"
on a majority of quorum nodes (once they have the correct configuration
files) to be able to start the cluster.

I noticed a call to the above command in the "gpfs.gplbin" spec file in
the "%postun" section (when doing RPM upgrades, if I'm not mistaken).

<Insert here rant about CCR design & testing>.


Loïc.
-- 
|   Loïc Tortay <tortay at cc.in2p3.fr>  -     IN2P3 Computing Centre     |



More information about the gpfsug-discuss mailing list