[gpfsug-discuss] ESS bring up the GPFS in recovery group without takeover

Damir Krstic damir.krstic at gmail.com
Fri Dec 22 17:44:50 GMT 2017


It's been a very frustrating couple of months with our 2 ESS systems. IBM
tells us we had blueflame bug and they came on site and updated our ESS to
the latest version back in middle of November. Wednesday night one of the
NSD servers in one of our ESS building blocks kernel panicked. No idea why
and none of the logs are insightful. We have a PMR open with IBM. I am not
very confident we will get to the bottom of what's causing kernel panics on
our IO servers. The system has gone down over 4 times now in 2 months.

When we tried brining it back up, it rejoined the recovery group and the IO
on the entire cluster locked up until we were able to find couple of
compute nodes with pending state in mmfsadm dump tscomm. Killing gpfs on
those nodes resolved the issue of the filesystem locking up.

So far we have never been successful in brining back an IO server and not
having a filesystem lock up until we find a node with pending state with
tscomm. Anyway, the system was stable for few minutes until the same IO
server that went down on Wednesday night went into an arbitrating mode. It
never recovered. We stopped gpfs on that server and IO recovered again. We
left gpfs down and cluster seems to be OK.

My question is, is there a way of brining back the IO server into the mix
without the recoverygroup takeover happening? Could I just start a gpfs and
have it back in the mix as a backup server for the recoverygroup and if so,
how do you do that. Right now that server is designated as primary server
for the recovery group. I would like to have both IO servers in the mix for
redundancy purposes.

This ESS situation is beyond frustrating and I don't see end in sight.

Any help is appreciated.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20171222/4d155e13/attachment.htm>


More information about the gpfsug-discuss mailing list