[gpfsug-discuss] GPFS Question: Will stopping all tie-breaker disks break quorum semantics?

Luke Raimbach luke.raimbach at oerc.ox.ac.uk
Thu May 24 11:55:42 BST 2012


Dear GPFS,

I have a relatively simple GPFS set-up:

Two manager-quorum nodes (primary and secondary configuration nodes) run the cluster with tie-breaker disk quorum semantics. The two manager nodes are SAN attached to 6 x 20TB SATA NSDs (marked as dataOnly), split in to two failure groups so we could create a file system that supported replication. Three of these NSDs are marked as the tie-breaker disks.

The metadata is stored on SAS disks located in both manager-quorum nodes (marked as metaDataOnly) and replicated between them.

The disk controller subsystem that runs the SATA NSDs requires a reboot, BUT I do not want to shut down GPFS as some critical services are dependent on a small (~12TB) portion of the data.

I have added two additional NSD servers to the cluster using some old equipment. These are SAN attached to 10 x 2TB LUNs which is enough to keep the critical data on. I am removing one of the SATA 20TB LUNs from the file system 'system' storage pool on the manager nodes and adding it to another storage pool 'evac-pool' which contains the new 10 x 2TB NSDs.

Using the policy engine, I want to migrate the file set which contains the critical data to this new storage pool and enable replication of the file set (with the single 20TB NSD in failure group 1 and the 10 x 2TB NSDs in failure group 2).

I am expecting to then be able to suspend then stop the 20TB NSD and maintain access to the critical data. This plan is progressing nicely, but I'm not yet at the stage where I can stop the 20TB NSD (I'm waiting for a re-stripe to finish for something else).

Does this plan sound plausible so far? I've read the relevant documentation and will run an experiment with stopping the single 20TB NSDs first. However, I thought about a potential problem - the quorum semantics in operation.

When I switch off all six 20TB NSDs, the cluster manager-quorum nodes to which they are attached will remain online (to serve the metadata NSDs for the surviving data disks), but all the tiebreaker disks are on the six 20TB NSDs. My question is, will removing access to the tie-breaker disks affect GPFS quorum, or are they only referenced when quorum is lost?

I'm running GPFS 3.4.7.

Thanks,
Luke.

--

Luke Raimbach
IT Manager
Oxford e-Research Centre
7 Keble Road,
Oxford,
OX1 3QG

+44(0)1865 610639





More information about the gpfsug-discuss mailing list