[gpfsug-discuss] [Newsletter] Re: Problem with mmlscluster and callback scripts

Matthias Knigge Matthias.Knigge at rohde-schwarz.com
Mon Sep 10 12:19:14 BST 2018


Hi Araon,

in my setup I have no chance to define a tiebreaker disk. So if one node goes down I would change the role if this node.

mmchnode --nonquorum -N nodename --force

After that I can start the filesystem and mount it.

Thanks,
Matthias


Best Regards

Matthias Knigge
R&D File Based Media Solutions

Rohde & Schwarz 
GmbH & Co. KG
Hanomaghof 1
30449 Hannover
Telefon +49 511 67 80 7 213
Fax +49 511 37 19 74
Internet: Matthias.Knigge at rohde-schwarz.com
------------------------------------------------------------
Geschäftsführung / Executive Board: Christian Leicher (Vorsitzender / Chairman), Peter Riedel, Sitz der Gesellschaft / Company's Place of Business: München, Registereintrag / Commercial Register No.: HRA 16 270, Persönlich haftender Gesellschafter / Personally Liable Partner: RUSEG Verwaltungs-GmbH, Sitz der Gesellschaft / Company's Place of Business: München, Registereintrag / Commercial Register No.: HRB 7 534, Umsatzsteuer-Identifikationsnummer (USt-IdNr.) / VAT Identification No.: DE 130 256 683, Elektro-Altgeräte Register (EAR) / WEEE Register No.: DE 240 437 86

-----Original Message-----
From: gpfsug-discuss-bounces at spectrumscale.org <gpfsug-discuss-bounces at spectrumscale.org> On Behalf Of Aaron Knister
Sent: Friday, September 07, 2018 3:35 PM
To: gpfsug-discuss at spectrumscale.org
Subject: *EXT* [Newsletter] Re: [gpfsug-discuss] Problem with mmlscluster and callback scripts

Hi Matthias,

Looks like you lost quorum in the cluster (you've got to have (n/2+1) quorum nodes up if you're using node-based quorum). Do you have a tiebreaker disk defined? (i.e. mmlsconfig tiebreakerdisk).

-Aaron

On 9/7/18 7:51 AM, Matthias Knigge wrote:
> Hello together,
> 
> I am using the version 5.0.2.0 of GPFS and have problems with the 
> command mmlscluster and callback-scripts. It is a small cluster of two 
> nodes only. If I shutdown one of the nodes sometimes mmlscluster 
> reports the following output:
> 
> [root at gpfs-tier1 gpfs5.2]# mmgetstate
> 
> Node number  Node name        GPFS state
> 
> -------------------------------------------
> 
>         1      gpfs-tier1       arbitrating
> 
> [root at gpfs-tier1 gpfs5.2]# mmlscluster
> 
> ssh: connect to host gpfs-tier2 port 22: No route to host
> 
> mmlscluster: Unable to retrieve GPFS cluster files from node 
> gpfs-tier2
> 
> mmlscluster: Command failed. Examine previous error messages to 
> determine cause.
> 
> Normally the output is like this:
> 
> [root at gpfs-tier1 gpfs5.2]# mmlscluster
> 
> GPFS cluster information
> 
> ========================
> 
>    GPFS cluster name:         TIERCLUSTER.gpfs-tier1
> 
>    GPFS cluster id:           12458173498278694815
> 
>    GPFS UID domain:           TIERCLUSTER.gpfs-tier1
> 
>    Remote shell command:      /usr/bin/ssh
> 
>    Remote file copy command:  /usr/bin/scp
> 
>    Repository type:           server-based
> 
> GPFS cluster configuration servers:
> 
> -----------------------------------
> 
>    Primary server:    gpfs-tier2
> 
>    Secondary server:  gpfs-tier1
> 
> Node  Daemon node name  IP address      Admin node name  Designation
> 
> ----------------------------------------------------------------------
> 
>     1   gpfs-tier1        192.168.178.10  gpfs-tier1       
> quorum-manager
> 
>     2   gpfs-tier2        192.168.178.11  gpfs-tier2       
> quorum-manager
> 
> [root at gpfs-tier1 gpfs5.2]# mmlscallback
> 
> NodeDownCallback
> 
>          command       = /var/mmfs/rs/nodedown.ksh
> 
>          priority      = 1
> 
>          event         = quorumNodeLeave
> 
>          parms         = %eventNode %quorumNodes
> 
> NodeUpCallback
> 
>          command       = /var/mmfs/rs/nodeup.ksh
> 
>          priority      = 1
> 
>          event         = quorumNodeJoin
> 
>          parms         = %eventNode %quorumNodes
> 
> If I shutdown the filesystem via mmshutdown the callback script works 
> but if I shutdown the whole node the scripts does not run.
> 
> The latest log-entry in mmfs.log.latest shows only this information:
> 
> 2018-09-07_13:12:36.724+0200: [I] Cluster Manager connection broke. 
> Probing cluster TIERCLUSTER.gpfs-tier1
> 
> 2018-09-07_13:12:37.226+0200: [E] Unable to contact enough other 
> quorum nodes during cluster probe.
> 
> 2018-09-07_13:12:37.226+0200: [E] Lost membership in cluster 
> TIERCLUSTER.gpfs-tier1. Unmounting file systems.
> 
> 2018-09-07_13:12:38.448+0200: [N] Connecting to 192.168.178.11
> gpfs-tier2 <c0p1>
> 
> Could anybody help me in this case? I want to try to start a script if 
> one node goes down or up to change the roles for starting the 
> filesystem. The callback event NodeLeave and NodeJoin do not run too.
> 
> Any more information required? If yes, please let me know!
> 
> Many thanks in advance and a nice weekend!
> 
> Matthias
> 
> Best Regards
> 
> Matthias Knigge
> R&D File Based Media Solutions
> 
> Rohde & Schwarz
> GmbH & Co. KG
> Hanomaghof 1
> 30449 Hannover
> Telefon +49 511 67 80 7 213
> Fax +49 511 37 19 74
> Internet: Matthias.Knigge at rohde-schwarz.com
> ------------------------------------------------------------
> Geschäftsführung / Executive Board: Christian Leicher (Vorsitzender / 
> Chairman), Peter Riedel, Sitz der Gesellschaft / Company's Place of
> Business: München, Registereintrag / Commercial Register No.: HRA 16 
> 270, Persönlich haftender Gesellschafter / Personally Liable Partner:
> RUSEG Verwaltungs-GmbH, Sitz der Gesellschaft / Company's Place of
> Business: München, Registereintrag / Commercial Register No.: HRB 7 
> 534, Umsatzsteuer-Identifikationsnummer (USt-IdNr.) / VAT Identification No.:
> DE 130 256 683, Elektro-Altgeräte Register (EAR) / WEEE Register No.: 
> DE
> 240 437 86
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> 

--
Aaron Knister
NASA Center for Climate Simulation (Code 606.2) Goddard Space Flight Center
(301) 286-2776
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


More information about the gpfsug-discuss mailing list