[gpfsug-discuss] gpfsug-discuss Digest, Vol 75, Issue 34

Jan-Frode Myklebust janfrode at tanso.net
Mon Apr 23 06:00:26 BST 2018


It started for me after upgrade from v4.2.x.x to 5.0.0.1 with RHEL7.4.
Strangely not immediately, but 2 days after the upgrade (wednesday evening
CET). Also I have some doubts that mount -o tcp will help, since TCP should
already be the default transport. Have asked for if we can rather block
this serverside using iptables.

But, I expect we should get a fix soon, and we’ll stick with v2.3.2 until
that.


-jf
man. 23. apr. 2018 kl. 01:23 skrev Ray Coetzee <coetzee.ray at gmail.com>:

> Hi Jan-Frode
> We've been told the same regarding mounts using UDP.
> Our exports are already explicitly configured for TCP and the client's
> fstab's set to use TCP.
> It would be infuriating if the clients are trying UDP first irrespective
> of the mount options configured.
>
> Why the problem started specifically last week for both of us is
> interesting.
>
> Kind regards
>
> Ray Coetzee
> Mob: +44 759 704 7060
>
> Skype: ray.coetzee
>
> Email: coetzee.ray at gmail.com
>
>
> On Mon, Apr 23, 2018 at 12:02 AM, Jan-Frode Myklebust <janfrode at tanso.net>
> wrote:
>
>>
>> Yes, I've been struggelig with something similiar this week. Ganesha
>> dying with SIGABRT -- nothing else logged. After catching a few coredumps,
>> it has been identified as a problem with some udp-communication during
>> mounts from solaris clients. Disabling udp as transport on the shares
>> serverside didn't help. It was suggested to use "mount -o tcp" or whatever
>> the solaris version of this is -- but we haven't tested this. So far the
>> downgrade to v2.3.2 has been our workaround.
>>
>> PMR:  48669,080,678
>>
>>
>>   -jf
>>
>>
>> On Mon, Apr 23, 2018 at 12:38 AM, Ray Coetzee <coetzee.ray at gmail.com>
>> wrote:
>>
>>> Good evening all
>>>
>>> I'm working with IBM on a PMR where ganesha is segfaulting or causing
>>> kernel panics on one group of CES nodes.
>>>
>>> We have 12 identical CES nodes split into two groups of 6 nodes each &
>>> have been running with RHEL 7.3 & GPFS 5.0.0-1 since 5.0.0-1 was
>>> released.
>>>
>>> Only one group started having issues Monday morning where ganesha would
>>> segfault and the mounts would move over to the remaining nodes.
>>> The remaining nodes then start to fall over like dominos within minutes
>>> or hours to the point that all CES nodes are "failed" according to
>>> "mmces node list" and the VIP's are unassigned.
>>>
>>> Recovering the nodes are extremely finicky and works for a few minutes
>>> or hours before segfaulting again.
>>> Most times a complete stop of Ganesha on all nodes & then only starting
>>> it on two random nodes allow mounts to recover for a while.
>>>
>>> None of the following has helped:
>>> A reboot of all nodes.
>>> Refresh CCR config file with mmsdrrestore
>>> Remove/add CES from nodes.
>>> Reinstall GPFS & protocol rpms
>>> Update to 5.0.0-2
>>> Fresh reinstall of a node
>>> Network checks out with no dropped packets on either data or export
>>> networks.
>>>
>>> The only temporary fix so far has been to downrev ganesha to 2.3.2 from
>>> 2.5.3 on the affected nodes.
>>>
>>> While waiting for IBM development, has anyone seen something similar
>>> maybe?
>>>
>>> Kind regards
>>>
>>> Ray Coetzee
>>>
>>>
>>>
>>> On Sat, Apr 21, 2018 at 12:00 PM, <
>>> gpfsug-discuss-request at spectrumscale.org> wrote:
>>>
>>>> Send gpfsug-discuss mailing list submissions to
>>>>         gpfsug-discuss at spectrumscale.org
>>>>
>>>> To subscribe or unsubscribe via the World Wide Web, visit
>>>>         http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>> or, via email, send a message with subject or body 'help' to
>>>>         gpfsug-discuss-request at spectrumscale.org
>>>>
>>>> You can reach the person managing the list at
>>>>         gpfsug-discuss-owner at spectrumscale.org
>>>>
>>>> When replying, please edit your Subject line so it is more specific
>>>> than "Re: Contents of gpfsug-discuss digest..."
>>>>
>>>>
>>>> Today's Topics:
>>>>
>>>>    1. Re: UK Meeting - tooling Spectrum Scale (Grunenberg, Renar)
>>>>    2. Re: UK Meeting - tooling Spectrum Scale
>>>>       (Simon Thompson (IT Research Support))
>>>>
>>>>
>>>> ----------------------------------------------------------------------
>>>>
>>>> Message: 1
>>>> Date: Fri, 20 Apr 2018 14:01:55 +0000
>>>> From: "Grunenberg, Renar" <Renar.Grunenberg at huk-coburg.de>
>>>> To: "'gpfsug-discuss at spectrumscale.org'"
>>>>         <gpfsug-discuss at spectrumscale.org>
>>>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale
>>>> Message-ID: <fb4c0ca7ece5462d96948e562803e77e at SMXRF105.msg.hukrf.de>
>>>> Content-Type: text/plain; charset="utf-8"
>>>>
>>>> Hallo Simon,
>>>> are there any reason why the link of the presentation from Yong ZY
>>>> Zheng(Cognitive, ML, Hortonworks) is not linked.
>>>>
>>>> Renar Grunenberg
>>>> Abteilung Informatik ? Betrieb
>>>>
>>>> HUK-COBURG
>>>> Bahnhofsplatz
>>>> 96444 Coburg
>>>> Telefon:        09561 96-44110
>>>> Telefax:        09561 96-44104
>>>> E-Mail: Renar.Grunenberg at huk-coburg.de
>>>> Internet:       www.huk.de
>>>> ________________________________
>>>> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter
>>>> Deutschlands a. G. in Coburg
>>>> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
>>>> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
>>>> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
>>>> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans
>>>> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas.
>>>> ________________________________
>>>> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte
>>>> Informationen.
>>>> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht
>>>> irrt?mlich erhalten haben,
>>>> informieren Sie bitte sofort den Absender und vernichten Sie diese
>>>> Nachricht.
>>>> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht
>>>> ist nicht gestattet.
>>>>
>>>> This information may contain confidential and/or privileged information.
>>>> If you are not the intended recipient (or have received this
>>>> information in error) please notify the
>>>> sender immediately and destroy this information.
>>>> Any unauthorized copying, disclosure or distribution of the material in
>>>> this information is strictly forbidden.
>>>> ________________________________
>>>> -------------- next part --------------
>>>> An HTML attachment was scrubbed...
>>>> URL: <
>>>> http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20180420/91e3d84d/attachment-0001.html
>>>> >
>>>>
>>>> ------------------------------
>>>>
>>>> Message: 2
>>>> Date: Fri, 20 Apr 2018 14:12:11 +0000
>>>> From: "Simon Thompson (IT Research Support)" <S.J.Thompson at bham.ac.uk>
>>>> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
>>>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale
>>>> Message-ID: <14C2312C-1B54-45E9-B867-3D9E479A52B6 at bham.ac.uk>
>>>> Content-Type: text/plain; charset="utf-8"
>>>>
>>>> Sorry, it was a typo from my side.
>>>>
>>>> The talks that are missing we are chasing for copies of the slides that
>>>> we can release.
>>>>
>>>> Simon
>>>>
>>>> From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of "
>>>> Renar.Grunenberg at huk-coburg.de" <Renar.Grunenberg at huk-coburg.de>
>>>> Reply-To: "gpfsug-discuss at spectrumscale.org" <
>>>> gpfsug-discuss at spectrumscale.org>
>>>> Date: Friday, 20 April 2018 at 15:02
>>>> To: "gpfsug-discuss at spectrumscale.org" <
>>>> gpfsug-discuss at spectrumscale.org>
>>>> Subject: Re: [gpfsug-discuss] UK Meeting - tooling Spectrum Scale
>>>>
>>>> Hallo Simon,
>>>> are there any reason why the link of the presentation from Yong ZY
>>>> Zheng(Cognitive, ML, Hortonworks) is not linked.
>>>>
>>>> Renar Grunenberg
>>>> Abteilung Informatik ? Betrieb
>>>>
>>>> HUK-COBURG
>>>> Bahnhofsplatz
>>>> 96444 Coburg
>>>> Telefon:
>>>>
>>>> 09561 96-44110
>>>>
>>>> Telefax:
>>>>
>>>> 09561 96-44104
>>>>
>>>> E-Mail:
>>>>
>>>> Renar.Grunenberg at huk-coburg.de
>>>>
>>>> Internet:
>>>>
>>>> www.huk.de
>>>>
>>>> ________________________________
>>>> HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter
>>>> Deutschlands a. G. in Coburg
>>>> Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
>>>> Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
>>>> Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
>>>> Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans
>>>> Olav Her?y, Dr. J?rg Rheinl?nder (stv.), Sarah R?ssler, Daniel Thomas.
>>>> ________________________________
>>>> Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte
>>>> Informationen.
>>>> Wenn Sie nicht der richtige Adressat sind oder diese Nachricht
>>>> irrt?mlich erhalten haben,
>>>> informieren Sie bitte sofort den Absender und vernichten Sie diese
>>>> Nachricht.
>>>> Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht
>>>> ist nicht gestattet.
>>>>
>>>> This information may contain confidential and/or privileged information.
>>>> If you are not the intended recipient (or have received this
>>>> information in error) please notify the
>>>> sender immediately and destroy this information.
>>>> Any unauthorized copying, disclosure or distribution of the material in
>>>> this information is strictly forbidden.
>>>> ________________________________
>>>> -------------- next part --------------
>>>> An HTML attachment was scrubbed...
>>>> URL: <
>>>> http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20180420/0b8e9ffa/attachment-0001.html
>>>> >
>>>>
>>>> ------------------------------
>>>>
>>>> _______________________________________________
>>>> gpfsug-discuss mailing list
>>>> gpfsug-discuss at spectrumscale.org
>>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>>
>>>>
>>>> End of gpfsug-discuss Digest, Vol 75, Issue 34
>>>> **********************************************
>>>>
>>>
>>>
>>> _______________________________________________
>>> gpfsug-discuss mailing list
>>> gpfsug-discuss at spectrumscale.org
>>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180423/7b69cb51/attachment.htm>


More information about the gpfsug-discuss mailing list