[gpfsug-discuss] Bad performance with GPFS system monitoring (mmsysmon) in GPFS 4.2.1.1

Simon Thompson (Research Computing - IT Services) S.J.Thompson at bham.ac.uk
Thu Jan 19 18:21:18 GMT 2017


On some of our nodes we were regularly seeing procees hung timeouts in dmesg from a python process, which I vaguely thought was related to the monitoring process (though we have other python bits from openstack running on these boxes). These are all running 4.2.2.0 code

Simon
________________________________________
From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Mathias Dietz [MDIETZ at de.ibm.com]
Sent: 19 January 2017 18:07
To: FC; gpfsug main discussion list
Subject: Re: [gpfsug-discuss] Bad performance with GPFS system monitoring (mmsysmon) in GPFS 4.2.1.1

Hi Farid,

there is no official way for disabling the system health monitoring because other components rely on it (e.g. GUI, CES, Install Toolkit,..)
If you are fine with the consequences you can just delete the mmsysmonitor.conf, which will prevent the monitor from starting.

During our testing we did not see a significant performance impact caused by the monitoring.
In 4.2.2 some component monitors (e.g. disk) have been further improved to reduce polling and use notifications instead.

Nevertheless, I would like to better understand what the issue is.
What kind of workload do you run ?
Do you see spikes in CPU usage every 30 seconds ?
Is it the same on all cluster nodes or just on some of them ?
Could you send us the output of "mmhealth node show -v" to see which monitors are active.

It might make sense to open a PMR to get this issue fixed.

Thanks.


Mit freundlichen Grüßen / Kind regards

Mathias Dietz

Spectrum Scale - Release Lead Architect (4.2.X Release)
System Health and Problem Determination Architect
IBM Certified Software Engineer

----------------------------------------------------------------------------------------------------------
IBM Deutschland
Hechtsheimer Str. 2
55131 Mainz
Mobile: +49-15152801035
E-Mail: mdietz at de.ibm.com
----------------------------------------------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martina Koederitz, Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294





From:        FC <farid.chabane at ymail.com>
To:        "gpfsug-discuss at spectrumscale.org" <gpfsug-discuss at spectrumscale.org>
Date:        01/19/2017 07:06 AM
Subject:        [gpfsug-discuss] Bad performance with GPFS system monitoring (mmsysmon) in GPFS 4.2.1.1
Sent by:        gpfsug-discuss-bounces at spectrumscale.org
________________________________



Hi all,

We are facing performance issues with some of our applications due to the GPFS system monitoring (mmsysmon) on CentOS 7.2.

Bad performances (increase of iteration time) are seen every 30s exactly as the occurence frequency of mmsysmon ; the default monitor interval set to 30s in /var/mmfs/mmsysmon/mmsysmonitor.conf

Shutting down GPFS with mmshutdown doesnt stop this process, we stopped it with the command mmsysmoncontrol and we get a stable iteration time.

What are the impacts of disabling this process except losing access to mmhealth commands ?
Do you have an idea of a proper way to disable it for good without doing it in rc.local or increasing the monitoring interval in the configuration file ?

Thanks,
Farid _______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss






More information about the gpfsug-discuss mailing list