[gpfsug-discuss] Bad performance with GPFS system monitoring (mmsysmon) in GPFS 4.2.1.1

Jonathon A Anderson jonathon.anderson at colorado.edu
Tue Feb 21 21:39:48 GMT 2017


This thread happened before I joined gpfsug-discuss; but be advised that we also experienced severe (1.5x-3x) performance degradation in user applications when running mmsysmon. In particular, we’re running a Haswell+OPA system.

The issue appears to only happen when the user application is simultaneously using all available cores *and* communicating over the network. Synthetic cpu tests with HPL did not expose the issue, nor did OSU micro-benchmarks that were designed to maximize the network without necessarily using all CPUs.

I’ve stopped mmsysmon by hand[^1] for now; but I haven’t yet gone so far as to remove the config file to prevent it from starting in the future.

We intend to run further tests; but I wanted to share our experiences so far (as this took us way longer than I wish it had to diagnose).

~jonathon




More information about the gpfsug-discuss mailing list