[gpfsug-discuss] spontaneous tracing?

Aaron Knister aaron.s.knister at nasa.gov
Sat Mar 10 21:44:39 GMT 2018


I found myself with a little treat this morning to the tune of tracing 
running on the entire cluster of 3500 nodes. There were no logs I could 
find to indicate *why* the tracing had started but it was clear it was 
initiated by the cluster manager.

Some sleuthing (thanks, collectl!) allowed me to figure out that the 
tracing started as the command:

/usr/lpp/mmfs/bin/mmksh /usr/lpp/mmfs/bin/mmcommon notifyOverload _asmgr

I thought that running "mmchocnfig deadlockOverloadThreshold=0 -i" would 
stop this from happening again but lo and behold tracing kicked off 
*again* (with the same caller) some time later even after setting that 
parameter.

What's odd is there are no log events to indicate an overload occurred.

Has anyone seen similar behavior?

We're on 4.2.3.6 efix17.

-Aaron

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776



More information about the gpfsug-discuss mailing list