[gpfsug-discuss] Potential problems - leaving trace enabled in over-write mode?

Tue Mar 7 21:51:23 GMT 2017

Hi Bob,

I have the impression the biggest impact is to metadata-type operations 
rather than throughput but don't quote me on that because I have very 
little data to back it up. In the process of testing upgrading from GPFS 
3.5 to 4.1 we ran fio on 1000 some nodes against an FS in our test 
environment which sustained about 60-80k iops on the filesystem's 
metadata LUNs. At one point I couldn't understand why I was struggling 
to get about 13k iops and realized tracing was turned on on some subset 
of nsd servers (which are also manager nodes). After turning it off the 
throughput immediately shot back up to where I was expecting it to be.

Also during testing we were tracking down a bug for which I needed to 
run tracing *everywhere* and then turn it off when one of the manager 
nodes saw a particular error. I used a script IBM had sent me a while 
back to help with this that I made some tweaks to. I've attached it in 
case its helpful. In a nutshell the process looks like:

- start tracing everywhere (/usr/lpp/mmfs/bin/mmdsh -Nall 
/usr/lpp/mmfs/bin/mmtrace start). Doing it this way avoids the need to 
change the sdrfs file which depending on your cluster size may or may 
not have some benefits.
- run a command to watch for the event in question that when triggered 
runs /usr/lpp/mmfs/bin/mmdsh -Nall /usr/lpp/mmfs/bin/mmtrace stop

If the condition could present itself on multiple nodes within quick 
succession (as was the case for me) you could wrap the mmdsh for 
stopping tracing in an flock, using an arbitrary node that stores the 
lock locally:

ssh $stopHost flock -xn /tmp/mmfsTraceStopLock -c 
"'/usr/lpp/mmfs/bin/mmdsh -N all /usr/lpp/mmfs/bin/mmtrace stop'"

Wrapping it in an flock avoids multiple trace format format attempts.

-Aaron

On 3/7/17 3:32 PM, Oesterlin, Robert wrote:
> I’m considering enabling trace on all nodes all the time, doing
> something like this:
>
>
>
> mmtracectl --set --trace=def --trace-recycle=global
> --tracedev-write-mode=overwrite --tracedev-overwrite-buffer-size=256M
> mmtracectl --start
>
>
>
> My questions are:
>
>
>
> - What is the performance penalty of leaving this on 100% of the time on
> a node?
>
> - Does anyone have any suggestions on automation on stopping trace when
> a particular event occurs?
>
> - What other issues, if any?
>
>
>
>
>
> Bob Oesterlin
> Sr Principal Storage Engineer, Nuance
> 507-269-0413
>
>
>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>

-- 
Aaron Knister
NASA Center for Climate Simulation (Code 606.2)
Goddard Space Flight Center
(301) 286-2776
-------------- next part --------------
#!/usr/bin/ksh
stopHost=loremds20
mmtrace=/usr/lpp/mmfs/bin/mmtrace
mmtracectl=/usr/lpp/mmfs/bin/mmtracectl
# No automatic start of mmtrace.
# Second to sleep between checking.
secondsToSleep=2

# Flag to know when tripped or stopped
tripped=0

# mmfs log file to monitor
logToGrep=/var/log/messages

# Path to mmfs bin directory
MMFSbin=/usr/lpp/mmfs/bin

# Trip file.  Will exist if trap is sprung
trapHasSprung=/tmp/mmfsTrapHasSprung

rm $trapHasSprung 2>/dev/null

# Start tracing on this node
#${mmtrace} start

# Initial count of expelled message in mmfs log
baseCount=$(grep "unmounted by the system with return code 301 reason code" $logToGrep | wc -l)

# do this loop while the trip file does not exist

while [[ ! -f $trapHasSprung ]]
do
  sleep $secondsToSleep

  # Get current count of expelled to check against the initial.
  currentCount=$(grep "unmounted by the system with return code 301 reason code" $logToGrep | wc -l)

  if [[ $currentCount > $baseCount ]]
  then
   tripped=1
   /usr/lpp/mmfs/bin/mmdsh -N managernodes,quorumnodes touch $trapHasSprung
   # cluster manager?
   #stopHost=$(/usr/lpp/mmfs/bin/tslsmgr | grep '^Cluster manager' | awk '{ print $NF }' | sed -e 's/[()]//g')
   ssh $stopHost flock -xn /tmp/mmfsTraceStopLock -c "'/usr/lpp/mmfs/bin/mmdsh -N all -f128 /usr/lpp/mmfs/bin/mmtrace stop noformat'"
  fi

done