[gpfsug-discuss] Token manager - how to monitor performance?

Tomer Perry TOMP at il.ibm.com
Thu Jan 31 15:11:24 GMT 2019


Hi,

I agree that we should potentially add mode metrics, but for a start, I 
would look into mmdiag --memory and mmdiag --tokenmgr (the latter show 
different output on a token server).


Regards,

Tomer Perry
Scalable I/O Development (Spectrum Scale)
email: tomp at il.ibm.com
1 Azrieli Center, Tel Aviv 67021, Israel
Global Tel:    +1 720 3422758
Israel Tel:      +972 3 9188625
Mobile:         +972 52 2554625




From:   "Billich Heinrich Rainer (PSI)" <heiner.billich at psi.ch>
To:     gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date:   31/01/2019 16:56
Subject:        [gpfsug-discuss] Token manager - how to monitor 
performance?
Sent by:        gpfsug-discuss-bounces at spectrumscale.org



Hello,
Sorry for coming up with this never-ending story. I know that token 
management is mainly autoconfigured and even the placement of token 
manager nodes is no longer under user control in all cases. Still I would 
like to monitor this component to see if we are close to some limit like 
memory or rpc rate. Especially as we?ll do some major changes to our setup 
soon.
I would like to monitor the performance of our token manager nodes to get 
warned _before_ we get performance issues. Any advice is welcome. 
Ideally I would like collect some numbers and pass them on to influxdb or 
similar. I didn?t find anything in perfmon/zimon that seemed to match. I 
could imagine that numbers like ?number of active tokens? and ?number of 
token operations? per manager would be helpful. Or ?# of rpc calls per 
second?.  And maybe ?number of open files?, ?number of token operations?, 
?number of tokens? for clients.  And maybe some percentage of used token 
memory ? and cache hit ratio ?
This would also help to tune ? like if a client does very many token 
operations or rpc calls maybe I should increase maxFilesToCache. 
 The above is just to illustrate, as token management is complicated the 
really valuable metrics may be different.
Or am I too anxious and should wait and see instead?
cheers,
Heiner
 
Heiner Billich
--
Paul Scherrer Institut
Heiner Billich 
System Engineer Scientific Computing
Science IT / High Performance Computing 
WHGA/106 
Forschungsstrasse 111
5232 Villigen PSI
Switzerland
 
Phone +41 56 310 36 02
heiner.billich at psi.ch 
https://www.psi.ch
 
 
 
 _______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=mLPyKeOa1gNDrORvEXBgMw&m=J5n3Wsk1f6CsyL867jkmS3P2BYZDfkPS6GB9dShnYcI&s=YFTWUM3MQu8C1MitRnyPnYQ_wMtjj3Uwmif6gJUoLgc&e=





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20190131/f0650517/attachment.htm>


More information about the gpfsug-discuss mailing list