[gpfsug-discuss] Token manager - how to monitor performance?

Billich Heinrich Rainer (PSI) heiner.billich at psi.ch
Thu Jan 31 14:56:21 GMT 2019


Hello,
Sorry for coming up with this never-ending story. I know that token management is mainly autoconfigured and even the placement of token manager nodes is no longer under user control in all cases. Still I would like to monitor this component to see if we are close to some limit like memory or rpc rate. Especially as we’ll do some major changes to our setup soon.
I would like to monitor the performance of our token manager nodes to get warned _before_ we get performance issues. Any advice is welcome.
Ideally I would like collect some numbers and pass them on to influxdb or similar. I didn’t find anything in perfmon/zimon that seemed to match. I could imagine that numbers like “number of active tokens” and “number of token operations” per manager would be helpful. Or “# of rpc calls per second”.  And maybe “number of open files”, “number of token operations”, “number of tokens” for clients.  And maybe some percentage of used token memory … and cache hit ratio …
This would also help to tune – like if a client does very many token operations or rpc calls maybe I should increase maxFilesToCache.
 The above is just to illustrate, as token management is complicated the really valuable metrics may be different.
Or am I too anxious and should wait and see instead?
cheers,
Heiner

Heiner Billich
--
Paul Scherrer Institut
Heiner Billich
System Engineer Scientific Computing
Science IT / High Performance Computing
WHGA/106
Forschungsstrasse 111
5232 Villigen PSI
Switzerland

Phone +41 56 310 36 02
heiner.billich at psi.ch<mailto:heiner.billich at psi.ch>
https://www.psi.ch




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20190131/dcfd412b/attachment.htm>


More information about the gpfsug-discuss mailing list