[gpfsug-discuss] SS Metrics (Zimon) and SS GUI, Federation not working

Kristy Kallback-Rose kkr at lbl.gov
Thu May 25 22:51:32 BST 2017


Hi Michael, Norbert,

  Thanks for your replies, we did do all the setup as Michael described,
and stop and restart services more than once ;-). I believe the issue is
resolved with the PTF. I am still checking, but it seems to be working with
symmetric peering between those two nodes. I will test further and expand
to other nodes and make sure it continue to work. I will report back if I
run into any other issues.

Cheers,
Kristy

On Thu, May 25, 2017 at 6:46 AM, Michael L Taylor <taylorm at us.ibm.com>
wrote:

> Hi Kristy,
> At first glance your config looks ok. Here are a few things to check.
>
> Is 4.2.3 the first time you have installed and configured performance
> monitoring? Or have you configured it at some version < 4.2.3 and then
> upgraded to 4.2.3?
>
> <https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adv_guienableperfmon.htm>
> Did you restart pmcollector after changing the configuration?
> <https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adv_guienableperfmon.htm>
> https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.3/
> com.ibm.spectrum.scale.v4r23.doc/bl1adv_guienableperfmon.htm
> "Configure peer configuration for the collectors. The collector
> configuration is stored in the /opt/IBM/zimon/ZIMonCollector.cfg file.
> This file defines collector peer configuration and the aggregation rules.
> If you are using only a single collector, you can skip this step. Restart
> the pmcollector service after making changes to the configuration file. The
> GUI must have access to all data from each GUI node. "
>
> Firewall ports are open for performance monitoring and MGMT GUI?
> https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.3/
> com.ibm.spectrum.scale.v4r23.doc/bl1adv_firewallforgui.htm?cp=STXKQY
> https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.3/
> com.ibm.spectrum.scale.v4r23.doc/bl1adv_firewallforPMT.htm
>
> Did you setup the collectors with :
> prompt# mmperfmon config generate --collectors collector1.domain.com,
> collector2.domain.com,…
>
> Once the configuration file has been stored within IBM Spectrum Scale, it
> can be activated as follows.
> prompt# mmchnode --perfmon –N nodeclass1,nodeclass2,…
>
> Perhaps once you make sure the federated mode is set between hostA and
> hostB as you like then 'systemctl restart pmcollector' and then 'systemctl
> restart gpfsgui' on both nodes?
>
>
>
> [image: Inactive hide details for gpfsug-discuss-request---05/24/2017
> 12:58:21 PM---Send gpfsug-discuss mailing list submissions to gp]
> gpfsug-discuss-request---05/24/2017 12:58:21 PM---Send gpfsug-discuss
> mailing list submissions to gpfsug-discuss at spectrumscale.org
>
> From: gpfsug-discuss-request at spectrumscale.org
> To: gpfsug-discuss at spectrumscale.org
> Date: 05/24/2017 12:58 PM
> Subject: gpfsug-discuss Digest, Vol 64, Issue 61
> Sent by: gpfsug-discuss-bounces at spectrumscale.org
> ------------------------------
>
>
>
> Send gpfsug-discuss mailing list submissions to
> gpfsug-discuss at spectrumscale.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
> or, via email, send a message with subject or body 'help' to
> gpfsug-discuss-request at spectrumscale.org
>
> You can reach the person managing the list at
> gpfsug-discuss-owner at spectrumscale.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of gpfsug-discuss digest..."
>
>
> Today's Topics:
>
>   1. SS Metrics (Zimon) and SS GUI, Federation not working
>      (Kristy Kallback-Rose)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 24 May 2017 12:57:49 -0700
> From: Kristy Kallback-Rose <kkr at lbl.gov>
> To: gpfsug-discuss at spectrumscale.org
> Subject: [gpfsug-discuss] SS Metrics (Zimon) and SS GUI, Federation
> not working
> Message-ID:
> <CAA9oNus2BRyJcQEHXa7j1Vmz_Z6swTwRDatMw93P0_sD8X76vg at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
>
> Hello,
>
>  We have been experimenting with Zimon and the SS GUI on our dev cluster
> under 4.2.3. Things work well with one collector, but I'm running into
> issues when trying to use symmetric collector peers, i.e. federation.
>
>  hostA and hostB are setup as both collectors and sensors with each a
> collector peer for the other. When this is done I can use mmperfmon to
> query hostA from hostA or hostB and vice versa. However, with this
> federation setup, the GUI fails to show data. The GUI is running on hostB.
> >From the collector candidate pool, hostA has been selected (automatically,
> not manually) as can be seen in the sensor configuration file. The GUI is
> unable to load data (just shows "Loading" on the graph), *unless* I change
> the setting of the ZIMonAddress variable in
> /usr/lpp/mmfs/gui/conf/gpfsgui.properties
> from localhost to hostA explicitly, it does not work if I change it to
> hostB explicitly. The GUI also works fine if I remove the peer entries
> altogether and just have one collector.
>
>  I thought that federation meant that no matter which collector was
> queried the data would be returned. This appears to work for mmperfmon, but
> not the GUI. Can anyone advise? I also don't like the idea of having a pool
> of collector candidates and hard-coding one into the GUI configuration. I
> am including some output below to show the configs and query results.
>
> Thanks,
>
> Kristy
>
>
>  The peers are added into the ZIMonCollector.cfg using the default port
> 9085:
>
> peers = {
>
>        host = "hostA"
>
>        port = "9085"
>
> },
>
> {
>
>        host = "hostB"
>
>        port = "9085"
>
> }
>
>
> And the nodes are added as collector candidates, on hostA and hostB you
> see, looking at the config file directly, in /opt/IBM/zimon/ZIMonSensors.
> cfg:
>
> colCandidates = "hostA.nersc.gov <http://hosta.nersc.gov/>", "
> hostB.nersc.gov <http://hostb.nersc.gov/>"
>
> colRedundancy = 1
>
> collectors = {
>
> host = "hostA.nersc.gov <http://hosta.nersc.gov/>"
>
> port = "4739"
>
> }
>
>
> Showing the config with mmperfmon config show:
>
> colCandidates = "hostA.nersc.gov <http://hosta.nersc.gov/>", "
> hostB.nersc.gov <http://hostb.nersc.gov/>"
>
> colRedundancy = 1
>
> collectors = {
>
> host = ""
>
>
> Using mmperfmon I can query either host.
>
>
> [root at hostA ~]#  mmperfmon query cpu -N hostB
>
>
> Legend:
>
> 1: hostB.nersc.gov <http://hostb.nersc.gov/>|CPU|cpu_system
>
> 2: hostB.nersc.gov <http://hostb.nersc.gov/>|CPU|cpu_user
>
> 3: hostB.nersc.gov <http://hostb.nersc.gov/>|CPU|cpu_contexts
>
>
>
> Row           Timestamp cpu_system cpu_user cpu_contexts
>
>  1 2017-05-23-17:03:54       0.54     3.67         4961
>
>  2 2017-05-23-17:03:55       0.63     3.55         6199
>
>  3 2017-05-23-17:03:56       1.59     3.76         7914
>
>  4 2017-05-23-17:03:57       1.38     5.34         5393
>
>  5 2017-05-23-17:03:58       0.54     2.21         2435
>
>  6 2017-05-23-17:03:59       0.13     0.29         2519
>
>  7 2017-05-23-17:04:00       0.13     0.25         2197
>
>  8 2017-05-23-17:04:01       0.13     0.29         2473
>
>  9 2017-05-23-17:04:02       0.08     0.21         2336
>
> 10 2017-05-23-17:04:03       0.13     0.21         2312
>
>
> [root@ hostB ~]#  mmperfmon query cpu -N hostB
>
>
> Legend:
>
> 1: hostB.nersc.gov <http://hostb.nersc.gov/>|CPU|cpu_system
>
> 2: hostB.nersc.gov <http://hostb.nersc.gov/>|CPU|cpu_user
>
> 3: hostB.nersc.gov <http://hostb.nersc.gov/>|CPU|cpu_contexts
>
>
>
> Row           Timestamp cpu_system cpu_user cpu_contexts
>
>  1 2017-05-23-17:04:07       0.13     0.21         2010
>
>  2 2017-05-23-17:04:08       0.04     0.21         2571
>
>  3 2017-05-23-17:04:09       0.08     0.25         2766
>
>  4 2017-05-23-17:04:10       0.13     0.29         3147
>
>  5 2017-05-23-17:04:11       0.83     0.83         2596
>
>  6 2017-05-23-17:04:12       0.33     0.54         2530
>
>  7 2017-05-23-17:04:13       0.08     0.33         2428
>
>  8 2017-05-23-17:04:14       0.13     0.25         2326
>
>  9 2017-05-23-17:04:15       0.13     0.29         4190
>
> 10 2017-05-23-17:04:16       0.58     1.92         5882
>
>
> [root@ hostB ~]#  mmperfmon query cpu -N hostA
>
>
> Legend:
>
> 1: hostA.nersc.gov <http://hosta.nersc.gov/>|CPU|cpu_system
>
> 2: hostA.nersc.gov <http://hosta.nersc.gov/>|CPU|cpu_user
>
> 3: hostA.nersc.gov <http://hosta.nersc.gov/>|CPU|cpu_contexts
>
>
>
> Row           Timestamp cpu_system cpu_user cpu_contexts
>
>  1 2017-05-23-17:05:45       0.33     0.46         7460
>
>  2 2017-05-23-17:05:46       0.33     0.42         8993
>
>  3 2017-05-23-17:05:47       0.42     0.54         8709
>
>  4 2017-05-23-17:05:48       0.38      0.5         5923
>
>  5 2017-05-23-17:05:49       0.54     1.46         7381
>
>  6 2017-05-23-17:05:50       0.58     3.51        10381
>
>  7 2017-05-23-17:05:51       1.05     1.13        10995
>
>  8 2017-05-23-17:05:52       0.88     0.92        10855
>
>  9 2017-05-23-17:05:53        0.5     0.63        10958
>
> 10 2017-05-23-17:05:54        0.5     0.59        10285
>
>
> [root@ hostA ~]#  mmperfmon query cpu -N hostA
>
>
> Legend:
>
> 1: hostA.nersc.gov <http://hosta.nersc.gov/>|CPU|cpu_system
>
> 2: hostA.nersc.gov <http://hosta.nersc.gov/>|CPU|cpu_user
>
> 3: hostA.nersc.gov <http://hosta.nersc.gov/>|CPU|cpu_contexts
>
>
>
> Row           Timestamp cpu_system cpu_user cpu_contexts
>
>  1 2017-05-23-17:05:50       0.58     3.51        10381
>
>  2 2017-05-23-17:05:51       1.05     1.13        10995
>
>  3 2017-05-23-17:05:52       0.88     0.92        10855
>
>  4 2017-05-23-17:05:53        0.5     0.63        10958
>
>  5 2017-05-23-17:05:54        0.5     0.59        10285
>
>  6 2017-05-23-17:05:55       0.46     0.63        11621
>
>  7 2017-05-23-17:05:56       0.84     0.92        11477
>
>  8 2017-05-23-17:05:57       1.47     1.88        11084
>
>  9 2017-05-23-17:05:58       0.46     1.76         9125
>
> 10 2017-05-23-17:05:59       0.42     0.63        11745
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <http://gpfsug.org/pipermail/gpfsug-discuss/attachments/
> 20170524/e64509b9/attachment.html>
>
> ------------------------------
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
> End of gpfsug-discuss Digest, Vol 64, Issue 61
> **********************************************
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170525/65f931e9/attachment.htm>


More information about the gpfsug-discuss mailing list