[gpfsug-discuss] SS Metrics (Zimon) and SS GUI, Federation not working

Michael L Taylor taylorm at us.ibm.com
Thu May 25 14:46:06 BST 2017


Hi Kristy,
At first glance your config looks ok.  Here are a few things to check.

Is 4.2.3 the first time you have installed and configured performance
monitoring?  Or have you configured it at some version < 4.2.3 and then
upgraded to 4.2.3?

Did you restart pmcollector after changing the configuration?
https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adv_guienableperfmon.htm
"Configure peer configuration for the collectors. The collector
configuration is stored in the /opt/IBM/zimon/ZIMonCollector.cfg file. This
file defines collector peer configuration and the aggregation rules. If you
are using only a single collector, you can skip this step. Restart the
pmcollector service after making changes to the configuration file. The GUI
must have access to all data from each GUI node. "

Firewall ports are open for performance monitoring and MGMT GUI?
https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adv_firewallforgui.htm?cp=STXKQY
https://www.ibm.com/support/knowledgecenter/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adv_firewallforPMT.htm

Did you setup the collectors with :
prompt# mmperfmon config generate --collectors
collector1.domain.com,collector2.domain.com,…

Once the configuration file has been stored within IBM Spectrum Scale, it
can be activated as follows.
prompt# mmchnode --perfmon –N nodeclass1,nodeclass2,…

Perhaps once you make sure the federated mode is set between hostA and
hostB as you like then 'systemctl restart pmcollector' and then 'systemctl
restart gpfsgui' on both nodes?





From:	gpfsug-discuss-request at spectrumscale.org
To:	gpfsug-discuss at spectrumscale.org
Date:	05/24/2017 12:58 PM
Subject:	gpfsug-discuss Digest, Vol 64, Issue 61
Sent by:	gpfsug-discuss-bounces at spectrumscale.org



Send gpfsug-discuss mailing list submissions to
		 gpfsug-discuss at spectrumscale.org

To subscribe or unsubscribe via the World Wide Web, visit
		 http://gpfsug.org/mailman/listinfo/gpfsug-discuss
or, via email, send a message with subject or body 'help' to
		 gpfsug-discuss-request at spectrumscale.org

You can reach the person managing the list at
		 gpfsug-discuss-owner at spectrumscale.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."


Today's Topics:

   1. SS Metrics (Zimon) and SS GUI,		 Federation not working
      (Kristy Kallback-Rose)


----------------------------------------------------------------------

Message: 1
Date: Wed, 24 May 2017 12:57:49 -0700
From: Kristy Kallback-Rose <kkr at lbl.gov>
To: gpfsug-discuss at spectrumscale.org
Subject: [gpfsug-discuss] SS Metrics (Zimon) and SS GUI,
Federation
		 not working
Message-ID:

<CAA9oNus2BRyJcQEHXa7j1Vmz_Z6swTwRDatMw93P0_sD8X76vg at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hello,

  We have been experimenting with Zimon and the SS GUI on our dev cluster
under 4.2.3. Things work well with one collector, but I'm running into
issues when trying to use symmetric collector peers, i.e. federation.

  hostA and hostB are setup as both collectors and sensors with each a
collector peer for the other. When this is done I can use mmperfmon to
query hostA from hostA or hostB and vice versa. However, with this
federation setup, the GUI fails to show data. The GUI is running on hostB.
>From the collector candidate pool, hostA has been selected (automatically,
not manually) as can be seen in the sensor configuration file. The GUI is
unable to load data (just shows "Loading" on the graph), *unless* I change
the setting of the ZIMonAddress variable in
/usr/lpp/mmfs/gui/conf/gpfsgui.properties
from localhost to hostA explicitly, it does not work if I change it to
hostB explicitly. The GUI also works fine if I remove the peer entries
altogether and just have one collector.

  I thought that federation meant that no matter which collector was
queried the data would be returned. This appears to work for mmperfmon, but
not the GUI. Can anyone advise? I also don't like the idea of having a pool
of collector candidates and hard-coding one into the GUI configuration. I
am including some output below to show the configs and query results.

Thanks,

Kristy


  The peers are added into the ZIMonCollector.cfg using the default port
9085:

 peers = {

        host = "hostA"

        port = "9085"

 },

 {

        host = "hostB"

        port = "9085"

 }


And the nodes are added as collector candidates, on hostA and hostB you
see, looking at the config file directly, in /opt/IBM/zimon/ZIMonSensors.
cfg:

colCandidates = "hostA.nersc.gov <http://hosta.nersc.gov/>", "
hostB.nersc.gov <http://hostb.nersc.gov/>"

colRedundancy = 1

collectors = {

host = "hostA.nersc.gov <http://hosta.nersc.gov/>"

port = "4739"

}


Showing the config with mmperfmon config show:

colCandidates = "hostA.nersc.gov <http://hosta.nersc.gov/>", "
hostB.nersc.gov <http://hostb.nersc.gov/>"

colRedundancy = 1

collectors = {

host = ""


Using mmperfmon I can query either host.


[root at hostA ~]#  mmperfmon query cpu -N hostB


Legend:

 1: hostB.nersc.gov <http://hostb.nersc.gov/>|CPU|cpu_system

 2: hostB.nersc.gov <http://hostb.nersc.gov/>|CPU|cpu_user

 3: hostB.nersc.gov <http://hostb.nersc.gov/>|CPU|cpu_contexts



Row           Timestamp cpu_system cpu_user cpu_contexts

  1 2017-05-23-17:03:54       0.54     3.67         4961

  2 2017-05-23-17:03:55       0.63     3.55         6199

  3 2017-05-23-17:03:56       1.59     3.76         7914

  4 2017-05-23-17:03:57       1.38     5.34         5393

  5 2017-05-23-17:03:58       0.54     2.21         2435

  6 2017-05-23-17:03:59       0.13     0.29         2519

  7 2017-05-23-17:04:00       0.13     0.25         2197

  8 2017-05-23-17:04:01       0.13     0.29         2473

  9 2017-05-23-17:04:02       0.08     0.21         2336

 10 2017-05-23-17:04:03       0.13     0.21         2312


[root@ hostB ~]#  mmperfmon query cpu -N hostB


Legend:

 1: hostB.nersc.gov <http://hostb.nersc.gov/>|CPU|cpu_system

 2: hostB.nersc.gov <http://hostb.nersc.gov/>|CPU|cpu_user

 3: hostB.nersc.gov <http://hostb.nersc.gov/>|CPU|cpu_contexts



Row           Timestamp cpu_system cpu_user cpu_contexts

  1 2017-05-23-17:04:07       0.13     0.21         2010

  2 2017-05-23-17:04:08       0.04     0.21         2571

  3 2017-05-23-17:04:09       0.08     0.25         2766

  4 2017-05-23-17:04:10       0.13     0.29         3147

  5 2017-05-23-17:04:11       0.83     0.83         2596

  6 2017-05-23-17:04:12       0.33     0.54         2530

  7 2017-05-23-17:04:13       0.08     0.33         2428

  8 2017-05-23-17:04:14       0.13     0.25         2326

  9 2017-05-23-17:04:15       0.13     0.29         4190

 10 2017-05-23-17:04:16       0.58     1.92         5882


[root@ hostB ~]#  mmperfmon query cpu -N hostA


Legend:

 1: hostA.nersc.gov <http://hosta.nersc.gov/>|CPU|cpu_system

 2: hostA.nersc.gov <http://hosta.nersc.gov/>|CPU|cpu_user

 3: hostA.nersc.gov <http://hosta.nersc.gov/>|CPU|cpu_contexts



Row           Timestamp cpu_system cpu_user cpu_contexts

  1 2017-05-23-17:05:45       0.33     0.46         7460

  2 2017-05-23-17:05:46       0.33     0.42         8993

  3 2017-05-23-17:05:47       0.42     0.54         8709

  4 2017-05-23-17:05:48       0.38      0.5         5923

  5 2017-05-23-17:05:49       0.54     1.46         7381

  6 2017-05-23-17:05:50       0.58     3.51        10381

  7 2017-05-23-17:05:51       1.05     1.13        10995

  8 2017-05-23-17:05:52       0.88     0.92        10855

  9 2017-05-23-17:05:53        0.5     0.63        10958

 10 2017-05-23-17:05:54        0.5     0.59        10285


[root@ hostA ~]#  mmperfmon query cpu -N hostA


Legend:

 1: hostA.nersc.gov <http://hosta.nersc.gov/>|CPU|cpu_system

 2: hostA.nersc.gov <http://hosta.nersc.gov/>|CPU|cpu_user

 3: hostA.nersc.gov <http://hosta.nersc.gov/>|CPU|cpu_contexts



Row           Timestamp cpu_system cpu_user cpu_contexts

  1 2017-05-23-17:05:50       0.58     3.51        10381

  2 2017-05-23-17:05:51       1.05     1.13        10995

  3 2017-05-23-17:05:52       0.88     0.92        10855

  4 2017-05-23-17:05:53        0.5     0.63        10958

  5 2017-05-23-17:05:54        0.5     0.59        10285

  6 2017-05-23-17:05:55       0.46     0.63        11621

  7 2017-05-23-17:05:56       0.84     0.92        11477

  8 2017-05-23-17:05:57       1.47     1.88        11084

  9 2017-05-23-17:05:58       0.46     1.76         9125

 10 2017-05-23-17:05:59       0.42     0.63        11745
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <
http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20170524/e64509b9/attachment.html
>

------------------------------

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss


End of gpfsug-discuss Digest, Vol 64, Issue 61
**********************************************


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170525/138c0c08/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170525/138c0c08/attachment.gif>


More information about the gpfsug-discuss mailing list