[gpfsug-discuss] GUI reports erroneous NIC errors

Luke Raimbach luke.raimbach at googlemail.com
Mon Feb 19 12:16:43 GMT 2018


Hi GUI whizzes,

I have a couple of AFM nodes in my cluster with dual-port MLX cards for
RDMA.

Only the first port on the card is connected to the fabric and the cluster
configuration seems correct to me:

# mmlsconfig

---8<---
[nsdNodes]
verbsPorts mlx5_1/1
[afm]
verbsPorts mlx4_1/1
[afm,nsdNodes]
verbsRdma enable
--->8---

The cluster is working fine, and the mmlfs.log shows me what I expect, i.e.
RDMA connections being made over the correct interfaces.

Nevertheless the GUI tells me such lies as "Node Degraded" and
"ib_rdma_nic_unrecognised" for the second port on the card (which is not
explicitly used). Event details are:

Event name: ib_rdma_nic_unrecognized
Component: Network
Entity type: Node
Entity name: afm01
Event time: 19/02/18 12:53:39
Message: IB RDMA NIC mlx4_1/2 was not recognized
Description: The specified IB RDMA NIC was not correctly recognized for
usage by Spectrum Scale
Cause: The specified IB RDMA NIC is not reported in 'mmfsadm dump verbs'
User action: N/A
Reporting node: afm01
Event type: Active health state of an entity which is monitored by the
system.

Naturally the GUI is for those who like to see reports and this incorrect
entry would likely generate a high volume of unwanted questions from such
report viewers. How can I bring the GUI reporting back in line with reality?

Thanks,
Luke.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180219/0e07bb04/attachment.htm>


More information about the gpfsug-discuss mailing list