[gpfsug-discuss] Thousands of CLOSE_WAIT connections

Simon Thompson (IT Research Support) S.J.Thompson at bham.ac.uk
Fri Jun 15 17:17:48 BST 2018


This:
“2018-06-14 08:04:06.341564 CET ib_rdma_nic_unrecognized ERROR IB RDMA NIC mlx5_0/1 was not recognized”

Looks like you are telling GPFS to use an MLX card that doesn’t exist on the node, this is set with verbsPorts, it’s probably not your issue here, but you are better using nodeclasses and assigning the config option to those nodeclasses that have the correct card installed (I’d also encourage you to use a fabric number, we do this even if there is only 1 fabric currently in the cluster as we’ve added other fabrics over time or over multiple locations).

Have you tried using mmnetverify at all? It’s been getting better in the newer releases and will give you a good indication if you have a comms issue due to something like name resolution in addition to testing between nodes…

Simon

From: <gpfsug-discuss-bounces at spectrumscale.org> on behalf of "cabrillo at ifca.unican.es" <cabrillo at ifca.unican.es>
Reply-To: "gpfsug-discuss at spectrumscale.org" <gpfsug-discuss at spectrumscale.org>
Date: Friday, 15 June 2018 at 16:16
To: "gpfsug-discuss at spectrumscale.org" <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] Thousands of CLOSE_WAIT connections

2018-06-14 08:04:06.341564 CET ib_rdma_nic_unrecognized ERROR IB RDMA NIC mlx5_0/1 was not recognized
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180615/2b44b84d/attachment.htm>


More information about the gpfsug-discuss mailing list