[gpfsug-discuss] Infiniband: device mlx4_0 not found

Josh Catana jcatana at gmail.com
Sun Jun 18 16:30:55 BST 2017


Are any cards VPI that can do both eth and ib? I remember reading in
documentation that that there is a bus order to having mixed media with
mellanox cards. There is a module setting during init where you can set eth
ib or auto detect. If the card is on auto it might be coming up eth and
making the driver flake out because it's in the wrong order.
Responding from my phone so I can't really look it up myself right now
about what the proper order is, but maybe this might be some help
troubleshooting.

On Jun 18, 2017 12:58 AM, "Frank Tower" <frank.tower at outlook.com> wrote:

> Hi,
>
>
> You were right, ibv_devinfo -v doesn't return something if both card are
> connected. I didn't checked ibv_* tools, I supposed once IP stack and
> ibstat OK, the rest should work. I'm stupid 😊
>
>
> Anyway, once I disconnect one card, ibv_devinfo show me input but with
> both cards, I don't have any input except "device not found".
>
> And what is weird here, it's that it work only when one card are
> connected, no matter the card (both are similar: model, firmware, revision,
> company)... Really strange, I will dig more about the issue.
>
>
> Stupid and bad workaround: connected a dual port Infiniband. But
> production system doesn't wait..
>
>
> Thank for your help,
> Frank
>
> ------------------------------
> *From:* Aaron Knister <aaron.knister at gmail.com>
> *Sent:* Saturday, June 10, 2017 2:05 PM
> *To:* gpfsug main discussion list
> *Subject:* Re: [gpfsug-discuss] Infiniband: device mlx4_0 not found
>
> Out of curiosity could you send us the output of "ibv_devinfo -v"?
>
> -Aaron
>
> Sent from my iPhone
>
> On Jun 10, 2017, at 06:55, Frank Tower <frank.tower at outlook.com> wrote:
>
> Hi everybody,
>
>
> I don't get why one of our compute node cannot start GPFS over IB.
>
>
> I have the following error:
>
>
> [I] VERBS RDMA starting with verbsRdmaCm=no verbsRdmaSend=no
> verbsRdmaUseMultiCqThreads=yes verbsRdmaUseCompVectors=yes
>
> [I] VERBS RDMA library libibverbs.so (version >= 1.1) loaded and
> initialized.
>
> [I] VERBS RDMA verbsRdmasPerNode reduced from 1000 to 514 to match
> (nsdMaxWorkerThreads 512 + (nspdThreadsPerQueue 2 * nspdQueues 1)).
>
> [I] VERBS RDMA parse verbsPorts mlx4_0/1
>
> [W] VERBS RDMA parse error   verbsPort mlx4_0/1   ignored due to device
> mlx4_0 not found
>
> [I] VERBS RDMA library libibverbs.so unloaded.
>
> [E] VERBS RDMA failed to start, no valid verbsPorts defined.
>
>
>
> I'm using Centos 7.3, Kernel 3.10.0-514.21.1.el7.x86_64.
>
>
> I have 2 infinibands card, both have an IP and working well.
>
>
> [root at rdx110 ~]# ibstat -l
>
> mlx4_0
>
> mlx4_1
>
> [root at rdx110 ~]#
>
>
> I tried configuration with both card, and no one work with GPFS.
>
>
> I also tried with mlx4_0/1, but same problem.
>
>
> Someone already have the issue ?
>
>
> Kind Regards,
>
> Frank
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170618/b67835a4/attachment.htm>


More information about the gpfsug-discuss mailing list