[gpfsug-discuss] Path to NSD lost when host_sas_address changed on port

J. Eric Wonderley eric.wonderley at vt.edu
Fri Jan 20 16:14:09 GMT 2017


Maybe multipath is not seeing all of the wwns?

multipath -v3 | grep ^51855 look ok?

For some unknown reason multipath does not see our sandisk array...we have
to add them to the end of /etc/multipath/wwids file


On Fri, Jan 20, 2017 at 10:32 AM, David D. Johnson <david_johnson at brown.edu>
wrote:

> We have most of our GPFS NSD storage set up as pairs of RAID boxes served
> by failover pairs of servers.
> Most of it is FibreChannel, but the newest four boxes and servers are
> using dual port SAS controllers.
> Just this week, we had one server lose one out of the paths to one of the
> raid boxes. Took a while
> to realize what happened, but apparently the port2 ID changed from
> 51866da05cf7b001 to
> 51866da05cf7b002 on the fly, without rebooting.  Port1 is still
> 51866da05cf7b000, which is the card ID (host_add).
>
> We’re running gpfs 4.2.2.1 on RHEL7.2 on these hosts.
>
> Has anyone else seen this kind of behavior?
> First noticed these messages, 3 hours 13 minutes after boot:
> Jan 10 13:15:53 storage043 kernel: megasas: Err returned from
> build_and_issue_cmd
> Jan 10 13:15:53 storage043 kernel: megasas: Err returned from
> build_and_issue_cmd
> Jan 10 13:15:53 storage043 kernel: megasas: Err returned from
> build_and_issue_cmd
> Jan 10 13:15:53 storage043 kernel: megasas: Err returned from
> build_and_issue_cmd
> Jan 10 13:15:53 storage043 kernel: megasas: Err returned from
> build_and_issue_cmd
> Jan 10 13:15:53 storage043 kernel: megasas: Err returned from
> build_and_issue_cmd
> Jan 10 13:15:53 storage043 kernel: megasas: Err returned from
> build_and_issue_cmd
>
> The multipath daemon was sending lots of log messages like:
> Jan 10 13:49:22 storage043 multipathd: mpathw: load table [0 4642340864
> multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1
> 1 8:64 1]
> Jan 10 13:49:22 storage043 multipathd: mpathaa: load table [0 4642340864
> multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1
> 1 8:96 1]
> Jan 10 13:49:22 storage043 multipathd: mpathx: load table [0 4642340864
> multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1
> 1 8:128 1]
>
> Currently worked around problem by including 00 01 and 02 for all 8 SAS
> cards when mapping LUN/volume to host groups.
>
> Thanks,
>  — ddj
> Dave Johnson
> Brown University CCV
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at spectrumscale.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170120/9565a7a5/attachment.htm>


More information about the gpfsug-discuss mailing list