[gpfsug-discuss] Path to NSD lost when host_sas_address changed on port

David D. Johnson david_johnson at brown.edu
Fri Jan 20 15:32:17 GMT 2017


We have most of our GPFS NSD storage set up as pairs of RAID boxes served by failover pairs of servers.
Most of it is FibreChannel, but the newest four boxes and servers are using dual port SAS controllers.
Just this week, we had one server lose one out of the paths to one of the raid boxes. Took a while
to realize what happened, but apparently the port2 ID changed from 51866da05cf7b001 to
51866da05cf7b002 on the fly, without rebooting.  Port1 is still 51866da05cf7b000, which is the card ID (host_add).

We’re running gpfs 4.2.2.1 on RHEL7.2 on these hosts.

Has anyone else seen this kind of behavior? 
First noticed these messages, 3 hours 13 minutes after boot:
Jan 10 13:15:53 storage043 kernel: megasas: Err returned from build_and_issue_cmd
Jan 10 13:15:53 storage043 kernel: megasas: Err returned from build_and_issue_cmd
Jan 10 13:15:53 storage043 kernel: megasas: Err returned from build_and_issue_cmd
Jan 10 13:15:53 storage043 kernel: megasas: Err returned from build_and_issue_cmd
Jan 10 13:15:53 storage043 kernel: megasas: Err returned from build_and_issue_cmd
Jan 10 13:15:53 storage043 kernel: megasas: Err returned from build_and_issue_cmd
Jan 10 13:15:53 storage043 kernel: megasas: Err returned from build_and_issue_cmd

The multipath daemon was sending lots of log messages like:
Jan 10 13:49:22 storage043 multipathd: mpathw: load table [0 4642340864 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:64 1]
Jan 10 13:49:22 storage043 multipathd: mpathaa: load table [0 4642340864 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:96 1]
Jan 10 13:49:22 storage043 multipathd: mpathx: load table [0 4642340864 multipath 3 pg_init_retries 50 queue_if_no_path 1 rdac 1 1 round-robin 0 1 1 8:128 1]

Currently worked around problem by including 00 01 and 02 for all 8 SAS cards when mapping LUN/volume to host groups.

Thanks,
 — ddj
Dave Johnson
Brown University CCV
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170120/459283c5/attachment.htm>


More information about the gpfsug-discuss mailing list