[sles-beta] SLES11-SP3 Beta4 multipath problem. SAN pathes do not get reinstated again after SAN path down and up again

urs.frey at post.ch urs.frey at post.ch
Thu Apr 25 06:13:43 MDT 2013


Hi
I am testing SLES11-SP3 x86_64 on a HP Blade BL465cG7, FCoE SAN attached to EMC VMAx (Symmetrix)

The problem I observe is, that when failing a SNA path and reinstate, multipath does not re-discover the SAN  LUns again and reinstate. SAN pathes one detected as failed are kept failed, even when the SAN ports do get up again and the SAN LUNs would be available again.

Here is my multipath.conf file
h04wwl:~ # cat /etc/multipath.conf
#------------------------------------------------------------------
# IT-Post multipath.conf
# SLES11-SP1 / SP2
# please do not edit. This file is handled by pst-san-mgmt package
#
#------------------------------------------------------------------
# 13Jan2012:IT222-FRU: V01-00: after cons by Pjotr Knurek (EMC)
#------------------------------------------------------------------
devices {
       device {
               vendor                  "EMC"
               product                 "SYMMETRIX"
               ##getuid_callout          "/lib/udev/scsi_id --whitelisted --page=pre-spc3-83 --device=/dev/%n"
               features                "0"
               hardware_handler        "0"
               path_selector           "round-robin 0"
               path_grouping_policy    multibus
               rr_weight               uniform
               rr_min_io               1
               path_checker            directio
               prio                    const
               prio_args               ""
       }
       device {
               vendor                  "DGC"
               product                 ".*"
               product_blacklist       "LUNZ"
               ##getuid_callout          "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
               features                "1 queue_if_no_path"
               hardware_handler        "1 emc"
               path_selector           "round-robin 0"
               path_grouping_policy    group_by_prio
               failback                immediate
               rr_weight               uniform
               no_path_retry           60
               rr_min_io               1000
               path_checker            emc_clariion
               prio                    emc
               prio_args               ""
       }
}
h04wwl:~ #

This is what I see issueing multipath -ll
h04wwl:~ # multipath -ll
360000970000292602543533030354146 dm-4 EMC     ,SYMMETRIX
size=34G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 1:0:0:12 sdm        8:192  active ready running
  `- 1:0:1:12 sdz        65:144 active ready running
360000970000292602543533030354145 dm-3 EMC     ,SYMMETRIX
size=34G features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
  |- 1:0:0:11 sdl        8:176  active ready running
  `- 1:0:1:11 sdy        65:128 active ready running

What I can see is, that under /sys/class/fc_host/host2/port_state there is noted "Bypassed" instead of online:

h04wwl:~ # for prt in `find /sys/class/fc_host/host*/ -name port_state`;do cat $prt;done
Online
Bypassed
h04wwl:~ # for prt in `find /sys/class/fc_remote_ports/rport-*/ -name port_state`;do cat $prt;done
Online
Online
Online
Online
Not Present
Not Present
h04wwl:~ #

When issuing a LIP on both CNAs I get the SAN pathes ready and accessable again:
h04wwl:~ # /usr/local/scripts/lpfc-dynamic-tgt-lun-disc.sh -al
Issuing LIP on host1
Scanning HOST: host1
Issuing LIP on host2
Scanning HOST: host2
Found
 2:0:0:0
 2:0:0:1
 2:0:0:10
 2:0:0:11
 2:0:0:12
 2:0:0:2
 2:0:0:3
 2:0:0:4
 2:0:0:5
 2:0:0:6
 2:0:0:7
 2:0:0:8
 2:0:0:9
 2:0:1:0
 2:0:1:1
 2:0:1:10
 2:0:1:11
 2:0:1:12
 2:0:1:2
 2:0:1:3
 2:0:1:4
 2:0:1:5
 2:0:1:6
 2:0:1:7
 2:0:1:8
 2:0:1:9

h04wwl:~ # for prt in `find /sys/class/fc_host/host*/ -name port_state`;do cat $prt;done
Online
Online
h04wwl:~ # for prt in `find /sys/class/fc_remote_ports/rport-*/ -name port_state`;do cat $prt;done
Online
Online
Online
Online
Online
Online
Online
Online
h04wwl:~ #

I do expect from multipathing, that a reinstate is done shortly upon SAN path up discovery automaticly.
This works under SLES11-SP2 x86_64 on the very same HW.

So I consider this as a BUG in SLES11-SP3 Beta4

Thank you very much for clearing this

best regards

Urs Frey
Die Schweizerische Post
Services
Informationstechnologie
Webergutstrasse 12
3030 Bern (Zollikofen)
Telefon : ++41 (0)58 338 58 70
FAX     : ++41 (0)58 667 30 07
E-Mail:   urs.frey at post.ch<mailto:urs.frey at post.ch>




More information about the sles-beta mailing list