[sles-beta] SLES11-SP3 x86_64 RC1 Problem Storage SAN lpfc module FCF discovery

Hannes Reinecke hare at suse.com
Tue May 7 04:55:40 MDT 2013


On 05/07/2013 12:06 PM, urs.frey at post.ch wrote:
> Hi
> I am testing SLES11-SP3 RC1 x86_64 on a HP Pr0liant Blade BL465cG7 AMD Opteron, attached to SAN EMC VMAx, FCoE, NC551i Emulex OneConnect 10GB CNA
> 
> Suddenly during normal operation, I loose a SAN path on my test server.
> Instead of a re-instate after coming up of the SAN path again, I can see thousands of messages like this in my /var/log/messages
> 
> May  6 21:43:01 h04wwl /usr/sbin/cron[1686]: (root) CMD (/usr/local/scripts/cpqhealth_mon > /dev/null 2>&1)
> May  6 21:45:01 h04wwl /usr/sbin/cron[1862]: (root) CMD (/usr/local/scripts/check_multipath.sh > /dev/null 2>&1)
> May  6 21:45:01 h04wwl /usr/sbin/cron[1863]: (oracle) CMD (/appl/ora/oraenv/bck/dynarcbck.sh > /dev/null 2>&1)
> May  6 21:49:01 h04wwl /usr/sbin/cron[2224]: (root) CMD (/usr/local/scripts/query_patchnix.sh > /dev/null 2>&1)
> May  6 21:50:01 h04wwl /usr/sbin/cron[2412]: (root) CMD (/usr/local/scripts/check_multipath.sh > /dev/null 2>&1)
> May  6 21:50:01 h04wwl /usr/sbin/cron[2413]: (oracle) CMD (/appl/ora/oraenv/bck/dynarcbck.sh > /dev/null 2>&1)
> May  6 21:55:01 h04wwl /usr/sbin/cron[2860]: (root) CMD (/usr/local/scripts/check_multipath.sh > /dev/null 2>&1)
> May  6 21:55:01 h04wwl /usr/sbin/cron[2861]: (oracle) CMD (/appl/ora/oraenv/bck/dynarcbck.sh > /dev/null 2>&1)
> May  6 21:58:22 h04wwl kernel: [621025.614280] lpfc 0000:04:00.2: 0:3300 In-use FCF (0) modified, perform FCF rediscovery
> May  6 21:58:22 h04wwl kernel: [621025.680799] lpfc 0000:04:00.2: 0:2546 New FCF event, evt_tag:x3, index:x0
> May  6 21:58:22 h04wwl kernel: [621025.680820] lpfc 0000:04:00.2: 0:3300 In-use FCF (0) modified, perform FCF rediscovery
> May  6 21:58:22 h04wwl kernel: [621025.726467] lpfc 0000:04:00.2: 0:2546 New FCF event, evt_tag:x4, index:x0
> May  6 21:58:22 h04wwl kernel: [621025.726485] lpfc 0000:04:00.2: 0:3300 In-use FCF (0) modified, perform FCF rediscovery
> May  6 21:58:22 h04wwl kernel: [621025.772060] lpfc 0000:04:00.2: 0:2546 New FCF event, evt_tag:x5, index:x0
> May  6 21:58:22 h04wwl kernel: [621025.772078] lpfc 0000:04:00.2: 0:3300 In-use FCF (0) modified, perform FCF rediscovery
> May  6 21:58:22 h04wwl kernel: [621025.825965] lpfc 0000:04:00.2: 0:2546 New FCF event, evt_tag:x6, index:x0
> May  6 21:58:22 h04wwl kernel: [621025.825983] lpfc 0000:04:00.2: 0:3300 In-use FCF (0) modified, perform FCF rediscovery
> May  6 21:58:22 h04wwl kernel: [621025.871796] lpfc 0000:04:00.2: 0:2546 New FCF event, evt_tag:x7, index:x0
> 
> Upon issuing manually a LIP on the CNA again, the SAN path gets re-detected and reinstated.
> I could observe quite a similar behavior also on SLES11-SP3 Beta4.
> Why is there no automatic reinstate of the SAN path?
> What is does this mean "In-use FCF (0) modified, perform FCF rediscovery".
> I mean this should be solved in fully automatic mode, no manual intervention necessary.
> 
> This works best on SLES11-SP2, and also worked fine until SLES11-SP3 Beta4
> I assume, that we either do have a problem with the lpfc.ko kernel module, or in the SLES11-SP3 RC1 kernel itself
> 
> I can reproduce it by simply taking down one CNA switch path and getting it up again.
> BUT when disabling one SAN path on the VMAx itself, multipath does work best and does also reinstate without a problem after enabling the SAN path again.
> 
> So there must be a different event causing the problem.
> 
This is already a bugzilla entry for this: 816065
We already have an lpfc update from Emulex, which might fix this issue.
Can you try with the latest KOTD?

Cheers,

Hannes


More information about the sles-beta mailing list