From: Alexei_Roudnev (Alexei_Roudnev_at_exigengroup.com)
Date: Fri Dec 21 2007 - 20:55:51 CET
Message-ID: <091201c8440b$7d13ae70$7031a8c0@exigengroup.com> From: "Alexei_Roudnev" <Alexei_Roudnev@exigengroup.com> Date: Fri, 21 Dec 2007 11:55:51 -0800 Subject: Re: [suse-sles-e] Upgrade to SLES9 SP4 results in non-bootable systems when using SATA
One (very serious) problem with SLES is that it do not keep PREVIOUS kernel
during the upgrades, so if
upgrade failed, you need to go thru a manual process of _boot from CD, copy
kernel and initrd files from the healthy server, edut grub menu. reboot, and
now install a correct kernel). It is a bad thing, because it is VERY EASY to
maintain previous kernel in Linux(es).
In my experience, there are some problems with SLES9 SP4 upgrade (esp. if
seen all beta history - there was a lot of mess in the very end of the beta
testingm and it shjows that upgrade was not properly beta-tested before the
release). I did not completed all testing (sometimes it woirks pretty well)
but I had a few cases when I could not upgrade at all.
I maintain old kernel manually, by adding 'default' kernel (thru rpm) and
then switching back to smp kernel - it allows to have a last-resort boot
option. SLES is the only Unix I know which do not maintail PREVIOUS kernel
on upgrades - all pther which I have (FreeBSD, Solaris, some otehr linuxes)
always maintain 1 previous kernel, so it is never a case (if upgrade broke a
kernel compatibility, you can boot from the previous kernel and downgrade).
----- Original Message -----
From: "Robert Scheck" <scheck@etes.de>
To: "Novell SLES" <suse-sles-e@suse.com>
Sent: Friday, December 21, 2007 9:29 AM
Subject: [suse-sles-e] Upgrade to SLES9 SP4 results in non-bootable systems
when using SATA
> Hello all,
>
> today we upgraded a couple of servers having SuSE Linux Enterprise Server
> 9
> Service Pack 3 to the recently released Service Pack 4. After rebooting
> the
> first of them (because of the new Linux kernel), the machine didn't come
> up
> by itself anymore.
>
> When looking to it, I saw that the system it wasn't able to find its root
> partition /dev/sda6 as configured. So during the initrd when the devices
> are created and where the "waiting for /dev/XXX" is printed, the stuff
> simply bailed out, displayed all available device files and I got a shell
> prompt. There were no /dev/sda5 (swap) or /dev/sda6 (root) device files
> available any longer as before (when stepping from SLES9 up to SP3).
>
> When trying to get the system booting again, I just noticed that the
> device
> files /dev/hdc5 and /dev/hdc6 were created, so I changed the grub config
> on
> runtime before system start and retried - even with success. Of course, I
> had to modify the /etc/fstab then as well for the swap partition. To
> ensure
> that everything is really working, I did another reboot and the system
> came
> up as expected, now.
>
> Luckily it wasn't a mission critical system, but what the hell (and this
> is
> really friendly said now) did you Novell respectively SuSE guys smoke or
> drink before you prepared the Service Pack 4 - or asked otherwise: Why did
> this slip even through quality assurance? Can you please do Enterprise and
> not only write it in big letters to the boxes and media, you're shipping?!
>
> Changing drivers for a new Service Pack is okay (at least from my point of
> view) as long as it contains at least no regressions (!), but this case
> definately was a regression which normally never should happen, right? Can
> you please make sure, that you never change a driver within a Enterprise
> release, that a SCSI-emulated device (SATA shown as /dev/sdX) just gets
> finally displayed as an IDE device (/dev/hdX) from one Linux kernel update
> to another? You can do such stuff from SLES 10 -> 11 or similar, but not
> within a stable series; we're not on openSUSE Factory where everything can
> break from one update to another!
>
> As this machine only has a update-only subscription, I can't open a bug to
> report (not without paying money that SuSE/Novell fixes its
> self-introduced
> bugs) this stuff there. If maybe somebody of the technical guys of SuSE or
> Novell cares about this problem, the relevant part of lspci output is just
> below.
>
> $ lspci -vvv
> [...]
> 0000:00:1f.2 IDE interface: Intel Corporation 82801EB (ICH5) SATA
> Controller (rev 02) (prog-if 8f [Master SecP SecO PriP PriO])
> Subsystem: Dell PowerEdge SC1425
> Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR- FastB2B-
> Status: Cap- 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
> <TAbort- <MAbort- >SERR- <PERR-
> Latency: 0
> Interrupt: pin A routed to IRQ 18
> Region 0: I/O ports at ccb8 [size=8]
> Region 1: I/O ports at ccb0 [size=4]
> Region 2: I/O ports at cca0 [size=8]
> Region 3: I/O ports at cc98 [size=4]
> Region 4: I/O ports at cc80 [size=16]
> [...]
> $
>
> Maybe somebody can announce when /dev/hdc switches back to /dev/sda again,
> so that I can do the relevant updates on-site to avoid further unnecessary
> downtimes? It would be at least a very small christmas present after this
> horrible non-Enterprise behaviour today.
>
>
> Merry christmas
>
> Robert Scheck
>
> --
> Robert Scheck
> Web: http://www.etes.de E-Mail: scheck@etes.de
> ETES GmbH Libanonstrasse 58 A D-70184 Stuttgart
> Fon: +49 (7 11) 48 90 83 - 12 Fax: +49 (7 11) 48 90 83 - 50
>
> Registergericht: Amtsgericht Stuttgart HRB 721182
> Geschäftsführende Gesellschafter: Markus Espenhain und Jan Theofel
> Sitz der Gesellschaft: Stuttgart
> USt.-Id.Nr.: DE814767446
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: suse-sles-e-unsubscribe@suse.com
> For additional commands, e-mail: suse-sles-e-help@suse.com
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: suse-sles-e-unsubscribe@suse.com
For additional commands, e-mail: suse-sles-e-help@suse.com
This archive was generated by hypermail 2.1.7 : Fri Dec 21 2007 - 09:56:21 CET