From bugzilla_noreply at suse.com Thu Dec 8 00:15:10 2022
From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com)
Date: Thu, 08 Dec 2022 00:15:10 +0000
Subject: [Bug 1206194] New: Distro migration name resolution failure
Message-ID:
https://bugzilla.suse.com/show_bug.cgi?id=1206194
Bug ID: 1206194
Summary: Distro migration name resolution failure
Classification: SUSE Linux Enterprise Server
Product: SUSE Linux Enterprise Server 12 SP5 in Public Clouds
Version: BYOS
Hardware: x86-64
OS: SLES 12
Status: NEW
Severity: Normal
Priority: P5 - None
Component: DMS
Assignee: dms-bugs at lists.suse.com
Reporter: kpenrose at suse.com
QA Contact: public-cloud-maintainers at suse.de
Group: novellonly, SUSE Public Cloud
Found By: SUSE Technical Services
Services Priority: 500
Blocker: ---
Tested with multiple images provided by SUSE. Both standard 12sp5 byos and
12sp5 for SAP byos. All instances were registered to SCC. All attempts to
migrate to 15sp1 failed.
To reproduce the problem:
Create a vm instance using one of our images. For example sles-sap-12-sp5-byos.
In my cases I set the "Authentication Type" to a user and password instead of
the key.
Install all updates
zypper up
Reboot after updates installed.
Install migration packages
zypper in suse-migration-sle15-activation SLES15-Migration
Reboot to kick off migration.
Failure reported:
_____________________________________________________
Calling: ['cat', '/etc/resolv.conf']
DNS Resolver
### /etc/resolv.conf is a symlink to /var/run/netconfig/resolv.conf
### autogenerated by netconfig!
#
# Before you change this file manually, consider to define the
# static DNS configuration using the following variables in the
# /etc/sysconfig/network/config file:
# NETCONFIG_DNS_STATIC_SEARCHLIST
# NETCONFIG_DNS_STATIC_SERVERS
# NETCONFIG_DNS_FORWARDER
# or disable DNS configuration updates via netconfig by setting:
# NETCONFIG_DNS_POLICY=''
#
# See also the netconfig(8) manual page and other documentation.
#
### Call "netconfig update -f" to force adjusting of /etc/resolv.conf.
search zuqmzox35guulhwgs1nkmozwke.bx.internal.cloudapp.net
nameserver 168.63.129.16
Running prepare service
Bind mounting /etc/zypp
Calling: ['mount', '--bind', '/system-root/etc/zypp', '/etc/zypp']
Bind mounting /usr/lib/zypp/plugins
Calling: ['mount', '--bind', '/system-root/usr/lib/zypp/plugins/services',
'/usr/lib/zypp/plugins/services']
Calling: ['chroot', '/system-root', 'SUSEConnect', '--list-extensions']
SUSEConnect error: SocketError: getaddrinfo: Temporary failure in name
resolution
System not registered. Aborting migration.
Preparation of zypper metadata failed with System not registered. Aborting
migration.
Running grub setup service
Calling: ['chroot', '/system-root', 'zypper', '--non-interactive',
'--no-gpg-checks', 'remove', 'SLE*-Migration', 'suse-migration-*-activation']
Uninstalling migration:
_____________________________________________________
To fix the problem edit the resolv.conf:
Default /etc/resolv.conf
kpenrose12sp5-migrate:~ # cat /etc/resolv.conf
### /etc/resolv.conf file autogenerated by netconfig!
#
# Before you change this file manually, consider to define the
# static DNS configuration using the following variables in the
# /etc/sysconfig/network/config file:
# NETCONFIG_DNS_STATIC_SEARCHLIST
# NETCONFIG_DNS_STATIC_SERVERS
# NETCONFIG_DNS_FORWARDER
# or disable DNS configuration updates via netconfig by setting:
# NETCONFIG_DNS_POLICY=''
#
# See also the netconfig(8) manual page and other documentation.
#
# Note: Manual change of this file disables netconfig too, but
# may get lost when this file contains comments or empty lines
# only, the netconfig settings are same with settings in this
# file and in case of a "netconfig update -f" call.
#
### Please remove (at least) this line when you modify the file!
search zuqmzox35guulhwgs1nkmozwke.bx.internal.cloudapp.net
nameserver 168.63.129.16
Removed this line:
### Please remove (at least) this line when you modify the file!
Install migration packages
zypper in suse-migration-sle15-activation SLES15-Migration
Reboot to kick off migration.
Successful migration!
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From bugzilla_noreply at suse.com Thu Dec 8 13:02:03 2022
From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com)
Date: Thu, 08 Dec 2022 13:02:03 +0000
Subject: [Bug 1206194] Distro migration name resolution failure
In-Reply-To:
References:
Message-ID:
https://bugzilla.suse.com/show_bug.cgi?id=1206194
https://bugzilla.suse.com/show_bug.cgi?id=1206194#c1
Robert Schweikert changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |kpenrose at suse.com,
| |rjschwei at suse.com
Flags| |needinfo?(kpenrose at suse.com
| |)
--- Comment #1 from Robert Schweikert ---
But DMS is not responsible for name resolution. This is clearly an issue as to
how the network is configured.
We can certainly add more checks and produce a better error, but we can
certainly not force the customer to change their network configuration or
forcefully do it ourselves.
Is the goal of this bug to achieve a better user experience?
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From bugzilla_noreply at suse.com Thu Dec 8 14:42:15 2022
From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com)
Date: Thu, 08 Dec 2022 14:42:15 +0000
Subject: [Bug 1206194] Distro migration name resolution failure
In-Reply-To:
References:
Message-ID:
https://bugzilla.suse.com/show_bug.cgi?id=1206194
https://bugzilla.suse.com/show_bug.cgi?id=1206194#c2
Keith Berger changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |kberger at suse.com
--- Comment #2 from Keith Berger ---
I can reproduce this with the latest SLES-12-SP5-BYOS image + latest updates
(zypper ref/zypper up) on both AWS
((suse-sles-12-sp5-byos-v20221105-hvm-ssd-x86_64) and Azure
(suse:sles-12-sp5-byos:gen2:2022.10.15)
doing more testing now
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From bugzilla_noreply at suse.com Thu Dec 8 14:43:04 2022
From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com)
Date: Thu, 08 Dec 2022 14:43:04 +0000
Subject: [Bug 1206194] Distro migration name resolution failure
In-Reply-To:
References:
Message-ID:
https://bugzilla.suse.com/show_bug.cgi?id=1206194
https://bugzilla.suse.com/show_bug.cgi?id=1206194#c3
Keith Berger changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jiri.belka at suse.com
--- Comment #3 from Keith Berger ---
probably related to https://bugzilla.suse.com/show_bug.cgi?id=1204702 but need
to figure out what changed. testing older images now
--
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From bugzilla_noreply at suse.com Thu Dec 8 14:44:20 2022
From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com)
Date: Thu, 08 Dec 2022 14:44:20 +0000
Subject: [Bug 1206194] Distro migration name resolution failure
In-Reply-To:
References:
Message-ID:
https://bugzilla.suse.com/show_bug.cgi?id=1206194
Keith Berger changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |dms-bugs at lists.suse.com
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From bugzilla_noreply at suse.com Thu Dec 8 16:16:45 2022
From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com)
Date: Thu, 08 Dec 2022 16:16:45 +0000
Subject: [Bug 1206194] Distro migration name resolution failure
In-Reply-To:
References:
Message-ID:
https://bugzilla.suse.com/show_bug.cgi?id=1206194
https://bugzilla.suse.com/show_bug.cgi?id=1206194#c4
Keith Berger changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |IN_PROGRESS
--- Comment #4 from Keith Berger ---
I confirmed the workaround by Kirk does resolve this. The root cause seems to
be during the reboot, the prior resolv.conf from the 12-sp5 which originally
has the correct entries i.e.
search iflnnm1lauhu3lf5mva5abcjsb.bx.internal.cloudapp.net
nameserver 168.63.129.16
gets reset to a "blank" file without these entries. Once this happens and the
suse-migration-services is run, it fails to copy over the file,
https://github.com/SUSE/suse-migration-services/blob/master/suse_migration_services/units/setup_host_network.py#L54
https://github.com/SUSE/suse-migration-services/blob/master/suse_migration_services/units/setup_host_network.py#L151
Not copying is not the root cause as the migration image still gets the correct
settings
migration at localhost:~> ls -al /etc/resolv.conf
lrwxrwxrwx 1 root root 26 Dec 8 14:48 /etc/resolv.conf ->
/run/netconfig/resolv.conf
migration at localhost:~> cat /run/netconfig/resolv.conf
### /etc/resolv.conf is a symlink to /var/run/netconfig/resolv.conf
### autogenerated by netconfig!
#
# Before you change this file manually, consider to define the
# static DNS configuration using the following variables in the
# /etc/sysconfig/network/config file:
# NETCONFIG_DNS_STATIC_SEARCHLIST
# NETCONFIG_DNS_STATIC_SERVERS
# NETCONFIG_DNS_FORWARDER
# or disable DNS configuration updates via netconfig by setting:
# NETCONFIG_DNS_POLICY=''
#
# See also the netconfig(8) manual page and other documentation.
#
### Call "netconfig update -f" to force adjusting of /etc/resolv.conf.
search iflnnm1lauhu3lf5mva5abcjsb.bx.internal.cloudapp.net
nameserver 168.63.129.16
HOWEVER, when we go check to see if the system is registered we run
https://github.com/SUSE/suse-migration-services/blob/master/suse_migration_services/suse_connect.py#L38
chroot /etc/system-root SUSEConnect --list-extensions
but since the resolv.conf in /etc/system-root we get the name resolution error
Calling: ['chroot', '/system-root', 'SUSEConnect', '--list-extensions']
SUSEConnect error: SocketError: getaddrinfo: Temporary failure in name
resolution
A simple fix is instead of chroot, just use
SUSEConnect --list-extensions --root /etc/system-root
or in terms of the code
extensions_cmd_result = Command.run(
['SUSEConnect', '--list-extensions', '--root', root_path ],
raise_on_error=False
)
The big question is what changed that results in the resolv.conf being reset
and why are we not seeing this in the QA tests.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From bugzilla_noreply at suse.com Thu Dec 8 18:10:20 2022
From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com)
Date: Thu, 08 Dec 2022 18:10:20 +0000
Subject: [Bug 1206194] Distro migration name resolution failure
In-Reply-To:
References:
Message-ID:
https://bugzilla.suse.com/show_bug.cgi?id=1206194
https://bugzilla.suse.com/show_bug.cgi?id=1206194#c5
--- Comment #5 from Keith Berger ---
This may have been broken for a while. If you start the migration via the kexec
method by running `sudo /usr/sbin/run_migration` it works. If you install
suse-migration-sle15-activation and do the reboot method it fails.
Kirk are you and your customer able to use 'run_migration' as a workaround for
now?
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From bugzilla_noreply at suse.com Thu Dec 8 18:29:08 2022
From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com)
Date: Thu, 08 Dec 2022 18:29:08 +0000
Subject: [Bug 1206194] Distro migration name resolution failure
In-Reply-To:
References:
Message-ID:
https://bugzilla.suse.com/show_bug.cgi?id=1206194
https://bugzilla.suse.com/show_bug.cgi?id=1206194#c6
--- Comment #6 from Keith Berger ---
The reason it works on PAYG images is they add an entry in the hosts file
# Added by SMT registration do not remove, retain comment as well
52.188.224.179 smt-azure.susecloud.net smt-azure
which avoids the error with the empty resolv.conf file
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From bugzilla_noreply at suse.com Thu Dec 8 18:56:58 2022
From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com)
Date: Thu, 08 Dec 2022 18:56:58 +0000
Subject: [Bug 1206194] Distro migration name resolution failure
In-Reply-To:
References:
Message-ID:
https://bugzilla.suse.com/show_bug.cgi?id=1206194
https://bugzilla.suse.com/show_bug.cgi?id=1206194#c7
Robert Schweikert changed:
What |Removed |Added
----------------------------------------------------------------------------
Flags|needinfo?(kpenrose at suse.com |
|) |
--- Comment #7 from Robert Schweikert ---
The entry in /etc/hosts gets created by cloud-regionsrv-client. This also means
it works for BYOS instances that use the update infrastructure as a proxy.
The root cause is that for instances that use a DHCP only, i.e. more or less
the trivial default setup, /etc/resolv.conf gets cleaned out during shutdown.
>From an implementation perspective we missed that use case, well, at some point
the behavior of netconfig/wicked changes such the resolv.conf gets claened out
upon shutdown.
We can compensate for this.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From bugzilla_noreply at suse.com Thu Dec 8 19:04:45 2022
From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com)
Date: Thu, 08 Dec 2022 19:04:45 +0000
Subject: [Bug 1206194] Distro migration name resolution failure
In-Reply-To:
References:
Message-ID:
https://bugzilla.suse.com/show_bug.cgi?id=1206194
https://bugzilla.suse.com/show_bug.cgi?id=1206194#c8
Kirk Penrose changed:
What |Removed |Added
----------------------------------------------------------------------------
Flags| |needinfo?
--- Comment #8 from Kirk Penrose ---
(In reply to Keith Berger from comment #5)
> This may have been broken for a while. If you start the migration via the
> kexec method by running `sudo /usr/sbin/run_migration` it works. If you
> install suse-migration-sle15-activation and do the reboot method it fails.
>
> Kirk are you and your customer able to use 'run_migration' as a workaround
> for now?
The customer I have has a different problem. At least I don't believe it is
related. That's why I could not L3 this bug.
The problem they are having is:
Failed locating regionsrv-client cache files: No cloud-regionsrv-client cache
files found in /var/cache/cloudregister or /var/lib/cloudregister
This is only supposed to happen when the hosts file has an entry for
susecloud.net. They did at first, but they were removed and the problem still
happens. I may have to open another bug for that one.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From bugzilla_noreply at suse.com Tue Dec 13 15:30:44 2022
From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com)
Date: Tue, 13 Dec 2022 15:30:44 +0000
Subject: [Bug 1206194] Distro migration name resolution failure
In-Reply-To:
References:
Message-ID:
https://bugzilla.suse.com/show_bug.cgi?id=1206194
Karen Van der Veer changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |kvanderveer at suse.com
Whiteboard| |public-cloud-meeting-2022-1
| |2-13
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From bugzilla_noreply at suse.com Tue Dec 13 15:35:14 2022
From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com)
Date: Tue, 13 Dec 2022 15:35:14 +0000
Subject: [Bug 1206194] Distro migration name resolution failure
In-Reply-To:
References:
Message-ID:
https://bugzilla.suse.com/show_bug.cgi?id=1206194
https://bugzilla.suse.com/show_bug.cgi?id=1206194#c9
--- Comment #9 from Keith Berger ---
PR for empty resolv.conf
https://github.com/SUSE/suse-migration-services/pull/254
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From bugzilla_noreply at suse.com Thu Dec 15 16:46:53 2022
From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com)
Date: Thu, 15 Dec 2022 16:46:53 +0000
Subject: [Bug 1206194] Distro migration name resolution failure
In-Reply-To:
References:
Message-ID:
https://bugzilla.suse.com/show_bug.cgi?id=1206194
https://bugzilla.suse.com/show_bug.cgi?id=1206194#c10
Yogalakshmi Arunachalam changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |yarunachalam at suse.com
--- Comment #10 from Yogalakshmi Arunachalam ---
validated the PR
https://jira.suse.com/browse/CSE-420
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
From bugzilla_noreply at suse.com Thu Dec 15 22:18:36 2022
From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com)
Date: Thu, 15 Dec 2022 22:18:36 +0000
Subject: [Bug 1206194] Distro migration name resolution failure
In-Reply-To:
References:
Message-ID:
https://bugzilla.suse.com/show_bug.cgi?id=1206194
https://bugzilla.suse.com/show_bug.cgi?id=1206194#c12
Kirk Penrose changed:
What |Removed |Added
----------------------------------------------------------------------------
Flags|needinfo? |
--- Comment #12 from Kirk Penrose ---
(In reply to Keith Berger from comment #5)
> This may have been broken for a while. If you start the migration via the
> kexec method by running `sudo /usr/sbin/run_migration` it works. If you
> install suse-migration-sle15-activation and do the reboot method it fails.
>
> Kirk are you and your customer able to use 'run_migration' as a workaround
> for now?
This works for my testing.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: