From bugzilla_noreply at suse.com Thu Dec 8 00:15:10 2022 From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com) Date: Thu, 08 Dec 2022 00:15:10 +0000 Subject: [Bug 1206194] New: Distro migration name resolution failure Message-ID: https://bugzilla.suse.com/show_bug.cgi?id=1206194 Bug ID: 1206194 Summary: Distro migration name resolution failure Classification: SUSE Linux Enterprise Server Product: SUSE Linux Enterprise Server 12 SP5 in Public Clouds Version: BYOS Hardware: x86-64 OS: SLES 12 Status: NEW Severity: Normal Priority: P5 - None Component: DMS Assignee: dms-bugs at lists.suse.com Reporter: kpenrose at suse.com QA Contact: public-cloud-maintainers at suse.de Group: novellonly, SUSE Public Cloud Found By: SUSE Technical Services Services Priority: 500 Blocker: --- Tested with multiple images provided by SUSE. Both standard 12sp5 byos and 12sp5 for SAP byos. All instances were registered to SCC. All attempts to migrate to 15sp1 failed. To reproduce the problem: Create a vm instance using one of our images. For example sles-sap-12-sp5-byos. In my cases I set the "Authentication Type" to a user and password instead of the key. Install all updates zypper up Reboot after updates installed. Install migration packages zypper in suse-migration-sle15-activation SLES15-Migration Reboot to kick off migration. Failure reported: _____________________________________________________ Calling: ['cat', '/etc/resolv.conf'] DNS Resolver ### /etc/resolv.conf is a symlink to /var/run/netconfig/resolv.conf ### autogenerated by netconfig! # # Before you change this file manually, consider to define the # static DNS configuration using the following variables in the # /etc/sysconfig/network/config file: # NETCONFIG_DNS_STATIC_SEARCHLIST # NETCONFIG_DNS_STATIC_SERVERS # NETCONFIG_DNS_FORWARDER # or disable DNS configuration updates via netconfig by setting: # NETCONFIG_DNS_POLICY='' # # See also the netconfig(8) manual page and other documentation. # ### Call "netconfig update -f" to force adjusting of /etc/resolv.conf. search zuqmzox35guulhwgs1nkmozwke.bx.internal.cloudapp.net nameserver 168.63.129.16 Running prepare service Bind mounting /etc/zypp Calling: ['mount', '--bind', '/system-root/etc/zypp', '/etc/zypp'] Bind mounting /usr/lib/zypp/plugins Calling: ['mount', '--bind', '/system-root/usr/lib/zypp/plugins/services', '/usr/lib/zypp/plugins/services'] Calling: ['chroot', '/system-root', 'SUSEConnect', '--list-extensions'] SUSEConnect error: SocketError: getaddrinfo: Temporary failure in name resolution System not registered. Aborting migration. Preparation of zypper metadata failed with System not registered. Aborting migration. Running grub setup service Calling: ['chroot', '/system-root', 'zypper', '--non-interactive', '--no-gpg-checks', 'remove', 'SLE*-Migration', 'suse-migration-*-activation'] Uninstalling migration: _____________________________________________________ To fix the problem edit the resolv.conf: Default /etc/resolv.conf kpenrose12sp5-migrate:~ # cat /etc/resolv.conf ### /etc/resolv.conf file autogenerated by netconfig! # # Before you change this file manually, consider to define the # static DNS configuration using the following variables in the # /etc/sysconfig/network/config file: # NETCONFIG_DNS_STATIC_SEARCHLIST # NETCONFIG_DNS_STATIC_SERVERS # NETCONFIG_DNS_FORWARDER # or disable DNS configuration updates via netconfig by setting: # NETCONFIG_DNS_POLICY='' # # See also the netconfig(8) manual page and other documentation. # # Note: Manual change of this file disables netconfig too, but # may get lost when this file contains comments or empty lines # only, the netconfig settings are same with settings in this # file and in case of a "netconfig update -f" call. # ### Please remove (at least) this line when you modify the file! search zuqmzox35guulhwgs1nkmozwke.bx.internal.cloudapp.net nameserver 168.63.129.16 Removed this line: ### Please remove (at least) this line when you modify the file! Install migration packages zypper in suse-migration-sle15-activation SLES15-Migration Reboot to kick off migration. Successful migration! -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bugzilla_noreply at suse.com Thu Dec 8 13:02:03 2022 From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com) Date: Thu, 08 Dec 2022 13:02:03 +0000 Subject: [Bug 1206194] Distro migration name resolution failure In-Reply-To: References: Message-ID: https://bugzilla.suse.com/show_bug.cgi?id=1206194 https://bugzilla.suse.com/show_bug.cgi?id=1206194#c1 Robert Schweikert changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kpenrose at suse.com, | |rjschwei at suse.com Flags| |needinfo?(kpenrose at suse.com | |) --- Comment #1 from Robert Schweikert --- But DMS is not responsible for name resolution. This is clearly an issue as to how the network is configured. We can certainly add more checks and produce a better error, but we can certainly not force the customer to change their network configuration or forcefully do it ourselves. Is the goal of this bug to achieve a better user experience? -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bugzilla_noreply at suse.com Thu Dec 8 14:42:15 2022 From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com) Date: Thu, 08 Dec 2022 14:42:15 +0000 Subject: [Bug 1206194] Distro migration name resolution failure In-Reply-To: References: Message-ID: https://bugzilla.suse.com/show_bug.cgi?id=1206194 https://bugzilla.suse.com/show_bug.cgi?id=1206194#c2 Keith Berger changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kberger at suse.com --- Comment #2 from Keith Berger --- I can reproduce this with the latest SLES-12-SP5-BYOS image + latest updates (zypper ref/zypper up) on both AWS ((suse-sles-12-sp5-byos-v20221105-hvm-ssd-x86_64) and Azure (suse:sles-12-sp5-byos:gen2:2022.10.15) doing more testing now -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bugzilla_noreply at suse.com Thu Dec 8 14:43:04 2022 From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com) Date: Thu, 08 Dec 2022 14:43:04 +0000 Subject: [Bug 1206194] Distro migration name resolution failure In-Reply-To: References: Message-ID: https://bugzilla.suse.com/show_bug.cgi?id=1206194 https://bugzilla.suse.com/show_bug.cgi?id=1206194#c3 Keith Berger changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jiri.belka at suse.com --- Comment #3 from Keith Berger --- probably related to https://bugzilla.suse.com/show_bug.cgi?id=1204702 but need to figure out what changed. testing older images now -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bugzilla_noreply at suse.com Thu Dec 8 14:44:20 2022 From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com) Date: Thu, 08 Dec 2022 14:44:20 +0000 Subject: [Bug 1206194] Distro migration name resolution failure In-Reply-To: References: Message-ID: https://bugzilla.suse.com/show_bug.cgi?id=1206194 Keith Berger changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dms-bugs at lists.suse.com -- You are receiving this mail because: You are on the CC list for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bugzilla_noreply at suse.com Thu Dec 8 16:16:45 2022 From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com) Date: Thu, 08 Dec 2022 16:16:45 +0000 Subject: [Bug 1206194] Distro migration name resolution failure In-Reply-To: References: Message-ID: https://bugzilla.suse.com/show_bug.cgi?id=1206194 https://bugzilla.suse.com/show_bug.cgi?id=1206194#c4 Keith Berger changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |IN_PROGRESS --- Comment #4 from Keith Berger --- I confirmed the workaround by Kirk does resolve this. The root cause seems to be during the reboot, the prior resolv.conf from the 12-sp5 which originally has the correct entries i.e. search iflnnm1lauhu3lf5mva5abcjsb.bx.internal.cloudapp.net nameserver 168.63.129.16 gets reset to a "blank" file without these entries. Once this happens and the suse-migration-services is run, it fails to copy over the file, https://github.com/SUSE/suse-migration-services/blob/master/suse_migration_services/units/setup_host_network.py#L54 https://github.com/SUSE/suse-migration-services/blob/master/suse_migration_services/units/setup_host_network.py#L151 Not copying is not the root cause as the migration image still gets the correct settings migration at localhost:~> ls -al /etc/resolv.conf lrwxrwxrwx 1 root root 26 Dec 8 14:48 /etc/resolv.conf -> /run/netconfig/resolv.conf migration at localhost:~> cat /run/netconfig/resolv.conf ### /etc/resolv.conf is a symlink to /var/run/netconfig/resolv.conf ### autogenerated by netconfig! # # Before you change this file manually, consider to define the # static DNS configuration using the following variables in the # /etc/sysconfig/network/config file: # NETCONFIG_DNS_STATIC_SEARCHLIST # NETCONFIG_DNS_STATIC_SERVERS # NETCONFIG_DNS_FORWARDER # or disable DNS configuration updates via netconfig by setting: # NETCONFIG_DNS_POLICY='' # # See also the netconfig(8) manual page and other documentation. # ### Call "netconfig update -f" to force adjusting of /etc/resolv.conf. search iflnnm1lauhu3lf5mva5abcjsb.bx.internal.cloudapp.net nameserver 168.63.129.16 HOWEVER, when we go check to see if the system is registered we run https://github.com/SUSE/suse-migration-services/blob/master/suse_migration_services/suse_connect.py#L38 chroot /etc/system-root SUSEConnect --list-extensions but since the resolv.conf in /etc/system-root we get the name resolution error Calling: ['chroot', '/system-root', 'SUSEConnect', '--list-extensions'] SUSEConnect error: SocketError: getaddrinfo: Temporary failure in name resolution A simple fix is instead of chroot, just use SUSEConnect --list-extensions --root /etc/system-root or in terms of the code extensions_cmd_result = Command.run( ['SUSEConnect', '--list-extensions', '--root', root_path ], raise_on_error=False ) The big question is what changed that results in the resolv.conf being reset and why are we not seeing this in the QA tests. -- You are receiving this mail because: You are on the CC list for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bugzilla_noreply at suse.com Thu Dec 8 18:10:20 2022 From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com) Date: Thu, 08 Dec 2022 18:10:20 +0000 Subject: [Bug 1206194] Distro migration name resolution failure In-Reply-To: References: Message-ID: https://bugzilla.suse.com/show_bug.cgi?id=1206194 https://bugzilla.suse.com/show_bug.cgi?id=1206194#c5 --- Comment #5 from Keith Berger --- This may have been broken for a while. If you start the migration via the kexec method by running `sudo /usr/sbin/run_migration` it works. If you install suse-migration-sle15-activation and do the reboot method it fails. Kirk are you and your customer able to use 'run_migration' as a workaround for now? -- You are receiving this mail because: You are on the CC list for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bugzilla_noreply at suse.com Thu Dec 8 18:29:08 2022 From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com) Date: Thu, 08 Dec 2022 18:29:08 +0000 Subject: [Bug 1206194] Distro migration name resolution failure In-Reply-To: References: Message-ID: https://bugzilla.suse.com/show_bug.cgi?id=1206194 https://bugzilla.suse.com/show_bug.cgi?id=1206194#c6 --- Comment #6 from Keith Berger --- The reason it works on PAYG images is they add an entry in the hosts file # Added by SMT registration do not remove, retain comment as well 52.188.224.179 smt-azure.susecloud.net smt-azure which avoids the error with the empty resolv.conf file -- You are receiving this mail because: You are on the CC list for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bugzilla_noreply at suse.com Thu Dec 8 18:56:58 2022 From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com) Date: Thu, 08 Dec 2022 18:56:58 +0000 Subject: [Bug 1206194] Distro migration name resolution failure In-Reply-To: References: Message-ID: https://bugzilla.suse.com/show_bug.cgi?id=1206194 https://bugzilla.suse.com/show_bug.cgi?id=1206194#c7 Robert Schweikert changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(kpenrose at suse.com | |) | --- Comment #7 from Robert Schweikert --- The entry in /etc/hosts gets created by cloud-regionsrv-client. This also means it works for BYOS instances that use the update infrastructure as a proxy. The root cause is that for instances that use a DHCP only, i.e. more or less the trivial default setup, /etc/resolv.conf gets cleaned out during shutdown. >From an implementation perspective we missed that use case, well, at some point the behavior of netconfig/wicked changes such the resolv.conf gets claened out upon shutdown. We can compensate for this. -- You are receiving this mail because: You are on the CC list for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bugzilla_noreply at suse.com Thu Dec 8 19:04:45 2022 From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com) Date: Thu, 08 Dec 2022 19:04:45 +0000 Subject: [Bug 1206194] Distro migration name resolution failure In-Reply-To: References: Message-ID: https://bugzilla.suse.com/show_bug.cgi?id=1206194 https://bugzilla.suse.com/show_bug.cgi?id=1206194#c8 Kirk Penrose changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo? --- Comment #8 from Kirk Penrose --- (In reply to Keith Berger from comment #5) > This may have been broken for a while. If you start the migration via the > kexec method by running `sudo /usr/sbin/run_migration` it works. If you > install suse-migration-sle15-activation and do the reboot method it fails. > > Kirk are you and your customer able to use 'run_migration' as a workaround > for now? The customer I have has a different problem. At least I don't believe it is related. That's why I could not L3 this bug. The problem they are having is: Failed locating regionsrv-client cache files: No cloud-regionsrv-client cache files found in /var/cache/cloudregister or /var/lib/cloudregister This is only supposed to happen when the hosts file has an entry for susecloud.net. They did at first, but they were removed and the problem still happens. I may have to open another bug for that one. -- You are receiving this mail because: You are on the CC list for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bugzilla_noreply at suse.com Tue Dec 13 15:30:44 2022 From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com) Date: Tue, 13 Dec 2022 15:30:44 +0000 Subject: [Bug 1206194] Distro migration name resolution failure In-Reply-To: References: Message-ID: https://bugzilla.suse.com/show_bug.cgi?id=1206194 Karen Van der Veer changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kvanderveer at suse.com Whiteboard| |public-cloud-meeting-2022-1 | |2-13 -- You are receiving this mail because: You are on the CC list for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bugzilla_noreply at suse.com Tue Dec 13 15:35:14 2022 From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com) Date: Tue, 13 Dec 2022 15:35:14 +0000 Subject: [Bug 1206194] Distro migration name resolution failure In-Reply-To: References: Message-ID: https://bugzilla.suse.com/show_bug.cgi?id=1206194 https://bugzilla.suse.com/show_bug.cgi?id=1206194#c9 --- Comment #9 from Keith Berger --- PR for empty resolv.conf https://github.com/SUSE/suse-migration-services/pull/254 -- You are receiving this mail because: You are on the CC list for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bugzilla_noreply at suse.com Thu Dec 15 16:46:53 2022 From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com) Date: Thu, 15 Dec 2022 16:46:53 +0000 Subject: [Bug 1206194] Distro migration name resolution failure In-Reply-To: References: Message-ID: https://bugzilla.suse.com/show_bug.cgi?id=1206194 https://bugzilla.suse.com/show_bug.cgi?id=1206194#c10 Yogalakshmi Arunachalam changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |yarunachalam at suse.com --- Comment #10 from Yogalakshmi Arunachalam --- validated the PR https://jira.suse.com/browse/CSE-420 -- You are receiving this mail because: You are on the CC list for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bugzilla_noreply at suse.com Thu Dec 15 22:18:36 2022 From: bugzilla_noreply at suse.com (bugzilla_noreply at suse.com) Date: Thu, 15 Dec 2022 22:18:36 +0000 Subject: [Bug 1206194] Distro migration name resolution failure In-Reply-To: References: Message-ID: https://bugzilla.suse.com/show_bug.cgi?id=1206194 https://bugzilla.suse.com/show_bug.cgi?id=1206194#c12 Kirk Penrose changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo? | --- Comment #12 from Kirk Penrose --- (In reply to Keith Berger from comment #5) > This may have been broken for a while. If you start the migration via the > kexec method by running `sudo /usr/sbin/run_migration` it works. If you > install suse-migration-sle15-activation and do the reboot method it fails. > > Kirk are you and your customer able to use 'run_migration' as a workaround > for now? This works for my testing. -- You are receiving this mail because: You are on the CC list for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: