Change location of DHCP leases in CloudStack provider as it doesn't work for RHEL8

Bug #1843334 reported by Marat Salakhutdinov
20
This bug affects 2 people
Affects Status Importance Assigned to Milestone
cloud-init
Fix Released
Medium
Ryan Harper

Bug Description

Cloud-init is failing to get Virtual Router IP of CloudStack provider on RHEL8 because default path used on this part of code:
https://github.com/number5/cloud-init/blob/07b17236be5665bb552c7460102bcd07bf8f2be8/cloudinit/net/dhcp.py#L24

is changed to /var/lib/NetworkManager in RHEL8.

To fix the problem it probably will worth to try to change code in file above to one like on this section of the code: https://github.com/number5/cloud-init/blob/acc25d8d7d603313059ac35b4253b504efc560a9/cloudinit/sources/DataSourceCloudStack.py#L172-L180

Related branches

Revision history for this message
Marat Salakhutdinov (maratsal) wrote :

In my setup as a workaround I just replace default path to /var/lib/NetworkManager on this part of code https://github.com/number5/cloud-init/blob/07b17236be5665bb552c7460102bcd07bf8f2be8/cloudinit/net/dhcp.py#L24 on template creation process.

description: updated
Revision history for this message
Dan Watkins (oddbloke) wrote :

Hi Marat,

Thanks for filing this bug! Could you run `cloud-init collect-logs` on a failing instance and attach the produced tarball to this bug, please? This will give us enough information to effectively triage the issue. Once you've done so, please move the status back to New.

Thanks!

Dan

Changed in cloud-init:
status: New → Incomplete
Revision history for this message
Marat Salakhutdinov (maratsal) wrote :

attaching cloud-init logs

Dan Watkins (oddbloke)
Changed in cloud-init:
status: Incomplete → New
Revision history for this message
Marat Salakhutdinov (maratsal) wrote : Re: [Bug 1843334] Re: Change location of DHCP leases in CloudStack provider as it doesn't work for RHEL8

Hi Dan,

Forgot to set it to new after uploading logs but you have already done it
just recently :)

Please let me know if you need any additional details.

Thanks.
Marat

On Tue, Sep 10, 2019 at 13:35 Dan Watkins <email address hidden>
wrote:

> Hi Marat,
>
> Thanks for filing this bug! Could you run `cloud-init collect-logs` on
> a failing instance and attach the produced tarball to this bug, please?
> This will give us enough information to effectively triage the issue.
> Once you've done so, please move the status back to New.
>
>
> Thanks!
>
> Dan
>
> ** Changed in: cloud-init
> Status: New => Incomplete
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1843334
>
> Title:
> Change location of DHCP leases in CloudStack provider as it doesn't
> work for RHEL8
>
> Status in cloud-init:
> Incomplete
>
> Bug description:
> Cloud-init is failing to get Virtual Router IP of CloudStack provider on
> RHEL8 because default path used on this part of code:
>
> https://github.com/number5/cloud-init/blob/07b17236be5665bb552c7460102bcd07bf8f2be8/cloudinit/net/dhcp.py#L24
>
> is changed to /var/lib/NetworkManager in RHEL8.
>
> To fix the problem it probably will worth to try to change code in
> file above to one like on this section of the code:
> https://github.com/number5/cloud-
>
> init/blob/acc25d8d7d603313059ac35b4253b504efc560a9/cloudinit/sources/DataSourceCloudStack.py#L172-L180
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/cloud-init/+bug/1843334/+subscriptions
>
--
Marat Salakhutdinov

tel. +1 514 823 9005
e-mail: <email address hidden>

Revision history for this message
Ryan Harper (raharper) wrote :
Download full text (10.3 KiB)

Thanks for the logs. What is see is:

1) cloud-init generates a fallback networking config 'dhcp on eth0' which is rendered as sysconfig:

2019-09-10 20:00:38,606 - stages.py[INFO]: Applying network configuration from fallback bringup=False: {'config': [{'type': 'physical', 'name': 'eth0', 'mac_address': '02:00:52:bc:00:0c', 'subnets': [{'type': 'dhcp'}]}], 'version': 1}
2019-09-10 20:00:38,607 - __init__.py[DEBUG]: Selected renderer 'sysconfig' from priority list: None
2019-09-10 20:00:38,610 - util.py[DEBUG]: Writing to /etc/sysconfig/network-scripts/ifcfg-eth0 - wb: [644] 159 bytes
2019-09-10 20:00:38,613 - util.py[DEBUG]: Restoring selinux mode for /etc/sysconfig/network-scripts/ifcfg-eth0 (recursive=False)

2) When cloud-init init runs after networking is supposed to be up, there is no DHCP lease in the NetworkManager path

2019-09-10 20:00:40,692 - handlers.py[DEBUG]: start: init-network/search-CloudStack: searching for network data from DataSourceCloudStack
2019-09-10 20:00:40,692 - __init__.py[DEBUG]: Seeing if we can get any data from <class 'cloudinit.sources.DataSourceCloudStack.DataSourceCloudStack'>
2019-09-10 20:00:40,693 - DataSourceCloudStack.py[DEBUG]: Using /var/lib/NetworkManager lease directory
2019-09-10 20:00:40,693 - DataSourceCloudStack.py[DEBUG]: No DHCP found, using default gateway

LOoking at the journal, I can see that NetworkManager is loading the ifcfg-rh plugin which is supposed to support reading sysconfig files:

Sep 10 16:00:39.274968 centos-host NetworkManager[687]: <info> [1568145639.2746] settings: Loaded settings plugin: SettingsPluginIfcfg ("/usr/lib64/NetworkManager/1.14.0-14.el8/libnm-settings-plugin-ifcfg-rh.so")

And NetworkManager brings up eth0 with DHCP:

ep 10 16:00:39.277538 centos-host NetworkManager[687]: <info> [1568145639.2775] settings: Loaded settings plugin: NMSIbftPlugin ("/usr/lib64/NetworkManager/1.14.0-14.el8/libnm-settings-plugin-ibft.so")
Sep 10 16:00:39.277571 centos-host NetworkManager[687]: <info> [1568145639.2775] settings: Loaded settings plugin: NMSKeyfilePlugin (internal)
Sep 10 16:00:39.281007 centos-host NetworkManager[687]: <info> [1568145639.2809] ifcfg-rh: new connection /etc/sysconfig/network-scripts/ifcfg-eth0 (5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03,"System eth0")
Sep 10 16:00:39.291088 centos-host NetworkManager[687]: <info> [1568145639.2910] manager: rfkill: WiFi enabled by radio killswitch; enabled by state file
Sep 10 16:00:39.291721 centos-host NetworkManager[687]: <info> [1568145639.2917] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file
Sep 10 16:00:39.292362 centos-host NetworkManager[687]: <info> [1568145639.2923] manager: Networking is enabled by state file
Sep 10 16:00:39.293807 centos-host NetworkManager[687]: <info> [1568145639.2937] dhcp-init: Using DHCP client 'internal'
Sep 10 16:00:39.294565 centos-host nm-dispatcher[698]: req:1 'hostname': new request (3 scripts)
Sep 10 16:00:39.294577 centos-host nm-dispatcher[698]: req:1 'hostname': start running ordered scripts...
Sep 10 16:00:39.303417 centos-host NetworkManager[687]: <info> [1568145639.3033] Loaded device plugin: NMTeamFactory (/usr/lib64/NetworkManager/1.14.0-14.el...

Changed in cloud-init:
status: New → Incomplete
Revision history for this message
Marat Salakhutdinov (maratsal) wrote :

I was thinking that problem in this line - https://github.com/cloud-init/cloud-init/blob/fa47d527a03a00319936323f0a857fbecafceaf7/cloudinit/sources/DataSourceCloudStack.py#L222

That the problem with dhcp.networkd_get_option_from_leases('SERVER_ADDRESS') where we don't specify location of the leases and it defaults to one specified here https://github.com/cloud-init/cloud-init/blob/fa47d527a03a00319936323f0a857fbecafceaf7/cloudinit/net/dhcp.py#L24 and because of that this function cannot return VR IP correctly.

Then it falls to the second part of the function code https://github.com/cloud-init/cloud-init/blob/fa47d527a03a00319936323f0a857fbecafceaf7/cloudinit/sources/DataSourceCloudStack.py#L234-L246 and it fails to get DHCP server address as it is looking for match "dhcp-server-identifier" and not "SERVER_ADDRESS" and as the result it posts error message (part of the code I linked to above) to the logs:
2019-09-10 20:00:40,693 - DataSourceCloudStack.py[DEBUG]: No DHCP found, using default gateway

However after using your suggestion it works. So the name of the service is NetworkManager.service:

[root@centos-host ~]# systemctl status NetworkManager
● NetworkManager.service - Network Manager
   Loaded: loaded (/usr/lib/systemd/system/NetworkManager.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2019-09-10 16:00:38 EDT; 19h ago
     Docs: man:NetworkManager(8)
 Main PID: 687 (NetworkManager)
    Tasks: 3 (limit: 5033)
   Memory: 9.2M
   CGroup: /system.slice/NetworkManager.service
           └─687 /usr/sbin/NetworkManager --no-daemon
.....
[root@centos-host ~]#

And after trying your suggestion to start cloud-init after NetworkManager it worked fine.

[root@centos-host cca-user]# cat /etc/systemd/system/cloud-init.service.d/10-nm.conf
[Unit]
After=NetworkManager.service
[root@centos-host cca-user]#

Thanks for finding solution!

Changed in cloud-init:
status: Incomplete → New
Ryan Harper (raharper)
Changed in cloud-init:
assignee: nobody → Ryan Harper (raharper)
importance: Undecided → Medium
status: New → In Progress
Revision history for this message
Server Team CI bot (server-team-bot) wrote :

This bug is fixed with commit 8888ca1a to cloud-init on branch master.
To view that commit see the following URL:
https://git.launchpad.net/cloud-init/commit/?id=8888ca1a

Changed in cloud-init:
status: In Progress → Fix Committed
Revision history for this message
Chad Smith (chad.smith) wrote : Fixed in cloud-init version 19.2-73.

This bug is believed to be fixed in cloud-init in version 19.2-73. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in cloud-init:
status: Fix Committed → Fix Released
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.