Quickly changing netplan configuration sometimes causes systemd-networkd to fail

Bug #1850244 reported by Lee Trager
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
High
Lee Trager
Netplan
New
Undecided
Unassigned

Bug Description

MAAS 2.7 adds support for testing network configuration. The network configuration is only applied when a connectivity test is being run, otherwise the default network configuration(DHCP) is used. If multiple connectivity tests are being run the configuration is reapplied each time.

For example if the smartctl-validate, internet-connectivity, gateway-connectivity, and rack-controller connectivity tests are selected the following will happen

1. Machine boots into the MAAS ephemeral environment
2. Networking is automatically configured to DHCP on the boot device as the default
3. smartctl-validate test runs
4. netplan apply custom networking
5. internet-connectivity test runs
6. netplan apply default networking
7. netplan apply custom networking
8. gateway-connectivity test runs
9. netplan apply custom networkingd
10. netplan apply custom networking
11. rack-controller-connectivity test runs
12. netplan apply custom networking

When one of the connectivity tests runs quickly enough it causes systemd-networkd to fail with

systemd-networkd.service: Start request repeated too quickly.
systemd-networkd.service: Failed with result 'start-limit-hit'.
systemd-networkd.service: Failed to start Network Service.

This results in the system having no networking which causes all remaining tests to timeout and the machine goes into a failed testing state. If I access the machine through the console and run systemctl restart systemd-networkd networking comes back up.

Related branches

Revision history for this message
Lee Trager (ltrager) wrote :
Revision history for this message
Lee Trager (ltrager) wrote :
Revision history for this message
Ryan Harper (raharper) wrote :

You may want to add a drop-in systemd-networkd.service config file that disables the rate-limit checking during your testing.

% mkdir -p /etc/systemd/system/systemd-networkd.service.d
% echo -e "[Service]\nStartLimitInterval=0" > /etc/systemd/system/systemd-networkd.service.d/10-extra.conf
% systemctl daemon-reload

Revision history for this message
Lee Trager (ltrager) wrote :

The attached branch does something similar. maas-run-remote-scripts now connects to dbus so when a networking configuration is applied it checks on the status of systemd-networkd. If systemd-networkd has failed, is inactive, or is deactiving, maas-run-remote-scripts requests systemd-networkd be restarted. It then spins for 5s until systemd-networkd is active.

netplan may want to do something similar as I didn't expect I'd have to check on the status of systemd-networkd after running netplan apply.

Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.