ifup-wait-all-auto does not wait for interfaces to be fully up

Bug #1442828 reported by Jay Vosburgh
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ifupdown (Ubuntu)
Fix Released
High
Martin Pitt
Vivid
Fix Released
High
Martin Pitt

Bug Description

The change to ifup@.service done as part of LP 1425376 appears to break the ordering of units marked as "After=network-online.target". In my specific case, a new service script with "After=network-online.target" is erroneously run concurrently with dhclient. As the new script depends on networking configuration being complete, it fails as the IP addresses and routes from DHCP are not configured. This functioned correctly on vivid daily images from a few days ago, and appears to break starting with the vivid daily from approximately 0409.

Infinity suggested this change as a likely suspect:

diff -Nru systemd-219/debian/extra/units/ifup@.service systemd-219/debian/extra/units/ifup@.service
--- systemd-219/debian/extra/units/ifup@.service 2015-04-02 08:08:56.000000000 +0000
+++ systemd-219/debian/extra/units/ifup@.service 2015-04-07 14:38:38.000000000 +0000
@@ -6,10 +6,8 @@
 DefaultDependencies=no

 [Service]
-Type=oneshot
-ExecStart=/sbin/ifup --allow=hotplug %I
-ExecStartPost=/sbin/ifup --allow=auto %I
 # only fail if ifupdown knows about the iface AND it's not up
-ExecStartPost=/bin/sh -c 'if ifquery %I >/dev/null; then ifquery --state %I >/dev/null; fi'
+ExecStart=/bin/sh -ec 'ifup --allow=hotplug %I; ifup --allow=auto %I; \
+ if ifquery %I >/dev/null; then ifquery --state %I >/dev/null; fi'
 ExecStop=/sbin/ifdown %I
 RemainAfterExit=true

and, indeed, reverting this (copying ifup@.service from a few-days old vivid image to a current image) resolves the problem.

The affected version is ubuntu-vivid-daily-amd64-server-20150409.2 (installed via AWS).

Tags: systemd-boot
Jay Vosburgh (jvosburgh)
affects: linux (Ubuntu) → systemd (Ubuntu)
Revision history for this message
Adam Conrad (adconrad) wrote :

Martin, this looks like it renders the network-online target entirely useless. When we're in Austin, I think you, me, and stgraber should sit down and argue a bit about what "correct" behaviour is, but I'd guess it'll looks something like "block on a one-shot attempt to bring up the network, and if that fails miserably, background a second attempt so the boot can continue". Ish.

Changed in systemd (Ubuntu):
assignee: nobody → Martin Pitt (pitti)
Revision history for this message
Martin Pitt (pitti) wrote :

I suppose that merely reintroducing the Type=oneshot will make things work again?

However, I don't really understand this yet. ifup@.service should have nothing to do with how network-online.target behaves. For the record, the design of this is as follows:

  * ifup@.service should not be depended on by anything, and it should be possible to run this asynchronously. This job will bring up hotplugged network interfaces during runtime, and the Before=network.target will ensure that any consumer of network related functionality will stop before any ifup@.service during shutdown. The "network.target" is relatively uninteresting during bootup (there its main function is to sort firewalls etc. before bringing up any network interface via network-pre.target).

 * Coldplugged interfaces during boot, and virtual ones like bridges, bonds, etc. will be brought up by /etc/init.d/networking (aka networking.service). However, this is Type=forking, not oneshot, so will (and should) run in parallel with everything else during boot.

 * With ifupdown (i. e. /etc/network/interfaces), the ifup-wait-all-auto.service unit is supposed to wait for all interfaces to be "up", and this implements network-online.target. This calls "ifquery --state <interface>" to ask ifupdown whether an interface is up. So from your bug report I conclude that this isn't actually working, and ifupdown considers an interface as "up" even before dhclient finished?

It was, and would be wrong to make network.target having to wait for any ifup@.service to finish again. This unnecessarily delays the boot and causes too big hangs when any network interface is unavailable, like in bug 1425376.

So if I understood this right, then asking ifquery --state in ifup-wait-all-auto.service isn't sufficient, and this check needs to become stronger?

To confirm this, would you mind attaching "journalctl -b" to this bug, once for a boot with the current ifup@.service and once with reverting back to Type=oneshot? I'd like to compare the startup order and DHCP interaction there. Thanks!

Changed in systemd (Ubuntu):
importance: Undecided → High
affects: systemd (Ubuntu) → ifupdown (Ubuntu)
Martin Pitt (pitti)
tags: added: systemd-boot
summary: - change for LP 1425376 breaks systemd After=network-online.target
+ ifup-wait-all-auto does not wait for interfaces to be fully up
Revision history for this message
Martin Pitt (pitti) wrote :

In the future we need to clean up the ifupdown integration, and use something like https://people.debian.org/~biebl/ifupdown-wait-online.tar.gz to wait properly without this active shell waiting loop. (This is similar to /etc/network/if-up.d/upstart).

For vivid, an unintrusive fix is to extend the waiting loop to wait until all /run/network/ifup-*.pid files are gone, i. e. to wait until ifup is done, not just wait until all of them have started.

Changed in ifupdown (Ubuntu):
status: New → In Progress
Revision history for this message
Martin Pitt (pitti) wrote :

I'm not exactly proud of that entire shell hack, but so close to vivid's release I believe this is safer than entirely reenginering the ifupdown integration. Fixed package uploaded to the vivid review queue.

This is on the list of cleanups to do for Debian jessie+1, which will drop the shell waiting entirely.

Changed in ifupdown (Ubuntu Vivid):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ifupdown - 0.7.48.1ubuntu9

---------------
ifupdown (0.7.48.1ubuntu9) vivid; urgency=medium

  * debian/ifupdown.ifup-wait-all-auto.service: Wait until ifup's are actually
    done (their pid files go away), not just until ifstate claims that they
    are up (which happens at the beginning of ifup, not at the end). Fixes
    network-online.target to wait until "auto" interfaces are really up.
    (LP: #1442828)
 -- Martin Pitt <email address hidden> Sun, 12 Apr 2015 06:49:10 -0500

Changed in ifupdown (Ubuntu Vivid):
status: Fix Committed → Fix Released
Revision history for this message
Jay Vosburgh (jvosburgh) wrote :

ifupdown 0.7.48.1ubuntu9 resolves the original problem for me on a fresh vivid install with the daily build for today.

Thanks.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.