cloud-init's netplan rendering does not do anything that starts networkd

Bug #1737630 reported by Michael Hudson-Doyle
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
cloud-init
Invalid
Undecided
Unassigned

Bug Description

Currently if an instance ends up using cloud-init's netplan support with the networkd backed, networkd is never started and so networking doesn't come up. The fix is probably to call "netplan apply" rather than "netplan generate".

Revision history for this message
Scott Moser (smoser) wrote :

The goal is that cloud-init should not have to call netplan apply.
Rather, cloud-init runs Before=network-pre.target. Then, the normal
processes that would bring up networking will run After network-pre.target,
and they'll see and consume the files that cloud-init (and netplan generate)
wrote.

The goal is basically that this is the same as if cloud-init wrote
those files and then did a /sbin/reboot.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

That's a fine goal but we do need the network to come up :-)

I guess it could invoke the generator as systemd would, after removing netplan.stamp?

Or maybe just run systemctl enable systemd-networkd if glob('/run/systemd/network/*netplan-*') is truthy?

Revision history for this message
Steve Langasek (vorlon) wrote : Re: [Bug 1737630] Re: cloud-init's netplan rendering does not do anything that starts networkd

On Tue, Dec 12, 2017 at 08:00:07AM -0000, Michael Hudson-Doyle wrote:
> That's a fine goal but we do need the network to come up :-)

> I guess it could invoke the generator as systemd would, after removing
> netplan.stamp?

The correct way to re-trigger generators is with 'systemd daemon-reload'.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

>
>
> > I guess it could invoke the generator as systemd would, after removing
> > netplan.stamp?
>
> The correct way to re-trigger generators is with 'systemd daemon-
> reload'.

That might help, but when does this bit of cloud-init execute? All the
generator does in this situation is add systemd-networkd to
multi-user.target.wants so if it's after that target has been reached it
won't help. I guess it is probably before that.

Is there an ETA on this bug?

Revision history for this message
Scott Moser (smoser) wrote :

@Steve,
But why should we/cloud-init need to do that? Why do we *not* need to do that now in cloud-images?

@Michael,
"Is there an ETA on this bug?"

I can't reproduce the bug, nor do I understand it.

Other than bug 1728181, in all my experience it is working.

Can you show how to recreate failure?

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Why this doesn't happen in cloud-images is a good question...

Ah, this is why: http://launchpadlibrarian.net/337315344/livecd-rootfs_2.456_2.457.diff.gz

I guess I can do something similar for live-server...

Revision history for this message
Scott Moser (smoser) wrote :

Michael, so un-mark cloud-init ? What should be marked as affects?

Revision history for this message
Mike Gerdts (mgerdts) wrote :

I am currently working on bhyve support in SmartOS and have encountered this same thing with the ubuntu-certified-17.10 (e0ef873d-6423-4002-90db-1ecd1d99dff2) image using changeset f7deaf15acf382d62554e2b1d70daa9a9109d542 plus a couple other changes unrelated to this issue. The other changes can be seen at

https://github.com/mgerdts/cloud-init/commits/smartos-bhyve

I'd be happy to work with you on reproducing this.

Revision history for this message
Mike Gerdts (mgerdts) wrote :

Forget my previous comment. After reverting the change to the datasources definition, I no longer see this problem.

Scott Moser (smoser)
Changed in cloud-init:
status: New → Invalid
Revision history for this message
Konstantin Khlebnikov (koct9i) wrote :

Bug still not fixed. I see race between netplan generation and starting network.
If I remove most services from image network starts before generating netplan and nothing works.
runcmd: [netplan apply] helps

Revision history for this message
Konstantin Khlebnikov (koct9i) wrote :

False alarm. It seems systemd-networkd service is not enabled by default for some reason.
"netplan apply" restarts is and thus mitigates original issue.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

I think the cloud-init generator, should be generating enabling systemd-networkd at generation time, if clearly nothing else is going to supply networking info.

Then cloud-init generating netplan pre-networking, and asking netplan to generate networkd units, will make things work.

Revision history for this message
Dan Watkins (oddbloke) wrote :

cloud-init supports rendering configuration for multiple different networking backends (netplan, ifupdown, sysconfig). Note that networkd is _not_ one of those backends (because it configures networkd via netplan), so cloud-init feels like the wrong place to be poking at networkd.

This feels like an image mastering issue to me. If you create an Ubuntu image that isn't configured to bring up networking, then it's not going to bring up networking. Perhaps I'm missing something here.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Sure, and image is configured to use cloud-init with netplan backend, which itself uses networkd backend.

Now, the issues is that of a systemd boot transaction and inability to affect it whilst its in flight.

Thus netplan ships a generator, which on boot at generator stage (prior to calculating and executing the initial transaction) reads netplan.yaml configs and generates appropriate dependencies for a particular backend to be invoked.

Similarly cloud-init ships a generator, to hookup all the cloud-init units & targets into the initial transaction.

The problem is the one of the integration of the two generators.

cloud-init at generator stage, potentially knows, that it will spit out netplan configs & will generate netplan units *just in time* for the networking.target to have all the files on disk to have the right things.

But netplan at generator stage, sees no configs whats so ever, doesn't know which backend to use & hook up for the boot target, and thus does nothing. When cloud-init calls netplan to generate just in time units, netplan does generate dependencies to start the right networking backend, but it's too late as the initial boot transaction is already in progress and cannot be changed to slot in systemd-networkd & systemd-networkd-wait-online to bring up the networking.

It feels like there should be some integration between cloud-init & netplan generators to indicate that "yes we will have networking config" and "please hookup <default>, networkd, NetworkManager backends for the networking.target"

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Note that when this was discovered many times over, we have cargo culted shipping things like:

0 ./live-build/ubuntu-cpc/includes.chroot/etc/systemd/system/network-online.target.wants/systemd-networkd-wait-online.service
0 ./live-build/ubuntu-cpc/includes.chroot/etc/systemd/system/multi-user.target.wants/systemd-networkd.service
0 ./live-build/ubuntu-cpc/includes.chroot/etc/systemd/system/sockets.target.wants/systemd-networkd.socket

In the cloud images, then in the ubuntu-server live images, and probably many other places too.

I'm not sure if it's cloud-init generator that should be poking netplan to "activate backend" or if netplan generator should "notice" that "we are cloud-initing, generate backend dependencies only" or some such.

Revision history for this message
Dan Watkins (oddbloke) wrote :
Download full text (3.8 KiB)

On Wed, Oct 09, 2019 at 01:57:00PM -0000, Dimitri John Ledkov wrote:
> Sure, and image is configured to use cloud-init with netplan backend,
> which itself uses networkd backend.

The cloud-init generator is shipped by upstream, so it's intended to be
used on distros other than just Ubuntu. I'm only bringing this up
because there is a non-zero possibility that, in the future, someone
could use netplan to configure the NetworkManager backend in a
configuration that we need to support upstream.

(The generator is templated, so we can easily make distro-specific
changes; this is more of a reminder that we need to be careful about
what is generally useful upstream and what is useful to Ubuntu only.)

> Now, the issues is that of a systemd boot transaction and inability to
> affect it whilst its in flight.
>
> Thus netplan ships a generator, which on boot at generator stage (prior
> to calculating and executing the initial transaction) reads netplan.yaml
> configs and generates appropriate dependencies for a particular backend
> to be invoked.
>
> Similarly cloud-init ships a generator, to hookup all the cloud-init
> units & targets into the initial transaction.
>
> The problem is the one of the integration of the two generators.
>
> cloud-init at generator stage, potentially knows, that it will spit out
> netplan configs & will generate netplan units *just in time* for the
> networking.target to have all the files on disk to have the right
> things.

So, currently, the cloud-init generator only knows about how to detect
the data source that should be used for the platform (as well as
handling kernel cmdline options to modify its behaviour, including
"always disable cloud-init", for example). The generator is implemented
in shell, as are its dependencies, so that we don't have to pay Python
startup costs to determine that we didn't need to run cloud-init at all.

cloud-init's network backend selection is dynamic. Once we've
determined the network configuration that should be applied, we detect
what network backends are available on the system and then select the
highest priority one (eni over sysconfig over netplan). The
availability checks for ENI and netplan are pretty simple: are the
appropriate binaries available and, for ENI, does
/etc/network/interfaces exist. (Naturally, all this logic is only
implemented in Python at the moment, so we would need to reimplement it
in shell for use in the generator.)

One area of complexity is that the application of network configuration
can be entirely disabled. I don't, off-hand, know the ways in which we
support doing this, but I think that the cloud-init generator shouldn't
generate any network backend dependencies in those cases. (Other than
explicitly being disabled, cloud-init expects to generate network
configuration on every first boot.)

> But netplan at generator stage, sees no configs whats so ever, doesn't
> know which backend to use & hook up for the boot target, and thus does
> nothing. When cloud-init calls netplan to generate just in time units,
> netplan does generate dependencies to start the right networking
> backend, but it's too late as the initial boot transaction is already i...

Read more...

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

>
> > It feels like there should be some integration between cloud-init &
> > netplan generators to indicate that "yes we will have networking config"
> > and "please hookup <default>, networkd, NetworkManager backends for the
> > networking.target"
>
> Yes, I agree that some integration is warranted. Are generators
> ordered?

From systemd.generator(7): "All generators are executed in parallel. That
means all executables are started at the very same time and need to be able
to cope with this parallelism."

(and in general, it would be very un-systemd for there to be any kind of
implicit ordering of this sort of thing)

Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.