Comment 17 for bug 1718568

Revision history for this message
Dan Streetman (ddstreet) wrote :

> Is there no other more reliable way to reproduce the case that's being fixed here

sure yes, here's what i did; first, setup:

-create VMs for the releases (T/X/A), managed by virsh (e.g. with uvt-kvm or whatever)
-add a second interface to each of them, e.g.:

$ virsh attach-interface lp1718568-artful network default --model virtio --persistent

-set up another server/vm/container/whatever, connected to the same network as the test VM second interfaces, and install and configure isc-dhcp-server (or dnsmasq or whatever dhcpv6 server) on that to serve out dhcpv6

now on each release VM:

-check the virsh interface you'll be using, e.g.:

$ virsh domiflist lp1718568-artful
Interface Type Source Model MAC
-------------------------------------------------------
vnet3 network default virtio 52:54:00:3f:b9:ad
vnet0 network default virtio 52:54:00:bf:16:8a

confirm which matches the test interface in the VM using mac; in my case it's vnet0. now, bring its link state down using virsh:

$ virsh domif-setlink lp1718568-artful vnet0 down

and ssh into the test VM (on its first, still working, default interface - or use virt-viewer or whatever) and test dhclient -6, making sure to first verify the test interface is down (i.e. that it doesn't already have a link-local addr):

ubuntu@lp1718568-artful:~$ sudo ip a show ens7
3: ens7: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
    link/ether 52:54:00:bf:16:8a brd ff:ff:ff:ff:ff:ff

ubuntu@lp1718568-artful:~$ sudo dhclient -v -6 ens7
Internet Systems Consortium DHCP Client 4.3.5
Copyright 2004-2016 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

no link-local IPv6 address for ens7

If you think you have received this message due to a bug rather
than a configuration issue please read the section on submitting
bugs on either our web page at www.isc.org or in the README file
before submitting a bug. These pages explain the proper
process and the information we find helpful for debugging..

exiting.

as expected (for this bug), that fails immediately. now, upgrade to the test version in my ppa
https://launchpad.net/~ddstreet/+archive/ubuntu/lp1718568

ubuntu@lp1718568-artful:~$ dpkg -l | grep isc-dhcp
ii isc-dhcp-client 4.3.5-3ubuntu2.2+hf1718568v20180302b1 amd64 DHCP client for automatically obtaining an IP address
ii isc-dhcp-common 4.3.5-3ubuntu2.2+hf1718568v20180302b1 amd64 common manpages relevant to all of the isc-dhcp packages

again, make sure the interface is down (no link-local addr) and try the new dhclient:

ubuntu@lp1718568-artful:~$ sudo ip l set down dev ens7
ubuntu@lp1718568-artful:~$ sudo ip a show ens7
3: ens7: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
    link/ether 52:54:00:bf:16:8a brd ff:ff:ff:ff:ff:ff

ubuntu@lp1718568-artful:~$ sudo dhclient -v -6 ens7
Internet Systems Consortium DHCP Client 4.3.5
Copyright 2004-2016 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

now, instead of immediately exiting with error, it waits - this is where it's waiting for the interface to get a 'tentative' link-local address, and then complete dad to switch to normal link-local so dhcpv6 can begin. After only a few seconds (dhclient-script.linux defaults to 60 attempts, with 0.1 second delays between, so ~6 seconds of waiting) it will give up and exit as before:

no link-local IPv6 address for ens7

If you think you have received this message due to a bug rather
than a configuration issue please read the section on submitting
bugs on either our web page at www.isc.org or in the README file
before submitting a bug. These pages explain the proper
process and the information we find helpful for debugging..

exiting.

Ok, we verified the patch does force dhclient to wait for the link-local addr (even without any tentative addr); now bring the interface back down and re-test, but this time immediately switch back to the host and use virsh to bring the link state back up (before dhclient times out):

ubuntu@lp1718568-artful:~$ sudo ip l set down dev ens7
ubuntu@lp1718568-artful:~$ sudo ip a show ens7
3: ens7: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast state DOWN group default qlen 1000
    link/ether 52:54:00:bf:16:8a brd ff:ff:ff:ff:ff:ff

ubuntu@lp1718568-artful:~$ sudo dhclient -v -6 ens7
Internet Systems Consortium DHCP Client 4.3.5
Copyright 2004-2016 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

now quickly in the host:

$ virsh domif-setlink lp1718568-artful vnet0 up
Device updated successfully

and back in the test VM:

Listening on Socket/ens7
Sending on Socket/ens7
PRC: Soliciting for leases (INIT).
XMT: Forming Solicit, 0 ms elapsed.
XMT: X-- IA_NA 00:bf:16:8a
XMT: | X-- Request renew in +3600
XMT: | X-- Request rebind in +5400
XMT: | X-- Request address 2001:db8::99.
XMT: | | X-- Request preferred in +7200
XMT: | | X-- Request valid in +10800
XMT: Solicit on ens7, interval 1080ms.
RCV: Advertise message on ens7 from fe80::5054:ff:fec9:897c.
RCV: X-- IA_NA 00:bf:16:8a
RCV: | X-- starts 1520020524
RCV: | X-- t1 - renew +3600
RCV: | X-- t2 - rebind +7200
RCV: | X-- [Options]
RCV: | | X-- IAADDR 2001:db8::99
RCV: | | | X-- Preferred lifetime 604800.
RCV: | | | X-- Max lifetime 2592000.
RCV: X-- Server ID: 00:01:00:01:22:2c:61:12:52:54:00:c9:89:7c
RCV: Advertisement recorded.
PRC: Selecting best advertised lease.
PRC: Considering best lease.
PRC: X-- Initial candidate 00:01:00:01:22:2c:61:12:52:54:00:c9:89:7c (s: 10105, p: 0).
XMT: Forming Request, 0 ms elapsed.
XMT: X-- IA_NA 00:bf:16:8a
XMT: | X-- Requested renew +3600
XMT: | X-- Requested rebind +5400
XMT: | | X-- IAADDR 2001:db8::99
XMT: | | | X-- Preferred lifetime +7200
XMT: | | | X-- Max lifetime +7500
XMT: V IA_NA appended.
XMT: Request on ens7, interval 950ms.
RCV: Reply message on ens7 from fe80::5054:ff:fec9:897c.
RCV: X-- IA_NA 00:bf:16:8a
RCV: | X-- starts 1520020525
RCV: | X-- t1 - renew +3600
RCV: | X-- t2 - rebind +7200
RCV: | X-- [Options]
RCV: | | X-- IAADDR 2001:db8::99
RCV: | | | X-- Preferred lifetime 604800.
RCV: | | | X-- Max lifetime 2592000.
RCV: X-- Server ID: 00:01:00:01:22:2c:61:12:52:54:00:c9:89:7c
PRC: Bound to lease 00:01:00:01:22:2c:61:12:52:54:00:c9:89:7c.

patch works!

Of course, if the NIC can't get its interface up within the ~6 second timeout, dhclient will still fail, but I think for any incredibly slow interface hw like that, it's not unreasonable to expect some additional ifupdown/netplan/networkd/NetworkManager configuration to delay dhclient after bringing up the interface. I can't imagine what HW takes more than 6 seconds to bring up link state.

I should point out the regression potential for this as well - and nic that's configured for dhcpv6 but has no link state previously failed immediately, while with this patch it won't fail for ~6 seconds. That may delay boot by those 6 seconds if systemd/upstart is waiting for the interface to get its dhcpv6 address. However, consider that for an interface that *does* have link-state, but there is no dhcpv6 server on its network, dhclient will wait much, much longer for a dhcpv6 response. I think the additional 6 seconds for a broken configuration is not unreasonable to get some slow-to-come-up nics working.