fix(dhcpcd): Make lease parsing more robust (#5129)
3d4213f...
by
Chris Patterson <email address hidden>
net/dhcp: raise InvalidDHCPLeaseFileError on error parsing dhcpcd lease (#5128)
Seeing a fairly large number of lease parsing failures on Azure similar
to:
```
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceAzure.py", line 851, in _get_data
crawled_data = util.log_time( ^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2828, in log_time
ret = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/cloudinit/sources/helpers/azure.py", line 45, in impl
return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceAzure.py", line 660, in crawl_metadata
self._wait_for_pps_savable_reuse()
File "/usr/lib/python3/dist-packages/cloudinit/sources/helpers/azure.py", line 45, in impl
return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceAzure.py", line 1236, in _wait_for_pps_savable_reuse
self._wait_for_hot_attached_primary_nic(nl_sock)
File "/usr/lib/python3/dist-packages/cloudinit/sources/helpers/azure.py", line 45, in impl
return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceAzure.py", line 1142, in _wait_for_hot_attached_primary_nic
primary_nic_found = self._setup_ephemeral_networking( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/cloudinit/sources/helpers/azure.py", line 45, in impl
return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceAzure.py", line 440, in _setup_ephemeral_networking
lease = self._ephemeral_dhcp_ctx.obtain_lease() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/cloudinit/net/ephemeral.py", line 293, in obtain_lease
self.lease = maybe_perform_dhcp_discovery( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 103, in maybe_perform_dhcp_discovery
return distro.dhcp_client.dhcp_discovery(interface, dhcp_log_func, distro) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 656, in dhcp_discovery
lease = self.get_newest_lease(interface) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 829, in get_newest_lease
return self.parse_dhcpcd_lease( ^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/cloudinit/net/dhcp.py", line 787, in parse_dhcpcd_lease
lease = dict(
^^^^^
ValueError: dictionary update sequence element #0 has length 1; 2 is required
```
Catch this error in parse_dhcpcd_lease() and raise
InvalidDHCPLeaseFileError after logging an error.
Signed-off-by: Chris Patterson <email address hidden>
fix: Fix runtime file locations for cloud-init (#4820)
Update various hard-coded filepaths. Also make sure we
bootstrap our Paths() config correctly so that we read
from the configured rundir.
Co-authored-by: Mina Galić <email address hidden>
Sponsored by: The FreeBSD Foundation
Fixes GH-4766
de92c01...
by
Chris Patterson <email address hidden>
net/dhcp: bump dhcpcd timeout to 300s (#5127)
On most distros, including Ubuntu, the default timeout for dhclient is 300s.
There is no cloud-init controlled duration for the dhclient process as
it doesn't fork until after it receives an IP address and there is no timeout
value passed to subp().
I have seen some distros configure dhclient with a timeout of 60s, but
is far less common.
Given that a cloud VM is not very useful with DHCP, err on the generous
side and allow up to 300 seconds for dhcpcd to get an address.
Note that there is still an issue with dhcpcd retries which will be
addressed later in a separate PR.
Signed-off-by: Chris Patterson <email address hidden>
These are the failures I could find when running tests in random order.
Changes that aren't self-explanatory should include a comment in the
test itself.
test_vultr.py includes a refactor to remove global state and remove
a test that didn't work and wasn't needed after the changes.