Merge ~jocha/cloud-init:jocha-reboot-bugfix into cloud-init:master

Proposed by Joshua Chan
Status: Superseded
Proposed branch: ~jocha/cloud-init:jocha-reboot-bugfix
Merge into: cloud-init:master
Diff against target: 170 lines (+86/-8)
2 files modified
cloudinit/sources/DataSourceAzure.py (+14/-6)
tests/unittests/test_datasource/test_azure.py (+72/-2)
Reviewer Review Type Date Requested Status
Douglas Jordan Pending
Chad Smith Pending
Review via email: mp+344191@code.launchpad.net

This proposal supersedes a proposal from 2018-04-18.

This proposal has been superseded by a proposal from 2018-04-24.

Commit message

Add reported ready marker file.

This change is for Azure VM Preprovisioning. A bug was found when after azure VMs report ready the first time, during the time when VM is polling indefinitely for the new ovf-env.xml from Instance Metadata Service (IMDS), if a reboot happens, we send another report ready signal to the fabric, which deletes the reprovisioning data on the node.

This marker file is used to fix this issue so that we will only send a report ready signal to the fabric when no marker file is present. Then, create a marker file so that when a reboot does occur, we check if a marker file has been created and decide whether we would like to send the repot ready signal.

LP: #1765214

To post a comment you must log in.
Revision history for this message
Douglas Jordan (dojordan) : Posted in a previous version of this proposal
Revision history for this message
Chad Smith (chad.smith) : Posted in a previous version of this proposal
Revision history for this message
Chad Smith (chad.smith) : Posted in a previous version of this proposal
review: Needs Fixing

Unmerged commits

23002d0... by Joshua Chan

fix string formatting and split up unit tests

eb99235... by Mike Gerdts

DataSourceSmartOS: add locking of serial device.

cloud-init and mdata-get each have their own implementation of the
SmartOS metadata protocol. If cloud-init and other services that call
mdata-get are run concurrently, crosstalk on the serial port can cause
them both to become confused.

This change makes it so that cloud-init uses the same cooperative
locking scheme that's used by mdata-get, thus preventing cross-talk
between mdata-get and cloud-init.

For testing, a VM running on a SmartOS host and pyserial are required.
If the tests are run on a platform other than SmartOS, those that use a
real serial port are skipped. pyserial remains commented in
requirements.txt because most testers will not be running atop SmartOS.

LP: #1746605

3352633... by Mike Gerdts

DataSourceSmartOS: sdc:hostname is ignored

There are three potential sources of the hostname, one of which is
documented SmartOS's vmadm(1M) via the hostname property. That
property's value is retrieved via the sdc:hostname key. The other
two sources for the hostname are a hostname key in customer_metadata
and the VM's uuid (sdc:uuid). Of these three, the sdc:hostname value
is not used in a meaningful way by DataSourceSmartOS.

This fix changes the fallback mechanism when hostname is not
specified in customer_metadata. The order of precedence for setting
the hostname is now 1) hostname in customer_metadata,
2) sdc:hostname, then 3) sdc:uuid.

LP: #1765085

fe82866... by Mike Gerdts

DataSourceSmartOS: list() should always return a list

If customer_metadata has no keys, the KEYS request returns an empty
string. Callers of the list() method expect a list to be returned and
will give a stack trace if this expectation is not met.

LP: #1763480

22d5d58... by Scott Moser

schema: in validation, raise ImportError if strict but no jsonschema.

validate_cloudconfig_schema with strict=True would not actually validate
if there was no jsonschema available. That seems kind of strange.
The change here is to make it raise an exception if strict was passed in.
And then to fix the one test that needed a skipIfJsonSchema wrapper.

5114828... by Scott Moser

set_passwords: Add newline to end of sshd config, only restart if updated.

This admittedly does a fairly extensive re-factor to simply add a newline
to the end of sshd_config.

It makes the ssh_config updating portion of set_passwords more testable
and adds tests for that.

The new function is in 'update_ssh_config_lines' which allows you
to update a config with multiple changes even though only a single one
is currently used.

We also only restart the ssh daemon now if a change was made to the
config file. Before it was always restarted if the user specified
a value for ssh_pwauth other than 'unchanged'.

Thanks to Lorens Kockum for initial diagnosis and patch.

LP: #1677205

222cabe... by Scott Moser

pylint: pay attention to unused variable warnings.

This enables warnings produced by pylint for unused variables (W0612),
and fixes the existing errors.

f7ac3e6... by Junjie.Wang

doc: Add documentation for AliYun datasource.

Just add some documentation to readthedocs for AliYun.

5670b53... by Scott Moser

Schema: do not warn on duplicate items in commands.

runcmd, bootcmd, snap/commands, ubuntu-advantage/commands would
log warning (and fail if strict) on duplicate values in the commands.
But those should be allowed. Example, it is perfectly valid to do:
   runcmd: ['sleep 1', 'sleep 1']

LP: #1764264

3bd8fd3... by Chad Smith

net: Depend on iproute2's ip instead of net-tools ifconfig or route

The net-tools package is deprecated and will eventually be dropped. Use
"ip route", "link" or "address" instead of "ifconfig" or "route" calls.
Cloud-init can now run in an environment that no longer has net-tools.
This affects the network and route printing emitted to
cloud-config-output.log as well as the cc_disable_ec2_metadata module.

Additional changes:
 - separate readResource and resourceLocation into standalone test
   functions
 - Fix ipv4 address rows to report scopes represented by ip addr show
 - Formatted route/address ouput now handles multiple ipv4 and ipv6
   addresses on a single interface

Co-authored-by: James Hogarth <email address hidden>
Co-authored-by: Robert Schweikert <email address hidden>

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
diff --git a/cloudinit/sources/DataSourceAzure.py b/cloudinit/sources/DataSourceAzure.py
index a71197a..b47a6af 100644
--- a/cloudinit/sources/DataSourceAzure.py
+++ b/cloudinit/sources/DataSourceAzure.py
@@ -48,6 +48,7 @@ DEFAULT_FS = 'ext4'
48# DMI chassis-asset-tag is set static for all azure instances48# DMI chassis-asset-tag is set static for all azure instances
49AZURE_CHASSIS_ASSET_TAG = '7783-7084-3265-9085-8269-3286-77'49AZURE_CHASSIS_ASSET_TAG = '7783-7084-3265-9085-8269-3286-77'
50REPROVISION_MARKER_FILE = "/var/lib/cloud/data/poll_imds"50REPROVISION_MARKER_FILE = "/var/lib/cloud/data/poll_imds"
51REPORTED_READY_MARKER_FILE = "/var/lib/cloud/data/reported_ready"
51IMDS_URL = "http://169.254.169.254/metadata/reprovisiondata"52IMDS_URL = "http://169.254.169.254/metadata/reprovisiondata"
5253
5354
@@ -450,13 +451,16 @@ class DataSourceAzure(sources.DataSource):
450 # call DHCP and setup the ephemeral network to acquire the new IP.451 # call DHCP and setup the ephemeral network to acquire the new IP.
451 return False452 return False
452453
453 need_report = report_ready
454 while True:454 while True:
455 try:455 try:
456 with EphemeralDHCPv4() as lease:456 with EphemeralDHCPv4() as lease:
457 if need_report:457 if report_ready:
458 self._report_ready(lease=lease)458 self._report_ready(lease=lease)
459 need_report = False459 path = REPORTED_READY_MARKER_FILE
460 LOG.info("Create flag file, reported ready: %s", path)
461 util.write_file(path, "{pid}: {time}\n".format(
462 pid=os.getpid(), time=time()))
463 report_ready = False
460 return readurl(url, timeout=1, headers=headers,464 return readurl(url, timeout=1, headers=headers,
461 exception_cb=exc_cb, infinite=True).contents465 exception_cb=exc_cb, infinite=True).contents
462 except UrlError:466 except UrlError:
@@ -490,14 +494,17 @@ class DataSourceAzure(sources.DataSource):
490 if (cfg.get('PreprovisionedVm') is True or494 if (cfg.get('PreprovisionedVm') is True or
491 os.path.isfile(path)):495 os.path.isfile(path)):
492 if not os.path.isfile(path):496 if not os.path.isfile(path):
493 LOG.info("Creating a marker file to poll imds")497 LOG.info("Creating a marker file to poll imds: %s",
494 util.write_file(path, "%s: %s\n" % (os.getpid(), time()))498 path)
499 util.write_file(path, "{pid}: {time}\n".format(
500 pid=os.getpid(), time=time()))
495 return True501 return True
496 return False502 return False
497503
498 def _reprovision(self):504 def _reprovision(self):
499 """Initiate the reprovisioning workflow."""505 """Initiate the reprovisioning workflow."""
500 contents = self._poll_imds()506 report_ready = bool(not os.path.isfile(REPORTED_READY_MARKER_FILE))
507 contents = self._poll_imds(report_ready)
501 md, ud, cfg = read_azure_ovf(contents)508 md, ud, cfg = read_azure_ovf(contents)
502 return (md, ud, cfg, {'ovf-env.xml': contents})509 return (md, ud, cfg, {'ovf-env.xml': contents})
503510
@@ -526,6 +533,7 @@ class DataSourceAzure(sources.DataSource):
526 "Error communicating with Azure fabric; You may experience."533 "Error communicating with Azure fabric; You may experience."
527 "connectivity issues.", exc_info=True)534 "connectivity issues.", exc_info=True)
528 return False535 return False
536 util.del_file(REPORTED_READY_MARKER_FILE)
529 util.del_file(REPROVISION_MARKER_FILE)537 util.del_file(REPROVISION_MARKER_FILE)
530 return fabric_data538 return fabric_data
531539
diff --git a/tests/unittests/test_datasource/test_azure.py b/tests/unittests/test_datasource/test_azure.py
index 88fe76c..4ff2a66 100644
--- a/tests/unittests/test_datasource/test_azure.py
+++ b/tests/unittests/test_datasource/test_azure.py
@@ -1163,12 +1163,13 @@ class TestAzureDataSourcePreprovisioning(CiTestCase):
1163 cfg = ret[2]1163 cfg = ret[2]
1164 self.assertFalse(cfg['PreprovisionedVm'])1164 self.assertFalse(cfg['PreprovisionedVm'])
11651165
1166 @mock.patch('cloudinit.sources.DataSourceAzure.util.write_file')
1166 @mock.patch('cloudinit.sources.DataSourceAzure.util.is_FreeBSD')1167 @mock.patch('cloudinit.sources.DataSourceAzure.util.is_FreeBSD')
1167 @mock.patch('cloudinit.net.dhcp.EphemeralIPv4Network')1168 @mock.patch('cloudinit.net.dhcp.EphemeralIPv4Network')
1168 @mock.patch('cloudinit.net.dhcp.maybe_perform_dhcp_discovery')1169 @mock.patch('cloudinit.net.dhcp.maybe_perform_dhcp_discovery')
1169 @mock.patch('requests.Session.request')1170 @mock.patch('requests.Session.request')
1170 def test_poll_imds_returns_ovf_env(self, fake_resp, m_dhcp, m_net,1171 def test_poll_imds_returns_ovf_env(self, fake_resp, m_dhcp, m_net,
1171 m_is_bsd, *args):1172 m_is_bsd, write_f, *args):
1172 """The _poll_imds method should return the ovf_env.xml."""1173 """The _poll_imds method should return the ovf_env.xml."""
1173 m_is_bsd.return_value = False1174 m_is_bsd.return_value = False
1174 m_dhcp.return_value = [{1175 m_dhcp.return_value = [{
@@ -1194,12 +1195,13 @@ class TestAzureDataSourcePreprovisioning(CiTestCase):
1194 prefix_or_mask='255.255.255.0', router='192.168.2.1')1195 prefix_or_mask='255.255.255.0', router='192.168.2.1')
1195 self.assertEqual(m_net.call_count, 1)1196 self.assertEqual(m_net.call_count, 1)
11961197
1198 @mock.patch('cloudinit.sources.DataSourceAzure.util.write_file')
1197 @mock.patch('cloudinit.sources.DataSourceAzure.util.is_FreeBSD')1199 @mock.patch('cloudinit.sources.DataSourceAzure.util.is_FreeBSD')
1198 @mock.patch('cloudinit.net.dhcp.EphemeralIPv4Network')1200 @mock.patch('cloudinit.net.dhcp.EphemeralIPv4Network')
1199 @mock.patch('cloudinit.net.dhcp.maybe_perform_dhcp_discovery')1201 @mock.patch('cloudinit.net.dhcp.maybe_perform_dhcp_discovery')
1200 @mock.patch('requests.Session.request')1202 @mock.patch('requests.Session.request')
1201 def test__reprovision_calls__poll_imds(self, fake_resp, m_dhcp, m_net,1203 def test__reprovision_calls__poll_imds(self, fake_resp, m_dhcp, m_net,
1202 m_is_bsd, *args):1204 m_is_bsd, write_f, *args):
1203 """The _reprovision method should call poll IMDS."""1205 """The _reprovision method should call poll IMDS."""
1204 m_is_bsd.return_value = False1206 m_is_bsd.return_value = False
1205 m_dhcp.return_value = [{1207 m_dhcp.return_value = [{
@@ -1231,6 +1233,74 @@ class TestAzureDataSourcePreprovisioning(CiTestCase):
1231 prefix_or_mask='255.255.255.0', router='192.168.2.1')1233 prefix_or_mask='255.255.255.0', router='192.168.2.1')
1232 self.assertEqual(m_net.call_count, 1)1234 self.assertEqual(m_net.call_count, 1)
12331235
1236 @mock.patch('cloudinit.sources.DataSourceAzure.DataSourceAzure._poll_imds')
1237 @mock.patch('os.path.isfile')
1238 def test_reprovision_report_ready(self, isfile, _poll_imds,
1239 *args):
1240 """_reprovision should call _poll_IMDS(True) (with report ready)
1241 if flag file is not present"""
1242 isfile.return_value = False
1243 hostname = "myhost"
1244 username = "myuser"
1245 odata = {'HostName': hostname, 'UserName': username}
1246 _poll_imds.return_value = construct_valid_ovf_env(data=odata)
1247 dsa = dsaz.DataSourceAzure({}, distro=None, paths=self.paths)
1248 md, ud, cfg, d = dsa._reprovision()
1249 self.assertEqual(_poll_imds.call_args[0], tuple([True]))
1250
1251 @mock.patch('cloudinit.sources.DataSourceAzure.DataSourceAzure._poll_imds')
1252 @mock.patch('os.path.isfile')
1253 def test_reprovision_no_report_ready(self, isfile, _poll_imds,
1254 *args):
1255 """_reprovision should call _poll_IMDS(False) (without report ready)
1256 if flag file is present"""
1257 isfile.return_value = True
1258 hostname = "myhost"
1259 username = "myuser"
1260 odata = {'HostName': hostname, 'UserName': username}
1261 _poll_imds.return_value = construct_valid_ovf_env(data=odata)
1262 dsa = dsaz.DataSourceAzure({}, distro=None, paths=self.paths)
1263 md, ud, cfg, d = dsa._reprovision()
1264 self.assertEqual(_poll_imds.call_args[0], tuple([False]))
1265
1266 @mock.patch('cloudinit.sources.DataSourceAzure.util.write_file')
1267 @mock.patch('cloudinit.net.dhcp.EphemeralIPv4Network')
1268 @mock.patch('cloudinit.net.dhcp.maybe_perform_dhcp_discovery')
1269 @mock.patch('requests.Session.request')
1270 @mock.patch(
1271 'cloudinit.sources.DataSourceAzure.DataSourceAzure._report_ready')
1272 def test_poll_imds_report_ready_true(self, report_ready_func,
1273 fake_resp, m_dhcp, m_net,
1274 write_f, *args):
1275 """The poll_imds should call report_ready when flag is true"""
1276 m_dhcp.return_value = [{
1277 'interface': 'eth9', 'fixed-address': '192.168.2.9',
1278 'routers': '192.168.2.1', 'subnet-mask': '255.255.255.0',
1279 'unknown-245': '624c3620'}]
1280 dsa = dsaz.DataSourceAzure({}, distro=None, paths=self.paths)
1281 report_ready_flag = True
1282 dsa._poll_imds(report_ready_flag)
1283 self.assertEqual(report_ready_func.call_count, 1)
1284
1285 @mock.patch('cloudinit.net.dhcp.EphemeralIPv4Network')
1286 @mock.patch('cloudinit.net.dhcp.maybe_perform_dhcp_discovery')
1287 @mock.patch('requests.Session.request')
1288 @mock.patch(
1289 'cloudinit.sources.DataSourceAzure.DataSourceAzure._report_ready')
1290 def test_poll_imds_report_ready_false(self, report_ready_func,
1291 fake_resp, m_dhcp, m_net,
1292 *args):
1293 """The poll_imds should not call reporting ready
1294 when flag is false"""
1295 m_dhcp.return_value = [{
1296 'interface': 'eth9', 'fixed-address': '192.168.2.9',
1297 'routers': '192.168.2.1', 'subnet-mask': '255.255.255.0',
1298 'unknown-245': '624c3620'}]
1299 dsa = dsaz.DataSourceAzure({}, distro=None, paths=self.paths)
1300 report_ready_flag = False
1301 dsa._poll_imds(report_ready_flag)
1302 self.assertEqual(report_ready_func.call_count, 0)
1303
1234 @mock.patch('cloudinit.sources.DataSourceAzure.util.write_file')1304 @mock.patch('cloudinit.sources.DataSourceAzure.util.write_file')
1235 @mock.patch('os.path.isfile')1305 @mock.patch('os.path.isfile')
1236 def test__should_reprovision_with_true_cfg(self, isfile, write_f, *args):1306 def test__should_reprovision_with_true_cfg(self, isfile, write_f, *args):

Subscribers

People subscribed via source and target branches