Merge ~dojordan/cloud-init:azure-preprovisioning into cloud-init:master

Proposed by Douglas Jordan on 2017-11-28
Status: Needs review
Proposed branch: ~dojordan/cloud-init:azure-preprovisioning
Merge into: cloud-init:master
Diff against target: 459 lines (+258/-22)
4 files modified
.gitignore (+1/-0)
cloudinit/sources/DataSourceAzure.py (+128/-9)
cloudinit/url_helper.py (+12/-7)
tests/unittests/test_datasource/test_azure.py (+117/-6)
Reviewer Review Type Date Requested Status
Server Team CI bot continuous-integration Approve 11 hours ago
Scott Moser Needs Fixing on 2018-01-03
Chad Smith Approve on 2017-12-14
Paul Meyer (community) 2017-11-28 Approve on 2017-12-01
Review via email: mp+334341@code.launchpad.net

Commit Message

Azure VM Preprovisioning support.

This change will enable azure vms to report provisioning has completed twice, first to tell the fabric it has completed then a second time to enable customer settings. The datasource for the second provisioning is the Instance Metadata Service (IMDS), and the VM will poll indefinitely for the new ovf-env.xml from IMDS.

LP: #1734991

Description of the Change

Azure VM Preprovisioning support.

This change will enable azure vms to report provisioning has completed twice, first to tell the fabric it has completed then a second time to enable customer settings. The datasource for the second provisioning is the Instance Metadata Service (IMDS), and the VM will poll indefinitely for the new ovf-env.xml from IMDS.

LP: 1734991

To post a comment you must log in.
Paul Meyer (paul-meyer) wrote :

Looks good to me (after you fix the tests and flake8's)

Scott Moser (smoser) wrote :

Fix your commit message (press 'Set commit message' above).

 Summary
 <blank>
 More info

FAILED: Continuous integration, rev:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/553/
Executed test runs:
    FAILED: Checkout

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/553/rebuild

review: Needs Fixing (continuous-integration)
Douglas Jordan (dojordan) wrote :

> FAILED: Continuous integration, rev:
> https://jenkins.ubuntu.com/server/job/cloud-init-ci/553/
> Executed test runs:
> FAILED: Checkout
>
> Click here to trigger a rebuild:
> https://jenkins.ubuntu.com/server/job/cloud-init-ci/553/rebuild

Looks like the git checkout failed. Thoughts?
stderr: remote: Authorisation required.
fatal: Authentication failed for 'https://git.launchpad.net/~dojordan/cloud-init:azure-preprovisioning/'

FAILED: Continuous integration, rev:72e423b4a0827dd954170749812b13aba761f399
https://jenkins.ubuntu.com/server/job/cloud-init-ci/562/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    FAILED: MAAS Compatability Testing

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/562/rebuild

review: Needs Fixing (continuous-integration)

FAILED: Continuous integration, rev:384b7731e99338903d91da641267ed84a7470669
https://jenkins.ubuntu.com/server/job/cloud-init-ci/567/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    FAILED: MAAS Compatability Testing

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/567/rebuild

review: Needs Fixing (continuous-integration)

FAILED: Continuous integration, rev:d00ec2ec82271c3c35830a04d2f216cd7bef8ba7
https://jenkins.ubuntu.com/server/job/cloud-init-ci/575/
Executed test runs:
    SUCCESS: Checkout
    FAILED: Unit & Style Tests

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/575/rebuild

review: Needs Fixing (continuous-integration)

PASSED: Continuous integration, rev:6021da2b61a55102ec7637d6fea54e02db1c1c92
https://jenkins.ubuntu.com/server/job/cloud-init-ci/576/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/576/rebuild

review: Approve (continuous-integration)
Paul Meyer (paul-meyer) :
review: Approve
Douglas Jordan (dojordan) wrote :

Thanks for the feedback. I've resolved your comments.

PASSED: Continuous integration, rev:ff0b7b0b5fdd1512c7ec2ccbee0f92afde5c733d
https://jenkins.ubuntu.com/server/job/cloud-init-ci/579/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/579/rebuild

review: Approve (continuous-integration)
Chad Smith (chad.smith) wrote :

Douglas, thanks for working this. I'm going to mark it work in progress as you address or respond to comments. When you'd like another review pass please just mark it back to "Needs review"

FAILED: Continuous integration, rev:10c6b219ba7168c5fb3751d28a383b130a1348ae
https://jenkins.ubuntu.com/server/job/cloud-init-ci/590/
Executed test runs:
    SUCCESS: Checkout
    FAILED: Unit & Style Tests

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/590/rebuild

review: Needs Fixing (continuous-integration)

PASSED: Continuous integration, rev:e06a762741953a5c55d843b0dcfd454111655cea
https://jenkins.ubuntu.com/server/job/cloud-init-ci/595/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/595/rebuild

review: Approve (continuous-integration)
Chad Smith (chad.smith) wrote :

Thanks for the rework and comments. One more pass on this. I'll test against stock azure xenial today and report here.

PASSED: Continuous integration, rev:c500a0184b8bc7c7ba03b1a884896702a761959f
https://jenkins.ubuntu.com/server/job/cloud-init-ci/612/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/612/rebuild

review: Approve (continuous-integration)
Chad Smith (chad.smith) wrote :

Ok one last nit from me and it looks good. I need a bit of testing still on this but content looks good.

Chad Smith (chad.smith) wrote :

one last try on the nit

PASSED: Continuous integration, rev:3c9509671f05d8dfc990b5ce35a615c9cd7b442c
https://jenkins.ubuntu.com/server/job/cloud-init-ci/618/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/618/rebuild

review: Approve (continuous-integration)
Chad Smith (chad.smith) :
review: Approve
Scott Moser (smoser) wrote :

There are some comments in line.
I'm not sure I fully understand all of this.

I could be wrong here, but I think you're using
bounce_network_with_azure_hostname in order to interact with IMDS_URL.

We have 2 ways now of brining up ephemeral networking for this purpose:
 a.) Chad has recently added code in the EC2 metadata service using cloudinit/net/dhcp.py .
 b.) the digital ocean datasource has code to negotiate a ipv4 link local address.

If 'b' is sufficient, I'd prefer that, but either one I'd prefer to the bounce_network_ which i think may actually not work for you if you rebased to trunk.

As mentioned in IRC, I'm still concerned about systemd giving up and deciding that boot has failed after some amount of time polling on a metadata service. As Douglas pointed out, cloud-init has timeouts set to 0 and is a 'oneshot', so *its* timeout is not an issue, but I think that things that it runs 'Before' or (pre-networking or other) might end up timing out.

Scott Moser (smoser) wrote :

some things there need fixing, definitely need rebase to trunk (I suspect you'll have conflicts), but if not, some thought is needed.

review: Needs Fixing
fd33d0a... by Douglas Jordan on 2018-01-04

Merge branch 'master' into azure-preprovisioning

8b89d2d... by Douglas Jordan on 2018-01-04

Small tweaks.

c207d44... by Douglas Jordan on 2018-01-05

Addressing CR comments

FAILED: Continuous integration, rev:c207d443ca360f192509b33dacd404cd0a4d3bc5
https://jenkins.ubuntu.com/server/job/cloud-init-ci/662/
Executed test runs:
    SUCCESS: Checkout
    FAILED: Unit & Style Tests

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/662/rebuild

review: Needs Fixing (continuous-integration)
ba329ca... by Douglas Jordan on 2018-01-05

Using warning instead of warn

FAILED: Continuous integration, rev:ba329ca5cde7f64b1f04ff3a600f3424e8c04515
https://jenkins.ubuntu.com/server/job/cloud-init-ci/669/
Executed test runs:
    SUCCESS: Checkout
    FAILED: Unit & Style Tests

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/669/rebuild

review: Needs Fixing (continuous-integration)
93d88e2... by Douglas Jordan on 2018-01-05

Fixing flake8

PASSED: Continuous integration, rev:93d88e25a1db5220cf087bba80a1ff15dd7f7fd8
https://jenkins.ubuntu.com/server/job/cloud-init-ci/670/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/670/rebuild

review: Approve (continuous-integration)
Douglas Jordan (dojordan) wrote :

Thanks for the comments Scott-

The reason for the bounce_network_with_azure_hostname is actually to get the new IP address. We can't use the digital ocean ipv4 link local approach as the way our IMDS server identifies which VM is talking to it is via the mac address and the ip address. When the control plane updates the DHCP server with the new IP, the vm has to trigger dhcp and acquire the new address. In our polling loop, we call bounce_networking_with_azure_hostname to trigger dhcp when we get an exception. This is because IMDS will not even handle our request if the mac/ip does not match what was expected. In windows, we get around this by just disconnecting and reconnecting the NIC, however on Ubuntu 16.04, we observed the link had to be disconnected for a very long time (>10s) for this behavior to occur. Hopefully, with systemd-networkd, this issue will be fixed. For now however, we are targeting 16.04 for pre=provisioning, and we will continue to test artful/bionic with the link state disconnect / reconnect approach.

> There are some comments in line.
> I'm not sure I fully understand all of this.
>
> I could be wrong here, but I think you're using
> bounce_network_with_azure_hostname in order to interact with IMDS_URL.
>
> We have 2 ways now of brining up ephemeral networking for this purpose:
> a.) Chad has recently added code in the EC2 metadata service using
> cloudinit/net/dhcp.py .
> b.) the digital ocean datasource has code to negotiate a ipv4 link local
> address.
>
> If 'b' is sufficient, I'd prefer that, but either one I'd prefer to the
> bounce_network_ which i think may actually not work for you if you rebased to
> trunk.
>
> As mentioned in IRC, I'm still concerned about systemd giving up and deciding
> that boot has failed after some amount of time polling on a metadata service.
> As Douglas pointed out, cloud-init has timeouts set to 0 and is a 'oneshot',
> so *its* timeout is not an issue, but I think that things that it runs
> 'Before' or (pre-networking or other) might end up timing out.

sushant (sushantsharma) wrote :

<not sure why the inline comment does not show up>

Hi Douglas, Can you please add a comment before line 83 (where you catch exception and re-DHCP) saying that this is temporary. We plan to add a networking module specific to azure in cloud-init and will address re-DHCP need in that module. The plan is to submit that module for review in a week or so. Thanks!

fc01540... by Douglas Jordan on 2018-01-08

Adding comments

PASSED: Continuous integration, rev:fc0154011d4a3f0418fa71cfa5330bd9ac837dfd
https://jenkins.ubuntu.com/server/job/cloud-init-ci/679/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/679/rebuild

review: Approve (continuous-integration)
Scott Moser (smoser) wrote :

As it is right now, 'bounce_network_with_azure_hostname' is a no-op
on Ubuntu 18.04 (bionic) or any system without 'ifup'.
So while you want this fixed in 16.04, the general Ubuntu development
path insists that we make things work on the development release first
and the SRU them back to 16.04.

In summary: I can't let this in as broken on bionic. We'll need to find
a way to make that work on 18.04.

Your commit message says:
 | This change will enable azure vms to report provisioning has completed
 | twice, first to tell the fabric it has completed then a second time to
 | enable customer settings.

Is that true, will it *always* report reprovisioning has completed twice?

there are some small inlinel things also.

those are the big things though. thanks for your work.

Douglas Jordan (dojordan) wrote :

Regarding the DHCP stuff: We are exploring an alternate solution to bounce the nic from hyper-v, but in the mean time we would like to get this checked in. So an alternate solution for bionic would be to simply change the hostname. This way, systemd-networkd will keep re triggering DHCP. Once we get the final ovf_env.xml from IMDS, we will actually apply the real, customer provided hostname.

Regarding "Is that true, will it *always* report reprovisioning has completed twice?"-
tldr; yes. Technically, it will report "provisioning" has completed twice, while we are calling the second incarnation "reprovisioning"

7f23e5c... by Douglas Jordan 12 hours ago

Changing hostname post xenial, and other PR comments

Author: Douglas Jordan <email address hidden>
Committer: Douglas Jordan <email address hidden>

PASSED: Continuous integration, rev:7f23e5c4808a9c647cd4d5277625a723a58b132b
https://jenkins.ubuntu.com/server/job/cloud-init-ci/685/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/685/rebuild

review: Approve (continuous-integration)

Unmerged commits

7f23e5c... by Douglas Jordan 12 hours ago

Changing hostname post xenial, and other PR comments

Author: Douglas Jordan <email address hidden>
Committer: Douglas Jordan <email address hidden>

fc01540... by Douglas Jordan on 2018-01-08

Adding comments

93d88e2... by Douglas Jordan on 2018-01-05

Fixing flake8

ba329ca... by Douglas Jordan on 2018-01-05

Using warning instead of warn

c207d44... by Douglas Jordan on 2018-01-05

Addressing CR comments

8b89d2d... by Douglas Jordan on 2018-01-04

Small tweaks.

fd33d0a... by Douglas Jordan on 2018-01-04

Merge branch 'master' into azure-preprovisioning

3c95096... by Douglas Jordan on 2017-12-11

nit fixes

37f9ff8... by Douglas Jordan on 2017-12-11

call extract method directly.

c500a01... by Douglas Jordan on 2017-12-11

Flake8 fixes

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
diff --git a/.gitignore b/.gitignore
index b0500a6..75565ed 100644
--- a/.gitignore
+++ b/.gitignore
@@ -10,3 +10,4 @@ parts
10prime10prime
11stage11stage
12*.snap12*.snap
13*.cover
diff --git a/cloudinit/sources/DataSourceAzure.py b/cloudinit/sources/DataSourceAzure.py
index d1d0975..0bcbc99 100644
--- a/cloudinit/sources/DataSourceAzure.py
+++ b/cloudinit/sources/DataSourceAzure.py
@@ -11,6 +11,8 @@ from functools import partial
11import os11import os
12import os.path12import os.path
13import re13import re
14from time import time
15from uuid import uuid4
14from xml.dom import minidom16from xml.dom import minidom
15import xml.etree.ElementTree as ET17import xml.etree.ElementTree as ET
1618
@@ -18,6 +20,7 @@ from cloudinit import log as logging
18from cloudinit import net20from cloudinit import net
19from cloudinit import sources21from cloudinit import sources
20from cloudinit.sources.helpers.azure import get_metadata_from_fabric22from cloudinit.sources.helpers.azure import get_metadata_from_fabric
23from cloudinit.url_helper import readurl, wait_for_url, UrlError
21from cloudinit import util24from cloudinit import util
2225
23LOG = logging.getLogger(__name__)26LOG = logging.getLogger(__name__)
@@ -44,6 +47,8 @@ LEASE_FILE = '/var/lib/dhcp/dhclient.eth0.leases'
44DEFAULT_FS = 'ext4'47DEFAULT_FS = 'ext4'
45# DMI chassis-asset-tag is set static for all azure instances48# DMI chassis-asset-tag is set static for all azure instances
46AZURE_CHASSIS_ASSET_TAG = '7783-7084-3265-9085-8269-3286-77'49AZURE_CHASSIS_ASSET_TAG = '7783-7084-3265-9085-8269-3286-77'
50REPROVISION_MARKER_FILE = "/var/lib/cloud/data/poll_imds"
51IMDS_URL = "http://169.254.169.254/metadata/reprovisiondata"
4752
4853
49def find_storvscid_from_sysctl_pnpinfo(sysctl_out, deviceid):54def find_storvscid_from_sysctl_pnpinfo(sysctl_out, deviceid):
@@ -276,19 +281,20 @@ class DataSourceAzure(sources.DataSource):
276281
277 with temporary_hostname(azure_hostname, self.ds_cfg,282 with temporary_hostname(azure_hostname, self.ds_cfg,
278 hostname_command=hostname_command) \283 hostname_command=hostname_command) \
279 as previous_hostname:284 as previous_hn:
280 if (previous_hostname is not None and285 if (previous_hn is not None and
281 util.is_true(self.ds_cfg.get('set_hostname'))):286 util.is_true(self.ds_cfg.get('set_hostname'))):
282 cfg = self.ds_cfg['hostname_bounce']287 cfg = self.ds_cfg['hostname_bounce']
283288
284 # "Bouncing" the network289 # "Bouncing" the network
285 try:290 try:
286 perform_hostname_bounce(hostname=azure_hostname,291 return perform_hostname_bounce(hostname=azure_hostname,
287 cfg=cfg,292 cfg=cfg,
288 prev_hostname=previous_hostname)293 prev_hostname=previous_hn)
289 except Exception as e:294 except Exception as e:
290 LOG.warning("Failed publishing hostname: %s", e)295 LOG.warning("Failed publishing hostname: %s", e)
291 util.logexc(LOG, "handling set_hostname failed")296 util.logexc(LOG, "handling set_hostname failed")
297 return False
292298
293 def get_metadata_from_agent(self):299 def get_metadata_from_agent(self):
294 temp_hostname = self.metadata.get('local-hostname')300 temp_hostname = self.metadata.get('local-hostname')
@@ -345,15 +351,20 @@ class DataSourceAzure(sources.DataSource):
345 ddir = self.ds_cfg['data_dir']351 ddir = self.ds_cfg['data_dir']
346352
347 candidates = [self.seed_dir]353 candidates = [self.seed_dir]
354 if os.path.isfile(REPROVISION_MARKER_FILE):
355 candidates.insert(0, "IMDS")
348 candidates.extend(list_possible_azure_ds_devs())356 candidates.extend(list_possible_azure_ds_devs())
349 if ddir:357 if ddir:
350 candidates.append(ddir)358 candidates.append(ddir)
351359
352 found = None360 found = None
353361 reprovision = False
354 for cdev in candidates:362 for cdev in candidates:
355 try:363 try:
356 if cdev.startswith("/dev/"):364 if cdev == "IMDS":
365 ret = None
366 reprovision = True
367 elif cdev.startswith("/dev/"):
357 if util.is_FreeBSD():368 if util.is_FreeBSD():
358 ret = util.mount_cb(cdev, load_azure_ds_dir,369 ret = util.mount_cb(cdev, load_azure_ds_dir,
359 mtype="udf", sync=False)370 mtype="udf", sync=False)
@@ -370,6 +381,8 @@ class DataSourceAzure(sources.DataSource):
370 LOG.warning("%s was not mountable", cdev)381 LOG.warning("%s was not mountable", cdev)
371 continue382 continue
372383
384 if reprovision or self._should_reprovision(ret):
385 ret = self._reprovision()
373 (md, self.userdata_raw, cfg, files) = ret386 (md, self.userdata_raw, cfg, files) = ret
374 self.seed = cdev387 self.seed = cdev
375 self.metadata = util.mergemanydict([md, DEFAULT_METADATA])388 self.metadata = util.mergemanydict([md, DEFAULT_METADATA])
@@ -428,6 +441,83 @@ class DataSourceAzure(sources.DataSource):
428 LOG.debug("negotiating already done for %s",441 LOG.debug("negotiating already done for %s",
429 self.get_instance_id())442 self.get_instance_id())
430443
444 def _poll_imds(self):
445 """Poll IMDS for the new provisioning data until we get a valid
446 response. Then returned the returned JSON object."""
447 url = IMDS_URL + "?api-version=2017-04-02"
448 headers = {"Metadata": "true"}
449 LOG.debug("Start polling IMDS")
450
451 def exception_cb(msg, exception):
452 if isinstance(exception, UrlError) and exception.code == 404:
453 return
454 LOG.warning("Exception during polling. Will try DHCP.",
455 exc_info=True)
456 # On certain versions of the networking stack, the link
457 # state disconnect is not long enough to retrigger DHCP.
458 # If we get an exception while trying to call IMDS, we
459 # bounce the network to force DHCP to acquire the new IP.
460 if not self.bounce_network_with_azure_hostname():
461 # On ubuntu releases > xenial, we rely on systemd-networkd to
462 # retrigger DHCP when the hostname changes as the ifup/down
463 # commands are missing. In that case, we will simply change
464 # the hostname. We can do this as the real hostname will still
465 # be applied after receiving the ovf_env.xml.
466 hn_cmd = self.ds_cfg['hostname_bounce']['hostname_command']
467 set_hostname(str(uuid4()), hn_cmd)
468
469 wait_for_url([url], max_wait=None, timeout=60, status_cb=LOG.info,
470 headers_cb=lambda url: headers, sleep_time=1,
471 exception_cb=exception_cb)
472 return str(readurl(url, headers=headers))
473
474 def _report_ready(self):
475 """Tells the fabric provisioning has completed
476 before we go into our polling loop."""
477 try:
478 get_metadata_from_fabric(self.dhclient_lease_file)
479 except Exception as exc:
480 LOG.warning(
481 "Error communicating with Azure fabric; You may experience."
482 "connectivity issues.", exc_info=True)
483
484 def _should_reprovision(self, ret):
485 """Whether or not we should poll IMDS for reprovisioning data.
486 Also sets a marker file to poll IMDS.
487
488 The marker file is used for the following scenario: the VM boots into
489 this polling loop, which we expect to be proceeding infinitely until
490 the VM is picked. If for whatever reason the platform moves us to a
491 new host (for instance a hardware issue), we need to keep polling.
492 However, since the VM reports ready to the Fabric, we will not attach
493 the ISO, thus cloud-init needs to have a way of knowing that it should
494 jump back into the polling loop in order to retrieve the ovf_env."""
495 if not ret:
496 return False
497 (md, self.userdata_raw, cfg, files) = ret
498 path = REPROVISION_MARKER_FILE
499 if (cfg.get('PreprovisionedVm') is True or
500 os.path.isfile(path)):
501 if not os.path.isfile(path):
502 LOG.info("Creating a marker file to poll imds")
503 util.write_file(path, "%s: %s\n" % (os.getpid(), time()))
504 return True
505 return False
506
507 def _reprovision(self):
508 """Initiate the reprovisioning workflow."""
509 LOG.info("bouncing network to enable IMDS polling")
510 if self.metadata is None:
511 self.metadata = {}
512 self.metadata['local-hostname'] = 'azurevm'
513 # This is needed in order to report our temp hostname to the platform
514 # and trigger DHCP in order to get an IP address.
515 self.bounce_network_with_azure_hostname()
516 self._report_ready()
517 contents = self._poll_imds()
518 md, ud, cfg = read_azure_ovf(contents)
519 return (md, ud, cfg, {'ovf-env.xml': contents})
520
431 def _negotiate(self):521 def _negotiate(self):
432 """Negotiate with fabric and return data from it.522 """Negotiate with fabric and return data from it.
433523
@@ -453,7 +543,7 @@ class DataSourceAzure(sources.DataSource):
453 "Error communicating with Azure fabric; You may experience."543 "Error communicating with Azure fabric; You may experience."
454 "connectivity issues.", exc_info=True)544 "connectivity issues.", exc_info=True)
455 return False545 return False
456546 util.del_file(REPROVISION_MARKER_FILE)
457 return fabric_data547 return fabric_data
458548
459 def activate(self, cfg, is_new_instance):549 def activate(self, cfg, is_new_instance):
@@ -595,6 +685,7 @@ def address_ephemeral_resize(devpath=RESOURCE_DISK_PATH, maxwait=120,
595def perform_hostname_bounce(hostname, cfg, prev_hostname):685def perform_hostname_bounce(hostname, cfg, prev_hostname):
596 # set the hostname to 'hostname' if it is not already set to that.686 # set the hostname to 'hostname' if it is not already set to that.
597 # then, if policy is not off, bounce the interface using command687 # then, if policy is not off, bounce the interface using command
688 # Returns True if the network was bounced, False otherwise.
598 command = cfg['command']689 command = cfg['command']
599 interface = cfg['interface']690 interface = cfg['interface']
600 policy = cfg['policy']691 policy = cfg['policy']
@@ -614,7 +705,8 @@ def perform_hostname_bounce(hostname, cfg, prev_hostname):
614 else:705 else:
615 LOG.debug(706 LOG.debug(
616 "Skipping network bounce: ifupdown utils aren't present.")707 "Skipping network bounce: ifupdown utils aren't present.")
617 return # Don't bounce as networkd handles hostname DDNS updates708 # Don't bounce as networkd handles hostname DDNS updates
709 return False
618 LOG.debug("pubhname: publishing hostname [%s]", msg)710 LOG.debug("pubhname: publishing hostname [%s]", msg)
619 shell = not isinstance(command, (list, tuple))711 shell = not isinstance(command, (list, tuple))
620 # capture=False, see comments in bug 1202758 and bug 1206164.712 # capture=False, see comments in bug 1202758 and bug 1206164.
@@ -622,6 +714,7 @@ def perform_hostname_bounce(hostname, cfg, prev_hostname):
622 get_uptime=True, func=util.subp,714 get_uptime=True, func=util.subp,
623 kwargs={'args': command, 'shell': shell, 'capture': False,715 kwargs={'args': command, 'shell': shell, 'capture': False,
624 'env': env})716 'env': env})
717 return True
625718
626719
627def crtfile_to_pubkey(fname, data=None):720def crtfile_to_pubkey(fname, data=None):
@@ -838,9 +931,35 @@ def read_azure_ovf(contents):
838 if 'ssh_pwauth' not in cfg and password:931 if 'ssh_pwauth' not in cfg and password:
839 cfg['ssh_pwauth'] = True932 cfg['ssh_pwauth'] = True
840933
934 cfg['PreprovisionedVm'] = _extract_preprovisioned_vm_setting(dom)
935
841 return (md, ud, cfg)936 return (md, ud, cfg)
842937
843938
939def _extract_preprovisioned_vm_setting(dom):
940 """Read the preprovision flag from the ovf. It should not
941 exist unless true."""
942 platform_settings_section = find_child(
943 dom.documentElement,
944 lambda n: n.localName == "PlatformSettingsSection")
945 if not platform_settings_section or len(platform_settings_section) == 0:
946 LOG.debug("PlatformSettingsSection not found")
947 return False
948 platform_settings = find_child(
949 platform_settings_section[0],
950 lambda n: n.localName == "PlatformSettings")
951 if not platform_settings or len(platform_settings) == 0:
952 LOG.debug("PlatformSettings not found")
953 return False
954 preprovisionedVm = find_child(
955 platform_settings[0],
956 lambda n: n.localName == "PreprovisionedVm")
957 if not preprovisionedVm or len(preprovisionedVm) == 0:
958 LOG.debug("PreprovisionedVm not found")
959 return False
960 return bool(preprovisionedVm[0])
961
962
844def encrypt_pass(password, salt_id="$6$"):963def encrypt_pass(password, salt_id="$6$"):
845 return crypt.crypt(password, salt_id + util.rand_str(strlen=16))964 return crypt.crypt(password, salt_id + util.rand_str(strlen=16))
846965
diff --git a/cloudinit/url_helper.py b/cloudinit/url_helper.py
index 0e0f5b4..281c87c 100644
--- a/cloudinit/url_helper.py
+++ b/cloudinit/url_helper.py
@@ -301,6 +301,8 @@ def wait_for_url(urls, max_wait=None, timeout=None,
301 service but is not going to find one. It is possible that the instance301 service but is not going to find one. It is possible that the instance
302 data host (169.254.169.254) may be firewalled off Entirely for a sytem,302 data host (169.254.169.254) may be firewalled off Entirely for a sytem,
303 meaning that the connection will block forever unless a timeout is set.303 meaning that the connection will block forever unless a timeout is set.
304
305 A value of None for max_wait will retry indefinitely.
304 """306 """
305 start_time = time.time()307 start_time = time.time()
306308
@@ -311,8 +313,9 @@ def wait_for_url(urls, max_wait=None, timeout=None,
311 status_cb = log_status_cb313 status_cb = log_status_cb
312314
313 def timeup(max_wait, start_time):315 def timeup(max_wait, start_time):
314 return ((max_wait <= 0 or max_wait is None) or316 if (max_wait is None):
315 (time.time() - start_time > max_wait))317 return False
318 return ((max_wait <= 0) or (time.time() - start_time > max_wait))
316319
317 loop_n = 0320 loop_n = 0
318 while True:321 while True:
@@ -322,7 +325,8 @@ def wait_for_url(urls, max_wait=None, timeout=None,
322 if loop_n != 0:325 if loop_n != 0:
323 if timeup(max_wait, start_time):326 if timeup(max_wait, start_time):
324 break327 break
325 if timeout and (now + timeout > (start_time + max_wait)):328 if (max_wait is not None and
329 timeout and (now + timeout > (start_time + max_wait))):
326 # shorten timeout to not run way over max_time330 # shorten timeout to not run way over max_time
327 timeout = int((start_time + max_wait) - now)331 timeout = int((start_time + max_wait) - now)
328332
@@ -354,10 +358,11 @@ def wait_for_url(urls, max_wait=None, timeout=None,
354 url_exc = e358 url_exc = e
355359
356 time_taken = int(time.time() - start_time)360 time_taken = int(time.time() - start_time)
357 status_msg = "Calling '%s' failed [%s/%ss]: %s" % (url,361 max_wait_str = "%ss" % max_wait if max_wait else "unlimited"
358 time_taken,362 status_msg = "Calling '%s' failed [%s/%s]: %s" % (url,
359 max_wait,363 time_taken,
360 reason)364 max_wait_str,
365 reason)
361 status_cb(status_msg)366 status_cb(status_msg)
362 if exception_cb:367 if exception_cb:
363 # This can be used to alter the headers that will be sent368 # This can be used to alter the headers that will be sent
diff --git a/tests/unittests/test_datasource/test_azure.py b/tests/unittests/test_datasource/test_azure.py
index 6341e1e..9680ac6 100644
--- a/tests/unittests/test_datasource/test_azure.py
+++ b/tests/unittests/test_datasource/test_azure.py
@@ -5,18 +5,20 @@ from cloudinit.util import b64e, decode_binary, load_file, write_file
5from cloudinit.sources import DataSourceAzure as dsaz5from cloudinit.sources import DataSourceAzure as dsaz
6from cloudinit.util import find_freebsd_part6from cloudinit.util import find_freebsd_part
7from cloudinit.util import get_path_dev_freebsd7from cloudinit.util import get_path_dev_freebsd
88from cloudinit.version import version_string as vs
9from cloudinit.tests.helpers import (CiTestCase, TestCase, populate_dir, mock,9from cloudinit.tests.helpers import (CiTestCase, TestCase, populate_dir, mock,
10 ExitStack, PY26, SkipTest)10 ExitStack, PY26, SkipTest)
1111
12import crypt12import crypt
13import os13import os
14import stat14import stat
15import tempfile
15import xml.etree.ElementTree as ET16import xml.etree.ElementTree as ET
16import yaml17import yaml
1718
1819
19def construct_valid_ovf_env(data=None, pubkeys=None, userdata=None):20def construct_valid_ovf_env(data=None, pubkeys=None,
21 userdata=None, platform_settings=None):
20 if data is None:22 if data is None:
21 data = {'HostName': 'FOOHOST'}23 data = {'HostName': 'FOOHOST'}
22 if pubkeys is None:24 if pubkeys is None:
@@ -66,10 +68,12 @@ def construct_valid_ovf_env(data=None, pubkeys=None, userdata=None):
66 xmlns:i="http://www.w3.org/2001/XMLSchema-instance">68 xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
67 <KmsServerHostname>kms.core.windows.net</KmsServerHostname>69 <KmsServerHostname>kms.core.windows.net</KmsServerHostname>
68 <ProvisionGuestAgent>false</ProvisionGuestAgent>70 <ProvisionGuestAgent>false</ProvisionGuestAgent>
69 <GuestAgentPackageName i:nil="true" />71 <GuestAgentPackageName i:nil="true" />"""
70 </PlatformSettings></wa:PlatformSettingsSection>72 if platform_settings:
71</Environment>73 for k, v in platform_settings.items():
72 """74 content += "<%s>%s</%s>\n" % (k, v, k)
75 content += """</PlatformSettings></wa:PlatformSettingsSection>
76</Environment>"""
7377
74 return content78 return content
7579
@@ -1107,4 +1111,111 @@ class TestAzureNetExists(CiTestCase):
1107 self.assertTrue(hasattr(dsaz, "DataSourceAzureNet"))1111 self.assertTrue(hasattr(dsaz, "DataSourceAzureNet"))
11081112
11091113
1114@mock.patch('cloudinit.sources.DataSourceAzure.util.subp')
1115@mock.patch.object(dsaz, 'get_hostname')
1116@mock.patch.object(dsaz, 'set_hostname')
1117class TestAzureDataSourcePreprovisioning(CiTestCase):
1118
1119 def setUp(self):
1120 super(TestAzureDataSourcePreprovisioning, self).setUp()
1121 tmp = self.tmp_dir()
1122 self.waagent_d = self.tmp_path('/var/lib/waagent', tmp)
1123 self.paths = helpers.Paths({'cloud_dir': tmp})
1124 dsaz.BUILTIN_DS_CONFIG['data_dir'] = self.waagent_d
1125
1126 def test_read_azure_ovf_with_flag(self, *args):
1127 """The read_azure_ovf method should set the PreprovisionedVM
1128 cfg flag if the propper setting is present."""
1129 content = construct_valid_ovf_env(
1130 platform_settings={"PreprovisionedVm": "True"})
1131 ret = dsaz.read_azure_ovf(content)
1132 cfg = ret[2]
1133 self.assertTrue(cfg['PreprovisionedVm'])
1134
1135 def test_read_azure_ovf_without_flag(self, *args):
1136 """The read_azure_ovf method should not set the
1137 PreprovisionedVM cfg flag."""
1138 content = construct_valid_ovf_env()
1139 ret = dsaz.read_azure_ovf(content)
1140 cfg = ret[2]
1141 self.assertFalse(cfg['PreprovisionedVm'])
1142
1143 @mock.patch('requests.Session.request')
1144 def test_poll_imds_returns_ovf_env(self, fake_resp, *args):
1145 """The _poll_imds method should return the ovf_env.xml."""
1146 url = 'http://{0}/metadata/reprovisiondata?api-version=2017-04-02'
1147 host = "169.254.169.254"
1148 full_url = url.format(host)
1149 fake_resp.return_value = mock.MagicMock(status_code=200, text="ovf")
1150 dsa = dsaz.DataSourceAzure({}, distro=None, paths=self.paths)
1151 self.assertTrue(len(dsa._poll_imds()) > 0)
1152 print(fake_resp.call_args_list)
1153 self.assertEqual(fake_resp.call_args_list,
1154 [mock.call(allow_redirects=True,
1155 headers={'Metadata': 'true',
1156 'User-Agent':
1157 'Cloud-Init/%s' % vs()
1158 }, method='GET', timeout=60.0,
1159 url=full_url),
1160 mock.call(allow_redirects=True,
1161 headers={'Metadata': 'true',
1162 'User-Agent':
1163 'Cloud-Init/%s' % vs()
1164 }, method='GET', url=full_url)])
1165
1166 @mock.patch('requests.Session.request')
1167 def test__reprovision_calls__poll_imds(self, fake_resp, *args):
1168 """The _reprovision method should call poll IMDS."""
1169 url = 'http://{0}/metadata/reprovisiondata?api-version=2017-04-02'
1170 host = "169.254.169.254"
1171 full_url = url.format(host)
1172 hostname = "myhost"
1173 username = "myuser"
1174 odata = {'HostName': hostname, 'UserName': username}
1175 content = construct_valid_ovf_env(data=odata)
1176 fake_resp.return_value = mock.MagicMock(status_code=200, text=content)
1177 dsa = dsaz.DataSourceAzure({}, distro=None, paths=self.paths)
1178 md, ud, cfg, d = dsa._reprovision()
1179 self.assertEqual(md['local-hostname'], hostname)
1180 self.assertEqual(cfg['system_info']['default_user']['name'], username)
1181 self.assertEqual(fake_resp.call_args_list,
1182 [mock.call(allow_redirects=True,
1183 headers={'Metadata': 'true',
1184 'User-Agent':
1185 'Cloud-Init/%s' % vs()},
1186 method='GET', timeout=60.0, url=full_url),
1187 mock.call(allow_redirects=True,
1188 headers={'Metadata': 'true',
1189 'User-Agent':
1190 'Cloud-Init/%s' % vs()},
1191 method='GET', url=full_url)])
1192
1193 @mock.patch('os.path.isfile')
1194 @mock.patch.object(dsaz, 'open', create=True)
1195 def test__should_reprovision_with_true_cfg(self, isfile, myopen, *args):
1196 """The _should_reprovision method should return true with config
1197 flag present."""
1198 isfile.return_value = False
1199 myopen.return_value = tempfile.TemporaryFile()
1200 dsa = dsaz.DataSourceAzure({}, distro=None, paths=self.paths)
1201 self.assertTrue(dsa._should_reprovision(
1202 (None, None, {'PreprovisionedVm': True}, None)))
1203
1204 @mock.patch('os.path.isfile')
1205 def test__should_reprovision_with_file_existing(self, isfile, *args):
1206 """The _should_reprovision method should return True if the sentinal
1207 exists."""
1208 isfile.return_value = True
1209 dsa = dsaz.DataSourceAzure({}, distro=None, paths=self.paths)
1210 self.assertTrue(dsa._should_reprovision(
1211 (None, None, {'preprovisionedvm': False}, None)))
1212
1213 @mock.patch('os.path.isfile')
1214 def test__should_reprovision_returns_false(self, isfile, *args):
1215 """The _should_reprovision method should return False
1216 if config and sentinal are not present."""
1217 isfile.return_value = False
1218 dsa = dsaz.DataSourceAzure({}, distro=None, paths=self.paths)
1219 self.assertFalse(dsa._should_reprovision((None, None, {}, None)))
1220
1110# vi: ts=4 expandtab1221# vi: ts=4 expandtab

Subscribers

People subscribed via source and target branches