Merge ~dojordan/cloud-init:azure-preprovisioning into cloud-init:master

Proposed by Douglas Jordan on 2017-11-28
Status: Needs review
Proposed branch: ~dojordan/cloud-init:azure-preprovisioning
Merge into: cloud-init:master
Diff against target: 459 lines (+258/-22)
4 files modified
.gitignore (+1/-0)
cloudinit/sources/DataSourceAzure.py (+128/-9)
cloudinit/url_helper.py (+12/-7)
tests/unittests/test_datasource/test_azure.py (+117/-6)
Reviewer Review Type Date Requested Status
Server Team CI bot continuous-integration Approve 1 hour ago
Scott Moser Needs Fixing on 2018-01-03
Chad Smith Approve on 2017-12-14
Paul Meyer (community) 2017-11-28 Approve on 2017-12-01
Review via email: mp+334341@code.launchpad.net

Commit Message

Azure VM Preprovisioning support.

This change will enable azure vms to report provisioning has completed twice, first to tell the fabric it has completed then a second time to enable customer settings. The datasource for the second provisioning is the Instance Metadata Service (IMDS), and the VM will poll indefinitely for the new ovf-env.xml from IMDS.

LP: #1734991

Description of the Change

Azure VM Preprovisioning support.

This change will enable azure vms to report provisioning has completed twice, first to tell the fabric it has completed then a second time to enable customer settings. The datasource for the second provisioning is the Instance Metadata Service (IMDS), and the VM will poll indefinitely for the new ovf-env.xml from IMDS.

LP: 1734991

To post a comment you must log in.
Paul Meyer (paul-meyer) wrote :

Looks good to me (after you fix the tests and flake8's)

Scott Moser (smoser) wrote :

Fix your commit message (press 'Set commit message' above).

 Summary
 <blank>
 More info

FAILED: Continuous integration, rev:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/553/
Executed test runs:
    FAILED: Checkout

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/553/rebuild

review: Needs Fixing (continuous-integration)
Douglas Jordan (dojordan) wrote :

> FAILED: Continuous integration, rev:
> https://jenkins.ubuntu.com/server/job/cloud-init-ci/553/
> Executed test runs:
> FAILED: Checkout
>
> Click here to trigger a rebuild:
> https://jenkins.ubuntu.com/server/job/cloud-init-ci/553/rebuild

Looks like the git checkout failed. Thoughts?
stderr: remote: Authorisation required.
fatal: Authentication failed for 'https://git.launchpad.net/~dojordan/cloud-init:azure-preprovisioning/'

FAILED: Continuous integration, rev:72e423b4a0827dd954170749812b13aba761f399
https://jenkins.ubuntu.com/server/job/cloud-init-ci/562/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    FAILED: MAAS Compatability Testing

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/562/rebuild

review: Needs Fixing (continuous-integration)

FAILED: Continuous integration, rev:384b7731e99338903d91da641267ed84a7470669
https://jenkins.ubuntu.com/server/job/cloud-init-ci/567/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    FAILED: MAAS Compatability Testing

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/567/rebuild

review: Needs Fixing (continuous-integration)

FAILED: Continuous integration, rev:d00ec2ec82271c3c35830a04d2f216cd7bef8ba7
https://jenkins.ubuntu.com/server/job/cloud-init-ci/575/
Executed test runs:
    SUCCESS: Checkout
    FAILED: Unit & Style Tests

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/575/rebuild

review: Needs Fixing (continuous-integration)

PASSED: Continuous integration, rev:6021da2b61a55102ec7637d6fea54e02db1c1c92
https://jenkins.ubuntu.com/server/job/cloud-init-ci/576/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/576/rebuild

review: Approve (continuous-integration)
Paul Meyer (paul-meyer) :
review: Approve
Douglas Jordan (dojordan) wrote :

Thanks for the feedback. I've resolved your comments.

PASSED: Continuous integration, rev:ff0b7b0b5fdd1512c7ec2ccbee0f92afde5c733d
https://jenkins.ubuntu.com/server/job/cloud-init-ci/579/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/579/rebuild

review: Approve (continuous-integration)
Chad Smith (chad.smith) wrote :

Douglas, thanks for working this. I'm going to mark it work in progress as you address or respond to comments. When you'd like another review pass please just mark it back to "Needs review"

FAILED: Continuous integration, rev:10c6b219ba7168c5fb3751d28a383b130a1348ae
https://jenkins.ubuntu.com/server/job/cloud-init-ci/590/
Executed test runs:
    SUCCESS: Checkout
    FAILED: Unit & Style Tests

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/590/rebuild

review: Needs Fixing (continuous-integration)

PASSED: Continuous integration, rev:e06a762741953a5c55d843b0dcfd454111655cea
https://jenkins.ubuntu.com/server/job/cloud-init-ci/595/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/595/rebuild

review: Approve (continuous-integration)
Chad Smith (chad.smith) wrote :

Thanks for the rework and comments. One more pass on this. I'll test against stock azure xenial today and report here.

PASSED: Continuous integration, rev:c500a0184b8bc7c7ba03b1a884896702a761959f
https://jenkins.ubuntu.com/server/job/cloud-init-ci/612/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/612/rebuild

review: Approve (continuous-integration)
Chad Smith (chad.smith) wrote :

Ok one last nit from me and it looks good. I need a bit of testing still on this but content looks good.

Chad Smith (chad.smith) wrote :

one last try on the nit

PASSED: Continuous integration, rev:3c9509671f05d8dfc990b5ce35a615c9cd7b442c
https://jenkins.ubuntu.com/server/job/cloud-init-ci/618/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/618/rebuild

review: Approve (continuous-integration)
Chad Smith (chad.smith) :
review: Approve
Scott Moser (smoser) wrote :

There are some comments in line.
I'm not sure I fully understand all of this.

I could be wrong here, but I think you're using
bounce_network_with_azure_hostname in order to interact with IMDS_URL.

We have 2 ways now of brining up ephemeral networking for this purpose:
 a.) Chad has recently added code in the EC2 metadata service using cloudinit/net/dhcp.py .
 b.) the digital ocean datasource has code to negotiate a ipv4 link local address.

If 'b' is sufficient, I'd prefer that, but either one I'd prefer to the bounce_network_ which i think may actually not work for you if you rebased to trunk.

As mentioned in IRC, I'm still concerned about systemd giving up and deciding that boot has failed after some amount of time polling on a metadata service. As Douglas pointed out, cloud-init has timeouts set to 0 and is a 'oneshot', so *its* timeout is not an issue, but I think that things that it runs 'Before' or (pre-networking or other) might end up timing out.

Scott Moser (smoser) wrote :

some things there need fixing, definitely need rebase to trunk (I suspect you'll have conflicts), but if not, some thought is needed.

review: Needs Fixing
fd33d0a... by Douglas Jordan on 2018-01-04

Merge branch 'master' into azure-preprovisioning

8b89d2d... by Douglas Jordan on 2018-01-04

Small tweaks.

c207d44... by Douglas Jordan on 2018-01-05

Addressing CR comments

FAILED: Continuous integration, rev:c207d443ca360f192509b33dacd404cd0a4d3bc5
https://jenkins.ubuntu.com/server/job/cloud-init-ci/662/
Executed test runs:
    SUCCESS: Checkout
    FAILED: Unit & Style Tests

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/662/rebuild

review: Needs Fixing (continuous-integration)
ba329ca... by Douglas Jordan on 2018-01-05

Using warning instead of warn

FAILED: Continuous integration, rev:ba329ca5cde7f64b1f04ff3a600f3424e8c04515
https://jenkins.ubuntu.com/server/job/cloud-init-ci/669/
Executed test runs:
    SUCCESS: Checkout
    FAILED: Unit & Style Tests

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/669/rebuild

review: Needs Fixing (continuous-integration)
93d88e2... by Douglas Jordan on 2018-01-05

Fixing flake8

PASSED: Continuous integration, rev:93d88e25a1db5220cf087bba80a1ff15dd7f7fd8
https://jenkins.ubuntu.com/server/job/cloud-init-ci/670/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/670/rebuild

review: Approve (continuous-integration)
Douglas Jordan (dojordan) wrote :

Thanks for the comments Scott-

The reason for the bounce_network_with_azure_hostname is actually to get the new IP address. We can't use the digital ocean ipv4 link local approach as the way our IMDS server identifies which VM is talking to it is via the mac address and the ip address. When the control plane updates the DHCP server with the new IP, the vm has to trigger dhcp and acquire the new address. In our polling loop, we call bounce_networking_with_azure_hostname to trigger dhcp when we get an exception. This is because IMDS will not even handle our request if the mac/ip does not match what was expected. In windows, we get around this by just disconnecting and reconnecting the NIC, however on Ubuntu 16.04, we observed the link had to be disconnected for a very long time (>10s) for this behavior to occur. Hopefully, with systemd-networkd, this issue will be fixed. For now however, we are targeting 16.04 for pre=provisioning, and we will continue to test artful/bionic with the link state disconnect / reconnect approach.

> There are some comments in line.
> I'm not sure I fully understand all of this.
>
> I could be wrong here, but I think you're using
> bounce_network_with_azure_hostname in order to interact with IMDS_URL.
>
> We have 2 ways now of brining up ephemeral networking for this purpose:
> a.) Chad has recently added code in the EC2 metadata service using
> cloudinit/net/dhcp.py .
> b.) the digital ocean datasource has code to negotiate a ipv4 link local
> address.
>
> If 'b' is sufficient, I'd prefer that, but either one I'd prefer to the
> bounce_network_ which i think may actually not work for you if you rebased to
> trunk.
>
> As mentioned in IRC, I'm still concerned about systemd giving up and deciding
> that boot has failed after some amount of time polling on a metadata service.
> As Douglas pointed out, cloud-init has timeouts set to 0 and is a 'oneshot',
> so *its* timeout is not an issue, but I think that things that it runs
> 'Before' or (pre-networking or other) might end up timing out.

sushant (sushantsharma) wrote :

<not sure why the inline comment does not show up>

Hi Douglas, Can you please add a comment before line 83 (where you catch exception and re-DHCP) saying that this is temporary. We plan to add a networking module specific to azure in cloud-init and will address re-DHCP need in that module. The plan is to submit that module for review in a week or so. Thanks!

fc01540... by Douglas Jordan on 2018-01-08

Adding comments

PASSED: Continuous integration, rev:fc0154011d4a3f0418fa71cfa5330bd9ac837dfd
https://jenkins.ubuntu.com/server/job/cloud-init-ci/679/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/679/rebuild

review: Approve (continuous-integration)
Scott Moser (smoser) wrote :

As it is right now, 'bounce_network_with_azure_hostname' is a no-op
on Ubuntu 18.04 (bionic) or any system without 'ifup'.
So while you want this fixed in 16.04, the general Ubuntu development
path insists that we make things work on the development release first
and the SRU them back to 16.04.

In summary: I can't let this in as broken on bionic. We'll need to find
a way to make that work on 18.04.

Your commit message says:
 | This change will enable azure vms to report provisioning has completed
 | twice, first to tell the fabric it has completed then a second time to
 | enable customer settings.

Is that true, will it *always* report reprovisioning has completed twice?

there are some small inlinel things also.

those are the big things though. thanks for your work.

Douglas Jordan (dojordan) wrote :

Regarding the DHCP stuff: We are exploring an alternate solution to bounce the nic from hyper-v, but in the mean time we would like to get this checked in. So an alternate solution for bionic would be to simply change the hostname. This way, systemd-networkd will keep re triggering DHCP. Once we get the final ovf_env.xml from IMDS, we will actually apply the real, customer provided hostname.

Regarding "Is that true, will it *always* report reprovisioning has completed twice?"-
tldr; yes. Technically, it will report "provisioning" has completed twice, while we are calling the second incarnation "reprovisioning"

7f23e5c... by Douglas Jordan 1 hour ago

Changing hostname post xenial, and other PR comments

Author: Douglas Jordan <email address hidden>
Committer: Douglas Jordan <email address hidden>

PASSED: Continuous integration, rev:7f23e5c4808a9c647cd4d5277625a723a58b132b
https://jenkins.ubuntu.com/server/job/cloud-init-ci/685/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/685/rebuild

review: Approve (continuous-integration)

Unmerged commits

7f23e5c... by Douglas Jordan 1 hour ago

Changing hostname post xenial, and other PR comments

Author: Douglas Jordan <email address hidden>
Committer: Douglas Jordan <email address hidden>

fc01540... by Douglas Jordan on 2018-01-08

Adding comments

93d88e2... by Douglas Jordan on 2018-01-05

Fixing flake8

ba329ca... by Douglas Jordan on 2018-01-05

Using warning instead of warn

c207d44... by Douglas Jordan on 2018-01-05

Addressing CR comments

8b89d2d... by Douglas Jordan on 2018-01-04

Small tweaks.

fd33d0a... by Douglas Jordan on 2018-01-04

Merge branch 'master' into azure-preprovisioning

3c95096... by Douglas Jordan on 2017-12-11

nit fixes

37f9ff8... by Douglas Jordan on 2017-12-11

call extract method directly.

c500a01... by Douglas Jordan on 2017-12-11

Flake8 fixes

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/.gitignore b/.gitignore
2index b0500a6..75565ed 100644
3--- a/.gitignore
4+++ b/.gitignore
5@@ -10,3 +10,4 @@ parts
6 prime
7 stage
8 *.snap
9+*.cover
10diff --git a/cloudinit/sources/DataSourceAzure.py b/cloudinit/sources/DataSourceAzure.py
11index d1d0975..0bcbc99 100644
12--- a/cloudinit/sources/DataSourceAzure.py
13+++ b/cloudinit/sources/DataSourceAzure.py
14@@ -11,6 +11,8 @@ from functools import partial
15 import os
16 import os.path
17 import re
18+from time import time
19+from uuid import uuid4
20 from xml.dom import minidom
21 import xml.etree.ElementTree as ET
22
23@@ -18,6 +20,7 @@ from cloudinit import log as logging
24 from cloudinit import net
25 from cloudinit import sources
26 from cloudinit.sources.helpers.azure import get_metadata_from_fabric
27+from cloudinit.url_helper import readurl, wait_for_url, UrlError
28 from cloudinit import util
29
30 LOG = logging.getLogger(__name__)
31@@ -44,6 +47,8 @@ LEASE_FILE = '/var/lib/dhcp/dhclient.eth0.leases'
32 DEFAULT_FS = 'ext4'
33 # DMI chassis-asset-tag is set static for all azure instances
34 AZURE_CHASSIS_ASSET_TAG = '7783-7084-3265-9085-8269-3286-77'
35+REPROVISION_MARKER_FILE = "/var/lib/cloud/data/poll_imds"
36+IMDS_URL = "http://169.254.169.254/metadata/reprovisiondata"
37
38
39 def find_storvscid_from_sysctl_pnpinfo(sysctl_out, deviceid):
40@@ -276,19 +281,20 @@ class DataSourceAzure(sources.DataSource):
41
42 with temporary_hostname(azure_hostname, self.ds_cfg,
43 hostname_command=hostname_command) \
44- as previous_hostname:
45- if (previous_hostname is not None and
46+ as previous_hn:
47+ if (previous_hn is not None and
48 util.is_true(self.ds_cfg.get('set_hostname'))):
49 cfg = self.ds_cfg['hostname_bounce']
50
51 # "Bouncing" the network
52 try:
53- perform_hostname_bounce(hostname=azure_hostname,
54- cfg=cfg,
55- prev_hostname=previous_hostname)
56+ return perform_hostname_bounce(hostname=azure_hostname,
57+ cfg=cfg,
58+ prev_hostname=previous_hn)
59 except Exception as e:
60 LOG.warning("Failed publishing hostname: %s", e)
61 util.logexc(LOG, "handling set_hostname failed")
62+ return False
63
64 def get_metadata_from_agent(self):
65 temp_hostname = self.metadata.get('local-hostname')
66@@ -345,15 +351,20 @@ class DataSourceAzure(sources.DataSource):
67 ddir = self.ds_cfg['data_dir']
68
69 candidates = [self.seed_dir]
70+ if os.path.isfile(REPROVISION_MARKER_FILE):
71+ candidates.insert(0, "IMDS")
72 candidates.extend(list_possible_azure_ds_devs())
73 if ddir:
74 candidates.append(ddir)
75
76 found = None
77-
78+ reprovision = False
79 for cdev in candidates:
80 try:
81- if cdev.startswith("/dev/"):
82+ if cdev == "IMDS":
83+ ret = None
84+ reprovision = True
85+ elif cdev.startswith("/dev/"):
86 if util.is_FreeBSD():
87 ret = util.mount_cb(cdev, load_azure_ds_dir,
88 mtype="udf", sync=False)
89@@ -370,6 +381,8 @@ class DataSourceAzure(sources.DataSource):
90 LOG.warning("%s was not mountable", cdev)
91 continue
92
93+ if reprovision or self._should_reprovision(ret):
94+ ret = self._reprovision()
95 (md, self.userdata_raw, cfg, files) = ret
96 self.seed = cdev
97 self.metadata = util.mergemanydict([md, DEFAULT_METADATA])
98@@ -428,6 +441,83 @@ class DataSourceAzure(sources.DataSource):
99 LOG.debug("negotiating already done for %s",
100 self.get_instance_id())
101
102+ def _poll_imds(self):
103+ """Poll IMDS for the new provisioning data until we get a valid
104+ response. Then returned the returned JSON object."""
105+ url = IMDS_URL + "?api-version=2017-04-02"
106+ headers = {"Metadata": "true"}
107+ LOG.debug("Start polling IMDS")
108+
109+ def exception_cb(msg, exception):
110+ if isinstance(exception, UrlError) and exception.code == 404:
111+ return
112+ LOG.warning("Exception during polling. Will try DHCP.",
113+ exc_info=True)
114+ # On certain versions of the networking stack, the link
115+ # state disconnect is not long enough to retrigger DHCP.
116+ # If we get an exception while trying to call IMDS, we
117+ # bounce the network to force DHCP to acquire the new IP.
118+ if not self.bounce_network_with_azure_hostname():
119+ # On ubuntu releases > xenial, we rely on systemd-networkd to
120+ # retrigger DHCP when the hostname changes as the ifup/down
121+ # commands are missing. In that case, we will simply change
122+ # the hostname. We can do this as the real hostname will still
123+ # be applied after receiving the ovf_env.xml.
124+ hn_cmd = self.ds_cfg['hostname_bounce']['hostname_command']
125+ set_hostname(str(uuid4()), hn_cmd)
126+
127+ wait_for_url([url], max_wait=None, timeout=60, status_cb=LOG.info,
128+ headers_cb=lambda url: headers, sleep_time=1,
129+ exception_cb=exception_cb)
130+ return str(readurl(url, headers=headers))
131+
132+ def _report_ready(self):
133+ """Tells the fabric provisioning has completed
134+ before we go into our polling loop."""
135+ try:
136+ get_metadata_from_fabric(self.dhclient_lease_file)
137+ except Exception as exc:
138+ LOG.warning(
139+ "Error communicating with Azure fabric; You may experience."
140+ "connectivity issues.", exc_info=True)
141+
142+ def _should_reprovision(self, ret):
143+ """Whether or not we should poll IMDS for reprovisioning data.
144+ Also sets a marker file to poll IMDS.
145+
146+ The marker file is used for the following scenario: the VM boots into
147+ this polling loop, which we expect to be proceeding infinitely until
148+ the VM is picked. If for whatever reason the platform moves us to a
149+ new host (for instance a hardware issue), we need to keep polling.
150+ However, since the VM reports ready to the Fabric, we will not attach
151+ the ISO, thus cloud-init needs to have a way of knowing that it should
152+ jump back into the polling loop in order to retrieve the ovf_env."""
153+ if not ret:
154+ return False
155+ (md, self.userdata_raw, cfg, files) = ret
156+ path = REPROVISION_MARKER_FILE
157+ if (cfg.get('PreprovisionedVm') is True or
158+ os.path.isfile(path)):
159+ if not os.path.isfile(path):
160+ LOG.info("Creating a marker file to poll imds")
161+ util.write_file(path, "%s: %s\n" % (os.getpid(), time()))
162+ return True
163+ return False
164+
165+ def _reprovision(self):
166+ """Initiate the reprovisioning workflow."""
167+ LOG.info("bouncing network to enable IMDS polling")
168+ if self.metadata is None:
169+ self.metadata = {}
170+ self.metadata['local-hostname'] = 'azurevm'
171+ # This is needed in order to report our temp hostname to the platform
172+ # and trigger DHCP in order to get an IP address.
173+ self.bounce_network_with_azure_hostname()
174+ self._report_ready()
175+ contents = self._poll_imds()
176+ md, ud, cfg = read_azure_ovf(contents)
177+ return (md, ud, cfg, {'ovf-env.xml': contents})
178+
179 def _negotiate(self):
180 """Negotiate with fabric and return data from it.
181
182@@ -453,7 +543,7 @@ class DataSourceAzure(sources.DataSource):
183 "Error communicating with Azure fabric; You may experience."
184 "connectivity issues.", exc_info=True)
185 return False
186-
187+ util.del_file(REPROVISION_MARKER_FILE)
188 return fabric_data
189
190 def activate(self, cfg, is_new_instance):
191@@ -595,6 +685,7 @@ def address_ephemeral_resize(devpath=RESOURCE_DISK_PATH, maxwait=120,
192 def perform_hostname_bounce(hostname, cfg, prev_hostname):
193 # set the hostname to 'hostname' if it is not already set to that.
194 # then, if policy is not off, bounce the interface using command
195+ # Returns True if the network was bounced, False otherwise.
196 command = cfg['command']
197 interface = cfg['interface']
198 policy = cfg['policy']
199@@ -614,7 +705,8 @@ def perform_hostname_bounce(hostname, cfg, prev_hostname):
200 else:
201 LOG.debug(
202 "Skipping network bounce: ifupdown utils aren't present.")
203- return # Don't bounce as networkd handles hostname DDNS updates
204+ # Don't bounce as networkd handles hostname DDNS updates
205+ return False
206 LOG.debug("pubhname: publishing hostname [%s]", msg)
207 shell = not isinstance(command, (list, tuple))
208 # capture=False, see comments in bug 1202758 and bug 1206164.
209@@ -622,6 +714,7 @@ def perform_hostname_bounce(hostname, cfg, prev_hostname):
210 get_uptime=True, func=util.subp,
211 kwargs={'args': command, 'shell': shell, 'capture': False,
212 'env': env})
213+ return True
214
215
216 def crtfile_to_pubkey(fname, data=None):
217@@ -838,9 +931,35 @@ def read_azure_ovf(contents):
218 if 'ssh_pwauth' not in cfg and password:
219 cfg['ssh_pwauth'] = True
220
221+ cfg['PreprovisionedVm'] = _extract_preprovisioned_vm_setting(dom)
222+
223 return (md, ud, cfg)
224
225
226+def _extract_preprovisioned_vm_setting(dom):
227+ """Read the preprovision flag from the ovf. It should not
228+ exist unless true."""
229+ platform_settings_section = find_child(
230+ dom.documentElement,
231+ lambda n: n.localName == "PlatformSettingsSection")
232+ if not platform_settings_section or len(platform_settings_section) == 0:
233+ LOG.debug("PlatformSettingsSection not found")
234+ return False
235+ platform_settings = find_child(
236+ platform_settings_section[0],
237+ lambda n: n.localName == "PlatformSettings")
238+ if not platform_settings or len(platform_settings) == 0:
239+ LOG.debug("PlatformSettings not found")
240+ return False
241+ preprovisionedVm = find_child(
242+ platform_settings[0],
243+ lambda n: n.localName == "PreprovisionedVm")
244+ if not preprovisionedVm or len(preprovisionedVm) == 0:
245+ LOG.debug("PreprovisionedVm not found")
246+ return False
247+ return bool(preprovisionedVm[0])
248+
249+
250 def encrypt_pass(password, salt_id="$6$"):
251 return crypt.crypt(password, salt_id + util.rand_str(strlen=16))
252
253diff --git a/cloudinit/url_helper.py b/cloudinit/url_helper.py
254index 0e0f5b4..281c87c 100644
255--- a/cloudinit/url_helper.py
256+++ b/cloudinit/url_helper.py
257@@ -301,6 +301,8 @@ def wait_for_url(urls, max_wait=None, timeout=None,
258 service but is not going to find one. It is possible that the instance
259 data host (169.254.169.254) may be firewalled off Entirely for a sytem,
260 meaning that the connection will block forever unless a timeout is set.
261+
262+ A value of None for max_wait will retry indefinitely.
263 """
264 start_time = time.time()
265
266@@ -311,8 +313,9 @@ def wait_for_url(urls, max_wait=None, timeout=None,
267 status_cb = log_status_cb
268
269 def timeup(max_wait, start_time):
270- return ((max_wait <= 0 or max_wait is None) or
271- (time.time() - start_time > max_wait))
272+ if (max_wait is None):
273+ return False
274+ return ((max_wait <= 0) or (time.time() - start_time > max_wait))
275
276 loop_n = 0
277 while True:
278@@ -322,7 +325,8 @@ def wait_for_url(urls, max_wait=None, timeout=None,
279 if loop_n != 0:
280 if timeup(max_wait, start_time):
281 break
282- if timeout and (now + timeout > (start_time + max_wait)):
283+ if (max_wait is not None and
284+ timeout and (now + timeout > (start_time + max_wait))):
285 # shorten timeout to not run way over max_time
286 timeout = int((start_time + max_wait) - now)
287
288@@ -354,10 +358,11 @@ def wait_for_url(urls, max_wait=None, timeout=None,
289 url_exc = e
290
291 time_taken = int(time.time() - start_time)
292- status_msg = "Calling '%s' failed [%s/%ss]: %s" % (url,
293- time_taken,
294- max_wait,
295- reason)
296+ max_wait_str = "%ss" % max_wait if max_wait else "unlimited"
297+ status_msg = "Calling '%s' failed [%s/%s]: %s" % (url,
298+ time_taken,
299+ max_wait_str,
300+ reason)
301 status_cb(status_msg)
302 if exception_cb:
303 # This can be used to alter the headers that will be sent
304diff --git a/tests/unittests/test_datasource/test_azure.py b/tests/unittests/test_datasource/test_azure.py
305index 6341e1e..9680ac6 100644
306--- a/tests/unittests/test_datasource/test_azure.py
307+++ b/tests/unittests/test_datasource/test_azure.py
308@@ -5,18 +5,20 @@ from cloudinit.util import b64e, decode_binary, load_file, write_file
309 from cloudinit.sources import DataSourceAzure as dsaz
310 from cloudinit.util import find_freebsd_part
311 from cloudinit.util import get_path_dev_freebsd
312-
313+from cloudinit.version import version_string as vs
314 from cloudinit.tests.helpers import (CiTestCase, TestCase, populate_dir, mock,
315 ExitStack, PY26, SkipTest)
316
317 import crypt
318 import os
319 import stat
320+import tempfile
321 import xml.etree.ElementTree as ET
322 import yaml
323
324
325-def construct_valid_ovf_env(data=None, pubkeys=None, userdata=None):
326+def construct_valid_ovf_env(data=None, pubkeys=None,
327+ userdata=None, platform_settings=None):
328 if data is None:
329 data = {'HostName': 'FOOHOST'}
330 if pubkeys is None:
331@@ -66,10 +68,12 @@ def construct_valid_ovf_env(data=None, pubkeys=None, userdata=None):
332 xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
333 <KmsServerHostname>kms.core.windows.net</KmsServerHostname>
334 <ProvisionGuestAgent>false</ProvisionGuestAgent>
335- <GuestAgentPackageName i:nil="true" />
336- </PlatformSettings></wa:PlatformSettingsSection>
337-</Environment>
338- """
339+ <GuestAgentPackageName i:nil="true" />"""
340+ if platform_settings:
341+ for k, v in platform_settings.items():
342+ content += "<%s>%s</%s>\n" % (k, v, k)
343+ content += """</PlatformSettings></wa:PlatformSettingsSection>
344+</Environment>"""
345
346 return content
347
348@@ -1107,4 +1111,111 @@ class TestAzureNetExists(CiTestCase):
349 self.assertTrue(hasattr(dsaz, "DataSourceAzureNet"))
350
351
352+@mock.patch('cloudinit.sources.DataSourceAzure.util.subp')
353+@mock.patch.object(dsaz, 'get_hostname')
354+@mock.patch.object(dsaz, 'set_hostname')
355+class TestAzureDataSourcePreprovisioning(CiTestCase):
356+
357+ def setUp(self):
358+ super(TestAzureDataSourcePreprovisioning, self).setUp()
359+ tmp = self.tmp_dir()
360+ self.waagent_d = self.tmp_path('/var/lib/waagent', tmp)
361+ self.paths = helpers.Paths({'cloud_dir': tmp})
362+ dsaz.BUILTIN_DS_CONFIG['data_dir'] = self.waagent_d
363+
364+ def test_read_azure_ovf_with_flag(self, *args):
365+ """The read_azure_ovf method should set the PreprovisionedVM
366+ cfg flag if the propper setting is present."""
367+ content = construct_valid_ovf_env(
368+ platform_settings={"PreprovisionedVm": "True"})
369+ ret = dsaz.read_azure_ovf(content)
370+ cfg = ret[2]
371+ self.assertTrue(cfg['PreprovisionedVm'])
372+
373+ def test_read_azure_ovf_without_flag(self, *args):
374+ """The read_azure_ovf method should not set the
375+ PreprovisionedVM cfg flag."""
376+ content = construct_valid_ovf_env()
377+ ret = dsaz.read_azure_ovf(content)
378+ cfg = ret[2]
379+ self.assertFalse(cfg['PreprovisionedVm'])
380+
381+ @mock.patch('requests.Session.request')
382+ def test_poll_imds_returns_ovf_env(self, fake_resp, *args):
383+ """The _poll_imds method should return the ovf_env.xml."""
384+ url = 'http://{0}/metadata/reprovisiondata?api-version=2017-04-02'
385+ host = "169.254.169.254"
386+ full_url = url.format(host)
387+ fake_resp.return_value = mock.MagicMock(status_code=200, text="ovf")
388+ dsa = dsaz.DataSourceAzure({}, distro=None, paths=self.paths)
389+ self.assertTrue(len(dsa._poll_imds()) > 0)
390+ print(fake_resp.call_args_list)
391+ self.assertEqual(fake_resp.call_args_list,
392+ [mock.call(allow_redirects=True,
393+ headers={'Metadata': 'true',
394+ 'User-Agent':
395+ 'Cloud-Init/%s' % vs()
396+ }, method='GET', timeout=60.0,
397+ url=full_url),
398+ mock.call(allow_redirects=True,
399+ headers={'Metadata': 'true',
400+ 'User-Agent':
401+ 'Cloud-Init/%s' % vs()
402+ }, method='GET', url=full_url)])
403+
404+ @mock.patch('requests.Session.request')
405+ def test__reprovision_calls__poll_imds(self, fake_resp, *args):
406+ """The _reprovision method should call poll IMDS."""
407+ url = 'http://{0}/metadata/reprovisiondata?api-version=2017-04-02'
408+ host = "169.254.169.254"
409+ full_url = url.format(host)
410+ hostname = "myhost"
411+ username = "myuser"
412+ odata = {'HostName': hostname, 'UserName': username}
413+ content = construct_valid_ovf_env(data=odata)
414+ fake_resp.return_value = mock.MagicMock(status_code=200, text=content)
415+ dsa = dsaz.DataSourceAzure({}, distro=None, paths=self.paths)
416+ md, ud, cfg, d = dsa._reprovision()
417+ self.assertEqual(md['local-hostname'], hostname)
418+ self.assertEqual(cfg['system_info']['default_user']['name'], username)
419+ self.assertEqual(fake_resp.call_args_list,
420+ [mock.call(allow_redirects=True,
421+ headers={'Metadata': 'true',
422+ 'User-Agent':
423+ 'Cloud-Init/%s' % vs()},
424+ method='GET', timeout=60.0, url=full_url),
425+ mock.call(allow_redirects=True,
426+ headers={'Metadata': 'true',
427+ 'User-Agent':
428+ 'Cloud-Init/%s' % vs()},
429+ method='GET', url=full_url)])
430+
431+ @mock.patch('os.path.isfile')
432+ @mock.patch.object(dsaz, 'open', create=True)
433+ def test__should_reprovision_with_true_cfg(self, isfile, myopen, *args):
434+ """The _should_reprovision method should return true with config
435+ flag present."""
436+ isfile.return_value = False
437+ myopen.return_value = tempfile.TemporaryFile()
438+ dsa = dsaz.DataSourceAzure({}, distro=None, paths=self.paths)
439+ self.assertTrue(dsa._should_reprovision(
440+ (None, None, {'PreprovisionedVm': True}, None)))
441+
442+ @mock.patch('os.path.isfile')
443+ def test__should_reprovision_with_file_existing(self, isfile, *args):
444+ """The _should_reprovision method should return True if the sentinal
445+ exists."""
446+ isfile.return_value = True
447+ dsa = dsaz.DataSourceAzure({}, distro=None, paths=self.paths)
448+ self.assertTrue(dsa._should_reprovision(
449+ (None, None, {'preprovisionedvm': False}, None)))
450+
451+ @mock.patch('os.path.isfile')
452+ def test__should_reprovision_returns_false(self, isfile, *args):
453+ """The _should_reprovision method should return False
454+ if config and sentinal are not present."""
455+ isfile.return_value = False
456+ dsa = dsaz.DataSourceAzure({}, distro=None, paths=self.paths)
457+ self.assertFalse(dsa._should_reprovision((None, None, {}, None)))
458+
459 # vi: ts=4 expandtab

Subscribers

People subscribed via source and target branches