Merge ~chad.smith/cloud-init:feature/cli-cloudinit-query into cloud-init:master

Proposed by Chad Smith
Status: Merged
Approved by: Chad Smith
Approved revision: 5916cbb4f60cbcf517ec1edbd45c964e73b441af
Merge reported by: Server Team CI bot
Merged at revision: not available
Proposed branch: ~chad.smith/cloud-init:feature/cli-cloudinit-query
Merge into: cloud-init:master
Diff against target: 1645 lines (+952/-233)
14 files modified
bash_completion/cloud-init (+3/-1)
cloudinit/cmd/devel/render.py (+1/-6)
cloudinit/cmd/main.py (+10/-0)
cloudinit/cmd/query.py (+155/-0)
cloudinit/cmd/tests/test_query.py (+193/-0)
cloudinit/helpers.py (+4/-0)
cloudinit/sources/__init__.py (+62/-14)
cloudinit/sources/tests/test_init.py (+109/-21)
doc/rtd/index.rst (+1/-0)
doc/rtd/topics/capabilities.rst (+84/-21)
doc/rtd/topics/datasources.rst (+6/-142)
doc/rtd/topics/instancedata.rst (+297/-0)
integration-requirements.txt (+2/-1)
tests/cloud_tests/testcases/base.py (+25/-27)
Reviewer Review Type Date Requested Status
Server Team CI bot continuous-integration Approve
cloud-init Commiters Pending
Review via email: mp+354891@code.launchpad.net

Commit message

cli: add cloud-init query subcommand to query instance metadata

Cloud-init caches any cloud metadata crawled during boot in the file
/run/cloud-init/instance-data.json. Cloud-init also standardizes some of
that metadata across all clouds. The command 'cloud-init query' surfaces a
simple CLI to query or format any cached instance metadata so that scripts
or end-users do not have to write tools to crawl metadata themselves.

Since 'cloud-init query' is runnable by non-root users, redact any
sensitive data from instance-data.json and provide a root-readable
unredacted instance-data-sensitive.json. Datasources can now define a
sensitive_metadata_keys tuple which will redact any matching keys
which could contain passwords or credentials from instance-data.json.

Also add the following standardized 'v1' instance-data.json keys:
  - user_data: The base64encoded user-data provided at instance launch
  - vendor_data: Any vendor_data provided to the instance at launch
  - underscore_delimited versions of existing hyphenated keys:
    instance_id, local_hostname, availability_zone, cloud_name

To post a comment you must log in.
Revision history for this message
Server Team CI bot (server-team-bot) wrote :

FAILED: Continuous integration, rev:44a4844725ac3a75b53f901e43697f7036b0bc42
https://jenkins.ubuntu.com/server/job/cloud-init-ci/315/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    FAILED: Ubuntu LTS: Integration

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/315/rebuild

review: Needs Fixing (continuous-integration)
f719689... by Chad Smith

user_data and vendor_data are now also under v1 keys. Fix tests

Revision history for this message
Server Team CI bot (server-team-bot) wrote :

FAILED: Continuous integration, rev:692ce186d00a8e7c7472827923d7765895697972
https://jenkins.ubuntu.com/server/job/cloud-init-ci/316/
Executed test runs:
    SUCCESS: Checkout
    FAILED: Unit & Style Tests

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/316/rebuild

review: Needs Fixing (continuous-integration)
32f09b2... by Chad Smith

shift integration tests to unittest2 instead of unittest

593cc6d... by Chad Smith

rst format alignment on list items

Revision history for this message
Server Team CI bot (server-team-bot) wrote :

FAILED: Continuous integration, rev:757f864259853272d353c04ea222a671e3c0b9d6
https://jenkins.ubuntu.com/server/job/cloud-init-ci/317/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    FAILED: Ubuntu LTS: Integration

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/317/rebuild

review: Needs Fixing (continuous-integration)
Revision history for this message
Scott Moser (smoser) wrote :

It occurred to me yesterday that user-data can be big. EC2 is limited to 16K, but many other platforms have higher limits. That means we potentially have a very big blob of base64 encoded text in an other-wise reasonably small file.

That, coupled with the fact that for quite some time user-data has been available in /var/lib/cloud/instance/user-data.txt (badly named...) seems to mean to me that we should just not put user-data in the json.

We could still have cloud-init query make it available.
The same seems true of vendor-data.

And then, needs to update bash_completion also.

1c90226... by Chad Smith

add top-level instancedata doc topic. pull it out of datasource

ab686e1... by Chad Smith

add top-level instancedata doc, couple doc fixups

Revision history for this message
Server Team CI bot (server-team-bot) wrote :

FAILED: Continuous integration, rev:ab686e1890289ae210fb433d90a3fea18d1eb51b
https://jenkins.ubuntu.com/server/job/cloud-init-ci/323/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    FAILED: Ubuntu LTS: Integration

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/323/rebuild

review: Needs Fixing (continuous-integration)
0bf450f... by Chad Smith

add user-data and vendor-data optional arguments. redact ud and vd for non-root

e868742... by Chad Smith

unit test fixes for separation of user-data/vendor-data from instance-data.json

49e7818... by Chad Smith

test fixes for metadata

1b46398... by Chad Smith

update docs to drop userdata/vendordata from instance-data.json

Revision history for this message
Server Team CI bot (server-team-bot) wrote :

FAILED: Continuous integration, rev:1b463986ec287d090d104e34214432b53bc09de5
https://jenkins.ubuntu.com/server/job/cloud-init-ci/328/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    FAILED: Ubuntu LTS: Integration

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/328/rebuild

review: Needs Fixing (continuous-integration)
ada811c... by Chad Smith

add integration test dep on unittest2 for assertItemsEqual

Revision history for this message
Server Team CI bot (server-team-bot) wrote :

FAILED: Continuous integration, rev:ada811cea7a5e8fd1cb0e0c72d0a1a735f4a95db
https://jenkins.ubuntu.com/server/job/cloud-init-ci/332/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    FAILED: Ubuntu LTS: Integration

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/332/rebuild

review: Needs Fixing (continuous-integration)
a74ca3e... by Chad Smith

no userdata/vendordata in integration test validation

Revision history for this message
Server Team CI bot (server-team-bot) wrote :

PASSED: Continuous integration, rev:a74ca3ee6594ffd2d63f3c46bc00e7c68401c81a
https://jenkins.ubuntu.com/server/job/cloud-init-ci/334/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/334/rebuild

review: Approve (continuous-integration)
8165e18... by Chad Smith

add redact_sensitive_keys function

a6349b4... by Chad Smith

doc lints

Revision history for this message
Server Team CI bot (server-team-bot) wrote :

PASSED: Continuous integration, rev:8165e18402e0a3e402db1adecebf3b94a86e0a24
https://jenkins.ubuntu.com/server/job/cloud-init-ci/338/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/338/rebuild

review: Approve (continuous-integration)
951944f... by Chad Smith

doc updates

e3a0084... by Chad Smith

fix instance data rst docs

Revision history for this message
Scott Moser (smoser) wrote :

minor comments inline.
only thought is to talk about names of things and expected values ('cloudname') and such.

we need to hhave that somewhere, and i didn't read close enough to see it.

Revision history for this message
Server Team CI bot (server-team-bot) wrote :

PASSED: Continuous integration, rev:e3a008419762313d25cc8c39cf64c716b5616482
https://jenkins.ubuntu.com/server/job/cloud-init-ci/340/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/340/rebuild

review: Approve (continuous-integration)
d5aa5f6... by Chad Smith

call paths if missing any instance, user or vendor data files

2bc2a4d... by Chad Smith

update docs

Revision history for this message
Server Team CI bot (server-team-bot) wrote :

PASSED: Continuous integration, rev:2bc2a4dd41d8e72500aa3292b7e8f42d6816c784
https://jenkins.ubuntu.com/server/job/cloud-init-ci/341/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/341/rebuild

review: Approve (continuous-integration)
Revision history for this message
Ryan Harper (raharper) wrote :

This looks good. A couple of in-line nits but the biggest question is whether this query command is meant to be top-level (cloud-init query) or if it was going to stay behind devel (cloud-init devel query)?

9401cbd... by Chad Smith

bash completion fixes

Revision history for this message
Server Team CI bot (server-team-bot) wrote :

PASSED: Continuous integration, rev:9401cbd7cbfdaeef75d98d7f7a1d4a0ca59d8ccc
https://jenkins.ubuntu.com/server/job/cloud-init-ci/342/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/342/rebuild

review: Approve (continuous-integration)
79befd9... by Chad Smith

address raharper's review comments

- bash_completion fixups for top-level query command
- cache os.getuid calls to avoid duplicate calls
- use paths.instance_link instead of paths.get_ipath()
- comment on underscore delmited standardized keys
- rtd doc fixups

7d38ce3... by Chad Smith

more doc fixes

87bf954... by Chad Smith

add note about security_sensitive keys

Revision history for this message
Chad Smith (chad.smith) :
Revision history for this message
Server Team CI bot (server-team-bot) wrote :

PASSED: Continuous integration, rev:87bf954e24ee393a898069edf334bd15a70b7893
https://jenkins.ubuntu.com/server/job/cloud-init-ci/343/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/343/rebuild

review: Approve (continuous-integration)
Revision history for this message
Ryan Harper (raharper) wrote :

Should commit message mention to two separate files for instance and instance-sensitive data?

Revision history for this message
Ryan Harper (raharper) :
c455ea0... by Chad Smith

log as debug any skipped config files when non-root users raises a PermissionError

5916cbb... by Chad Smith

add cloud_name key to instance data

Revision history for this message
Chad Smith (chad.smith) :
Revision history for this message
Server Team CI bot (server-team-bot) wrote :

FAILED: Continuous integration, rev:a15e6c37da955f5497f83772209a8084b2e9b748
https://jenkins.ubuntu.com/server/job/cloud-init-ci/344/
Executed test runs:
    SUCCESS: Checkout
    FAILED: Unit & Style Tests

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/344/rebuild

review: Needs Fixing (continuous-integration)
Revision history for this message
Server Team CI bot (server-team-bot) wrote :

PASSED: Continuous integration, rev:5916cbb4f60cbcf517ec1edbd45c964e73b441af
https://jenkins.ubuntu.com/server/job/cloud-init-ci/345/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/345/rebuild

review: Approve (continuous-integration)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/bash_completion/cloud-init b/bash_completion/cloud-init
2index 6d01bf3..8c25032 100644
3--- a/bash_completion/cloud-init
4+++ b/bash_completion/cloud-init
5@@ -10,7 +10,7 @@ _cloudinit_complete()
6 cur_word="${COMP_WORDS[COMP_CWORD]}"
7 prev_word="${COMP_WORDS[COMP_CWORD-1]}"
8
9- subcmds="analyze clean collect-logs devel dhclient-hook features init modules single status"
10+ subcmds="analyze clean collect-logs devel dhclient-hook features init modules query single status"
11 base_params="--help --file --version --debug --force"
12 case ${COMP_CWORD} in
13 1)
14@@ -40,6 +40,8 @@ _cloudinit_complete()
15 COMPREPLY=($(compgen -W "--help --mode" -- $cur_word))
16 ;;
17
18+ query)
19+ COMPREPLY=($(compgen -W "--all --help --instance-data --list-keys --user-data --vendor-data --debug" -- $cur_word));;
20 single)
21 COMPREPLY=($(compgen -W "--help --name --frequency --report" -- $cur_word))
22 ;;
23diff --git a/cloudinit/cmd/devel/render.py b/cloudinit/cmd/devel/render.py
24index e85933d..2ba6b68 100755
25--- a/cloudinit/cmd/devel/render.py
26+++ b/cloudinit/cmd/devel/render.py
27@@ -9,7 +9,6 @@ import sys
28 from cloudinit.handlers.jinja_template import render_jinja_payload_from_file
29 from cloudinit import log
30 from cloudinit.sources import INSTANCE_JSON_FILE
31-from cloudinit import util
32 from . import addLogHandlerCLI, read_cfg_paths
33
34 NAME = 'render'
35@@ -54,11 +53,7 @@ def handle_args(name, args):
36 paths.run_dir, INSTANCE_JSON_FILE)
37 else:
38 instance_data_fn = args.instance_data
39- try:
40- with open(instance_data_fn) as stream:
41- instance_data = stream.read()
42- instance_data = util.load_json(instance_data)
43- except IOError:
44+ if not os.path.exists(instance_data_fn):
45 LOG.error('Missing instance-data.json file: %s', instance_data_fn)
46 return 1
47 try:
48diff --git a/cloudinit/cmd/main.py b/cloudinit/cmd/main.py
49index 0eee583..5a43702 100644
50--- a/cloudinit/cmd/main.py
51+++ b/cloudinit/cmd/main.py
52@@ -791,6 +791,10 @@ def main(sysv_args=None):
53 ' pass to this module'))
54 parser_single.set_defaults(action=('single', main_single))
55
56+ parser_query = subparsers.add_parser(
57+ 'query',
58+ help='Query standardized instance metadata from the command line.')
59+
60 parser_dhclient = subparsers.add_parser('dhclient-hook',
61 help=('run the dhclient hook'
62 'to record network info'))
63@@ -842,6 +846,12 @@ def main(sysv_args=None):
64 clean_parser(parser_clean)
65 parser_clean.set_defaults(
66 action=('clean', handle_clean_args))
67+ elif sysv_args[0] == 'query':
68+ from cloudinit.cmd.query import (
69+ get_parser as query_parser, handle_args as handle_query_args)
70+ query_parser(parser_query)
71+ parser_query.set_defaults(
72+ action=('render', handle_query_args))
73 elif sysv_args[0] == 'status':
74 from cloudinit.cmd.status import (
75 get_parser as status_parser, handle_status_args)
76diff --git a/cloudinit/cmd/query.py b/cloudinit/cmd/query.py
77new file mode 100644
78index 0000000..7d2d4fe
79--- /dev/null
80+++ b/cloudinit/cmd/query.py
81@@ -0,0 +1,155 @@
82+# This file is part of cloud-init. See LICENSE file for license information.
83+
84+"""Query standardized instance metadata from the command line."""
85+
86+import argparse
87+import os
88+import six
89+import sys
90+
91+from cloudinit.handlers.jinja_template import (
92+ convert_jinja_instance_data, render_jinja_payload)
93+from cloudinit.cmd.devel import addLogHandlerCLI, read_cfg_paths
94+from cloudinit import log
95+from cloudinit.sources import (
96+ INSTANCE_JSON_FILE, INSTANCE_JSON_SENSITIVE_FILE, REDACT_SENSITIVE_VALUE)
97+from cloudinit import util
98+
99+NAME = 'query'
100+LOG = log.getLogger(NAME)
101+
102+
103+def get_parser(parser=None):
104+ """Build or extend an arg parser for query utility.
105+
106+ @param parser: Optional existing ArgumentParser instance representing the
107+ query subcommand which will be extended to support the args of
108+ this utility.
109+
110+ @returns: ArgumentParser with proper argument configuration.
111+ """
112+ if not parser:
113+ parser = argparse.ArgumentParser(
114+ prog=NAME, description='Query cloud-init instance data')
115+ parser.add_argument(
116+ '-d', '--debug', action='store_true', default=False,
117+ help='Add verbose messages during template render')
118+ parser.add_argument(
119+ '-i', '--instance-data', type=str,
120+ help=('Path to instance-data.json file. Default is /run/cloud-init/%s'
121+ % INSTANCE_JSON_FILE))
122+ parser.add_argument(
123+ '-l', '--list-keys', action='store_true', default=False,
124+ help=('List query keys available at the provided instance-data'
125+ ' <varname>.'))
126+ parser.add_argument(
127+ '-u', '--user-data', type=str,
128+ help=('Path to user-data file. Default is'
129+ ' /var/lib/cloud/instance/user-data.txt'))
130+ parser.add_argument(
131+ '-v', '--vendor-data', type=str,
132+ help=('Path to vendor-data file. Default is'
133+ ' /var/lib/cloud/instance/vendor-data.txt'))
134+ parser.add_argument(
135+ 'varname', type=str, nargs='?',
136+ help=('A dot-delimited instance data variable to query from'
137+ ' instance-data query. For example: v2.local_hostname'))
138+ parser.add_argument(
139+ '-a', '--all', action='store_true', default=False, dest='dump_all',
140+ help='Dump all available instance-data')
141+ parser.add_argument(
142+ '-f', '--format', type=str, dest='format',
143+ help=('Optionally specify a custom output format string. Any'
144+ ' instance-data variable can be specified between double-curly'
145+ ' braces. For example -f "{{ v2.cloud_name }}"'))
146+ return parser
147+
148+
149+def handle_args(name, args):
150+ """Handle calls to 'cloud-init query' as a subcommand."""
151+ paths = None
152+ addLogHandlerCLI(LOG, log.DEBUG if args.debug else log.WARNING)
153+ if not any([args.list_keys, args.varname, args.format, args.dump_all]):
154+ LOG.error(
155+ 'Expected one of the options: --all, --format,'
156+ ' --list-keys or varname')
157+ get_parser().print_help()
158+ return 1
159+
160+ uid = os.getuid()
161+ if not all([args.instance_data, args.user_data, args.vendor_data]):
162+ paths = read_cfg_paths()
163+ if not args.instance_data:
164+ if uid == 0:
165+ default_json_fn = INSTANCE_JSON_SENSITIVE_FILE
166+ else:
167+ default_json_fn = INSTANCE_JSON_FILE # World readable
168+ instance_data_fn = os.path.join(paths.run_dir, default_json_fn)
169+ else:
170+ instance_data_fn = args.instance_data
171+ if not args.user_data:
172+ user_data_fn = os.path.join(paths.instance_link, 'user-data.txt')
173+ else:
174+ user_data_fn = args.user_data
175+ if not args.vendor_data:
176+ vendor_data_fn = os.path.join(paths.instance_link, 'vendor-data.txt')
177+ else:
178+ vendor_data_fn = args.vendor_data
179+
180+ try:
181+ instance_json = util.load_file(instance_data_fn)
182+ except IOError:
183+ LOG.error('Missing instance-data.json file: %s', instance_data_fn)
184+ return 1
185+
186+ instance_data = util.load_json(instance_json)
187+ if uid != 0:
188+ instance_data['userdata'] = (
189+ '<%s> file:%s' % (REDACT_SENSITIVE_VALUE, user_data_fn))
190+ instance_data['vendordata'] = (
191+ '<%s> file:%s' % (REDACT_SENSITIVE_VALUE, vendor_data_fn))
192+ else:
193+ instance_data['userdata'] = util.load_file(user_data_fn)
194+ instance_data['vendordata'] = util.load_file(vendor_data_fn)
195+ if args.format:
196+ payload = '## template: jinja\n{fmt}'.format(fmt=args.format)
197+ rendered_payload = render_jinja_payload(
198+ payload=payload, payload_fn='query commandline',
199+ instance_data=instance_data,
200+ debug=True if args.debug else False)
201+ if rendered_payload:
202+ print(rendered_payload)
203+ return 0
204+ return 1
205+
206+ response = convert_jinja_instance_data(instance_data)
207+ if args.varname:
208+ try:
209+ for var in args.varname.split('.'):
210+ response = response[var]
211+ except KeyError:
212+ LOG.error('Undefined instance-data key %s', args.varname)
213+ return 1
214+ if args.list_keys:
215+ if not isinstance(response, dict):
216+ LOG.error("--list-keys provided but '%s' is not a dict", var)
217+ return 1
218+ response = '\n'.join(sorted(response.keys()))
219+ elif args.list_keys:
220+ response = '\n'.join(sorted(response.keys()))
221+ if not isinstance(response, six.string_types):
222+ response = util.json_dumps(response)
223+ print(response)
224+ return 0
225+
226+
227+def main():
228+ """Tool to query specific instance-data values."""
229+ parser = get_parser()
230+ sys.exit(handle_args(NAME, parser.parse_args()))
231+
232+
233+if __name__ == '__main__':
234+ main()
235+
236+# vi: ts=4 expandtab
237diff --git a/cloudinit/cmd/tests/test_query.py b/cloudinit/cmd/tests/test_query.py
238new file mode 100644
239index 0000000..fb87c6a
240--- /dev/null
241+++ b/cloudinit/cmd/tests/test_query.py
242@@ -0,0 +1,193 @@
243+# This file is part of cloud-init. See LICENSE file for license information.
244+
245+from six import StringIO
246+from textwrap import dedent
247+import os
248+
249+from collections import namedtuple
250+from cloudinit.cmd import query
251+from cloudinit.helpers import Paths
252+from cloudinit.sources import REDACT_SENSITIVE_VALUE, INSTANCE_JSON_FILE
253+from cloudinit.tests.helpers import CiTestCase, mock
254+from cloudinit.util import ensure_dir, write_file
255+
256+
257+class TestQuery(CiTestCase):
258+
259+ with_logs = True
260+
261+ args = namedtuple(
262+ 'queryargs',
263+ ('debug dump_all format instance_data list_keys user_data vendor_data'
264+ ' varname'))
265+
266+ def setUp(self):
267+ super(TestQuery, self).setUp()
268+ self.tmp = self.tmp_dir()
269+ self.instance_data = self.tmp_path('instance-data', dir=self.tmp)
270+
271+ def test_handle_args_error_on_missing_param(self):
272+ """Error when missing required parameters and print usage."""
273+ args = self.args(
274+ debug=False, dump_all=False, format=None, instance_data=None,
275+ list_keys=False, user_data=None, vendor_data=None, varname=None)
276+ with mock.patch('sys.stderr', new_callable=StringIO) as m_stderr:
277+ with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout:
278+ self.assertEqual(1, query.handle_args('anyname', args))
279+ expected_error = (
280+ 'ERROR: Expected one of the options: --all, --format, --list-keys'
281+ ' or varname\n')
282+ self.assertIn(expected_error, self.logs.getvalue())
283+ self.assertIn('usage: query', m_stdout.getvalue())
284+ self.assertIn(expected_error, m_stderr.getvalue())
285+
286+ def test_handle_args_error_on_missing_instance_data(self):
287+ """When instance_data file path does not exist, log an error."""
288+ absent_fn = self.tmp_path('absent', dir=self.tmp)
289+ args = self.args(
290+ debug=False, dump_all=True, format=None, instance_data=absent_fn,
291+ list_keys=False, user_data='ud', vendor_data='vd', varname=None)
292+ with mock.patch('sys.stderr', new_callable=StringIO) as m_stderr:
293+ self.assertEqual(1, query.handle_args('anyname', args))
294+ self.assertIn(
295+ 'ERROR: Missing instance-data.json file: %s' % absent_fn,
296+ self.logs.getvalue())
297+ self.assertIn(
298+ 'ERROR: Missing instance-data.json file: %s' % absent_fn,
299+ m_stderr.getvalue())
300+
301+ def test_handle_args_defaults_instance_data(self):
302+ """When no instance_data argument, default to configured run_dir."""
303+ args = self.args(
304+ debug=False, dump_all=True, format=None, instance_data=None,
305+ list_keys=False, user_data=None, vendor_data=None, varname=None)
306+ run_dir = self.tmp_path('run_dir', dir=self.tmp)
307+ ensure_dir(run_dir)
308+ paths = Paths({'run_dir': run_dir})
309+ self.add_patch('cloudinit.cmd.query.read_cfg_paths', 'm_paths')
310+ self.m_paths.return_value = paths
311+ with mock.patch('sys.stderr', new_callable=StringIO) as m_stderr:
312+ self.assertEqual(1, query.handle_args('anyname', args))
313+ json_file = os.path.join(run_dir, INSTANCE_JSON_FILE)
314+ self.assertIn(
315+ 'ERROR: Missing instance-data.json file: %s' % json_file,
316+ self.logs.getvalue())
317+ self.assertIn(
318+ 'ERROR: Missing instance-data.json file: %s' % json_file,
319+ m_stderr.getvalue())
320+
321+ def test_handle_args_dumps_all_instance_data(self):
322+ """When --all is specified query will dump all instance data vars."""
323+ write_file(self.instance_data, '{"my-var": "it worked"}')
324+ args = self.args(
325+ debug=False, dump_all=True, format=None,
326+ instance_data=self.instance_data, list_keys=False,
327+ user_data='ud', vendor_data='vd', varname=None)
328+ with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout:
329+ self.assertEqual(0, query.handle_args('anyname', args))
330+ self.assertEqual(
331+ '{\n "my_var": "it worked",\n "userdata": "<%s> file:ud",\n'
332+ ' "vendordata": "<%s> file:vd"\n}\n' % (
333+ REDACT_SENSITIVE_VALUE, REDACT_SENSITIVE_VALUE),
334+ m_stdout.getvalue())
335+
336+ def test_handle_args_returns_top_level_varname(self):
337+ """When the argument varname is passed, report its value."""
338+ write_file(self.instance_data, '{"my-var": "it worked"}')
339+ args = self.args(
340+ debug=False, dump_all=True, format=None,
341+ instance_data=self.instance_data, list_keys=False,
342+ user_data='ud', vendor_data='vd', varname='my_var')
343+ with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout:
344+ self.assertEqual(0, query.handle_args('anyname', args))
345+ self.assertEqual('it worked\n', m_stdout.getvalue())
346+
347+ def test_handle_args_returns_nested_varname(self):
348+ """If user_data file is a jinja template render instance-data vars."""
349+ write_file(self.instance_data,
350+ '{"v1": {"key-2": "value-2"}, "my-var": "it worked"}')
351+ args = self.args(
352+ debug=False, dump_all=False, format=None,
353+ instance_data=self.instance_data, user_data='ud', vendor_data='vd',
354+ list_keys=False, varname='v1.key_2')
355+ with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout:
356+ self.assertEqual(0, query.handle_args('anyname', args))
357+ self.assertEqual('value-2\n', m_stdout.getvalue())
358+
359+ def test_handle_args_returns_standardized_vars_to_top_level_aliases(self):
360+ """Any standardized vars under v# are promoted as top-level aliases."""
361+ write_file(
362+ self.instance_data,
363+ '{"v1": {"v1_1": "val1.1"}, "v2": {"v2_2": "val2.2"},'
364+ ' "top": "gun"}')
365+ expected = dedent("""\
366+ {
367+ "top": "gun",
368+ "userdata": "<redacted for non-root user> file:ud",
369+ "v1": {
370+ "v1_1": "val1.1"
371+ },
372+ "v1_1": "val1.1",
373+ "v2": {
374+ "v2_2": "val2.2"
375+ },
376+ "v2_2": "val2.2",
377+ "vendordata": "<redacted for non-root user> file:vd"
378+ }
379+ """)
380+ args = self.args(
381+ debug=False, dump_all=True, format=None,
382+ instance_data=self.instance_data, user_data='ud', vendor_data='vd',
383+ list_keys=False, varname=None)
384+ with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout:
385+ self.assertEqual(0, query.handle_args('anyname', args))
386+ self.assertEqual(expected, m_stdout.getvalue())
387+
388+ def test_handle_args_list_keys_sorts_top_level_keys_when_no_varname(self):
389+ """Sort all top-level keys when only --list-keys provided."""
390+ write_file(
391+ self.instance_data,
392+ '{"v1": {"v1_1": "val1.1"}, "v2": {"v2_2": "val2.2"},'
393+ ' "top": "gun"}')
394+ expected = 'top\nuserdata\nv1\nv1_1\nv2\nv2_2\nvendordata\n'
395+ args = self.args(
396+ debug=False, dump_all=False, format=None,
397+ instance_data=self.instance_data, list_keys=True, user_data='ud',
398+ vendor_data='vd', varname=None)
399+ with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout:
400+ self.assertEqual(0, query.handle_args('anyname', args))
401+ self.assertEqual(expected, m_stdout.getvalue())
402+
403+ def test_handle_args_list_keys_sorts_nested_keys_when_varname(self):
404+ """Sort all nested keys of varname object when --list-keys provided."""
405+ write_file(
406+ self.instance_data,
407+ '{"v1": {"v1_1": "val1.1", "v1_2": "val1.2"}, "v2":' +
408+ ' {"v2_2": "val2.2"}, "top": "gun"}')
409+ expected = 'v1_1\nv1_2\n'
410+ args = self.args(
411+ debug=False, dump_all=False, format=None,
412+ instance_data=self.instance_data, list_keys=True,
413+ user_data='ud', vendor_data='vd', varname='v1')
414+ with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout:
415+ self.assertEqual(0, query.handle_args('anyname', args))
416+ self.assertEqual(expected, m_stdout.getvalue())
417+
418+ def test_handle_args_list_keys_errors_when_varname_is_not_a_dict(self):
419+ """Raise an error when --list-keys and varname specify a non-list."""
420+ write_file(
421+ self.instance_data,
422+ '{"v1": {"v1_1": "val1.1", "v1_2": "val1.2"}, "v2": ' +
423+ '{"v2_2": "val2.2"}, "top": "gun"}')
424+ expected_error = "ERROR: --list-keys provided but 'top' is not a dict"
425+ args = self.args(
426+ debug=False, dump_all=False, format=None,
427+ instance_data=self.instance_data, list_keys=True, user_data='ud',
428+ vendor_data='vd', varname='top')
429+ with mock.patch('sys.stderr', new_callable=StringIO) as m_stderr:
430+ with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout:
431+ self.assertEqual(1, query.handle_args('anyname', args))
432+ self.assertEqual('', m_stdout.getvalue())
433+ self.assertIn(expected_error, m_stderr.getvalue())
434+
435+# vi: ts=4 expandtab
436diff --git a/cloudinit/helpers.py b/cloudinit/helpers.py
437index 3cc1fb1..dcd2645 100644
438--- a/cloudinit/helpers.py
439+++ b/cloudinit/helpers.py
440@@ -239,6 +239,10 @@ class ConfigMerger(object):
441 if cc_fn and os.path.isfile(cc_fn):
442 try:
443 i_cfgs.append(util.read_conf(cc_fn))
444+ except PermissionError:
445+ LOG.debug(
446+ 'Skipped loading cloud-config from %s due to'
447+ ' non-root.', cc_fn)
448 except Exception:
449 util.logexc(LOG, 'Failed loading of cloud-config from %s',
450 cc_fn)
451diff --git a/cloudinit/sources/__init__.py b/cloudinit/sources/__init__.py
452index a775f1a..730e817 100644
453--- a/cloudinit/sources/__init__.py
454+++ b/cloudinit/sources/__init__.py
455@@ -38,8 +38,12 @@ DEP_FILESYSTEM = "FILESYSTEM"
456 DEP_NETWORK = "NETWORK"
457 DS_PREFIX = 'DataSource'
458
459-# File in which instance meta-data, user-data and vendor-data is written
460+# File in which public available instance meta-data is written
461+# security-sensitive key values are redacted from this world-readable file
462 INSTANCE_JSON_FILE = 'instance-data.json'
463+# security-sensitive key values are present in this root-readable file
464+INSTANCE_JSON_SENSITIVE_FILE = 'instance-data-sensitive.json'
465+REDACT_SENSITIVE_VALUE = 'redacted for non-root user'
466
467 # Key which can be provide a cloud's official product name to cloud-init
468 METADATA_CLOUD_NAME_KEY = 'cloud-name'
469@@ -58,7 +62,7 @@ class InvalidMetaDataException(Exception):
470 pass
471
472
473-def process_instance_metadata(metadata, key_path=''):
474+def process_instance_metadata(metadata, key_path='', sensitive_keys=()):
475 """Process all instance metadata cleaning it up for persisting as json.
476
477 Strip ci-b64 prefix and catalog any 'base64_encoded_keys' as a list
478@@ -67,22 +71,46 @@ def process_instance_metadata(metadata, key_path=''):
479 """
480 md_copy = copy.deepcopy(metadata)
481 md_copy['base64_encoded_keys'] = []
482+ md_copy['sensitive_keys'] = []
483 for key, val in metadata.items():
484 if key_path:
485 sub_key_path = key_path + '/' + key
486 else:
487 sub_key_path = key
488+ if key in sensitive_keys or sub_key_path in sensitive_keys:
489+ md_copy['sensitive_keys'].append(sub_key_path)
490 if isinstance(val, str) and val.startswith('ci-b64:'):
491 md_copy['base64_encoded_keys'].append(sub_key_path)
492 md_copy[key] = val.replace('ci-b64:', '')
493 if isinstance(val, dict):
494- return_val = process_instance_metadata(val, sub_key_path)
495+ return_val = process_instance_metadata(
496+ val, sub_key_path, sensitive_keys)
497 md_copy['base64_encoded_keys'].extend(
498 return_val.pop('base64_encoded_keys'))
499+ md_copy['sensitive_keys'].extend(
500+ return_val.pop('sensitive_keys'))
501 md_copy[key] = return_val
502 return md_copy
503
504
505+def redact_sensitive_keys(metadata, redact_value=REDACT_SENSITIVE_VALUE):
506+ """Redact any sensitive keys from to provided metadata dictionary.
507+
508+ Replace any keys values listed in 'sensitive_keys' with redact_value.
509+ """
510+ if not metadata.get('sensitive_keys', []):
511+ return metadata
512+ md_copy = copy.deepcopy(metadata)
513+ for key_path in metadata.get('sensitive_keys'):
514+ path_parts = key_path.split('/')
515+ obj = md_copy
516+ for path in path_parts:
517+ if isinstance(obj[path], dict) and path != path_parts[-1]:
518+ obj = obj[path]
519+ obj[path] = redact_value
520+ return md_copy
521+
522+
523 URLParams = namedtuple(
524 'URLParms', ['max_wait_seconds', 'timeout_seconds', 'num_retries'])
525
526@@ -127,6 +155,10 @@ class DataSource(object):
527
528 _dirty_cache = False
529
530+ # N-tuple of keypaths or keynames redact from instance-data.json for
531+ # non-root users
532+ sensitive_metadata_keys = ('security-credentials',)
533+
534 def __init__(self, sys_cfg, distro, paths, ud_proc=None):
535 self.sys_cfg = sys_cfg
536 self.distro = distro
537@@ -152,12 +184,24 @@ class DataSource(object):
538
539 def _get_standardized_metadata(self):
540 """Return a dictionary of standardized metadata keys."""
541- return {'v1': {
542- 'local-hostname': self.get_hostname(),
543- 'instance-id': self.get_instance_id(),
544- 'cloud-name': self.cloud_name,
545- 'region': self.region,
546- 'availability-zone': self.availability_zone}}
547+ local_hostname = self.get_hostname()
548+ instance_id = self.get_instance_id()
549+ availability_zone = self.availability_zone
550+ cloud_name = self.cloud_name
551+ # When adding new standard keys prefer underscore-delimited instead
552+ # of hyphen-delimted to support simple variable references in jinja
553+ # templates.
554+ return {
555+ 'v1': {
556+ 'availability-zone': availability_zone,
557+ 'availability_zone': availability_zone,
558+ 'cloud-name': cloud_name,
559+ 'cloud_name': cloud_name,
560+ 'instance-id': instance_id,
561+ 'instance_id': instance_id,
562+ 'local-hostname': local_hostname,
563+ 'local_hostname': local_hostname,
564+ 'region': self.region}}
565
566 def clear_cached_attrs(self, attr_defaults=()):
567 """Reset any cached metadata attributes to datasource defaults.
568@@ -200,9 +244,7 @@ class DataSource(object):
569 """
570 instance_data = {
571 'ds': {
572- 'meta_data': self.metadata,
573- 'user_data': self.get_userdata_raw(),
574- 'vendor_data': self.get_vendordata_raw()}}
575+ 'meta_data': self.metadata}}
576 if hasattr(self, 'network_json'):
577 network_json = getattr(self, 'network_json')
578 if network_json != UNSET:
579@@ -217,7 +259,9 @@ class DataSource(object):
580 # Process content base64encoding unserializable values
581 content = util.json_dumps(instance_data)
582 # Strip base64: prefix and set base64_encoded_keys list.
583- processed_data = process_instance_metadata(json.loads(content))
584+ processed_data = process_instance_metadata(
585+ json.loads(content),
586+ sensitive_keys=self.sensitive_metadata_keys)
587 except TypeError as e:
588 LOG.warning('Error persisting instance-data.json: %s', str(e))
589 return False
590@@ -225,7 +269,11 @@ class DataSource(object):
591 LOG.warning('Error persisting instance-data.json: %s', str(e))
592 return False
593 json_file = os.path.join(self.paths.run_dir, INSTANCE_JSON_FILE)
594- write_json(json_file, processed_data, mode=0o600)
595+ write_json(json_file, processed_data) # World readable
596+ json_sensitive_file = os.path.join(self.paths.run_dir,
597+ INSTANCE_JSON_SENSITIVE_FILE)
598+ write_json(json_sensitive_file,
599+ redact_sensitive_keys(processed_data), mode=0o600)
600 return True
601
602 def _get_data(self):
603diff --git a/cloudinit/sources/tests/test_init.py b/cloudinit/sources/tests/test_init.py
604index 8299af2..6b96575 100644
605--- a/cloudinit/sources/tests/test_init.py
606+++ b/cloudinit/sources/tests/test_init.py
607@@ -1,5 +1,6 @@
608 # This file is part of cloud-init. See LICENSE file for license information.
609
610+import copy
611 import inspect
612 import os
613 import six
614@@ -9,7 +10,8 @@ from cloudinit.event import EventType
615 from cloudinit.helpers import Paths
616 from cloudinit import importer
617 from cloudinit.sources import (
618- INSTANCE_JSON_FILE, DataSource, UNSET)
619+ INSTANCE_JSON_FILE, INSTANCE_JSON_SENSITIVE_FILE, REDACT_SENSITIVE_VALUE,
620+ UNSET, DataSource, redact_sensitive_keys)
621 from cloudinit.tests.helpers import CiTestCase, skipIf, mock
622 from cloudinit.user_data import UserDataProcessor
623 from cloudinit import util
624@@ -20,20 +22,24 @@ class DataSourceTestSubclassNet(DataSource):
625 dsname = 'MyTestSubclass'
626 url_max_wait = 55
627
628- def __init__(self, sys_cfg, distro, paths, custom_userdata=None,
629- get_data_retval=True):
630+ def __init__(self, sys_cfg, distro, paths, custom_metadata=None,
631+ custom_userdata=None, get_data_retval=True):
632 super(DataSourceTestSubclassNet, self).__init__(
633 sys_cfg, distro, paths)
634 self._custom_userdata = custom_userdata
635+ self._custom_metadata = custom_metadata
636 self._get_data_retval = get_data_retval
637
638 def _get_cloud_name(self):
639 return 'SubclassCloudName'
640
641 def _get_data(self):
642- self.metadata = {'availability_zone': 'myaz',
643- 'local-hostname': 'test-subclass-hostname',
644- 'region': 'myregion'}
645+ if self._custom_metadata:
646+ self.metadata = self._custom_metadata
647+ else:
648+ self.metadata = {'availability_zone': 'myaz',
649+ 'local-hostname': 'test-subclass-hostname',
650+ 'region': 'myregion'}
651 if self._custom_userdata:
652 self.userdata_raw = self._custom_userdata
653 else:
654@@ -278,7 +284,7 @@ class TestDataSource(CiTestCase):
655 os.path.exists(json_file), 'Found unexpected file %s' % json_file)
656
657 def test_get_data_writes_json_instance_data_on_success(self):
658- """get_data writes INSTANCE_JSON_FILE to run_dir as readonly root."""
659+ """get_data writes INSTANCE_JSON_FILE to run_dir as world readable."""
660 tmp = self.tmp_dir()
661 datasource = DataSourceTestSubclassNet(
662 self.sys_cfg, self.distro, Paths({'run_dir': tmp}))
663@@ -287,40 +293,90 @@ class TestDataSource(CiTestCase):
664 content = util.load_file(json_file)
665 expected = {
666 'base64_encoded_keys': [],
667+ 'sensitive_keys': [],
668 'v1': {
669 'availability-zone': 'myaz',
670+ 'availability_zone': 'myaz',
671 'cloud-name': 'subclasscloudname',
672+ 'cloud_name': 'subclasscloudname',
673 'instance-id': 'iid-datasource',
674+ 'instance_id': 'iid-datasource',
675 'local-hostname': 'test-subclass-hostname',
676+ 'local_hostname': 'test-subclass-hostname',
677 'region': 'myregion'},
678 'ds': {
679 'meta_data': {'availability_zone': 'myaz',
680 'local-hostname': 'test-subclass-hostname',
681- 'region': 'myregion'},
682- 'user_data': 'userdata_raw',
683- 'vendor_data': 'vendordata_raw'}}
684- self.maxDiff = None
685+ 'region': 'myregion'}}}
686 self.assertEqual(expected, util.load_json(content))
687 file_stat = os.stat(json_file)
688+ self.assertEqual(0o644, stat.S_IMODE(file_stat.st_mode))
689+ self.assertEqual(expected, util.load_json(content))
690+
691+ def test_get_data_writes_json_instance_data_sensitive(self):
692+ """get_data writes INSTANCE_JSON_SENSITIVE_FILE as readonly root."""
693+ tmp = self.tmp_dir()
694+ datasource = DataSourceTestSubclassNet(
695+ self.sys_cfg, self.distro, Paths({'run_dir': tmp}),
696+ custom_metadata={
697+ 'availability_zone': 'myaz',
698+ 'local-hostname': 'test-subclass-hostname',
699+ 'region': 'myregion',
700+ 'some': {'security-credentials': {
701+ 'cred1': 'sekret', 'cred2': 'othersekret'}}})
702+ self.assertEqual(
703+ ('security-credentials',), datasource.sensitive_metadata_keys)
704+ datasource.get_data()
705+ json_file = self.tmp_path(INSTANCE_JSON_FILE, tmp)
706+ sensitive_json_file = self.tmp_path(INSTANCE_JSON_SENSITIVE_FILE, tmp)
707+ redacted = util.load_json(util.load_file(json_file))
708+ self.assertEqual(
709+ {'cred1': 'sekret', 'cred2': 'othersekret'},
710+ redacted['ds']['meta_data']['some']['security-credentials'])
711+ content = util.load_file(sensitive_json_file)
712+ expected = {
713+ 'base64_encoded_keys': [],
714+ 'sensitive_keys': ['ds/meta_data/some/security-credentials'],
715+ 'v1': {
716+ 'availability-zone': 'myaz',
717+ 'availability_zone': 'myaz',
718+ 'cloud-name': 'subclasscloudname',
719+ 'cloud_name': 'subclasscloudname',
720+ 'instance-id': 'iid-datasource',
721+ 'instance_id': 'iid-datasource',
722+ 'local-hostname': 'test-subclass-hostname',
723+ 'local_hostname': 'test-subclass-hostname',
724+ 'region': 'myregion'},
725+ 'ds': {
726+ 'meta_data': {
727+ 'availability_zone': 'myaz',
728+ 'local-hostname': 'test-subclass-hostname',
729+ 'region': 'myregion',
730+ 'some': {'security-credentials': REDACT_SENSITIVE_VALUE}}}
731+ }
732+ self.maxDiff = None
733+ self.assertEqual(expected, util.load_json(content))
734+ file_stat = os.stat(sensitive_json_file)
735 self.assertEqual(0o600, stat.S_IMODE(file_stat.st_mode))
736+ self.assertEqual(expected, util.load_json(content))
737
738 def test_get_data_handles_redacted_unserializable_content(self):
739 """get_data warns unserializable content in INSTANCE_JSON_FILE."""
740 tmp = self.tmp_dir()
741 datasource = DataSourceTestSubclassNet(
742 self.sys_cfg, self.distro, Paths({'run_dir': tmp}),
743- custom_userdata={'key1': 'val1', 'key2': {'key2.1': self.paths}})
744+ custom_metadata={'key1': 'val1', 'key2': {'key2.1': self.paths}})
745 datasource.get_data()
746 json_file = self.tmp_path(INSTANCE_JSON_FILE, tmp)
747 content = util.load_file(json_file)
748- expected_userdata = {
749+ expected_metadata = {
750 'key1': 'val1',
751 'key2': {
752 'key2.1': "Warning: redacted unserializable type <class"
753 " 'cloudinit.helpers.Paths'>"}}
754 instance_json = util.load_json(content)
755 self.assertEqual(
756- expected_userdata, instance_json['ds']['user_data'])
757+ expected_metadata, instance_json['ds']['meta_data'])
758
759 def test_persist_instance_data_writes_ec2_metadata_when_set(self):
760 """When ec2_metadata class attribute is set, persist to json."""
761@@ -361,17 +417,17 @@ class TestDataSource(CiTestCase):
762 tmp = self.tmp_dir()
763 datasource = DataSourceTestSubclassNet(
764 self.sys_cfg, self.distro, Paths({'run_dir': tmp}),
765- custom_userdata={'key1': 'val1', 'key2': {'key2.1': b'\x123'}})
766+ custom_metadata={'key1': 'val1', 'key2': {'key2.1': b'\x123'}})
767 self.assertTrue(datasource.get_data())
768 json_file = self.tmp_path(INSTANCE_JSON_FILE, tmp)
769 content = util.load_file(json_file)
770 instance_json = util.load_json(content)
771- self.assertEqual(
772- ['ds/user_data/key2/key2.1'],
773+ self.assertItemsEqual(
774+ ['ds/meta_data/key2/key2.1'],
775 instance_json['base64_encoded_keys'])
776 self.assertEqual(
777 {'key1': 'val1', 'key2': {'key2.1': 'EjM='}},
778- instance_json['ds']['user_data'])
779+ instance_json['ds']['meta_data'])
780
781 @skipIf(not six.PY2, "json serialization on <= py2.7 handles bytes")
782 def test_get_data_handles_bytes_values(self):
783@@ -379,7 +435,7 @@ class TestDataSource(CiTestCase):
784 tmp = self.tmp_dir()
785 datasource = DataSourceTestSubclassNet(
786 self.sys_cfg, self.distro, Paths({'run_dir': tmp}),
787- custom_userdata={'key1': 'val1', 'key2': {'key2.1': b'\x123'}})
788+ custom_metadata={'key1': 'val1', 'key2': {'key2.1': b'\x123'}})
789 self.assertTrue(datasource.get_data())
790 json_file = self.tmp_path(INSTANCE_JSON_FILE, tmp)
791 content = util.load_file(json_file)
792@@ -387,7 +443,7 @@ class TestDataSource(CiTestCase):
793 self.assertEqual([], instance_json['base64_encoded_keys'])
794 self.assertEqual(
795 {'key1': 'val1', 'key2': {'key2.1': '\x123'}},
796- instance_json['ds']['user_data'])
797+ instance_json['ds']['meta_data'])
798
799 @skipIf(not six.PY2, "Only python2 hits UnicodeDecodeErrors on non-utf8")
800 def test_non_utf8_encoding_logs_warning(self):
801@@ -395,7 +451,7 @@ class TestDataSource(CiTestCase):
802 tmp = self.tmp_dir()
803 datasource = DataSourceTestSubclassNet(
804 self.sys_cfg, self.distro, Paths({'run_dir': tmp}),
805- custom_userdata={'key1': 'val1', 'key2': {'key2.1': b'ab\xaadef'}})
806+ custom_metadata={'key1': 'val1', 'key2': {'key2.1': b'ab\xaadef'}})
807 self.assertTrue(datasource.get_data())
808 json_file = self.tmp_path(INSTANCE_JSON_FILE, tmp)
809 self.assertFalse(os.path.exists(json_file))
810@@ -509,4 +565,36 @@ class TestDataSource(CiTestCase):
811 self.logs.getvalue())
812
813
814+class TestRedactSensitiveData(CiTestCase):
815+
816+ def test_redact_sensitive_data_noop_when_no_sensitive_keys_present(self):
817+ """When sensitive_keys is absent or empty from metadata do nothing."""
818+ md = {'my': 'data'}
819+ self.assertEqual(
820+ md, redact_sensitive_keys(md, redact_value='redacted'))
821+ md['sensitive_keys'] = []
822+ self.assertEqual(
823+ md, redact_sensitive_keys(md, redact_value='redacted'))
824+
825+ def test_redact_sensitive_data_redacts_exact_match_name(self):
826+ """Only exact matched sensitive_keys are redacted from metadata."""
827+ md = {'sensitive_keys': ['md/secure'],
828+ 'md': {'secure': 's3kr1t', 'insecure': 'publik'}}
829+ secure_md = copy.deepcopy(md)
830+ secure_md['md']['secure'] = 'redacted'
831+ self.assertEqual(
832+ secure_md,
833+ redact_sensitive_keys(md, redact_value='redacted'))
834+
835+ def test_redact_sensitive_data_does_redacts_with_default_string(self):
836+ """When redact_value is absent, REDACT_SENSITIVE_VALUE is used."""
837+ md = {'sensitive_keys': ['md/secure'],
838+ 'md': {'secure': 's3kr1t', 'insecure': 'publik'}}
839+ secure_md = copy.deepcopy(md)
840+ secure_md['md']['secure'] = 'redacted for non-root user'
841+ self.assertEqual(
842+ secure_md,
843+ redact_sensitive_keys(md))
844+
845+
846 # vi: ts=4 expandtab
847diff --git a/doc/rtd/index.rst b/doc/rtd/index.rst
848index de67f36..20a99a3 100644
849--- a/doc/rtd/index.rst
850+++ b/doc/rtd/index.rst
851@@ -31,6 +31,7 @@ initialization of a cloud instance.
852 topics/capabilities.rst
853 topics/availability.rst
854 topics/format.rst
855+ topics/instancedata.rst
856 topics/dir_layout.rst
857 topics/examples.rst
858 topics/boot.rst
859diff --git a/doc/rtd/topics/capabilities.rst b/doc/rtd/topics/capabilities.rst
860index 2d8e253..0d8b894 100644
861--- a/doc/rtd/topics/capabilities.rst
862+++ b/doc/rtd/topics/capabilities.rst
863@@ -18,7 +18,7 @@ User configurability
864
865 User-data can be given by the user at instance launch time. See
866 :ref:`user_data_formats` for acceptable user-data content.
867-
868+
869
870 This is done via the ``--user-data`` or ``--user-data-file`` argument to
871 ec2-run-instances for example.
872@@ -53,10 +53,9 @@ system:
873
874 % cloud-init --help
875 usage: cloud-init [-h] [--version] [--file FILES]
876-
877 [--debug] [--force]
878- {init,modules,single,dhclient-hook,features,analyze,devel,collect-logs,clean,status}
879- ...
880+ {init,modules,single,query,dhclient-hook,features,analyze,devel,collect-logs,clean,status}
881+ ...
882
883 optional arguments:
884 -h, --help show this help message and exit
885@@ -68,17 +67,19 @@ system:
886 your own risk)
887
888 Subcommands:
889- {init,modules,single,dhclient-hook,features,analyze,devel,collect-logs,clean,status}
890+ {init,modules,single,query,dhclient-hook,features,analyze,devel,collect-logs,clean,status}
891 init initializes cloud-init and performs initial modules
892 modules activates modules using a given configuration key
893 single run a single module
894+ query Query instance metadata from the command line
895 dhclient-hook run the dhclient hookto record network info
896 features list defined features
897 analyze Devel tool: Analyze cloud-init logs and data
898 devel Run development tools
899 collect-logs Collect and tar all cloud-init debug info
900- clean Remove logs and artifacts so cloud-init can re-run.
901- status Report cloud-init status or wait on completion.
902+ clean Remove logs and artifacts so cloud-init can re-run
903+ status Report cloud-init status or wait on completion
904+
905
906 CLI Subcommand details
907 ======================
908@@ -104,8 +105,8 @@ cloud-init status
909 Report whether cloud-init is running, done, disabled or errored. Exits
910 non-zero if an error is detected in cloud-init.
911
912- * **--long**: Detailed status information.
913- * **--wait**: Block until cloud-init completes.
914+* **--long**: Detailed status information.
915+* **--wait**: Block until cloud-init completes.
916
917 .. code-block:: shell-session
918
919@@ -143,6 +144,68 @@ Logs collected are:
920 * journalctl output
921 * /var/lib/cloud/instance/user-data.txt
922
923+.. _cli_query:
924+
925+cloud-init query
926+------------------
927+Query standardized cloud instance metadata crawled by cloud-init and stored
928+in ``/run/cloud-init/instance-data.json``. This is a convenience command-line
929+interface to reference any cached configuration metadata that cloud-init
930+crawls when booting the instance. See :ref:`instance_metadata` for more info.
931+
932+* **--all**: Dump all available instance data as json which can be queried.
933+* **--instance-data**: Optional path to a different instance-data.json file to
934+ source for queries.
935+* **--list-keys**: List available query keys from cached instance data.
936+
937+.. code-block:: shell-session
938+
939+ # List all top-level query keys available (includes standardized aliases)
940+ % cloud-init query --list-keys
941+ availability_zone
942+ base64_encoded_keys
943+ cloud_name
944+ ds
945+ instance_id
946+ local_hostname
947+ region
948+ v1
949+
950+* **<varname>**: A dot-delimited variable path into the instance-data.json
951+ object.
952+
953+.. code-block:: shell-session
954+
955+ # Query cloud-init standardized metadata on any cloud
956+ % cloud-init query v1.cloud_name
957+ aws # or openstack, azure, gce etc.
958+
959+ # Any standardized instance-data under a <v#> key is aliased as a top-level
960+ # key for convenience.
961+ % cloud-init query cloud_name
962+ aws # or openstack, azure, gce etc.
963+
964+ # Query datasource-specific metadata on EC2
965+ % cloud-init query ds.meta_data.public_ipv4
966+
967+* **--format** A string that will use jinja-template syntax to render a string
968+ replacing
969+
970+.. code-block:: shell-session
971+
972+ # Generate a custom hostname fqdn based on instance-id, cloud and region
973+ % cloud-init query --format 'custom-{{instance_id}}.{{region}}.{{v1.cloud_name}}.com'
974+ custom-i-0e91f69987f37ec74.us-east-2.aws.com
975+
976+
977+.. note::
978+ The standardized instance data keys under **v#** are guaranteed not to change
979+ behavior or format. If using top-level convenience aliases for any
980+ standardized instance data keys, the most value (highest **v#**) of that key
981+ name is what is reported as the top-level value. So these aliases act as a
982+ 'latest'.
983+
984+
985 .. _cli_analyze:
986
987 cloud-init analyze
988@@ -150,10 +213,10 @@ cloud-init analyze
989 Get detailed reports of where cloud-init spends most of its time. See
990 :ref:`boot_time_analysis` for more info.
991
992- * **blame** Report ordered by most costly operations.
993- * **dump** Machine-readable JSON dump of all cloud-init tracked events.
994- * **show** show time-ordered report of the cost of operations during each
995- boot stage.
996+* **blame** Report ordered by most costly operations.
997+* **dump** Machine-readable JSON dump of all cloud-init tracked events.
998+* **show** show time-ordered report of the cost of operations during each
999+ boot stage.
1000
1001 .. _cli_devel:
1002
1003@@ -182,8 +245,8 @@ cloud-init clean
1004 Remove cloud-init artifacts from /var/lib/cloud and optionally reboot the
1005 machine to so cloud-init re-runs all stages as it did on first boot.
1006
1007- * **--logs**: Optionally remove /var/log/cloud-init*log files.
1008- * **--reboot**: Reboot the system after removing artifacts.
1009+* **--logs**: Optionally remove /var/log/cloud-init*log files.
1010+* **--reboot**: Reboot the system after removing artifacts.
1011
1012 .. _cli_init:
1013
1014@@ -195,7 +258,7 @@ Can be run on the commandline, but is generally gated to run only once
1015 due to semaphores in **/var/lib/cloud/instance/sem/** and
1016 **/var/lib/cloud/sem**.
1017
1018- * **--local**: Run *init-local* stage instead of *init*.
1019+* **--local**: Run *init-local* stage instead of *init*.
1020
1021 .. _cli_modules:
1022
1023@@ -210,8 +273,8 @@ declared to run in various boot stages in the file
1024 commandline, but each module is gated to run only once due to semaphores
1025 in ``/var/lib/cloud/``.
1026
1027- * **--mode (init|config|final)**: Run *modules:init*, *modules:config* or
1028- *modules:final* cloud-init stages. See :ref:`boot_stages` for more info.
1029+* **--mode (init|config|final)**: Run *modules:init*, *modules:config* or
1030+ *modules:final* cloud-init stages. See :ref:`boot_stages` for more info.
1031
1032 .. _cli_single:
1033
1034@@ -221,9 +284,9 @@ Attempt to run a single named cloud config module. The following example
1035 re-runs the cc_set_hostname module ignoring the module default frequency
1036 of once-per-instance:
1037
1038- * **--name**: The cloud-config module name to run
1039- * **--frequency**: Optionally override the declared module frequency
1040- with one of (always|once-per-instance|once)
1041+* **--name**: The cloud-config module name to run
1042+* **--frequency**: Optionally override the declared module frequency
1043+ with one of (always|once-per-instance|once)
1044
1045 .. code-block:: shell-session
1046
1047diff --git a/doc/rtd/topics/datasources.rst b/doc/rtd/topics/datasources.rst
1048index 14432e6..e34f145 100644
1049--- a/doc/rtd/topics/datasources.rst
1050+++ b/doc/rtd/topics/datasources.rst
1051@@ -17,146 +17,10 @@ own way) internally a datasource abstract class was created to allow for a
1052 single way to access the different cloud systems methods to provide this data
1053 through the typical usage of subclasses.
1054
1055-
1056-.. _instance_metadata:
1057-
1058-instance-data
1059--------------
1060-For reference, cloud-init stores all the metadata, vendordata and userdata
1061-provided by a cloud in a json blob at ``/run/cloud-init/instance-data.json``.
1062-While the json contains datasource-specific keys and names, cloud-init will
1063-maintain a minimal set of standardized keys that will remain stable on any
1064-cloud. Standardized instance-data keys will be present under a "v1" key.
1065-Any datasource metadata cloud-init consumes will all be present under the
1066-"ds" key.
1067-
1068-Below is an instance-data.json example from an OpenStack instance:
1069-
1070-.. sourcecode:: json
1071-
1072- {
1073- "base64-encoded-keys": [
1074- "ds/meta-data/random_seed",
1075- "ds/user-data"
1076- ],
1077- "ds": {
1078- "ec2_metadata": {
1079- "ami-id": "ami-0000032f",
1080- "ami-launch-index": "0",
1081- "ami-manifest-path": "FIXME",
1082- "block-device-mapping": {
1083- "ami": "vda",
1084- "ephemeral0": "/dev/vdb",
1085- "root": "/dev/vda"
1086- },
1087- "hostname": "xenial-test.novalocal",
1088- "instance-action": "none",
1089- "instance-id": "i-0006e030",
1090- "instance-type": "m1.small",
1091- "local-hostname": "xenial-test.novalocal",
1092- "local-ipv4": "10.5.0.6",
1093- "placement": {
1094- "availability-zone": "None"
1095- },
1096- "public-hostname": "xenial-test.novalocal",
1097- "public-ipv4": "10.245.162.145",
1098- "reservation-id": "r-fxm623oa",
1099- "security-groups": "default"
1100- },
1101- "meta-data": {
1102- "availability_zone": null,
1103- "devices": [],
1104- "hostname": "xenial-test.novalocal",
1105- "instance-id": "3e39d278-0644-4728-9479-678f9212d8f0",
1106- "launch_index": 0,
1107- "local-hostname": "xenial-test.novalocal",
1108- "name": "xenial-test",
1109- "project_id": "e0eb2d2538814...",
1110- "random_seed": "A6yPN...",
1111- "uuid": "3e39d278-0644-4728-9479-678f92..."
1112- },
1113- "network_json": {
1114- "links": [
1115- {
1116- "ethernet_mac_address": "fa:16:3e:7d:74:9b",
1117- "id": "tap9ca524d5-6e",
1118- "mtu": 8958,
1119- "type": "ovs",
1120- "vif_id": "9ca524d5-6e5a-4809-936a-6901..."
1121- }
1122- ],
1123- "networks": [
1124- {
1125- "id": "network0",
1126- "link": "tap9ca524d5-6e",
1127- "network_id": "c6adfc18-9753-42eb-b3ea-18b57e6b837f",
1128- "type": "ipv4_dhcp"
1129- }
1130- ],
1131- "services": [
1132- {
1133- "address": "10.10.160.2",
1134- "type": "dns"
1135- }
1136- ]
1137- },
1138- "user-data": "I2Nsb3VkLWNvbmZpZ...",
1139- "vendor-data": null
1140- },
1141- "v1": {
1142- "availability-zone": null,
1143- "cloud-name": "openstack",
1144- "instance-id": "3e39d278-0644-4728-9479-678f9212d8f0",
1145- "local-hostname": "xenial-test",
1146- "region": null
1147- }
1148- }
1149-
1150-
1151-As of cloud-init v. 18.4, any values present in
1152-``/run/cloud-init/instance-data.json`` can be used in cloud-init user data
1153-scripts or cloud config data. This allows consumers to use cloud-init's
1154-vendor-neutral, standardized metadata keys as well as datasource-specific
1155-content for any scripts or cloud-config modules they are using.
1156-
1157-To use instance-data.json values in scripts and **#config-config** files the
1158-user-data will need to contain the following header as the first line **## template: jinja**. Cloud-init will source all variables defined in
1159-``/run/cloud-init/instance-data.json`` and allow scripts or cloud-config files
1160-to reference those paths. Below are two examples::
1161-
1162- * Cloud config calling home with the ec2 public hostname and avaliability-zone
1163- ```
1164- ## template: jinja
1165- #cloud-config
1166- runcmd:
1167- - echo 'EC2 public hostname allocated to instance: {{ ds.meta_data.public_hostname }}' > /tmp/instance_metadata
1168- - echo 'EC2 avaiability zone: {{ v1.availability_zone }}' >> /tmp/instance_metadata
1169- - curl -X POST -d '{"hostname": "{{ds.meta_data.public_hostname }}", "availability-zone": "{{ v1.availability_zone }}"}' https://example.com.com
1170- ```
1171-
1172- * Custom user script performing different operations based on region
1173- ```
1174- ## template: jinja
1175- #!/bin/bash
1176- {% if v1.region == 'us-east-2' -%}
1177- echo 'Installing custom proxies for {{ v1.region }}
1178- sudo apt-get install my-xtra-fast-stack
1179- {%- endif %}
1180- ...
1181-
1182- ```
1183-
1184-.. note::
1185- Trying to reference jinja variables that don't exist in
1186- instance-data.json will result in warnings in ``/var/log/cloud-init.log``
1187- and the following string in your rendered user-data:
1188- ``CI_MISSING_JINJA_VAR/<your_varname>``.
1189-
1190-.. note::
1191- To save time designing your user-data for a specific cloud's
1192- instance-data.json, use the 'render' cloud-init command on an
1193- instance booted on your favorite cloud. See :ref:`cli_devel` for more
1194- information.
1195+Any metadata processed by cloud-init's datasources is persisted as
1196+``/run/cloud0-init/instance-data.json``. Cloud-init provides tooling
1197+to quickly introspect some of that data. See :ref:`instance_metadata` for
1198+more information.
1199
1200
1201 Datasource API
1202@@ -196,14 +60,14 @@ The current interface that a datasource object must provide is the following:
1203 # or does not exist)
1204 def device_name_to_device(self, name)
1205
1206- # gets the locale string this instance should be applying
1207+ # gets the locale string this instance should be applying
1208 # which typically used to adjust the instances locale settings files
1209 def get_locale(self)
1210
1211 @property
1212 def availability_zone(self)
1213
1214- # gets the instance id that was assigned to this instance by the
1215+ # gets the instance id that was assigned to this instance by the
1216 # cloud provider or when said instance id does not exist in the backing
1217 # metadata this will return 'iid-datasource'
1218 def get_instance_id(self)
1219diff --git a/doc/rtd/topics/instancedata.rst b/doc/rtd/topics/instancedata.rst
1220new file mode 100644
1221index 0000000..634e180
1222--- /dev/null
1223+++ b/doc/rtd/topics/instancedata.rst
1224@@ -0,0 +1,297 @@
1225+.. _instance_metadata:
1226+
1227+*****************
1228+Instance Metadata
1229+*****************
1230+
1231+What is a instance data?
1232+========================
1233+
1234+Instance data is the collection of all configuration data that cloud-init
1235+processes to configure the instance. This configuration typically
1236+comes from any number of sources:
1237+
1238+* cloud-provided metadata services (aka metadata)
1239+* custom config-drive attached to the instance
1240+* cloud-config seed files in the booted cloud image or distribution
1241+* vendordata provided from files or cloud metadata services
1242+* userdata provided at instance creation
1243+
1244+Each cloud provider presents unique configuration metadata in different
1245+formats to the instance. Cloud-init provides a cache of any crawled metadata
1246+as well as a versioned set of standardized instance data keys which it makes
1247+available on all platforms.
1248+
1249+Cloud-init produces a simple json object in
1250+``/run/cloud-init/instance-data.json`` which represents standardized and
1251+versioned representation of the metadata it consumes during initial boot. The
1252+intent is to provide the following benefits to users or scripts on any system
1253+deployed with cloud-init:
1254+
1255+* simple static object to query to obtain a instance's metadata
1256+* speed: avoid costly network transactions for metadata that is already cached
1257+ on the filesytem
1258+* reduce need to recrawl metadata services for static metadata that is already
1259+ cached
1260+* leverage cloud-init's best practices for crawling cloud-metadata services
1261+* avoid rolling unique metadata crawlers on each cloud platform to get
1262+ metadata configuration values
1263+
1264+Cloud-init stores any instance data processed in the following files:
1265+
1266+* ``/run/cloud-init/instance-data.json``: world-readable json containing
1267+ standardized keys, sensitive keys redacted
1268+* ``/run/cloud-init/instance-data-sensitive.json``: root-readable unredacted
1269+ json blob
1270+* ``/var/lib/cloud/instance/user-data.txt``: root-readable sensitive raw
1271+ userdata
1272+* ``/var/lib/cloud/instance/vendor-data.txt``: root-readable sensitive raw
1273+ vendordata
1274+
1275+Cloud-init redacts any security sensitive content from instance-data.json,
1276+stores ``/run/cloud-init/instance-data.json`` as a world-readable json file.
1277+Because user-data and vendor-data can contain passwords both of these files
1278+are readonly for *root* as well. The *root* user can also read
1279+``/run/cloud-init/instance-data-sensitive.json`` which is all instance data
1280+from instance-data.json as well as unredacted sensitive content.
1281+
1282+
1283+Format of instance-data.json
1284+============================
1285+
1286+The instance-data.json and instance-data-sensitive.json files are well-formed
1287+JSON and record the set of keys and values for any metadata processed by
1288+cloud-init. Cloud-init standardizes the format for this content so that it
1289+can be generalized across different cloud platforms.
1290+
1291+There are three basic top-level keys:
1292+
1293+* **base64_encoded_keys**: A list of forward-slash delimited key paths into
1294+ the instance-data.json object whose value is base64encoded for json
1295+ compatibility. Values at these paths should be decoded to get the original
1296+ value.
1297+
1298+* **sensitive_keys**: A list of forward-slash delimited key paths into
1299+ the instance-data.json object whose value is considered by the datasource as
1300+ 'security sensitive'. Only the keys listed here will be redacted from
1301+ instance-data.json for non-root users.
1302+
1303+* **ds**: Datasource-specific metadata crawled for the specific cloud
1304+ platform. It should closely represent the structure of the cloud metadata
1305+ crawled. The structure of content and details provided are entirely
1306+ cloud-dependent. Mileage will vary depending on what the cloud exposes.
1307+ The content exposed under the 'ds' key is currently **experimental** and
1308+ expected to change slightly in the upcoming cloud-init release.
1309+
1310+* **v1**: Standardized cloud-init metadata keys, these keys are guaranteed to
1311+ exist on all cloud platforms. They will also retain their current behavior
1312+ and format and will be carried forward even if cloud-init introduces a new
1313+ version of standardized keys with **v2**.
1314+
1315+The standardized keys present:
1316+
1317++----------------------+-----------------------------------------------+---------------------------+
1318+| Key path | Description | Examples |
1319++======================+===============================================+===========================+
1320+| v1.cloud_name | The name of the cloud provided by metadata | aws, openstack, azure, |
1321+| | key 'cloud-name' or the cloud-init datasource | configdrive, nocloud, |
1322+| | name which was discovered. | ovf, etc. |
1323++----------------------+-----------------------------------------------+---------------------------+
1324+| v1.instance_id | Unique instance_id allocated by the cloud | i-<somehash> |
1325++----------------------+-----------------------------------------------+---------------------------+
1326+| v1.local_hostname | The internal or local hostname of the system | ip-10-41-41-70, |
1327+| | | <user-provided-hostname> |
1328++----------------------+-----------------------------------------------+---------------------------+
1329+| v1.region | The physical region/datacenter in which the | us-east-2 |
1330+| | instance is deployed | |
1331++----------------------+-----------------------------------------------+---------------------------+
1332+| v1.availability_zone | The physical availability zone in which the | us-east-2b, nova, null |
1333+| | instance is deployed | |
1334++----------------------+-----------------------------------------------+---------------------------+
1335+
1336+
1337+Below is an example of ``/run/cloud-init/instance_data.json`` on an EC2
1338+instance:
1339+
1340+.. sourcecode:: json
1341+
1342+ {
1343+ "base64_encoded_keys": [],
1344+ "sensitive_keys": [],
1345+ "ds": {
1346+ "meta_data": {
1347+ "ami-id": "ami-014e1416b628b0cbf",
1348+ "ami-launch-index": "0",
1349+ "ami-manifest-path": "(unknown)",
1350+ "block-device-mapping": {
1351+ "ami": "/dev/sda1",
1352+ "ephemeral0": "sdb",
1353+ "ephemeral1": "sdc",
1354+ "root": "/dev/sda1"
1355+ },
1356+ "hostname": "ip-10-41-41-70.us-east-2.compute.internal",
1357+ "instance-action": "none",
1358+ "instance-id": "i-04fa31cfc55aa7976",
1359+ "instance-type": "t2.micro",
1360+ "local-hostname": "ip-10-41-41-70.us-east-2.compute.internal",
1361+ "local-ipv4": "10.41.41.70",
1362+ "mac": "06:b6:92:dd:9d:24",
1363+ "metrics": {
1364+ "vhostmd": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
1365+ },
1366+ "network": {
1367+ "interfaces": {
1368+ "macs": {
1369+ "06:b6:92:dd:9d:24": {
1370+ "device-number": "0",
1371+ "interface-id": "eni-08c0c9fdb99b6e6f4",
1372+ "ipv4-associations": {
1373+ "18.224.22.43": "10.41.41.70"
1374+ },
1375+ "local-hostname": "ip-10-41-41-70.us-east-2.compute.internal",
1376+ "local-ipv4s": "10.41.41.70",
1377+ "mac": "06:b6:92:dd:9d:24",
1378+ "owner-id": "437526006925",
1379+ "public-hostname": "ec2-18-224-22-43.us-east-2.compute.amazonaws.com",
1380+ "public-ipv4s": "18.224.22.43",
1381+ "security-group-ids": "sg-828247e9",
1382+ "security-groups": "Cloud-init integration test secgroup",
1383+ "subnet-id": "subnet-282f3053",
1384+ "subnet-ipv4-cidr-block": "10.41.41.0/24",
1385+ "subnet-ipv6-cidr-blocks": "2600:1f16:b80:ad00::/64",
1386+ "vpc-id": "vpc-252ef24d",
1387+ "vpc-ipv4-cidr-block": "10.41.0.0/16",
1388+ "vpc-ipv4-cidr-blocks": "10.41.0.0/16",
1389+ "vpc-ipv6-cidr-blocks": "2600:1f16:b80:ad00::/56"
1390+ }
1391+ }
1392+ }
1393+ },
1394+ "placement": {
1395+ "availability-zone": "us-east-2b"
1396+ },
1397+ "profile": "default-hvm",
1398+ "public-hostname": "ec2-18-224-22-43.us-east-2.compute.amazonaws.com",
1399+ "public-ipv4": "18.224.22.43",
1400+ "public-keys": {
1401+ "cloud-init-integration": [
1402+ "ssh-rsa
1403+ AAAAB3NzaC1yc2EAAAADAQABAAABAQDSL7uWGj8cgWyIOaspgKdVy0cKJ+UTjfv7jBOjG2H/GN8bJVXy72XAvnhM0dUM+CCs8FOf0YlPX+Frvz2hKInrmRhZVwRSL129PasD12MlI3l44u6IwS1o/W86Q+tkQYEljtqDOo0a+cOsaZkvUNzUyEXUwz/lmYa6G4hMKZH4NBj7nbAAF96wsMCoyNwbWryBnDYUr6wMbjRR1J9Pw7Xh7WRC73wy4Va2YuOgbD3V/5ZrFPLbWZW/7TFXVrql04QVbyei4aiFR5n//GvoqwQDNe58LmbzX/xvxyKJYdny2zXmdAhMxbrpFQsfpkJ9E/H5w0yOdSvnWbUoG5xNGoOB
1404+ cloud-init-integration"
1405+ ]
1406+ },
1407+ "reservation-id": "r-06ab75e9346f54333",
1408+ "security-groups": "Cloud-init integration test secgroup",
1409+ "services": {
1410+ "domain": "amazonaws.com",
1411+ "partition": "aws"
1412+ }
1413+ }
1414+ },
1415+ "v1": {
1416+ "availability-zone": "us-east-2b",
1417+ "availability_zone": "us-east-2b",
1418+ "cloud-name": "aws",
1419+ "cloud_name": "aws",
1420+ "instance-id": "i-04fa31cfc55aa7976",
1421+ "instance_id": "i-04fa31cfc55aa7976",
1422+ "local-hostname": "ip-10-41-41-70",
1423+ "local_hostname": "ip-10-41-41-70",
1424+ "region": "us-east-2"
1425+ }
1426+ }
1427+
1428+
1429+Using instance-data
1430+===================
1431+
1432+As of cloud-init v. 18.4, any variables present in
1433+``/run/cloud-init/instance-data.json`` can be used in:
1434+
1435+* User-data scripts
1436+* Cloud config data
1437+* Command line interface via **cloud-init query** or
1438+ **cloud-init devel render**
1439+
1440+Many clouds allow users to provide user-data to an instance at
1441+the time the instance is launched. Cloud-init supports a number of
1442+:ref:`user_data_formats`.
1443+
1444+Both user-data scripts and **#cloud-config** data support jinja template
1445+rendering.
1446+When the first line of the provided user-data begins with,
1447+**## template: jinja** cloud-init will use jinja to render that file.
1448+Any instance-data-sensitive.json variables are surfaced as dot-delimited
1449+jinja template variables because cloud-config modules are run as 'root'
1450+user.
1451+
1452+
1453+Below are some examples of providing these types of user-data:
1454+
1455+* Cloud config calling home with the ec2 public hostname and avaliability-zone
1456+
1457+.. code-block:: shell-session
1458+
1459+ ## template: jinja
1460+ #cloud-config
1461+ runcmd:
1462+ - echo 'EC2 public hostname allocated to instance: {{
1463+ ds.meta_data.public_hostname }}' > /tmp/instance_metadata
1464+ - echo 'EC2 avaiability zone: {{ v1.availability_zone }}' >>
1465+ /tmp/instance_metadata
1466+ - curl -X POST -d '{"hostname": "{{ds.meta_data.public_hostname }}",
1467+ "availability-zone": "{{ v1.availability_zone }}"}'
1468+ https://example.com
1469+
1470+* Custom user-data script performing different operations based on region
1471+
1472+.. code-block:: shell-session
1473+
1474+ ## template: jinja
1475+ #!/bin/bash
1476+ {% if v1.region == 'us-east-2' -%}
1477+ echo 'Installing custom proxies for {{ v1.region }}
1478+ sudo apt-get install my-xtra-fast-stack
1479+ {%- endif %}
1480+ ...
1481+
1482+.. note::
1483+ Trying to reference jinja variables that don't exist in
1484+ instance-data.json will result in warnings in ``/var/log/cloud-init.log``
1485+ and the following string in your rendered user-data:
1486+ ``CI_MISSING_JINJA_VAR/<your_varname>``.
1487+
1488+Cloud-init also surfaces a commandline tool **cloud-init query** which can
1489+assist developers or scripts with obtaining instance metadata easily. See
1490+:ref:`cli_query` for more information.
1491+
1492+To cut down on keystrokes on the command line, cloud-init also provides
1493+top-level key aliases for any standardized ``v#`` keys present. The preceding
1494+``v1`` is not required of ``v1.var_name`` These aliases will represent the
1495+value of the highest versioned standard key. For example, ``cloud_name``
1496+value will be ``v2.cloud_name`` if both ``v1`` and ``v2`` keys are present in
1497+instance-data.json.
1498+The **query** command also publishes ``userdata`` and ``vendordata`` keys to
1499+the root user which will contain the decoded user and vendor data provided to
1500+this instance. Non-root users referencing userdata or vendordata keys will
1501+see only redacted values.
1502+
1503+.. code-block:: shell-session
1504+
1505+ # List all top-level instance-data keys available
1506+ % cloud-init query --list-keys
1507+
1508+ # Find your EC2 ami-id
1509+ % cloud-init query ds.metadata.ami_id
1510+
1511+ # Format your cloud_name and region using jinja template syntax
1512+ % cloud-init query --format 'cloud: {{ v1.cloud_name }} myregion: {{
1513+ % v1.region }}'
1514+
1515+.. note::
1516+ To save time designing a user-data template for a specific cloud's
1517+ instance-data.json, use the 'render' cloud-init command on an
1518+ instance booted on your favorite cloud. See :ref:`cli_devel` for more
1519+ information.
1520+
1521+.. vi: textwidth=78
1522diff --git a/integration-requirements.txt b/integration-requirements.txt
1523index f80cb94..880d988 100644
1524--- a/integration-requirements.txt
1525+++ b/integration-requirements.txt
1526@@ -5,16 +5,17 @@
1527 # the packages/pkg-deps.json file as well.
1528 #
1529
1530+unittest2
1531 # ec2 backend
1532 boto3==1.5.9
1533
1534 # ssh communication
1535 paramiko==2.4.1
1536
1537+
1538 # lxd backend
1539 # 04/03/2018: enables use of lxd 3.0
1540 git+https://github.com/lxc/pylxd.git@4b8ab1802f9aee4eb29cf7b119dae0aa47150779
1541
1542-
1543 # finds latest image information
1544 git+https://git.launchpad.net/simplestreams
1545diff --git a/tests/cloud_tests/testcases/base.py b/tests/cloud_tests/testcases/base.py
1546index 2745827..c545796 100644
1547--- a/tests/cloud_tests/testcases/base.py
1548+++ b/tests/cloud_tests/testcases/base.py
1549@@ -5,15 +5,15 @@
1550 import crypt
1551 import json
1552 import re
1553-import unittest
1554+import unittest2
1555
1556
1557 from cloudinit import util as c_util
1558
1559-SkipTest = unittest.SkipTest
1560+SkipTest = unittest2.SkipTest
1561
1562
1563-class CloudTestCase(unittest.TestCase):
1564+class CloudTestCase(unittest2.TestCase):
1565 """Base test class for verifiers."""
1566
1567 # data gets populated in get_suite.setUpClass
1568@@ -167,8 +167,9 @@ class CloudTestCase(unittest.TestCase):
1569 'Skipping instance-data.json test.'
1570 ' OS: %s not bionic or newer' % self.os_name)
1571 instance_data = json.loads(out)
1572- self.assertEqual(
1573- ['ds/user_data'], instance_data['base64_encoded_keys'])
1574+ self.assertItemsEqual(
1575+ [],
1576+ instance_data['base64_encoded_keys'])
1577 ds = instance_data.get('ds', {})
1578 v1_data = instance_data.get('v1', {})
1579 metadata = ds.get('meta-data', {})
1580@@ -187,10 +188,10 @@ class CloudTestCase(unittest.TestCase):
1581 metadata.get('placement', {}).get('availability-zone'),
1582 'Could not determine EC2 Availability zone placement')
1583 self.assertIsNotNone(
1584- v1_data['availability-zone'], 'expected ec2 availability-zone')
1585- self.assertEqual('aws', v1_data['cloud-name'])
1586- self.assertIn('i-', v1_data['instance-id'])
1587- self.assertIn('ip-', v1_data['local-hostname'])
1588+ v1_data['availability_zone'], 'expected ec2 availability_zone')
1589+ self.assertEqual('aws', v1_data['cloud_name'])
1590+ self.assertIn('i-', v1_data['instance_id'])
1591+ self.assertIn('ip-', v1_data['local_hostname'])
1592 self.assertIsNotNone(v1_data['region'], 'expected ec2 region')
1593
1594 def test_instance_data_json_lxd(self):
1595@@ -213,16 +214,14 @@ class CloudTestCase(unittest.TestCase):
1596 ' OS: %s not bionic or newer' % self.os_name)
1597 instance_data = json.loads(out)
1598 v1_data = instance_data.get('v1', {})
1599- self.assertEqual(
1600- ['ds/user_data', 'ds/vendor_data'],
1601- sorted(instance_data['base64_encoded_keys']))
1602- self.assertEqual('nocloud', v1_data['cloud-name'])
1603+ self.assertItemsEqual([], sorted(instance_data['base64_encoded_keys']))
1604+ self.assertEqual('nocloud', v1_data['cloud_name'])
1605 self.assertIsNone(
1606- v1_data['availability-zone'],
1607- 'found unexpected lxd availability-zone %s' %
1608- v1_data['availability-zone'])
1609- self.assertIn('cloud-test', v1_data['instance-id'])
1610- self.assertIn('cloud-test', v1_data['local-hostname'])
1611+ v1_data['availability_zone'],
1612+ 'found unexpected lxd availability_zone %s' %
1613+ v1_data['availability_zone'])
1614+ self.assertIn('cloud-test', v1_data['instance_id'])
1615+ self.assertIn('cloud-test', v1_data['local_hostname'])
1616 self.assertIsNone(
1617 v1_data['region'],
1618 'found unexpected lxd region %s' % v1_data['region'])
1619@@ -248,18 +247,17 @@ class CloudTestCase(unittest.TestCase):
1620 ' OS: %s not bionic or newer' % self.os_name)
1621 instance_data = json.loads(out)
1622 v1_data = instance_data.get('v1', {})
1623- self.assertEqual(
1624- ['ds/user_data'], instance_data['base64_encoded_keys'])
1625- self.assertEqual('nocloud', v1_data['cloud-name'])
1626+ self.assertItemsEqual([], instance_data['base64_encoded_keys'])
1627+ self.assertEqual('nocloud', v1_data['cloud_name'])
1628 self.assertIsNone(
1629- v1_data['availability-zone'],
1630- 'found unexpected kvm availability-zone %s' %
1631- v1_data['availability-zone'])
1632+ v1_data['availability_zone'],
1633+ 'found unexpected kvm availability_zone %s' %
1634+ v1_data['availability_zone'])
1635 self.assertIsNotNone(
1636 re.match(r'[\da-f]{8}(-[\da-f]{4}){3}-[\da-f]{12}',
1637- v1_data['instance-id']),
1638- 'kvm instance-id is not a UUID: %s' % v1_data['instance-id'])
1639- self.assertIn('ubuntu', v1_data['local-hostname'])
1640+ v1_data['instance_id']),
1641+ 'kvm instance_id is not a UUID: %s' % v1_data['instance_id'])
1642+ self.assertIn('ubuntu', v1_data['local_hostname'])
1643 self.assertIsNone(
1644 v1_data['region'],
1645 'found unexpected lxd region %s' % v1_data['region'])

Subscribers

People subscribed via source and target branches