Merge ~chad.smith/cloud-init:feature/cli-cloudinit-query into cloud-init:master
- Git
- lp:~chad.smith/cloud-init
- feature/cli-cloudinit-query
- Merge into master
Status: | Merged |
---|---|
Approved by: | Chad Smith |
Approved revision: | 5916cbb4f60cbcf517ec1edbd45c964e73b441af |
Merge reported by: | Server Team CI bot |
Merged at revision: | not available |
Proposed branch: | ~chad.smith/cloud-init:feature/cli-cloudinit-query |
Merge into: | cloud-init:master |
Diff against target: |
1645 lines (+952/-233) 14 files modified
bash_completion/cloud-init (+3/-1) cloudinit/cmd/devel/render.py (+1/-6) cloudinit/cmd/main.py (+10/-0) cloudinit/cmd/query.py (+155/-0) cloudinit/cmd/tests/test_query.py (+193/-0) cloudinit/helpers.py (+4/-0) cloudinit/sources/__init__.py (+62/-14) cloudinit/sources/tests/test_init.py (+109/-21) doc/rtd/index.rst (+1/-0) doc/rtd/topics/capabilities.rst (+84/-21) doc/rtd/topics/datasources.rst (+6/-142) doc/rtd/topics/instancedata.rst (+297/-0) integration-requirements.txt (+2/-1) tests/cloud_tests/testcases/base.py (+25/-27) |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Server Team CI bot | continuous-integration | Approve | |
cloud-init Commiters | Pending | ||
Review via email: mp+354891@code.launchpad.net |
Commit message
cli: add cloud-init query subcommand to query instance metadata
Cloud-init caches any cloud metadata crawled during boot in the file
/run/cloud-
that metadata across all clouds. The command 'cloud-init query' surfaces a
simple CLI to query or format any cached instance metadata so that scripts
or end-users do not have to write tools to crawl metadata themselves.
Since 'cloud-init query' is runnable by non-root users, redact any
sensitive data from instance-data.json and provide a root-readable
unredacted instance-
sensitive_
which could contain passwords or credentials from instance-data.json.
Also add the following standardized 'v1' instance-data.json keys:
- user_data: The base64encoded user-data provided at instance launch
- vendor_data: Any vendor_data provided to the instance at launch
- underscore_
instance_id, local_hostname, availability_zone, cloud_name
Description of the change
Server Team CI bot (server-team-bot) wrote : | # |
- f719689... by Chad Smith
-
user_data and vendor_data are now also under v1 keys. Fix tests
Server Team CI bot (server-team-bot) wrote : | # |
FAILED: Continuous integration, rev:692ce186d00
https:/
Executed test runs:
SUCCESS: Checkout
FAILED: Unit & Style Tests
Click here to trigger a rebuild:
https:/
- 32f09b2... by Chad Smith
-
shift integration tests to unittest2 instead of unittest
- 593cc6d... by Chad Smith
-
rst format alignment on list items
Server Team CI bot (server-team-bot) wrote : | # |
FAILED: Continuous integration, rev:757f8642598
https:/
Executed test runs:
SUCCESS: Checkout
SUCCESS: Unit & Style Tests
SUCCESS: Ubuntu LTS: Build
FAILED: Ubuntu LTS: Integration
Click here to trigger a rebuild:
https:/
Scott Moser (smoser) wrote : | # |
It occurred to me yesterday that user-data can be big. EC2 is limited to 16K, but many other platforms have higher limits. That means we potentially have a very big blob of base64 encoded text in an other-wise reasonably small file.
That, coupled with the fact that for quite some time user-data has been available in /var/lib/
We could still have cloud-init query make it available.
The same seems true of vendor-data.
And then, needs to update bash_completion also.
- 1c90226... by Chad Smith
-
add top-level instancedata doc topic. pull it out of datasource
- ab686e1... by Chad Smith
-
add top-level instancedata doc, couple doc fixups
Server Team CI bot (server-team-bot) wrote : | # |
FAILED: Continuous integration, rev:ab686e18902
https:/
Executed test runs:
SUCCESS: Checkout
SUCCESS: Unit & Style Tests
SUCCESS: Ubuntu LTS: Build
FAILED: Ubuntu LTS: Integration
Click here to trigger a rebuild:
https:/
- 0bf450f... by Chad Smith
-
add user-data and vendor-data optional arguments. redact ud and vd for non-root
- e868742... by Chad Smith
-
unit test fixes for separation of user-data/
vendor- data from instance-data.json - 49e7818... by Chad Smith
-
test fixes for metadata
- 1b46398... by Chad Smith
-
update docs to drop userdata/vendordata from instance-data.json
Server Team CI bot (server-team-bot) wrote : | # |
FAILED: Continuous integration, rev:1b463986ec2
https:/
Executed test runs:
SUCCESS: Checkout
SUCCESS: Unit & Style Tests
SUCCESS: Ubuntu LTS: Build
FAILED: Ubuntu LTS: Integration
Click here to trigger a rebuild:
https:/
- ada811c... by Chad Smith
-
add integration test dep on unittest2 for assertItemsEqual
Server Team CI bot (server-team-bot) wrote : | # |
FAILED: Continuous integration, rev:ada811cea7a
https:/
Executed test runs:
SUCCESS: Checkout
SUCCESS: Unit & Style Tests
SUCCESS: Ubuntu LTS: Build
FAILED: Ubuntu LTS: Integration
Click here to trigger a rebuild:
https:/
- a74ca3e... by Chad Smith
-
no userdata/vendordata in integration test validation
Server Team CI bot (server-team-bot) wrote : | # |
PASSED: Continuous integration, rev:a74ca3ee659
https:/
Executed test runs:
SUCCESS: Checkout
SUCCESS: Unit & Style Tests
SUCCESS: Ubuntu LTS: Build
SUCCESS: Ubuntu LTS: Integration
IN_PROGRESS: Declarative: Post Actions
Click here to trigger a rebuild:
https:/
- 8165e18... by Chad Smith
-
add redact_
sensitive_ keys function - a6349b4... by Chad Smith
-
doc lints
Server Team CI bot (server-team-bot) wrote : | # |
PASSED: Continuous integration, rev:8165e18402e
https:/
Executed test runs:
SUCCESS: Checkout
SUCCESS: Unit & Style Tests
SUCCESS: Ubuntu LTS: Build
SUCCESS: Ubuntu LTS: Integration
IN_PROGRESS: Declarative: Post Actions
Click here to trigger a rebuild:
https:/
- 951944f... by Chad Smith
-
doc updates
- e3a0084... by Chad Smith
-
fix instance data rst docs
Scott Moser (smoser) wrote : | # |
minor comments inline.
only thought is to talk about names of things and expected values ('cloudname') and such.
we need to hhave that somewhere, and i didn't read close enough to see it.
Server Team CI bot (server-team-bot) wrote : | # |
PASSED: Continuous integration, rev:e3a00841976
https:/
Executed test runs:
SUCCESS: Checkout
SUCCESS: Unit & Style Tests
SUCCESS: Ubuntu LTS: Build
SUCCESS: Ubuntu LTS: Integration
IN_PROGRESS: Declarative: Post Actions
Click here to trigger a rebuild:
https:/
- d5aa5f6... by Chad Smith
-
call paths if missing any instance, user or vendor data files
- 2bc2a4d... by Chad Smith
-
update docs
Server Team CI bot (server-team-bot) wrote : | # |
PASSED: Continuous integration, rev:2bc2a4dd41d
https:/
Executed test runs:
SUCCESS: Checkout
SUCCESS: Unit & Style Tests
SUCCESS: Ubuntu LTS: Build
SUCCESS: Ubuntu LTS: Integration
IN_PROGRESS: Declarative: Post Actions
Click here to trigger a rebuild:
https:/
Ryan Harper (raharper) wrote : | # |
This looks good. A couple of in-line nits but the biggest question is whether this query command is meant to be top-level (cloud-init query) or if it was going to stay behind devel (cloud-init devel query)?
- 9401cbd... by Chad Smith
-
bash completion fixes
Server Team CI bot (server-team-bot) wrote : | # |
PASSED: Continuous integration, rev:9401cbd7cbf
https:/
Executed test runs:
SUCCESS: Checkout
SUCCESS: Unit & Style Tests
SUCCESS: Ubuntu LTS: Build
SUCCESS: Ubuntu LTS: Integration
IN_PROGRESS: Declarative: Post Actions
Click here to trigger a rebuild:
https:/
- 79befd9... by Chad Smith
-
address raharper's review comments
- bash_completion fixups for top-level query command
- cache os.getuid calls to avoid duplicate calls
- use paths.instance_link instead of paths.get_ipath()
- comment on underscore delmited standardized keys
- rtd doc fixups - 7d38ce3... by Chad Smith
-
more doc fixes
- 87bf954... by Chad Smith
-
add note about security_sensitive keys
Chad Smith (chad.smith) : | # |
Server Team CI bot (server-team-bot) wrote : | # |
PASSED: Continuous integration, rev:87bf954e24e
https:/
Executed test runs:
SUCCESS: Checkout
SUCCESS: Unit & Style Tests
SUCCESS: Ubuntu LTS: Build
SUCCESS: Ubuntu LTS: Integration
IN_PROGRESS: Declarative: Post Actions
Click here to trigger a rebuild:
https:/
Ryan Harper (raharper) wrote : | # |
Should commit message mention to two separate files for instance and instance-sensitive data?
Ryan Harper (raharper) : | # |
- c455ea0... by Chad Smith
-
log as debug any skipped config files when non-root users raises a PermissionError
- 5916cbb... by Chad Smith
-
add cloud_name key to instance data
Chad Smith (chad.smith) : | # |
Server Team CI bot (server-team-bot) wrote : | # |
FAILED: Continuous integration, rev:a15e6c37da9
https:/
Executed test runs:
SUCCESS: Checkout
FAILED: Unit & Style Tests
Click here to trigger a rebuild:
https:/
Server Team CI bot (server-team-bot) wrote : | # |
PASSED: Continuous integration, rev:5916cbb4f60
https:/
Executed test runs:
SUCCESS: Checkout
SUCCESS: Unit & Style Tests
SUCCESS: Ubuntu LTS: Build
SUCCESS: Ubuntu LTS: Integration
IN_PROGRESS: Declarative: Post Actions
Click here to trigger a rebuild:
https:/
Preview Diff
1 | diff --git a/bash_completion/cloud-init b/bash_completion/cloud-init |
2 | index 6d01bf3..8c25032 100644 |
3 | --- a/bash_completion/cloud-init |
4 | +++ b/bash_completion/cloud-init |
5 | @@ -10,7 +10,7 @@ _cloudinit_complete() |
6 | cur_word="${COMP_WORDS[COMP_CWORD]}" |
7 | prev_word="${COMP_WORDS[COMP_CWORD-1]}" |
8 | |
9 | - subcmds="analyze clean collect-logs devel dhclient-hook features init modules single status" |
10 | + subcmds="analyze clean collect-logs devel dhclient-hook features init modules query single status" |
11 | base_params="--help --file --version --debug --force" |
12 | case ${COMP_CWORD} in |
13 | 1) |
14 | @@ -40,6 +40,8 @@ _cloudinit_complete() |
15 | COMPREPLY=($(compgen -W "--help --mode" -- $cur_word)) |
16 | ;; |
17 | |
18 | + query) |
19 | + COMPREPLY=($(compgen -W "--all --help --instance-data --list-keys --user-data --vendor-data --debug" -- $cur_word));; |
20 | single) |
21 | COMPREPLY=($(compgen -W "--help --name --frequency --report" -- $cur_word)) |
22 | ;; |
23 | diff --git a/cloudinit/cmd/devel/render.py b/cloudinit/cmd/devel/render.py |
24 | index e85933d..2ba6b68 100755 |
25 | --- a/cloudinit/cmd/devel/render.py |
26 | +++ b/cloudinit/cmd/devel/render.py |
27 | @@ -9,7 +9,6 @@ import sys |
28 | from cloudinit.handlers.jinja_template import render_jinja_payload_from_file |
29 | from cloudinit import log |
30 | from cloudinit.sources import INSTANCE_JSON_FILE |
31 | -from cloudinit import util |
32 | from . import addLogHandlerCLI, read_cfg_paths |
33 | |
34 | NAME = 'render' |
35 | @@ -54,11 +53,7 @@ def handle_args(name, args): |
36 | paths.run_dir, INSTANCE_JSON_FILE) |
37 | else: |
38 | instance_data_fn = args.instance_data |
39 | - try: |
40 | - with open(instance_data_fn) as stream: |
41 | - instance_data = stream.read() |
42 | - instance_data = util.load_json(instance_data) |
43 | - except IOError: |
44 | + if not os.path.exists(instance_data_fn): |
45 | LOG.error('Missing instance-data.json file: %s', instance_data_fn) |
46 | return 1 |
47 | try: |
48 | diff --git a/cloudinit/cmd/main.py b/cloudinit/cmd/main.py |
49 | index 0eee583..5a43702 100644 |
50 | --- a/cloudinit/cmd/main.py |
51 | +++ b/cloudinit/cmd/main.py |
52 | @@ -791,6 +791,10 @@ def main(sysv_args=None): |
53 | ' pass to this module')) |
54 | parser_single.set_defaults(action=('single', main_single)) |
55 | |
56 | + parser_query = subparsers.add_parser( |
57 | + 'query', |
58 | + help='Query standardized instance metadata from the command line.') |
59 | + |
60 | parser_dhclient = subparsers.add_parser('dhclient-hook', |
61 | help=('run the dhclient hook' |
62 | 'to record network info')) |
63 | @@ -842,6 +846,12 @@ def main(sysv_args=None): |
64 | clean_parser(parser_clean) |
65 | parser_clean.set_defaults( |
66 | action=('clean', handle_clean_args)) |
67 | + elif sysv_args[0] == 'query': |
68 | + from cloudinit.cmd.query import ( |
69 | + get_parser as query_parser, handle_args as handle_query_args) |
70 | + query_parser(parser_query) |
71 | + parser_query.set_defaults( |
72 | + action=('render', handle_query_args)) |
73 | elif sysv_args[0] == 'status': |
74 | from cloudinit.cmd.status import ( |
75 | get_parser as status_parser, handle_status_args) |
76 | diff --git a/cloudinit/cmd/query.py b/cloudinit/cmd/query.py |
77 | new file mode 100644 |
78 | index 0000000..7d2d4fe |
79 | --- /dev/null |
80 | +++ b/cloudinit/cmd/query.py |
81 | @@ -0,0 +1,155 @@ |
82 | +# This file is part of cloud-init. See LICENSE file for license information. |
83 | + |
84 | +"""Query standardized instance metadata from the command line.""" |
85 | + |
86 | +import argparse |
87 | +import os |
88 | +import six |
89 | +import sys |
90 | + |
91 | +from cloudinit.handlers.jinja_template import ( |
92 | + convert_jinja_instance_data, render_jinja_payload) |
93 | +from cloudinit.cmd.devel import addLogHandlerCLI, read_cfg_paths |
94 | +from cloudinit import log |
95 | +from cloudinit.sources import ( |
96 | + INSTANCE_JSON_FILE, INSTANCE_JSON_SENSITIVE_FILE, REDACT_SENSITIVE_VALUE) |
97 | +from cloudinit import util |
98 | + |
99 | +NAME = 'query' |
100 | +LOG = log.getLogger(NAME) |
101 | + |
102 | + |
103 | +def get_parser(parser=None): |
104 | + """Build or extend an arg parser for query utility. |
105 | + |
106 | + @param parser: Optional existing ArgumentParser instance representing the |
107 | + query subcommand which will be extended to support the args of |
108 | + this utility. |
109 | + |
110 | + @returns: ArgumentParser with proper argument configuration. |
111 | + """ |
112 | + if not parser: |
113 | + parser = argparse.ArgumentParser( |
114 | + prog=NAME, description='Query cloud-init instance data') |
115 | + parser.add_argument( |
116 | + '-d', '--debug', action='store_true', default=False, |
117 | + help='Add verbose messages during template render') |
118 | + parser.add_argument( |
119 | + '-i', '--instance-data', type=str, |
120 | + help=('Path to instance-data.json file. Default is /run/cloud-init/%s' |
121 | + % INSTANCE_JSON_FILE)) |
122 | + parser.add_argument( |
123 | + '-l', '--list-keys', action='store_true', default=False, |
124 | + help=('List query keys available at the provided instance-data' |
125 | + ' <varname>.')) |
126 | + parser.add_argument( |
127 | + '-u', '--user-data', type=str, |
128 | + help=('Path to user-data file. Default is' |
129 | + ' /var/lib/cloud/instance/user-data.txt')) |
130 | + parser.add_argument( |
131 | + '-v', '--vendor-data', type=str, |
132 | + help=('Path to vendor-data file. Default is' |
133 | + ' /var/lib/cloud/instance/vendor-data.txt')) |
134 | + parser.add_argument( |
135 | + 'varname', type=str, nargs='?', |
136 | + help=('A dot-delimited instance data variable to query from' |
137 | + ' instance-data query. For example: v2.local_hostname')) |
138 | + parser.add_argument( |
139 | + '-a', '--all', action='store_true', default=False, dest='dump_all', |
140 | + help='Dump all available instance-data') |
141 | + parser.add_argument( |
142 | + '-f', '--format', type=str, dest='format', |
143 | + help=('Optionally specify a custom output format string. Any' |
144 | + ' instance-data variable can be specified between double-curly' |
145 | + ' braces. For example -f "{{ v2.cloud_name }}"')) |
146 | + return parser |
147 | + |
148 | + |
149 | +def handle_args(name, args): |
150 | + """Handle calls to 'cloud-init query' as a subcommand.""" |
151 | + paths = None |
152 | + addLogHandlerCLI(LOG, log.DEBUG if args.debug else log.WARNING) |
153 | + if not any([args.list_keys, args.varname, args.format, args.dump_all]): |
154 | + LOG.error( |
155 | + 'Expected one of the options: --all, --format,' |
156 | + ' --list-keys or varname') |
157 | + get_parser().print_help() |
158 | + return 1 |
159 | + |
160 | + uid = os.getuid() |
161 | + if not all([args.instance_data, args.user_data, args.vendor_data]): |
162 | + paths = read_cfg_paths() |
163 | + if not args.instance_data: |
164 | + if uid == 0: |
165 | + default_json_fn = INSTANCE_JSON_SENSITIVE_FILE |
166 | + else: |
167 | + default_json_fn = INSTANCE_JSON_FILE # World readable |
168 | + instance_data_fn = os.path.join(paths.run_dir, default_json_fn) |
169 | + else: |
170 | + instance_data_fn = args.instance_data |
171 | + if not args.user_data: |
172 | + user_data_fn = os.path.join(paths.instance_link, 'user-data.txt') |
173 | + else: |
174 | + user_data_fn = args.user_data |
175 | + if not args.vendor_data: |
176 | + vendor_data_fn = os.path.join(paths.instance_link, 'vendor-data.txt') |
177 | + else: |
178 | + vendor_data_fn = args.vendor_data |
179 | + |
180 | + try: |
181 | + instance_json = util.load_file(instance_data_fn) |
182 | + except IOError: |
183 | + LOG.error('Missing instance-data.json file: %s', instance_data_fn) |
184 | + return 1 |
185 | + |
186 | + instance_data = util.load_json(instance_json) |
187 | + if uid != 0: |
188 | + instance_data['userdata'] = ( |
189 | + '<%s> file:%s' % (REDACT_SENSITIVE_VALUE, user_data_fn)) |
190 | + instance_data['vendordata'] = ( |
191 | + '<%s> file:%s' % (REDACT_SENSITIVE_VALUE, vendor_data_fn)) |
192 | + else: |
193 | + instance_data['userdata'] = util.load_file(user_data_fn) |
194 | + instance_data['vendordata'] = util.load_file(vendor_data_fn) |
195 | + if args.format: |
196 | + payload = '## template: jinja\n{fmt}'.format(fmt=args.format) |
197 | + rendered_payload = render_jinja_payload( |
198 | + payload=payload, payload_fn='query commandline', |
199 | + instance_data=instance_data, |
200 | + debug=True if args.debug else False) |
201 | + if rendered_payload: |
202 | + print(rendered_payload) |
203 | + return 0 |
204 | + return 1 |
205 | + |
206 | + response = convert_jinja_instance_data(instance_data) |
207 | + if args.varname: |
208 | + try: |
209 | + for var in args.varname.split('.'): |
210 | + response = response[var] |
211 | + except KeyError: |
212 | + LOG.error('Undefined instance-data key %s', args.varname) |
213 | + return 1 |
214 | + if args.list_keys: |
215 | + if not isinstance(response, dict): |
216 | + LOG.error("--list-keys provided but '%s' is not a dict", var) |
217 | + return 1 |
218 | + response = '\n'.join(sorted(response.keys())) |
219 | + elif args.list_keys: |
220 | + response = '\n'.join(sorted(response.keys())) |
221 | + if not isinstance(response, six.string_types): |
222 | + response = util.json_dumps(response) |
223 | + print(response) |
224 | + return 0 |
225 | + |
226 | + |
227 | +def main(): |
228 | + """Tool to query specific instance-data values.""" |
229 | + parser = get_parser() |
230 | + sys.exit(handle_args(NAME, parser.parse_args())) |
231 | + |
232 | + |
233 | +if __name__ == '__main__': |
234 | + main() |
235 | + |
236 | +# vi: ts=4 expandtab |
237 | diff --git a/cloudinit/cmd/tests/test_query.py b/cloudinit/cmd/tests/test_query.py |
238 | new file mode 100644 |
239 | index 0000000..fb87c6a |
240 | --- /dev/null |
241 | +++ b/cloudinit/cmd/tests/test_query.py |
242 | @@ -0,0 +1,193 @@ |
243 | +# This file is part of cloud-init. See LICENSE file for license information. |
244 | + |
245 | +from six import StringIO |
246 | +from textwrap import dedent |
247 | +import os |
248 | + |
249 | +from collections import namedtuple |
250 | +from cloudinit.cmd import query |
251 | +from cloudinit.helpers import Paths |
252 | +from cloudinit.sources import REDACT_SENSITIVE_VALUE, INSTANCE_JSON_FILE |
253 | +from cloudinit.tests.helpers import CiTestCase, mock |
254 | +from cloudinit.util import ensure_dir, write_file |
255 | + |
256 | + |
257 | +class TestQuery(CiTestCase): |
258 | + |
259 | + with_logs = True |
260 | + |
261 | + args = namedtuple( |
262 | + 'queryargs', |
263 | + ('debug dump_all format instance_data list_keys user_data vendor_data' |
264 | + ' varname')) |
265 | + |
266 | + def setUp(self): |
267 | + super(TestQuery, self).setUp() |
268 | + self.tmp = self.tmp_dir() |
269 | + self.instance_data = self.tmp_path('instance-data', dir=self.tmp) |
270 | + |
271 | + def test_handle_args_error_on_missing_param(self): |
272 | + """Error when missing required parameters and print usage.""" |
273 | + args = self.args( |
274 | + debug=False, dump_all=False, format=None, instance_data=None, |
275 | + list_keys=False, user_data=None, vendor_data=None, varname=None) |
276 | + with mock.patch('sys.stderr', new_callable=StringIO) as m_stderr: |
277 | + with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout: |
278 | + self.assertEqual(1, query.handle_args('anyname', args)) |
279 | + expected_error = ( |
280 | + 'ERROR: Expected one of the options: --all, --format, --list-keys' |
281 | + ' or varname\n') |
282 | + self.assertIn(expected_error, self.logs.getvalue()) |
283 | + self.assertIn('usage: query', m_stdout.getvalue()) |
284 | + self.assertIn(expected_error, m_stderr.getvalue()) |
285 | + |
286 | + def test_handle_args_error_on_missing_instance_data(self): |
287 | + """When instance_data file path does not exist, log an error.""" |
288 | + absent_fn = self.tmp_path('absent', dir=self.tmp) |
289 | + args = self.args( |
290 | + debug=False, dump_all=True, format=None, instance_data=absent_fn, |
291 | + list_keys=False, user_data='ud', vendor_data='vd', varname=None) |
292 | + with mock.patch('sys.stderr', new_callable=StringIO) as m_stderr: |
293 | + self.assertEqual(1, query.handle_args('anyname', args)) |
294 | + self.assertIn( |
295 | + 'ERROR: Missing instance-data.json file: %s' % absent_fn, |
296 | + self.logs.getvalue()) |
297 | + self.assertIn( |
298 | + 'ERROR: Missing instance-data.json file: %s' % absent_fn, |
299 | + m_stderr.getvalue()) |
300 | + |
301 | + def test_handle_args_defaults_instance_data(self): |
302 | + """When no instance_data argument, default to configured run_dir.""" |
303 | + args = self.args( |
304 | + debug=False, dump_all=True, format=None, instance_data=None, |
305 | + list_keys=False, user_data=None, vendor_data=None, varname=None) |
306 | + run_dir = self.tmp_path('run_dir', dir=self.tmp) |
307 | + ensure_dir(run_dir) |
308 | + paths = Paths({'run_dir': run_dir}) |
309 | + self.add_patch('cloudinit.cmd.query.read_cfg_paths', 'm_paths') |
310 | + self.m_paths.return_value = paths |
311 | + with mock.patch('sys.stderr', new_callable=StringIO) as m_stderr: |
312 | + self.assertEqual(1, query.handle_args('anyname', args)) |
313 | + json_file = os.path.join(run_dir, INSTANCE_JSON_FILE) |
314 | + self.assertIn( |
315 | + 'ERROR: Missing instance-data.json file: %s' % json_file, |
316 | + self.logs.getvalue()) |
317 | + self.assertIn( |
318 | + 'ERROR: Missing instance-data.json file: %s' % json_file, |
319 | + m_stderr.getvalue()) |
320 | + |
321 | + def test_handle_args_dumps_all_instance_data(self): |
322 | + """When --all is specified query will dump all instance data vars.""" |
323 | + write_file(self.instance_data, '{"my-var": "it worked"}') |
324 | + args = self.args( |
325 | + debug=False, dump_all=True, format=None, |
326 | + instance_data=self.instance_data, list_keys=False, |
327 | + user_data='ud', vendor_data='vd', varname=None) |
328 | + with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout: |
329 | + self.assertEqual(0, query.handle_args('anyname', args)) |
330 | + self.assertEqual( |
331 | + '{\n "my_var": "it worked",\n "userdata": "<%s> file:ud",\n' |
332 | + ' "vendordata": "<%s> file:vd"\n}\n' % ( |
333 | + REDACT_SENSITIVE_VALUE, REDACT_SENSITIVE_VALUE), |
334 | + m_stdout.getvalue()) |
335 | + |
336 | + def test_handle_args_returns_top_level_varname(self): |
337 | + """When the argument varname is passed, report its value.""" |
338 | + write_file(self.instance_data, '{"my-var": "it worked"}') |
339 | + args = self.args( |
340 | + debug=False, dump_all=True, format=None, |
341 | + instance_data=self.instance_data, list_keys=False, |
342 | + user_data='ud', vendor_data='vd', varname='my_var') |
343 | + with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout: |
344 | + self.assertEqual(0, query.handle_args('anyname', args)) |
345 | + self.assertEqual('it worked\n', m_stdout.getvalue()) |
346 | + |
347 | + def test_handle_args_returns_nested_varname(self): |
348 | + """If user_data file is a jinja template render instance-data vars.""" |
349 | + write_file(self.instance_data, |
350 | + '{"v1": {"key-2": "value-2"}, "my-var": "it worked"}') |
351 | + args = self.args( |
352 | + debug=False, dump_all=False, format=None, |
353 | + instance_data=self.instance_data, user_data='ud', vendor_data='vd', |
354 | + list_keys=False, varname='v1.key_2') |
355 | + with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout: |
356 | + self.assertEqual(0, query.handle_args('anyname', args)) |
357 | + self.assertEqual('value-2\n', m_stdout.getvalue()) |
358 | + |
359 | + def test_handle_args_returns_standardized_vars_to_top_level_aliases(self): |
360 | + """Any standardized vars under v# are promoted as top-level aliases.""" |
361 | + write_file( |
362 | + self.instance_data, |
363 | + '{"v1": {"v1_1": "val1.1"}, "v2": {"v2_2": "val2.2"},' |
364 | + ' "top": "gun"}') |
365 | + expected = dedent("""\ |
366 | + { |
367 | + "top": "gun", |
368 | + "userdata": "<redacted for non-root user> file:ud", |
369 | + "v1": { |
370 | + "v1_1": "val1.1" |
371 | + }, |
372 | + "v1_1": "val1.1", |
373 | + "v2": { |
374 | + "v2_2": "val2.2" |
375 | + }, |
376 | + "v2_2": "val2.2", |
377 | + "vendordata": "<redacted for non-root user> file:vd" |
378 | + } |
379 | + """) |
380 | + args = self.args( |
381 | + debug=False, dump_all=True, format=None, |
382 | + instance_data=self.instance_data, user_data='ud', vendor_data='vd', |
383 | + list_keys=False, varname=None) |
384 | + with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout: |
385 | + self.assertEqual(0, query.handle_args('anyname', args)) |
386 | + self.assertEqual(expected, m_stdout.getvalue()) |
387 | + |
388 | + def test_handle_args_list_keys_sorts_top_level_keys_when_no_varname(self): |
389 | + """Sort all top-level keys when only --list-keys provided.""" |
390 | + write_file( |
391 | + self.instance_data, |
392 | + '{"v1": {"v1_1": "val1.1"}, "v2": {"v2_2": "val2.2"},' |
393 | + ' "top": "gun"}') |
394 | + expected = 'top\nuserdata\nv1\nv1_1\nv2\nv2_2\nvendordata\n' |
395 | + args = self.args( |
396 | + debug=False, dump_all=False, format=None, |
397 | + instance_data=self.instance_data, list_keys=True, user_data='ud', |
398 | + vendor_data='vd', varname=None) |
399 | + with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout: |
400 | + self.assertEqual(0, query.handle_args('anyname', args)) |
401 | + self.assertEqual(expected, m_stdout.getvalue()) |
402 | + |
403 | + def test_handle_args_list_keys_sorts_nested_keys_when_varname(self): |
404 | + """Sort all nested keys of varname object when --list-keys provided.""" |
405 | + write_file( |
406 | + self.instance_data, |
407 | + '{"v1": {"v1_1": "val1.1", "v1_2": "val1.2"}, "v2":' + |
408 | + ' {"v2_2": "val2.2"}, "top": "gun"}') |
409 | + expected = 'v1_1\nv1_2\n' |
410 | + args = self.args( |
411 | + debug=False, dump_all=False, format=None, |
412 | + instance_data=self.instance_data, list_keys=True, |
413 | + user_data='ud', vendor_data='vd', varname='v1') |
414 | + with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout: |
415 | + self.assertEqual(0, query.handle_args('anyname', args)) |
416 | + self.assertEqual(expected, m_stdout.getvalue()) |
417 | + |
418 | + def test_handle_args_list_keys_errors_when_varname_is_not_a_dict(self): |
419 | + """Raise an error when --list-keys and varname specify a non-list.""" |
420 | + write_file( |
421 | + self.instance_data, |
422 | + '{"v1": {"v1_1": "val1.1", "v1_2": "val1.2"}, "v2": ' + |
423 | + '{"v2_2": "val2.2"}, "top": "gun"}') |
424 | + expected_error = "ERROR: --list-keys provided but 'top' is not a dict" |
425 | + args = self.args( |
426 | + debug=False, dump_all=False, format=None, |
427 | + instance_data=self.instance_data, list_keys=True, user_data='ud', |
428 | + vendor_data='vd', varname='top') |
429 | + with mock.patch('sys.stderr', new_callable=StringIO) as m_stderr: |
430 | + with mock.patch('sys.stdout', new_callable=StringIO) as m_stdout: |
431 | + self.assertEqual(1, query.handle_args('anyname', args)) |
432 | + self.assertEqual('', m_stdout.getvalue()) |
433 | + self.assertIn(expected_error, m_stderr.getvalue()) |
434 | + |
435 | +# vi: ts=4 expandtab |
436 | diff --git a/cloudinit/helpers.py b/cloudinit/helpers.py |
437 | index 3cc1fb1..dcd2645 100644 |
438 | --- a/cloudinit/helpers.py |
439 | +++ b/cloudinit/helpers.py |
440 | @@ -239,6 +239,10 @@ class ConfigMerger(object): |
441 | if cc_fn and os.path.isfile(cc_fn): |
442 | try: |
443 | i_cfgs.append(util.read_conf(cc_fn)) |
444 | + except PermissionError: |
445 | + LOG.debug( |
446 | + 'Skipped loading cloud-config from %s due to' |
447 | + ' non-root.', cc_fn) |
448 | except Exception: |
449 | util.logexc(LOG, 'Failed loading of cloud-config from %s', |
450 | cc_fn) |
451 | diff --git a/cloudinit/sources/__init__.py b/cloudinit/sources/__init__.py |
452 | index a775f1a..730e817 100644 |
453 | --- a/cloudinit/sources/__init__.py |
454 | +++ b/cloudinit/sources/__init__.py |
455 | @@ -38,8 +38,12 @@ DEP_FILESYSTEM = "FILESYSTEM" |
456 | DEP_NETWORK = "NETWORK" |
457 | DS_PREFIX = 'DataSource' |
458 | |
459 | -# File in which instance meta-data, user-data and vendor-data is written |
460 | +# File in which public available instance meta-data is written |
461 | +# security-sensitive key values are redacted from this world-readable file |
462 | INSTANCE_JSON_FILE = 'instance-data.json' |
463 | +# security-sensitive key values are present in this root-readable file |
464 | +INSTANCE_JSON_SENSITIVE_FILE = 'instance-data-sensitive.json' |
465 | +REDACT_SENSITIVE_VALUE = 'redacted for non-root user' |
466 | |
467 | # Key which can be provide a cloud's official product name to cloud-init |
468 | METADATA_CLOUD_NAME_KEY = 'cloud-name' |
469 | @@ -58,7 +62,7 @@ class InvalidMetaDataException(Exception): |
470 | pass |
471 | |
472 | |
473 | -def process_instance_metadata(metadata, key_path=''): |
474 | +def process_instance_metadata(metadata, key_path='', sensitive_keys=()): |
475 | """Process all instance metadata cleaning it up for persisting as json. |
476 | |
477 | Strip ci-b64 prefix and catalog any 'base64_encoded_keys' as a list |
478 | @@ -67,22 +71,46 @@ def process_instance_metadata(metadata, key_path=''): |
479 | """ |
480 | md_copy = copy.deepcopy(metadata) |
481 | md_copy['base64_encoded_keys'] = [] |
482 | + md_copy['sensitive_keys'] = [] |
483 | for key, val in metadata.items(): |
484 | if key_path: |
485 | sub_key_path = key_path + '/' + key |
486 | else: |
487 | sub_key_path = key |
488 | + if key in sensitive_keys or sub_key_path in sensitive_keys: |
489 | + md_copy['sensitive_keys'].append(sub_key_path) |
490 | if isinstance(val, str) and val.startswith('ci-b64:'): |
491 | md_copy['base64_encoded_keys'].append(sub_key_path) |
492 | md_copy[key] = val.replace('ci-b64:', '') |
493 | if isinstance(val, dict): |
494 | - return_val = process_instance_metadata(val, sub_key_path) |
495 | + return_val = process_instance_metadata( |
496 | + val, sub_key_path, sensitive_keys) |
497 | md_copy['base64_encoded_keys'].extend( |
498 | return_val.pop('base64_encoded_keys')) |
499 | + md_copy['sensitive_keys'].extend( |
500 | + return_val.pop('sensitive_keys')) |
501 | md_copy[key] = return_val |
502 | return md_copy |
503 | |
504 | |
505 | +def redact_sensitive_keys(metadata, redact_value=REDACT_SENSITIVE_VALUE): |
506 | + """Redact any sensitive keys from to provided metadata dictionary. |
507 | + |
508 | + Replace any keys values listed in 'sensitive_keys' with redact_value. |
509 | + """ |
510 | + if not metadata.get('sensitive_keys', []): |
511 | + return metadata |
512 | + md_copy = copy.deepcopy(metadata) |
513 | + for key_path in metadata.get('sensitive_keys'): |
514 | + path_parts = key_path.split('/') |
515 | + obj = md_copy |
516 | + for path in path_parts: |
517 | + if isinstance(obj[path], dict) and path != path_parts[-1]: |
518 | + obj = obj[path] |
519 | + obj[path] = redact_value |
520 | + return md_copy |
521 | + |
522 | + |
523 | URLParams = namedtuple( |
524 | 'URLParms', ['max_wait_seconds', 'timeout_seconds', 'num_retries']) |
525 | |
526 | @@ -127,6 +155,10 @@ class DataSource(object): |
527 | |
528 | _dirty_cache = False |
529 | |
530 | + # N-tuple of keypaths or keynames redact from instance-data.json for |
531 | + # non-root users |
532 | + sensitive_metadata_keys = ('security-credentials',) |
533 | + |
534 | def __init__(self, sys_cfg, distro, paths, ud_proc=None): |
535 | self.sys_cfg = sys_cfg |
536 | self.distro = distro |
537 | @@ -152,12 +184,24 @@ class DataSource(object): |
538 | |
539 | def _get_standardized_metadata(self): |
540 | """Return a dictionary of standardized metadata keys.""" |
541 | - return {'v1': { |
542 | - 'local-hostname': self.get_hostname(), |
543 | - 'instance-id': self.get_instance_id(), |
544 | - 'cloud-name': self.cloud_name, |
545 | - 'region': self.region, |
546 | - 'availability-zone': self.availability_zone}} |
547 | + local_hostname = self.get_hostname() |
548 | + instance_id = self.get_instance_id() |
549 | + availability_zone = self.availability_zone |
550 | + cloud_name = self.cloud_name |
551 | + # When adding new standard keys prefer underscore-delimited instead |
552 | + # of hyphen-delimted to support simple variable references in jinja |
553 | + # templates. |
554 | + return { |
555 | + 'v1': { |
556 | + 'availability-zone': availability_zone, |
557 | + 'availability_zone': availability_zone, |
558 | + 'cloud-name': cloud_name, |
559 | + 'cloud_name': cloud_name, |
560 | + 'instance-id': instance_id, |
561 | + 'instance_id': instance_id, |
562 | + 'local-hostname': local_hostname, |
563 | + 'local_hostname': local_hostname, |
564 | + 'region': self.region}} |
565 | |
566 | def clear_cached_attrs(self, attr_defaults=()): |
567 | """Reset any cached metadata attributes to datasource defaults. |
568 | @@ -200,9 +244,7 @@ class DataSource(object): |
569 | """ |
570 | instance_data = { |
571 | 'ds': { |
572 | - 'meta_data': self.metadata, |
573 | - 'user_data': self.get_userdata_raw(), |
574 | - 'vendor_data': self.get_vendordata_raw()}} |
575 | + 'meta_data': self.metadata}} |
576 | if hasattr(self, 'network_json'): |
577 | network_json = getattr(self, 'network_json') |
578 | if network_json != UNSET: |
579 | @@ -217,7 +259,9 @@ class DataSource(object): |
580 | # Process content base64encoding unserializable values |
581 | content = util.json_dumps(instance_data) |
582 | # Strip base64: prefix and set base64_encoded_keys list. |
583 | - processed_data = process_instance_metadata(json.loads(content)) |
584 | + processed_data = process_instance_metadata( |
585 | + json.loads(content), |
586 | + sensitive_keys=self.sensitive_metadata_keys) |
587 | except TypeError as e: |
588 | LOG.warning('Error persisting instance-data.json: %s', str(e)) |
589 | return False |
590 | @@ -225,7 +269,11 @@ class DataSource(object): |
591 | LOG.warning('Error persisting instance-data.json: %s', str(e)) |
592 | return False |
593 | json_file = os.path.join(self.paths.run_dir, INSTANCE_JSON_FILE) |
594 | - write_json(json_file, processed_data, mode=0o600) |
595 | + write_json(json_file, processed_data) # World readable |
596 | + json_sensitive_file = os.path.join(self.paths.run_dir, |
597 | + INSTANCE_JSON_SENSITIVE_FILE) |
598 | + write_json(json_sensitive_file, |
599 | + redact_sensitive_keys(processed_data), mode=0o600) |
600 | return True |
601 | |
602 | def _get_data(self): |
603 | diff --git a/cloudinit/sources/tests/test_init.py b/cloudinit/sources/tests/test_init.py |
604 | index 8299af2..6b96575 100644 |
605 | --- a/cloudinit/sources/tests/test_init.py |
606 | +++ b/cloudinit/sources/tests/test_init.py |
607 | @@ -1,5 +1,6 @@ |
608 | # This file is part of cloud-init. See LICENSE file for license information. |
609 | |
610 | +import copy |
611 | import inspect |
612 | import os |
613 | import six |
614 | @@ -9,7 +10,8 @@ from cloudinit.event import EventType |
615 | from cloudinit.helpers import Paths |
616 | from cloudinit import importer |
617 | from cloudinit.sources import ( |
618 | - INSTANCE_JSON_FILE, DataSource, UNSET) |
619 | + INSTANCE_JSON_FILE, INSTANCE_JSON_SENSITIVE_FILE, REDACT_SENSITIVE_VALUE, |
620 | + UNSET, DataSource, redact_sensitive_keys) |
621 | from cloudinit.tests.helpers import CiTestCase, skipIf, mock |
622 | from cloudinit.user_data import UserDataProcessor |
623 | from cloudinit import util |
624 | @@ -20,20 +22,24 @@ class DataSourceTestSubclassNet(DataSource): |
625 | dsname = 'MyTestSubclass' |
626 | url_max_wait = 55 |
627 | |
628 | - def __init__(self, sys_cfg, distro, paths, custom_userdata=None, |
629 | - get_data_retval=True): |
630 | + def __init__(self, sys_cfg, distro, paths, custom_metadata=None, |
631 | + custom_userdata=None, get_data_retval=True): |
632 | super(DataSourceTestSubclassNet, self).__init__( |
633 | sys_cfg, distro, paths) |
634 | self._custom_userdata = custom_userdata |
635 | + self._custom_metadata = custom_metadata |
636 | self._get_data_retval = get_data_retval |
637 | |
638 | def _get_cloud_name(self): |
639 | return 'SubclassCloudName' |
640 | |
641 | def _get_data(self): |
642 | - self.metadata = {'availability_zone': 'myaz', |
643 | - 'local-hostname': 'test-subclass-hostname', |
644 | - 'region': 'myregion'} |
645 | + if self._custom_metadata: |
646 | + self.metadata = self._custom_metadata |
647 | + else: |
648 | + self.metadata = {'availability_zone': 'myaz', |
649 | + 'local-hostname': 'test-subclass-hostname', |
650 | + 'region': 'myregion'} |
651 | if self._custom_userdata: |
652 | self.userdata_raw = self._custom_userdata |
653 | else: |
654 | @@ -278,7 +284,7 @@ class TestDataSource(CiTestCase): |
655 | os.path.exists(json_file), 'Found unexpected file %s' % json_file) |
656 | |
657 | def test_get_data_writes_json_instance_data_on_success(self): |
658 | - """get_data writes INSTANCE_JSON_FILE to run_dir as readonly root.""" |
659 | + """get_data writes INSTANCE_JSON_FILE to run_dir as world readable.""" |
660 | tmp = self.tmp_dir() |
661 | datasource = DataSourceTestSubclassNet( |
662 | self.sys_cfg, self.distro, Paths({'run_dir': tmp})) |
663 | @@ -287,40 +293,90 @@ class TestDataSource(CiTestCase): |
664 | content = util.load_file(json_file) |
665 | expected = { |
666 | 'base64_encoded_keys': [], |
667 | + 'sensitive_keys': [], |
668 | 'v1': { |
669 | 'availability-zone': 'myaz', |
670 | + 'availability_zone': 'myaz', |
671 | 'cloud-name': 'subclasscloudname', |
672 | + 'cloud_name': 'subclasscloudname', |
673 | 'instance-id': 'iid-datasource', |
674 | + 'instance_id': 'iid-datasource', |
675 | 'local-hostname': 'test-subclass-hostname', |
676 | + 'local_hostname': 'test-subclass-hostname', |
677 | 'region': 'myregion'}, |
678 | 'ds': { |
679 | 'meta_data': {'availability_zone': 'myaz', |
680 | 'local-hostname': 'test-subclass-hostname', |
681 | - 'region': 'myregion'}, |
682 | - 'user_data': 'userdata_raw', |
683 | - 'vendor_data': 'vendordata_raw'}} |
684 | - self.maxDiff = None |
685 | + 'region': 'myregion'}}} |
686 | self.assertEqual(expected, util.load_json(content)) |
687 | file_stat = os.stat(json_file) |
688 | + self.assertEqual(0o644, stat.S_IMODE(file_stat.st_mode)) |
689 | + self.assertEqual(expected, util.load_json(content)) |
690 | + |
691 | + def test_get_data_writes_json_instance_data_sensitive(self): |
692 | + """get_data writes INSTANCE_JSON_SENSITIVE_FILE as readonly root.""" |
693 | + tmp = self.tmp_dir() |
694 | + datasource = DataSourceTestSubclassNet( |
695 | + self.sys_cfg, self.distro, Paths({'run_dir': tmp}), |
696 | + custom_metadata={ |
697 | + 'availability_zone': 'myaz', |
698 | + 'local-hostname': 'test-subclass-hostname', |
699 | + 'region': 'myregion', |
700 | + 'some': {'security-credentials': { |
701 | + 'cred1': 'sekret', 'cred2': 'othersekret'}}}) |
702 | + self.assertEqual( |
703 | + ('security-credentials',), datasource.sensitive_metadata_keys) |
704 | + datasource.get_data() |
705 | + json_file = self.tmp_path(INSTANCE_JSON_FILE, tmp) |
706 | + sensitive_json_file = self.tmp_path(INSTANCE_JSON_SENSITIVE_FILE, tmp) |
707 | + redacted = util.load_json(util.load_file(json_file)) |
708 | + self.assertEqual( |
709 | + {'cred1': 'sekret', 'cred2': 'othersekret'}, |
710 | + redacted['ds']['meta_data']['some']['security-credentials']) |
711 | + content = util.load_file(sensitive_json_file) |
712 | + expected = { |
713 | + 'base64_encoded_keys': [], |
714 | + 'sensitive_keys': ['ds/meta_data/some/security-credentials'], |
715 | + 'v1': { |
716 | + 'availability-zone': 'myaz', |
717 | + 'availability_zone': 'myaz', |
718 | + 'cloud-name': 'subclasscloudname', |
719 | + 'cloud_name': 'subclasscloudname', |
720 | + 'instance-id': 'iid-datasource', |
721 | + 'instance_id': 'iid-datasource', |
722 | + 'local-hostname': 'test-subclass-hostname', |
723 | + 'local_hostname': 'test-subclass-hostname', |
724 | + 'region': 'myregion'}, |
725 | + 'ds': { |
726 | + 'meta_data': { |
727 | + 'availability_zone': 'myaz', |
728 | + 'local-hostname': 'test-subclass-hostname', |
729 | + 'region': 'myregion', |
730 | + 'some': {'security-credentials': REDACT_SENSITIVE_VALUE}}} |
731 | + } |
732 | + self.maxDiff = None |
733 | + self.assertEqual(expected, util.load_json(content)) |
734 | + file_stat = os.stat(sensitive_json_file) |
735 | self.assertEqual(0o600, stat.S_IMODE(file_stat.st_mode)) |
736 | + self.assertEqual(expected, util.load_json(content)) |
737 | |
738 | def test_get_data_handles_redacted_unserializable_content(self): |
739 | """get_data warns unserializable content in INSTANCE_JSON_FILE.""" |
740 | tmp = self.tmp_dir() |
741 | datasource = DataSourceTestSubclassNet( |
742 | self.sys_cfg, self.distro, Paths({'run_dir': tmp}), |
743 | - custom_userdata={'key1': 'val1', 'key2': {'key2.1': self.paths}}) |
744 | + custom_metadata={'key1': 'val1', 'key2': {'key2.1': self.paths}}) |
745 | datasource.get_data() |
746 | json_file = self.tmp_path(INSTANCE_JSON_FILE, tmp) |
747 | content = util.load_file(json_file) |
748 | - expected_userdata = { |
749 | + expected_metadata = { |
750 | 'key1': 'val1', |
751 | 'key2': { |
752 | 'key2.1': "Warning: redacted unserializable type <class" |
753 | " 'cloudinit.helpers.Paths'>"}} |
754 | instance_json = util.load_json(content) |
755 | self.assertEqual( |
756 | - expected_userdata, instance_json['ds']['user_data']) |
757 | + expected_metadata, instance_json['ds']['meta_data']) |
758 | |
759 | def test_persist_instance_data_writes_ec2_metadata_when_set(self): |
760 | """When ec2_metadata class attribute is set, persist to json.""" |
761 | @@ -361,17 +417,17 @@ class TestDataSource(CiTestCase): |
762 | tmp = self.tmp_dir() |
763 | datasource = DataSourceTestSubclassNet( |
764 | self.sys_cfg, self.distro, Paths({'run_dir': tmp}), |
765 | - custom_userdata={'key1': 'val1', 'key2': {'key2.1': b'\x123'}}) |
766 | + custom_metadata={'key1': 'val1', 'key2': {'key2.1': b'\x123'}}) |
767 | self.assertTrue(datasource.get_data()) |
768 | json_file = self.tmp_path(INSTANCE_JSON_FILE, tmp) |
769 | content = util.load_file(json_file) |
770 | instance_json = util.load_json(content) |
771 | - self.assertEqual( |
772 | - ['ds/user_data/key2/key2.1'], |
773 | + self.assertItemsEqual( |
774 | + ['ds/meta_data/key2/key2.1'], |
775 | instance_json['base64_encoded_keys']) |
776 | self.assertEqual( |
777 | {'key1': 'val1', 'key2': {'key2.1': 'EjM='}}, |
778 | - instance_json['ds']['user_data']) |
779 | + instance_json['ds']['meta_data']) |
780 | |
781 | @skipIf(not six.PY2, "json serialization on <= py2.7 handles bytes") |
782 | def test_get_data_handles_bytes_values(self): |
783 | @@ -379,7 +435,7 @@ class TestDataSource(CiTestCase): |
784 | tmp = self.tmp_dir() |
785 | datasource = DataSourceTestSubclassNet( |
786 | self.sys_cfg, self.distro, Paths({'run_dir': tmp}), |
787 | - custom_userdata={'key1': 'val1', 'key2': {'key2.1': b'\x123'}}) |
788 | + custom_metadata={'key1': 'val1', 'key2': {'key2.1': b'\x123'}}) |
789 | self.assertTrue(datasource.get_data()) |
790 | json_file = self.tmp_path(INSTANCE_JSON_FILE, tmp) |
791 | content = util.load_file(json_file) |
792 | @@ -387,7 +443,7 @@ class TestDataSource(CiTestCase): |
793 | self.assertEqual([], instance_json['base64_encoded_keys']) |
794 | self.assertEqual( |
795 | {'key1': 'val1', 'key2': {'key2.1': '\x123'}}, |
796 | - instance_json['ds']['user_data']) |
797 | + instance_json['ds']['meta_data']) |
798 | |
799 | @skipIf(not six.PY2, "Only python2 hits UnicodeDecodeErrors on non-utf8") |
800 | def test_non_utf8_encoding_logs_warning(self): |
801 | @@ -395,7 +451,7 @@ class TestDataSource(CiTestCase): |
802 | tmp = self.tmp_dir() |
803 | datasource = DataSourceTestSubclassNet( |
804 | self.sys_cfg, self.distro, Paths({'run_dir': tmp}), |
805 | - custom_userdata={'key1': 'val1', 'key2': {'key2.1': b'ab\xaadef'}}) |
806 | + custom_metadata={'key1': 'val1', 'key2': {'key2.1': b'ab\xaadef'}}) |
807 | self.assertTrue(datasource.get_data()) |
808 | json_file = self.tmp_path(INSTANCE_JSON_FILE, tmp) |
809 | self.assertFalse(os.path.exists(json_file)) |
810 | @@ -509,4 +565,36 @@ class TestDataSource(CiTestCase): |
811 | self.logs.getvalue()) |
812 | |
813 | |
814 | +class TestRedactSensitiveData(CiTestCase): |
815 | + |
816 | + def test_redact_sensitive_data_noop_when_no_sensitive_keys_present(self): |
817 | + """When sensitive_keys is absent or empty from metadata do nothing.""" |
818 | + md = {'my': 'data'} |
819 | + self.assertEqual( |
820 | + md, redact_sensitive_keys(md, redact_value='redacted')) |
821 | + md['sensitive_keys'] = [] |
822 | + self.assertEqual( |
823 | + md, redact_sensitive_keys(md, redact_value='redacted')) |
824 | + |
825 | + def test_redact_sensitive_data_redacts_exact_match_name(self): |
826 | + """Only exact matched sensitive_keys are redacted from metadata.""" |
827 | + md = {'sensitive_keys': ['md/secure'], |
828 | + 'md': {'secure': 's3kr1t', 'insecure': 'publik'}} |
829 | + secure_md = copy.deepcopy(md) |
830 | + secure_md['md']['secure'] = 'redacted' |
831 | + self.assertEqual( |
832 | + secure_md, |
833 | + redact_sensitive_keys(md, redact_value='redacted')) |
834 | + |
835 | + def test_redact_sensitive_data_does_redacts_with_default_string(self): |
836 | + """When redact_value is absent, REDACT_SENSITIVE_VALUE is used.""" |
837 | + md = {'sensitive_keys': ['md/secure'], |
838 | + 'md': {'secure': 's3kr1t', 'insecure': 'publik'}} |
839 | + secure_md = copy.deepcopy(md) |
840 | + secure_md['md']['secure'] = 'redacted for non-root user' |
841 | + self.assertEqual( |
842 | + secure_md, |
843 | + redact_sensitive_keys(md)) |
844 | + |
845 | + |
846 | # vi: ts=4 expandtab |
847 | diff --git a/doc/rtd/index.rst b/doc/rtd/index.rst |
848 | index de67f36..20a99a3 100644 |
849 | --- a/doc/rtd/index.rst |
850 | +++ b/doc/rtd/index.rst |
851 | @@ -31,6 +31,7 @@ initialization of a cloud instance. |
852 | topics/capabilities.rst |
853 | topics/availability.rst |
854 | topics/format.rst |
855 | + topics/instancedata.rst |
856 | topics/dir_layout.rst |
857 | topics/examples.rst |
858 | topics/boot.rst |
859 | diff --git a/doc/rtd/topics/capabilities.rst b/doc/rtd/topics/capabilities.rst |
860 | index 2d8e253..0d8b894 100644 |
861 | --- a/doc/rtd/topics/capabilities.rst |
862 | +++ b/doc/rtd/topics/capabilities.rst |
863 | @@ -18,7 +18,7 @@ User configurability |
864 | |
865 | User-data can be given by the user at instance launch time. See |
866 | :ref:`user_data_formats` for acceptable user-data content. |
867 | - |
868 | + |
869 | |
870 | This is done via the ``--user-data`` or ``--user-data-file`` argument to |
871 | ec2-run-instances for example. |
872 | @@ -53,10 +53,9 @@ system: |
873 | |
874 | % cloud-init --help |
875 | usage: cloud-init [-h] [--version] [--file FILES] |
876 | - |
877 | [--debug] [--force] |
878 | - {init,modules,single,dhclient-hook,features,analyze,devel,collect-logs,clean,status} |
879 | - ... |
880 | + {init,modules,single,query,dhclient-hook,features,analyze,devel,collect-logs,clean,status} |
881 | + ... |
882 | |
883 | optional arguments: |
884 | -h, --help show this help message and exit |
885 | @@ -68,17 +67,19 @@ system: |
886 | your own risk) |
887 | |
888 | Subcommands: |
889 | - {init,modules,single,dhclient-hook,features,analyze,devel,collect-logs,clean,status} |
890 | + {init,modules,single,query,dhclient-hook,features,analyze,devel,collect-logs,clean,status} |
891 | init initializes cloud-init and performs initial modules |
892 | modules activates modules using a given configuration key |
893 | single run a single module |
894 | + query Query instance metadata from the command line |
895 | dhclient-hook run the dhclient hookto record network info |
896 | features list defined features |
897 | analyze Devel tool: Analyze cloud-init logs and data |
898 | devel Run development tools |
899 | collect-logs Collect and tar all cloud-init debug info |
900 | - clean Remove logs and artifacts so cloud-init can re-run. |
901 | - status Report cloud-init status or wait on completion. |
902 | + clean Remove logs and artifacts so cloud-init can re-run |
903 | + status Report cloud-init status or wait on completion |
904 | + |
905 | |
906 | CLI Subcommand details |
907 | ====================== |
908 | @@ -104,8 +105,8 @@ cloud-init status |
909 | Report whether cloud-init is running, done, disabled or errored. Exits |
910 | non-zero if an error is detected in cloud-init. |
911 | |
912 | - * **--long**: Detailed status information. |
913 | - * **--wait**: Block until cloud-init completes. |
914 | +* **--long**: Detailed status information. |
915 | +* **--wait**: Block until cloud-init completes. |
916 | |
917 | .. code-block:: shell-session |
918 | |
919 | @@ -143,6 +144,68 @@ Logs collected are: |
920 | * journalctl output |
921 | * /var/lib/cloud/instance/user-data.txt |
922 | |
923 | +.. _cli_query: |
924 | + |
925 | +cloud-init query |
926 | +------------------ |
927 | +Query standardized cloud instance metadata crawled by cloud-init and stored |
928 | +in ``/run/cloud-init/instance-data.json``. This is a convenience command-line |
929 | +interface to reference any cached configuration metadata that cloud-init |
930 | +crawls when booting the instance. See :ref:`instance_metadata` for more info. |
931 | + |
932 | +* **--all**: Dump all available instance data as json which can be queried. |
933 | +* **--instance-data**: Optional path to a different instance-data.json file to |
934 | + source for queries. |
935 | +* **--list-keys**: List available query keys from cached instance data. |
936 | + |
937 | +.. code-block:: shell-session |
938 | + |
939 | + # List all top-level query keys available (includes standardized aliases) |
940 | + % cloud-init query --list-keys |
941 | + availability_zone |
942 | + base64_encoded_keys |
943 | + cloud_name |
944 | + ds |
945 | + instance_id |
946 | + local_hostname |
947 | + region |
948 | + v1 |
949 | + |
950 | +* **<varname>**: A dot-delimited variable path into the instance-data.json |
951 | + object. |
952 | + |
953 | +.. code-block:: shell-session |
954 | + |
955 | + # Query cloud-init standardized metadata on any cloud |
956 | + % cloud-init query v1.cloud_name |
957 | + aws # or openstack, azure, gce etc. |
958 | + |
959 | + # Any standardized instance-data under a <v#> key is aliased as a top-level |
960 | + # key for convenience. |
961 | + % cloud-init query cloud_name |
962 | + aws # or openstack, azure, gce etc. |
963 | + |
964 | + # Query datasource-specific metadata on EC2 |
965 | + % cloud-init query ds.meta_data.public_ipv4 |
966 | + |
967 | +* **--format** A string that will use jinja-template syntax to render a string |
968 | + replacing |
969 | + |
970 | +.. code-block:: shell-session |
971 | + |
972 | + # Generate a custom hostname fqdn based on instance-id, cloud and region |
973 | + % cloud-init query --format 'custom-{{instance_id}}.{{region}}.{{v1.cloud_name}}.com' |
974 | + custom-i-0e91f69987f37ec74.us-east-2.aws.com |
975 | + |
976 | + |
977 | +.. note:: |
978 | + The standardized instance data keys under **v#** are guaranteed not to change |
979 | + behavior or format. If using top-level convenience aliases for any |
980 | + standardized instance data keys, the most value (highest **v#**) of that key |
981 | + name is what is reported as the top-level value. So these aliases act as a |
982 | + 'latest'. |
983 | + |
984 | + |
985 | .. _cli_analyze: |
986 | |
987 | cloud-init analyze |
988 | @@ -150,10 +213,10 @@ cloud-init analyze |
989 | Get detailed reports of where cloud-init spends most of its time. See |
990 | :ref:`boot_time_analysis` for more info. |
991 | |
992 | - * **blame** Report ordered by most costly operations. |
993 | - * **dump** Machine-readable JSON dump of all cloud-init tracked events. |
994 | - * **show** show time-ordered report of the cost of operations during each |
995 | - boot stage. |
996 | +* **blame** Report ordered by most costly operations. |
997 | +* **dump** Machine-readable JSON dump of all cloud-init tracked events. |
998 | +* **show** show time-ordered report of the cost of operations during each |
999 | + boot stage. |
1000 | |
1001 | .. _cli_devel: |
1002 | |
1003 | @@ -182,8 +245,8 @@ cloud-init clean |
1004 | Remove cloud-init artifacts from /var/lib/cloud and optionally reboot the |
1005 | machine to so cloud-init re-runs all stages as it did on first boot. |
1006 | |
1007 | - * **--logs**: Optionally remove /var/log/cloud-init*log files. |
1008 | - * **--reboot**: Reboot the system after removing artifacts. |
1009 | +* **--logs**: Optionally remove /var/log/cloud-init*log files. |
1010 | +* **--reboot**: Reboot the system after removing artifacts. |
1011 | |
1012 | .. _cli_init: |
1013 | |
1014 | @@ -195,7 +258,7 @@ Can be run on the commandline, but is generally gated to run only once |
1015 | due to semaphores in **/var/lib/cloud/instance/sem/** and |
1016 | **/var/lib/cloud/sem**. |
1017 | |
1018 | - * **--local**: Run *init-local* stage instead of *init*. |
1019 | +* **--local**: Run *init-local* stage instead of *init*. |
1020 | |
1021 | .. _cli_modules: |
1022 | |
1023 | @@ -210,8 +273,8 @@ declared to run in various boot stages in the file |
1024 | commandline, but each module is gated to run only once due to semaphores |
1025 | in ``/var/lib/cloud/``. |
1026 | |
1027 | - * **--mode (init|config|final)**: Run *modules:init*, *modules:config* or |
1028 | - *modules:final* cloud-init stages. See :ref:`boot_stages` for more info. |
1029 | +* **--mode (init|config|final)**: Run *modules:init*, *modules:config* or |
1030 | + *modules:final* cloud-init stages. See :ref:`boot_stages` for more info. |
1031 | |
1032 | .. _cli_single: |
1033 | |
1034 | @@ -221,9 +284,9 @@ Attempt to run a single named cloud config module. The following example |
1035 | re-runs the cc_set_hostname module ignoring the module default frequency |
1036 | of once-per-instance: |
1037 | |
1038 | - * **--name**: The cloud-config module name to run |
1039 | - * **--frequency**: Optionally override the declared module frequency |
1040 | - with one of (always|once-per-instance|once) |
1041 | +* **--name**: The cloud-config module name to run |
1042 | +* **--frequency**: Optionally override the declared module frequency |
1043 | + with one of (always|once-per-instance|once) |
1044 | |
1045 | .. code-block:: shell-session |
1046 | |
1047 | diff --git a/doc/rtd/topics/datasources.rst b/doc/rtd/topics/datasources.rst |
1048 | index 14432e6..e34f145 100644 |
1049 | --- a/doc/rtd/topics/datasources.rst |
1050 | +++ b/doc/rtd/topics/datasources.rst |
1051 | @@ -17,146 +17,10 @@ own way) internally a datasource abstract class was created to allow for a |
1052 | single way to access the different cloud systems methods to provide this data |
1053 | through the typical usage of subclasses. |
1054 | |
1055 | - |
1056 | -.. _instance_metadata: |
1057 | - |
1058 | -instance-data |
1059 | -------------- |
1060 | -For reference, cloud-init stores all the metadata, vendordata and userdata |
1061 | -provided by a cloud in a json blob at ``/run/cloud-init/instance-data.json``. |
1062 | -While the json contains datasource-specific keys and names, cloud-init will |
1063 | -maintain a minimal set of standardized keys that will remain stable on any |
1064 | -cloud. Standardized instance-data keys will be present under a "v1" key. |
1065 | -Any datasource metadata cloud-init consumes will all be present under the |
1066 | -"ds" key. |
1067 | - |
1068 | -Below is an instance-data.json example from an OpenStack instance: |
1069 | - |
1070 | -.. sourcecode:: json |
1071 | - |
1072 | - { |
1073 | - "base64-encoded-keys": [ |
1074 | - "ds/meta-data/random_seed", |
1075 | - "ds/user-data" |
1076 | - ], |
1077 | - "ds": { |
1078 | - "ec2_metadata": { |
1079 | - "ami-id": "ami-0000032f", |
1080 | - "ami-launch-index": "0", |
1081 | - "ami-manifest-path": "FIXME", |
1082 | - "block-device-mapping": { |
1083 | - "ami": "vda", |
1084 | - "ephemeral0": "/dev/vdb", |
1085 | - "root": "/dev/vda" |
1086 | - }, |
1087 | - "hostname": "xenial-test.novalocal", |
1088 | - "instance-action": "none", |
1089 | - "instance-id": "i-0006e030", |
1090 | - "instance-type": "m1.small", |
1091 | - "local-hostname": "xenial-test.novalocal", |
1092 | - "local-ipv4": "10.5.0.6", |
1093 | - "placement": { |
1094 | - "availability-zone": "None" |
1095 | - }, |
1096 | - "public-hostname": "xenial-test.novalocal", |
1097 | - "public-ipv4": "10.245.162.145", |
1098 | - "reservation-id": "r-fxm623oa", |
1099 | - "security-groups": "default" |
1100 | - }, |
1101 | - "meta-data": { |
1102 | - "availability_zone": null, |
1103 | - "devices": [], |
1104 | - "hostname": "xenial-test.novalocal", |
1105 | - "instance-id": "3e39d278-0644-4728-9479-678f9212d8f0", |
1106 | - "launch_index": 0, |
1107 | - "local-hostname": "xenial-test.novalocal", |
1108 | - "name": "xenial-test", |
1109 | - "project_id": "e0eb2d2538814...", |
1110 | - "random_seed": "A6yPN...", |
1111 | - "uuid": "3e39d278-0644-4728-9479-678f92..." |
1112 | - }, |
1113 | - "network_json": { |
1114 | - "links": [ |
1115 | - { |
1116 | - "ethernet_mac_address": "fa:16:3e:7d:74:9b", |
1117 | - "id": "tap9ca524d5-6e", |
1118 | - "mtu": 8958, |
1119 | - "type": "ovs", |
1120 | - "vif_id": "9ca524d5-6e5a-4809-936a-6901..." |
1121 | - } |
1122 | - ], |
1123 | - "networks": [ |
1124 | - { |
1125 | - "id": "network0", |
1126 | - "link": "tap9ca524d5-6e", |
1127 | - "network_id": "c6adfc18-9753-42eb-b3ea-18b57e6b837f", |
1128 | - "type": "ipv4_dhcp" |
1129 | - } |
1130 | - ], |
1131 | - "services": [ |
1132 | - { |
1133 | - "address": "10.10.160.2", |
1134 | - "type": "dns" |
1135 | - } |
1136 | - ] |
1137 | - }, |
1138 | - "user-data": "I2Nsb3VkLWNvbmZpZ...", |
1139 | - "vendor-data": null |
1140 | - }, |
1141 | - "v1": { |
1142 | - "availability-zone": null, |
1143 | - "cloud-name": "openstack", |
1144 | - "instance-id": "3e39d278-0644-4728-9479-678f9212d8f0", |
1145 | - "local-hostname": "xenial-test", |
1146 | - "region": null |
1147 | - } |
1148 | - } |
1149 | - |
1150 | - |
1151 | -As of cloud-init v. 18.4, any values present in |
1152 | -``/run/cloud-init/instance-data.json`` can be used in cloud-init user data |
1153 | -scripts or cloud config data. This allows consumers to use cloud-init's |
1154 | -vendor-neutral, standardized metadata keys as well as datasource-specific |
1155 | -content for any scripts or cloud-config modules they are using. |
1156 | - |
1157 | -To use instance-data.json values in scripts and **#config-config** files the |
1158 | -user-data will need to contain the following header as the first line **## template: jinja**. Cloud-init will source all variables defined in |
1159 | -``/run/cloud-init/instance-data.json`` and allow scripts or cloud-config files |
1160 | -to reference those paths. Below are two examples:: |
1161 | - |
1162 | - * Cloud config calling home with the ec2 public hostname and avaliability-zone |
1163 | - ``` |
1164 | - ## template: jinja |
1165 | - #cloud-config |
1166 | - runcmd: |
1167 | - - echo 'EC2 public hostname allocated to instance: {{ ds.meta_data.public_hostname }}' > /tmp/instance_metadata |
1168 | - - echo 'EC2 avaiability zone: {{ v1.availability_zone }}' >> /tmp/instance_metadata |
1169 | - - curl -X POST -d '{"hostname": "{{ds.meta_data.public_hostname }}", "availability-zone": "{{ v1.availability_zone }}"}' https://example.com.com |
1170 | - ``` |
1171 | - |
1172 | - * Custom user script performing different operations based on region |
1173 | - ``` |
1174 | - ## template: jinja |
1175 | - #!/bin/bash |
1176 | - {% if v1.region == 'us-east-2' -%} |
1177 | - echo 'Installing custom proxies for {{ v1.region }} |
1178 | - sudo apt-get install my-xtra-fast-stack |
1179 | - {%- endif %} |
1180 | - ... |
1181 | - |
1182 | - ``` |
1183 | - |
1184 | -.. note:: |
1185 | - Trying to reference jinja variables that don't exist in |
1186 | - instance-data.json will result in warnings in ``/var/log/cloud-init.log`` |
1187 | - and the following string in your rendered user-data: |
1188 | - ``CI_MISSING_JINJA_VAR/<your_varname>``. |
1189 | - |
1190 | -.. note:: |
1191 | - To save time designing your user-data for a specific cloud's |
1192 | - instance-data.json, use the 'render' cloud-init command on an |
1193 | - instance booted on your favorite cloud. See :ref:`cli_devel` for more |
1194 | - information. |
1195 | +Any metadata processed by cloud-init's datasources is persisted as |
1196 | +``/run/cloud0-init/instance-data.json``. Cloud-init provides tooling |
1197 | +to quickly introspect some of that data. See :ref:`instance_metadata` for |
1198 | +more information. |
1199 | |
1200 | |
1201 | Datasource API |
1202 | @@ -196,14 +60,14 @@ The current interface that a datasource object must provide is the following: |
1203 | # or does not exist) |
1204 | def device_name_to_device(self, name) |
1205 | |
1206 | - # gets the locale string this instance should be applying |
1207 | + # gets the locale string this instance should be applying |
1208 | # which typically used to adjust the instances locale settings files |
1209 | def get_locale(self) |
1210 | |
1211 | @property |
1212 | def availability_zone(self) |
1213 | |
1214 | - # gets the instance id that was assigned to this instance by the |
1215 | + # gets the instance id that was assigned to this instance by the |
1216 | # cloud provider or when said instance id does not exist in the backing |
1217 | # metadata this will return 'iid-datasource' |
1218 | def get_instance_id(self) |
1219 | diff --git a/doc/rtd/topics/instancedata.rst b/doc/rtd/topics/instancedata.rst |
1220 | new file mode 100644 |
1221 | index 0000000..634e180 |
1222 | --- /dev/null |
1223 | +++ b/doc/rtd/topics/instancedata.rst |
1224 | @@ -0,0 +1,297 @@ |
1225 | +.. _instance_metadata: |
1226 | + |
1227 | +***************** |
1228 | +Instance Metadata |
1229 | +***************** |
1230 | + |
1231 | +What is a instance data? |
1232 | +======================== |
1233 | + |
1234 | +Instance data is the collection of all configuration data that cloud-init |
1235 | +processes to configure the instance. This configuration typically |
1236 | +comes from any number of sources: |
1237 | + |
1238 | +* cloud-provided metadata services (aka metadata) |
1239 | +* custom config-drive attached to the instance |
1240 | +* cloud-config seed files in the booted cloud image or distribution |
1241 | +* vendordata provided from files or cloud metadata services |
1242 | +* userdata provided at instance creation |
1243 | + |
1244 | +Each cloud provider presents unique configuration metadata in different |
1245 | +formats to the instance. Cloud-init provides a cache of any crawled metadata |
1246 | +as well as a versioned set of standardized instance data keys which it makes |
1247 | +available on all platforms. |
1248 | + |
1249 | +Cloud-init produces a simple json object in |
1250 | +``/run/cloud-init/instance-data.json`` which represents standardized and |
1251 | +versioned representation of the metadata it consumes during initial boot. The |
1252 | +intent is to provide the following benefits to users or scripts on any system |
1253 | +deployed with cloud-init: |
1254 | + |
1255 | +* simple static object to query to obtain a instance's metadata |
1256 | +* speed: avoid costly network transactions for metadata that is already cached |
1257 | + on the filesytem |
1258 | +* reduce need to recrawl metadata services for static metadata that is already |
1259 | + cached |
1260 | +* leverage cloud-init's best practices for crawling cloud-metadata services |
1261 | +* avoid rolling unique metadata crawlers on each cloud platform to get |
1262 | + metadata configuration values |
1263 | + |
1264 | +Cloud-init stores any instance data processed in the following files: |
1265 | + |
1266 | +* ``/run/cloud-init/instance-data.json``: world-readable json containing |
1267 | + standardized keys, sensitive keys redacted |
1268 | +* ``/run/cloud-init/instance-data-sensitive.json``: root-readable unredacted |
1269 | + json blob |
1270 | +* ``/var/lib/cloud/instance/user-data.txt``: root-readable sensitive raw |
1271 | + userdata |
1272 | +* ``/var/lib/cloud/instance/vendor-data.txt``: root-readable sensitive raw |
1273 | + vendordata |
1274 | + |
1275 | +Cloud-init redacts any security sensitive content from instance-data.json, |
1276 | +stores ``/run/cloud-init/instance-data.json`` as a world-readable json file. |
1277 | +Because user-data and vendor-data can contain passwords both of these files |
1278 | +are readonly for *root* as well. The *root* user can also read |
1279 | +``/run/cloud-init/instance-data-sensitive.json`` which is all instance data |
1280 | +from instance-data.json as well as unredacted sensitive content. |
1281 | + |
1282 | + |
1283 | +Format of instance-data.json |
1284 | +============================ |
1285 | + |
1286 | +The instance-data.json and instance-data-sensitive.json files are well-formed |
1287 | +JSON and record the set of keys and values for any metadata processed by |
1288 | +cloud-init. Cloud-init standardizes the format for this content so that it |
1289 | +can be generalized across different cloud platforms. |
1290 | + |
1291 | +There are three basic top-level keys: |
1292 | + |
1293 | +* **base64_encoded_keys**: A list of forward-slash delimited key paths into |
1294 | + the instance-data.json object whose value is base64encoded for json |
1295 | + compatibility. Values at these paths should be decoded to get the original |
1296 | + value. |
1297 | + |
1298 | +* **sensitive_keys**: A list of forward-slash delimited key paths into |
1299 | + the instance-data.json object whose value is considered by the datasource as |
1300 | + 'security sensitive'. Only the keys listed here will be redacted from |
1301 | + instance-data.json for non-root users. |
1302 | + |
1303 | +* **ds**: Datasource-specific metadata crawled for the specific cloud |
1304 | + platform. It should closely represent the structure of the cloud metadata |
1305 | + crawled. The structure of content and details provided are entirely |
1306 | + cloud-dependent. Mileage will vary depending on what the cloud exposes. |
1307 | + The content exposed under the 'ds' key is currently **experimental** and |
1308 | + expected to change slightly in the upcoming cloud-init release. |
1309 | + |
1310 | +* **v1**: Standardized cloud-init metadata keys, these keys are guaranteed to |
1311 | + exist on all cloud platforms. They will also retain their current behavior |
1312 | + and format and will be carried forward even if cloud-init introduces a new |
1313 | + version of standardized keys with **v2**. |
1314 | + |
1315 | +The standardized keys present: |
1316 | + |
1317 | ++----------------------+-----------------------------------------------+---------------------------+ |
1318 | +| Key path | Description | Examples | |
1319 | ++======================+===============================================+===========================+ |
1320 | +| v1.cloud_name | The name of the cloud provided by metadata | aws, openstack, azure, | |
1321 | +| | key 'cloud-name' or the cloud-init datasource | configdrive, nocloud, | |
1322 | +| | name which was discovered. | ovf, etc. | |
1323 | ++----------------------+-----------------------------------------------+---------------------------+ |
1324 | +| v1.instance_id | Unique instance_id allocated by the cloud | i-<somehash> | |
1325 | ++----------------------+-----------------------------------------------+---------------------------+ |
1326 | +| v1.local_hostname | The internal or local hostname of the system | ip-10-41-41-70, | |
1327 | +| | | <user-provided-hostname> | |
1328 | ++----------------------+-----------------------------------------------+---------------------------+ |
1329 | +| v1.region | The physical region/datacenter in which the | us-east-2 | |
1330 | +| | instance is deployed | | |
1331 | ++----------------------+-----------------------------------------------+---------------------------+ |
1332 | +| v1.availability_zone | The physical availability zone in which the | us-east-2b, nova, null | |
1333 | +| | instance is deployed | | |
1334 | ++----------------------+-----------------------------------------------+---------------------------+ |
1335 | + |
1336 | + |
1337 | +Below is an example of ``/run/cloud-init/instance_data.json`` on an EC2 |
1338 | +instance: |
1339 | + |
1340 | +.. sourcecode:: json |
1341 | + |
1342 | + { |
1343 | + "base64_encoded_keys": [], |
1344 | + "sensitive_keys": [], |
1345 | + "ds": { |
1346 | + "meta_data": { |
1347 | + "ami-id": "ami-014e1416b628b0cbf", |
1348 | + "ami-launch-index": "0", |
1349 | + "ami-manifest-path": "(unknown)", |
1350 | + "block-device-mapping": { |
1351 | + "ami": "/dev/sda1", |
1352 | + "ephemeral0": "sdb", |
1353 | + "ephemeral1": "sdc", |
1354 | + "root": "/dev/sda1" |
1355 | + }, |
1356 | + "hostname": "ip-10-41-41-70.us-east-2.compute.internal", |
1357 | + "instance-action": "none", |
1358 | + "instance-id": "i-04fa31cfc55aa7976", |
1359 | + "instance-type": "t2.micro", |
1360 | + "local-hostname": "ip-10-41-41-70.us-east-2.compute.internal", |
1361 | + "local-ipv4": "10.41.41.70", |
1362 | + "mac": "06:b6:92:dd:9d:24", |
1363 | + "metrics": { |
1364 | + "vhostmd": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" |
1365 | + }, |
1366 | + "network": { |
1367 | + "interfaces": { |
1368 | + "macs": { |
1369 | + "06:b6:92:dd:9d:24": { |
1370 | + "device-number": "0", |
1371 | + "interface-id": "eni-08c0c9fdb99b6e6f4", |
1372 | + "ipv4-associations": { |
1373 | + "18.224.22.43": "10.41.41.70" |
1374 | + }, |
1375 | + "local-hostname": "ip-10-41-41-70.us-east-2.compute.internal", |
1376 | + "local-ipv4s": "10.41.41.70", |
1377 | + "mac": "06:b6:92:dd:9d:24", |
1378 | + "owner-id": "437526006925", |
1379 | + "public-hostname": "ec2-18-224-22-43.us-east-2.compute.amazonaws.com", |
1380 | + "public-ipv4s": "18.224.22.43", |
1381 | + "security-group-ids": "sg-828247e9", |
1382 | + "security-groups": "Cloud-init integration test secgroup", |
1383 | + "subnet-id": "subnet-282f3053", |
1384 | + "subnet-ipv4-cidr-block": "10.41.41.0/24", |
1385 | + "subnet-ipv6-cidr-blocks": "2600:1f16:b80:ad00::/64", |
1386 | + "vpc-id": "vpc-252ef24d", |
1387 | + "vpc-ipv4-cidr-block": "10.41.0.0/16", |
1388 | + "vpc-ipv4-cidr-blocks": "10.41.0.0/16", |
1389 | + "vpc-ipv6-cidr-blocks": "2600:1f16:b80:ad00::/56" |
1390 | + } |
1391 | + } |
1392 | + } |
1393 | + }, |
1394 | + "placement": { |
1395 | + "availability-zone": "us-east-2b" |
1396 | + }, |
1397 | + "profile": "default-hvm", |
1398 | + "public-hostname": "ec2-18-224-22-43.us-east-2.compute.amazonaws.com", |
1399 | + "public-ipv4": "18.224.22.43", |
1400 | + "public-keys": { |
1401 | + "cloud-init-integration": [ |
1402 | + "ssh-rsa |
1403 | + AAAAB3NzaC1yc2EAAAADAQABAAABAQDSL7uWGj8cgWyIOaspgKdVy0cKJ+UTjfv7jBOjG2H/GN8bJVXy72XAvnhM0dUM+CCs8FOf0YlPX+Frvz2hKInrmRhZVwRSL129PasD12MlI3l44u6IwS1o/W86Q+tkQYEljtqDOo0a+cOsaZkvUNzUyEXUwz/lmYa6G4hMKZH4NBj7nbAAF96wsMCoyNwbWryBnDYUr6wMbjRR1J9Pw7Xh7WRC73wy4Va2YuOgbD3V/5ZrFPLbWZW/7TFXVrql04QVbyei4aiFR5n//GvoqwQDNe58LmbzX/xvxyKJYdny2zXmdAhMxbrpFQsfpkJ9E/H5w0yOdSvnWbUoG5xNGoOB |
1404 | + cloud-init-integration" |
1405 | + ] |
1406 | + }, |
1407 | + "reservation-id": "r-06ab75e9346f54333", |
1408 | + "security-groups": "Cloud-init integration test secgroup", |
1409 | + "services": { |
1410 | + "domain": "amazonaws.com", |
1411 | + "partition": "aws" |
1412 | + } |
1413 | + } |
1414 | + }, |
1415 | + "v1": { |
1416 | + "availability-zone": "us-east-2b", |
1417 | + "availability_zone": "us-east-2b", |
1418 | + "cloud-name": "aws", |
1419 | + "cloud_name": "aws", |
1420 | + "instance-id": "i-04fa31cfc55aa7976", |
1421 | + "instance_id": "i-04fa31cfc55aa7976", |
1422 | + "local-hostname": "ip-10-41-41-70", |
1423 | + "local_hostname": "ip-10-41-41-70", |
1424 | + "region": "us-east-2" |
1425 | + } |
1426 | + } |
1427 | + |
1428 | + |
1429 | +Using instance-data |
1430 | +=================== |
1431 | + |
1432 | +As of cloud-init v. 18.4, any variables present in |
1433 | +``/run/cloud-init/instance-data.json`` can be used in: |
1434 | + |
1435 | +* User-data scripts |
1436 | +* Cloud config data |
1437 | +* Command line interface via **cloud-init query** or |
1438 | + **cloud-init devel render** |
1439 | + |
1440 | +Many clouds allow users to provide user-data to an instance at |
1441 | +the time the instance is launched. Cloud-init supports a number of |
1442 | +:ref:`user_data_formats`. |
1443 | + |
1444 | +Both user-data scripts and **#cloud-config** data support jinja template |
1445 | +rendering. |
1446 | +When the first line of the provided user-data begins with, |
1447 | +**## template: jinja** cloud-init will use jinja to render that file. |
1448 | +Any instance-data-sensitive.json variables are surfaced as dot-delimited |
1449 | +jinja template variables because cloud-config modules are run as 'root' |
1450 | +user. |
1451 | + |
1452 | + |
1453 | +Below are some examples of providing these types of user-data: |
1454 | + |
1455 | +* Cloud config calling home with the ec2 public hostname and avaliability-zone |
1456 | + |
1457 | +.. code-block:: shell-session |
1458 | + |
1459 | + ## template: jinja |
1460 | + #cloud-config |
1461 | + runcmd: |
1462 | + - echo 'EC2 public hostname allocated to instance: {{ |
1463 | + ds.meta_data.public_hostname }}' > /tmp/instance_metadata |
1464 | + - echo 'EC2 avaiability zone: {{ v1.availability_zone }}' >> |
1465 | + /tmp/instance_metadata |
1466 | + - curl -X POST -d '{"hostname": "{{ds.meta_data.public_hostname }}", |
1467 | + "availability-zone": "{{ v1.availability_zone }}"}' |
1468 | + https://example.com |
1469 | + |
1470 | +* Custom user-data script performing different operations based on region |
1471 | + |
1472 | +.. code-block:: shell-session |
1473 | + |
1474 | + ## template: jinja |
1475 | + #!/bin/bash |
1476 | + {% if v1.region == 'us-east-2' -%} |
1477 | + echo 'Installing custom proxies for {{ v1.region }} |
1478 | + sudo apt-get install my-xtra-fast-stack |
1479 | + {%- endif %} |
1480 | + ... |
1481 | + |
1482 | +.. note:: |
1483 | + Trying to reference jinja variables that don't exist in |
1484 | + instance-data.json will result in warnings in ``/var/log/cloud-init.log`` |
1485 | + and the following string in your rendered user-data: |
1486 | + ``CI_MISSING_JINJA_VAR/<your_varname>``. |
1487 | + |
1488 | +Cloud-init also surfaces a commandline tool **cloud-init query** which can |
1489 | +assist developers or scripts with obtaining instance metadata easily. See |
1490 | +:ref:`cli_query` for more information. |
1491 | + |
1492 | +To cut down on keystrokes on the command line, cloud-init also provides |
1493 | +top-level key aliases for any standardized ``v#`` keys present. The preceding |
1494 | +``v1`` is not required of ``v1.var_name`` These aliases will represent the |
1495 | +value of the highest versioned standard key. For example, ``cloud_name`` |
1496 | +value will be ``v2.cloud_name`` if both ``v1`` and ``v2`` keys are present in |
1497 | +instance-data.json. |
1498 | +The **query** command also publishes ``userdata`` and ``vendordata`` keys to |
1499 | +the root user which will contain the decoded user and vendor data provided to |
1500 | +this instance. Non-root users referencing userdata or vendordata keys will |
1501 | +see only redacted values. |
1502 | + |
1503 | +.. code-block:: shell-session |
1504 | + |
1505 | + # List all top-level instance-data keys available |
1506 | + % cloud-init query --list-keys |
1507 | + |
1508 | + # Find your EC2 ami-id |
1509 | + % cloud-init query ds.metadata.ami_id |
1510 | + |
1511 | + # Format your cloud_name and region using jinja template syntax |
1512 | + % cloud-init query --format 'cloud: {{ v1.cloud_name }} myregion: {{ |
1513 | + % v1.region }}' |
1514 | + |
1515 | +.. note:: |
1516 | + To save time designing a user-data template for a specific cloud's |
1517 | + instance-data.json, use the 'render' cloud-init command on an |
1518 | + instance booted on your favorite cloud. See :ref:`cli_devel` for more |
1519 | + information. |
1520 | + |
1521 | +.. vi: textwidth=78 |
1522 | diff --git a/integration-requirements.txt b/integration-requirements.txt |
1523 | index f80cb94..880d988 100644 |
1524 | --- a/integration-requirements.txt |
1525 | +++ b/integration-requirements.txt |
1526 | @@ -5,16 +5,17 @@ |
1527 | # the packages/pkg-deps.json file as well. |
1528 | # |
1529 | |
1530 | +unittest2 |
1531 | # ec2 backend |
1532 | boto3==1.5.9 |
1533 | |
1534 | # ssh communication |
1535 | paramiko==2.4.1 |
1536 | |
1537 | + |
1538 | # lxd backend |
1539 | # 04/03/2018: enables use of lxd 3.0 |
1540 | git+https://github.com/lxc/pylxd.git@4b8ab1802f9aee4eb29cf7b119dae0aa47150779 |
1541 | |
1542 | - |
1543 | # finds latest image information |
1544 | git+https://git.launchpad.net/simplestreams |
1545 | diff --git a/tests/cloud_tests/testcases/base.py b/tests/cloud_tests/testcases/base.py |
1546 | index 2745827..c545796 100644 |
1547 | --- a/tests/cloud_tests/testcases/base.py |
1548 | +++ b/tests/cloud_tests/testcases/base.py |
1549 | @@ -5,15 +5,15 @@ |
1550 | import crypt |
1551 | import json |
1552 | import re |
1553 | -import unittest |
1554 | +import unittest2 |
1555 | |
1556 | |
1557 | from cloudinit import util as c_util |
1558 | |
1559 | -SkipTest = unittest.SkipTest |
1560 | +SkipTest = unittest2.SkipTest |
1561 | |
1562 | |
1563 | -class CloudTestCase(unittest.TestCase): |
1564 | +class CloudTestCase(unittest2.TestCase): |
1565 | """Base test class for verifiers.""" |
1566 | |
1567 | # data gets populated in get_suite.setUpClass |
1568 | @@ -167,8 +167,9 @@ class CloudTestCase(unittest.TestCase): |
1569 | 'Skipping instance-data.json test.' |
1570 | ' OS: %s not bionic or newer' % self.os_name) |
1571 | instance_data = json.loads(out) |
1572 | - self.assertEqual( |
1573 | - ['ds/user_data'], instance_data['base64_encoded_keys']) |
1574 | + self.assertItemsEqual( |
1575 | + [], |
1576 | + instance_data['base64_encoded_keys']) |
1577 | ds = instance_data.get('ds', {}) |
1578 | v1_data = instance_data.get('v1', {}) |
1579 | metadata = ds.get('meta-data', {}) |
1580 | @@ -187,10 +188,10 @@ class CloudTestCase(unittest.TestCase): |
1581 | metadata.get('placement', {}).get('availability-zone'), |
1582 | 'Could not determine EC2 Availability zone placement') |
1583 | self.assertIsNotNone( |
1584 | - v1_data['availability-zone'], 'expected ec2 availability-zone') |
1585 | - self.assertEqual('aws', v1_data['cloud-name']) |
1586 | - self.assertIn('i-', v1_data['instance-id']) |
1587 | - self.assertIn('ip-', v1_data['local-hostname']) |
1588 | + v1_data['availability_zone'], 'expected ec2 availability_zone') |
1589 | + self.assertEqual('aws', v1_data['cloud_name']) |
1590 | + self.assertIn('i-', v1_data['instance_id']) |
1591 | + self.assertIn('ip-', v1_data['local_hostname']) |
1592 | self.assertIsNotNone(v1_data['region'], 'expected ec2 region') |
1593 | |
1594 | def test_instance_data_json_lxd(self): |
1595 | @@ -213,16 +214,14 @@ class CloudTestCase(unittest.TestCase): |
1596 | ' OS: %s not bionic or newer' % self.os_name) |
1597 | instance_data = json.loads(out) |
1598 | v1_data = instance_data.get('v1', {}) |
1599 | - self.assertEqual( |
1600 | - ['ds/user_data', 'ds/vendor_data'], |
1601 | - sorted(instance_data['base64_encoded_keys'])) |
1602 | - self.assertEqual('nocloud', v1_data['cloud-name']) |
1603 | + self.assertItemsEqual([], sorted(instance_data['base64_encoded_keys'])) |
1604 | + self.assertEqual('nocloud', v1_data['cloud_name']) |
1605 | self.assertIsNone( |
1606 | - v1_data['availability-zone'], |
1607 | - 'found unexpected lxd availability-zone %s' % |
1608 | - v1_data['availability-zone']) |
1609 | - self.assertIn('cloud-test', v1_data['instance-id']) |
1610 | - self.assertIn('cloud-test', v1_data['local-hostname']) |
1611 | + v1_data['availability_zone'], |
1612 | + 'found unexpected lxd availability_zone %s' % |
1613 | + v1_data['availability_zone']) |
1614 | + self.assertIn('cloud-test', v1_data['instance_id']) |
1615 | + self.assertIn('cloud-test', v1_data['local_hostname']) |
1616 | self.assertIsNone( |
1617 | v1_data['region'], |
1618 | 'found unexpected lxd region %s' % v1_data['region']) |
1619 | @@ -248,18 +247,17 @@ class CloudTestCase(unittest.TestCase): |
1620 | ' OS: %s not bionic or newer' % self.os_name) |
1621 | instance_data = json.loads(out) |
1622 | v1_data = instance_data.get('v1', {}) |
1623 | - self.assertEqual( |
1624 | - ['ds/user_data'], instance_data['base64_encoded_keys']) |
1625 | - self.assertEqual('nocloud', v1_data['cloud-name']) |
1626 | + self.assertItemsEqual([], instance_data['base64_encoded_keys']) |
1627 | + self.assertEqual('nocloud', v1_data['cloud_name']) |
1628 | self.assertIsNone( |
1629 | - v1_data['availability-zone'], |
1630 | - 'found unexpected kvm availability-zone %s' % |
1631 | - v1_data['availability-zone']) |
1632 | + v1_data['availability_zone'], |
1633 | + 'found unexpected kvm availability_zone %s' % |
1634 | + v1_data['availability_zone']) |
1635 | self.assertIsNotNone( |
1636 | re.match(r'[\da-f]{8}(-[\da-f]{4}){3}-[\da-f]{12}', |
1637 | - v1_data['instance-id']), |
1638 | - 'kvm instance-id is not a UUID: %s' % v1_data['instance-id']) |
1639 | - self.assertIn('ubuntu', v1_data['local-hostname']) |
1640 | + v1_data['instance_id']), |
1641 | + 'kvm instance_id is not a UUID: %s' % v1_data['instance_id']) |
1642 | + self.assertIn('ubuntu', v1_data['local_hostname']) |
1643 | self.assertIsNone( |
1644 | v1_data['region'], |
1645 | 'found unexpected lxd region %s' % v1_data['region']) |
FAILED: Continuous integration, rev:44a4844725a c3a75b53f901e43 697f7036b0bc42 /jenkins. ubuntu. com/server/ job/cloud- init-ci/ 315/
https:/
Executed test runs:
SUCCESS: Checkout
SUCCESS: Unit & Style Tests
SUCCESS: Ubuntu LTS: Build
FAILED: Ubuntu LTS: Integration
Click here to trigger a rebuild: /jenkins. ubuntu. com/server/ job/cloud- init-ci/ 315/rebuild
https:/