Merge ~samgilson/cloud-init:cloud-init-analyze-boot into cloud-init:master
- Git
- lp:~samgilson/cloud-init
- cloud-init-analyze-boot
- Merge into master
Status: | Merged |
---|---|
Approved by: | Dan Watkins |
Approved revision: | 54c7df160a48df264447727a85f684f18e2428ef |
Merge reported by: | Server Team CI bot |
Merged at revision: | not available |
Proposed branch: | ~samgilson/cloud-init:cloud-init-analyze-boot |
Merge into: | cloud-init:master |
Diff against target: |
661 lines (+546/-12) 6 files modified
cloudinit/analyze/__main__.py (+86/-2) cloudinit/analyze/show.py (+192/-10) cloudinit/analyze/tests/test_boot.py (+170/-0) doc/rtd/topics/analyze.rst (+84/-0) doc/rtd/topics/capabilities.rst (+1/-0) doc/rtd/topics/debugging.rst (+13/-0) |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Server Team CI bot | continuous-integration | Approve | |
Dan Watkins | Approve | ||
Sam Gilson (community) | Approve | ||
Chad Smith | Approve | ||
Review via email: mp+368943@code.launchpad.net |
Commit message
Cloud-init analyze module: Added ability to analyze boot events.
This branch introduces a new command line feature for cloud-init.
Currently, the cloud-init module has the capability to analyze events in
cloud-init.log in three ways: 'show', 'blame', 'dump'.
These changes add a fourth capability, called 'boot'.
Running the command 'cloud-init analyze boot' will provide the user three
timestamps.
1) Timestamp for when the kernel starts initializing.
2) Timestamp for when the kernel finishes its initialization.
3) Timestamp for when systemd activates cloud-init.
This feature enables cloud-init users to analyze different boot phases.
This would aid in debugging performance issues related
to cloud-init startup or tracking regression.
Description of the change
Chad Smith (chad.smith) wrote : | # |
Chad Smith (chad.smith) : | # |
Chad Smith (chad.smith) wrote : | # |
Marking needs information until we have an idea of how best to align cloud-init start time to something that aligns more with the timestamp in cloud-init logs.
Currently ActiveEnterTime
# cloud-init.log
2019-06-24 20:24:25,463 - util.py[DEBUG]: Cloud-init v. 19.1-11-
# systemctl properties
root@x1:~# systemctl show -p ActiveEnterTime
ActiveEnterTime
root@x1:~# systemctl show -p InactiveExitTim
InactiveExitTim
So I think that maybe the timestamp arithmetic in the branch may be suspect
Sam Gilson (samgilson) wrote : | # |
This is very interesting. I created an Ubuntu 18.04:latest VM on Azure and did the same checks as you, and here is what I got.
cat /var/log/
2019-06-24 18:59:34,735 - util.py[DEBUG]: Cloud-init v. 19.1-1-
systemctl show -p UserspaceTimestamp
UserspaceTimest
systemctl show -p ExecMainStartTi
ExecMainStartTi
systemctl show -p ActiveEnterTime
ActiveEnterTime
systemctl show -p InactiveExitTim
InactiveExitTim
cloud-init analyze boot
-- Most Recent Boot Record --
Kernel Started at: 2019-06-24 18:59:11.739511
Kernel ended boot at: 2019-06-24 18:59:22.125180
Kernel time to boot (seconds): 10.385669
Cloud-init start: 2019-06-24 18:59:30.426426
Time between Kernel boot and Cloud-init start (seconds): 8.301246
In my case, the InactiveExitTim
Sam Gilson (samgilson) wrote : | # |
Keep in mind I had created this VM earlier in the day, hence why all the timestamps are so far off the current time.
> This is very interesting. I created an Ubuntu 18.04:latest VM on Azure and did
> the same checks as you, and here is what I got.
> cat /var/log/
> 2019-06-24 18:59:34,735 - util.py[DEBUG]: Cloud-init v.
> 19.1-1-
> 18:59:34 +0000. Up 22.97 seconds.
>
> systemctl show -p UserspaceTimestamp
> UserspaceTimest
>
> systemctl show -p ExecMainStartTi
> ExecMainStartTi
>
> systemctl show -p ActiveEnterTime
> ActiveEnterTime
>
> systemctl show -p InactiveExitTim
> InactiveExitTim
>
> cloud-init analyze boot
> -- Most Recent Boot Record --
> Kernel Started at: 2019-06-24 18:59:11.739511
> Kernel ended boot at: 2019-06-24 18:59:22.125180
> Kernel time to boot (seconds): 10.385669
> Cloud-init start: 2019-06-24 18:59:30.426426
> Time between Kernel boot and Cloud-init start (seconds): 8.301246
>
> In my case, the InactiveExitTim
> init.log are the closest together (still not correct), with the ActiveEnter
> and ExecMainStart timestamps being very far off the mark. I am not really
> certain as to why the ActiveEnter and ExecMainStart are so far off in this
> case.
Sam Gilson (samgilson) wrote : | # |
Hey Chad, in your first comment, what did you mean by "on my system"? I have been testing this on Ubuntu 18.04:latest and have found that on all the VMs I booted and tried, the cloud-init boot analyze is always consistently 4 seconds behind the cloud-init.log reported time. However, when checking the timestamps, it seems to be that ActiveEnter and ExecMainStart are consistently never correct, often off by several factors of 10. However in your test, it seems that InactiveExit and ActiveEnter are the same.
Sam Gilson (samgilson) wrote : | # |
It seems that on Azure VMs that have been pre-provisioned, the ActiveEnter and ExecMainStart timestamps are incorrect due to the waiting that the pre-provisioning must do during the cloud-init-local phase of cloud-init in order to get the necessary data from Azure. Due to this, the more consistent time stamp is the InactiveExit timestamp, as it is written before any pre-provisioning occurs. However, InactiveExit is consistently 4 seconds off of the cloud-init.log timestamp.
However, when on a VM that has not been pre-provisioned, using ActiveEnter instead of InactiveExit always nets you the correct answer.
Sam Gilson (samgilson) wrote : | # |
Hey Chad, thanks for all the great help in finding these bugs. I went back through and fixed all of the ones you found, so this MP is now ready for a re-review.
Chad Smith (chad.smith) wrote : | # |
looks much better Sam, one suggestion inline to drop/adapt get_ci_
Chad Smith (chad.smith) wrote : | # |
one more on some of the formatting success/failure too
Sam Gilson (samgilson) wrote : | # |
Hey Chad, thank you for the reviews. I made all the changes you specified, specifically being dropping get_ci_
Server Team CI bot (server-team-bot) wrote : | # |
FAILED: Continuous integration, rev:8e6d3c9f176
https:/
Executed test runs:
SUCCESS: Checkout
SUCCESS: Unit & Style Tests
FAILED: Ubuntu LTS: Build
Click here to trigger a rebuild:
https:/
Server Team CI bot (server-team-bot) wrote : | # |
PASSED: Continuous integration, rev:8e6d3c9f176
https:/
Executed test runs:
SUCCESS: Checkout
SUCCESS: Unit & Style Tests
SUCCESS: Ubuntu LTS: Build
SUCCESS: Ubuntu LTS: Integration
IN_PROGRESS: Declarative: Post Actions
Click here to trigger a rebuild:
https:/
Chad Smith (chad.smith) wrote : | # |
One final round of review commments from me on this.
here's a pastebin of the changes if you think it'll work:
1. I think we need to document that we are currently working around an lxc issue in this code
with a comment to pointing to https:/
2. we probably shouldn't subshell with subp to stat a file, but use python's os.stat
here's a pastebin of the changes I suggest:
https:/
Chad Smith (chad.smith) wrote : | # |
This looks good to me. Will talk with other upstream devs on Monday to see if we have any concerns with including this in 19.2.
Dan Watkins (oddbloke) wrote : | # |
I haven't done a full review (because I'm on a plane and about to be interrupted), but I did find some stylistic issues as well as perhaps some functional ones.
Sam Gilson (samgilson) : | # |
Server Team CI bot (server-team-bot) wrote : | # |
Commit message lints:
- Line #0 has 26 too many characters. Line starts with: "Cloud-init analyze module:"... - Line #2 has 264 too many characters. Line starts with: "This branch introduces"... - Line #6 has 139 too many characters. Line starts with: "This feature enables"...
Server Team CI bot (server-team-bot) : | # |
Server Team CI bot (server-team-bot) wrote : | # |
Commit message lints:
- Line #12 has 26 too many characters. Line starts with: "This would aid in debugging"...
Server Team CI bot (server-team-bot) : | # |
Preview Diff
1 | diff --git a/cloudinit/analyze/__main__.py b/cloudinit/analyze/__main__.py | |||
2 | index f861365..99e5c20 100644 | |||
3 | --- a/cloudinit/analyze/__main__.py | |||
4 | +++ b/cloudinit/analyze/__main__.py | |||
5 | @@ -7,7 +7,7 @@ import re | |||
6 | 7 | import sys | 7 | import sys |
7 | 8 | 8 | ||
8 | 9 | from cloudinit.util import json_dumps | 9 | from cloudinit.util import json_dumps |
10 | 10 | 10 | from datetime import datetime | |
11 | 11 | from . import dump | 11 | from . import dump |
12 | 12 | from . import show | 12 | from . import show |
13 | 13 | 13 | ||
14 | @@ -52,9 +52,93 @@ def get_parser(parser=None): | |||
15 | 52 | dest='outfile', default='-', | 52 | dest='outfile', default='-', |
16 | 53 | help='specify where to write output. ') | 53 | help='specify where to write output. ') |
17 | 54 | parser_dump.set_defaults(action=('dump', analyze_dump)) | 54 | parser_dump.set_defaults(action=('dump', analyze_dump)) |
18 | 55 | parser_boot = subparsers.add_parser( | ||
19 | 56 | 'boot', help='Print list of boot times for kernel and cloud-init') | ||
20 | 57 | parser_boot.add_argument('-i', '--infile', action='store', | ||
21 | 58 | dest='infile', default='/var/log/cloud-init.log', | ||
22 | 59 | help='specify where to read input. ') | ||
23 | 60 | parser_boot.add_argument('-o', '--outfile', action='store', | ||
24 | 61 | dest='outfile', default='-', | ||
25 | 62 | help='specify where to write output.') | ||
26 | 63 | parser_boot.set_defaults(action=('boot', analyze_boot)) | ||
27 | 55 | return parser | 64 | return parser |
28 | 56 | 65 | ||
29 | 57 | 66 | ||
30 | 67 | def analyze_boot(name, args): | ||
31 | 68 | """Report a list of how long different boot operations took. | ||
32 | 69 | |||
33 | 70 | For Example: | ||
34 | 71 | -- Most Recent Boot Record -- | ||
35 | 72 | Kernel Started at: <time> | ||
36 | 73 | Kernel ended boot at: <time> | ||
37 | 74 | Kernel time to boot (seconds): <time> | ||
38 | 75 | Cloud-init activated by systemd at: <time> | ||
39 | 76 | Time between Kernel end boot and Cloud-init activation (seconds):<time> | ||
40 | 77 | Cloud-init start: <time> | ||
41 | 78 | """ | ||
42 | 79 | infh, outfh = configure_io(args) | ||
43 | 80 | kernel_info = show.dist_check_timestamp() | ||
44 | 81 | status_code, kernel_start, kernel_end, ci_sysd_start = \ | ||
45 | 82 | kernel_info | ||
46 | 83 | kernel_start_timestamp = datetime.utcfromtimestamp(kernel_start) | ||
47 | 84 | kernel_end_timestamp = datetime.utcfromtimestamp(kernel_end) | ||
48 | 85 | ci_sysd_start_timestamp = datetime.utcfromtimestamp(ci_sysd_start) | ||
49 | 86 | try: | ||
50 | 87 | last_init_local = \ | ||
51 | 88 | [e for e in _get_events(infh) if e['name'] == 'init-local' and | ||
52 | 89 | 'starting search' in e['description']][-1] | ||
53 | 90 | ci_start = datetime.utcfromtimestamp(last_init_local['timestamp']) | ||
54 | 91 | except IndexError: | ||
55 | 92 | ci_start = 'Could not find init-local log-line in cloud-init.log' | ||
56 | 93 | status_code = show.FAIL_CODE | ||
57 | 94 | |||
58 | 95 | FAILURE_MSG = 'Your Linux distro or container does not support this ' \ | ||
59 | 96 | 'functionality.\n' \ | ||
60 | 97 | 'You must be running a Kernel Telemetry supported ' \ | ||
61 | 98 | 'distro.\nPlease check ' \ | ||
62 | 99 | 'https://cloudinit.readthedocs.io/en/latest' \ | ||
63 | 100 | '/topics/analyze.html for more ' \ | ||
64 | 101 | 'information on supported distros.\n' | ||
65 | 102 | |||
66 | 103 | SUCCESS_MSG = '-- Most Recent Boot Record --\n' \ | ||
67 | 104 | ' Kernel Started at: {k_s_t}\n' \ | ||
68 | 105 | ' Kernel ended boot at: {k_e_t}\n' \ | ||
69 | 106 | ' Kernel time to boot (seconds): {k_r}\n' \ | ||
70 | 107 | ' Cloud-init activated by systemd at: {ci_sysd_t}\n' \ | ||
71 | 108 | ' Time between Kernel end boot and Cloud-init ' \ | ||
72 | 109 | 'activation (seconds): {bt_r}\n' \ | ||
73 | 110 | ' Cloud-init start: {ci_start}\n' | ||
74 | 111 | |||
75 | 112 | CONTAINER_MSG = '-- Most Recent Container Boot Record --\n' \ | ||
76 | 113 | ' Container started at: {k_s_t}\n' \ | ||
77 | 114 | ' Cloud-init activated by systemd at: {ci_sysd_t}\n' \ | ||
78 | 115 | ' Cloud-init start: {ci_start}\n' \ | ||
79 | 116 | |||
80 | 117 | status_map = { | ||
81 | 118 | show.FAIL_CODE: FAILURE_MSG, | ||
82 | 119 | show.CONTAINER_CODE: CONTAINER_MSG, | ||
83 | 120 | show.SUCCESS_CODE: SUCCESS_MSG | ||
84 | 121 | } | ||
85 | 122 | |||
86 | 123 | kernel_runtime = kernel_end - kernel_start | ||
87 | 124 | between_process_runtime = ci_sysd_start - kernel_end | ||
88 | 125 | |||
89 | 126 | kwargs = { | ||
90 | 127 | 'k_s_t': kernel_start_timestamp, | ||
91 | 128 | 'k_e_t': kernel_end_timestamp, | ||
92 | 129 | 'k_r': kernel_runtime, | ||
93 | 130 | 'bt_r': between_process_runtime, | ||
94 | 131 | 'k_e': kernel_end, | ||
95 | 132 | 'k_s': kernel_start, | ||
96 | 133 | 'ci_sysd': ci_sysd_start, | ||
97 | 134 | 'ci_sysd_t': ci_sysd_start_timestamp, | ||
98 | 135 | 'ci_start': ci_start | ||
99 | 136 | } | ||
100 | 137 | |||
101 | 138 | outfh.write(status_map[status_code].format(**kwargs)) | ||
102 | 139 | return status_code | ||
103 | 140 | |||
104 | 141 | |||
105 | 58 | def analyze_blame(name, args): | 142 | def analyze_blame(name, args): |
106 | 59 | """Report a list of records sorted by largest time delta. | 143 | """Report a list of records sorted by largest time delta. |
107 | 60 | 144 | ||
108 | @@ -119,7 +203,7 @@ def analyze_dump(name, args): | |||
109 | 119 | 203 | ||
110 | 120 | def _get_events(infile): | 204 | def _get_events(infile): |
111 | 121 | rawdata = None | 205 | rawdata = None |
113 | 122 | events, rawdata = show.load_events(infile, None) | 206 | events, rawdata = show.load_events_infile(infile) |
114 | 123 | if not events: | 207 | if not events: |
115 | 124 | events, _ = dump.dump_events(rawdata=rawdata) | 208 | events, _ = dump.dump_events(rawdata=rawdata) |
116 | 125 | return events | 209 | return events |
117 | diff --git a/cloudinit/analyze/show.py b/cloudinit/analyze/show.py | |||
118 | index 3e778b8..511b808 100644 | |||
119 | --- a/cloudinit/analyze/show.py | |||
120 | +++ b/cloudinit/analyze/show.py | |||
121 | @@ -8,8 +8,11 @@ import base64 | |||
122 | 8 | import datetime | 8 | import datetime |
123 | 9 | import json | 9 | import json |
124 | 10 | import os | 10 | import os |
125 | 11 | import time | ||
126 | 12 | import sys | ||
127 | 11 | 13 | ||
128 | 12 | from cloudinit import util | 14 | from cloudinit import util |
129 | 15 | from cloudinit.distros import uses_systemd | ||
130 | 13 | 16 | ||
131 | 14 | # An event: | 17 | # An event: |
132 | 15 | ''' | 18 | ''' |
133 | @@ -49,6 +52,10 @@ format_key = { | |||
134 | 49 | 52 | ||
135 | 50 | formatting_help = " ".join(["{0}: {1}".format(k.replace('%', '%%'), v) | 53 | formatting_help = " ".join(["{0}: {1}".format(k.replace('%', '%%'), v) |
136 | 51 | for k, v in format_key.items()]) | 54 | for k, v in format_key.items()]) |
137 | 55 | SUCCESS_CODE = 'successful' | ||
138 | 56 | FAIL_CODE = 'failure' | ||
139 | 57 | CONTAINER_CODE = 'container' | ||
140 | 58 | TIMESTAMP_UNKNOWN = (FAIL_CODE, -1, -1, -1) | ||
141 | 52 | 59 | ||
142 | 53 | 60 | ||
143 | 54 | def format_record(msg, event): | 61 | def format_record(msg, event): |
144 | @@ -125,9 +132,175 @@ def total_time_record(total_time): | |||
145 | 125 | return 'Total Time: %3.5f seconds\n' % total_time | 132 | return 'Total Time: %3.5f seconds\n' % total_time |
146 | 126 | 133 | ||
147 | 127 | 134 | ||
148 | 135 | class SystemctlReader(object): | ||
149 | 136 | ''' | ||
150 | 137 | Class for dealing with all systemctl subp calls in a consistent manner. | ||
151 | 138 | ''' | ||
152 | 139 | def __init__(self, property, parameter=None): | ||
153 | 140 | self.epoch = None | ||
154 | 141 | self.args = ['/bin/systemctl', 'show'] | ||
155 | 142 | if parameter: | ||
156 | 143 | self.args.append(parameter) | ||
157 | 144 | self.args.extend(['-p', property]) | ||
158 | 145 | # Don't want the init of our object to break. Instead of throwing | ||
159 | 146 | # an exception, set an error code that gets checked when data is | ||
160 | 147 | # requested from the object | ||
161 | 148 | self.failure = self.subp() | ||
162 | 149 | |||
163 | 150 | def subp(self): | ||
164 | 151 | ''' | ||
165 | 152 | Make a subp call based on set args and handle errors by setting | ||
166 | 153 | failure code | ||
167 | 154 | |||
168 | 155 | :return: whether the subp call failed or not | ||
169 | 156 | ''' | ||
170 | 157 | try: | ||
171 | 158 | value, err = util.subp(self.args, capture=True) | ||
172 | 159 | if err: | ||
173 | 160 | return err | ||
174 | 161 | self.epoch = value | ||
175 | 162 | return None | ||
176 | 163 | except Exception as systemctl_fail: | ||
177 | 164 | return systemctl_fail | ||
178 | 165 | |||
179 | 166 | def parse_epoch_as_float(self): | ||
180 | 167 | ''' | ||
181 | 168 | If subp call succeeded, return the timestamp from subp as a float. | ||
182 | 169 | |||
183 | 170 | :return: timestamp as a float | ||
184 | 171 | ''' | ||
185 | 172 | # subp has 2 ways to fail: it either fails and throws an exception, | ||
186 | 173 | # or returns an error code. Raise an exception here in order to make | ||
187 | 174 | # sure both scenarios throw exceptions | ||
188 | 175 | if self.failure: | ||
189 | 176 | raise RuntimeError('Subprocess call to systemctl has failed, ' | ||
190 | 177 | 'returning error code ({})' | ||
191 | 178 | .format(self.failure)) | ||
192 | 179 | # Output from systemctl show has the format Property=Value. | ||
193 | 180 | # For example, UserspaceMonotonic=1929304 | ||
194 | 181 | timestamp = self.epoch.split('=')[1] | ||
195 | 182 | # Timestamps reported by systemctl are in microseconds, converting | ||
196 | 183 | return float(timestamp) / 1000000 | ||
197 | 184 | |||
198 | 185 | |||
199 | 186 | def dist_check_timestamp(): | ||
200 | 187 | ''' | ||
201 | 188 | Determine which init system a particular linux distro is using. | ||
202 | 189 | Each init system (systemd, upstart, etc) has a different way of | ||
203 | 190 | providing timestamps. | ||
204 | 191 | |||
205 | 192 | :return: timestamps of kernelboot, kernelendboot, and cloud-initstart | ||
206 | 193 | or TIMESTAMP_UNKNOWN if the timestamps cannot be retrieved. | ||
207 | 194 | ''' | ||
208 | 195 | |||
209 | 196 | if uses_systemd(): | ||
210 | 197 | return gather_timestamps_using_systemd() | ||
211 | 198 | |||
212 | 199 | # Use dmesg to get timestamps if the distro does not have systemd | ||
213 | 200 | if util.is_FreeBSD() or 'gentoo' in \ | ||
214 | 201 | util.system_info()['system'].lower(): | ||
215 | 202 | return gather_timestamps_using_dmesg() | ||
216 | 203 | |||
217 | 204 | # this distro doesn't fit anything that is supported by cloud-init. just | ||
218 | 205 | # return error codes | ||
219 | 206 | return TIMESTAMP_UNKNOWN | ||
220 | 207 | |||
221 | 208 | |||
222 | 209 | def gather_timestamps_using_dmesg(): | ||
223 | 210 | ''' | ||
224 | 211 | Gather timestamps that corresponds to kernel begin initialization, | ||
225 | 212 | kernel finish initialization using dmesg as opposed to systemctl | ||
226 | 213 | |||
227 | 214 | :return: the two timestamps plus a dummy timestamp to keep consistency | ||
228 | 215 | with gather_timestamps_using_systemd | ||
229 | 216 | ''' | ||
230 | 217 | try: | ||
231 | 218 | data, _ = util.subp(['dmesg'], capture=True) | ||
232 | 219 | split_entries = data[0].splitlines() | ||
233 | 220 | for i in split_entries: | ||
234 | 221 | if i.decode('UTF-8').find('user') != -1: | ||
235 | 222 | splitup = i.decode('UTF-8').split() | ||
236 | 223 | stripped = splitup[1].strip(']') | ||
237 | 224 | |||
238 | 225 | # kernel timestamp from dmesg is equal to 0, | ||
239 | 226 | # with the userspace timestamp relative to it. | ||
240 | 227 | user_space_timestamp = float(stripped) | ||
241 | 228 | kernel_start = float(time.time()) - float(util.uptime()) | ||
242 | 229 | kernel_end = kernel_start + user_space_timestamp | ||
243 | 230 | |||
244 | 231 | # systemd wont start cloud-init in this case, | ||
245 | 232 | # so we cannot get that timestamp | ||
246 | 233 | return SUCCESS_CODE, kernel_start, kernel_end, \ | ||
247 | 234 | kernel_end | ||
248 | 235 | |||
249 | 236 | except Exception: | ||
250 | 237 | pass | ||
251 | 238 | return TIMESTAMP_UNKNOWN | ||
252 | 239 | |||
253 | 240 | |||
254 | 241 | def gather_timestamps_using_systemd(): | ||
255 | 242 | ''' | ||
256 | 243 | Gather timestamps that corresponds to kernel begin initialization, | ||
257 | 244 | kernel finish initialization. and cloud-init systemd unit activation | ||
258 | 245 | |||
259 | 246 | :return: the three timestamps | ||
260 | 247 | ''' | ||
261 | 248 | kernel_start = float(time.time()) - float(util.uptime()) | ||
262 | 249 | try: | ||
263 | 250 | delta_k_end = SystemctlReader('UserspaceTimestampMonotonic')\ | ||
264 | 251 | .parse_epoch_as_float() | ||
265 | 252 | delta_ci_s = SystemctlReader('InactiveExitTimestampMonotonic', | ||
266 | 253 | 'cloud-init-local').parse_epoch_as_float() | ||
267 | 254 | base_time = kernel_start | ||
268 | 255 | status = SUCCESS_CODE | ||
269 | 256 | # lxc based containers do not set their monotonic zero point to be when | ||
270 | 257 | # the container starts, instead keep using host boot as zero point | ||
271 | 258 | # time.CLOCK_MONOTONIC_RAW is only available in python 3.3 | ||
272 | 259 | if util.is_container(): | ||
273 | 260 | # clock.monotonic also uses host boot as zero point | ||
274 | 261 | if sys.version_info >= (3, 3): | ||
275 | 262 | base_time = float(time.time()) - float(time.monotonic()) | ||
276 | 263 | # TODO: lxcfs automatically truncates /proc/uptime to seconds | ||
277 | 264 | # in containers when https://github.com/lxc/lxcfs/issues/292 | ||
278 | 265 | # is fixed, util.uptime() should be used instead of stat on | ||
279 | 266 | try: | ||
280 | 267 | file_stat = os.stat('/proc/1/cmdline') | ||
281 | 268 | kernel_start = file_stat.st_atime | ||
282 | 269 | except OSError as err: | ||
283 | 270 | raise RuntimeError('Could not determine container boot ' | ||
284 | 271 | 'time from /proc/1/cmdline. ({})' | ||
285 | 272 | .format(err)) | ||
286 | 273 | status = CONTAINER_CODE | ||
287 | 274 | else: | ||
288 | 275 | status = FAIL_CODE | ||
289 | 276 | kernel_end = base_time + delta_k_end | ||
290 | 277 | cloudinit_sysd = base_time + delta_ci_s | ||
291 | 278 | |||
292 | 279 | except Exception as e: | ||
293 | 280 | # Except ALL exceptions as Systemctl reader can throw many different | ||
294 | 281 | # errors, but any failure in systemctl means that timestamps cannot be | ||
295 | 282 | # obtained | ||
296 | 283 | print(e) | ||
297 | 284 | return TIMESTAMP_UNKNOWN | ||
298 | 285 | return status, kernel_start, kernel_end, cloudinit_sysd | ||
299 | 286 | |||
300 | 287 | |||
301 | 128 | def generate_records(events, blame_sort=False, | 288 | def generate_records(events, blame_sort=False, |
302 | 129 | print_format="(%n) %d seconds in %I%D", | 289 | print_format="(%n) %d seconds in %I%D", |
303 | 130 | dump_files=False, log_datafiles=False): | 290 | dump_files=False, log_datafiles=False): |
304 | 291 | ''' | ||
305 | 292 | Take in raw events and create parent-child dependencies between events | ||
306 | 293 | in order to order events in chronological order. | ||
307 | 294 | |||
308 | 295 | :param events: JSONs from dump that represents events taken from logs | ||
309 | 296 | :param blame_sort: whether to sort by timestamp or by time taken. | ||
310 | 297 | :param print_format: formatting to represent event, time stamp, | ||
311 | 298 | and time taken by the event in one line | ||
312 | 299 | :param dump_files: whether to dump files into JSONs | ||
313 | 300 | :param log_datafiles: whether or not to log events generated | ||
314 | 301 | |||
315 | 302 | :return: boot records ordered chronologically | ||
316 | 303 | ''' | ||
317 | 131 | 304 | ||
318 | 132 | sorted_events = sorted(events, key=lambda x: x['timestamp']) | 305 | sorted_events = sorted(events, key=lambda x: x['timestamp']) |
319 | 133 | records = [] | 306 | records = [] |
320 | @@ -189,19 +362,28 @@ def generate_records(events, blame_sort=False, | |||
321 | 189 | 362 | ||
322 | 190 | 363 | ||
323 | 191 | def show_events(events, print_format): | 364 | def show_events(events, print_format): |
324 | 365 | ''' | ||
325 | 366 | A passthrough method that makes it easier to call generate_records() | ||
326 | 367 | |||
327 | 368 | :param events: JSONs from dump that represents events taken from logs | ||
328 | 369 | :param print_format: formatting to represent event, time stamp, | ||
329 | 370 | and time taken by the event in one line | ||
330 | 371 | |||
331 | 372 | :return: boot records ordered chronologically | ||
332 | 373 | ''' | ||
333 | 192 | return generate_records(events, print_format=print_format) | 374 | return generate_records(events, print_format=print_format) |
334 | 193 | 375 | ||
335 | 194 | 376 | ||
341 | 195 | def load_events(infile, rawdata=None): | 377 | def load_events_infile(infile): |
342 | 196 | if rawdata: | 378 | ''' |
343 | 197 | data = rawdata.read() | 379 | Takes in a log file, read it, and convert to json. |
344 | 198 | else: | 380 | |
345 | 199 | data = infile.read() | 381 | :param infile: The Log file to be read |
346 | 200 | 382 | ||
348 | 201 | j = None | 383 | :return: json version of logfile, raw file |
349 | 384 | ''' | ||
350 | 385 | data = infile.read() | ||
351 | 202 | try: | 386 | try: |
353 | 203 | j = json.loads(data) | 387 | return json.loads(data), data |
354 | 204 | except ValueError: | 388 | except ValueError: |
358 | 205 | pass | 389 | return None, data |
356 | 206 | |||
357 | 207 | return j, data | ||
359 | diff --git a/cloudinit/analyze/tests/test_boot.py b/cloudinit/analyze/tests/test_boot.py | |||
360 | 208 | new file mode 100644 | 390 | new file mode 100644 |
361 | index 0000000..706e2cc | |||
362 | --- /dev/null | |||
363 | +++ b/cloudinit/analyze/tests/test_boot.py | |||
364 | @@ -0,0 +1,170 @@ | |||
365 | 1 | import os | ||
366 | 2 | from cloudinit.analyze.__main__ import (analyze_boot, get_parser) | ||
367 | 3 | from cloudinit.tests.helpers import CiTestCase, mock | ||
368 | 4 | from cloudinit.analyze.show import dist_check_timestamp, SystemctlReader, \ | ||
369 | 5 | FAIL_CODE, CONTAINER_CODE | ||
370 | 6 | |||
371 | 7 | err_code = (FAIL_CODE, -1, -1, -1) | ||
372 | 8 | |||
373 | 9 | |||
374 | 10 | class TestDistroChecker(CiTestCase): | ||
375 | 11 | |||
376 | 12 | @mock.patch('cloudinit.util.system_info', return_value={'dist': ('', '', | ||
377 | 13 | ''), | ||
378 | 14 | 'system': ''}) | ||
379 | 15 | @mock.patch('platform.linux_distribution', return_value=('', '', '')) | ||
380 | 16 | @mock.patch('cloudinit.util.is_FreeBSD', return_value=False) | ||
381 | 17 | def test_blank_distro(self, m_sys_info, m_linux_distribution, m_free_bsd): | ||
382 | 18 | self.assertEqual(err_code, dist_check_timestamp()) | ||
383 | 19 | |||
384 | 20 | @mock.patch('cloudinit.util.system_info', return_value={'dist': ('', '', | ||
385 | 21 | '')}) | ||
386 | 22 | @mock.patch('platform.linux_distribution', return_value=('', '', '')) | ||
387 | 23 | @mock.patch('cloudinit.util.is_FreeBSD', return_value=True) | ||
388 | 24 | def test_freebsd_gentoo_cant_find(self, m_sys_info, | ||
389 | 25 | m_linux_distribution, m_is_FreeBSD): | ||
390 | 26 | self.assertEqual(err_code, dist_check_timestamp()) | ||
391 | 27 | |||
392 | 28 | @mock.patch('cloudinit.util.subp', return_value=(0, 1)) | ||
393 | 29 | def test_subp_fails(self, m_subp): | ||
394 | 30 | self.assertEqual(err_code, dist_check_timestamp()) | ||
395 | 31 | |||
396 | 32 | |||
397 | 33 | class TestSystemCtlReader(CiTestCase): | ||
398 | 34 | |||
399 | 35 | def test_systemctl_invalid_property(self): | ||
400 | 36 | reader = SystemctlReader('dummyProperty') | ||
401 | 37 | with self.assertRaises(RuntimeError): | ||
402 | 38 | reader.parse_epoch_as_float() | ||
403 | 39 | |||
404 | 40 | def test_systemctl_invalid_parameter(self): | ||
405 | 41 | reader = SystemctlReader('dummyProperty', 'dummyParameter') | ||
406 | 42 | with self.assertRaises(RuntimeError): | ||
407 | 43 | reader.parse_epoch_as_float() | ||
408 | 44 | |||
409 | 45 | @mock.patch('cloudinit.util.subp', return_value=('U=1000000', None)) | ||
410 | 46 | def test_systemctl_works_correctly_threshold(self, m_subp): | ||
411 | 47 | reader = SystemctlReader('dummyProperty', 'dummyParameter') | ||
412 | 48 | self.assertEqual(1.0, reader.parse_epoch_as_float()) | ||
413 | 49 | thresh = 1.0 - reader.parse_epoch_as_float() | ||
414 | 50 | self.assertTrue(thresh < 1e-6) | ||
415 | 51 | self.assertTrue(thresh > (-1 * 1e-6)) | ||
416 | 52 | |||
417 | 53 | @mock.patch('cloudinit.util.subp', return_value=('U=0', None)) | ||
418 | 54 | def test_systemctl_succeed_zero(self, m_subp): | ||
419 | 55 | reader = SystemctlReader('dummyProperty', 'dummyParameter') | ||
420 | 56 | self.assertEqual(0.0, reader.parse_epoch_as_float()) | ||
421 | 57 | |||
422 | 58 | @mock.patch('cloudinit.util.subp', return_value=('U=1', None)) | ||
423 | 59 | def test_systemctl_succeed_distinct(self, m_subp): | ||
424 | 60 | reader = SystemctlReader('dummyProperty', 'dummyParameter') | ||
425 | 61 | val1 = reader.parse_epoch_as_float() | ||
426 | 62 | m_subp.return_value = ('U=2', None) | ||
427 | 63 | reader2 = SystemctlReader('dummyProperty', 'dummyParameter') | ||
428 | 64 | val2 = reader2.parse_epoch_as_float() | ||
429 | 65 | self.assertNotEqual(val1, val2) | ||
430 | 66 | |||
431 | 67 | @mock.patch('cloudinit.util.subp', return_value=('100', None)) | ||
432 | 68 | def test_systemctl_epoch_not_splittable(self, m_subp): | ||
433 | 69 | reader = SystemctlReader('dummyProperty', 'dummyParameter') | ||
434 | 70 | with self.assertRaises(IndexError): | ||
435 | 71 | reader.parse_epoch_as_float() | ||
436 | 72 | |||
437 | 73 | @mock.patch('cloudinit.util.subp', return_value=('U=foobar', None)) | ||
438 | 74 | def test_systemctl_cannot_convert_epoch_to_float(self, m_subp): | ||
439 | 75 | reader = SystemctlReader('dummyProperty', 'dummyParameter') | ||
440 | 76 | with self.assertRaises(ValueError): | ||
441 | 77 | reader.parse_epoch_as_float() | ||
442 | 78 | |||
443 | 79 | |||
444 | 80 | class TestAnalyzeBoot(CiTestCase): | ||
445 | 81 | |||
446 | 82 | def set_up_dummy_file_ci(self, path, log_path): | ||
447 | 83 | infh = open(path, 'w+') | ||
448 | 84 | infh.write('2019-07-08 17:40:49,601 - util.py[DEBUG]: Cloud-init v. ' | ||
449 | 85 | '19.1-1-gbaa47854-0ubuntu1~18.04.1 running \'init-local\' ' | ||
450 | 86 | 'at Mon, 08 Jul 2019 17:40:49 +0000. Up 18.84 seconds.') | ||
451 | 87 | infh.close() | ||
452 | 88 | outfh = open(log_path, 'w+') | ||
453 | 89 | outfh.close() | ||
454 | 90 | |||
455 | 91 | def set_up_dummy_file(self, path, log_path): | ||
456 | 92 | infh = open(path, 'w+') | ||
457 | 93 | infh.write('dummy data') | ||
458 | 94 | infh.close() | ||
459 | 95 | outfh = open(log_path, 'w+') | ||
460 | 96 | outfh.close() | ||
461 | 97 | |||
462 | 98 | def remove_dummy_file(self, path, log_path): | ||
463 | 99 | if os.path.isfile(path): | ||
464 | 100 | os.remove(path) | ||
465 | 101 | if os.path.isfile(log_path): | ||
466 | 102 | os.remove(log_path) | ||
467 | 103 | |||
468 | 104 | @mock.patch('cloudinit.analyze.show.dist_check_timestamp', | ||
469 | 105 | return_value=err_code) | ||
470 | 106 | def test_boot_invalid_distro(self, m_dist_check_timestamp): | ||
471 | 107 | |||
472 | 108 | path = os.path.dirname(os.path.abspath(__file__)) | ||
473 | 109 | log_path = path + '/boot-test.log' | ||
474 | 110 | path += '/dummy.log' | ||
475 | 111 | self.set_up_dummy_file(path, log_path) | ||
476 | 112 | |||
477 | 113 | parser = get_parser() | ||
478 | 114 | args = parser.parse_args(args=['boot', '-i', path, '-o', | ||
479 | 115 | log_path]) | ||
480 | 116 | name_default = '' | ||
481 | 117 | analyze_boot(name_default, args) | ||
482 | 118 | # now args have been tested, go into outfile and make sure error | ||
483 | 119 | # message is in the outfile | ||
484 | 120 | outfh = open(args.outfile, 'r') | ||
485 | 121 | data = outfh.read() | ||
486 | 122 | err_string = 'Your Linux distro or container does not support this ' \ | ||
487 | 123 | 'functionality.\nYou must be running a Kernel ' \ | ||
488 | 124 | 'Telemetry supported distro.\nPlease check ' \ | ||
489 | 125 | 'https://cloudinit.readthedocs.io/en/latest/topics' \ | ||
490 | 126 | '/analyze.html for more information on supported ' \ | ||
491 | 127 | 'distros.\n' | ||
492 | 128 | |||
493 | 129 | self.remove_dummy_file(path, log_path) | ||
494 | 130 | self.assertEqual(err_string, data) | ||
495 | 131 | |||
496 | 132 | @mock.patch("cloudinit.util.is_container", return_value=True) | ||
497 | 133 | @mock.patch('cloudinit.util.subp', return_value=('U=1000000', None)) | ||
498 | 134 | def test_container_no_ci_log_line(self, m_is_container, m_subp): | ||
499 | 135 | path = os.path.dirname(os.path.abspath(__file__)) | ||
500 | 136 | log_path = path + '/boot-test.log' | ||
501 | 137 | path += '/dummy.log' | ||
502 | 138 | self.set_up_dummy_file(path, log_path) | ||
503 | 139 | |||
504 | 140 | parser = get_parser() | ||
505 | 141 | args = parser.parse_args(args=['boot', '-i', path, '-o', | ||
506 | 142 | log_path]) | ||
507 | 143 | name_default = '' | ||
508 | 144 | |||
509 | 145 | finish_code = analyze_boot(name_default, args) | ||
510 | 146 | |||
511 | 147 | self.remove_dummy_file(path, log_path) | ||
512 | 148 | self.assertEqual(FAIL_CODE, finish_code) | ||
513 | 149 | |||
514 | 150 | @mock.patch("cloudinit.util.is_container", return_value=True) | ||
515 | 151 | @mock.patch('cloudinit.util.subp', return_value=('U=1000000', None)) | ||
516 | 152 | @mock.patch('cloudinit.analyze.__main__._get_events', return_value=[{ | ||
517 | 153 | 'name': 'init-local', 'description': 'starting search', 'timestamp': | ||
518 | 154 | 100000}]) | ||
519 | 155 | @mock.patch('cloudinit.analyze.show.dist_check_timestamp', | ||
520 | 156 | return_value=(CONTAINER_CODE, 1, 1, 1)) | ||
521 | 157 | def test_container_ci_log_line(self, m_is_container, m_subp, m_get, m_g): | ||
522 | 158 | path = os.path.dirname(os.path.abspath(__file__)) | ||
523 | 159 | log_path = path + '/boot-test.log' | ||
524 | 160 | path += '/dummy.log' | ||
525 | 161 | self.set_up_dummy_file_ci(path, log_path) | ||
526 | 162 | |||
527 | 163 | parser = get_parser() | ||
528 | 164 | args = parser.parse_args(args=['boot', '-i', path, '-o', | ||
529 | 165 | log_path]) | ||
530 | 166 | name_default = '' | ||
531 | 167 | finish_code = analyze_boot(name_default, args) | ||
532 | 168 | |||
533 | 169 | self.remove_dummy_file(path, log_path) | ||
534 | 170 | self.assertEqual(CONTAINER_CODE, finish_code) | ||
535 | diff --git a/doc/rtd/topics/analyze.rst b/doc/rtd/topics/analyze.rst | |||
536 | 0 | new file mode 100644 | 171 | new file mode 100644 |
537 | index 0000000..5cf38bd | |||
538 | --- /dev/null | |||
539 | +++ b/doc/rtd/topics/analyze.rst | |||
540 | @@ -0,0 +1,84 @@ | |||
541 | 1 | ************************* | ||
542 | 2 | Cloud-init Analyze Module | ||
543 | 3 | ************************* | ||
544 | 4 | |||
545 | 5 | Overview | ||
546 | 6 | ======== | ||
547 | 7 | The analyze module was added to cloud-init in order to help analyze cloud-init boot time | ||
548 | 8 | performance. It is loosely based on systemd-analyze where there are 4 main actions: | ||
549 | 9 | show, blame, dump, and boot. | ||
550 | 10 | |||
551 | 11 | The 'show' action is similar to 'systemd-analyze critical-chain' which prints a list of units, the | ||
552 | 12 | time they started and how long they took. For cloud-init, we have four stages, and within each stage | ||
553 | 13 | a number of modules may run depending on configuration. ‘cloudinit-analyze show’ will, for each | ||
554 | 14 | boot, print this information and a summary total time, per boot. | ||
555 | 15 | |||
556 | 16 | The 'blame' action matches 'systemd-analyze blame' where it prints, in descending order, | ||
557 | 17 | the units that took the longest to run. This output is highly useful for examining where cloud-init | ||
558 | 18 | is spending its time during execution. | ||
559 | 19 | |||
560 | 20 | The 'dump' action simply dumps the cloud-init logs that the analyze module is performing | ||
561 | 21 | the analysis on and returns a list of dictionaries that can be consumed for other reporting needs. | ||
562 | 22 | |||
563 | 23 | The 'boot' action prints out kernel related timestamps that are not included in any of the | ||
564 | 24 | cloud-init logs. There are three different timestamps that are presented to the user: | ||
565 | 25 | kernel start, kernel finish boot, and cloud-init start. This was added for additional | ||
566 | 26 | clarity into the boot process that cloud-init does not have control over, to aid in debugging of | ||
567 | 27 | performance issues related to cloudinit startup or tracking regression. | ||
568 | 28 | |||
569 | 29 | Usage | ||
570 | 30 | ===== | ||
571 | 31 | Using each of the printing formats is as easy as running one of the following bash commands: | ||
572 | 32 | |||
573 | 33 | .. code-block:: shell-session | ||
574 | 34 | |||
575 | 35 | cloud-init analyze show | ||
576 | 36 | cloud-init analyze blame | ||
577 | 37 | cloud-init analyze dump | ||
578 | 38 | cloud-init analyze boot | ||
579 | 39 | |||
580 | 40 | Cloud-init analyze boot Timestamp Gathering | ||
581 | 41 | =========================================== | ||
582 | 42 | The following boot related timestamps are gathered on demand when cloud-init analyze boot runs: | ||
583 | 43 | - Kernel Startup, which is inferred from system uptime | ||
584 | 44 | - Kernel Finishes Initialization, which is inferred from systemd UserSpaceMonotonicTimestamp property | ||
585 | 45 | - Cloud-init activation, which is inferred from the property InactiveExitTimestamp of the cloud-init | ||
586 | 46 | local systemd unit. | ||
587 | 47 | |||
588 | 48 | In order to gather the necessary timestamps using systemd, running the commands | ||
589 | 49 | |||
590 | 50 | .. code-block:: shell-session | ||
591 | 51 | |||
592 | 52 | systemctl show -p UserspaceTimestampMonotonic | ||
593 | 53 | systemctl show cloud-init-local -p InactiveExitTimestampMonotonic | ||
594 | 54 | |||
595 | 55 | will gather the UserspaceTimestamp and InactiveExitTimestamp. | ||
596 | 56 | The UserspaceTimestamp tracks when the init system starts, which is used as an indicator of kernel | ||
597 | 57 | finishing initialization. The InactiveExitTimestamp tracks when a particular systemd unit transitions | ||
598 | 58 | from the Inactive to Active state, which can be used to mark the beginning of systemd's activation | ||
599 | 59 | of cloud-init. | ||
600 | 60 | |||
601 | 61 | Currently this only works for distros that use systemd as the init process. We will be expanding | ||
602 | 62 | support for other distros in the future and this document will be updated accordingly. | ||
603 | 63 | |||
604 | 64 | If systemd is not present on the system, dmesg is used to attempt to find an event that logs the | ||
605 | 65 | beginning of the init system. However, with this method only the first two timestamps are able to be found; | ||
606 | 66 | dmesg does not monitor userspace processes, so no cloud-init start timestamps are emitted like when | ||
607 | 67 | using systemd. | ||
608 | 68 | |||
609 | 69 | List of Cloud-init analyze boot supported distros | ||
610 | 70 | ================================================= | ||
611 | 71 | - Arch | ||
612 | 72 | - CentOS | ||
613 | 73 | - Debian | ||
614 | 74 | - Fedora | ||
615 | 75 | - OpenSuSE | ||
616 | 76 | - Red Hat Enterprise Linux | ||
617 | 77 | - Ubuntu | ||
618 | 78 | - SUSE Linux Enterprise Server | ||
619 | 79 | - CoreOS | ||
620 | 80 | |||
621 | 81 | List of Cloud-init analyze boot unsupported distros | ||
622 | 82 | =================================================== | ||
623 | 83 | - FreeBSD | ||
624 | 84 | - Gentoo | ||
625 | 0 | \ No newline at end of file | 85 | \ No newline at end of file |
626 | diff --git a/doc/rtd/topics/capabilities.rst b/doc/rtd/topics/capabilities.rst | |||
627 | index 0d8b894..6d85a99 100644 | |||
628 | --- a/doc/rtd/topics/capabilities.rst | |||
629 | +++ b/doc/rtd/topics/capabilities.rst | |||
630 | @@ -217,6 +217,7 @@ Get detailed reports of where cloud-init spends most of its time. See | |||
631 | 217 | * **dump** Machine-readable JSON dump of all cloud-init tracked events. | 217 | * **dump** Machine-readable JSON dump of all cloud-init tracked events. |
632 | 218 | * **show** show time-ordered report of the cost of operations during each | 218 | * **show** show time-ordered report of the cost of operations during each |
633 | 219 | boot stage. | 219 | boot stage. |
634 | 220 | * **boot** show timestamps from kernel initialization, kernel finish initialization, and cloud-init start. | ||
635 | 220 | 221 | ||
636 | 221 | .. _cli_devel: | 222 | .. _cli_devel: |
637 | 222 | 223 | ||
638 | diff --git a/doc/rtd/topics/debugging.rst b/doc/rtd/topics/debugging.rst | |||
639 | index 51363ea..e13d915 100644 | |||
640 | --- a/doc/rtd/topics/debugging.rst | |||
641 | +++ b/doc/rtd/topics/debugging.rst | |||
642 | @@ -68,6 +68,19 @@ subcommands default to reading /var/log/cloud-init.log. | |||
643 | 68 | 00.00100s (modules-final/config-rightscale_userdata) | 68 | 00.00100s (modules-final/config-rightscale_userdata) |
644 | 69 | ... | 69 | ... |
645 | 70 | 70 | ||
646 | 71 | * ``analyze boot`` Make subprocess calls to the kernel in order to get relevant | ||
647 | 72 | pre-cloud-init timestamps, such as the kernel start, kernel finish boot, and cloud-init start. | ||
648 | 73 | |||
649 | 74 | .. code-block:: shell-session | ||
650 | 75 | |||
651 | 76 | $ cloud-init analyze boot | ||
652 | 77 | -- Most Recent Boot Record -- | ||
653 | 78 | Kernel Started at: 2019-06-13 15:59:55.809385 | ||
654 | 79 | Kernel ended boot at: 2019-06-13 16:00:00.944740 | ||
655 | 80 | Kernel time to boot (seconds): 5.135355 | ||
656 | 81 | Cloud-init start: 2019-06-13 16:00:05.738396 | ||
657 | 82 | Time between Kernel boot and Cloud-init start (seconds): 4.793656 | ||
658 | 83 | |||
659 | 71 | 84 | ||
660 | 72 | Analyze quickstart - LXC | 85 | Analyze quickstart - LXC |
661 | 73 | --------------------------- | 86 | --------------------------- |
Thank you for this branch Sam and for making cloud-init better!
A couple of notes: analyze/ tests/test_ boot.py topics/ analyze. rst
1. ^M characters present throughout:
- cloudinit/
- doc/rtd/
2. Timestamps don't seem to be reported properly in an lxc container for me: (note that the kernel ended boot time abd cloud-init start time is in the future on my system:
root@x1:~# cloud-init analyze boot
-- Most Recent Boot Record --
Kernel Started at: 2019-06-24 20:24:25.527606
Kernel ended boot at: 2019-06-27 18:53:04.667329
Kernel time to boot (seconds): 253719.139722
Cloud-init start: 2019-06-27 18:53:05.277550
Time between Kernel boot and Cloud-init start (seconds): 0.610222
root@x1:~# date
Mon Jun 24 20:39:25 UTC 2019
2. timestamp reported by cloud-init boot doesn't match timestamp in cloud-init logs
on my system. This may lead to confusing errors because you
cloud-init analyze boot
Cloud-init start: 2019-06-21 21:56:24.461153
YET cloud-init does startup @ 21.56:08
grep "running 'init-local" /var/log/ cloud-init. log | tail -n 1 gb05832a5- 1~bddeb running 'init-local' at Fri, 21 Jun 2019 21:56:08 +0000. Up 5.00 seconds.
2019-06-21 21:56:09,067 - util.py[DEBUG]: Cloud-init v. 18.5-59-
Maybe we should be using ActiveEnterTime stampMonotonic or ExecMainStartTi mestampMonotoni c to represesent on cloud-init-local starts it's datasource discovery?
I'll continue to try to triage what might give us a better handle on cloud-init start time