Merge lp:~doanac/utah/rsyslog-config into lp:utah

Proposed by Andy Doan
Status: Merged
Merged at revision: 940
Proposed branch: lp:~doanac/utah/rsyslog-config
Merge into: lp:utah
Diff against target: 744 lines (+165/-214)
7 files modified
tests/test_rsyslog.py (+65/-58)
utah/config.py (+1/-95)
utah/provisioning/baremetal/bamboofeeder.py (+2/-2)
utah/provisioning/baremetal/cobbler.py (+2/-3)
utah/provisioning/rsyslog.py (+92/-53)
utah/provisioning/vm.py (+2/-2)
utah/run.py (+1/-1)
To merge this branch: bzr merge lp:~doanac/utah/rsyslog-config
Reviewer Review Type Date Requested Status
Max Brustkern (community) Approve
Andy Doan (community) Needs Resubmitting
Review via email: mp+169254@code.launchpad.net

Description of the change

This helps fix our current configuration issues we are having and should help get us ready for the next release

To post a comment you must log in.
Revision history for this message
Max Brustkern (nuclearbob) wrote :

How does this line (303 in the merge) work:
self.assertTrue(abs(time.time() - start - 5), 2)
? In the documentation on assertTrue, it looks like 2 should be picked up as the mesg argument?

Also, I'm a little worried about the regression potential of 432-435. Are we sure we don't have any test cases that currently take longer than 4 hours?

On line 531 of the merge, I get removing the steps. Was the previous reason for the seealso because the other method had steps, or because it was similar? Maybe the seealso still makes sense?

Looks like that also happened on 589, so maybe there's a consistent reason for it.

It looks like the timeout in wait_for_steps is now the timeout for the whole process rather than each step. That seems reasonable, and I'm assuming that since the previous default values were whole-process timeouts, we don't need to worry about updating them.

The rest of it looks reasonable to me, assuming it all tests out okay.

Revision history for this message
Max Brustkern (nuclearbob) wrote :

For 432-435, if we run this branch on a machine that still has None in the config file, the whole config file will be ignored. If the config files is all defaults anyway, it'll be fine, but that seems like it could have the potential to cause problems. Maybe if we do want to move to int for that, we can introduce that as a default now and then remove null from the schema in another version or two. I'm still concerned about the 4 hour timeout, since part of the job of the utah client is to ensure that things don't run forever.

Revision history for this message
Max Brustkern (nuclearbob) wrote :

I got good tests of this on virtual and physical machines. I think once we determine how to address the configuration changes around timeouts, we should be good to go.

Revision history for this message
Andy Doan (doanac) wrote :

On 06/13/2013 07:14 PM, Max Brustkern wrote:
> How does this line (303 in the merge) work:
> self.assertTrue(abs(time.time() - start - 5), 2) ? In the
> documentation on assertTrue, it looks like 2 should be picked up as
> the mesg argument?

typo, was supposed to assertLessThan

> Also, I'm a little worried about the regression potential of 432-435.
> Are we sure we don't have any test cases that currently take longer
> than 4 hours?

I was just making an educated guess. You also said in your next thread:

  | I'm still concerned about the 4 hour timeout, since part of the job
  | of the utah client is to ensure that things don't run forever.

I think we should backout this change to config.py. The wait_for_run
already checks if the value is None and will default to a 2 day timeout.

> On line 531 of the merge, I get removing the steps. Was the previous
> reason for the seealso because the other method had steps, or because
> it was similar? Maybe the seealso still makes sense?
>
> Looks like that also happened on 589, so maybe there's a consistent
> reason for it.

I think it was just because of the steps and how they were documented

> It looks like the timeout in wait_for_steps is now the timeout for
> the whole process rather than each step. That seems reasonable, and
> I'm assuming that since the previous default values were
> whole-process timeouts, we don't need to worry about updating them.

that was my hope :)

> The rest of it looks reasonable to me, assuming it all tests out
> okay.

Sounds like I have two things to fix. I'll probably do that tomorrow and
re-submit.

lp:~doanac/utah/rsyslog-config updated
943. By Andy Doan

typo found by max

944. By Andy Doan

backout runtimeout change in config.py

removing the option for None was going to cause regressions. And
4 hours might not be a good guess either.

Revision history for this message
Andy Doan (doanac) wrote :

screw it. i couldn't sleep knowing it should be easy to fix your concerns.

review: Needs Resubmitting
Revision history for this message
Max Brustkern (nuclearbob) wrote :

This seems good to me. If nobody objects, I'm in favor of merging it today.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'tests/test_rsyslog.py'
2--- tests/test_rsyslog.py 2013-06-04 20:46:41 +0000
3+++ tests/test_rsyslog.py 2013-06-14 03:44:27 +0000
4@@ -19,6 +19,8 @@
5 import itertools
6 import logging
7 import mock
8+import os
9+import shutil
10 import socket
11 import time
12 import tempfile
13@@ -67,14 +69,21 @@
14 time.sleep(1)
15 truncate = False
16
17+ def setUp(self):
18+ """Create a temporary directory for logging."""
19+ self.logdir = tempfile.mkdtemp()
20+ self.logfile = os.path.join(self.logdir, 'tmp.log')
21+
22+ def tearDown(self):
23+ """Cleanup artifacts from setUp."""
24+ shutil.rmtree(self.logdir)
25+
26 def test_easy(self):
27 """Make sure we can match a typical first message."""
28 steps = [
29 {
30 "message": "test_easy",
31 "pattern": ".*log source = /proc/kmsg started",
32- "timeout": 120,
33- "booted": False,
34 },
35 ]
36
37@@ -82,9 +91,9 @@
38 'test 1',
39 '<13>Feb 20 13:38:55 host user: log source = /proc/kmsg started',
40 ]
41- r = RSyslog("utah-test", '/tmp')
42+ r = RSyslog("utah-test", self.logdir)
43 threading.Thread(target=self.producer, args=(r.port, messages)).start()
44- r.wait_for_install(steps)
45+ r._wait_for_steps(steps, 5, self.logfile, {})
46
47 def test_fail_pattern(self):
48 """Exception is raised on fail_pattern match."""
49@@ -93,7 +102,6 @@
50 "message": "test_late_command_failure",
51 "pattern": "finished",
52 "fail_pattern": "failure",
53- "timeout": 120,
54 },
55 ]
56
57@@ -103,9 +111,10 @@
58 'finished',
59 ]
60
61- r = RSyslog('utah-test', '/tmp')
62+ r = RSyslog("utah-test", self.logdir)
63 threading.Thread(target=self.producer, args=(r.port, messages)).start()
64- self.assertRaises(UTAHException, r.wait_for_install, steps)
65+ with self.assertRaises(UTAHException):
66+ r._wait_for_steps(steps, 5, self.logfile, {})
67
68 def test_future(self):
69 """Test handling a missing message."""
70@@ -115,18 +124,15 @@
71 "pattern": ".*log source = /proc/kmsg started",
72 # set a really small timeout to prove we properly calculate
73 # future timeout logic
74- "timeout": 0.01,
75 "booted": True,
76 },
77 {
78 "message": "hit skipped",
79 "pattern": "hit skipped",
80- "timeout": 120,
81 },
82 {
83 "message": "message 3",
84 "pattern": "message 3",
85- "timeout": 120,
86 },
87 ]
88
89@@ -136,7 +142,7 @@
90 'message 3',
91 ]
92
93- r = RSyslog("utah-test", '/tmp')
94+ r = RSyslog("utah-test", self.logdir)
95 threading.Thread(target=self.producer, args=(r.port, messages)).start()
96
97 def booted_cb(match):
98@@ -145,7 +151,7 @@
99
100 self.test_future_booted = False
101 with patch.object(r.logger, 'warn') as mock_method:
102- r.wait_for_install(steps, booted_cb)
103+ r._wait_for_steps(steps, 5, self.logfile, {'booted': booted_cb})
104 mock_method.assert_called_with(
105 'Expected pattern missed, matched future pattern: %s',
106 'hit skipped')
107@@ -158,18 +164,15 @@
108 {
109 "message": "test callbacks",
110 "pattern": ".*log source = /proc/kmsg started",
111- "timeout": 120,
112 "booted": True,
113 },
114 {
115 "message": "hit skipped",
116 "pattern": "hit skipped",
117- "timeout": 120,
118 },
119 {
120 "message": "message 3",
121 "pattern": "message 3",
122- "timeout": 120,
123 "blah": True,
124 },
125 ]
126@@ -180,7 +183,7 @@
127 'message 3',
128 ]
129
130- r = RSyslog("utah-test", '/tmp')
131+ r = RSyslog("utah-test", self.logdir)
132 threading.Thread(target=self.producer, args=(r.port, messages)).start()
133
134 def booted_cb(match):
135@@ -196,61 +199,66 @@
136 'booted': booted_cb,
137 'blah': blah_cb,
138 }
139- r._wait_for_steps(steps, "/dev/null", callbacks)
140+ r._wait_for_steps(steps, 5, "/dev/null", callbacks)
141 self.assertTrue(self.test_callbacks_booted, 'booted callback not made')
142 self.assertTrue(self.test_callbacks_blah, 'blah callback not made')
143
144+ def test_install(self):
145+ """Make sure wait_for_install works."""
146+ messages = [
147+ 'blah',
148+ 'ubiquity[11]: Ubiquity 1',
149+ 'finish-install: umount',
150+ ]
151+
152+ def cb(match):
153+ self.test_install_cb = True
154+
155+ self.test_install_cb = False
156+ r = RSyslog("utah-test", self.logdir)
157+ r.logger = mock.Mock()
158+ threading.Thread(target=self.producer, args=(r.port, messages)).start()
159+ r.wait_for_install(cb)
160+ self.assertFalse(r.logger.error.called)
161+ self.assertTrue(self.test_install_cb)
162+
163+ r = RSyslog("utah-test", self.logdir)
164+ messages[-1] = 'utah: Late command failure detected'
165+ threading.Thread(target=self.producer, args=(r.port, messages)).start()
166+ with self.assertRaises(UTAHException):
167+ r.wait_for_install()
168+
169 def test_booted(self):
170 """Make sure wait_for_booted works."""
171 tst_uuid = str(uuid.uuid4())
172- steps = [
173- {
174- "message": "test booted",
175- "pattern": ".*log source = /proc/kmsg started",
176- "timeout": 120,
177- },
178- {
179- "message": "test uuid",
180- "uuid": True,
181- "pattern": '.*Machine UUID:\s+(.*)',
182- "timeout": 2,
183- },
184- ]
185-
186 messages = [
187 'test 1',
188 '<13>Feb 20 13:38:55 host user: log source = /proc/kmsg started',
189 '<13>Feb 21 13:40:55 host user: Machine UUID: {}'.format(tst_uuid),
190 ]
191
192- r = RSyslog("utah-test", '/tmp')
193+ r = RSyslog("utah-test", self.logdir)
194 r.logger = mock.Mock()
195 threading.Thread(target=self.producer, args=(r.port, messages)).start()
196- r.wait_for_booted(steps, tst_uuid)
197+ r.wait_for_booted(tst_uuid)
198 self.assertFalse(r.logger.error.called)
199
200 threading.Thread(target=self.producer, args=(r.port, messages)).start()
201 with self.assertRaises(UTAHProvisioningException):
202- r.wait_for_booted(steps, 'bad')
203+ r.wait_for_booted('bad')
204
205 def test_booted_workaround(self):
206 """Test that wait_for_booted doesn't raise an exception on timeout."""
207- steps = [
208- {
209- "message": "test booted",
210- "pattern": "Wait forever for a line that never comes",
211- "timeout": 5,
212- },
213- ]
214-
215- r = RSyslog("utah-test", '/tmp')
216-
217- r.wait_for_booted(steps, None)
218+ orig = config.boottimeout
219+ try:
220+ config.boottimeout = 0.5
221+ r = RSyslog("utah-test", self.logdir)
222+ r.wait_for_booted(None)
223+ finally:
224+ config.boottimeout = orig
225
226 def test_wait_for_run(self):
227 """Test that wait_for_utah_run works."""
228- steps = config.autorun_steps
229-
230 messages = [
231 'test 1',
232 'utah: completed /etc/utah/autorun/01_run-utah rc=7',
233@@ -261,7 +269,7 @@
234
235 r = RSyslog("utah-test", '/tmp')
236 threading.Thread(target=self.producer, args=(r.port, messages)).start()
237- r.wait_for_utah_run(steps, cb)
238+ r.wait_for_utah_run(cb)
239 self.assertEqual(7, self.utah_run_rc)
240
241 def usefile(self, truncate):
242@@ -270,7 +278,6 @@
243 {
244 "message": "truncate test",
245 "pattern": ".*log source = /proc/kmsg started",
246- "timeout": 60,
247 },
248 ]
249
250@@ -282,12 +289,12 @@
251 ]
252
253 with tempfile.NamedTemporaryFile() as f:
254- r = RSyslog("utah-test", '/tmp', f.name)
255+ r = RSyslog("utah-test", self.logdir, f.name)
256 t = threading.Thread(
257 target=self.file_producer,
258 args=(f, messages, truncate))
259 t.start()
260- r.wait_for_booted(steps, None)
261+ r._wait_for_steps(steps, 5, self.logfile, {})
262 t.join()
263
264 def test_usefile(self):
265@@ -304,31 +311,31 @@
266 {
267 "message": "message_1",
268 "pattern": "finished",
269- "timeout": 1,
270 },
271 {
272 "message": "message_2",
273 "pattern": "pattern",
274- "timeout": 1,
275 },
276 {
277 "message": "message_3",
278 "pattern": "pattern",
279- "timeout": 1,
280 },
281 ]
282
283- messages = itertools.repeat('log example')
284+ messages = ['message_1', 'message_2', 'message_4']
285 pattern = (r'Timeout \(\d+\) occurred for {} message.\n'
286 r'Remaining messages: {}'
287 .format(steps[0]['message'],
288 ', '.join([step['message']
289 for step in steps[1:]])))
290
291- r = RSyslog('utah-test', '/tmp')
292+ r = RSyslog('utah-test', self.logdir)
293 thread = threading.Thread(target=self.producer,
294 args=(r.port, messages))
295 thread.daemon = True
296 thread.start()
297- self.assertRaisesRegexp(UTAHTimeout, pattern,
298- r.wait_for_install, steps)
299+ start = time.time()
300+ with self.assertRaisesRegexp(UTAHTimeout, pattern):
301+ r._wait_for_steps(steps, 5, self.logfile, {})
302+ # assert the timeout was within 2 seconds of what we expect
303+ self.assertLess(abs(time.time() - start - 5), 2)
304
305=== modified file 'utah/config.py'
306--- utah/config.py 2013-06-05 15:56:26 +0000
307+++ utah/config.py 2013-06-14 03:44:27 +0000
308@@ -46,67 +46,17 @@
309 "$schema": "http://json-schema.org/draft-04/schema#",
310 'title': 'UTAH configuration',
311 'type': 'object',
312- 'additionalProperties': False,
313+ 'additionalProperties': True,
314 'properties': {
315 'arch': {
316 'description': 'Default architecture',
317 'type': 'string',
318 },
319- 'autorun_steps': {
320- 'description': 'rsyslog patterns matched during autorun',
321- 'type': 'array',
322- 'items': {'type': 'object'},
323- 'default': [
324- {
325- 'message': 'UTAH setup',
326- 'pattern': ('.*utah: completed '
327- '/etc/utah/autorun/\d+_utah-setup'),
328- 'timeout': 180,
329- },
330- {
331- 'message': 'UTAH client install',
332- 'pattern': ('.*utah: completed '
333- '/etc/utah/autorun/\d+_install-client'),
334- 'timeout': 180,
335- },
336- {
337- 'message': 'UTAH runlist',
338- 'pattern': ('.*utah: completed '
339- '/etc/utah/autorun/\d+_run-utah rc=(\d+)'),
340- 'completed': True,
341- 'timeout': 7200, # 2 hours
342- },
343- ],
344- },
345 'boot': {
346 'description': 'Default kernel parameters',
347 'type': ['string', 'null'],
348 'default': None,
349 },
350- 'boot_steps': {
351- 'description': 'rsyslog patterns matched during boot',
352- 'type': 'array',
353- 'items': {'type': 'object'},
354- 'default': [
355- {
356- 'message': 'system booted',
357- 'pattern': '.*Linux version \\d',
358- 'timeout': 600,
359- },
360- {
361- 'message': 'disk mounted',
362- 'pattern': '.*mounted filesystem with ordered data',
363- 'timeout': 600,
364- },
365- {
366- 'message': 'UTAH uuid received',
367- # binds this to the callback for wait_for_booted
368- 'uuid': True,
369- 'pattern': '.*Machine UUID:\s+(.*)',
370- 'timeout': 600,
371- },
372- ],
373- },
374 'boottimeout': {
375 'description': 'Time to wait for installed system to boot',
376 'type': 'integer',
377@@ -223,50 +173,6 @@
378 'format': 'path',
379 'default': None,
380 },
381- 'install_steps': {
382- 'description': 'rsyslog patterns matched during installation',
383- 'type': 'array',
384- 'items': {'type': 'object'},
385- 'default': [
386- {
387- 'message': 'system started',
388- 'pattern': '.*',
389- 'booted': True,
390- 'timeout': 360,
391- },
392- {
393- 'message': 'install started',
394- 'pattern': [
395- '.*ubiquity\\[\\d+\\]: Ubiquity \\d',
396- # needed for non-ubiquity installs
397- '.*anna-install',
398- ],
399- 'timeout': 120,
400- },
401- {
402- 'message': 'install complete',
403- 'pattern': [
404- '.*ubiquity reboot',
405- '.*log-output -t ubiquity umount /target$',
406- # for server bootspeed
407- '.*finish-install.d/94save-logs',
408- '.*finish-install: umount',
409- # a catch-all
410- '.*rsyslogd:.*exiting on signal 15.',
411- ],
412- 'fail_pattern': [
413- # ubiquity/failure_command
414- '.*utah: Installation failure detected',
415- # ubiquity/success_command
416- # and d-i/late_command failures
417- '.*utah: Late command failure detected',
418- # d-i installation failure
419- '.*exited with status [^0]',
420- ],
421- 'timeout': 3600,
422- },
423- ],
424- },
425 'installpackages': {
426 'description': 'Packages to install on system',
427 'type': 'array',
428
429=== modified file 'utah/provisioning/baremetal/bamboofeeder.py'
430--- utah/provisioning/baremetal/bamboofeeder.py 2013-06-10 05:48:52 +0000
431+++ utah/provisioning/baremetal/bamboofeeder.py 2013-06-14 03:44:27 +0000
432@@ -271,8 +271,8 @@
433 self.restart()
434 self.logger.info('System installing')
435
436- self.rsyslog.wait_for_install(config.install_steps)
437- self.rsyslog.wait_for_booted(config.boot_steps, self.uuid)
438+ self.rsyslog.wait_for_install()
439+ self.rsyslog.wait_for_booted(self.uuid)
440 retry(self.sshcheck, logmethod=self.logger.info,
441 retry_timeout=config.checktimeout)
442
443
444=== modified file 'utah/provisioning/baremetal/cobbler.py'
445--- utah/provisioning/baremetal/cobbler.py 2013-06-10 05:57:49 +0000
446+++ utah/provisioning/baremetal/cobbler.py 2013-06-14 03:44:27 +0000
447@@ -202,8 +202,7 @@
448 for arg in self.machineinfo])
449
450 self.restart()
451- self.rsyslog.wait_for_install(config.install_steps,
452- self._disable_netboot)
453+ self.rsyslog.wait_for_install(self._disable_netboot)
454
455 # TODO: wait_for_booted isn't working on d-i installs and
456 # autopilot jobs like:
457@@ -211,7 +210,7 @@
458 # ps-unity-autopilot-release-testing/140/label=autopilot-nvidia/
459 # For now, it seems to be a problem on all installs, so we're going to
460 # handle it in rsyslog.py
461- self.rsyslog.wait_for_booted(config.boot_steps, self.uuid)
462+ self.rsyslog.wait_for_booted(self.uuid)
463
464 retry(self.sshcheck, logmethod=self.logger.info,
465 retry_timeout=config.checktimeout)
466
467=== modified file 'utah/provisioning/rsyslog.py'
468--- utah/provisioning/rsyslog.py 2013-05-22 16:56:32 +0000
469+++ utah/provisioning/rsyslog.py 2013-06-14 03:44:27 +0000
470@@ -23,6 +23,7 @@
471 import sys
472 import time
473
474+from utah.config import config
475 from utah.exceptions import UTAHException
476 from utah.provisioning.exceptions import UTAHProvisioningException
477 from utah.timeout import UTAHTimeout
478@@ -70,65 +71,85 @@
479 """Return UDP port number used to listen for syslog messages."""
480 return self._port
481
482- def wait_for_install(self, steps, booted_callback=None):
483+ def wait_for_install(self, booted_callback=None):
484 """Monitor rsyslog messages during the installation.
485
486 Works through each step in the steps array to find messages in the
487 syslog indicating what part of the install we are in.
488
489- :param steps: Set of steps that the installation will go through.
490- :type steps: list(dict)
491 :param booted_callback:
492 function to be called once the system has been booted
493 :type booted_callback: callable | None
494
495- An example of a valid steps argument would be as follows::
496-
497- {
498- "message": "system started",
499- "pattern": ".*log source = /proc/kmsg started",
500- "fail_pattern": ".*exited with status [^0]",
501- "timeout": 1000,
502- "booted": false
503- },
504- {
505- "message": "system started",
506- "pattern": [
507- ".*log source = /proc/kmsg started"
508- ],
509- "timeout": 1000,
510- "booted": false
511- }
512-
513- where:
514-
515- - ``pattern`` can an array or string.
516- - ``booted`` is optional, but can be used to indicate that the
517- system is fully operational. This can be used in conjunction with
518- the ``booted_callback`` function.
519-
520- .. seealso:: :meth:`wait_for_booted`
521-
522 """
523+ steps = [
524+ {
525+ 'message': 'system started',
526+ 'pattern': ['.*'],
527+ 'booted': True,
528+ },
529+ {
530+ 'message': 'install started',
531+ 'pattern': [
532+ '.*ubiquity\\[\\d+\\]: Ubiquity \\d',
533+ # needed for non-ubiquity installs
534+ '.*anna-install',
535+ ],
536+ },
537+ {
538+ 'message': 'install complete',
539+ 'pattern': [
540+ '.*ubiquity reboot',
541+ '.*log-output -t ubiquity umount /target$',
542+ # for server bootspeed
543+ '.*finish-install.d/94save-logs',
544+ '.*finish-install: umount',
545+ # a catch-all
546+ '.*rsyslogd:.*exiting on signal 15.',
547+ ],
548+ 'fail_pattern': [
549+ # ubiquity/failure_command
550+ '.*utah: Installation failure detected',
551+ # ubiquity/success_command
552+ # and d-i/late_command failures
553+ '.*utah: Late command failure detected',
554+ # d-i installation failure
555+ '.*exited with status [^0]',
556+ ],
557+ },
558+ ]
559+
560 logfile = '{}/{}-install.log'.format(self._logpath, self._host)
561 callbacks = {'booted': booted_callback}
562- self._wait_for_steps(steps, logfile, callbacks)
563+ self._wait_for_steps(steps, config.installtimeout, logfile, callbacks)
564
565- def wait_for_booted(self, steps, uuid):
566+ def wait_for_booted(self, uuid):
567 """Monitor rsyslog during boot up.
568
569 Works the same as the :meth:`wait_for_install` method but takes in
570 steps that determine a system has booted after the install has
571 completed.
572
573- :param steps: Set of steps that the boot up will go through.
574- :type steps: list(dict)
575 :param uuid: The UUID generated by provisioning code.
576 :type uuid: string
577
578- .. seealso:: :meth:`wait_for_install`
579-
580 """
581+ steps = [
582+ {
583+ 'message': 'system booted',
584+ 'pattern': '.*Linux version \\d',
585+ },
586+ {
587+ 'message': 'disk mounted',
588+ 'pattern': '.*mounted filesystem with ordered data',
589+ },
590+ {
591+ 'message': 'UTAH uuid received',
592+ # binds this to the callback for wait_for_booted
593+ 'uuid': True,
594+ 'pattern': '.*Machine UUID:\s+(.*)',
595+ },
596+ ]
597 logfile = '{}/{}-boot.log'.format(self._logpath, self._host)
598
599 def _cb(match):
600@@ -140,7 +161,7 @@
601 # TODO: Resolve the issues around this and get it back to normal
602 # i.e. LP#1100386
603 try:
604- self._wait_for_steps(steps, logfile, callbacks)
605+ self._wait_for_steps(steps, config.boottimeout, logfile, callbacks)
606 if not _cb.uuid:
607 self.logger.error('UUID not sent from target')
608 elif uuid and uuid != _cb.uuid:
609@@ -149,39 +170,60 @@
610 except UTAHTimeout:
611 self.logger.warning('Timed out waiting for boot, but continuing')
612
613- def wait_for_utah_run(self, steps, completed_cb=None):
614+ def wait_for_utah_run(self, completed_cb=None):
615 """Monitor rsyslog while waiting for utah-client to finish autorun.
616
617- :param steps: Set of steps that the boot up will go through.
618- :type steps: list(dict)
619 :param completed_cb: A callback function that will include the utah
620 exit code.
621 :type completed_cb: function
622
623 """
624+ steps = [
625+ {
626+ 'message': 'UTAH setup',
627+ 'pattern': ('.*utah: completed '
628+ '/etc/utah/autorun/\d+_utah-setup'),
629+ },
630+ {
631+ 'message': 'UTAH client install',
632+ 'pattern': ('.*utah: completed '
633+ '/etc/utah/autorun/\d+_install-client'),
634+ },
635+ {
636+ 'message': 'UTAH runlist',
637+ 'pattern': ('.*utah: completed '
638+ '/etc/utah/autorun/\d+_run-utah rc=(\d+)'),
639+ 'completed': True,
640+ },
641+ ]
642 logfile = '{}/{}-utah-run.log'.format(self._logpath, self._host)
643 callbacks = {'completed': completed_cb}
644- self._wait_for_steps(steps, logfile, callbacks)
645+ timeout = config.runtimeout
646+ if not timeout:
647+ # this means forever, but lets give something sensible to help
648+ # prevent jobs from hanging forever
649+ timeout = 172800 # 2 days
650+ self._wait_for_steps(steps, timeout, logfile, callbacks)
651
652- def _wait_for_steps(self, steps, logfile, callbacks):
653+ def _wait_for_steps(self, steps, timeout, logfile, callbacks):
654 with open(logfile, 'w') as f:
655 x = 0
656+ deadline = time.time() + timeout
657 while x < len(steps):
658 message = steps[x]['message']
659 pattern = steps[x]['pattern']
660 fail_pattern = steps[x].get('fail_pattern', [])
661- timeout = steps[x]['timeout']
662
663 if not isinstance(pattern, list):
664 pattern = [pattern]
665 if not isinstance(fail_pattern, list):
666 fail_pattern = [fail_pattern]
667 pattern.extend(fail_pattern)
668- future_pats, future_timeout = self._future_patterns(steps, x)
669+ future_pats = self._future_patterns(steps, x)
670 pattern.extend(future_pats)
671- self.logger.info('Waiting %ds for: %s', timeout, message)
672- timeout += future_timeout
673- pattern, match = self._wait_for(f, pattern, message, timeout)
674+ self.logger.info(
675+ 'Waiting %ds for: %s', deadline - time.time(), message)
676+ pattern, match = self._wait_for(f, pattern, message, deadline)
677 if match is None:
678 remaining_messages = [step['message']
679 for step in steps[x + 1:]]
680@@ -217,16 +259,14 @@
681 @staticmethod
682 def _future_patterns(steps, index):
683 patterns = []
684- timeout = 0
685 index += 1
686 for step in steps[index:]:
687 pattern = step['pattern']
688- timeout += step['timeout']
689 if not isinstance(pattern, list):
690 patterns.append(pattern)
691 else:
692 patterns.extend(pattern)
693- return patterns, timeout
694+ return patterns
695
696 @staticmethod
697 def _fast_forward(steps, pattern, match, callbacks):
698@@ -252,11 +292,10 @@
699
700 x += 1
701
702- def _wait_for(self, writer, patterns, message, timeout):
703+ def _wait_for(self, writer, patterns, message, deadline):
704 pats = [re.compile(x) for x in patterns]
705
706- endtime = time.time() + timeout
707- while(time.time() < endtime):
708+ while(time.time() < deadline):
709 data = self._read_line()
710 if data:
711 if self.logger.level <= logging.DEBUG:
712
713=== modified file 'utah/provisioning/vm.py'
714--- utah/provisioning/vm.py 2013-06-11 04:23:34 +0000
715+++ utah/provisioning/vm.py 2013-06-14 03:44:27 +0000
716@@ -415,7 +415,7 @@
717 '{}.syslog.log'.format(self.name))
718 self.logger.info('Logs will be written to %s', log_filename)
719
720- self.rsyslog.wait_for_install(config.install_steps)
721+ self.rsyslog.wait_for_install()
722 while vm.isActive() is not 0:
723 pass
724
725@@ -550,7 +550,7 @@
726 else:
727 raise UTAHVMProvisioningException('Failed to provision VM')
728
729- self.rsyslog.wait_for_booted(config.boot_steps, self.uuid)
730+ self.rsyslog.wait_for_booted(self.uuid)
731 try:
732 self.logger.info(
733 'Waiting %d seconds for ping response', config.boottimeout)
734
735=== modified file 'utah/run.py'
736--- utah/run.py 2013-06-10 04:54:50 +0000
737+++ utah/run.py 2013-06-14 03:44:27 +0000
738@@ -234,7 +234,7 @@
739 raise UTAHException('Failed to get return code from UTAH runlist')
740 _provision_and_run.rc = int(match.group(1))
741
742- machine.rsyslog.wait_for_utah_run(config.autorun_steps, _complete_cb)
743+ machine.rsyslog.wait_for_utah_run(_complete_cb)
744 return _provision_and_run.rc, tgt_report
745
746

Subscribers

People subscribed via source and target branches

to all changes: