Merge lp:~vila/ubuntu-ci-services-itself/debug into lp:ubuntu-ci-services-itself

Proposed by Vincent Ladeuil
Status: Merged
Approved by: Vincent Ladeuil
Approved revision: 405
Merged at revision: 411
Proposed branch: lp:~vila/ubuntu-ci-services-itself/debug
Merge into: lp:ubuntu-ci-services-itself
Diff against target: 245 lines (+83/-54)
3 files modified
test_runner/run_test.py (+7/-3)
test_runner/tstrun/run_worker.py (+64/-42)
test_runner/tstrun/testbed.py (+12/-9)
To merge this branch: bzr merge lp:~vila/ubuntu-ci-services-itself/debug
Reviewer Review Type Date Requested Status
Andy Doan (community) Approve
PS Jenkins bot (community) continuous-integration Approve
Review via email: mp+211384@code.launchpad.net

Commit message

Rename summary and log produced on the test bed and add them to the artifacts

Description of the change

This provides more logs for the test runner to help debug.

Paul encountered a case where the subunit stream content was broken. This
requires the summary.log created by adt-run.

The log of the test run itself wasn't saved as an artifact, it is now,
package by package.

Finally, the cloud-init log was only available on errors, it's now available
in all cases as it may contains useful info for diagnosis (like which
package *versions* were installed).

I've tested locally and will run a ticket through the engine (as soon as I get an updated deployment...)

To post a comment you must log in.
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

FAILED: Continuous integration, rev:403
No commit message was specified in the merge proposal. Click on the following link and set the commit message (if you want a jenkins rebuild you need to trigger it yourself):
https://code.launchpad.net/~vila/ubuntu-ci-services-itself/debug/+merge/211384/+edit-commit-message

http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/449/
Executed test runs:

Click here to trigger a rebuild:
http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/449/rebuild

review: Needs Fixing (continuous-integration)
404. By Vincent Ladeuil

Fix pep8 issues.

Revision history for this message
Andy Doan (doanac) wrote :

100 + results.setdefault('artifacts', []).append({
101 + 'name': '{}.{}'.format(self.logger_name, name),
102 + 'reference': url,
103 + 'type': kind,
104 + })

there's a method in the base class that does this now:

 http://bazaar.launchpad.net/~canonical-ci-engineering/ubuntu-ci-services-itself/trunk/view/head:/ci-utils/ci_utils/amqp_worker.py#L135

7 def run_test(package):
114 + def save_testbed_artifacts(self, logger, results, test_bed, package):

I wish there was a better way to keep the artifacts in sync between these two modules. ie just pull everything from a directory and make an assumption on the file-type if its LOGS/RESULTS type. But I don't see anything obvious or easy right now. Just something to think about in the future.

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

PASSED: Continuous integration, rev:404
http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/451/
Executed test runs:

Click here to trigger a rebuild:
http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/451/rebuild

review: Approve (continuous-integration)
405. By Vincent Ladeuil

Fix review comments.

Revision history for this message
Vincent Ladeuil (vila) wrote :

> 100 + results.setdefault('artifacts', []).append({
> 101 + 'name': '{}.{}'.format(self.logger_name, name),
> 102 + 'reference': url,
> 103 + 'type': kind,
> 104 + })
>
> there's a method in the base class that does this now:
>
> http://bazaar.launchpad.net/~canonical-ci-engineering/ubuntu-ci-services-
> itself/trunk/view/head:/ci-utils/ci_utils/amqp_worker.py#L135

Called.

>
>
> 7 def run_test(package):
> 114 + def save_testbed_artifacts(self, logger, results, test_bed,
> package):
>
> I wish there was a better way to keep the artifacts in sync between these two
> modules. ie just pull everything from a directory and make an assumption on
> the file-type if its LOGS/RESULTS type. But I don't see anything obvious or
> easy right now. Just something to think about in the future.

Agreed, it's not super pretty, especially when you consider that they are downloaded from the testbed to be uploaded to swift. I didn't want to refactor too aggressively either without tests.

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :

PASSED: Continuous integration, rev:405
http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/453/
Executed test runs:

Click here to trigger a rebuild:
http://s-jenkins.ubuntu-ci:8080/job/uci-engine-ci/453/rebuild

review: Approve (continuous-integration)
Revision history for this message
Andy Doan (doanac) wrote :

On 03/17/2014 07:52 PM, Vincent Ladeuil wrote:
> Agreed, it's not super pretty, especially when you consider that they are downloaded from the testbed to be uploaded to swift. I didn't want to refactor too aggressively either without tests.
agreed also.

Revision history for this message
Andy Doan (doanac) :
review: Approve
Revision history for this message
Vincent Ladeuil (vila) wrote :

Finally ! After an epic fight against swift uploads on hp (ticket filed with support), it appears that I can get some success as long as my uploads stay below some mysterious limit... with enough retries.

So, I've tested this: http://15.125.78.170/ticket.html?ticket_id=1

All logs are there, properly named, even in the left column under 'Image Testing' for easy access, pfew, what a journey.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'test_runner/run_test.py'
--- test_runner/run_test.py 2014-03-14 10:37:28 +0000
+++ test_runner/run_test.py 2014-03-18 00:50:25 +0000
@@ -137,7 +137,8 @@
137137
138138
139def run_test(package):139def run_test(package):
140 output = open('result.subunit', 'wb')140 subunit_path = '{}.subunit'.format(package)
141 output = open(subunit_path, 'wb')
141 result = subunit.TestProtocolClient(output)142 result = subunit.TestProtocolClient(output)
142 result.startTestRun()143 result.startTestRun()
143 os.mkdir(package)144 os.mkdir(package)
@@ -147,15 +148,18 @@
147 # MISSINGTEST: dsc file not found148 # MISSINGTEST: dsc file not found
148 dsc_path = get_dsc_path(package, cwd=package)149 dsc_path = get_dsc_path(package, cwd=package)
149 try:150 try:
151 summary_path = '{}-summary.log'.format(package)
152 log_path = '{}.log'.format(package)
150 cmd = ['sudo', 'adt-run', dsc_path,153 cmd = ['sudo', 'adt-run', dsc_path,
151 '--summary', 'summary.log',154 '--summary', summary_path,
155 '--log-file', log_path,
152 # Required to get files produced in 'results'156 # Required to get files produced in 'results'
153 '--paths-testbed',157 '--paths-testbed',
154 '--output-dir', 'results',158 '--output-dir', 'results',
155 '---', 'adt-virt-null']159 '---', 'adt-virt-null']
156 proc, out, err = run(*cmd, out=None, err=None, check_rc=False)160 proc, out, err = run(*cmd, out=None, err=None, check_rc=False)
157 rc = proc.returncode161 rc = proc.returncode
158 for name, status in parse_summary('summary.log'):162 for name, status in parse_summary(summary_path):
159 process_test_result(package, result, name, status)163 process_test_result(package, result, name, status)
160 finally:164 finally:
161 result.stopTestRun()165 result.stopTestRun()
162166
=== modified file 'test_runner/tstrun/run_worker.py'
--- test_runner/tstrun/run_worker.py 2014-03-14 21:54:37 +0000
+++ test_runner/tstrun/run_worker.py 2014-03-18 00:50:25 +0000
@@ -28,39 +28,60 @@
28 super(TestRunnerWorker, self).__init__('image_test')28 super(TestRunnerWorker, self).__init__('image_test')
29 self.data_store = None29 self.data_store = None
3030
31 def save_subunit(self, log, package, stream, results):31 def save_artifact(self, logger, results, name, value, description=None,
32 # make this exception-safe since we can already report pass/fail32 kind='LOGS'):
33 # with or without this file33 """Save an artifact catching and reporting exceptions.
34 try:34
35 log.info('Saving subunit results for {}'.format(package))35 This should only be used to upload artifacts after the pass/fail status
36 name = 'test-runner.{}-subunit-stream'.format(package)36 has been established to it's safe to fail the upload.
37 url = self.data_store.put_file(name, stream, 'text/plain')37
38 results.setdefault('artifacts', []).append({38 :param logger: To report execution.
39 'name': name,39
40 'reference': url,40 :param results: The dict holding the 'artifacts' attribute.
41 'type': 'RESULTS',41
42 })42 :param name: The artifact name.
43 except:43
44 log.exception(44 :param value: The artifact value as a string.
45 'Unable to upload subunit result for {}'.format(package))45
4646 :param description: For reporting purposes.
47 def save_testbed_console(self, log, test_bed, results):47
48 # make this exception-safe since we can already report pass/fail48 :param kind: The kind of artifact (defaults to 'LOGS').
49 # with or without this file49 """
50 try:50 if description is None:
51 log.info('Saving testbed console')51 description = name
52 name = 'test-runner.testbed-cloud-init.log'52 try:
53 console = test_bed.get_cloud_init_console()53 logger.info('Saving {}'.format(description))
54 url = self.data_store.put_file(name, console, 'text/plain')54 url = self.data_store.put_file(name, value, 'text/plain')
55 results.setdefault('artifacts', []).append({55 self._create_artifact('{}.{}'.format(self.logger_name, name),
56 'name': name,56 url, results, kind)
57 'reference': url,57 except:
58 'type': 'LOGS',58 logger.exception('Unable to upload {}'.format(description))
59 })59
60 except:60 def save_testbed_console(self, logger, results, test_bed):
61 log.exception('unable to upload testbed console')61 self.save_artifact(logger, results,
6262 'testbed-cloud-init.log',
63 def handle_request(self, log, params):63 test_bed.get_cloud_init_console(),
64 'testbed console')
65
66 def save_testbed_artifacts(self, logger, results, test_bed, package):
67 subunit_path = '{}.subunit'.format(package)
68 self.save_artifact(
69 logger, results,
70 subunit_path, test_bed.get_remote_content(subunit_path),
71 'subunit results for {}'.format(package),
72 'RESULTS')
73 package_log_path = '{}.log'.format(package)
74 self.save_artifact(
75 logger, results,
76 package_log_path, test_bed.get_remote_content(package_log_path),
77 'adt-run log for {}'.format(package))
78 summary_log_path = '{}-summary.log'.format(package)
79 self.save_artifact(
80 logger, results,
81 summary_log_path, test_bed.get_remote_content(summary_log_path),
82 'adt-run summary for {}'.format(package))
83
84 def handle_request(self, logger, params):
64 ticket_id = params['ticket_id']85 ticket_id = params['ticket_id']
65 progress_queue = params['progress_trigger']86 progress_queue = params['progress_trigger']
66 image_id = params['image_id']87 image_id = params['image_id']
@@ -69,7 +90,7 @@
69 results = {}90 results = {}
7091
71 def status_cb(msg):92 def status_cb(msg):
72 log.info(msg)93 logger.info(msg)
73 amqp_utils.progress_update(progress_queue, {'message': msg})94 amqp_utils.progress_update(progress_queue, {'message': msg})
7495
75 self.data_store = self._create_data_store(ticket_id)96 self.data_store = self._create_data_store(ticket_id)
@@ -79,41 +100,42 @@
79 test_bed = testbed.TestBed('testbed-{}'.format(progress_queue),100 test_bed = testbed.TestBed('testbed-{}'.format(progress_queue),
80 flavors, image_id, status_cb)101 flavors, image_id, status_cb)
81 except:102 except:
82 log.exception(103 logger.exception(
83 'The testbed creation for {} failed'.format(ticket_id))104 'The testbed creation for {} failed'.format(ticket_id))
84 return amqp_utils.progress_failed, results105 return amqp_utils.progress_failed, results
85106
86 try:107 try:
87 test_bed.setup()108 test_bed.setup()
88 except:109 except:
89 log.exception(110 logger.exception(
90 'The testbed setup for {} failed'.format(ticket_id))111 'The testbed setup for {} failed'.format(ticket_id))
91 if test_bed.instance is not None:112 if test_bed.instance is not None:
92 self.save_testbed_console(log, test_bed, results)113 self.save_testbed_console(logger, results, test_bed)
93 test_bed.teardown()114 test_bed.teardown()
94 return amqp_utils.progress_failed, results115 return amqp_utils.progress_failed, results
95116
117 self.save_testbed_console(logger, results, test_bed)
96 status_cb('The test bed is ready')118 status_cb('The test bed is ready')
97 # The tests will succeed unless they fail ;)119 # The tests will succeed unless they fail ;)
98 notify = amqp_utils.progress_completed120 notify = amqp_utils.progress_completed
99 try:121 try:
100 for package in package_list:122 for package in package_list:
101 if params.get('cancelled'):123 if params.get('cancelled'):
102 log.error('The request for {} has been cancelled,'124 logger.error('The request for {} has been cancelled,'
103 ' exiting'.format(ticket_id))125 ' exiting'.format(ticket_id))
104 return amqp_utils.progress_failed, results126 return amqp_utils.progress_failed, results
105 status_cb('Testing {}'.format(package))127 status_cb('Testing {}'.format(package))
106 # uci-vms shell adt-run ... --- adt-virt-null128 # uci-vms shell adt-run ... --- adt-virt-null
107 return_code, subunit_stream = test_bed.run_test(package)129 return_code = test_bed.run_test(package)
108 # 0 is success, 8 is skipped and considered a success130 # 0 is success, 8 is skipped and considered a success
109 if return_code not in (0, 8):131 if return_code not in (0, 8):
110 # At least one test failed132 # At least one test failed
111 notify = amqp_utils.progress_failed133 notify = amqp_utils.progress_failed
112 self.save_subunit(log, package, subunit_stream, results)134 self.save_testbed_artifacts(logger, results, test_bed, package)
113 status_cb('Test completed for ticket {}'.format(ticket_id))135 status_cb('Test completed for ticket {}'.format(ticket_id))
114 return notify, results136 return notify, results
115 except:137 except:
116 log.exception(138 logger.exception(
117 'Exception while handling ticket {}'.format(ticket_id))139 'Exception while handling ticket {}'.format(ticket_id))
118 raise140 raise
119 finally:141 finally:
120142
=== modified file 'test_runner/tstrun/testbed.py'
--- test_runner/tstrun/testbed.py 2014-03-14 17:57:46 +0000
+++ test_runner/tstrun/testbed.py 2014-03-18 00:50:25 +0000
@@ -245,7 +245,7 @@
245 def wait_for_cloud_init(self):245 def wait_for_cloud_init(self):
246 # FIXME: cloud_init_timeout should be a config option (related to246 # FIXME: cloud_init_timeout should be a config option (related to
247 # get_ip_timeout and probably the two can be merged) -- vila 2014-01-30247 # get_ip_timeout and probably the two can be merged) -- vila 2014-01-30
248 cloud_init_timeout = 300 # in seconds so 5 minutes248 cloud_init_timeout = 600 # in seconds so 10 minutes
249 timeout_limit = time.time() + cloud_init_timeout249 timeout_limit = time.time() + cloud_init_timeout
250 while time.time() < timeout_limit:250 while time.time() < timeout_limit:
251 # A relatively cheap way to catch cloud-init completion is to watch251 # A relatively cheap way to catch cloud-init completion is to watch
@@ -262,8 +262,6 @@
262 # We're good to go262 # We're good to go
263 log.info(263 log.info(
264 'cloud-init completed for {}'.format(self.instance.id))264 'cloud-init completed for {}'.format(self.instance.id))
265 # FIXME: Right place to report how long it
266 # took for cloud-init to finish. -- vila 2014-01-30
267 return265 return
268 time.sleep(5)266 time.sleep(5)
269 raise NovaClientException('Instance never completed cloud-init')267 raise NovaClientException('Instance never completed cloud-init')
@@ -295,12 +293,13 @@
295 out, err = proc.communicate()293 out, err = proc.communicate()
296 return proc, out, err294 return proc, out, err
297295
296 def get_remote_content(self, path):
297 _, content, _ = self.ssh('cat', path, out=subprocess.PIPE)
298 return content
299
298 def run_test(self, package):300 def run_test(self, package):
299 proc, _, _ = self.ssh('/tmp/run_test.py', package)301 proc, _, _ = self.ssh('/tmp/run_test.py', package)
300 # FIXME: Does that look like a pipe ? -- vila 2014-02-06302 return proc.returncode
301 _, subunit_stream, _ = self.ssh('cat', 'result.subunit',
302 out=subprocess.PIPE)
303 return proc.returncode, subunit_stream
304303
305304
306# The following are helpers for local manual tests, they should disappear once305# The following are helpers for local manual tests, they should disappear once
@@ -345,6 +344,10 @@
345 from ci_utils import dump_stack344 from ci_utils import dump_stack
346 dump_stack.install_stack_dump_signal()345 dump_stack.install_stack_dump_signal()
347 test_bed = test_print_ip()346 test_bed = test_print_ip()
348 print test_bed.run_test('libpng') # Known to run no tests347 # libpng is known to run no tests
349 print test_bed.run_test('juju-core')348 for package in ('libpng',):
349 print test_bed.run_test(package)
350# print test_bed.get_remote_content('{}.log'.format(package))
351 print test_bed.get_remote_content('{}-summary.log'.format(package))
352 print test_bed.get_remote_content('{}.subunit'.format(package))
350 test_bed.teardown()353 test_bed.teardown()

Subscribers

People subscribed via source and target branches