Merge lp:~thomir-deactivatedaccount/core-result-checker/trunk-detect-adt-status-and-republish-maybe into lp:core-result-checker

Proposed by Thomi Richards
Status: Merged
Approved by: Thomi Richards
Approved revision: 12
Merged at revision: 12
Proposed branch: lp:~thomir-deactivatedaccount/core-result-checker/trunk-detect-adt-status-and-republish-maybe
Merge into: lp:core-result-checker
Diff against target: 95 lines (+49/-3)
1 file modified
core_result_checker/__init__.py (+49/-3)
To merge this branch: bzr merge lp:~thomir-deactivatedaccount/core-result-checker/trunk-detect-adt-status-and-republish-maybe
Reviewer Review Type Date Requested Status
Francis Ginther Needs Information
Celso Providelo (community) Approve
Review via email: mp+255316@code.launchpad.net

Commit message

Requeue tests if infrastructure failed.

Description of the change

Re-queue test payloads into core.tess.v1 if the adt exit_code indicates an infrastructure error.

To post a comment you must log in.
Revision history for this message
Celso Providelo (cprov) :
review: Approve
Revision history for this message
Francis Ginther (fginther) wrote :

This looks wrong to me:

56 + q = self.connection.SimpleQueue(
57 + "core.tests.{}".format(constants.API_VERSION)
58 + )

Shouldn't the core.tests queue match the version that the previously ran this payload? Using the core-result-checker's API_VERSION would cause us to send messages to the wrong queue should the core-image-tester's API_VERSION get out of sync.

Do we actually need to have the name of the core-image-tester queue added to the payload so that the core-result-checker knows where to resend it?

review: Needs Information
Revision history for this message
Celso Providelo (cprov) wrote :

Francis,

result-checker only binds core.results.v1 queue, i.e. the v1 contract is common to all processed messages, thus re-posting on core.tests.v1 makes sense.

Versioned queues will only allow us to deploy new isolated solution pipelines (watcher->publisher->tester->checker) in the same amqp server, not necessarily mix them.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'core_result_checker/__init__.py'
2--- core_result_checker/__init__.py 2015-04-06 23:23:22 +0000
3+++ core_result_checker/__init__.py 2015-04-07 00:08:11 +0000
4@@ -37,9 +37,10 @@
5
6 class Worker(object):
7
8- def __init__(self, swift_publisher):
9+ def __init__(self, swift_publisher, retry_publisher):
10 get_logger(__name__).info("Service Started.")
11 self.swift_publisher = swift_publisher
12+ self.retry_publisher = retry_publisher
13
14 def __call__(self, message):
15 logger = get_logger(__name__, message)
16@@ -48,10 +49,18 @@
17 channel = message['channel']
18 device = message['device']
19 image_name = message['image_name']
20+ exit_code = message['exit_code']
21 except KeyError as e:
22 logger.error("Unable to unpack incoming message: %s", str(e))
23 return MessageActions.Retry
24
25+ if int(exit_code) in (16, 20, 100):
26+ logger.info(
27+ "Test run infrastructure failed (exit code %s), retrying.",
28+ exit_code
29+ )
30+ self.retry_publisher.retry_test_run(message)
31+ return MessageActions.Acknowledge
32 container_name = "core-{}-{}-{}".format(
33 channel,
34 device,
35@@ -123,6 +132,42 @@
36 )
37
38
39+class RetryPublisher(object):
40+
41+ """A class that knows how to retry a test payload."""
42+
43+ def __init__(self, connection, max_retries):
44+ self.connection = connection
45+ self.max_retries = max_retries
46+
47+ def retry_test_run(self, payload):
48+ """Maybe retry a test payload.
49+
50+ This will look at it's retry count, and either requeue it in the
51+ test queue, or insert it into the dead letter queue.
52+
53+ """
54+ retry_count = int(payload.get('test_run_retry_count', '0'))
55+ if retry_count < self.max_retries:
56+ q = self.connection.SimpleQueue(
57+ "core.tests.{}".format(constants.API_VERSION)
58+ )
59+ payload['test_run_retry_count'] = retry_count + 1
60+ q.put(payload)
61+ q.close()
62+ else:
63+ logger = get_logger(__name__, payload)
64+ logger.error(
65+ "Test retry count exceeds maximum. Inserting into dead "
66+ "letter queue"
67+ )
68+ q = self.connection.SimpleQueue(
69+ "core.deadletters.{}".format(constants.API_VERSION)
70+ )
71+ q.put(payload)
72+ q.close()
73+
74+
75 def main():
76 config = read_config()
77 log_path = os.path.abspath(
78@@ -133,14 +178,15 @@
79 )
80
81 amqp_uris = config.get('amqp', 'uris').split()
82- swift_publisher = SwiftPublisher(config['nova'])
83- worker = Worker(swift_publisher)
84 retry_policy = DefaultRetryPolicy(
85 max_retries=3,
86 dead_queue='core.deadletters.{}'.format(constants.API_VERSION)
87 )
88 try:
89 with kombu.Connection(amqp_uris) as connection:
90+ swift_publisher = SwiftPublisher(config['nova'])
91+ retry_publisher = RetryPublisher(connection, 3)
92+ worker = Worker(swift_publisher, retry_publisher)
93 queue_monitor = SimpleRabbitQueueWorker(
94 connection,
95 "core.result.{}".format(constants.API_VERSION),

Subscribers

People subscribed via source and target branches

to all changes: