Merge ~tsimonq2/autopkgtest-cloud/+git/bug-1654761:disallow-duplicate-tests into autopkgtest-cloud:master

Proposed by Simon Quigley
Status: Rejected
Rejected by: Brian Murray
Proposed branch: ~tsimonq2/autopkgtest-cloud/+git/bug-1654761:disallow-duplicate-tests
Merge into: autopkgtest-cloud:master
Diff against target: 100 lines (+70/-0)
2 files modified
webcontrol/request/inprogress.py (+62/-0)
webcontrol/request/submit.py (+8/-0)
Reviewer Review Type Date Requested Status
Brian Murray Needs Fixing
Łukasz Zemczak Needs Fixing
Steve Langasek Needs Information
Iain Lane Pending
Review via email: mp+363643@code.launchpad.net

Description of the change

This is an IN PROGRESS merge request. I would like a review before I proceed further to make sure I'm on the right track.

This change fixes bug 1654761 and supersedes https://code.launchpad.net/~tsimonq2/autopkgtest-cloud/+git/bug-1654761/+merge/348972 - see the most recent review comment for why this code is structured how it is (including the file I used to test this code).

Left to do:
 - Write tests.
 - Implement support for reading the AMQP queue.

From the research that I have done, reading from the AMQP queue will have to be done by:
 - Declaring the queue with no autoacks.
 - Read the contents of the queue.
 - Disconnect without ACKing.

However, I am not entirely sure this is the correct line of action here. From reading through the code, there seems to be a queues.json file I could use. If that is the case, I would need an example file when it has some content to base my code off of. Any advice is appreciated here.

I tested this code by running the following (using the pastebin Iain linked in the last MP):
from inprogress import InProgress
inprogress = InProgress()
inprogress.is_test_running("/tmp/running.json", "linux", "cosmic", "amd64", ["pciutils/1:3.5.2-1ubuntu2"], ["ci-train-ppa-service/stable-phone-overlay", "ci-train-ppa-service/3343"])

To post a comment you must log in.
Revision history for this message
Steve Langasek (vorlon) wrote :

does this take care to exclude requester name from the data being compared?

review: Needs Information
Revision history for this message
Simon Quigley (tsimonq2) wrote :

Yes.

In the code comments for is_test_running, the only requester information that is actually iterated on is REQUESTINFO, so the semicolon-separated string with job information in it. It seemed to be much easier to treat that simply as a unique identifier than as a string we actually parse.

When the subdictionaries are being recursed through, the requester key is not even considered.

Let me know if you have any further questions.

Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Ok, this still needs some work IMO. Also, I don't think I'd feel safe having this merged without any tests - we need to make sure it all works as expected. Especially when writing code without a good way of testing it manually.

Even though I also don't feel strongly about us parsing the running.json file, I guess that's an acceptable approach. The file is short-lived, yes, but it's quite atomic, so there shouldn't be any corruption while reading it. And I don't see any easier way of getting the running pieces. Well, we might also try to hook up into the AMQP queues like amqp-status-collector does, but that seems more complex and also possibly wasteful? That being said, I'm not an expert in the AMQP API so maybe there's an easier way.

Still, this seems like a good way forward, but it needs more work (and refactoring). Please see my inline comments about things that need fixing.

review: Needs Fixing
ffe754d... by Simon Quigley

Stylistic cleanups.

Revision history for this message
Simon Quigley (tsimonq2) wrote :

Wow, it's been too long since I've looked at this. I promise, my Python isn't as bad these days as it was a year and a half (or so) go.

The changes sil2100 requested have been made. I'm looking into tests now.

Revision history for this message
Brian Murray (brian-murray) wrote :

It looks to me like this only checks the running.json file and not the queues.json file in which case the code doesn't do everything it purports to. I'm currently looking at queues.json (which is quite large for reasons), which has a shocking number of duplicate requests and think that queues.json is the real pain point here not running.json. Is this MP something you want to finish Simon?

review: Needs Fixing
Revision history for this message
Brian Murray (brian-murray) wrote :

Also I'm less concerned about unit testing here as functional testing should reveal an issue pretty quickly and we can build a good escape hatch.

Revision history for this message
Brian Murray (brian-murray) wrote :

We ended up implementing this, including checking queues.json, a bit ago and have deployed it to production.

Unmerged commits

ffe754d... by Simon Quigley

Stylistic cleanups.

72b99f5... by Simon Quigley

Initial InProgress functionality.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/webcontrol/request/inprogress.py b/webcontrol/request/inprogress.py
2new file mode 100644
3index 0000000..6f7c964
4--- /dev/null
5+++ b/webcontrol/request/inprogress.py
6@@ -0,0 +1,62 @@
7+"""Check if a specified test is already queued or running
8+
9+Copyright: Simon Quigley <tsimonq2@ubuntu.com>
10+This file is licensed under the same license as the rest of this program.
11+"""
12+
13+import os
14+import json
15+
16+
17+def extract_running_data(rdata):
18+ """Try to extract the data about currently-running tests
19+
20+ If this returns None, that means there is no running data, otherwise
21+ the actual running data is returned
22+ """
23+ try:
24+ with open(rdata) as data:
25+ return json.load(data)
26+ except FileNotFoundError:
27+ return None
28+
29+def test_is_running(runninguri, package, release, arch, triggers, ppas=[]):
30+ """Determine if a test is currently running
31+
32+ This function searches the queue for the specified package. If it is
33+ found, True is returned, with False being returned if it is not found.
34+
35+ The json file passed to this function is organized in this format:
36+ {"PACKAGENAME": {"REQUESTINFO": {QUEUE INFO AND LOG TAIL}}}
37+
38+ PACKAGENAME is defined as the name of the package being tested.
39+ REQUESTINFO is identifying information for the request. It uses
40+ information like any triggers, PPA names, environment variables, etc.
41+ QUEUE INFO AND LOG TAIL is several dictionaries nested within each
42+ other containing the series name, architecture, and then build-specific
43+ information within that.
44+ """
45+
46+ # Load the JSON file for the currently-running tests
47+ running_data = extract_running_data(runninguri)
48+ if not running_data or not package in running_data:
49+ return False
50+
51+ # If any bit of the data passed to us in the arguments doesn't match
52+ # the data in the JSON file, return False
53+ for request in running_data[package]:
54+ try:
55+ metadata = running_data[package][request][release][arch][0]
56+
57+ # Convert any of the lists to sets to ensure that any out
58+ # of order list elements are not treated as different
59+ if set(metadata["triggers"]) != set(triggers):
60+ return False
61+ elif ppas and "ppas" in metadata and \
62+ set(metadata["ppas"]) != set(ppas):
63+ return False
64+ except KeyError:
65+ return False
66+
67+ # Everything has passed, return True
68+ return True
69diff --git a/webcontrol/request/submit.py b/webcontrol/request/submit.py
70index 8504127..10b8c32 100644
71--- a/webcontrol/request/submit.py
72+++ b/webcontrol/request/submit.py
73@@ -14,6 +14,7 @@ import urllib.request
74 import urllib.parse
75 from urllib.error import HTTPError
76 from datetime import datetime
77+from inprogress import test_is_running
78
79 import amqplib.client_0_8 as amqp
80
81@@ -30,6 +31,8 @@ ALLOWED_TEAMS = ['canonical-kernel-distro-team']
82 # not teams
83 ALLOWED_USERS_PERPACKAGE = {'snapcraft': ['snappy-m-o']}
84
85+RUNNING = "/tmp/running.json"
86+
87
88 class Submit:
89 def __init__(self):
90@@ -138,6 +141,11 @@ class Submit:
91 'Ubuntu, thus you are not allowed to use this '
92 'service.' % (package, trigsrc))
93
94+ # Verify that the test is not already in progress
95+ if test_is_running(RUNNING, package, release, arch, triggers, ppas):
96+ raise ValueError('Test is already in progress, please wait until '
97+ 'it is complete before queuing it again.')
98+
99 def validate_git_request(self, release, arch, package, ppas=[], env=[], **kwargs):
100 """Validate parameters for an upstream git test request
101

Subscribers

People subscribed via source and target branches