Make sure duplicate tests cannot be queued

Bug #1654761 reported by Timo Jyrinki
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Auto Package Testing
Fix Released
High
Simon Quigley
britney
Opinion
Undecided
Simon Quigley

Bug Description

It is a bit too easy to request a retry of a regressing test on excuses page even if one would be queued already. One option to cure this would be to change "Regression" to "Queued" once a retry is queued, instead of keeping it as "Regression" with the retry link until it is actually running.

Or more simply, remove the retry link when one is already queued.

Also, consider making request.cgi aware of the queue so that we can avoid triggering multiple test runs with the same parameters at the same time.

Tags: adt-298

Related branches

Simon Quigley (tsimonq2)
Changed in auto-package-testing:
status: New → Confirmed
assignee: nobody → Simon Quigley (tsimonq2)
Revision history for this message
Simon Quigley (tsimonq2) wrote :

This actually isn't an issue with the autopkgtest infra on first glance; the bulk of work will need to be done in britney2's autopkgtest policy.

Changed in britney:
status: New → Confirmed
assignee: nobody → Simon Quigley (tsimonq2)
status: Confirmed → In Progress
Changed in auto-package-testing:
status: Confirmed → In Progress
Revision history for this message
Simon Quigley (tsimonq2) wrote :

So here's some general notes and some questions that I have gathered regarding the implementation of this. Please note that this is my first time working with the Britney2 or autopkgtest-cloud code, so I might be somewhat off on some specifics, so please do correct me if I'm inaccurate anywhere here.

It seems to me that the way this is implemented is that when Britney triggers a test initially, it logs that test in pending.json. It then on each run determines whether the packages in pending.json have results on autopkgtest.u.c (by checking the Swift results) and if it does, get the status from there. If it doesn't, then it keeps it marked as RUNNING. This is pretty black and white; it doesn't scan the *queue* at all to see if there's tests already there. It seems to do something similar for the packages requested manually.

Since the AMQP queue can already be sent to in the autopkgtest policy, what I can try to do is if something in the queue exactly matches a test marked as REGRESSION (in the case of it being triggered manually) or a test marked as RUNNING (in the case of Britney triggering the test), I can change the result to be QUEUED. Additionally, if a test is marked as RUNNING-ALWAYSFAIL, and it meets the conditions previously stated, I can mark it as QUEUED-ALWAYSFAIL.

One thing I am unclear on after doing that is how tests that are already running would be detected. In the case that Britney triggers the test, it's clear; that's what we have implemented already. But I'm not sure about the logic should a test be manually triggered and not in the queue but indeed running; both the previous state (very likely "REGRESSION") and QUEUED would be inaccurate. Is there a queue specifically for running tests that can be scanned? What might be the best way to do this?

Thanks, and I look forward to feedback (and corrections, if necessary) on this.

Revision history for this message
Iain Lane (laney) wrote :

proposed-migration can't check directly if things are in the queue. AMQP allows you only to insert something into the queue and remove it again.

Except for the hack which we have in webcontrol.

Also, I think that you want to refuse manually constructed URLs too. So I think I would do this in the request script. Look at get_queue_info() and running() - you should make these callable from request.cgi, and then refuse the request if you find a matching one in either of those. Probably fix this in validate_distro_request() and validate_git_request()?

This will leave links that give you an error available on excuses.html. I don't think that's so bad, but if you wanted to fix that then maybe http://autopkgtest.ubuntu.com/queues.json and a similar for running tests could be exposed to proposed-migration somehow. Not exactly sure how right now - it wouldn't be ideal for proposed-migration to be hitting autopkgtest's website.

Thanks!

Revision history for this message
Iain Lane (laney) wrote :

proposed-migration can't check directly if things are in the queue. AMQP allows you only to insert something into the queue and remove it again.

Except for the hack which we have in webcontrol.

Also, IMO we would want to refuse manually constructed URLs too. So I think I would do this in the request script. Look at get_queue_info() and running() - you should make these callable from request.cgi, and then refuse the request if you find a matching one in either of those. Probably fix this in validate_distro_request() and validate_git_request()?

This will leave links that give you an error available on excuses.html. I don't think that's so bad, but if you wanted to fix that then maybe http://autopkgtest.ubuntu.com/queues.json and a similar for running tests could be exposed to proposed-migration somehow. Not exactly sure how right now - it wouldn't be ideal for proposed-migration to be hitting autopkgtest's website. It's probably okay to leave those - the real problem is the dupe requests.

Thanks!

Revision history for this message
Simon Quigley (tsimonq2) wrote :

That seems to solve part of the problem but not exactly all of it. So after implementing the fix for bug 1686929 (two birds with one stone here), when someone manually triggers a test, request.cgi won't let duplicate tests be queued. That should be relatively simple to implement from how you described it.

However, what also needs to be figured out is a stable way to access the queue items (read only is all that's needed) from Britney. This is because Britney sends the test directly to AMQP rather than going through request.cgi. So once duplicate tests can't be triggered (i.e. the above bug is fixed), it would certainly be confusing if these things show up in excuses but triggering them doesn't work (or even worse, they show as passed). You mentioned grabbing the data from queues.json, what about this isn't ideal, and what can be improved? (Because that's likely the way to go here.)

Thanks.

Simon Quigley (tsimonq2)
summary: - Use another status for retried but queued items
+ Use another status for retried but queued items and make sure duplicate
+ tests cannot be queued
description: updated
Revision history for this message
Simon Quigley (tsimonq2) wrote : Re: Use another status for retried but queued items and make sure duplicate tests cannot be queued

I've gone ahead and made bug 1686929 a duplicate of this one so we can track it here. I've also edited the description.

My goal is to get some prototype code for at minimum the test duplication (in autopkgtest-cloud) by the end of this weekend. I think that once that code is sorted out and is known to work, it should be merged at the same time as the Britney code, which is going to come next. The reason for this would be to make sure that people aren't confused on why tests are still visible to be retried (as mentioned in the last comment) but error out on an already queued test.

Revision history for this message
Iain Lane (laney) wrote : Re: [Bug 1654761] Re: Use another status for retried but queued items

On Thu, Mar 08, 2018 at 06:16:27PM -0000, Simon Quigley wrote:
> However, what also needs to be figured out is a stable way to access the
> queue items (read only is all that's needed) from Britney. This is
> because Britney sends the test directly to AMQP rather than going
> through request.cgi. So once duplicate tests can't be triggered (i.e.
> the above bug is fixed), it would certainly be confusing if these things
> show up in excuses but triggering them doesn't work (or even worse, they
> show as passed). You mentioned grabbing the data from queues.json, what
> about this isn't ideal, and what can be improved? (Because that's likely
> the way to go here.)

I think you overstate how important that is. It's not that annoying to
see an error. You want to queue a test, you click, you find out it's
already queued. Woohoo.

update_excuses is not a live page. If you click the link, its state will
not be reflected in the page until the next refresh which in the
non-busy case is probably going to be after the test has run anyway. And
then --- what if the test in progress when the page was generated
finishes and fails before the next refresh? In this proposed design
there's no button.

In my opinion we should leave update_excuses.html alone. It'd have to be
done with JS to be done properly, but then we need to figure out how to
stream the queue/running to clients and implement a whole pile of stuff
in the page to hide and show the buttons.

Cheers,

--
Iain Lane [ <email address hidden> ]
Debian Developer [ <email address hidden> ]
Ubuntu Developer [ <email address hidden> ]

Revision history for this message
Simon Quigley (tsimonq2) wrote : Re: Use another status for retried but queued items and make sure duplicate tests cannot be queued

Alright, so I guess we can count Britney as not affected by this then. :)

Changed in britney:
status: In Progress → Opinion
Simon Quigley (tsimonq2)
summary: - Use another status for retried but queued items and make sure duplicate
- tests cannot be queued
+ Make sure duplicate tests cannot be queued
tags: added: adt-298
Changed in auto-package-testing:
importance: Undecided → High
Changed in auto-package-testing:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.