Merge lp:~vila/u1-test-utils/1182875-mail-hangs into lp:u1-test-utils

Proposed by Vincent Ladeuil
Status: Merged
Approved by: Leo Arias
Approved revision: 70
Merged at revision: 70
Proposed branch: lp:~vila/u1-test-utils/1182875-mail-hangs
Merge into: lp:u1-test-utils
Diff against target: 29 lines (+11/-0)
1 file modified
u1testutils/selftests/unit/test_mail.py (+11/-0)
To merge this branch: bzr merge lp:~vila/u1-test-utils/1182875-mail-hangs
Reviewer Review Type Date Requested Status
Leo Arias (community) Approve
Review via email: mp+165122@code.launchpad.net

Commit message

Workaround race condition in test_mail.py

Description of the change

This is a stop-gap fix for bug #1182875 which is blocking landings.

First, it declares the thread as a daemonic one so that when the setup fails the process can be stopped. The tearDownClass appears to not be run when an error occurs in setUpClass so the thread is left running.

The second part makes an explicit time.sleep() to work around the lack of explicit synchronization mechanism with localmail.run().

A better fix would probably be to run the mail server in a different process and ensures they give back the port they're running on (addressing the issue encountered here).

To post a comment you must log in.
Revision history for this message
Leo Arias (elopio) wrote :

<elopio> vila: I didn't now about the daemon attribute. Looks good.
<elopio> but the sleep doesn't. Or maybe I misunderstood the docs.
<elopio> vila: my idea with this was that settimeout will fail after 10 seconds of trying to connect.
<vila> elopio: yeah, you know how much I love sleep :-/
<elopio> so the sleep, as I understand, will not help with the problem.
<vila> elopio: except it does ;)
<vila> elopio: but the timeout doesn't
<elopio> vila: could you reproduce it?
<vila> elopio: I did, that's how I found the sleep() call was working around the issue
<cgoldberg> i had a cool retry decorator a while back.. with a timeout... might be of use
<vila> elopio: and I suspect it happens more on jenkins because of the load
<vila> there
<elopio> vila: reading at the documentation of settimeout again, I think we might be missing a call to setblocking.
<elopio> ah, no, blocking mode is the default.
<vila> cgoldberg: that may be helpful ! but in this specific case I would hesitate to retry a setUpClass method
<elopio> then I don't understand. But if the sleep solves your problem, I'll approve it.
<vila> elopio: well, my problem in this case is that landing on u1-t-u are blocked ;)
<vila> elopio: the root issue is that the client is trying to connect before the server has started
<vila> elopio: which is what you wanted to test right ?
<elopio> vila: yes. Yesterday I tried to reproduce the problem, asked on ops, and still didn't understand what was goig on.
<vila> elopio: so I don't think this test is reliable, I've proposed the workaround to unblock landing while we discuss a better fix
<elopio> vila: right. What I don't understand is why settimeout + connect doesn't work. According to the docs it should. So what I thought was that we were waiting 10 seconds, and the server hasn't started.
<vila> elopio: well, sleep(0) was failing, sleep(0.1) passed, so it's far from requiring 10 seconds to start
<elopio> if that's the case, waiting 10.1 seconds would be the same. But, as I said, if you could reproduce the scenario and this works
<elopio> I'll approve now and understand later.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'u1testutils/selftests/unit/test_mail.py'
2--- u1testutils/selftests/unit/test_mail.py 2013-01-21 16:19:08 +0000
3+++ u1testutils/selftests/unit/test_mail.py 2013-05-22 13:28:51 +0000
4@@ -16,6 +16,7 @@
5
6 import socket
7 import threading
8+import time
9 import unittest
10
11 import localmail
12@@ -33,7 +34,17 @@
13 def setUpClass(class_):
14 class_.localmail_thread = threading.Thread(
15 target=localmail.run, args=(class_.SMTP_PORT, settings.IMAP_PORT))
16+ # Avoid hanging the main process when something goes wrong
17+ class_.localmail_thread.daemon = True
18 class_.localmail_thread.start()
19+ # There is a race here as there is no (easy, that I know of) way to
20+ # guarantee that the server has reached the point that it is listening
21+ # to the socket. This leads to conection errors in these cases. So the
22+ # workaround is to... sleep :-/ The proper way to handle this would be
23+ # to add an Event() in the server that can be waited on in the client
24+ # (or any other sync mechanism including making sure run() blocks until
25+ # the listen calls have been really executed) -- vila 2013-05-22
26+ time.sleep(0.1)
27 # Try the port to make sure the server started.
28 socket_ = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
29 socket_.settimeout(10)

Subscribers

People subscribed via source and target branches

to all changes: