Merge lp:~vila/bzr/405745-http-hangs into lp:bzr

Proposed by Vincent Ladeuil on 2009-10-08
Status: Merged
Merged at revision: 5396
Proposed branch: lp:~vila/bzr/405745-http-hangs
Merge into: lp:bzr
Diff against target: 254 lines (+99/-68)
3 files modified
NEWS (+3/-0)
bzrlib/tests/http_server.py (+93/-66)
bzrlib/tests/test_http.py (+3/-2)
To merge this branch: bzr merge lp:~vila/bzr/405745-http-hangs
Reviewer Review Type Date Requested Status
Andrew Bennetts 2009-10-08 Needs Fixing on 2009-10-08
Review via email: mp+13050@code.launchpad.net

This proposal supersedes a proposal from 2009-10-07.

To post a comment you must log in.
Vincent Ladeuil (vila) wrote : Posted in a previous version of this proposal

Thanks to Cris Boylan that keep nagging me on the problem, this patch
addresses the http tests hanging on AIX.

The crux of the issue is that shutdown() can be implemented differently on various systems
and we were trying to handle the socket from two different threads, which is frowned upon on AIX.

This patch stops the server more "manually" but more importantly more cleanly by
connecting to the server (allowing it to wake up from the listen() call and terminate
cleanly).

The server thread can now be joined without trouble reducing the leaking tests from ~300 to ~200
(when running bzr selftest -s bt.test_http).

Note that I didn't add any test here.
I tried to modify test_http.TestHTTPServer.test_server_start_and_stop to check
that no threads were leaked, but that just pointed out that the threading functions
active_count() and enumerate() are very happy to disagree about the number of active threads
even when called one after the other (I can't demonstrate it but I experimented it in ~5-10%
of my test runs).

Other submissions will follow shortly to address more leaks (first http/1.1, then ftp).

Andrew Bennetts (spiv) wrote : Posted in a previous version of this proposal

Looks good to me. Nice to see large scary comments get removed!

review: Approve
Vincent Ladeuil (vila) wrote :
Download full text (8.1 KiB)

I had to merge some refactoring done for bug #392127 so that the 2.4 band-aids stay smaller.

I changed the server.tearDown() into server.shutdown() as that better matches the SocketServer.py methods names.

I'm not particularly happy with having to override server_bind() twice, but that will be solved
with #392127 where the inheritance scheme is a bit different.

I may need to make connect_socket() a bit more reusable but I also plan to make more
test_server bits reusable so that could wait too.

Here is the differential diff (but it's ~200 lines while the full one is ~250):
=== modified file 'bzrlib/tests/http_server.py'
--- bzrlib/tests/http_server.py 2009-10-07 10:24:04 +0000
+++ bzrlib/tests/http_server.py 2009-10-08 08:25:53 +0000
@@ -317,27 +317,70 @@
         # the tests cases.
         self.test_case_server = test_case_server
         self._home_dir = test_case_server._home_dir
-
- def tearDown(self):
- """Called to clean-up the server.
-
- Since the server may be (surely is, even) in a blocking listen, we
- shutdown its socket before closing it.
- """
- # The server is listening for a last connection, let's give it:
- try:
- fake_conn = socket.create_connection(self.server_address)
- fake_conn.close()
- except socket.error, e:
- # But ignore connection errors as the point is to unblock the
- # server thread, it may happen that it's not blocked or even not
- # started (when something went wrong during test case setup
- # leading to self.setUp() *not* being called but self.tearDown()
- # still being called)
- pass
-
-
-class TestingHTTPServer(SocketServer.TCPServer, TestingHTTPServerMixin):
+ self.serving = False
+ self.is_shut_down = threading.Event()
+
+ def serve(self):
+ self.serving = True
+ self.is_shut_down.clear()
+ while self.serving:
+ try:
+ # Really a connection but the python framework is generic and
+ # call them requests
+ self.handle_request()
+ except socket.timeout:
+ pass
+ except (socket.error, select.error), e:
+ if e[0] == errno.EBADF:
+ # Starting with python-2.6, handle_request may raise socket
+ # or select exceptions when the server is shut down as we
+ # do.
+ pass
+ else:
+ raise
+ # Let's close the listening socket
+ self.server_close()
+ self.is_shut_down.set()
+
+ def connect_socket(self):
+ msg = "getaddrinfo returns an empty list"
+ for res in socket.getaddrinfo(*self.server_address):
+ af, socktype, proto, canonname, sa = res
+ sock = None
+ try:
+ sock = socket.socket(af, socktype, proto)
+ sock.connect(sa)
+ return sock
+
+ except socket.error, msg:
+ if sock is not None:
+ sock.close()
+ raise socket.error, msg
+
+ def shutdown(self):...

Read more...

Andrew Bennetts (spiv) wrote :

+ def connect_socket(self):
+ msg = "getaddrinfo returns an empty list"
+ for res in socket.getaddrinfo(*self.server_address):
[...]
+ except socket.error, msg:
+ if sock is not None:
+ sock.close()
+ raise socket.error, msg
+

"except socket.error, msg" is bogus, especially when "msg" was a string earlier. It's also a bit arbitrary to that if no addresses work that this will raise an error from the last address, rather than the first, or some sort of aggregation.

I'd probably write this as:
    err = socket.error("getaddrinfo returned an empty list")
    for ...
        except socket.error, err:
            # 'err' is now the most recent error
            if sock is not None:
                sock.close()
    raise err

Otherwise this seems ok. bb:tweak.

review: Needs Fixing
lp:~vila/bzr/405745-http-hangs updated on 2009-10-13
4736. By Vincent Ladeuil on 2009-10-12

Fixed as per Andrew's review.

* bzrlib/tests/http_server.py:
(TestingHTTPServerMixin.connect_socket): Be less dirty to handle
the socket errors.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2009-11-30 04:49:31 +0000
3+++ NEWS 2009-11-30 09:20:31 +0000
4@@ -552,6 +552,9 @@
5 Testing
6 *******
7
8+* HTTP test servers will leak less threads (and sockets) and will not hang on
9+ AIX anymore. (Vincent Ladeuil, #405745)
10+
11 * Passing ``--lsprof-tests -v`` to bzr selftest will cause lsprof output to
12 be output for every test. Note that this is very verbose! (Robert Collins)
13
14
15=== modified file 'bzrlib/tests/http_server.py'
16--- bzrlib/tests/http_server.py 2009-11-11 07:23:59 +0000
17+++ bzrlib/tests/http_server.py 2009-11-30 09:20:31 +0000
18@@ -317,44 +317,71 @@
19 # the tests cases.
20 self.test_case_server = test_case_server
21 self._home_dir = test_case_server._home_dir
22-
23- def tearDown(self):
24- """Called to clean-up the server.
25-
26- Since the server may be (surely is, even) in a blocking listen, we
27- shutdown its socket before closing it.
28- """
29- # Note that is this executed as part of the implicit tear down in the
30- # main thread while the server runs in its own thread. The clean way
31- # to tear down the server is to instruct him to stop accepting
32- # connections and wait for the current connection(s) to end
33- # naturally. To end the connection naturally, the http transports
34- # should close their socket when they do not need to talk to the
35- # server anymore. This happens naturally during the garbage collection
36- # phase of the test transport objetcs (the server clients), so we
37- # don't have to worry about them. So, for the server, we must tear
38- # down here, from the main thread, when the test have ended. Note
39- # that since the server is in a blocking operation and since python
40- # use select internally, shutting down the socket is reliable and
41- # relatively clean.
42- try:
43- self.socket.shutdown(socket.SHUT_RDWR)
44- except socket.error, e:
45- # WSAENOTCONN (10057) 'Socket is not connected' is harmless on
46- # windows (occurs before the first connection attempt
47- # vila--20071230)
48-
49- # 'Socket is not connected' can also occur on OSX, with a
50- # "regular" ENOTCONN (when something went wrong during test case
51- # setup leading to self.setUp() *not* being called but
52- # self.tearDown() still being called -- vila20081106
53- if not len(e.args) or e.args[0] not in (errno.ENOTCONN, 10057):
54- raise
55- # Let the server properly close the socket
56- self.server_close()
57-
58-
59-class TestingHTTPServer(SocketServer.TCPServer, TestingHTTPServerMixin):
60+ self.serving = False
61+ self.is_shut_down = threading.Event()
62+
63+ def serve(self):
64+ self.serving = True
65+ self.is_shut_down.clear()
66+ while self.serving:
67+ try:
68+ # Really a connection but the python framework is generic and
69+ # call them requests
70+ self.handle_request()
71+ except socket.timeout:
72+ pass
73+ except (socket.error, select.error), e:
74+ if e[0] == errno.EBADF:
75+ # Starting with python-2.6, handle_request may raise socket
76+ # or select exceptions when the server is shut down as we
77+ # do.
78+ pass
79+ else:
80+ raise
81+ # Let's close the listening socket
82+ self.server_close()
83+ self.is_shut_down.set()
84+
85+ def connect_socket(self):
86+ err = socket.error('getaddrinfo returns an empty list')
87+ for res in socket.getaddrinfo(*self.server_address):
88+ af, socktype, proto, canonname, sa = res
89+ sock = None
90+ try:
91+ sock = socket.socket(af, socktype, proto)
92+ sock.connect(sa)
93+ return sock
94+
95+ except socket.error, err:
96+ # 'err' is now the most recent error
97+ if sock is not None:
98+ sock.close()
99+ raise err
100+
101+ def shutdown(self):
102+ """Stops the serve() loop.
103+
104+ Blocks until the loop has finished. This must be called while serve()
105+ is running in another thread, or it will deadlock.
106+ """
107+ if not self.serving:
108+ return
109+ self.serving = False
110+ # The server is listening for a last connection, let's give it:
111+ try:
112+ fake_conn = self.connect_socket()
113+ fake_conn.close()
114+ except socket.error, e:
115+ # But ignore connection errors as the point is to unblock the
116+ # server thread, it may happen that it's not blocked or even not
117+ # started (when something went wrong during test case setup
118+ # leading to self.setUp() *not* being called but self.tearDown()
119+ # still being called)
120+ pass
121+ self.is_shut_down.wait()
122+
123+
124+class TestingHTTPServer(TestingHTTPServerMixin, SocketServer.TCPServer):
125
126 def __init__(self, server_address, request_handler_class,
127 test_case_server):
128@@ -362,9 +389,15 @@
129 SocketServer.TCPServer.__init__(self, server_address,
130 request_handler_class)
131
132-
133-class TestingThreadingHTTPServer(SocketServer.ThreadingTCPServer,
134- TestingHTTPServerMixin):
135+ def server_bind(self):
136+ SocketServer.TCPServer.server_bind(self)
137+ if sys.version < (2, 5):
138+ self.server_address = self.socket.getsockname()
139+
140+
141+class TestingThreadingHTTPServer(TestingHTTPServerMixin,
142+ SocketServer.ThreadingTCPServer,
143+ ):
144 """A threading HTTP test server for HTTP 1.1.
145
146 Since tests can initiate several concurrent connections to the same http
147@@ -399,6 +432,11 @@
148 else:
149 raise
150
151+ def server_bind(self):
152+ SocketServer.ThreadingTCPServer.server_bind(self)
153+ if sys.version < (2, 5):
154+ self.server_address = self.socket.getsockname()
155+
156
157 class HttpServer(transport.Server):
158 """A test server for http transports.
159@@ -462,18 +500,22 @@
160 raise httplib.UnknownProtocol(proto_vers)
161 else:
162 self._httpd = self.create_httpd(serv_cls, rhandler)
163+<<<<<<< TREE
164 self.host, self.port = self._httpd.socket.getsockname()
165+=======
166+ # Ensure we get the right port
167+ host, self.port = self._httpd.server_address
168+>>>>>>> MERGE-SOURCE
169 return self._httpd
170
171 def _http_start(self):
172 """Server thread main entry point. """
173- self._http_running = False
174+ server = None
175 try:
176 try:
177- httpd = self._get_httpd()
178+ server = self._get_httpd()
179 self._http_base_url = '%s://%s:%s/' % (self._url_protocol,
180 self.host, self.port)
181- self._http_running = True
182 except:
183 # Whatever goes wrong, we save the exception for the main
184 # thread. Note that since we are running in a thread, no signal
185@@ -486,21 +528,8 @@
186
187 # From now on, exceptions are taken care of by the
188 # SocketServer.BaseServer or the request handler.
189- while self._http_running:
190- try:
191- # Really an HTTP connection but the python framework is generic
192- # and call them requests
193- httpd.handle_request()
194- except socket.timeout:
195- pass
196- except (socket.error, select.error), e:
197- if e[0] == errno.EBADF:
198- # Starting with python-2.6, handle_request may raise socket
199- # or select exceptions when the server is shut down (as we
200- # do).
201- pass
202- else:
203- raise
204+ if server is not None:
205+ server.serve()
206
207 def _get_remote_url(self, path):
208 path_parts = path.split(os.path.sep)
209@@ -527,10 +556,10 @@
210 """
211 # XXX: TODO: make the server back onto vfs_server rather than local
212 # disk.
213- if not (backing_transport_server is None or \
214- isinstance(backing_transport_server, local.LocalURLServer)):
215+ if not (backing_transport_server is None
216+ or isinstance(backing_transport_server, local.LocalURLServer)):
217 raise AssertionError(
218- "HTTPServer currently assumes local transport, got %s" % \
219+ "HTTPServer currently assumes local transport, got %s" %
220 backing_transport_server)
221 self._home_dir = os.getcwdu()
222 self._local_path_parts = self._home_dir.split(os.path.sep)
223@@ -556,11 +585,9 @@
224
225 def tearDown(self):
226 """See bzrlib.transport.Server.tearDown."""
227- self._httpd.tearDown()
228 self._http_running = False
229- # We don't need to 'self._http_thread.join()' here since the thread is
230- # a daemonic one and will be garbage collected anyway. Joining just
231- # slows us down for no added benefit.
232+ self._httpd.shutdown()
233+ self._http_thread.join()
234
235 def get_url(self):
236 """See bzrlib.transport.Server.get_url."""
237
238=== modified file 'bzrlib/tests/test_http.py'
239--- bzrlib/tests/test_http.py 2009-11-18 16:02:53 +0000
240+++ bzrlib/tests/test_http.py 2009-11-30 09:20:31 +0000
241@@ -325,10 +325,11 @@
242 server = http_server.HttpServer()
243 server.setUp()
244 try:
245- self.assertTrue(server._http_running)
246+ self.assertTrue(server._httpd is not None)
247+ self.assertTrue(server._httpd.serving)
248 finally:
249 server.tearDown()
250- self.assertFalse(server._http_running)
251+ self.assertFalse(server._httpd.serving)
252
253 def test_create_http_server_one_zero(self):
254 class RequestHandlerOneZero(http_server.TestingHTTPRequestHandler):