Merge lp:~vila/bzr/383920-pycurl-hangs into lp:~bzr/bzr/trunk-old
- 383920-pycurl-hangs
- Merge into trunk-old
Status: | Rejected |
---|---|
Rejected by: | Vincent Ladeuil |
Proposed branch: | lp:~vila/bzr/383920-pycurl-hangs |
Merge into: | lp:~bzr/bzr/trunk-old |
Diff against target: | 196 lines |
To merge this branch: | bzr merge lp:~vila/bzr/383920-pycurl-hangs |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Martin Pool | Approve | ||
Review via email: mp+8295@code.launchpad.net |
Commit message
Description of the change
Vincent Ladeuil (vila) wrote : | # |
Robert Collins (lifeless) wrote : | # |
I think your analysis of the problem is slightly off.
HTTP says (broadly):
- 1.1: connections are persistent unless otherwise noted/prevented
- 1.0: connections are non-persistent unless otherwise requested and not
prevented
- all messages are delimited
- by Content-Length or
- TE or
- dropping the connection
So, if:
- we're sending 1.1 as the version
- we were not sending Content-Length, nor doing TE
- and not dropping the connection
then the server was fundamentally broken, not that the client was being
overly strict.
Is the diff accurate? it seems to have bundle stuff in it...
-Rob
Vincent Ladeuil (vila) wrote : | # |
>>>>> "robert" == Robert Collins <email address hidden> writes:
robert> I think your analysis of the problem is slightly off.
robert> HTTP says (broadly):
robert> - 1.1: connections are persistent unless otherwise noted/prevented
robert> - 1.0: connections are non-persistent unless otherwise requested and not
robert> prevented
robert> - all messages are delimited
robert> - by Content-Length or
robert> - TE or
robert> - dropping the connection
robert> So, if:
robert> - we're sending 1.1 as the version
robert> - we were not sending Content-Length, nor doing TE
robert> - and not dropping the connection
robert> then the server was fundamentally broken, not that
robert> the client was being overly strict.
Right. So the test server was:
- presenting itself as 1.1,
- not sending the Content-Length,
- sending 'Connection: close',
- closing the connection
and the client hanged, so following your arguments, it means the
*client* is broken, right ? Or is there still some grey area
because the server close on error while announcing 1.1 ?
Would you prefer KnownFailures for pycurl+https instead ?
robert> Is the diff accurate? it seems to have bundle stuff in it...
Yes to both points, the commit messages says... d-a-m-n--i-t,
commit messages don't appear in mails, <suffering/> right, I
should have think about that and explain in the cover letter :-/
A test was added there (long ago, which breaks under the same conditions
than the others, to avoid further regressions there, I refactor
the three existing tests, since there was a bunch of duplication,
that makes some noise.
Vincent
Robert Collins (lifeless) wrote : | # |
On Tue, 2009-07-07 at 13:15 +0000, Vincent Ladeuil wrote:
> Right. So the test server was:
> - presenting itself as 1.1,
> - not sending the Content-Length,
> - sending 'Connection: close',
> - closing the connection
This is valid. How were we closing it? shutdown(both), or halfshutdown,
then close, or ...? Perhaps this question doesn't matter, if doing
exactly the same thing but adding the header was enough to stop the
symptoms.
> and the client hanged, so following your arguments, it means the
> *client* is broken, right ? Or is there still some grey area
> because the server close on error while announcing 1.1 ?
Is it using https, or just http? I can imagine that there is some https
teardown stuff that perhaps we aren't doing right.
> Would you prefer KnownFailures for pycurl+https instead ?
I'd prefer to know the real cause. If valid http behaviour makes pycurl
hang, then real servers on the internet will be able to make bzr hang,
and thats bad. As it stands, I feel that landing this is a bit of an
anti-fix, as it is a workaround we can't reasonably ask live internet
servers to make.
-Rob
Martin Pool (mbp) wrote : | # |
31 + def send_error(self, code, message=None):
32 + """Send and log an error reply.
33 +
34 + Replace the bogus BaseHTTPServer.py with one that add the
35 + Content-Length header to address strict http clients.
36 + """
Please add a comment pointing to the bug so we know what particular problem this causes.
+ self.send_
I'd be a bit concerned about this being wrong if the message happens to be unicode - this will get the length in characters and then it'll be written as bytes. But maybe that never happens.
57 + if self.command != 'HEAD' and code >= 200 and code not in (204, 304):
58 + self.wfile.
This should possibly handle OPTIONS too, though I doubt we ever send it and it may not even reach this function, so it's academic.
Martin Pool (mbp) wrote : | # |
> On Tue, 2009-07-07 at 13:15 +0000, Vincent Ladeuil wrote:
>
> > Right. So the test server was:
> > - presenting itself as 1.1,
> > - not sending the Content-Length,
> > - sending 'Connection: close',
> > - closing the connection
>
> This is valid. How were we closing it? shutdown(both), or halfshutdown,
> then close, or ...? Perhaps this question doesn't matter, if doing
> exactly the same thing but adding the header was enough to stop the
> symptoms.
I can imagine some deadlocks occurring because they're both in the same process - perhaps the socket is relying on gc collect and it's not running before the deadlock occurs.
> > Would you prefer KnownFailures for pycurl+https instead ?
>
> I'd prefer to know the real cause. If valid http behaviour makes pycurl
> hang, then real servers on the internet will be able to make bzr hang,
> and thats bad. As it stands, I feel that landing this is a bit of an
> anti-fix, as it is a workaround we can't reasonably ask live internet
> servers to make.
I don't think we should hold this patch hostage to that. It's making the test server more correct and it's fixing a real problem that the test suite hangs. If we get reports that bzr hangs against real http/1.1 servers, which I don't remember seeing so far, then we can look at how to test that.
Robert Collins (lifeless) wrote : | # |
On Wed, 2009-07-08 at 00:09 +0000, Martin Pool wrote:
> I can imagine some deadlocks occurring because they're both in the
> same process - perhaps the socket is relying on gc collect and it's
> not running before the deadlock occurs.
Thats another possibility yes. Good catch.
> > > Would you prefer KnownFailures for pycurl+https instead ?
> >
> > I'd prefer to know the real cause. If valid http behaviour makes
> pycurl
> > hang, then real servers on the internet will be able to make bzr
> hang,
> > and thats bad. As it stands, I feel that landing this is a bit of an
> > anti-fix, as it is a workaround we can't reasonably ask live
> internet
> > servers to make.
>
> I don't think we should hold this patch hostage to that. It's making
> the test server more correct and it's fixing a real problem that the
> test suite hangs. If we get reports that bzr hangs against real
> http/1.1 servers, which I don't remember seeing so far, then we can
> look at how to test that.
The patch *does not* make the server more correct. It does prevent
hangs, but still, I think we should not stop looking at this till we
know the actual cause. Otherwise we've only half completed the root
cause analysis.
One reason we probably don't see reports of hangs, is that error cases
are relatively rare for bzr - we generally request resources that exist.
-Rob
Vincent Ladeuil (vila) wrote : | # |
>>>>> "robert" == Robert Collins <email address hidden> writes:
robert> On Tue, 2009-07-07 at 13:15 +0000, Vincent Ladeuil wrote:
>> Right. So the test server was:
>> - presenting itself as 1.1,
>> - not sending the Content-Length,
>> - sending 'Connection: close',
>> - closing the connection
robert> This is valid. How were we closing it? shutdown(both),
No
robert> or halfshutdown,
Not sure what you mean.
robert> then close, or ...?
Close the read and write parts of the socket. Simple closes.
robert> Perhaps this question doesn't matter, if doing
robert> exactly the same thing but adding the header was
robert> enough to stop the symptoms.
I was about to either:
- look at the C code,
- test against real servers (via my local_test_server) plugins,
- keep thinking...
And then I thought about the send_error method that I didn't
fully trust in the past and thought, well, let's just try
that.. and it worked.
>> and the client hanged, so following your arguments, it
>> means the *client* is broken, right ? Or is there still
>> some grey area because the server close on error while
>> announcing 1.1 ?
robert> Is it using https, or just http?
https *only*, http is fine...
robert> I can imagine that there is some https teardown stuff
robert> that perhaps we aren't doing right.
Whatever we do, a network connection loss will not, the client is
not supposed to hang either in that case no ?
>> Would you prefer KnownFailures for pycurl+https instead ?
robert> I'd prefer to know the real cause.
Cost/benefit analysis stopped me there, whatever the bug is, it
came from a change in either python or pycurl as I can guarantee
that at one point I had a python2.
the full test suite. It's a pity I didn't take note of the
precise versions involved.
robert> If valid http behaviour makes pycurl hang, then real
robert> servers on the internet will be able to make bzr
robert> hang, and thats bad.
I totally share that view.
robert> As it stands, I feel that landing this is a bit of an
robert> anti-fix, as it is a workaround we can't reasonably
robert> ask live internet servers to make.
Iff there are servers in the wild acting as described above (1.1,
close on error without Content-Length)...
My plan for that is to keep adding real servers to the
local_test_server plugin.
Overall, I think my time is better spent that way than fixing
bugs in pycurl instead of working on certificate validation for
urllib...
Vincent
Preview Diff
1 | === modified file 'NEWS' |
2 | --- NEWS 2009-07-06 06:46:30 +0000 |
3 | +++ NEWS 2009-07-08 06:35:20 +0000 |
4 | @@ -75,6 +75,10 @@ |
5 | of a remote repository. |
6 | (Jelmer Vernooij, #332194) |
7 | |
8 | +* Fix bogus ``send_error`` in python's BaseHTTPServer by inserting the missing |
9 | + 'Content-Length header. https pycurl clients can hang otherwise. |
10 | + (Vincent Ladeuil, #383920). |
11 | + |
12 | * Force deletion of readonly files during merge, update and other tree |
13 | transforms. |
14 | (Craig Hewetson, Martin Pool, #218206) |
15 | |
16 | === modified file 'bzrlib/tests/http_server.py' |
17 | --- bzrlib/tests/http_server.py 2009-03-23 14:59:43 +0000 |
18 | +++ bzrlib/tests/http_server.py 2009-07-08 06:35:20 +0000 |
19 | @@ -14,6 +14,7 @@ |
20 | # along with this program; if not, write to the Free Software |
21 | # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA |
22 | |
23 | +import BaseHTTPServer |
24 | import errno |
25 | import httplib |
26 | import os |
27 | @@ -144,6 +145,35 @@ |
28 | |
29 | return SimpleHTTPServer.SimpleHTTPRequestHandler.send_head(self) |
30 | |
31 | + def send_error(self, code, message=None): |
32 | + """Send and log an error reply. |
33 | + |
34 | + Replace the bogus BaseHTTPServer.py with one that add the |
35 | + Content-Length header to address strict http clients. |
36 | + """ |
37 | + |
38 | + try: |
39 | + short, long = self.responses[code] |
40 | + except KeyError: |
41 | + short, long = '???', '???' |
42 | + if message is None: |
43 | + message = short |
44 | + explain = long |
45 | + self.log_error("code %d, message %s", code, message) |
46 | + # using _quote_html to prevent Cross Site Scripting attacks (see bug |
47 | + # #1100201) |
48 | + content = (self.error_message_format % |
49 | + {'code': code, |
50 | + 'message': BaseHTTPServer._quote_html(message), |
51 | + 'explain': explain}) |
52 | + self.send_response(code, message) |
53 | + self.send_header("Content-Length", "%d" % len(content)) |
54 | + self.send_header("Content-Type", self.error_content_type) |
55 | + self.send_header('Connection', 'close') |
56 | + self.end_headers() |
57 | + if self.command != 'HEAD' and code >= 200 and code not in (204, 304): |
58 | + self.wfile.write(content) |
59 | + |
60 | def send_range_content(self, file, start, length): |
61 | file.seek(start) |
62 | self.wfile.write(file.read(length)) |
63 | |
64 | === modified file 'bzrlib/tests/test_read_bundle.py' |
65 | --- bzrlib/tests/test_read_bundle.py 2009-03-23 14:59:43 +0000 |
66 | +++ bzrlib/tests/test_read_bundle.py 2009-07-08 06:35:20 +0000 |
67 | @@ -84,70 +84,54 @@ |
68 | class TestReadBundleFromURL(TestTransportImplementation): |
69 | """Test that read_bundle works properly across multiple transports""" |
70 | |
71 | + def setUp(self): |
72 | + super(TestReadBundleFromURL, self).setUp() |
73 | + self.bundle_name = 'test_bundle' |
74 | + # read_mergeable_from_url will invoke get_transport which may *not* |
75 | + # respect self._transport (i.e. returns a transport that is different |
76 | + # from the one we want to test, so we must inject a correct transport |
77 | + # into possible_transports first. |
78 | + self.possible_transports = [self.get_transport(self.bundle_name)] |
79 | + self._captureVar('BZR_NO_SMART_VFS', None) |
80 | + wt = self.create_test_bundle() |
81 | + |
82 | + def read_mergeable_from_url(self, url): |
83 | + return bzrlib.bundle.read_mergeable_from_url( |
84 | + url, possible_transports=self.possible_transports) |
85 | + |
86 | def get_url(self, relpath=''): |
87 | return bzrlib.urlutils.join(self._server.get_url(), relpath) |
88 | |
89 | def create_test_bundle(self): |
90 | out, wt = create_bundle_file(self) |
91 | if self.get_transport().is_readonly(): |
92 | - f = open('test_bundle', 'wb') |
93 | - try: |
94 | - f.write(out.getvalue()) |
95 | - finally: |
96 | - f.close() |
97 | + self.build_tree_contents([(self.bundle_name, out.getvalue())]) |
98 | else: |
99 | - self.get_transport().put_file('test_bundle', out) |
100 | - self.log('Put to: %s', self.get_url('test_bundle')) |
101 | + self.get_transport().put_file(self.bundle_name, out) |
102 | + self.log('Put to: %s', self.get_url(self.bundle_name)) |
103 | return wt |
104 | |
105 | def test_read_mergeable_from_url(self): |
106 | - self._captureVar('BZR_NO_SMART_VFS', None) |
107 | - wt = self.create_test_bundle() |
108 | - if wt is None: |
109 | - return |
110 | - # read_mergeable_from_url will invoke get_transport which may *not* |
111 | - # respect self._transport (i.e. returns a transport that is different |
112 | - # from the one we want to test, so we must inject a correct transport |
113 | - # into possible_transports first. |
114 | - t = self.get_transport('test_bundle') |
115 | - possible_transports = [t] |
116 | - info = bzrlib.bundle.read_mergeable_from_url( |
117 | - unicode(self.get_url('test_bundle')), |
118 | - possible_transports=possible_transports) |
119 | + info = self.read_mergeable_from_url( |
120 | + unicode(self.get_url(self.bundle_name))) |
121 | revision = info.real_revisions[-1] |
122 | self.assertEqual('commit-1', revision.revision_id) |
123 | |
124 | def test_read_fail(self): |
125 | # Trying to read from a directory, or non-bundle file |
126 | # should fail with NotABundle |
127 | - self._captureVar('BZR_NO_SMART_VFS', None) |
128 | - wt = self.create_test_bundle() |
129 | - if wt is None: |
130 | - return |
131 | - |
132 | - self.assertRaises(errors.NotABundle, |
133 | - bzrlib.bundle.read_mergeable_from_url, |
134 | - self.get_url('tree')) |
135 | - self.assertRaises(errors.NotABundle, |
136 | - bzrlib.bundle.read_mergeable_from_url, |
137 | - self.get_url('tree/a')) |
138 | + self.assertRaises(errors.NotABundle, |
139 | + self.read_mergeable_from_url, self.get_url('tree')) |
140 | + self.assertRaises(errors.NotABundle, |
141 | + self.read_mergeable_from_url, self.get_url('tree/a')) |
142 | |
143 | def test_read_mergeable_respects_possible_transports(self): |
144 | - t = self.get_transport('test_bundle') |
145 | - if not isinstance(t, bzrlib.transport.ConnectedTransport): |
146 | + if not isinstance(self.get_transport(self.bundle_name), |
147 | + bzrlib.transport.ConnectedTransport): |
148 | # There is no point testing transport reuse for not connected |
149 | # transports (the test will fail even). |
150 | - return |
151 | - self._captureVar('BZR_NO_SMART_VFS', None) |
152 | - wt = self.create_test_bundle() |
153 | - if wt is None: |
154 | - return |
155 | - # read_mergeable_from_url will invoke get_transport which may *not* |
156 | - # respect self._transport (i.e. returns a transport that is different |
157 | - # from the one we want to test, so we must inject a correct transport |
158 | - # into possible_transports first. |
159 | - possible_transports = [t] |
160 | - url = unicode(self.get_url('test_bundle')) |
161 | - info = bzrlib.bundle.read_mergeable_from_url(url, |
162 | - possible_transports=possible_transports) |
163 | - self.assertEqual(1, len(possible_transports)) |
164 | + raise tests.TestSkipped( |
165 | + 'Need a ConnectedTransport to test transport reuse') |
166 | + url = unicode(self.get_url(self.bundle_name)) |
167 | + info = self.read_mergeable_from_url(url) |
168 | + self.assertEqual(1, len(self.possible_transports)) |
169 | |
170 | === modified file 'bzrlib/tests/test_transport_implementations.py' |
171 | --- bzrlib/tests/test_transport_implementations.py 2009-04-20 04:19:45 +0000 |
172 | +++ bzrlib/tests/test_transport_implementations.py 2009-07-08 06:35:21 +0000 |
173 | @@ -203,6 +203,13 @@ |
174 | for content, f in itertools.izip(contents, content_f): |
175 | self.assertEqual(content, f.read()) |
176 | |
177 | + def test_get_unknown_file(self): |
178 | + t = self.get_transport() |
179 | + files = ['a', 'b'] |
180 | + contents = ['contents of a\n', |
181 | + 'contents of b\n', |
182 | + ] |
183 | + self.build_tree(files, transport=t, line_endings='binary') |
184 | self.assertRaises(NoSuchFile, t.get, 'c') |
185 | self.assertListRaises(NoSuchFile, t.get_multi, ['a', 'b', 'c']) |
186 | self.assertListRaises(NoSuchFile, t.get_multi, iter(['a', 'b', 'c'])) |
187 | @@ -242,6 +249,9 @@ |
188 | for content, fname in zip(contents, files): |
189 | self.assertEqual(content, t.get_bytes(fname)) |
190 | |
191 | + def test_get_bytes_unknown_file(self): |
192 | + t = self.get_transport() |
193 | + |
194 | self.assertRaises(NoSuchFile, t.get_bytes, 'c') |
195 | |
196 | def test_get_with_open_write_stream_sees_all_content(self): |
Nasty bug here: pycurl was hanging when connecting in https mode to our test server
due to a missing Content-Length header in the python's BaseHttpServer implementation.
We didn't realize it sooner because it's triggered only when we test pycurl against
our https test server which requires both python2.6 and pycurl, which in turn seems
to be rather unsual.
A big thank you to Pavel, Russel and Martitza for bringing that up !