Merge lp:~jameinel/bzr/2.5-unending-sendall-1047309 into lp:bzr/2.5

Proposed by John A Meinel
Status: Merged
Approved by: John A Meinel
Approved revision: 6510
Merged at revision: 6510
Proposed branch: lp:~jameinel/bzr/2.5-unending-sendall-1047309
Merge into: lp:bzr/2.5
Diff against target: 68 lines (+34/-1)
3 files modified
bzrlib/osutils.py (+5/-1)
bzrlib/tests/test_osutils.py (+22/-0)
doc/en/release-notes/bzr-2.5.txt (+7/-0)
To merge this branch: bzr merge lp:~jameinel/bzr/2.5-unending-sendall-1047309
Reviewer Review Type Date Requested Status
Richard Wilbur Needs Fixing
Martin Packman (community) Approve
Review via email: mp+123268@code.launchpad.net

Commit message

Detect if we are not progressing during osutils.send_all() and fail rather than hang. (bug #1047309)

Description of the change

See the associated bug #1047309.

Basically, with the new ConnectionReset code coming into bzr-2.5, it seems that if Paramiko gets disconnected, it is unable to send any content, but doesn't raise an error. (it returns 0 bytes sent, but no exception is raised.)

As a conservative fix, this allows us to get a series of up to 5 0-bytes-sent, but any more than that in a row, and we just treat it as a ECONNRESET. Looking at the man pages, it seems send in non-blocking socket shouldn't ever give 0 bytes sent, but I preferred to be a little cautious.

If this causes fallout, we could tweak the retries, or even do sleep-based backoff.

To post a comment you must log in.
Revision history for this message
Martin Packman (gz) wrote :

Did you find any answer on if it's valid for send() to return 0? As far as I'm aware it's not, it's either EINTR or a partial write of more than one byte, though I can't find anything explicit either.

For implementations like paramiko that just emulate the interface, coping with this is I guess required, though it would be nice to fix upstream to do something more sane. I'd prefer raising an error straight off, but understand the motivation behind retying several times, as that would previously have worked and regressions are bad. I'd much prefer not to use this heuristic if at possible though, instead quirking on the socket type or something that makes it explict this is a paramiko issue only.

review: Needs Information
Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 9/7/2012 7:54 PM, Martin Packman wrote:
> Review: Needs Information
>
> Did you find any answer on if it's valid for send() to return 0? As
> far as I'm aware it's not, it's either EINTR or a partial write of
> more than one byte, though I can't find anything explicit either.
>
> For implementations like paramiko that just emulate the interface,
> coping with this is I guess required, though it would be nice to
> fix upstream to do something more sane. I'd prefer raising an error
> straight off, but understand the motivation behind retying several
> times, as that would previously have worked and regressions are
> bad. I'd much prefer not to use this heuristic if at possible
> though, instead quirking on the socket type or something that makes
> it explict this is a paramiko issue only.
>

Well, we could go with immediately erroring, and see how bad the
fallout is...

I think this is a reasonable compromise which means we won't have to
think about it again.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iEYEARECAAYFAlBLLYMACgkQJdeBCYSNAAPMXwCdHdkRL+i8RzI7PckS7yFce62M
1NIAnjiQ3YrIyh7xgkifk5JIScAN05YS
=zHgy
-----END PGP SIGNATURE-----

Revision history for this message
Martin Packman (gz) wrote :

Won't have to think about, apart from when someone writing code wants to know if they should use socket.sendall or osutils.sendall and what the hell is going on... So, I don't really like the code but am somewhat sold on it being the least risky option for fixing this bug.

Suggested tweaks before landing:
* Remove note about making _max_no_content_sends configurable, internal hacks like this shouldn't be exposed
* Add comment/docstring to osutils.sendall about why it's checking for 0 from send
* Maybe add mutter if we get 0 returned from send which is probably the only way we'll find out if it gets hit in the wild
* Make test_send_minimal_progress clear that this is not behaviour we've actually observed

review: Approve
Revision history for this message
Richard Wilbur (richard-wilbur) wrote :

The documentation for paramiko.Channel.send: "Returns the number of bytes sent, or 0 if the channel stream is closed."[1]

Python 2.7 socket library documentation for socket.send: "Returns the number of bytes sent. Applications are responsible for checking that all data has been sent; if only some of the data was transmitted, the application needs to attempt delivery of the remaining data. For further information on this concept, consult the Socket Programming HOWTO."[2]

Interestingly the Socket Programming HOWTO (referenced above) has an example which suggests you might get a 0 return value[3], but only if the connection is broken:
            sent = self.sock.send(msg[totalsent:])
            if sent == 0:
                raise RuntimeError("socket connection broken")

Linux documentation for socket send[4]:
RETURN VALUE
       On success, these calls return the number of characters sent. On
       error, -1 is returned, and errno is set appropriately.

The consensus seems to be that a socket send should never return 0 unless there is an error. Paramiko looks like it thinks that is a normal way to signal a closed connection. The python socket programming HOWTO implies that the only reason for a 0 return value is also a closed connection.

@John, I agree with your proposition that we should throw an exception immediately when send returns 0.
@Martin, this would alleviate the need for _max_no_content_sends. Hence,
  1. I am in favour of removing _max_no_content_sends
  2. Comment/docstring for osutils.sendall would be fine if it isn't obvious from the exception we throw
  3.,4. Agree

References:
[1] http://www.lag.net/paramiko/docs/paramiko.Channel-class.html#send
[2] http://docs.python.org/library/socket.html
[3] http://docs.python.org/howto/sockets.html#socket-howto
[4] info send

review: Needs Fixing
6511. By John A Meinel

simplify the fix. Sending 0 bytes seems to always indicate that we have a closed connection.

Revision history for this message
John A Meinel (jameinel) wrote :

Code has been simplified to just treat 0 bytes as closed connection.

Revision history for this message
John A Meinel (jameinel) wrote :

sent to pqm by email

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/osutils.py'
2--- bzrlib/osutils.py 2012-02-28 04:58:14 +0000
3+++ bzrlib/osutils.py 2012-09-11 07:29:18 +0000
4@@ -2140,8 +2140,12 @@
5 if e.args[0] != errno.EINTR:
6 raise
7 else:
8+ if sent == 0:
9+ raise errors.ConnectionReset('Sending to %s returned 0 bytes'
10+ % (sock,))
11 sent_total += sent
12- report_activity(sent, 'write')
13+ if report_activity is not None:
14+ report_activity(sent, 'write')
15
16
17 def connect_socket(address):
18
19=== modified file 'bzrlib/tests/test_osutils.py'
20--- bzrlib/tests/test_osutils.py 2012-02-28 04:58:14 +0000
21+++ bzrlib/tests/test_osutils.py 2012-09-11 07:29:18 +0000
22@@ -819,6 +819,28 @@
23 self.assertEqual(None, osutils.safe_file_id(None))
24
25
26+
27+class TestSendAll(tests.TestCase):
28+
29+ def test_send_with_no_progress(self):
30+ # See https://bugs.launchpad.net/bzr/+bug/1047309
31+ # It seems that paramiko can get into a state where it doesn't error,
32+ # but it returns 0 bytes sent for requests over and over again.
33+ class NoSendingSocket(object):
34+ def __init__(self):
35+ self.call_count = 0
36+ def send(self, bytes):
37+ self.call_count += 1
38+ if self.call_count > 100:
39+ # Prevent the test suite from hanging
40+ raise RuntimeError('too many calls')
41+ return 0
42+ sock = NoSendingSocket()
43+ self.assertRaises(errors.ConnectionReset,
44+ osutils.send_all, sock, 'content')
45+ self.assertEqual(1, sock.call_count)
46+
47+
48 class TestPosixFuncs(tests.TestCase):
49 """Test that the posix version of normpath returns an appropriate path
50 when used with 2 leading slashes."""
51
52=== modified file 'doc/en/release-notes/bzr-2.5.txt'
53--- doc/en/release-notes/bzr-2.5.txt 2012-09-06 12:05:27 +0000
54+++ doc/en/release-notes/bzr-2.5.txt 2012-09-11 07:29:18 +0000
55@@ -39,6 +39,13 @@
56 extracted texts from the repository. (Just an ordering constraint on how
57 they consumed the stream.) (John Arbash Meinel, #1046284)
58
59+* ``osutils.send_all`` now detects if we get a series of zero bytes sent,
60+ and fails with a ECONNRESET. It seems if paramiko gets disconnected, it
61+ will get into a state where it returns 0 bytes sent, but doesn't raise
62+ an error. This change allows us to get a couple hiccups of no content
63+ sent, but if it is consistent, we will consider it to be a failure.
64+ (John Arbash Meinel, #1047309)
65+
66 * Revert use of --no-tty when gpg signing commits. (Jelmer Vernooij, #1014570)
67
68 * Some small bug fixes wrt lightweight checkouts and remote repositories.

Subscribers

People subscribed via source and target branches