Merge into 2.1 : 2.1-client-stream-started-819604 : Code : Bazaar

Status:	Work in progress
Proposed branch:	lp:~jameinel/bzr/2.1-client-stream-started-819604
Merge into:	lp:bzr/2.1
Prerequisite:	lp:~jameinel/bzr/2.1-client-reconnect-819604
Diff against target:	175 lines (+90/-9) 4 files modified bzrlib/smart/client.py (+5/-1) bzrlib/smart/medium.py (+1/-2) bzrlib/smart/protocol.py (+5/-0) bzrlib/tests/test_smart_transport.py (+79/-6)
To merge this branch:	bzr merge lp:~jameinel/bzr/2.1-client-stream-started-819604
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
bzr-core		2011-10-07	Pending
Review via email: mp+78631@code.launchpad.net

Commit message

Bug #819604, allow reconnecting with a body stream as long as we haven't touched the stream yet.

Description of the change

This layers on top of my 'client reconnect' patch for the case where we have a body stream.

1) It adds a flush just before we process the first entry of the body stream. This adds an extra write call to 'bzr push'. However, we write after every entry in the stream, and the first entry of the stream is actually very small. (bzrlib.repository._stream_to_byte_stream() the first object that gets flushed is 'pack_writer.begin()', which is just the pack header.)
I don't imagine writing out the ProtocolThree headers will be adding tons of real world overhead.

2) It tracks once we start consuming body_stream, so that we don't do the auto-retry after that point.

I felt it was clear and clean enough to do it this way. We could arguably do a little bit better higher up the stack, but not much, as I described in the bug. Specifically, RemoteRepository.insert_stream is the only caller that passes a body_stream. But it, itself, only gets a stream of records to pass. So if we've started passing records, we can't retry anyway. So the retry logic needs to go all the way up to Fetch code.

Revision history for this message

Andrew Bennetts (spiv) wrote on 2011-10-09:

#

[Just a comment, not a full review]

> 1) It adds a flush just before we process the first entry of the body
> stream. This adds an extra write call to 'bzr push'. However, we write
> after every entry in the stream, and the first entry of the stream is
> actually very small. (bzrlib.repository._stream_to_byte_stream() the
> first object that gets flushed is 'pack_writer.begin()', which is just
> the pack header.)
> I don't imagine writing out the ProtocolThree headers will be adding
> tons of real world overhead.

You may be right, but please check. Use the netem stuff under linux to
add latency to loopback (I like 500ms), it's documented on the wiki, and
try some pushes with and without.

> 2) It tracks once we start consuming body_stream, so that we don't do
> the auto-retry after that point.
>
> I felt it was clear and clean enough to do it this way. We could

The layering here sounds reasonable to me.

-Andrew.

Reply

Revision history for this message

John A Meinel (jameinel) wrote on 2011-10-11:

#

Download full text (3.3 KiB)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/10/2011 01:37 AM, Andrew Bennetts wrote:
> [Just a comment, not a full review]
>
>> 1) It adds a flush just before we process the first entry of the body
>> stream. This adds an extra write call to 'bzr push'. However, we write
>> after every entry in the stream, and the first entry of the stream is
>> actually very small. (bzrlib.repository._stream_to_byte_stream() the
>> first object that gets flushed is 'pack_writer.begin()', which is just
>> the pack header.)
>> I don't imagine writing out the ProtocolThree headers will be adding
>> tons of real world overhead.
>
> You may be right, but please check. Use the netem stuff under linux to
> add latency to loopback (I like 500ms), it's documented on the wiki, and
> try some pushes with and without.

Short version: This can cause us to send an extra packet, but we don't
always.

I plugged in netem, adding a 500ms delay on the loopback.

I then create a shared repository, branch into it loggerhead -r -10, and
then push to it from the client. I added debugging code that says
'flushing' whenever we hit that '.flush()'[1] line.

bzr.dev 21.627 21.655 21.645
flush() 22.587 22.575 21.656 21.640

It looks like there are 2 tiers of times possible, one with the extra
round trip, and one without, but the extra round trip is not guaranteed.
(Maybe it depends on whether we get a context switch between the
.flush() call and consuming the first hunk from the iterator.)

I ran 3-4 times before I started this list, and all of them were at 21.6
tier, but the next 6 runs were about even between 21.6 and 22.5.

If I change it so that we just have an empty shared repository, and we
push all of loggerhead's history, I get:

bzr.dev 54.682 54.713 54.742 54.683
flush() 55.052 55.083 55.403 55.043

What is strange is that it is consistently slower for a 'write
everything' call, but it isn't even 500ms slower.

Note that without netem slowing things down, the full push times are:

bzr.dev 3.923 3.975 3.945
flush() 3.974 3.954 3.967

So certainly well within any noise margin.

Note that 'flush()' is really just saying 'actually write this to the
socket/pipe/etc', it isn't actually requesting a round trip be made.
Though I think we might have NODELAY set, if you look at the code, it is
going to be writing the actual header/content bytes to the socket in a
millisecond or so.

Arguably during the time we are sending the bulk data stream, we should
be setting NODELAY off.

>
>> 2) It tracks once we start consuming body_stream, so that we don't do
>> the auto-retry after that point.
>>
>> I felt it was clear and clean enough to do it this way. We could
>
> The layering here sounds reasonable to me.
>
> -Andrew.
>
>

We could leave in the check for body_stream_started without adding in
the flush(), though the chances of triggering it before the stream are
virtually nil because we buffer up to 1MB of data before auto-flushing.

[1] I actually see 'flushing' display 2 times. Probably that is the
'probe' insert? (The empty one to determine the remote side supports the
api.)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using ...

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/10/2011 01:37 AM, Andrew Bennetts wrote:
> [Just a comment, not a full review]
> 
>> 1) It adds a flush just before we process the first entry of the body
>> stream. This adds an extra write call to 'bzr push'. However, we write
>> after every entry in the stream, and the first entry of the stream is
>> actually very small. (bzrlib.repository._stream_to_byte_stream() the
>> first object that gets flushed is 'pack_writer.begin()', which is just
>> the pack header.)
>> I don't imagine writing out the ProtocolThree headers will be adding
>> tons of real world overhead.
> 
> You may be right, but please check.  Use the netem stuff under linux to
> add latency to loopback (I like 500ms), it's documented on the wiki, and
> try some pushes with and without.

Short version: This can cause us to send an extra packet, but we don't
always.

I plugged in netem, adding a 500ms delay on the loopback.

I then create a shared repository, branch into it loggerhead -r -10, and
then push to it from the client. I added debugging code that says
'flushing' whenever we hit that '.flush()'[1] line.

bzr.dev	21.627	21.655	21.645
flush()	22.587	22.575	21.656	21.640

It looks like there are 2 tiers of times possible, one with the extra
round trip, and one without, but the extra round trip is not guaranteed.
(Maybe it depends on whether we get a context switch between the
.flush() call and consuming the first hunk from the iterator.)

I ran 3-4 times before I started this list, and all of them were at 21.6
tier, but the next 6 runs were about even between 21.6 and 22.5.

If I change it so that we just have an empty shared repository, and we
push all of loggerhead's history, I get:

bzr.dev	54.682	54.713	54.742	54.683
flush()	55.052	55.083	55.403	55.043

What is strange is that it is consistently slower for a 'write
everything' call, but it isn't even 500ms slower.

Note that without netem slowing things down, the full push times are:

bzr.dev	3.923	3.975	3.945
flush()	3.974	3.954	3.967

So certainly well within any noise margin.

Note that 'flush()' is really just saying 'actually write this to the
socket/pipe/etc', it isn't actually requesting a round trip be made.
Though I think we might have NODELAY set, if you look at the code, it is
going to be writing the actual header/content bytes to the socket in a
millisecond or so.

Arguably during the time we are sending the bulk data stream, we should
be setting NODELAY off.

> 
>> 2) It tracks once we start consuming body_stream, so that we don't do
>> the auto-retry after that point.
>>
>> I felt it was clear and clean enough to do it this way. We could
> 
> The layering here sounds reasonable to me.
> 
> -Andrew.
> 
>

We could leave in the check for body_stream_started without adding in
the flush(), though the chances of triggering it before the stream are
virtually nil because we buffer up to 1MB of data before auto-flushing.

[1]  I actually see 'flushing' display 2 times. Probably that is the
'probe' insert? (The empty one to determine the remote side supports the
api.)

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk6UPqsACgkQJdeBCYSNAAOb9QCaArkb5qQHsdrZo4Jj1oF8GpzA
9JsAn23lEclB9g60dY9sMN0QA1uqLCaD
=fYjv
-----END PGP SIGNATURE-----

Reply

Revision history for this message

John A Meinel (jameinel) wrote on 2011-10-12: