selftest --parallel=subprocess ignores child stderr pipes

Bug #625551 reported by Martin Packman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Confirmed
Low
Unassigned

Bug Description

When running selftest with --parallel=subprocess (and actually having more than one CPU or BZR_CONCURRENCY set to > 1) Bazaar creates the child processes for running the tests with a stderr pipe, but then never does anything with with the process.stderr argument.

            # stderr=subprocess.STDOUT would be ideal, but until we prevent
            # noise on stderr it can interrupt the subunit protocol.
            process = subprocess.Popen(argv, stdin=subprocess.PIPE,
                                      stdout=subprocess.PIPE,
                                      stderr=subprocess.PIPE,
                                      bufsize=1)

This is particularly bad as only stdout is ever read from before calling wait on the process, so if a child writes more than the OS pipe buffer size to stderr the process will hang.

I don't quite understand the comment on the code above, as while subunit is used, it will always be impossible to write arbitrary stuff to stdout without breaking things.

It appears that --parallel=fork leaves stderr on the same fd as the parent, so similarly removing the stderr argument above may be sifficent for now. It at least makes problems visible (though possibly intermingled) and prevents wedging.

Martin Packman (gz)
Changed in bzr:
importance: Undecided → Low
status: New → Confirmed
Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 625551] [NEW] selftest --parallel=subprocess ignores child stderr pipes

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 8/27/2010 4:14 PM, Martin [gz] wrote:
> Public bug reported:
>
> When running selftest with --parallel=subprocess (and actually having
> more than one CPU or BZR_CONCURRENCY set to > 1) Bazaar creates the
> child processes for running the tests with a stderr pipe, but then never
> does anything with with the process.stderr argument.
>
> # stderr=subprocess.STDOUT would be ideal, but until we prevent
> # noise on stderr it can interrupt the subunit protocol.
> process = subprocess.Popen(argv, stdin=subprocess.PIPE,
> stdout=subprocess.PIPE,
> stderr=subprocess.PIPE,
> bufsize=1)
>
> This is particularly bad as only stdout is ever read from before calling
> wait on the process, so if a child writes more than the OS pipe buffer
> size to stderr the process will hang.
>
> I don't quite understand the comment on the code above, as while subunit
> is used, it will always be impossible to write arbitrary stuff to stdout
> without breaking things.
>
> It appears that --parallel=fork leaves stderr on the same fd as the
> parent, so similarly removing the stderr argument above may be sifficent
> for now. It at least makes problems visible (though possibly
> intermingled) and prevents wedging.
>
> ** Affects: bzr
> Importance: Low
> Status: Confirmed
>
>

Seems reasonable to me. The other option is to use '.communicate'
instead of .wait(). Or possibly child.stderr.close() child.wait() since
that will avoid blocking at least.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkx6mNoACgkQJdeBCYSNAAMHsACeMzJTdF2SyX8pU6nRd1S68M6S
+1MAoJjxg8VrbL84C2f+HlHblJP8wm3/
=o9jV
-----END PGP SIGNATURE-----

Revision history for this message
Andrew Bennetts (spiv) wrote :

Replacing the.wait() with .communicate() isn't sufficient, as subunit.ProtocolTestCase's run() repeatedly calls readline() on process.stdout without ever reading from stderr. So if the child attempts to write too much to stderr we will deadlock.

So I think the best solution for now is to simply reuse our own stderr, as mgz suggests, by not passing stderr= (or explicitly doing stderr=None). It's hopefully safe to assume that the terminal or whatever is draining both our stdout and stderr reliably! I'd be happy to make a patch that does this if folks agree.

Revision history for this message
Robert Collins (lifeless) wrote : Re: [Bug 625551] Re: selftest --parallel=subprocess ignores child stderr pipes

Or you could improve subunit!

Revision history for this message
Andrew Bennetts (spiv) wrote :

Robert Collins wrote:
> Or you could improve subunit!

Unfortunately the only easy way to improve the situation would involve
adding threads to subunit, and I'm not sure I hate myself enough to be
able to write that code :P

Revision history for this message
Robert Collins (lifeless) wrote :

On Fri, May 27, 2011 at 7:06 PM, Andrew Bennetts
<email address hidden> wrote:
> Robert Collins wrote:
>> Or you could improve subunit!
>
> Unfortunately the only easy way to improve the situation would involve
> adding threads to subunit, and I'm not sure I hate myself enough to be
> able to write that code :P

Why? the parser and the get-data functions are already separated -
subunit was written with twisted use in mind.

Revision history for this message
Andrew Bennetts (spiv) wrote :

Robert Collins wrote:
> On Fri, May 27, 2011 at 7:06 PM, Andrew Bennetts
> <email address hidden> wrote:
> > Robert Collins wrote:
> >> Or you could improve subunit!
> >
> > Unfortunately the only easy way to improve the situation would involve
> > adding threads to subunit, and I'm not sure I hate myself enough to be
> > able to write that code :P
>
> Why? the parser and the get-data functions are already separated -
> subunit was written with twisted use in mind.

Because subprocess doesn't expose a way to read incrementally from both
stdout and stderr. The best you can do with it is use .communicate()
to feed a canned stdin and get back a fully read stdout and stderr once
the process exits.

So if you want to drain stdout and stderr simultaneously (and we do) you
either need to reimplement that part of subprocess (the easiest way to
do that portably is with threads) or use something else entirely, like
Twisted, which would be pretty invasive for bzr selftest.

In short: I don't see a nice way to implement a get-data function that
incidentally drains stderr with the tools the stdlib provides, except
threads.

Revision history for this message
Robert Collins (lifeless) wrote :

On Fri, May 27, 2011 at 10:27 PM, Andrew Bennetts
<email address hidden> wrote:
>> > Unfortunately the only easy way to improve the situation would involve
>> > adding threads to subunit, and I'm not sure I hate myself enough to be
>> > able to write that code :P
>>
>> Why? the parser and the get-data functions are already separated -
>> subunit was written with twisted use in mind.
>
> Because subprocess doesn't expose a way to read incrementally from both
> stdout and stderr.   The best you can do with it is use .communicate()
> to feed a canned stdin and get back a fully read stdout and stderr once
> the process exits.

Sure it does - select. The file objects can be selected on and
incrementally iterated. Or asyncore.

-Rob

Revision history for this message
Andrew Bennetts (spiv) wrote :

> > Because subprocess doesn't expose a way to read incrementally from both
[…]
> Sure it does - select. The file objects can be selected on and
> incrementally iterated. Or asyncore.

Yes, select is how the subprocess module implements communicate() — on
posix. But that doesn't work on Windows. asyncore also doesn't help
for the same reason, it relies on APIs that only work on sockets on
Windows.

-Andrew.

Revision history for this message
Robert Collins (lifeless) wrote :

On Mon, May 30, 2011 at 6:27 PM, Andrew Bennetts
<email address hidden> wrote:
>> > Because subprocess doesn't expose a way to read incrementally from both
> […]
>> Sure it does - select. The file objects can be selected on and
>> incrementally iterated. Or asyncore.
>
> Yes, select is how the subprocess module implements communicate() — on
> posix.  But that doesn't work on Windows.  asyncore also doesn't help
> for the same reason, it relies on APIs that only work on sockets on
> Windows.

Ah. If only such a thing existed.

Revision history for this message
Martin Packman (gz) wrote :

>> Yes, select is how the subprocess module implements communicate() — on
>> posix. But that doesn't work on Windows. asyncore also doesn't help
>> for the same reason, it relies on APIs that only work on sockets on
>> Windows.
>
> Ah. If only such a thing existed.

It's perfectly possible to use asynchronous ipc on windows. The subprocess module just doesn't use the right pipe functions, and as with its other limitations can't really be extended to do so. Even if the underlying issues were fixed, it's not clear what the Python interface for not screwing this stuff up by accident would be. Twisted?

Revision history for this message
Andrew Bennetts (spiv) wrote :

Martin [gz] wrote:
> It's perfectly possible to use asynchronous ipc on windows. The
> subprocess module just doesn't use the right pipe functions, and as with
> its other limitations can't really be extended to do so. Even if the
> underlying issues were fixed, it's not clear what the Python interface
> for not screwing this stuff up by accident would be. Twisted?

Twisted certainly does cope just fine with this. Unfortunately it wants
to be the framework of your program rather than a library you can call
into when you occasionally need it, so it's not obviously a good idea
for bzr selftest. Oh, and Twisted officially doesn't support being
fork()ed either, which is a shame because I use --parallel=fork a lot.

Revision history for this message
Martin Pool (mbp) wrote :

Is there actually any down side to just leaving stderr connected to
the original? It seems actually quite useful to have any extraneous
messages come through there, and much simpler to implement.

Revision history for this message
Robert Collins (lifeless) wrote :

On Tue, May 31, 2011 at 6:50 PM, Martin Pool <email address hidden> wrote:
> Is there actually any down side to just leaving stderr connected to
> the original?  It seems actually quite useful to have any extraneous
> messages come through there, and much simpler to implement.

None from my perspective. Its designed to work this way in fact :)

Revision history for this message
Martin Pool (mbp) wrote :

On 31 May 2011 17:18, Robert Collins <email address hidden> wrote:
> On Tue, May 31, 2011 at 6:50 PM, Martin Pool <email address hidden> wrote:
>> Is there actually any down side to just leaving stderr connected to
>> the original?  It seems actually quite useful to have any extraneous
>> messages come through there, and much simpler to implement.
>
> None from my perspective. Its designed to work this way in fact :)

I don't understand your comment "or you could improve subunit" then...

Revision history for this message
Robert Collins (lifeless) wrote :

On Tue, May 31, 2011 at 7:30 PM, Martin Pool <email address hidden> wrote:
> I don't understand your comment "or you could improve subunit" then...

A thinko. It would be nice to have subunit be able to drain them both,
but yeah having stderr not handled by the processors is the intent.

Jelmer Vernooij (jelmer)
tags: added: check-for-breezy
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.