Bazaar

Merge lp:~gz/bzr/trivial_fork_error_block into lp:bzr

trivial_fork_error_block
Merge into bzr.dev

Proposed by Martin Packman on 2011-11-07

Status:	Rejected
Rejected by:	Martin Packman on 2011-11-08
Proposed branch:	lp:~gz/bzr/trivial_fork_error_block
Merge into:	lp:bzr
Diff against target:	42 lines (+14/-5) 1 file modified bzrlib/tests/__init__.py (+14/-5)
To merge this branch:	bzr merge lp:~gz/bzr/trivial_fork_error_block
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Vincent Ladeuil		2011-11-07	Needs Information on 2011-11-08
Review via email: mp+81479@code.launchpad.net

Description of the change

Second attempt at keeping last-ditch error logging from forked selftest children separated from each other. For the background, see (merged) proposal from earlier:

<https://code.launchpad.net/~gz/bzr/trivial_fork_error_block/+merge/81372>

I'm not crazy about this change, it adds a lot of complexity that's not comprehensively testable. However, even if unlikely, a child process getting a signal that caused the tail of traceback not be displayed at all would be annoying.

Revision history for this message

Vincent Ladeuil (vila) wrote on 2011-11-08:

I'm not sure I understand the rationale, did you encounter a real-life case where this is required ? babune killing the master process ?

Also, I'm not sure it applied here but mixing buffered and not buffered accesses is generally error-prone and may cause data loss no ? Should you add a .flush() before trying to report the exception ?

That being said, if I my fears are without basis and you have a real use case here, this should be landed, but a comment explaining it would help.

review: Needs Information

Revision history for this message

Vincent Ladeuil (vila) wrote on 2011-11-08:

/me blinks

My review above was made directly from launchpad so I missed your comments on your previous attempt that I just read in my backlogged mail.

So, bufsize=1.

But then, we shouldn't see *lines* mixed, but *chars* mixed. What is going on here ?

review: Needs Information

Revision history for this message

Martin Packman (gz) wrote on 2011-11-08:

> So, bufsize=1.
>
> But then, we shouldn't see *lines* mixed, but *chars* mixed. What is going on
> here ?

In Python, 1 is a magic value used to imply _IOLBUF (on the grounds no one want to do per-character buffering).

> I'm not sure I understand the rationale, did you encounter a real-life case
> where this is required ? babune killing the master process ?

The problem is the same as before, miraculously the same random failure happened on the natty buildbot for the revision intended to fix this, showing right away that it hadn't worked:

<http://babune.ladeuil.net:24842/job/selftest-chroot-natty/285/testReport/junit/bzrlib.tests.test_selftest/TestParallelFork/test_error_in_child_during_fork/>

> Also, I'm not sure it applied here but mixing buffered and not buffered
> accesses is generally error-prone and may cause data loss no ? Should you add
> a .flush() before trying to report the exception ?

We don't know if the stream has been created or used at this point, and any remaining content in buffers will just be discarded on exit. For last ditch logging like this, that doesn't matter too much as we'd already be losing that information, this way we just make sure the traceback does get out.

Again, this mostly just helps development of the selftest framework itself, problems in tests will be caught and reported at a much higher level than this.

Revision history for this message

Martin Packman (gz) wrote on 2011-11-08:

Looking at it again, I think this extra change isn't helpful either, see the original merge proposal again for more rationale.

Revision history for this message

Vincent Ladeuil (vila) wrote on 2011-11-08:

> > But then, we shouldn't see *lines* mixed, but *chars* mixed. What is going
> on
> > here ?
>
> In Python, 1 is a magic value used to imply _IOLBUF

Ha, bah, I knew that one day...

> (on the grounds no one want to do per-character buffering).

Hey, what if I want to output half chars ? ;)

> The problem is the same as before, miraculously the same random failure
> happened on the natty buildbot for the revision intended to fix this, showing
> right away that it hadn't worked:

This bug is really nice with you :)

>
> <http://babune.ladeuil.net:24842/job/selftest-chroot-natty/285/testReport/juni
> t/bzrlib.tests.test_selftest/TestParallelFork/test_error_in_child_during_fork/
> >

> We don't know if the stream has been created or used at this point, and any
> remaining content in buffers will just be discarded on exit. For last ditch
> logging like this, that doesn't matter too much as we'd already be losing that
> information, this way we just make sure the traceback does get out.

But why handling EINTR was really the intriguing part for me.

>
> Again, this mostly just helps development of the selftest framework itself,
> problems in tests will be caught and reported at a much higher level than
> this.

Right. But having discussing this test failure with you back in Orlando, I wonder if you're not digging too much on this front.

If we can't get a reliable way to address the issue here, it may be better to just relax the test itself and keep this part of the code as simple as possible while documenting the potential issue no ?

review: Needs Information

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Alejandro Cornejo2

Bazaar Codereview Subscribers

Benoit Pierre

Gmood

Karl Bielefeldt

Mahmoud Hassan

Martin Packman

Matt Nordhoff

Mohd Fikri Mohd Amin

MrJOHN

Václav Haisman

bzr PQM

vincenzo

to status/vote changes:

Alexander Belchenko

amandla2023

 === modified file 'bzrlib/tests/__init__.py'
 --- bzrlib/tests/__init__.py	2011-11-07 10:14:38 +0000
 +++ bzrlib/tests/__init__.py	2011-11-07 18:10:38 +0000
@@ -3503,24 +3503,33 @@
          pid = os.fork()
          if pid == 0:
              try:
--                stream = os.fdopen(c2pwrite, 'wb', 1)
--                workaround_zealous_crypto_random()
                  os.close(c2pread)
                  # Leave stderr and stdout open so we can see test noise
                  # Close stdin so that the child goes away if it decides to
                  # read from stdin (otherwise its a roulette to see what
                  # child actually gets keystrokes for pdb etc).
                  sys.stdin.close()
++                workaround_zealous_crypto_random()
++                stream = os.fdopen(c2pwrite, 'wb', 1)
                  subunit_result = AutoTimingTestResultDecorator(
                      SubUnitBzrProtocolClient(stream))
                  process_suite.run(subunit_result)
              except:
                  # Try and report traceback on stream, but exit with error even
                  # if stream couldn't be created or something else goes wrong.
--                # The traceback is formatted to a string and written in one go
--                # to avoid interleaving lines from multiple failing children.
++                # The traceback is formatted to a string and written with the
++                # low-level api to avoid line buffering interleaving output
++                # from multiple failing children, unless a signal forces that.
                  try:
--                    stream.write(traceback.format_exc())
++                    tb_text = traceback.format_exc()
++                    while tb_text:
++                        try:
++                            bytes_written = os.write(c2pwrite, tb_text)
++                        except EnvironmentError, e:
++                            if e.errno != errno.EINTR:
++                                raise
++                        else:
++                            tb_text = tb_text[bytes_written:]
                  finally:
                      os._exit(1)
              os._exit(0)

Bazaar

Merge lp:~gz/bzr/trivial_fork_error_block into lp:bzr

Commit message

Description of the change

Preview Diff

Subscribers