Merge into devel : branch-rewrite : Code : Launchpad itself

Status:

Superseded

Proposed branch:

lp:~stub/launchpad/branch-rewrite

Merge into:

lp:launchpad

Prerequisite:

lp:~stub/launchpad/pgbouncer-fixture

Diff against target:

355 lines (+198/-12)

5 files modified

lib/lp/codehosting/tests/test_rewrite.py (+88/-6)
lib/lp/testing/__init__.py (+23/-0)
lib/lp/testing/fixture.py (+16/-3)
lib/lp/testing/tests/test_fixture.py (+58/-2)
scripts/branch-rewrite.py (+13/-1)

To merge this branch:

bzr merge lp:~stub/launchpad/branch-rewrite

Critical

Fix Released

Link a bug report

Reviewer	Review Type	Date Requested	Status
Jeroen T. Vermeulen (community)		2011-08-29	Approve on 2011-08-31
Review via email: mp+73284@code.launchpad.net

This proposal has been superseded by a proposal from 2011-08-31.

Description of the change

= Summary =

branch-rewrite.py does not reconnect after a database outage.

== Proposed fix ==

Make it reconnect after a database outage.

Revision history for this message

Jeroen T. Vermeulen (jtv) wrote on 2011-08-31:

#

Download full text (5.1 KiB)

Hi Stuart,

Generally good, but I have a few things that look worth fixing.

=== modified file 'lib/lp/codehosting/tests/test_rewrite.py'
--- lib/lp/codehosting/tests/test_rewrite.py 2011-08-12 11:37:08 +0000
+++ lib/lp/codehosting/tests/test_rewrite.py 2011-08-29 20:15:59 +0000

@@ -177,7 +181,8 @@
         transaction.commit()
         rewriter.rewriteLine('/' + branch.unique_name + '/.bzr/README')
         rewriter.rewriteLine('/' + branch.unique_name + '/.bzr/README')
- logging_output_lines = self.getLoggerOutput(rewriter).strip().split('\n')
+ logging_output_lines = self.getLoggerOutput(
+ rewriter).strip().split('\n')

Cleaning out lint, I see. Thanks.

@@ -274,3 +280,62 @@
         # The script produces logging output, but not to stderr.
         self.assertEqual('', err)
         self.assertEqual(expected_lines, output_lines)
+
+
+class TestBranchRewriterScriptHandlesDisconnects(TestCaseWithFactory):
+ """Ensure branch-rewrite.py survives fastdowntime deploys."""
+ layer = LaunchpadScriptLayer
+
+ def setUp(self):
+ super(TestBranchRewriterScriptHandlesDisconnects, self).setUp()
+ self.pgbouncer = PGBouncerFixture()
+ self.addCleanup(self.pgbouncer.cleanUp)
+ self.pgbouncer.setUp()

Couldn't those last three lines be replaced with a simple

self.pgbouncer = self.useFixture(PGBouncerFixture())

?

In fact I'm not even sure it's worth a setUp with the super() dance, and carrying self.pgbouncer from setUp to the tests.

+ def spawn(self):
+ script_file = os.path.join(
+ config.root, 'scripts', 'branch-rewrite.py')
+
+ self.rewriter_proc = subprocess.Popen(
+ [script_file], stdin=subprocess.PIPE, stdout=subprocess.PIPE,
+ stderr=subprocess.PIPE, bufsize=0)

Again, is it worth keeping self.rewriter_proc on the test class? It just hides state. Why not make it a return value?

Moreover, is this process guaranteed to clean itself up? If not, then please ensure that it is. For instance, you could add something like "self.addCleanup(kill, rewriter_proc)" to spawn().

+ def request(self, query):
+ self.rewriter_proc.stdin.write(query + '\n')
+ return self.rewriter_proc.stdout.readline().rstrip('\n')

Do we know that the readline() will never hang under reasonable circumstances?

+ def test_disconnect(self):
+ self.spawn()

Sure "disconnect" is something that happens in this test, but is that one word really a good description of what you're testing here? Would it help to say something with "reconnects"?

+ # Everything should be working, and we get valid output.
+ out = self.request('foo')
+ assert out.endswith('/foo'), out

Why use assert here? Normally you'd say self.assertEndsWith(out, '/foo').

+ self.pgbouncer.stop()
+
+ # Now with pgbouncer down, we should get NULL messages and
+ # stderr spam, and this keeps happening. We test more than
+ # once to ensure that we will keep trying to reconnect even
+ # after several failures.
+ for count in range(5):
+ out = self.request('foo')

Is this race-free? I don't know ...

Hi Stuart,

Generally good, but I have a few things that look worth fixing.

=== modified file 'lib/lp/codehosting/tests/test_rewrite.py'
--- lib/lp/codehosting/tests/test_rewrite.py	2011-08-12 11:37:08 +0000
+++ lib/lp/codehosting/tests/test_rewrite.py	2011-08-29 20:15:59 +0000

@@ -177,7 +181,8 @@
         transaction.commit()
         rewriter.rewriteLine('/' + branch.unique_name + '/.bzr/README')
         rewriter.rewriteLine('/' + branch.unique_name + '/.bzr/README')
-        logging_output_lines = self.getLoggerOutput(rewriter).strip().split('\n')
+        logging_output_lines = self.getLoggerOutput(
+            rewriter).strip().split('\n')

Cleaning out lint, I see.  Thanks.

@@ -274,3 +280,62 @@
         # The script produces logging output, but not to stderr.
         self.assertEqual('', err)
         self.assertEqual(expected_lines, output_lines)
+
+
+class TestBranchRewriterScriptHandlesDisconnects(TestCaseWithFactory):
+    """Ensure branch-rewrite.py survives fastdowntime deploys."""
+    layer = LaunchpadScriptLayer
+
+    def setUp(self):
+        super(TestBranchRewriterScriptHandlesDisconnects, self).setUp()
+        self.pgbouncer = PGBouncerFixture()
+        self.addCleanup(self.pgbouncer.cleanUp)
+        self.pgbouncer.setUp()

Couldn't those last three lines be replaced with a simple

self.pgbouncer = self.useFixture(PGBouncerFixture())

?

In fact I'm not even sure it's worth a setUp with the super() dance, and carrying self.pgbouncer from setUp to the tests.

+    def spawn(self):
+        script_file = os.path.join(
+            config.root, 'scripts', 'branch-rewrite.py')
+
+        self.rewriter_proc = subprocess.Popen(
+            [script_file], stdin=subprocess.PIPE, stdout=subprocess.PIPE,
+            stderr=subprocess.PIPE, bufsize=0)

Again, is it worth keeping self.rewriter_proc on the test class?  It just hides state.  Why not make it a return value?

Moreover, is this process guaranteed to clean itself up?  If not, then please ensure that it is.  For instance, you could add something like "self.addCleanup(kill, rewriter_proc)" to spawn().

+    def request(self, query):
+        self.rewriter_proc.stdin.write(query + '\n')
+        return self.rewriter_proc.stdout.readline().rstrip('\n')

Do we know that the readline() will never hang under reasonable circumstances?

+    def test_disconnect(self):
+        self.spawn()

Sure "disconnect" is something that happens in this test, but is that one word really a good description of what you're testing here?  Would it help to say something with "reconnects"?

+        # Everything should be working, and we get valid output.
+        out = self.request('foo')
+        assert out.endswith('/foo'), out

Why use assert here?  Normally you'd say self.assertEndsWith(out, '/foo').

+        self.pgbouncer.stop()
+
+        # Now with pgbouncer down, we should get NULL messages and
+        # stderr spam, and this keeps happening. We test more than
+        # once to ensure that we will keep trying to reconnect even
+        # after several failures.
+        for count in range(5):
+            out = self.request('foo')

Is this race-free?  I don't know if pgbouncer is a separate process or not; if it is, does stop() block until it's taken effect?

+            assert out == 'NULL', out

Here too: why assert instead of self.assertEqual?

+        self.pgbouncer.start()

Here too: is this race-free?

+        # Everything should be working, and we get valid output.
+        out = self.request('foo')
+        assert out.endswith('/foo'), out

Here too: why assert instead of self.assertEndsWith?

+    def test_starts_with_db_down(self):
+        self.pgbouncer.stop()
+        self.spawn()
+
+        for count in range(5):
+            out = self.request('foo')
+            assert out == 'NULL', out

Another assert.

=== modified file 'scripts/branch-rewrite.py'
--- scripts/branch-rewrite.py	2010-11-06 12:50:22 +0000
+++ scripts/branch-rewrite.py	2011-08-29 20:15:59 +0000

@@ -60,9 +62,18 @@
                     return
             except KeyboardInterrupt:
                 sys.exit()
-            except:
+            except Exception:

Sensible.  This means you no longer try to do useful things when things go really bad.

self.logger.exception('Exception occurred:')
                 print "NULL"
+                # The exception might have been a DisconnectionError or
+                # similar. Cleanup such as database reconnection will
+                # not happen until the transaction is rolled back. We
+                # are explicitly rolling back the store here instead of
+                # using transaction.abort() due to Bug #819282.

This looks like it should properly be an XXX.  Could you format it as one?  XXX StuartBishop 2011-08-31 bug=819282: etc.

+                try:
+                    ISlaveStore(Branch).rollback()
+                except Exception:
+                    self.logger.exception('Exception occurred in rollback:')

Just to make sure: is the exception swallowed deliberately?  Or was it supposed to be re-raised, or sys.exit() called with an error code or whatever?

Jeroen

review: Approve

Revision history for this message

Stuart Bishop (stub) wrote on 2011-08-31:

#

Download full text (4.7 KiB)

On Wed, Aug 31, 2011 at 10:26 AM, Jeroen T. Vermeulen <email address hidden> wrote:
> === modified file 'lib/lp/codehosting/tests/test_rewrite.py'
> +class TestBranchRewriterScriptHandlesDisconnects(TestCaseWithFactory):
> + """Ensure branch-rewrite.py survives fastdowntime deploys."""
> + layer = LaunchpadScriptLayer
> +
> + def setUp(self):
> + super(TestBranchRewriterScriptHandlesDisconnects, self).setUp()
> + self.pgbouncer = PGBouncerFixture()
> + self.addCleanup(self.pgbouncer.cleanUp)
> + self.pgbouncer.setUp()
>
> Couldn't those last three lines be replaced with a simple
>
> self.pgbouncer = self.useFixture(PGBouncerFixture())
>
> ?
>
> In fact I'm not even sure it's worth a setUp with the super() dance, and carrying self.pgbouncer from setUp to the tests.

Yes, useFixture is nicer here. I've removed setUp() entirely and am
using useFixture in the tests.

> + def spawn(self):
> + script_file = os.path.join(
> + config.root, 'scripts', 'branch-rewrite.py')
> +
> + self.rewriter_proc = subprocess.Popen(
> + [script_file], stdin=subprocess.PIPE, stdout=subprocess.PIPE,
> + stderr=subprocess.PIPE, bufsize=0)
>
> Again, is it worth keeping self.rewriter_proc on the test class? It just hides state. Why not make it a return value?

If I don't hide state, I have to maintain state and pass it to the
request() helper :-)

> Moreover, is this process guaranteed to clean itself up? If not, then please ensure that it is. For instance, you could add something like "self.addCleanup(kill, rewriter_proc)" to spawn().

subprocess module handles this for us as soon as things go out of
scope. But it doesn't hurt to be explicit so I've added the cleanup.

> + def request(self, query):
> + self.rewriter_proc.stdin.write(query + '\n')
> + return self.rewriter_proc.stdout.readline().rstrip('\n')
>
> Do we know that the readline() will never hang under reasonable circumstances?

> + def test_disconnect(self):
> + self.spawn()
>
> Sure "disconnect" is something that happens in this test, but is that one word really a good description of what you're testing here? Would it help to say something with "reconnects"?

I've change the test name to test_reconnects_when_disconnected.

> + # Everything should be working, and we get valid output.
> + out = self.request('foo')
> + assert out.endswith('/foo'), out
>
> Why use assert here? Normally you'd say self.assertEndsWith(out, '/foo').

Because I'm always forgetting which assert methods are approved,
standard, testutils extensions, LP extensions :-)

Fixed, along with the other occurrences.

> + self.pgbouncer.stop()
> +
> + # Now with pgbouncer down, we should get NULL messages and
> + # stderr spam, and this keeps happening. We test more than
> + # once to ensure that we will keep trying to reconnect even
> + # after several failures.
> + for count in range(5):
> + out = self.request('foo')
>
> Is this race-free? I don't know if pgbouncer is a separate process or not; if it is, does stop() block until it's taken effect...

On Wed, Aug 31, 2011 at 10:26 AM, Jeroen T. Vermeulen <jtv@canonical.com> wrote:
> === modified file 'lib/lp/codehosting/tests/test_rewrite.py'
> +class TestBranchRewriterScriptHandlesDisconnects(TestCaseWithFactory):
> +    """Ensure branch-rewrite.py survives fastdowntime deploys."""
> +    layer = LaunchpadScriptLayer
> +
> +    def setUp(self):
> +        super(TestBranchRewriterScriptHandlesDisconnects, self).setUp()
> +        self.pgbouncer = PGBouncerFixture()
> +        self.addCleanup(self.pgbouncer.cleanUp)
> +        self.pgbouncer.setUp()
>
> Couldn't those last three lines be replaced with a simple
>
>    self.pgbouncer = self.useFixture(PGBouncerFixture())
>
> ?
>
> In fact I'm not even sure it's worth a setUp with the super() dance, and carrying self.pgbouncer from setUp to the tests.

Yes, useFixture is nicer here. I've removed setUp() entirely and am
using useFixture in the tests.

> +    def spawn(self):
> +        script_file = os.path.join(
> +            config.root, 'scripts', 'branch-rewrite.py')
> +
> +        self.rewriter_proc = subprocess.Popen(
> +            [script_file], stdin=subprocess.PIPE, stdout=subprocess.PIPE,
> +            stderr=subprocess.PIPE, bufsize=0)
>
> Again, is it worth keeping self.rewriter_proc on the test class?  It just hides state.  Why not make it a return value?

If I don't hide state, I have to maintain state and pass it to the
request() helper :-)

> Moreover, is this process guaranteed to clean itself up?  If not, then please ensure that it is.  For instance, you could add something like "self.addCleanup(kill, rewriter_proc)" to spawn().

subprocess module handles this for us as soon as things go out of
scope. But it doesn't hurt to be explicit so I've added the cleanup.

> +    def request(self, query):
> +        self.rewriter_proc.stdin.write(query + '\n')
> +        return self.rewriter_proc.stdout.readline().rstrip('\n')
>
> Do we know that the readline() will never hang under reasonable circumstances?

> +    def test_disconnect(self):
> +        self.spawn()
>
> Sure "disconnect" is something that happens in this test, but is that one word really a good description of what you're testing here?  Would it help to say something with "reconnects"?

I've change the test name to test_reconnects_when_disconnected.

> +        # Everything should be working, and we get valid output.
> +        out = self.request('foo')
> +        assert out.endswith('/foo'), out
>
> Why use assert here?  Normally you'd say self.assertEndsWith(out, '/foo').

Because I'm always forgetting which assert methods are approved,
standard, testutils extensions, LP extensions :-)

Fixed, along with the other occurrences.

> +        self.pgbouncer.stop()
> +
> +        # Now with pgbouncer down, we should get NULL messages and
> +        # stderr spam, and this keeps happening. We test more than
> +        # once to ensure that we will keep trying to reconnect even
> +        # after several failures.
> +        for count in range(5):
> +            out = self.request('foo')
>
> Is this race-free?  I don't know if pgbouncer is a separate process or not; if it is, does stop() block until it's taken effect?

It is race-free. The fixture provided by the pgbouncer package
lifeless put together blocks until the process has started up (and if
it doesn't, it is better to fix that than work around).

> +            assert out == 'NULL', out
>
> Here too: why assert instead of self.assertEqual?

Fixed.

> === modified file 'scripts/branch-rewrite.py'
>                 self.logger.exception('Exception occurred:')
>                 print "NULL"
> +                # The exception might have been a DisconnectionError or
> +                # similar. Cleanup such as database reconnection will
> +                # not happen until the transaction is rolled back. We
> +                # are explicitly rolling back the store here instead of
> +                # using transaction.abort() due to Bug #819282.
>
> This looks like it should properly be an XXX.  Could you format it as one?  XXX StuartBishop 2011-08-31 bug=819282: etc.

Yer, fixed.

> +                try:
> +                    ISlaveStore(Branch).rollback()
> +                except Exception:
> +                    self.logger.exception('Exception occurred in rollback:')
>
> Just to make sure: is the exception swallowed deliberately?  Or was it supposed to be re-raised, or sys.exit() called with an error code or whatever?

I don't know what the best behavior here is. Should we swallow the
failure in rollback, assuming it is a transient database issue? Or
should we die with an exception and hope Apache restarts the process?

-- 
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/

Revision history for this message

Stuart Bishop (stub) wrote on 2011-08-31:

#

readline() blocking shouldn't be a problem, but I've implemented a nonblocking readline helper and made use of it.

Revision history for this message

Jeroen T. Vermeulen (jtv) wrote on 2011-09-01:

#

I don't know enough to be of much help with the raise-or-swallow issue. What the best choice is there may depend on what goes on higher up in the call tree. If any kind of transactional integrity is expected across the failure, then the exception probably needs to be re-raised.

Launchpad itself

Merge lp:~stub/launchpad/branch-rewrite into lp:launchpad

Commit message

Description of the change

Preview Diff

Subscribers

 === modified file 'lib/lp/codehosting/tests/test_rewrite.py'
 --- lib/lp/codehosting/tests/test_rewrite.py	2011-08-12 11:37:08 +0000
 +++ lib/lp/codehosting/tests/test_rewrite.py	2011-08-31 16:53:36 +0000
@@ -14,15 +14,21 @@
  from zope.security.proxy import removeSecurityProxy
  from canonical.config import config
--from canonical.testing.layers import DatabaseFunctionalLayer
++from canonical.testing.layers import (
++    DatabaseFunctionalLayer,
++    DatabaseLayer,
++    )
  from lp.code.interfaces.codehosting import branch_id_alias
  from lp.codehosting.rewrite import BranchRewriter
  from lp.codehosting.vfs import branch_id_to_path
  from lp.services.log.logger import BufferLogger
  from lp.testing import (
      FakeTime,
++    nonblocking_readline,
++    TestCase,
      TestCaseWithFactory,
+     )
++from lp.testing.fixture import PGBouncerFixture
  class TestBranchRewriter(TestCaseWithFactory):
@@ -177,7 +183,8 @@
          transaction.commit()
          rewriter.rewriteLine('/' + branch.unique_name + '/.bzr/README')
          rewriter.rewriteLine('/' + branch.unique_name + '/.bzr/README')
--        logging_output_lines = self.getLoggerOutput(rewriter).strip().split('\n')
++        logging_output_lines = self.getLoggerOutput(
++            rewriter).strip().split('\n')
          self.assertEqual(2, len(logging_output_lines))
          self.assertIsNot(
              None,
@@ -194,7 +201,8 @@
          self.fake_time.advance(
              config.codehosting.branch_rewrite_cache_lifetime + 1)
          rewriter.rewriteLine('/' + branch.unique_name + '/.bzr/README')
--        logging_output_lines = self.getLoggerOutput(rewriter).strip().split('\n')
++        logging_output_lines = self.getLoggerOutput(
++            rewriter).strip().split('\n')
          self.assertEqual(2, len(logging_output_lines))
          self.assertIsNot(
              None,
@@ -246,7 +254,8 @@
          # buffering, write a complete line of output.
          for input_line in input_lines:
              proc.stdin.write(input_line + '\n')
--            output_lines.append(proc.stdout.readline().rstrip('\n'))
++            output_lines.append(
++                nonblocking_readline(proc.stdout, 60).rstrip('\n'))
          # If we create a new branch after the branch-rewrite.py script has
          # connected to the database, or edit a branch name that has already
          # been rewritten, both are rewritten successfully.
@@ -260,17 +269,90 @@
              'file:///var/tmp/bazaar.launchpad.dev/mirrors/%s/.bzr/README'
              % branch_id_to_path(new_branch.id))
          proc.stdin.write(new_branch_input + '\n')
--        output_lines.append(proc.stdout.readline().rstrip('\n'))
++        output_lines.append(
++            nonblocking_readline(proc.stdout, 60).rstrip('\n'))
          edited_branch_input = '/%s/.bzr/README' % edited_branch.unique_name
          expected_lines.append(
              'file:///var/tmp/bazaar.launchpad.dev/mirrors/%s/.bzr/README'
              % branch_id_to_path(edited_branch.id))
          proc.stdin.write(edited_branch_input + '\n')
--        output_lines.append(proc.stdout.readline().rstrip('\n'))
++        output_lines.append(
++            nonblocking_readline(proc.stdout, 60).rstrip('\n'))
          os.kill(proc.pid, signal.SIGINT)
          err = proc.stderr.read()
          # The script produces logging output, but not to stderr.
          self.assertEqual('', err)
          self.assertEqual(expected_lines, output_lines)
++
++
++class TestBranchRewriterScriptHandlesDisconnects(TestCase):
++    """Ensure branch-rewrite.py survives fastdowntime deploys."""
++    layer = DatabaseLayer
++
++    def spawn(self):
++        script_file = os.path.join(
++            config.root, 'scripts', 'branch-rewrite.py')
++
++        self.rewriter_proc = subprocess.Popen(
++            [script_file], stdin=subprocess.PIPE, stdout=subprocess.PIPE,
++            stderr=subprocess.PIPE, bufsize=0)
++
++        self.addCleanup(self.rewriter_proc.terminate)
++
++    def request(self, query):
++        self.rewriter_proc.stdin.write(query + '\n')
++        self.rewriter_proc.stdin.flush()
++
++        # 60 second timeout as we might need to wait for the script to
++        # finish starting up.
++        result = nonblocking_readline(self.rewriter_proc.stdout, 60)
++
++        if result.endswith('\n'):
++            return result[:-1]
++        self.fail(
++            "Incomplete line or no result retrieved from subprocess: %s"
++            % repr(result.getvalue()))
++
++    def test_reconnects_when_disconnected(self):
++        pgbouncer = self.useFixture(PGBouncerFixture())
++
++        self.spawn()
++
++        # Everything should be working, and we get valid output.
++        out = self.request('foo')
++        self.assertEndsWith(out, '/foo')
++
++        pgbouncer.stop()
++
++        # Now with pgbouncer down, we should get NULL messages and
++        # stderr spam, and this keeps happening. We test more than
++        # once to ensure that we will keep trying to reconnect even
++        # after several failures.
++        for count in range(5):
++            out = self.request('foo')
++            self.assertEqual(out, 'NULL')
++
++        pgbouncer.start()
++
++        # Everything should be working, and we get valid output.
++        out = self.request('foo')
++        self.assertEndsWith(out, '/foo')
++
++    def test_starts_with_db_down(self):
++        pgbouncer = self.useFixture(PGBouncerFixture())
++
++        # Start with the database down.
++        pgbouncer.stop()
++
++        self.spawn()
++
++        for count in range(5):
++            out = self.request('foo')
++            self.assertEqual(out, 'NULL')
++
++        pgbouncer.start()
++
++        out = self.request('foo')
++        self.assertEndsWith(out, '/foo')
 === modified file 'lib/lp/testing/__init__.py'
 --- lib/lp/testing/__init__.py	2011-08-19 18:20:58 +0000
 +++ lib/lp/testing/__init__.py	2011-08-31 16:53:36 +0000
@@ -29,6 +29,7 @@
      'logout',
      'map_branch_contents',
      'normalize_whitespace',
++    'nonblocking_readline',
      'oauth_access_token_for',
      'person_logged_in',
      'quote_jquery_expression',
@@ -69,6 +70,7 @@
  import os
  from pprint import pformat
  import re
++from select import select
  import shutil
  import subprocess
  import sys
@@ -1325,3 +1327,24 @@
  def extract_lp_cache(text):
      match = re.search(r'<script>LP.cache = (\{.*\});</script>', text)
      return simplejson.loads(match.group(1))
++
++
++def nonblocking_readline(instream, timeout):
++    """Non-blocking readline.
++
++    Files must provide a valid fileno() method. This is a test helper
++    as it is inefficient and unlikely useful for production code.
++    """
++    result = StringIO()
++    start = now = time.time()
++    while (now < start + timeout and not result.getvalue().endswith('\n')):
++        rlist = select([instream], [], [], timeout - (now - start))
++        if rlist:
++            # Reading 1 character at a time is inefficient, but means
++            # we don't need to implement put-back.
++            next_char = os.read(instream.fileno(), 1)
++            if next_char == "":
++                break  # EOF
++            result.write(next_char)
++        now = time.time()
++    return result.getvalue()
 === modified file 'lib/lp/testing/fixture.py'
 --- lib/lp/testing/fixture.py	2011-08-31 16:53:35 +0000
 +++ lib/lp/testing/fixture.py	2011-08-31 16:53:36 +0000
@@ -91,8 +91,7 @@
          # reconnect_store cleanup added first so it is run last, after
          # the environment variables have been reset.
--        from canonical.testing.layers import reconnect_stores
--        self.addCleanup(reconnect_stores)
++        self.addCleanup(self._maybe_reconnect_stores)
          # Abuse the PGPORT environment variable to get things connecting
          # via pgbouncer. Otherwise, we would need to temporarily
@@ -100,7 +99,21 @@
          self.useFixture(EnvironmentVariableFixture('PGPORT', str(self.port)))
          # Reset database connections so they go through pgbouncer.
--        reconnect_stores()
++        self._maybe_reconnect_stores()
++
++    def _maybe_reconnect_stores(self):
++        """Force Storm Stores to reconnect if they are registered.
++
++        This is a noop if the Component Architecture is not loaded,
++        as we are using a test layer that doesn't provide database
++        connections.
++        """
++        from canonical.testing.layers import (
++            reconnect_stores,
++            is_ca_available,
++            )
++        if is_ca_available():
++            reconnect_stores()
  class ZopeAdapterFixture(Fixture):
 === modified file 'lib/lp/testing/tests/test_fixture.py'
 --- lib/lp/testing/tests/test_fixture.py	2011-08-31 16:53:35 +0000
 +++ lib/lp/testing/tests/test_fixture.py	2011-08-31 16:53:36 +0000
@@ -8,6 +8,7 @@
  from textwrap import dedent
  from fixtures import EnvironmentVariableFixture
++import psycopg2
  from storm.exceptions import DisconnectionError
  from zope.component import (
      adapts,
@@ -18,8 +19,13 @@
      Interface,
+     )
++from canonical.config import dbconfig
  from canonical.launchpad.interfaces.lpstorm import IMasterStore
--from canonical.testing.layers import BaseLayer, LaunchpadZopelessLayer
++from canonical.testing.layers import (
++    BaseLayer,
++    DatabaseLayer,
++    LaunchpadZopelessLayer,
++    )
  from lp.registry.model.person import Person
  from lp.testing import TestCase
  from lp.testing.fixture import (
@@ -95,7 +101,11 @@
          self.assertIs(None, queryAdapter(context, IBar))
--class TestPGBouncerFixture(TestCase):
++class TestPGBouncerFixtureWithCA(TestCase):
++    """PGBouncerFixture reconnect tests for Component Architecture layers.
++
++    Registered Storm Stores should be reconnected through pgbouncer.
++    """
      layer = LaunchpadZopelessLayer
      def is_connected(self):
@@ -149,3 +159,49 @@
          # Database is working again.
          assert self.is_connected()
++
++
++class TestPGBouncerFixtureWithoutCA(TestCase):
++    """PGBouncerFixture tests for non-Component Architecture layers."""
++    layer = DatabaseLayer
++
++    def is_db_available(self):
++        # Direct connection to the DB.
++        con_str = dbconfig.rw_main_master + ' user=launchpad_main'
++        try:
++            con = psycopg2.connect(con_str)
++            cur = con.cursor()
++            cur.execute("SELECT id FROM Person LIMIT 1")
++            con.close()
++            return True
++        except psycopg2.OperationalError:
++            return False
++
++    def test_install_fixture(self):
++        self.assert_(self.is_db_available())
++
++        with PGBouncerFixture() as pgbouncer:
++            self.assert_(self.is_db_available())
++
++            pgbouncer.stop()
++            self.assert_(not self.is_db_available())
++
++        # This confirms that we are again connecting directly to the
++        # database, as the pgbouncer process was shutdown.
++        self.assert_(self.is_db_available())
++
++    def test_install_fixture_with_restart(self):
++        self.assert_(self.is_db_available())
++
++        with PGBouncerFixture() as pgbouncer:
++            self.assert_(self.is_db_available())
++
++            pgbouncer.stop()
++            self.assert_(not self.is_db_available())
++
++            pgbouncer.start()
++            self.assert_(self.is_db_available())
++
++        # Note that because pgbouncer was left running, we can't confirm
++        # that we are now connecting directly to the database.
++        self.assert_(self.is_db_available())
 === modified file 'scripts/branch-rewrite.py'
 --- scripts/branch-rewrite.py	2010-11-06 12:50:22 +0000
 +++ scripts/branch-rewrite.py	2011-08-31 16:53:36 +0000
@@ -19,6 +19,8 @@
  from canonical.database.sqlbase import ISOLATION_LEVEL_AUTOCOMMIT
  from canonical.config import config
++from canonical.launchpad.interfaces.lpstorm import ISlaveStore
++from lp.code.model.branch import Branch
  from lp.codehosting.rewrite import BranchRewriter
  from lp.services.log.loglevels import INFO, WARNING
  from lp.services.scripts.base import LaunchpadScript
@@ -60,9 +62,19 @@
                      return
              except KeyboardInterrupt:
                  sys.exit()
--            except:
++            except Exception:
                  self.logger.exception('Exception occurred:')
                  print "NULL"
++                # The exception might have been a DisconnectionError or
++                # similar. Cleanup such as database reconnection will
++                # not happen until the transaction is rolled back.
++                # XXX StuartBishop 2011-08-31 bug=819282: We are
++                # explicitly rolling back the store here as a workaround
++                # instead of using transaction.abort()
++                try:
++                    ISlaveStore(Branch).rollback()
++                except Exception:
++                    self.logger.exception('Exception occurred in rollback:')
  if __name__ == '__main__':