Merge into trunk : reliable-shutdown : Code : Python PGBouncer

Reviewer	Review Type	Date Requested	Status
Jeroen T. Vermeulen (community)		2011-10-25	Approve on 2011-10-26
Review via email: mp+80382@code.launchpad.net

lp:~allenap/python-pgbouncer/reliable-shutdown updated on 2011-10-25

15. By Gavin Panella on 2011-10-25: Fix test_unix_sockets to actually connect via a Unix socket.

Revision history for this message

Jeroen T. Vermeulen (jtv) wrote on 2011-10-26:

#

I'm not done reviewing this yet, but a few questions first:

* I guess if the test dies, pgbouncer will now die as well so you don't need a pidfile to set things right during a later run?

* Why treat the files you open as binary? Take a pidfile, for instance. I don't suppose it'll ever matter, but isn't that a text file? Adding the "b" to the open modes seems to confuse more than help.

Revision history for this message

Gavin Panella (allenap) wrote on 2011-10-26:

#

> I'm not done reviewing this yet, but a few questions first:

Thanks for looking at it :)

> * I guess if the test dies, pgbouncer will now die as well so you don't need
> a pidfile to set things right during a later run?

Well, I don't know if there are any greater guarantees than
before. Even if we/pgbouncer arranges for SIGHUP to be sent on parent
death it will only cause pgbouncer to reload its config.

I suppose I could add an atexit handler to kill child processes.

> * Why treat the files you open as binary? Take a pidfile, for instance. I
> don't suppose it'll ever matter, but isn't that a text file? Adding the "b"
> to the open modes seems to confuse more than help.

Just a habit. The Python docs state that using "b" improves
portability. Not that it's needed, but it does no harm, and the
"features" of text mode are not needed anyway.

Revision history for this message

Jeroen T. Vermeulen (jtv) wrote on 2011-10-26:

#

Download full text (9.2 KiB)

Thanks for fixing this, by the way. I see some really good stuff in here, both functionally and in terms of code quality.

=== modified file 'pgbouncer/fixture.py'
--- pgbouncer/fixture.py 2011-09-12 23:38:28 +0000
+++ pgbouncer/fixture.py 2011-10-25 20:23:23 +0000

@@ -111,46 +111,56 @@
authfile.write('"%s" "%s"\n' % user_creds)

     def stop(self):
- if self.process_pid is None:
- return
- os.kill(self.process_pid, signal.SIGTERM)
- # Wait for the shutdown to occur
+ if self.process is None:
+ # pgbouncer has not been started.
+ return
+ if self.process.poll() is not None:
+ # pgbouncer has exited already.
+ return
+ # Terminate and wait.
+ self.process.terminate()
         start = time.time()
         stop = start + 5.0
         while time.time() < stop:
- if not os.path.isfile(self.pidpath):
- self.process_pid = None
- return
- # If its not going away, we might want to raise an error, but for now
- # it seems reliable.
+ if self.process.poll() is None:
+ time.sleep(0.1)
+ else:
+ break
+ else:
+ raise Exception(
+ 'timeout waiting for pgbouncer to exit')

Some say that the pattern of “comment announcing block of code, block of code” is an indicator that you're probably better off extracting separate methods. I think in this case the terminate-and-wait loop is a bit mechanical and potentially worth extracting. It also gives you a bit more freedom to do things like a direct “return” on success, and to unit test the loop's corner cases. The while / if / sleep / else / break / else / raise loop, while small, may be just a bit too easy to break in maintenance.

Oh, and while you're at it, could you capitalize and punctuate that error message? It helps confirm to the hurried reader that the message is an independent sentence, rather than a continuation of an earlier statement (or of a traceback line) that the eye should skip over.

def start(self):
self.addCleanup(self.stop)

         # Add /usr/sbin if necessary to the PATH for magic just-works
         # behavior with Ubuntu.
- env = None
+ env = os.environ.copy()
         if not self.pgbouncer.startswith('/'):
- path = os.environ['PATH'].split(':')
+ path = env['PATH'].split(os.pathsep)

Better, because simpler.

if '/usr/sbin' not in path:
path.append('/usr/sbin')
- env = os.environ.copy()
- env['PATH'] = ':'.join(path)
-
- outputfile = open(self.outputpath, 'wt')
- self.process = subprocess.Popen(
- [self.pgbouncer, '-d', self.inipath], env=env,
- stdout=outputfile.fileno(), stderr=outputfile.fileno())
- self.process.communicate()
- # Wait up to 5 seconds for the pid file to exist
+ env['PATH'] = os.pathsep.join(path)
+
+ with open(self.outputpath, "wb") as outputfile:
+ with open(os.devnull, "rb") as devnull:
+ self....

Thanks for fixing this, by the way.  I see some really good stuff in here, both functionally and in terms of code quality.

=== modified file 'pgbouncer/fixture.py'
--- pgbouncer/fixture.py	2011-09-12 23:38:28 +0000
+++ pgbouncer/fixture.py	2011-10-25 20:23:23 +0000

@@ -111,46 +111,56 @@
                 authfile.write('"%s" "%s"\n' % user_creds)
 
     def stop(self):
-        if self.process_pid is None:
-            return
-        os.kill(self.process_pid, signal.SIGTERM)
-        # Wait for the shutdown to occur
+        if self.process is None:
+            # pgbouncer has not been started.
+            return
+        if self.process.poll() is not None:
+            # pgbouncer has exited already.
+            return
+        # Terminate and wait.
+        self.process.terminate()
         start = time.time()
         stop = start + 5.0
         while time.time() < stop:
-            if not os.path.isfile(self.pidpath):
-                self.process_pid = None
-                return
-        # If its not going away, we might want to raise an error, but for now
-        # it seems reliable.
+            if self.process.poll() is None:
+                time.sleep(0.1)
+            else:
+                break
+        else:
+            raise Exception(
+                'timeout waiting for pgbouncer to exit')

Some say that the pattern of “comment announcing block of code, block of code” is an indicator that you're probably better off extracting separate methods.  I think in this case the terminate-and-wait loop is a bit mechanical and potentially worth extracting.  It also gives you a bit more freedom to do things like a direct “return” on success, and to unit test the loop's corner cases.  The while / if / sleep / else / break / else / raise loop, while small, may be just a bit too easy to break in maintenance.

Oh, and while you're at it, could you capitalize and punctuate that error message?  It helps confirm to the hurried reader that the message is an independent sentence, rather than a continuation of an earlier statement (or of a traceback line) that the eye should skip over.

def start(self):
         self.addCleanup(self.stop)
 
         # Add /usr/sbin if necessary to the PATH for magic just-works
         # behavior with Ubuntu.
-        env = None
+        env = os.environ.copy()
         if not self.pgbouncer.startswith('/'):
-            path = os.environ['PATH'].split(':')
+            path = env['PATH'].split(os.pathsep)

Better, because simpler.

if '/usr/sbin' not in path:
                 path.append('/usr/sbin')
-                env = os.environ.copy()
-                env['PATH'] = ':'.join(path)
-
-        outputfile = open(self.outputpath, 'wt')
-        self.process = subprocess.Popen(
-            [self.pgbouncer, '-d', self.inipath], env=env,
-            stdout=outputfile.fileno(), stderr=outputfile.fileno())
-        self.process.communicate()
-        # Wait up to 5 seconds for the pid file to exist
+                env['PATH'] = os.pathsep.join(path)
+
+        with open(self.outputpath, "wb") as outputfile:
+            with open(os.devnull, "rb") as devnull:
+                self.process = subprocess.Popen(
+                    [self.pgbouncer, self.inipath], env=env, stdin=devnull,
+                    stdout=outputfile, stderr=outputfile)
+
+        self.addDetail(
+            os.path.basename(self.outputpath),
+            content_from_file(self.outputpath))
+
+        # Wait up to 5 seconds for the pid file to exist.
         start = time.time()
         stop = start + 5.0
         while time.time() < stop:
             if os.path.isfile(self.pidpath):
-                try:
-                    self.process_pid = int(file(self.pidpath, 'rt').read())
-                except ValueError:
-                    # Empty pid files -> ValueError.
-                    continue
-                return
-        raise Exception('timeout waiting for pgbouncer to create pid file')
+                with open(self.pidpath, "rb") as pidfile:
+                    if pidfile.read().strip().isdigit():
+                        break
+            time.sleep(0.1)
+        else:
+            raise Exception(
+                'timeout waiting for pgbouncer to create pid file')

Very similar to what you were doing in stop().  Once again I think this is easier to deal with if you extract it.  There's probably a reusable thingy somewhere for this basic structure, but I don't suppose it's worth digging up.

=== modified file 'pgbouncer/tests.py'
--- pgbouncer/tests.py	2011-09-05 15:01:52 +0000
+++ pgbouncer/tests.py	2011-10-25 20:23:23 +0000
@@ -15,7 +15,7 @@
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
 import os
-from unittest import main, TestLoader
+import unittest
 
 import fixtures
 import psycopg2
@@ -28,10 +28,12 @@
 # A 'just-works' workaround for Ubuntu not exposing initdb to the main PATH.
 os.environ['PATH'] = os.environ['PATH'] + ':/usr/lib/postgresql/8.4/bin'
 
+
+test_loader = testresources.TestLoader()
+
+
 def test_suite():
-    result = testresources.OptimisingTestSuite()
-    result.addTest(TestLoader().loadTestsFromName(__name__))
-    return result
+    return test_loader.loadTestsFromName(__name__)

Not that I have any idea what I'm talking about, but is there any risk that this global initialization might cause trouble for “make lint” checks (which I believe import the test as a module)?
 
 
@@ -49,15 +51,22 @@
 
     resources = [('db', DatabaseManager(initialize_sql=setup_user))]
 
+    def setUp(self):
+        super(TestFixture, self).setUp()
+        self.bouncer = PGBouncerFixture()
+        self.bouncer.databases[self.db.database] = 'host=' + self.db.host
+        self.bouncer.users['user1'] = ''

You're doing this test a world of good.  Still, it's a shame to jump through the “super” hoop *and* create implicit state *and* make an explicit call whenever you use the fixture.

Why not fold all of it into a single explicit call?  What I mean is:

def makeBouncerFixture(self, unix_socket_dir=None):
        bouncer = PGBouncerFixture()
        bouncer.databases[self.db.database] = 'hosts=' + self.db.host
        bouncer.users['user1'] = ''
        if unix_socket_dir is not None:
            bouncer.unix_socket_dir = unix_socket_dir
        return self.useFixture(bouncer)

Then replace every “self.useFixture(self.bouncer)” with “self.makeBouncerFixture()” and done.  It's shorter, even.  And if you need the bouncer object in a test, just capture the method's return value.

In “connect” of course you'd need to pass the bouncer as an extra argument.  Not sure if that negates the winnings.

@@ -65,36 +74,20 @@
         # potentially be used by a different process, so this isn't perfect,
         # but its pretty reliable as a test helper, and manual port allocation
         # outside the dynamic range should be fine.
-        bouncer = PGBouncerFixture()
-        db = self.db
-        bouncer.databases[db.database] = 'host=%s' % (db.host,)
-        bouncer.users['user1'] = ''
-        def check_connect():
-            conn = psycopg2.connect(host=bouncer.host, port=bouncer.port,
-                user='user1', database=db.database)
-            conn.close()
-        with bouncer:
-            current_port = bouncer.port
-            bouncer.stop()
-            self.assertRaises(psycopg2.OperationalError, check_connect)
-            bouncer.start()
-            check_connect()
+        self.useFixture(self.bouncer)
+        self.bouncer.stop()
+        self.assertRaises(psycopg2.OperationalError, self.connect)
+        self.bouncer.start()
+        self.connect().close()

World of good, definitely.  What I'm suggesting above would change it a tiny bit further:

bouncer = self.makeBouncerFixture()
        bouncer.stop()
        self.assertRaises(psycopg2.OperationalError, self.connect, bouncer)
        bouncer.start()
        self.connect(bouncer).close()

Slightly less monotonous, too.  :)  I'll leave it to you to decide if it makes things better or not.

def test_unix_sockets(self):
-        db = self.db
         unix_socket_dir = self.useFixture(fixtures.TempDir()).path
-        bouncer = PGBouncerFixture()
-        bouncer.databases[db.database] = 'host=%s' % (db.host,)
-        bouncer.users['user1'] = ''
-        bouncer.unix_socket_dir = unix_socket_dir
-        with bouncer:
-            # Connect to pgbouncer via a Unix domain socket. We don't
-            # care how pgbouncer connects to PostgreSQL.
-            conn = psycopg2.connect(
-                host=unix_socket_dir, user='user1',
-                database=db.database, port=bouncer.port)
-            conn.close()
+        self.bouncer.unix_socket_dir = unix_socket_dir
+        self.useFixture(self.bouncer)
+        # Connect to pgbouncer via a Unix domain socket. We don't
+        # care how pgbouncer connects to PostgreSQL.
+        self.connect(host=unix_socket_dir).close()

I like Unix domain sockets.  Why mess with global system state when you don't have to?

if __name__ == "__main__":
-    main(testLoader=TestLoader())
+    unittest.main(testLoader=test_loader)

I'll be honest: I have no idea how this boilerplate differs from the usual test_suite boilerplate.

Now to read your answers to the questions I posted earlier.  But code-wise, this looks good.

review: Approve

Revision history for this message

Jeroen T. Vermeulen (jtv) wrote on 2011-10-26:

#

> > * I guess if the test dies, pgbouncer will now die as well so you don't
> need
> > a pidfile to set things right during a later run?
>
> Well, I don't know if there are any greater guarantees than
> before. Even if we/pgbouncer arranges for SIGHUP to be sent on parent
> death it will only cause pgbouncer to reload its config.
>
> I suppose I could add an atexit handler to kill child processes.

If pgbouncer is a child of the test process, it should die along with the process. Don't mess with atexit if you can avoid it. It'd be good to give this a manual try though, just in case. You can probably just insert a long sleep and issue a manual “kill -segv <pid>” to see what happens if the test process dies without any python exit handling whatsoever.

> > * Why treat the files you open as binary? Take a pidfile, for instance. I
> > don't suppose it'll ever matter, but isn't that a text file? Adding the "b"
> > to the open modes seems to confuse more than help.
>
> Just a habit. The Python docs state that using "b" improves
> portability. Not that it's needed, but it does no harm, and the
> "features" of text mode are not needed anyway.

AFAIK it only improves portability for files that aren't pure text!

Revision history for this message

Gavin Panella (allenap) wrote on 2011-10-26:

#

Download full text (4.8 KiB)

[...]
> Some say that the pattern of “comment announcing block of code, block of code”
> is an indicator that you're probably better off extracting separate methods.
> I think in this case the terminate-and-wait loop is a bit mechanical and
> potentially worth extracting. It also gives you a bit more freedom to do
> things like a direct “return” on success, and to unit test the loop's corner
> cases. The while / if / sleep / else / break / else / raise loop, while
> small, may be just a bit too easy to break in maintenance.

I've factored some of it out into a new countdown() function. I've
stuck with break instead of return, just because. I don't have strong
feelings about early returns, but here I marginally prefer the break.

> Oh, and while you're at it, could you capitalize and punctuate that error
> message? It helps confirm to the hurried reader that the message is an
> independent sentence, rather than a continuation of an earlier statement (or
> of a traceback line) that the eye should skip over.

Done.

[...]
> Very similar to what you were doing in stop(). Once again I think this is
> easier to deal with if you extract it. There's probably a reusable thingy
> somewhere for this basic structure, but I don't suppose it's worth digging up.

Yep, changed as for stop().

> === modified file 'pgbouncer/tests.py'
> --- pgbouncer/tests.py 2011-09-05 15:01:52 +0000
> +++ pgbouncer/tests.py 2011-10-25 20:23:23 +0000
> @@ -15,7 +15,7 @@
> # along with this program. If not, see <http://www.gnu.org/licenses/>.
>
> import os
> -from unittest import main, TestLoader
> +import unittest
>
> import fixtures
> import psycopg2
> @@ -28,10 +28,12 @@
> # A 'just-works' workaround for Ubuntu not exposing initdb to the main PATH.
> os.environ['PATH'] = os.environ['PATH'] + ':/usr/lib/postgresql/8.4/bin'
>
> +
> +test_loader = testresources.TestLoader()
> +
> +
> def test_suite():
> - result = testresources.OptimisingTestSuite()
> - result.addTest(TestLoader().loadTestsFromName(__name__))
> - return result
> + return test_loader.loadTestsFromName(__name__)
>
>
> Not that I have any idea what I'm talking about, but is there any risk that
> this global initialization might cause trouble for “make lint” checks (which I
> believe import the test as a module)?

TestLoader.__init__() is not overridden from object() and
loadTestsFromName() does not mutate anything, so this is safe.
However, test_loader.discover() does mutate self so I've changed this
to create a new TestLoader as needed, Just In Case.

>
>
> @@ -49,15 +51,22 @@
>
> resources = [('db', DatabaseManager(initialize_sql=setup_user))]
>
> + def setUp(self):
> + super(TestFixture, self).setUp()
> + self.bouncer = PGBouncerFixture()
> + self.bouncer.databases[self.db.database] = 'host=' + self.db.host
> + self.bouncer.users['user1'] = ''
>
> You're doing this test a world of good. Still, it's a shame to jump through
> the “super” hoop *and* create implicit state *and* make an explicit call
> whenever you use the fixture.
>
> Why not fold all of it into a single explicit call? What I mean is:
>
> def makeBouncerFixture(self, ...

[...]
> Some say that the pattern of “comment announcing block of code, block of code”
> is an indicator that you're probably better off extracting separate methods.
> I think in this case the terminate-and-wait loop is a bit mechanical and
> potentially worth extracting.  It also gives you a bit more freedom to do
> things like a direct “return” on success, and to unit test the loop's corner
> cases.  The while / if / sleep / else / break / else / raise loop, while
> small, may be just a bit too easy to break in maintenance.

I've factored some of it out into a new countdown() function. I've
stuck with break instead of return, just because. I don't have strong
feelings about early returns, but here I marginally prefer the break.

> Oh, and while you're at it, could you capitalize and punctuate that error
> message?  It helps confirm to the hurried reader that the message is an
> independent sentence, rather than a continuation of an earlier statement (or
> of a traceback line) that the eye should skip over.

Done.

[...]
> Very similar to what you were doing in stop().  Once again I think this is
> easier to deal with if you extract it.  There's probably a reusable thingy
> somewhere for this basic structure, but I don't suppose it's worth digging up.

Yep, changed as for stop().

> === modified file 'pgbouncer/tests.py'
> --- pgbouncer/tests.py  2011-09-05 15:01:52 +0000
> +++ pgbouncer/tests.py  2011-10-25 20:23:23 +0000
> @@ -15,7 +15,7 @@
>  # along with this program.  If not, see <http://www.gnu.org/licenses/>.
> 
>  import os
> -from unittest import main, TestLoader
> +import unittest
> 
>  import fixtures
>  import psycopg2
> @@ -28,10 +28,12 @@
>  # A 'just-works' workaround for Ubuntu not exposing initdb to the main PATH.
>  os.environ['PATH'] = os.environ['PATH'] + ':/usr/lib/postgresql/8.4/bin'
> 
> +
> +test_loader = testresources.TestLoader()
> +
> +
>  def test_suite():
> -    result = testresources.OptimisingTestSuite()
> -    result.addTest(TestLoader().loadTestsFromName(__name__))
> -    return result
> +    return test_loader.loadTestsFromName(__name__)
> 
> 
> Not that I have any idea what I'm talking about, but is there any risk that
> this global initialization might cause trouble for “make lint” checks (which I
> believe import the test as a module)?

TestLoader.__init__() is not overridden from object() and
loadTestsFromName() does not mutate anything, so this is safe.
However, test_loader.discover() does mutate self so I've changed this
to create a new TestLoader as needed, Just In Case.

> 
> 
> @@ -49,15 +51,22 @@
> 
>      resources = [('db', DatabaseManager(initialize_sql=setup_user))]
> 
> +    def setUp(self):
> +        super(TestFixture, self).setUp()
> +        self.bouncer = PGBouncerFixture()
> +        self.bouncer.databases[self.db.database] = 'host=' + self.db.host
> +        self.bouncer.users['user1'] = ''
> 
> You're doing this test a world of good.  Still, it's a shame to jump through
> the “super” hoop *and* create implicit state *and* make an explicit call
> whenever you use the fixture.
> 
> Why not fold all of it into a single explicit call?  What I mean is:
> 
>     def makeBouncerFixture(self, unix_socket_dir=None):
>         bouncer = PGBouncerFixture()
>         bouncer.databases[self.db.database] = 'hosts=' + self.db.host
>         bouncer.users['user1'] = ''
>         if unix_socket_dir is not None:
>             bouncer.unix_socket_dir = unix_socket_dir
>         return self.useFixture(bouncer)
> 
> Then replace every “self.useFixture(self.bouncer)” with
> “self.makeBouncerFixture()” and done.  It's shorter, even.  And if you need
> the bouncer object in a test, just capture the method's return value.
> 
> In “connect” of course you'd need to pass the bouncer as an extra argument.
> Not sure if that negates the winnings.

It gets a little ugly because now the connect() method can no longer
assume that it should connect as "user1". Well, it _can_ assume that,
but it's less concrete than having it defined every time by
setUp(). There are ways around that... but I'm inclined to stick with
what's there. I thing setUp() can easily be overused, but I think it's
fine here because every test uses self.bouncer and self.connect.

I have simplified the connect() method.

[...]
> I like Unix domain sockets.  Why mess with global system state when you don't
> have to?

Indeed :)

>  if __name__ == "__main__":
> -    main(testLoader=TestLoader())
> +    unittest.main(testLoader=test_loader)
> 
> 
> I'll be honest: I have no idea how this boilerplate differs from the usual
> test_suite boilerplate.

This probably isn't needed, but I'll leave it in for now. It does no
harm, I hope.

> Now to read your answers to the questions I posted earlier.  But code-wise,
> this looks good.

Now to respond to your response :)

Thanks for the review.

lp:~allenap/python-pgbouncer/reliable-shutdown updated on 2011-10-26

16. By Gavin Panella on 2011-10-26: Factor out the countdown code.
17. By Gavin Panella on 2011-10-26: Simplify the countdown code.
18. By Gavin Panella on 2011-10-26: Improve docstrings and exception messages.
19. By Gavin Panella on 2011-10-26: Create TestLoader as needed.
20. By Gavin Panella on 2011-10-26: Simplify connect().

Revision history for this message

Gavin Panella (allenap) wrote on 2011-10-26:

#

[...]
> If pgbouncer is a child of the test process, it should die along
> with the process. Don't mess with atexit if you can avoid it. It'd
> be good to give this a manual try though, just in case. You can
> probably just insert a long sleep and issue a manual “kill -segv
> <pid>” to see what happens if the test process dies without any
> python exit handling whatsoever.

SIGSEGV and SIGTERM leave pgbouncer running, bug SIGINT allows
clean-ups to run (because it's a KeyboardInterrupt) so that pgbouncer
gets terminated. I think this is acceptable.

Rob has previously warned (w.r.t. rabbitfixture) not to be too clever
with cleaning stuff up from earlier. He advised something along the
lines of: it's better to let these things go awry and thus draw our
attention to leaks rather than patch things over.

> > > * Why treat the files you open as binary? Take a pidfile, for
> > > instance. I don't suppose it'll ever matter, but isn't that a
> > > text file? Adding the "b" to the open modes seems to confuse
> > > more than help.
> >
> > Just a habit. The Python docs state that using "b" improves
> > portability. Not that it's needed, but it does no harm, and the
> > "features" of text mode are not needed anyway.
>
> AFAIK it only improves portability for files that aren't pure text!

Though there is no different on Linux (afaik), treating every file as
binary reduces surprises on other platforms. Having a distinction
between text and binary reminds me of FTP ASCII mode and what a
horror-show that is.

When writing to a file I rarely use print, for example; I prefer to
use .write(), which means I must think about and insert line-endings
myself. I guess this _might_ tempt me to use text mode for writing a
file, but more often than not I'd simply use Unix line-endings. Only
notepad.exe is likely to have any problem opening files like that.

If I want to read a file line by line that could have any line-ending,
I will either slurp it in and call splitlines(), or use universal
newlines.

In the fixture there are three places where I use binary mode: opening
the output file for pgbouncer, opening the input file (/dev/null) for
pgbouncer, and for reading the PID file.

The first two mimic shell redirections, which must use binary mode (or
else piping tar to gzip would break on Windows).

The PID file is slurped in and immediately stripped (and is expected
to be no more than one line anyway) so line-endings are irrelevant.

[...]
> If pgbouncer is a child of the test process, it should die along
> with the process.  Don't mess with atexit if you can avoid it.  It'd
> be good to give this a manual try though, just in case.  You can
> probably just insert a long sleep and issue a manual “kill -segv
> <pid>” to see what happens if the test process dies without any
> python exit handling whatsoever.

SIGSEGV and SIGTERM leave pgbouncer running, bug SIGINT allows
clean-ups to run (because it's a KeyboardInterrupt) so that pgbouncer
gets terminated. I think this is acceptable.

Rob has previously warned (w.r.t. rabbitfixture) not to be too clever
with cleaning stuff up from earlier. He advised something along the
lines of: it's better to let these things go awry and thus draw our
attention to leaks rather than patch things over.

> > > * Why treat the files you open as binary?  Take a pidfile, for
> > >   instance. I don't suppose it'll ever matter, but isn't that a
> > >   text file?  Adding the "b" to the open modes seems to confuse
> > >   more than help.
> >
> > Just a habit. The Python docs state that using "b" improves
> > portability. Not that it's needed, but it does no harm, and the
> > "features" of text mode are not needed anyway.
> 
> AFAIK it only improves portability for files that aren't pure text!

Though there is no different on Linux (afaik), treating every file as
binary reduces surprises on other platforms. Having a distinction
between text and binary reminds me of FTP ASCII mode and what a
horror-show that is.

When writing to a file I rarely use print, for example; I prefer to
use .write(), which means I must think about and insert line-endings
myself. I guess this _might_ tempt me to use text mode for writing a
file, but more often than not I'd simply use Unix line-endings. Only
notepad.exe is likely to have any problem opening files like that.

If I want to read a file line by line that could have any line-ending,
I will either slurp it in and call splitlines(), or use universal
newlines.

In the fixture there are three places where I use binary mode: opening
the output file for pgbouncer, opening the input file (/dev/null) for
pgbouncer, and for reading the PID file.

The first two mimic shell redirections, which must use binary mode (or
else piping tar to gzip would break on Windows).

The PID file is slurped in and immediately stripped (and is expected
to be no more than one line anyway) so line-endings are irrelevant.

Python PGBouncer

Merge lp:~allenap/python-pgbouncer/reliable-shutdown into lp:python-pgbouncer

Commit message

Description of the change

Preview Diff

Subscribers

 === modified file '.bzrignore'
 --- .bzrignore	2011-07-18 03:31:27 +0000
 +++ .bzrignore	2011-10-26 14:08:31 +0000
@@ -6,3 +6,5 @@
  ./parts
  ./eggs
  ./download-cache
++TAGS
++tags
 === modified file 'README'
 --- README	2011-09-05 14:35:29 +0000
 +++ README	2011-10-26 14:08:31 +0000
@@ -30,12 +30,18 @@
  * python-fixtures (https://launchpad.net/python-fixtures or
    http://pypi.python.org/pypi/fixtures)
++* testtools (http://pypi.python.org/pypi/testtools)
++
  Testing Dependencies
  ====================
++In addition to the above, the tests also depend on:
++
++* psycopg2 (http://pypi.python.org/pypi/psycopg2)
++
  * subunit (http://pypi.python.org/pypi/python-subunit) (optional)
--* testtools (http://pypi.python.org/pypi/testtools)
++* testresources (http://pypi.python.org/pypi/testresources)
  * van.pg (http://pypi.python.org/pypi/van.pg)
@@ -75,4 +81,10 @@
  immediately available, you can use ./bootstrap.py to create bin/buildout, then
  bin/py to get a python interpreter with the dependencies available.
--To run the tests, run 'bin/py pgbouncer/tests.py'.
++To run the tests, run either:
++
++ $ bin/py pgbouncer/tests.py
++
++or:
++
++ $ bin/py -m testtools.run pgbouncer.tests.test_suite
 === modified file 'pgbouncer/fixture.py'
 --- pgbouncer/fixture.py	2011-09-12 23:38:28 +0000
 +++ pgbouncer/fixture.py	2011-10-26 14:08:31 +0000
@@ -18,13 +18,31 @@
      'PGBouncerFixture',
+     ]
++import itertools
  import os.path
  import socket
--import signal
  import subprocess
  import time
  from fixtures import Fixture, TempDir
++from testtools.content import content_from_file
++
++
++def countdown(duration=5.0, sleep=0.1):
++    """Provide a countdown iterator that sleeps between iterations.
++
++    Yields the current iteration count, starting from 1.
++    """
++    start = time.time()
++    stop = start + duration
++    for iteration in itertools.count(1):
++        now = time.time()
++        if now < stop:
++            yield iteration
++            time.sleep(sleep)
++        else:
++            break
++
  def _allocate_ports(n=1):
      """Allocate `n` unused ports.
@@ -46,12 +64,16 @@
  class PGBouncerFixture(Fixture):
      """Programmatically configure and run pgbouncer.
--    >>> Minimal usage:
--    >>> bouncer = PGBouncerFixture()
--    >>> bouncer.databases['mydb'] = 'host=hostname dbname=foo'
--    >>> bouncer.users['user1'] = 'credentials'
--    >>> with bouncer:
--    ...     # Can now connect to bouncer.host port=bouncer.port user=user1
++    Minimal usage:
++
++      >>> bouncer = PGBouncerFixture()
++      >>> bouncer.databases['mydb'] = 'host=hostname dbname=foo'
++      >>> bouncer.users['user1'] = 'credentials'
++      >>> with bouncer:
++      ...     connection = psycopg2.connect(
++      ...         database="mydb", host=bouncer.host, port=bouncer.port,
++      ...         user="user1", password="credentials")
++
      """
      def __init__(self):
@@ -76,7 +98,6 @@
          self.port = _allocate_ports()[0]
          self.configdir = self.useFixture(TempDir())
          self.auth_type = 'trust'
--        self.process_pid = None
          self.setUpConf()
          self.start()
@@ -111,46 +132,48 @@
                  authfile.write('"%s" "%s"\n' % user_creds)
      def stop(self):
--        if self.process_pid is None:
--            return
--        os.kill(self.process_pid, signal.SIGTERM)
--        # Wait for the shutdown to occur
--        start = time.time()
--        stop = start + 5.0
--        while time.time() < stop:
--            if not os.path.isfile(self.pidpath):
--                self.process_pid = None
--                return
--        # If its not going away, we might want to raise an error, but for now
--        # it seems reliable.
++        if self.process is None:
++            # pgbouncer has not been started.
++            return
++        if self.process.poll() is not None:
++            # pgbouncer has exited already.
++            return
++        self.process.terminate()
++        for iteration in countdown():
++            if self.process.poll() is not None:
++                break
++        else:
++            raise Exception(
++                'Time-out waiting for pgbouncer to exit.')
      def start(self):
          self.addCleanup(self.stop)
          # Add /usr/sbin if necessary to the PATH for magic just-works
          # behavior with Ubuntu.
--        env = None
++        env = os.environ.copy()
          if not self.pgbouncer.startswith('/'):
--            path = os.environ['PATH'].split(':')
++            path = env['PATH'].split(os.pathsep)
              if '/usr/sbin' not in path:
                  path.append('/usr/sbin')
--                env = os.environ.copy()
--                env['PATH'] = ':'.join(path)
--
--        outputfile = open(self.outputpath, 'wt')
--        self.process = subprocess.Popen(
--            [self.pgbouncer, '-d', self.inipath], env=env,
--            stdout=outputfile.fileno(), stderr=outputfile.fileno())
--        self.process.communicate()
--        # Wait up to 5 seconds for the pid file to exist
--        start = time.time()
--        stop = start + 5.0
--        while time.time() < stop:
++                env['PATH'] = os.pathsep.join(path)
++
++        with open(self.outputpath, "wb") as outputfile:
++            with open(os.devnull, "rb") as devnull:
++                self.process = subprocess.Popen(
++                    [self.pgbouncer, self.inipath], env=env, stdin=devnull,
++                    stdout=outputfile, stderr=outputfile)
++
++        self.addDetail(
++            os.path.basename(self.outputpath),
++            content_from_file(self.outputpath))
++
++        # Wait for the PID file to appear.
++        for iteration in countdown():
              if os.path.isfile(self.pidpath):
--                try:
--                    self.process_pid = int(file(self.pidpath, 'rt').read())
--                except ValueError:
--                    # Empty pid files -> ValueError.
--                    continue
--                return
--        raise Exception('timeout waiting for pgbouncer to create pid file')
++                with open(self.pidpath, "rb") as pidfile:
++                    if pidfile.read().strip().isdigit():
++                        break
++        else:
++            raise Exception(
++                'Time-out waiting for pgbouncer to create PID file.')
 === modified file 'pgbouncer/tests.py'
 --- pgbouncer/tests.py	2011-09-05 15:01:52 +0000
 +++ pgbouncer/tests.py	2011-10-26 14:08:31 +0000
@@ -15,7 +15,7 @@
  # along with this program.  If not, see <http://www.gnu.org/licenses/>.
  import os
--from unittest import main, TestLoader
++import unittest
  import fixtures
  import psycopg2
@@ -28,10 +28,10 @@
  # A 'just-works' workaround for Ubuntu not exposing initdb to the main PATH.
  os.environ['PATH'] = os.environ['PATH'] + ':/usr/lib/postgresql/8.4/bin'
++
  def test_suite():
--    result = testresources.OptimisingTestSuite()
--    result.addTest(TestLoader().loadTestsFromName(__name__))
--    return result
++    loader = testresources.TestLoader()
++    return loader.loadTestsFromName(__name__)
  class ResourcedTestCase(testtools.TestCase, testresources.ResourcedTestCase):
@@ -49,15 +49,21 @@
      resources = [('db', DatabaseManager(initialize_sql=setup_user))]
++    def setUp(self):
++        super(TestFixture, self).setUp()
++        self.bouncer = PGBouncerFixture()
++        self.bouncer.databases[self.db.database] = 'host=' + self.db.host
++        self.bouncer.users['user1'] = ''
++
++    def connect(self, host=None):
++        return psycopg2.connect(
++            host=(self.bouncer.host if host is None else host),
++            port=self.bouncer.port, database=self.db.database,
++            user='user1')
++
      def test_dynamic_port_allocation(self):
--        bouncer = PGBouncerFixture()
--        db = self.db
--        bouncer.databases[db.database] = 'host=%s' % (db.host,)
--        bouncer.users['user1'] = ''
--        with bouncer:
--            conn = psycopg2.connect(host=bouncer.host, port=bouncer.port,
--                user='user1', database=db.database)
--            conn.close()
++        self.useFixture(self.bouncer)
++        self.connect().close()
      def test_stop_start_facility(self):
          # Once setup the fixture can be stopped, and started again, retaining
@@ -65,36 +71,21 @@
          # potentially be used by a different process, so this isn't perfect,
          # but its pretty reliable as a test helper, and manual port allocation
          # outside the dynamic range should be fine.
--        bouncer = PGBouncerFixture()
--        db = self.db
--        bouncer.databases[db.database] = 'host=%s' % (db.host,)
--        bouncer.users['user1'] = ''
--        def check_connect():
--            conn = psycopg2.connect(host=bouncer.host, port=bouncer.port,
--                user='user1', database=db.database)
--            conn.close()
--        with bouncer:
--            current_port = bouncer.port
--            bouncer.stop()
--            self.assertRaises(psycopg2.OperationalError, check_connect)
--            bouncer.start()
--            check_connect()
++        self.useFixture(self.bouncer)
++        self.bouncer.stop()
++        self.assertRaises(psycopg2.OperationalError, self.connect)
++        self.bouncer.start()
++        self.connect().close()
      def test_unix_sockets(self):
--        db = self.db
          unix_socket_dir = self.useFixture(fixtures.TempDir()).path
--        bouncer = PGBouncerFixture()
--        bouncer.databases[db.database] = 'host=%s' % (db.host,)
--        bouncer.users['user1'] = ''
--        bouncer.unix_socket_dir = unix_socket_dir
--        with bouncer:
--            # Connect to pgbouncer via a Unix domain socket. We don't
--            # care how pgbouncer connects to PostgreSQL.
--            conn = psycopg2.connect(
--                host=unix_socket_dir, user='user1',
--                database=db.database, port=bouncer.port)
--            conn.close()
++        self.bouncer.unix_socket_dir = unix_socket_dir
++        self.useFixture(self.bouncer)
++        # Connect to pgbouncer via a Unix domain socket. We don't
++        # care how pgbouncer connects to PostgreSQL.
++        self.connect(host=unix_socket_dir).close()
  if __name__ == "__main__":
--    main(testLoader=TestLoader())
++    loader = testresources.TestLoader()
++    unittest.main(testLoader=loader)