Python PGBouncer

Code review comment for lp:~allenap/python-pgbouncer/reliable-shutdown

reliable-shutdown
Merge into trunk

Revision history for this message

Jeroen T. Vermeulen (jtv) wrote on 2011-10-26:

Thanks for fixing this, by the way. I see some really good stuff in here, both functionally and in terms of code quality.

=== modified file 'pgbouncer/fixture.py'
--- pgbouncer/fixture.py 2011-09-12 23:38:28 +0000
+++ pgbouncer/fixture.py 2011-10-25 20:23:23 +0000

@@ -111,46 +111,56 @@
authfile.write('"%s" "%s"\n' % user_creds)

     def stop(self):
- if self.process_pid is None:
- return
- os.kill(self.process_pid, signal.SIGTERM)
- # Wait for the shutdown to occur
+ if self.process is None:
+ # pgbouncer has not been started.
+ return
+ if self.process.poll() is not None:
+ # pgbouncer has exited already.
+ return
+ # Terminate and wait.
+ self.process.terminate()
         start = time.time()
         stop = start + 5.0
         while time.time() < stop:
- if not os.path.isfile(self.pidpath):
- self.process_pid = None
- return
- # If its not going away, we might want to raise an error, but for now
- # it seems reliable.
+ if self.process.poll() is None:
+ time.sleep(0.1)
+ else:
+ break
+ else:
+ raise Exception(
+ 'timeout waiting for pgbouncer to exit')

Some say that the pattern of “comment announcing block of code, block of code” is an indicator that you're probably better off extracting separate methods. I think in this case the terminate-and-wait loop is a bit mechanical and potentially worth extracting. It also gives you a bit more freedom to do things like a direct “return” on success, and to unit test the loop's corner cases. The while / if / sleep / else / break / else / raise loop, while small, may be just a bit too easy to break in maintenance.

Oh, and while you're at it, could you capitalize and punctuate that error message? It helps confirm to the hurried reader that the message is an independent sentence, rather than a continuation of an earlier statement (or of a traceback line) that the eye should skip over.

def start(self):
self.addCleanup(self.stop)

         # Add /usr/sbin if necessary to the PATH for magic just-works
         # behavior with Ubuntu.
- env = None
+ env = os.environ.copy()
         if not self.pgbouncer.startswith('/'):
- path = os.environ['PATH'].split(':')
+ path = env['PATH'].split(os.pathsep)

Better, because simpler.

             if '/usr/sbin' not in path:
                 path.append('/usr/sbin')
- env = os.environ.copy()
- env['PATH'] = ':'.join(path)
-
- outputfile = open(self.outputpath, 'wt')
- self.process = subprocess.Popen(
- [self.pgbouncer, '-d', self.inipath], env=env,
- stdout=outputfile.fileno(), stderr=outputfile.fileno())
- self.process.communicate()
- # Wait up to 5 seconds for the pid file to exist
+ env['PATH'] = os.pathsep.join(path)
+
+ with open(self.outputpath, "wb") as outputfile:
+ with open(os.devnull, "rb") as devnull:
+ self.process = subprocess.Popen(
+ [self.pgbouncer, self.inipath], env=env, stdin=devnull,
+ stdout=outputfile, stderr=outputfile)
+
+ self.addDetail(
+ os.path.basename(self.outputpath),
+ content_from_file(self.outputpath))
+
+ # Wait up to 5 seconds for the pid file to exist.
         start = time.time()
         stop = start + 5.0
         while time.time() < stop:
             if os.path.isfile(self.pidpath):
- try:
- self.process_pid = int(file(self.pidpath, 'rt').read())
- except ValueError:
- # Empty pid files -> ValueError.
- continue
- return
- raise Exception('timeout waiting for pgbouncer to create pid file')
+ with open(self.pidpath, "rb") as pidfile:
+ if pidfile.read().strip().isdigit():
+ break
+ time.sleep(0.1)
+ else:
+ raise Exception(
+ 'timeout waiting for pgbouncer to create pid file')

Very similar to what you were doing in stop(). Once again I think this is easier to deal with if you extract it. There's probably a reusable thingy somewhere for this basic structure, but I don't suppose it's worth digging up.

import os
-from unittest import main, TestLoader
+import unittest

import fixtures
import psycopg2
@@ -28,10 +28,12 @@
# A 'just-works' workaround for Ubuntu not exposing initdb to the main PATH.
os.environ['PATH'] = os.environ['PATH'] + ':/usr/lib/postgresql/8.4/bin'

+
+test_loader = testresources.TestLoader()
+
+
def test_suite():
- result = testresources.OptimisingTestSuite()
- result.addTest(TestLoader().loadTestsFromName(__name__))
- return result
+ return test_loader.loadTestsFromName(__name__)

Not that I have any idea what I'm talking about, but is there any risk that this global initialization might cause trouble for “make lint” checks (which I believe import the test as a module)?

@@ -49,15 +51,22 @@

resources = [('db', DatabaseManager(initialize_sql=setup_user))]

+ def setUp(self):
+ super(TestFixture, self).setUp()
+ self.bouncer = PGBouncerFixture()
+ self.bouncer.databases[self.db.database] = 'host=' + self.db.host
+ self.bouncer.users['user1'] = ''

You're doing this test a world of good. Still, it's a shame to jump through the “super” hoop *and* create implicit state *and* make an explicit call whenever you use the fixture.

Why not fold all of it into a single explicit call? What I mean is:

    def makeBouncerFixture(self, unix_socket_dir=None):
        bouncer = PGBouncerFixture()
        bouncer.databases[self.db.database] = 'hosts=' + self.db.host
        bouncer.users['user1'] = ''
        if unix_socket_dir is not None:
            bouncer.unix_socket_dir = unix_socket_dir
        return self.useFixture(bouncer)

Then replace every “self.useFixture(self.bouncer)” with “self.makeBouncerFixture()” and done. It's shorter, even. And if you need the bouncer object in a test, just capture the method's return value.

In “connect” of course you'd need to pass the bouncer as an extra argument. Not sure if that negates the winnings.

@@ -65,36 +74,20 @@
         # potentially be used by a different process, so this isn't perfect,
         # but its pretty reliable as a test helper, and manual port allocation
         # outside the dynamic range should be fine.
- bouncer = PGBouncerFixture()
- db = self.db
- bouncer.databases[db.database] = 'host=%s' % (db.host,)
- bouncer.users['user1'] = ''
- def check_connect():
- conn = psycopg2.connect(host=bouncer.host, port=bouncer.port,
- user='user1', database=db.database)
- conn.close()
- with bouncer:
- current_port = bouncer.port
- bouncer.stop()
- self.assertRaises(psycopg2.OperationalError, check_connect)
- bouncer.start()
- check_connect()
+ self.useFixture(self.bouncer)
+ self.bouncer.stop()
+ self.assertRaises(psycopg2.OperationalError, self.connect)
+ self.bouncer.start()
+ self.connect().close()

World of good, definitely. What I'm suggesting above would change it a tiny bit further:

        bouncer = self.makeBouncerFixture()
        bouncer.stop()
        self.assertRaises(psycopg2.OperationalError, self.connect, bouncer)
        bouncer.start()
        self.connect(bouncer).close()

Slightly less monotonous, too. :) I'll leave it to you to decide if it makes things better or not.

def test_unix_sockets(self):
- db = self.db
unix_socket_dir = self.useFixture(fixtures.TempDir()).path
- bouncer = PGBouncerFixture()
- bouncer.databases[db.database] = 'host=%s' % (db.host,)
- bouncer.users['user1'] = ''
- bouncer.unix_socket_dir = unix_socket_dir
- with bouncer:
- # Connect to pgbouncer via a Unix domain socket. We don't
- # care how pgbouncer connects to PostgreSQL.
- conn = psycopg2.connect(
- host=unix_socket_dir, user='user1',
- database=db.database, port=bouncer.port)
- conn.close()
+ self.bouncer.unix_socket_dir = unix_socket_dir
+ self.useFixture(self.bouncer)
+ # Connect to pgbouncer via a Unix domain socket. We don't
+ # care how pgbouncer connects to PostgreSQL.
+ self.connect(host=unix_socket_dir).close()

I like Unix domain sockets. Why mess with global system state when you don't have to?

if __name__ == "__main__":
- main(testLoader=TestLoader())
+ unittest.main(testLoader=test_loader)

I'll be honest: I have no idea how this boilerplate differs from the usual test_suite boilerplate.

Now to read your answers to the questions I posted earlier. But code-wise, this looks good.

Thanks for fixing this, by the way.  I see some really good stuff in here, both functionally and in terms of code quality.

=== modified file 'pgbouncer/fixture.py'
--- pgbouncer/fixture.py	2011-09-12 23:38:28 +0000
+++ pgbouncer/fixture.py	2011-10-25 20:23:23 +0000

@@ -111,46 +111,56 @@
                 authfile.write('"%s" "%s"\n' % user_creds)
 
     def stop(self):
-        if self.process_pid is None:
-            return
-        os.kill(self.process_pid, signal.SIGTERM)
-        # Wait for the shutdown to occur
+        if self.process is None:
+            # pgbouncer has not been started.
+            return
+        if self.process.poll() is not None:
+            # pgbouncer has exited already.
+            return
+        # Terminate and wait.
+        self.process.terminate()
         start = time.time()
         stop = start + 5.0
         while time.time() < stop:
-            if not os.path.isfile(self.pidpath):
-                self.process_pid = None
-                return
-        # If its not going away, we might want to raise an error, but for now
-        # it seems reliable.
+            if self.process.poll() is None:
+                time.sleep(0.1)
+            else:
+                break
+        else:
+            raise Exception(
+                'timeout waiting for pgbouncer to exit')

Some say that the pattern of “comment announcing block of code, block of code” is an indicator that you're probably better off extracting separate methods.  I think in this case the terminate-and-wait loop is a bit mechanical and potentially worth extracting.  It also gives you a bit more freedom to do things like a direct “return” on success, and to unit test the loop's corner cases.  The while / if / sleep / else / break / else / raise loop, while small, may be just a bit too easy to break in maintenance.

Oh, and while you're at it, could you capitalize and punctuate that error message?  It helps confirm to the hurried reader that the message is an independent sentence, rather than a continuation of an earlier statement (or of a traceback line) that the eye should skip over.

def start(self):
         self.addCleanup(self.stop)
 
         # Add /usr/sbin if necessary to the PATH for magic just-works
         # behavior with Ubuntu.
-        env = None
+        env = os.environ.copy()
         if not self.pgbouncer.startswith('/'):
-            path = os.environ['PATH'].split(':')
+            path = env['PATH'].split(os.pathsep)

Better, because simpler.

if '/usr/sbin' not in path:
                 path.append('/usr/sbin')
-                env = os.environ.copy()
-                env['PATH'] = ':'.join(path)
-
-        outputfile = open(self.outputpath, 'wt')
-        self.process = subprocess.Popen(
-            [self.pgbouncer, '-d', self.inipath], env=env,
-            stdout=outputfile.fileno(), stderr=outputfile.fileno())
-        self.process.communicate()
-        # Wait up to 5 seconds for the pid file to exist
+                env['PATH'] = os.pathsep.join(path)
+
+        with open(self.outputpath, "wb") as outputfile:
+            with open(os.devnull, "rb") as devnull:
+                self.process = subprocess.Popen(
+                    [self.pgbouncer, self.inipath], env=env, stdin=devnull,
+                    stdout=outputfile, stderr=outputfile)
+
+        self.addDetail(
+            os.path.basename(self.outputpath),
+            content_from_file(self.outputpath))
+
+        # Wait up to 5 seconds for the pid file to exist.
         start = time.time()
         stop = start + 5.0
         while time.time() < stop:
             if os.path.isfile(self.pidpath):
-                try:
-                    self.process_pid = int(file(self.pidpath, 'rt').read())
-                except ValueError:
-                    # Empty pid files -> ValueError.
-                    continue
-                return
-        raise Exception('timeout waiting for pgbouncer to create pid file')
+                with open(self.pidpath, "rb") as pidfile:
+                    if pidfile.read().strip().isdigit():
+                        break
+            time.sleep(0.1)
+        else:
+            raise Exception(
+                'timeout waiting for pgbouncer to create pid file')

Very similar to what you were doing in stop().  Once again I think this is easier to deal with if you extract it.  There's probably a reusable thingy somewhere for this basic structure, but I don't suppose it's worth digging up.

=== modified file 'pgbouncer/tests.py'
--- pgbouncer/tests.py	2011-09-05 15:01:52 +0000
+++ pgbouncer/tests.py	2011-10-25 20:23:23 +0000
@@ -15,7 +15,7 @@
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
 import os
-from unittest import main, TestLoader
+import unittest
 
 import fixtures
 import psycopg2
@@ -28,10 +28,12 @@
 # A 'just-works' workaround for Ubuntu not exposing initdb to the main PATH.
 os.environ['PATH'] = os.environ['PATH'] + ':/usr/lib/postgresql/8.4/bin'
 
+
+test_loader = testresources.TestLoader()
+
+
 def test_suite():
-    result = testresources.OptimisingTestSuite()
-    result.addTest(TestLoader().loadTestsFromName(__name__))
-    return result
+    return test_loader.loadTestsFromName(__name__)

Not that I have any idea what I'm talking about, but is there any risk that this global initialization might cause trouble for “make lint” checks (which I believe import the test as a module)?
 
 
@@ -49,15 +51,22 @@
 
     resources = [('db', DatabaseManager(initialize_sql=setup_user))]
 
+    def setUp(self):
+        super(TestFixture, self).setUp()
+        self.bouncer = PGBouncerFixture()
+        self.bouncer.databases[self.db.database] = 'host=' + self.db.host
+        self.bouncer.users['user1'] = ''

You're doing this test a world of good.  Still, it's a shame to jump through the “super” hoop *and* create implicit state *and* make an explicit call whenever you use the fixture.

Why not fold all of it into a single explicit call?  What I mean is:

Then replace every “self.useFixture(self.bouncer)” with “self.makeBouncerFixture()” and done.  It's shorter, even.  And if you need the bouncer object in a test, just capture the method's return value.

In “connect” of course you'd need to pass the bouncer as an extra argument.  Not sure if that negates the winnings.

@@ -65,36 +74,20 @@
         # potentially be used by a different process, so this isn't perfect,
         # but its pretty reliable as a test helper, and manual port allocation
         # outside the dynamic range should be fine.
-        bouncer = PGBouncerFixture()
-        db = self.db
-        bouncer.databases[db.database] = 'host=%s' % (db.host,)
-        bouncer.users['user1'] = ''
-        def check_connect():
-            conn = psycopg2.connect(host=bouncer.host, port=bouncer.port,
-                user='user1', database=db.database)
-            conn.close()
-        with bouncer:
-            current_port = bouncer.port
-            bouncer.stop()
-            self.assertRaises(psycopg2.OperationalError, check_connect)
-            bouncer.start()
-            check_connect()
+        self.useFixture(self.bouncer)
+        self.bouncer.stop()
+        self.assertRaises(psycopg2.OperationalError, self.connect)
+        self.bouncer.start()
+        self.connect().close()

World of good, definitely.  What I'm suggesting above would change it a tiny bit further:

bouncer = self.makeBouncerFixture()
        bouncer.stop()
        self.assertRaises(psycopg2.OperationalError, self.connect, bouncer)
        bouncer.start()
        self.connect(bouncer).close()

Slightly less monotonous, too.  :)  I'll leave it to you to decide if it makes things better or not.

def test_unix_sockets(self):
-        db = self.db
         unix_socket_dir = self.useFixture(fixtures.TempDir()).path
-        bouncer = PGBouncerFixture()
-        bouncer.databases[db.database] = 'host=%s' % (db.host,)
-        bouncer.users['user1'] = ''
-        bouncer.unix_socket_dir = unix_socket_dir
-        with bouncer:
-            # Connect to pgbouncer via a Unix domain socket. We don't
-            # care how pgbouncer connects to PostgreSQL.
-            conn = psycopg2.connect(
-                host=unix_socket_dir, user='user1',
-                database=db.database, port=bouncer.port)
-            conn.close()
+        self.bouncer.unix_socket_dir = unix_socket_dir
+        self.useFixture(self.bouncer)
+        # Connect to pgbouncer via a Unix domain socket. We don't
+        # care how pgbouncer connects to PostgreSQL.
+        self.connect(host=unix_socket_dir).close()

I like Unix domain sockets.  Why mess with global system state when you don't have to?

if __name__ == "__main__":
-    main(testLoader=TestLoader())
+    unittest.main(testLoader=test_loader)

I'll be honest: I have no idea how this boilerplate differs from the usual test_suite boilerplate.

Now to read your answers to the questions I posted earlier.  But code-wise, this looks good.

review: Approve

« Back to merge proposal