Ubuntu Distributed Development

Merge lp:~james-w/udd/storm-unicode-fixes into lp:udd

storm-unicode-fixes
Merge into import-scripts

Proposed by James Westby on 2012-07-02

Status:	Merged
Merged at revision:	601
Proposed branch:	lp:~james-w/udd/storm-unicode-fixes
Merge into:	lp:udd
Prerequisite:	lp:~james-w/udd/storm
Diff against target:	505 lines (+208/-29) 5 files modified udd/icommon.py (+186/-21) udd/scripts/mass_import.py (+6/-2) udd/tests/test_commit_database.py (+1/-1) udd/tests/test_revid_database.py (+1/-1) udd/tests/test_status_database.py (+14/-4)
To merge this branch:	bzr merge lp:~james-w/udd/storm-unicode-fixes
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Martin Packman			Approve on 2012-07-02
Vincent Ladeuil			Needs Information on 2012-07-02
John A Meinel			Approve on 2012-07-02
James Westby			Pending
Review via email: mp+113004@code.launchpad.net

This proposal supersedes a proposal from 2012-07-02.

Description of the change

Hi,

This fixes the issues that Max found where we weren't using unicode for
TEXT content, which meant the info was stored as blobs and problems
happened.

Thanks,

James

Revision history for this message

John A Meinel (jameinel) wrote on 2012-07-02:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 7/2/2012 12:03 PM, James Westby wrote:
> James Westby has proposed merging
> lp:~james-w/udd/storm-unicode-fixes into lp:udd with
> lp:~james-w/udd/storm as a prerequisite.
>
> Requested reviews: James Westby (james-w)
>
> For more details, see:
> https://code.launchpad.net/~james-w/udd/storm-unicode-fixes/+merge/113004
>
> Hi,
>
> This fixes the issues that Max found where we weren't using unicode
> for TEXT content, which meant the info was stored as blobs and
> problems happened.
>
> Thanks,
>
> James
>

This looks pretty good to me.

review: approve
John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk/xc+EACgkQJdeBCYSNAAPAjACdFDMf3zAcN6xOdBjxMI530Rjk
FGUAn0GISTMu57DTZnjqS4YpYUpTuaNV
=B8FQ
-----END PGP SIGNATURE-----

review: Approve

Revision history for this message

Martin Packman (gz) wrote on 2012-07-02:

For background for this change see UDD list thread:
<https://lists.ubuntu.com/archives/ubuntu-distributed-devel/2012-April/001085.html>

I'd like to see here is documentation for why SHAs and testaments are being stored as text, and whether it's just to avoid rewriting existing data. From experience it doesn't actually matter much storing hashes as hex digest text rather than their raw form.

My main complaint is this branch and its prereq make it harder to tell whether a variable should be bytes or unicode, tending to just add casts in various places without documenting which methods expect what or sticking to one clear rule.

def check(self, version, revid, suite, branch):
+ revid = unicode(revid)
sha = self.get_testament_sha1(branch.repository, revid)

This is the kind of thing I mean. This change casts the arg to unicode, but before passing back to the bzr layer which expects bytes, so relies on bzrlib autocasting back. Only switching to unicode at db insert and retrieve would be less confusing.

=== modified file 'udd/tests/test_count_outstanding_jobs.py'

Test fix not related to unicode usage?

-import sqlite3
+from pysqlite2.dbapi2 import OperationalError

So UDD now depends on having external pysqlite2 package?

=== modified file 'udd/tests/test_mass_import.py'

Test fix not related to unicode usage?

review: Needs Information

Revision history for this message

Vincent Ladeuil (vila) wrote on 2012-07-02:

15 FAILURES_TABLE_CREATE = '''create table if not exists %s
16 (package text constraint nonull NOT NULL,
17 - reason blob,
18 + reason text,

24 OLD_FAILURES_TABLE_CREATE = '''create table if not exists %s
25 (package text constraint nonull NOT NULL,
26 - reason blob,
27 + reason text,

34 RETRY_TABLE_CREATE = '''create table if not exists %s
35 - (signature blob constraint nonull NOT NULL,
36 + (signature text constraint nonull NOT NULL,

Don't the above changes requires some db migration ?

If not, can you explain why ?

Same remark as gz regarding pysqlite2: is that a new dependency ?

Also, why is it the only place you made this change when sqlite3 is used in
other places ?

review: Needs Information

Revision history for this message

James Westby (james-w) wrote on 2012-07-02:

On Mon, 02 Jul 2012 11:23:18 -0000, Martin Packman <email address hidden> wrote:
> I'd like to see here is documentation for why SHAs and testaments are
> being stored as text, and whether it's just to avoid rewriting
> existing data. From experience it doesn't actually matter much storing
> hashes as hex digest text rather than their raw form.

Yeah, it's because the db has them as TEXT. I agree that BLOB would make
more sense, but wanted to avoid a data transition.

> My main complaint is this branch and its prereq make it harder to tell
> whether a variable should be bytes or unicode, tending to just add
> casts in various places without documenting which methods expect what
> or sticking to one clear rule.

I agree that the docs could be better, I did have a rule (which
admittedly may not be all that clear):

If it's text require unicode in the apis, if not then do it in the db
layer.

> def check(self, version, revid, suite, branch):
> + revid = unicode(revid)
> sha = self.get_testament_sha1(branch.repository, revid)
>
> This is the kind of thing I mean. This change casts the arg to
> unicode, but before passing back to the bzr layer which expects bytes,
> so relies on bzrlib autocasting back. Only switching to unicode at db
> insert and retrieve would be less confusing.

Yeah, I should do the cast later when it hits the db, rather than before
bzrlib gets it.

> === modified file 'udd/tests/test_count_outstanding_jobs.py'
>
> Test fix not related to unicode usage?
>
> -import sqlite3
> +from pysqlite2.dbapi2 import OperationalError
>
> So UDD now depends on having external pysqlite2 package?

That's the error that storm raises. I mistakenly assumed it was the
built in version.

I'll change it to import sqlite from storm, so that it uses whichever
one storm has found.

> === modified file 'udd/tests/test_mass_import.py'
>
> Test fix not related to unicode usage?

Yep, I assume the test was added after the storm branch was reverted,
but didn't cause a test conflict. I can move it in to the earlier branch
if you like?

Thanks,

James

On Mon, 02 Jul 2012 11:23:18 -0000, Martin Packman <martin.packman@canonical.com> wrote:
> I'd like to see here is documentation for why SHAs and testaments are
> being stored as text, and whether it's just to avoid rewriting
> existing data. From experience it doesn't actually matter much storing
> hashes as hex digest text rather than their raw form.

Yeah, it's because the db has them as TEXT. I agree that BLOB would make
more sense, but wanted to avoid a data transition.

I agree that the docs could be better, I did have a rule (which
admittedly may not be all that clear):

If it's text require unicode in the apis, if not then do it in the db
  layer.

>      def check(self, version, revid, suite, branch):
> +        revid = unicode(revid)
>          sha = self.get_testament_sha1(branch.repository, revid)
> 
> This is the kind of thing I mean. This change casts the arg to
> unicode, but before passing back to the bzr layer which expects bytes,
> so relies on bzrlib autocasting back. Only switching to unicode at db
> insert and retrieve would be less confusing.

Yeah, I should do the cast later when it hits the db, rather than before
bzrlib gets it.

> === modified file 'udd/tests/test_count_outstanding_jobs.py'
> 
> Test fix not related to unicode usage? 
> 
> -import sqlite3
> +from pysqlite2.dbapi2 import OperationalError
> 
> So UDD now depends on having external pysqlite2 package?

That's the error that storm raises. I mistakenly assumed it was the
built in version.

I'll change it to import sqlite from storm, so that it uses whichever
one storm has found.

> === modified file 'udd/tests/test_mass_import.py'
> 
> Test fix not related to unicode usage?

Yep, I assume the test was added after the storm branch was reverted,
but didn't cause a test conflict. I can move it in to the earlier branch
if you like?

Thanks,

James

Revision history for this message

James Westby (james-w) wrote on 2012-07-02:

On Mon, 02 Jul 2012 16:12:20 -0000, Vincent Ladeuil <email address hidden> wrote:
> Review: Needs Information
>
> 15 FAILURES_TABLE_CREATE = '''create table if not exists %s
> 16 (package text constraint nonull NOT NULL,
> 17 - reason blob,
> 18 + reason text,
>
> 24 OLD_FAILURES_TABLE_CREATE = '''create table if not exists %s
> 25 (package text constraint nonull NOT NULL,
> 26 - reason blob,
> 27 + reason text,
>
> 34 RETRY_TABLE_CREATE = '''create table if not exists %s
> 35 - (signature blob constraint nonull NOT NULL,
> 36 + (signature text constraint nonull NOT NULL,
>
>
> Don't the above changes requires some db migration ?
>
> If not, can you explain why ?

Because as I understand it they are only there to use up characters in
sqlite. Defining the columns with those types didn't stop the data being
stored in another type, and didn't stop me from assuming that they
were correct and not fixing the things in this branch first time around.

They are purely for documentation (hence changing them to match what is
stored) but as such a db migration isn't really needed.

It's probably the case that the dbs on production will keep the wrong
schema, but I don't care enough about that when the point is to delete
them.

> Also, why is it the only place you made this change when sqlite3 is used in
> other places ?

It's the only one where the test failed, because it's storm that is
generating the error.

Thanks,

James

lp:~james-w/udd/storm-unicode-fixes updated on 2012-07-02

601. By James Westby on 2012-07-02: Import sqlite from storm so we are using the same one, thanks Max.

Revision history for this message

Martin Packman (gz) wrote on 2012-07-02:

> Yeah, it's because the db has them as TEXT. I agree that BLOB would make
> more sense, but wanted to avoid a data transition.

Great, a comment to that effect by the table creation should do.

> I agree that the docs could be better, I did have a rule (which
> admittedly may not be all that clear):
>
> If it's text require unicode in the apis, if not then do it in the db
> layer.

That rule makes sense, but there are currently some exceptions to it.

> Yeah, I should do the cast later when it hits the db, rather than before
> bzrlib gets it.

Right. Having outstanding_marks deliberately use the db form is a source of some of the pain here.

> That's the error that storm raises. I mistakenly assumed it was the
> built in version.
>
> I'll change it to import sqlite from storm, so that it uses whichever
> one storm has found.

Yup, we just want the import fallback logic in one place then to use the bound name from there. As storm has it, using that everywhere sounds good.

> Yep, I assume the test was added after the storm branch was reverted,
> but didn't cause a test conflict. I can move it in to the earlier branch
> if you like?

No need, just checking that was the case.

review: Approve

Revision history for this message

James Westby (james-w) wrote on 2012-07-02:

On Mon, 02 Jul 2012 17:07:23 -0000, Martin Packman <email address hidden> wrote:
> Great, a comment to that effect by the table creation should do.

Done.

> That rule makes sense, but there are currently some exceptions to it.

If you can point them out I'll see about fixing them.

> > Yeah, I should do the cast later when it hits the db, rather than before
> > bzrlib gets it.
>
> Right. Having outstanding_marks deliberately use the db form is a source of some of the pain here.

Yeah, it does make some things a bit odd. Running on postgres will allow
us to delete that code though, because it was a way to avoid sqlite
locking.

> Yup, we just want the import fallback logic in one place then to use the bound name from there. As storm has it, using that everywhere sounds good.

Done.

> > Yep, I assume the test was added after the storm branch was reverted,
> > but didn't cause a test conflict. I can move it in to the earlier branch
> > if you like?
>
> No need, just checking that was the case.

Ok.

Thanks,

James

lp:~james-w/udd/storm-unicode-fixes updated on 2012-07-03

602. By James Westby on 2012-07-02: Document some of the db methods.
603. By James Westby on 2012-07-02: Add a comment as to why revid and testament are TEXT. Thanks Martin.
604. By James Westby on 2012-07-03: Merged storm into storm-unicode-fixes.
605. By James Westby on 2012-07-03: Merged storm into storm-unicode-fixes.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

James Westby

Joe lancer

John A Meinel

Vincent Ladeuil

 === modified file 'udd/icommon.py'
 --- udd/icommon.py	2012-07-03 17:10:24 +0000
 +++ udd/icommon.py	2012-07-03 17:10:24 +0000
@@ -42,7 +42,7 @@
  # below. Some way to share the config is needed -- vila 2011-12-07
  conf = iconfig.ImporterStack()
--running_sentinel = "Apparently the supervisor died\n"
++running_sentinel = u"Apparently the supervisor died\n"
  no_lock_returncode = 139
@@ -453,7 +453,7 @@
      FAILURES_TABLE = "failures"
      FAILURES_TABLE_CREATE = '''create table if not exists %s
                       (package text constraint nonull NOT NULL,
--                      reason blob,
++                      reason text,
                        when_failed timestamp,
                        emailed integer default 0,
                        constraint isprimary PRIMARY KEY
@@ -464,7 +464,7 @@
      OLD_FAILURES_TABLE = "old_failures"
      OLD_FAILURES_TABLE_CREATE = '''create table if not exists %s
                       (package text constraint nonull NOT NULL,
--                      reason blob,
++                      reason text,
                        when_failed timestamp,
                        last_failed timestamp,
                        failure_count integer,
@@ -499,7 +499,7 @@
      RETRY_TABLE = "should_retry"
      RETRY_TABLE_CREATE = '''create table if not exists %s
--                    (signature blob constraint nonull NOT NULL,
++                    (signature text constraint nonull NOT NULL,
                       constraint isprimary PRIMARY KEY
                              (signature))''' % RETRY_TABLE
      RETRY_TABLE_SELECT = '''select * from %s where signature=?''' % RETRY_TABLE
@@ -545,6 +545,7 @@
                  self._add_job(c, row[0], self.JOB_TYPE_PRIORITY)
      def add_jobs_for_interrupted(self):
++        """Add all interrupted jobs to the queue."""
          with Cursor(self.conn) as c:
              rows = c.execute('select * from %s where reason=?'
                      % self.FAILURES_TABLE, (running_sentinel,))
@@ -578,10 +579,31 @@
          return ret
      def start_package(self, package):
++        """Note that the import of `package` is being started.
++
++        :param package: the package name that is being imported
++        :type package: unicode
++        :return: the id assigned to the job.
++        :rtype: int
++        """
          with Cursor(self.conn) as c:
              return self._start_package(c, package)
      def finish_job(self, package, job_id, success, output, when=None):
++        """Note that the import of `package` has finished
++
++        :param package: the package name that was imported
++        :type package: unicode
++        :param job_id: the id of the job
++        :type job_id: int
++        :param success: whether the job was successful
++        :type success: boolean
++        :param output: the output from the job
++        :type output: unicode
++        :param when: the datetime at which the job failed. The default
++            is None which means that the current time should be used.
++        :type when: datetime
++        """
          if when is None:
              when = datetime.utcnow()
          with Cursor(self.conn) as c:
@@ -606,11 +628,25 @@
                      datetime.utcnow(), None, None))
      def add_jobs(self, packages, job_type):
++        """Add a set of packages to the queue.
++
++        :param packages: the package names to be added
++        :type package: list(unicode)
++        :param job_type: the type of the job, one of JOB_TYPE_NEW,
++            JOB_TYPE_RETRY, JOB_TYPE_PRIORITY.
++        :type job_type: int
++        """
          with Cursor(self.conn) as c:
              for package in packages:
                  self._add_job(c, package, job_type)
      def last_import_time(self):
++        """Get the time that the last Launchpad import was done.
++
++        :return: the datetime at which the Launchpad import last
++            ran successfully.
++        :rtype: datetime
++        """
          with Cursor(self.conn) as c:
              row = c.execute('select * from %s order by import desc'
                      % self.IMPORT_TABLE).get_one()
@@ -619,6 +655,17 @@
              return row
      def add_import_jobs(self, packages, newest_published):
++        """Add a set of jobs from the Launchpad import.
++
++        The time reported as `newest_published` will also be stored
++        as the last time the import successfully ran.
++
++        :param packages: the list of packages to import
++        :type packages: list(unicode)
++        :param newest_published: the last publishing time seen for
++            any of those packages.
++        :type packages: datetime
++        """
          with Cursor(self.conn) as c:
              for package in packages:
                  self._add_job(c, package, self.JOB_TYPE_NEW)
@@ -626,6 +673,12 @@
                      (newest_published,))
      def next_job(self):
++        """Get the next job to run.
++
++        :return: the next job to run, as a tuple of (package_name, job_id).
++            If either is None then there is no job to run currently.
++        :rtype: (Maybe unicode, Maybe int)
++        """
          with Cursor(self.conn) as c:
              job_id = 0
              package = None
@@ -648,6 +701,13 @@
              return (job_id, package)
      def failure_reason(self, package):
++        """Get the reason (signature) why a package failed.
++
++        :param package: the package to get the reason for
++        :type package: unicode
++        :return: the failure reason, or None if the package hasn't failed.
++        :rtype: unicode
++        """
          with Cursor(self.conn) as c:
              row = c.execute(self.FAILURES_TABLE_FIND, (package,)).get_one()
              if row is None:
@@ -655,6 +715,13 @@
              return row[1]
      def failure_signature(self, raw_reason):
++        """Construct a failure signature from the output of a run.
++
++        :param raw_reason: the output of the job that failed
++        :type raw_reason: str
++        :return: the failure signature
++        :rtype: unicode
++        """
          trace = raw_reason.splitlines()
          if len(trace) == 1:
              if trace[0] == running_sentinel[:-1]: # Get rid of the final '\n'
@@ -662,24 +729,24 @@
              # sometimes, Python exceptions do not have file references
              m = re.match('(\w+): ', trace[0])
              if m:
--                return m.group(1)
++                return unicode(m.group(1))
              else:
--                return trace[0].strip()
++                return unicode(trace[0].strip())
          elif len(trace) < 3:
--            return " ".join(trace).strip()
++            return u" ".join(trace).strip()
          # If the failure reason is a traceback (which should always be the
          # case, the running_sentinel check above taking care of the still
          # running imports), we build the traceback signature and capture the
          # exception type
--        sig = ''
--        exc_type = ''
++        sig = u''
++        exc_type = u''
          exc_type_coming = False
          file_line_seen = False
          for l in trace:
              if l.startswith('  File'):
                  # Keep the method/function name
--                sig += ':' + l.split()[-1]
++                sig += u':' + unicode(l.split()[-1])
                  file_line_seen = True
              elif file_line_seen:
                  # We've seen the 'File...' line, so we are now seeing the code
@@ -704,11 +771,52 @@
          return c.execute(self.RETRY_TABLE_SELECT, (sig,)).get_one()
      def known_auto_retry(self, sig):
--        with Cursor(self.conn) as c:
--            return self._known_auto_retry(c, sig)
++        """Is the signature marked for auto-retry?
++
++        :param sig: the signature to check
++        :type sig: unicode
++        :return: whether the failure should be auto-retried.
++        :rtype: boolean
++        """
++        with Cursor(self.conn) as c:
++            return self._known_auto_retry(c, sig) is not None
++
++    def _add_known_auto_retry(self, c, sig):
++        row = self._known_auto_retry(c, sig)
++        if not row:
++            c.execute(self.RETRY_TABLE_INSERT, (sig,))
++
++    def add_known_auto_retry(self, sig):
++        """Mark a signature as being one that should be automatically retried.
++
++        :param sig: the signature to check
++        :type sig: unicode
++        """
++        with Cursor(self.conn) as c:
++            return self._add_known_auto_retry(c, sig)
      def retry(self, package, force=False, priority=False, auto=False,
              all=False):
++        """Retry a package.
++
++        :param package: the package to retry
++        :type package: unicode
++        :param force: whether to retry even if the package isn't marked as
++            failed (default is False).
++        :type force: boolean
++        :param priority: whether the job should be retried with high
++            priority (default is False).
++        :type priority: boolean
++        :param auto: whether to failures with that signature should
++            be retried automatically in future (default is False).
++        :type auto: boolean
++        :param all: whether to retry all current failures with that
++            signature (default is False).
++        :type all: boolean
++        :return: whether the job was actually retried (see force for
++            why it might not be)
++        :rtype: boolean
++        """
          with Cursor(self.conn) as c:
              row = c.execute(self.FAILURES_TABLE_FIND, (package,)).get_one()
              if row is None:
@@ -724,9 +832,7 @@
                  sig = self.failure_signature(raw_reason)
                  self._retry(c, package, sig, row[2], priority=priority)
                  if auto and sig != None:
--                    row = self._known_auto_retry(c, sig)
--                    if row is None:
--                        c.execute(self.RETRY_TABLE_INSERT, (sig,))
++                    self._add_known_auto_retry(c, sig)
                  if all and sig != None:
                      rows = c.execute('select * from %s'
                              % self.FAILURES_TABLE)
@@ -765,7 +871,7 @@
      def _attempt_retry(self, c, info):
          row = self._known_auto_retry(c, info.signature)
--        if row is None:
++        if row:
              # A failure that is not auto-retried
              return False
          info.auto_retry = True
@@ -954,6 +1060,9 @@
  class RevidDatabase(object):
++    # revid and testament are stored as TEXT when BLOB might more sense
++    # for these bytestrings, because that's what the original code
++    # stored them as, and keeping that avoids a data migration.
      REVID_TABLE = "revids"
      REVID_TABLE_CREATE = '''create table if not exists %s
                       (package text constraint nonull NOT NULL,
@@ -1005,6 +1114,15 @@
                      % self.REVID_TABLE, (self.package,)))
      def is_marked(self, version, suite):
++        """Whether that version is marked in the suite.
++
++        :param version: the version to check
++        :type version: changelog.Version
++        :param suite: suite to check
++        :type suite: unicode
++        :return: Whether a mark is recorded for that pair.
++        :rtype: boolean
++        """
          with Cursor(self.conn) as c:
              rows = list(c.execute(self.REVID_TABLE_FIND % self.REVID_TABLE,
                      (unicode(version), self.package, suite)))
@@ -1018,6 +1136,14 @@
              return False
      def last_marked_version(self, suite):
++        """Get the last version to be marked.
++
++        :param suite: suite to check
++        :type suite: unicode
++        :return: the versionto be marked, or None if there are no marked
++            versions in the suite.
++        :rtype: Maybe str
++        """
          with Cursor(self.conn) as c:
              versions = [a[0] for a in c.execute(self.REVID_TABLE_FIND_SUITE
                      % self.REVID_WORKING_TABLE,
@@ -1033,6 +1159,19 @@
              return None
      def check(self, version, revid, suite, branch):
++        """Check whether the (version, revid, suite) for a branch matches the db.
++
++        :param version: the version to check
++        :type version: changelog.Version
++        :param revid: the revid to check
++        :type revid: str
++        :param suite: suite to check
++        :type suite: unicode
++        :param branch: the branch to check
++        :type branch: bzrlib.branch.Branch
++        :return: whether the info matches the bd
++        :rtype: boolean
++        """
          sha = self.get_testament_sha1(branch.repository, revid)
          with Cursor(self.conn) as c:
              rows = list(c.execute(self.REVID_TABLE_FIND % self.REVID_TABLE,
@@ -1040,19 +1179,28 @@
              rows += list(c.execute(self.REVID_TABLE_FIND % self.REVID_WORKING_TABLE,
                      (unicode(version), self.package, suite)))
              if (version, suite) in self.outstanding_marks:
--                revid, tment = self.outstanding_marks[(unicode(version), suite)]
--                rows += (None, None, None, revid, tment)
++                rev, tment = self.outstanding_marks[(unicode(version), suite)]
++                rows += (None, None, None, rev, tment)
              if len(rows) < 1:
                  return False
              assert len(rows) < 2, "Multiple versions for a package/suite?"
              row = rows[0]
--            if (revid, sha) != (row[3], row[4]):
++            if (unicode(revid), unicode(sha)) != (row[3], row[4]):
                  assert False, ("%s != %s for %s %s in %s, something has changed"
                       % (unicode((revid, sha)), unicode((row[3], row[4])),
                           self.package, unicode(version), suite))
              return True
      def get_testament_sha1(self, repo, revid):
++        """Get the sha1 of the testament for revid in repo.
++
++        :param repo: the repository to consult
++        :type repo: bzrlib.repository.Repository
++        :param revid: the revid to get the testament for
++        :type revid: str
++        :return: the sha1 of the testament
++        :rtype: unicode
++        """
          op = cleanup.OperationWithCleanups(self._get_testament_sha1)
          return op.run(repo, revid)
@@ -1078,6 +1226,12 @@
              self._commit(c)
      def cleanup_last_run(self, cleanup_cb, push_cb):
++        """Cleanup any leftovers from the last run.
++
++        :param cleanup_cb: a callable to call when the cleanup is done.
++        :param push_cb: a callable to call if the db indicates that there
++            was work-in-progress when the last run terminated.
++        """
          with Cursor(self.conn) as c:
              rows = list(c.execute(self.REVID_TABLE_FIND_PACKAGE
                      % self.REVID_WORKING_TABLE, (self.package,)))
@@ -1094,8 +1248,19 @@
              c.execute(self.DELETE_WORKING, (self.package,))
      def mark(self, version, revid, suite, branch):
++        """Mark a (version, revision, suite) for a branch.
++
++        :param version: the version to mark.
++        :type version: changelog.Version
++        :param revid: the revision id
++        :type revid: str
++        :param suite: the suite to mark
++        :type suite: unicode
++        :param branch: the branh to mark
++        :type branch: bzrlib.branch.Branch
++        """
          sha = self.get_testament_sha1(branch.repository, revid)
--        self.outstanding_marks[(unicode(version), suite)] = (revid, sha)
++        self.outstanding_marks[(unicode(version), suite)] = (unicode(revid), unicode(sha))
      def commit_outstanding(self):
          with Cursor(self.conn) as c:
@@ -1116,7 +1281,7 @@
              rows = list(c.execute("select * from %s where package=?"
                      % self.SUFFIX_TABLE, (self.package,)))
              if len(rows) < 1:
--                return ""
++                return u""
              return rows[0][1]
 === modified file 'udd/scripts/mass_import.py'
 --- udd/scripts/mass_import.py	2012-07-03 17:10:24 +0000
 +++ udd/scripts/mass_import.py	2012-07-03 17:10:24 +0000
@@ -161,6 +161,9 @@
              # We mostly care about self.err which contains the traceback here
              output = self.out + self.err
              unicode_output = output.decode("utf-8", "replace")
++            # Ecode to ascii with 'replace' as that's what the failure
++            # signature code expects to get, even though we now deal
++            # with the value as a unicode string throughout.
              ascii_output = unicode_output.encode("ascii", "replace")
              self.success = self.retcode == 0
              if self.success:
@@ -172,7 +175,8 @@
                      "Importing %s failed:\n%s" % (
                          self.package_name, ascii_output))
              self.status_db.finish_job(
--                self.package_name, self.job_id, self.success, ascii_output)
++                self.package_name, self.job_id, self.success,
++                unicode(ascii_output))
              if not self.success:
                  reason = self.status_db.failure_reason(self.package_name)
                  self.failure_sig = self.status_db.failure_signature(reason)
@@ -334,7 +338,7 @@
              # We can't blame launchpad downtimes for all failures, we rely on
              # the ones declared in RETRY_TABLE (created when
              # 'requeue_package.py --auto' is used)
--            if self.status_db.known_auto_retry(imp.failure_sig) is not None:
++            if self.status_db.known_auto_retry(imp.failure_sig):
                  self.see_failure()
                  # We want to retry asap ('priority')
                  self.status_db.retry(package_name, priority=True)
 === modified file 'udd/tests/test_commit_database.py'
 --- udd/tests/test_commit_database.py	2012-07-03 17:10:24 +0000
 +++ udd/tests/test_commit_database.py	2012-07-03 17:10:24 +0000
@@ -8,7 +8,7 @@
      def setUp(self):
          super(TestCommitDb, self).setUp()
          db_conn, db_type = idb.get_memory_db_connection()
--        self.db = icommon.CommitDatabase(db_conn, db_type, 'foo')
++        self.db = icommon.CommitDatabase(db_conn, db_type, u'foo')
      def test_has_commit(self):
          self.assertEquals(False, self.db.has_commit_started())
 === modified file 'udd/tests/test_revid_database.py'
 --- udd/tests/test_revid_database.py	2012-07-03 17:10:24 +0000
 +++ udd/tests/test_revid_database.py	2012-07-03 17:10:24 +0000
@@ -28,7 +28,7 @@
      def get_db(self):
          db_conn, db_type = idb.get_memory_db_connection()
--        return icommon.RevidDatabase(db_conn, db_type, "pkg")
++        return icommon.RevidDatabase(db_conn, db_type, u"pkg")
      def check_rows(self, memory, working, saved):
          self.assertEqual(memory, len(self.db._memory_rows()))
 === modified file 'udd/tests/test_status_database.py'
 --- udd/tests/test_status_database.py	2012-07-03 17:10:24 +0000
 +++ udd/tests/test_status_database.py	2012-07-03 17:10:24 +0000
@@ -112,13 +112,20 @@
  class TestRetry(TestStatusDb):
      def test_known_auto_retry_empty(self):
--        self.assertEquals(None, self.db.known_auto_retry('whatever'))
++        self.assertEquals(False, self.db.known_auto_retry('whatever'))
++
++    def test_known_auto_retry_known(self):
++        sig = u'whatever'
++        self.db.add_known_auto_retry(sig)
++        self.assertEquals(True, self.db.known_auto_retry(sig))
++
++    def test_known_auto_retry_unknown(self):
++        self.db.add_known_auto_retry(u'something')
++        self.assertEquals(False, self.db.known_auto_retry(u'something else'))
  # FIXME: Need more tests for the following but they are too hard to test for
  # now -- vila 20111020
--# _known_auto_retry(self, c, sig):
--# known_auto_retry(self, sig):
  # retry(self, package, force=False, priority=False, auto=False,
  # _retry(self, c, package, signature, timestamp, priority=False):
  # _attempt_retry(self, c, info):
@@ -127,7 +134,10 @@
  class TestSignatures(TestStatusDb):
      def assertSignature(self, expected, raw):
--        self.assertEquals(expected, self.db.failure_signature(raw))
++        failure_signature = self.db.failure_signature(raw)
++        self.assertEquals(expected, failure_signature)
++        if failure_signature is not None:
++            self.assertIsInstance(failure_signature, unicode)
      def test_running_signature(self):
          # Running imports use a special failure signature

Ubuntu Distributed Development

Merge lp:~james-w/udd/storm-unicode-fixes into lp:udd

Commit message

Description of the change

Preview Diff

Subscribers