git ubuntu import: use a dbm cache to store importer progress
With the recent changes to the importer algorithm, we no longer can
identify a Launchpad publishing event by the commit information in the
repository -- a given SourcePackageRelease (source package name and
version) is only imported once (presuming all future publishes of the
same name and version match exactly). We rely on this in
launchpad_versions_published_after, which iterates the Launchpad data
backwards until we either match the commit data for a branch head, or
see Launchpad data from before one of the branch heads.
Change the importer code to take a --db-cache argument as a directory
containing a DBM cache for debian and ubuntu (this is needed, because
DBM are single-leve string-indexed string storage databases). Lookup the
source package name in the relevant cache before we lookup Launchpad
data, to obtain the last SPPHR used. Store the latest processed SPPHR
after iterating the Launchpad data.
Also update the scripts to support passing a persistent/consistent value
to the import for the cache.
Move the core functionality into _devel_branch_updates, making
update_devel_branches a thin wrapper to it. _devel_branch_updates now
has no dependencies so should be easier to test.
Rename namespace, applied_prefix to ref_prefix. _devel_branch_updates
can be simplified by collapsing namespace and applied_prefix into a
single concept ref_prefix.
Move printing to wrapper function. Really the inner function should do
computation only and leave it to the caller to report warnings etc. This
will prevent noise when under test.
_devel_branch_updates returns None for commit hashes, in which case the
intention is that the caller will suppress setting those refs but will
be able to note which refs are not being set. This is useful to maintain
reporting.
The debug and warning messages are now slightly changed since less
information is available one level up the stack. It should still be
sufficient for debugging or warning purposes.
After discussion with Robie on IRC, we decided that 0f3c943054ab
("import: drop publishing parent functionality") and 5aa33fa08078 ("Also
reset devel heads") introduced a regression in the semantics of the
devel pointers.
Before those changes, the devel pointers were merged up so as to be
fast-forwarding, as we imported publication entries, if the publication
entry was newer than the current devel pointer. In other words, the
devel pointers were part of the commit graph itself.
After those changes, the devel pointers are more like symbolic
references, describing meta-state about the commit graph, rather than
integral to the graph itself:
A given series devel branch, after a successful import, points to the
latest publication record in a given series.
The ubuntu/devel branch, after a successful import, points to the latest
series devel branch.
Given these 'rules', we can stop updating the devel pointers in the main
import loop and just do so after we are done importing. All series
branch pointers are updated, which is unnecessary but will generally be
a no-op. This allows the method to not be aware of reimporting or not.