~nacc/git-ubuntu:lp1730734-cache-importer-progress

Last commit made on 2017-11-16
Get this branch:
git clone -b lp1730734-cache-importer-progress https://git.launchpad.net/~nacc/git-ubuntu
Only Nish Aravamudan can upload to this branch. If you are Nish Aravamudan please log in for upload directions.

Branch merges

Branch information

Name:
lp1730734-cache-importer-progress
Repository:
lp:~nacc/git-ubuntu

Recent commits

a9960c8... by Nish Aravamudan

git ubuntu import: use a dbm cache to store importer progress

With the recent changes to the importer algorithm, we no longer can
identify a Launchpad publishing event by the commit information in the
repository -- a given SourcePackageRelease (source package name and
version) is only imported once (presuming all future publishes of the
same name and version match exactly). We rely on this in
launchpad_versions_published_after, which iterates the Launchpad data
backwards until we either match the commit data for a branch head, or
see Launchpad data from before one of the branch heads.

Change the importer code to take a --db-cache argument as a directory
containing a DBM cache for debian and ubuntu (this is needed, because
DBM are single-leve string-indexed string storage databases). Lookup the
source package name in the relevant cache before we lookup Launchpad
data, to obtain the last SPPHR used. Store the latest processed SPPHR
after iterating the Launchpad data.

Also update the scripts to support passing a persistent/consistent value
to the import for the cache.

LP: #1730734

b1551b1... by Robie Basak

Initial tests for _devel_branch_updates

8a605fe... by Robie Basak

Adjust head_versions structure

Let's simplify this a little. Instead of containing a pygit2.Reference,
resolve it to a commit hash string first to simplify testing.

While I'm there, as we only need two elements, just use a two-tuple for
the dictionary values.

f060cc4... by Robie Basak

Factor out and rework _devel_branch_updates

Move the core functionality into _devel_branch_updates, making
update_devel_branches a thin wrapper to it. _devel_branch_updates now
has no dependencies so should be easier to test.

Rename namespace, applied_prefix to ref_prefix. _devel_branch_updates
can be simplified by collapsing namespace and applied_prefix into a
single concept ref_prefix.

Move printing to wrapper function. Really the inner function should do
computation only and leave it to the caller to report warnings etc. This
will prevent noise when under test.

_devel_branch_updates returns None for commit hashes, in which case the
intention is that the caller will suppress setting those refs but will
be able to note which refs are not being set. This is useful to maintain
reporting.

The debug and warning messages are now slightly changed since less
information is available one level up the stack. It should still be
sufficient for debugging or warning purposes.

8af442a... by Nish Aravamudan

importer: rework and move devel pointer moving

After discussion with Robie on IRC, we decided that 0f3c943054ab
("import: drop publishing parent functionality") and 5aa33fa08078 ("Also
reset devel heads") introduced a regression in the semantics of the
devel pointers.

Before those changes, the devel pointers were merged up so as to be
fast-forwarding, as we imported publication entries, if the publication
entry was newer than the current devel pointer. In other words, the
devel pointers were part of the commit graph itself.

After those changes, the devel pointers are more like symbolic
references, describing meta-state about the commit graph, rather than
integral to the graph itself:

A given series devel branch, after a successful import, points to the
latest publication record in a given series.

The ubuntu/devel branch, after a successful import, points to the latest
series devel branch.

Given these 'rules', we can stop updating the devel pointers in the main
import loop and just do so after we are done importing. All series
branch pointers are updated, which is unnecessary but will generally be
a no-op. This allows the method to not be aware of reimporting or not.

LP: #1730655

Fixes: 0f3c943054ab ("import: drop publishing parent functionality")
Fixes: 5aa33fa08078 ("Also reset devel heads")

ab12071... by Nish Aravamudan

source_information: add API to obtain all series objects/names

f3ff76d... by Nish Aravamudan

source_information: cleanup formatting per style

2b8c190... by Nish Aravamudan

source_information: fix incorrect API comments

af30fd0... by Nish Aravamudan

source-package-walker.py: add script to linearly walk all source packages

It uses the same blacklist, whitelist and phasing that the other scripts
do.

8b3bbfb... by Nish Aravamudan

update-repository-alias: add script to update the default repository for a srcpkg

LP: #1661600