Basically, our current main importer loop looks like:
for applied in unapplied, applied:
for dist in debian, ubuntu:
for new publishes in $dist relative to $applied $dist branches
import $applied publish this updates the branch pointer for the pocket + series
if dist == 'ubuntu': this also updates the devel pointers
That is a fair amount of branch manipulation that will be discarded (the
series might have multiple publishes). Esp. on the first/reimport import
(worst case).
So instead:
for dist in debian, ubuntu:
for applied in unapplied, applied:
for new publishes in $dist relative to $applied $dist branches
if publish has already been imported: continue
import $applied publish
update all affected branch pointers,
where affected is defined by running track of refs we *would* have
updated in the original algorithm, and we also store the 'last'
commit that it would have pointed to. This consolidates the branch
and devel updates into one place as well.
This also makes the importer algorithm match the spec on some level
-- update the commit graph by importing only new publishes (which
would create new import tags and new applied tags, while verifying
any repeated publish data matches exactly) and then forcibly moving
branch pointers to where they are "now" in Launchpad. The branch
pointers are not part of the commit graph, so they change in a
distinct step.
In my testing, this drops the reimport time for ipsec-tools from ~65
minutes to ~42 minutes consistently. We could probably speed it up more
starting from this base, if need be -- e.g., I'm not sure it makes sense
to to do the import action itself in the loop. Perhaps it's better to
accumulate the set of unique publishes in a set and then iterate that
set in a second loop.
Given that we are going to be doing a lot of reimporting soon, speeding
it up is a net win. This also tries to clean up code for readability,
and add comment as we go.
Note this does not resolve the fundamental issue I am sure exists:
orphan tags. It simply removes support for them.
Move the core functionality into _devel_branch_updates, making
update_devel_branches a thin wrapper to it. _devel_branch_updates now
has no dependencies so should be easier to test.
Rename namespace, applied_prefix to ref_prefix. _devel_branch_updates
can be simplified by collapsing namespace and applied_prefix into a
single concept ref_prefix.
Move printing to wrapper function. Really the inner function should do
computation only and leave it to the caller to report warnings etc. This
will prevent noise when under test.
_devel_branch_updates returns None for commit hashes, in which case the
intention is that the caller will suppress setting those refs but will
be able to note which refs are not being set. This is useful to maintain
reporting.
The debug and warning messages are now slightly changed since less
information is available one level up the stack. It should still be
sufficient for debugging or warning purposes.
After discussion with Robie on IRC, we decided that 0f3c943054ab
("import: drop publishing parent functionality") and 5aa33fa08078 ("Also
reset devel heads") introduced a regression in the semantics of the
devel pointers.
Before those changes, the devel pointers were merged up so as to be
fast-forwarding, as we imported publication entries, if the publication
entry was newer than the current devel pointer. In other words, the
devel pointers were part of the commit graph itself.
After those changes, the devel pointers are more like symbolic
references, describing meta-state about the commit graph, rather than
integral to the graph itself:
A given series devel branch, after a successful import, points to the
latest publication record in a given series.
The ubuntu/devel branch, after a successful import, points to the latest
series devel branch.
Given these 'rules', we can stop updating the devel pointers in the main
import loop and just do so after we are done importing. All series
branch pointers are updated, which is unnecessary but will generally be
a no-op. This allows the method to not be aware of reimporting or not.