This allows reimports to be requested by injecting a reimport request
directly into the state database. There is deliberately no API for
requesting a reimport as this will never be done during normal
operation, and a sysadmin can use the sqlite3 CLI nearly as easily as a
dedicated CLI just for this operation, so I've opted not to implement a
dedicated CLI.
Previously "requests" were just a list of source package names as
strings, so weren't really explictly labelled as such. Now that we have
to actually track these, I've used a tuple(str, bool) as the simplest
way of identifying regular vs. reimport requests. This structure is now
used extensively in the importer service handling code and I've used the
general term "request" to refer to it.
Changes to importer_service_test.py are covered by a complete test suite
in importer_service_test.py, continuing 100% test coverage with
additional tests for the reimport cases against all changes. Coverage
for mailer.py is similarly maintained for these changes.
scriptutils.py and import-source-packages.py are however not currently
tested, and adding tests for these is therefore not currently practical.
In mitigation:
* The changes are small and can be verified by hand without too much
difficulty. All of the changes are in docstrings, comments and
variable renames to match the new "request" semantics, and there are
no algorithmic changes: everything is just "passed through".
* Since the importer itself runs idempotently and no change to it is
required here, we can have confidence that even if these changes are
buggy, the imported repositories won't really be affected.
* Worst case scenarios: reimports happen when they shouldn't; regular
imports happen when they should have been reimports; and similar. All
of these can be detected by the "logging" performed via importer
notes which report which version and at what time an an individual
import took place.
It turns out that "git format-patch --stdout ... | git am" does not
correctly round-trip in every case. If a commit message contains
something that looks like a patch, "git am" incorrectly tries to apply
it.
Instead, we can note the commits that need to be rebased at export time
with "git rev-list --reverse", and apply them at import time with "git
cherry-pick". This keeps everything "inside git" instead of
round-tripping through an mbox, working around the problem.
This does mean that the repository to which the import happens must have
the commits available, which commonly would mean that the export
destination repository must be the same as the import source repository.
As it happens, this is always the case for us anyway during a
--reimport, so this will work just fine.
Test rich history preservation unescaped patch bug
If a rich history commit _message_ contains something that looks like a
patch, the current implementation will attempt to apply it even though
it shouldn't. This test checks this case, but fails for now as the bug
exists, so we mark it xfail to verify that the test works.
We will shortly be changing rich history preservation to use
cherry-picks to fix a bug. This will require the same repository to be
used for rich history import as the commits being imported will need to
be present in the repository still. Tests will therefore need to use the
same repository for import as they did for export.
This does rely on the test cleanly and correctly removing all previous
relevant tags in order to properly test the import, making it a little
more error prone, but we have no choice given the new requirement.
Since this was previously the only use for our repo_factory fixture, we
can drop it from the code base, together with its documentation. If it
is needed again later, we can always pull it out of VCS again.