We had a case where it seemed that the poller wasn't picking up on the
complete list of packages to import. To assist debugging of this if it
happens again, log the number of packages marked importable after this
list is determined at startup.
Changelog date parsing uses strptime which relies on the locale setting,
so add an assertion to ensure this is true and set the locale correctly
at the main importer entry point.
This does mean that the importer will not be localizable, but that seems
unlikely to be an issue in practice as the importer service operators
can all happily operate in the C locale. If it does become an issue then
we'd need to carefully consider what behaviour might be locale-dependent
to maintain import reproducibility. It may not just be the changelog
date parsing.
Since we don't actually enable localisation except in setting it to C in
the importer entry point, users of other locales should not be affected
when running git-ubuntu normally or when running git-ubuntu.self-test.
There are various packages in the archive that fail to import because
they have oddly formatted timestamps in their changelog entries, but
they are still unambigious. Add support to parse these together with
their test cases.
I've not followed the usual pattern of adding xfail test cases first
because it would necessitate noisy reformatting of the test input data.
In this case it should be straightforward to confirm that the test cases
do actually test what is intended (though I did verify this manually).
Some changelog entries in the archive contain non-UTF8 and cannot be
decoded without error. This is a test for the upcoming solution.
The new test reuses gitubuntu/changelog_tests/test_utf8_error from some
other tests. I think this overloading is acceptable because any trap
created by this would immediately cause the test suite to fail.
I'm going to reimport these in a separate "slow" importer service
instance with much longer timeouts, so blacklist these from the main
importer instance for now.