~racb/git-ubuntu:changelog-date-parsing

Last commit made on 2020-07-23
Get this branch:
git clone -b changelog-date-parsing https://git.launchpad.net/~racb/git-ubuntu
Only Robie Basak can upload to this branch. If you are Robie Basak please log in for upload directions.

Branch merges

Branch information

Name:
changelog-date-parsing
Repository:
lp:~racb/git-ubuntu

Recent commits

6ff151d... by Robie Basak

Ensure the locale is set consistently

Changelog date parsing uses strptime which relies on the locale setting,
so add an assertion to ensure this is true and set the locale correctly
at the main importer entry point.

This does mean that the importer will not be localizable, but that seems
unlikely to be an issue in practice as the importer service operators
can all happily operate in the C locale. If it does become an issue then
we'd need to carefully consider what behaviour might be locale-dependent
to maintain import reproducibility. It may not just be the changelog
date parsing.

Since we don't actually enable localisation except in setting it to C in
the importer entry point, users of other locales should not be affected
when running git-ubuntu normally or when running git-ubuntu.self-test.

b0d19b8... by Robie Basak

Add parsing support for edge case dates

There are various packages in the archive that fail to import because
they have oddly formatted timestamps in their changelog entries, but
they are still unambigious. Add support to parse these together with
their test cases.

I've not followed the usual pattern of adding xfail test cases first
because it would necessitate noisy reformatting of the test input data.
In this case it should be straightforward to confirm that the test cases
do actually test what is intended (though I did verify this manually).

45293f8... by Robie Basak

Factor out changelog date parsing

This allows us to easily test individual date parsing cases, which will
need to do more of shortly to add more support for edge cases.

The test is parametrized so that we can add more edge cases shortly.

a3cfea7... by Robie Basak

Handle non-UTF8 characters in changelog notes

Handle non-UTF8 characters in changelog entries by replacing them when
creating changelog notes.

d0d2a0a... by Robie Basak

Test handling for non-UTF8 in changelog notes

Some changelog entries in the archive contain non-UTF8 and cannot be
decoded without error. This is a test for the upcoming solution.

The new test reuses gitubuntu/changelog_tests/test_utf8_error from some
other tests. I think this overloading is acceptable because any trap
created by this would immediately cause the test suite to fail.

0ba5be0... by Robie Basak

Temporarily blacklist "slow to reimport" packages

I'm going to reimport these in a separate "slow" importer service
instance with much longer timeouts, so blacklist these from the main
importer instance for now.

11071fe... by Rafael David Tinoco

Import uftrace (and dependency) for rafaeldtinoco

3a0a1ae... by Robie Basak

Add additional kernel packages to blacklist

e266ba4... by Robie Basak

Add additional kernel packages to blacklist

24b85db... by Andreas Hasenack

Import sshuttle