~racb/git-ubuntu:import-everything

Last commit made on 2023-01-18
Get this branch:
git clone -b import-everything https://git.launchpad.net/~racb/git-ubuntu
Only Robie Basak can upload to this branch. If you are Robie Basak please log in for upload directions.

Branch merges

Branch information

Name:
import-everything
Repository:
lp:~racb/git-ubuntu

Recent commits

41b3c36... by Robie Basak

poller: remove allowlists and phasing

Now that we are importing all* packages, allowlists (including
"allowlist_team") and phasing are no longer required, so we can remove
this functionality and remove significant sections of code.

Previously we were augmenting the "user-specified" allowlist with an
allowlist generated by reading the apt repository for source packages,
placing them in main and universe buckets, and then applying "phasing"
to them to allow us to include proportional subsets of main and
universe. This made sense at the time, but is no longer necessary.

The downside of the previous approach is that it locked the set of
source packages imported to those present at the time the poller was
last restarted. This would miss new source packages until after the
poller was restarted and a further upload was made to them after the
restart.

Now that we've imported the majority of packages currently published in
any pocket of any release that is not EOL, it's simpler to import any
package that isn't otherwise denylisted. We can maintain the denylist
with the set of edge case imports that still need attention to avoid
unnecessary importer resources.

This requires just a denylist and nothing else.

Since the apt repository parsing was being done solely to determine the
list of source packages in the archive for the purposes of the poller,
the apt repository reading and validation code can also be removed,
together with corresponding tests.

66cd050... by Robie Basak

More edge case changelog date parsing

According to the import specification, it's fine to parse Thurs as
Thu, Tues as Tue, and Sept as Sep, as these are all unambiguous. This
fixes imports of the following:

apachetop 0.12.5-7 Thurs, 12 Jan 2006 12:09:58 +0000
easychem 0.6-0ubuntu1 Tues, 20 Dec 2005 16:57:16 -0500
gnome-shell-extension-tilix-shortcut 1.0.1-1 Tue, 19 Sept 2017 14:23:17 +0200
libapp-cache-perl 0.35-1 Wed, 24 Sept 2008 19:32:23 +0200
libcrypt-hcesha-perl 0.70-2 Wed, 24 Sept 2008 19:44:01 +0200

Applications of multiple regular expression replacements are factored
out into _apply_re_substitutions(). This function isn't explicitly
tested; the comprehensive test coverage of its only caller
Changelog._parse_changelog_date() is sufficient.

5c99643... by Robie Basak

Add further changelog date overrides (2)

I've checked that these all fail to parse due to invalid date strings,
and added them to the exception list in the spec.

a2df474... by Robie Basak

Add further changelog date overrides

I've checked that these all fail to parse due to invalid date strings,
and added them to the exception list in the spec.

81bd897... by Robie Basak

denylist: add known failures and refer to bugs

6ce5a58... by Robie Basak

denylist: remove successful imports

These repositories already existed and have been updated successfully,
so can continue as normal now.

5f07169... by Robie Basak

Add further mass import failures to denylist

In importing source packages that exist in non-EOL releases but have
since been deleted, these further twelve packages fail to import and
have never been imported, so add these to the denylist for now.

bef5ef5... by Robie Basak

Add mass import failures to denylist

These package are not yet imported. They will need individual
investigation to fix the edge case issues that cause them to fail. In
the meantime, we won't attempt to import them automatically to avoid
blocking up worker slots.

843964b... by Robie Basak

Add production systemd service definitions

These are installed and managed manually; these are the current ones in
use in production, with the exception that in
git-ubuntu-importer-service-worker@.service we're currently manually
specifying the IP to connect to, and so this is substituted out.

ab5d7ad... by Robie Basak

Move import_srcpkg() to importer_service_worker.py

This function is only used from here, so there's no need for it to be in
a different module.