Merge ~racb/git-ubuntu:fix-publication-list-truncation into git-ubuntu:main

Proposed by Robie Basak
Status: Merged
Approved by: Robie Basak
Approved revision: 21338706797920b251bce5da6ff131e15dab60f3
Merged at revision: 21338706797920b251bce5da6ff131e15dab60f3
Proposed branch: ~racb/git-ubuntu:fix-publication-list-truncation
Merge into: git-ubuntu:main
Diff against target: 895 lines (+425/-150)
6 files modified
gitubuntu/git_repository.py (+39/-16)
gitubuntu/git_repository_test.py (+29/-0)
gitubuntu/importer.py (+39/-45)
gitubuntu/importer_test.py (+26/-18)
gitubuntu/source_information.py (+66/-69)
gitubuntu/source_information_test.py (+226/-2)
Reviewer Review Type Date Requested Status
Bryce Harrington Approve
Server Team CI bot continuous-integration Approve
Canonical Server Reporter Pending
Canonical Server Reporter Pending
Canonical Server Reporter Pending
Canonical Server Pending
Review via email: mp+424877@code.launchpad.net

Description of the change

This is to fix the current performance issue where the importer seems to begin importing from the beginning of time instead of resuming from only recent publications.

To post a comment you must log in.
Revision history for this message
Server Team CI bot (server-team-bot) wrote :

PASSED: Continuous integration, rev:24bf8b20cdd2b589c2ff7b32ba92fcc27a21e321
https://jenkins.canonical.com/server-team/job/git-ubuntu-ci/5/
Executed test runs:
    SUCCESS: VM Setup
    SUCCESS: Build
    SUCCESS: VM Reset
    SUCCESS: Unit Tests
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.canonical.com/server-team/job/git-ubuntu-ci/5//rebuild

review: Approve (continuous-integration)
Revision history for this message
Bryce Harrington (bryce) wrote :
Download full text (3.8 KiB)

commit 002e452c9dd413b9a22b7ff8d0d3c277a3ae0a3a
  - Verified list exists at https://lists.ubuntu.com/mailman/listinfo/ubuntu-distributed-devel
  - List adopted for git-ubuntu use on https://lists.ubuntu.com/archives/ubuntu-distributed-devel/2018-May/001222.html
  - LGTM

commit 2f0e8556c1a0c36ffefe80cfc3c0f5e3adb49b28
  - Is the name "USD Import Team" still valid as the name for the new list? If not, then the from line in the CHANGELOG_TEMPLATE needs updated.
  - `grep -sir usd-import-team .` shows additional references to the old team in setup.py and a few docs. The urls no longer resolve, so guessing this commit would be a good point to update all those too.
  - There are also other mentions of 'usd-import' in the codebase, such as some man page references to bugs.launchpad.net/usd-importer and git.launchpad.net/usd-importer, etc. which no longer resolve, and to some other email addresses that may no longer be valid. Maybe you want to cleanup those in a different commit in this branch, or in this commit?
  - Needs Fixing

commit 5aad1110b2b849e4704c37751cf656ae14e3b3bb
  - No other references to 'usd-importer-bot' in the codebase
  - There is still the 'usd-importer' service, but afaict that'll keep the name, and in any case is unrelated to the bot.
  - LGTM

commit b312ec9a2de2c0835fb19bae6527f59fb8ea6b97
  - Verified no other references to get_heads_and_versions()
  - Are you using namedtuple instead of an ordinary class for performance reasons? Or for test parameterization convenience? You've used namedtuples elsewhere in the codebase so wonder if there's a reason or just a style thing. Just curious, not suggesting it be changed.
  - In the :returns: for get_head_info() it documents the parameters in the namedtuple; would be better to just put those parameter docs with the HeadInfoItem namedtuple definition.
  - Needs Fixing

commit 8bace066b8c96d214d8f2bb04f93c791ee82e402
  - Since this is a bug fix (and the principle change for this branch), there should be a bug report filed for the issue for tracking.
  - The irclog is useful for reference, however I would be concerned that irc logs might be ephemeral, so please include a synopsis/summary of the discussion in the bug report.
  - I notice this drops checking if the date is defined before using it; presumably you're counting on date_created always being defined, is that true?
  - I don't know if 100 chars is considered too long a line length, but storing spi.head_name(namespace) to a local var could help shorten it.
  - This commit might be a good point to add code docs for _head_version_is_equal(). (Not sure if we bother with code docs for internal methods though.)
  - Needs Fixing

commit 2be6d7d8bef25b03a850b519402eef7cfa2bac5a
  - Better code doc for the test might be "get_head_info() extracts expected commit and version info from the head commit."
  - Otherwise, looks good to me.

commit 172c33290f7faa2d8d6ff903213df3ae140c9f51
  - LGTM

commit 793e8e3dcbf57df0e01f3c6c967e96a32b981761
  - LGTM

commit 5fd2d23b89b13240cc711f80a4d438632632981d
  - LGTM

commit 24bf8b20cdd2b589c2ff7b32ba92fcc27a21e321
  - The parameterization here is very clean, well documented, and easy to f...

Read more...

review: Needs Fixing
Revision history for this message
Robie Basak (racb) wrote :
Download full text (5.5 KiB)

On Tue, Jun 21, 2022 at 05:08:56PM -0000, Bryce Harrington wrote:
> commit 002e452c9dd413b9a22b7ff8d0d3c277a3ae0a3a
> - Verified list exists at https://lists.ubuntu.com/mailman/listinfo/ubuntu-distributed-devel
> - List adopted for git-ubuntu use on https://lists.ubuntu.com/archives/ubuntu-distributed-devel/2018-May/001222.html
> - LGTM
>
> commit 2f0e8556c1a0c36ffefe80cfc3c0f5e3adb49b28
> - Is the name "USD Import Team" still valid as the name for the new list? If not, then the from line in the CHANGELOG_TEMPLATE needs updated.
> - `grep -sir usd-import-team .` shows additional references to the old team in setup.py and a few docs. The urls no longer resolve, so guessing this commit would be a good point to update all those too.
> - There are also other mentions of 'usd-import' in the codebase, such as some man page references to bugs.launchpad.net/usd-importer and git.launchpad.net/usd-importer, etc. which no longer resolve, and to some other email addresses that may no longer be valid. Maybe you want to cleanup those in a different commit in this branch, or in this commit?
> - Needs Fixing
>
> commit 5aad1110b2b849e4704c37751cf656ae14e3b3bb
> - No other references to 'usd-importer-bot' in the codebase
> - There is still the 'usd-importer' service, but afaict that'll keep the name, and in any case is unrelated to the bot.
> - LGTM

These three commits aren't actually part of this MP. I'm not sure how they
ended up in your review. Your points sound valid but I'll address them
separately from this MP.

> commit b312ec9a2de2c0835fb19bae6527f59fb8ea6b97
> - Verified no other references to get_heads_and_versions()
> - Are you using namedtuple instead of an ordinary class for performance reasons? Or for test parameterization convenience? You've used namedtuples elsewhere in the codebase so wonder if there's a reason or just a style thing. Just curious, not suggesting it be changed.

I find myself naturally graduating from a single value to a 2-tuple to a
namedtuple.

I don't go all the way to a full class because I think that constraining
the structure to a namedtuple conveys to the reader (and my future self)
that it's purely data that's being passed around, with no attached
special behaviour. The immutability helps avoid surprises too. There
can't be any accidental modifications made by a called function, which
is a class of bug that is painful to detect and pin down were it to
happen.

If you're curious, dataclasses (new since Python 3.7) are interesting as
a normal powerful version of namedtuples that are more easily extended.
But I don't usually need that power, so I've been sticking to
namedtuples most of the time.

> - In the :returns: for get_head_info() it documents the parameters in the namedtuple; would be better to just put those parameter docs with the HeadInfoItem namedtuple definition.
> - Needs Fixing

Fixed - I squashed this into the original commit (now f5dc43c).

> commit 8bace066b8c96d214d8f2bb04f93c791ee82e402
> - Since this is a bug fix (and the principle change for this branch), there should be a bug report filed for the issue for tracking.

Done: LP: #1979650.

> - The irclog is usef...

Read more...

Revision history for this message
Robie Basak (racb) wrote :

Here's the range-diff of what I just pushed: https://paste.ubuntu.com/p/jcxbVgVCPT/

Revision history for this message
Server Team CI bot (server-team-bot) wrote :

PASSED: Continuous integration, rev:21338706797920b251bce5da6ff131e15dab60f3
https://jenkins.canonical.com/server-team/job/git-ubuntu-ci/6/
Executed test runs:
    SUCCESS: VM Setup
    SUCCESS: Build
    SUCCESS: VM Reset
    SUCCESS: Unit Tests
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.canonical.com/server-team/job/git-ubuntu-ci/6//rebuild

review: Approve (continuous-integration)
Revision history for this message
Bryce Harrington (bryce) wrote :

Looks good, +1

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/gitubuntu/git_repository.py b/gitubuntu/git_repository.py
2index d73972b..1ef6548 100644
3--- a/gitubuntu/git_repository.py
4+++ b/gitubuntu/git_repository.py
5@@ -1026,6 +1026,25 @@ def datetime_to_signature_spec(datetime):
6 return int(datetime.timestamp()), offset_mins
7
8
9+class HeadInfoItem(collections.namedtuple(
10+ 'HeadInfoItem',
11+ [
12+ 'version',
13+ 'commit_time',
14+ 'commit_id',
15+ ],
16+)):
17+ """Information associated with a single branch head
18+
19+ :ivar str version: the package version found in debian/changelog at the
20+ branch head.
21+ :ivar int commit_time: the timestamp of the commit at the branch head,
22+ expressed as seconds since the Unix epoch.
23+ :ivar pygit2.Oid commit_id: the hash of the commit at the branch head.
24+ """
25+ pass
26+
27+
28 class GitUbuntuRepository:
29 """A class for interacting with an importer git repository
30
31@@ -1888,33 +1907,37 @@ class GitUbuntuRepository:
32 'Cannot get changelog source package name'
33 )
34
35- def get_heads_and_versions(self, head_prefix, namespace):
36- """Extract the last version in debian/changelog of all
37- '<namespace>/<head_prefix>/debian/*' and
38- '<namespace>/<head_prefix>/ubuntu/*' branches.
39+ def get_head_info(self, head_prefix, namespace):
40+ """Extract package versions at branch heads
41
42- :rtype: dict
43- :returns: a dictionary keyed by branch name (without a 'refs/heads/'
44- prefix). The value is a dictionary with two keys: 'head' and
45- 'version'. The 'head' value is a pygit2.Branch object. The
46- 'version' value is the package version string found at that branch
47- head.
48+ Extract the version from debian/changelog of all
49+ f'{namespace}/{head_prefix>/*' branches, excluding any branch that
50+ contains 'ubuntu/devel'.
51+
52+ :param str namespace: the namespace under which git refs are found
53+ :param str head_prefix: the prefix to look for
54+ :rtype: dict(str, HeadInfoItem)
55+ :returns: a dictionary keyed by the namespaced branch name (ie. without
56+ a 'refs/heads/' prefix but with the namespace prefix, eg.
57+ 'importer/ubuntu/focal-devel').
58 """
59- versions = dict()
60- errors = False
61+ head_info = dict()
62 for head in self.local_branches:
63 prefix = '%s/%s' % (namespace, head_prefix)
64 if not head.branch_name.startswith(prefix):
65 continue
66 if 'ubuntu/devel' in head.branch_name:
67 continue
68- versions[head.branch_name] = dict()
69- versions[head.branch_name]['head'] = head
70- versions[head.branch_name]['version'], _ = (
71+ version, _ = (
72 self.get_changelog_versions_from_treeish(str(head.peel().id))
73 )
74+ head_info[head.branch_name] = HeadInfoItem(
75+ version=version,
76+ commit_time=head.peel().commit_time,
77+ commit_id=head.peel().id,
78+ )
79
80- return versions
81+ return head_info
82
83 def treeishs_identical(self, treeish_string1, treeish_string2):
84 if treeish_string1 is None or treeish_string2 is None:
85diff --git a/gitubuntu/git_repository_test.py b/gitubuntu/git_repository_test.py
86index c822ff1..838c31e 100644
87--- a/gitubuntu/git_repository_test.py
88+++ b/gitubuntu/git_repository_test.py
89@@ -12,6 +12,7 @@ import pygit2
90 import pytest
91
92 import gitubuntu.git_repository as target
93+from gitubuntu.git_repository import HeadInfoItem
94 from gitubuntu.repo_builder import (
95 Blob,
96 Commit,
97@@ -1157,3 +1158,31 @@ def test_get_all_reimport_tags(repo):
98 'refs/tags/importer/reimport/import/1/0',
99 'refs/tags/importer/reimport/import/1/1',
100 }
101+
102+
103+def test_get_head_info(repo):
104+ """get_head_info() extracts expected commit and version info from the head
105+ commit
106+ """
107+ Repo(
108+ commits=[
109+ Commit(name='foo', tree=SourceTree(source=Source())),
110+ ],
111+ branches={
112+ 'importer/ubuntu/foo': Placeholder('foo'),
113+ 'importer/debian/bar': Placeholder('foo'),
114+ 'other/ubuntu/baz': Placeholder('foo'),
115+ },
116+ ).write(repo.raw_repo)
117+ foo_commit_id = (
118+ repo.raw_repo
119+ .lookup_reference('refs/heads/importer/ubuntu/foo')
120+ .peel(pygit2.Commit).id
121+ )
122+ assert repo.get_head_info('ubuntu', 'importer') == {
123+ 'importer/ubuntu/foo': HeadInfoItem(
124+ version='1-1',
125+ commit_time=0,
126+ commit_id=foo_commit_id,
127+ ),
128+ }
129diff --git a/gitubuntu/importer.py b/gitubuntu/importer.py
130index 8850233..0bbc718 100644
131--- a/gitubuntu/importer.py
132+++ b/gitubuntu/importer.py
133@@ -251,53 +251,53 @@ def _main_with_repo(
134 retry_backoffs,
135 )
136
137- debian_head_versions = (
138- repo.get_heads_and_versions('debian', namespace)
139+ debian_head_info = (
140+ repo.get_head_info('debian', namespace)
141 )
142- for head_name in sorted(debian_head_versions):
143+ for head_name in sorted(debian_head_info):
144 _, _, pretty_head_name = head_name.partition('%s/' % namespace)
145 logging.debug('Last imported %s version is %s',
146 pretty_head_name,
147- debian_head_versions[head_name]['version']
148+ debian_head_info[head_name].version
149 )
150
151- ubuntu_head_versions = (
152- repo.get_heads_and_versions('ubuntu', namespace)
153+ ubuntu_head_info = (
154+ repo.get_head_info('ubuntu', namespace)
155 )
156- for head_name in sorted(ubuntu_head_versions):
157+ for head_name in sorted(ubuntu_head_info):
158 _, _, pretty_head_name = head_name.partition('%s/' % namespace)
159 logging.debug('Last imported %s version is %s',
160 pretty_head_name,
161- ubuntu_head_versions[head_name]['version']
162+ ubuntu_head_info[head_name].version
163 )
164
165 if not skip_applied:
166- applied_debian_head_versions = (
167- repo.get_heads_and_versions(
168+ applied_debian_head_info = (
169+ repo.get_head_info(
170 'applied/debian', namespace
171 )
172 )
173- for head_name in sorted(applied_debian_head_versions):
174+ for head_name in sorted(applied_debian_head_info):
175 _, _, pretty_head_name = head_name.partition(
176 '%s/' % namespace
177 )
178 logging.debug('Last applied %s version is %s',
179 pretty_head_name,
180- applied_debian_head_versions[head_name]['version']
181+ applied_debian_head_info[head_name].version
182 )
183
184- applied_ubuntu_head_versions = (
185- repo.get_heads_and_versions(
186+ applied_ubuntu_head_info = (
187+ repo.get_head_info(
188 'applied/ubuntu', namespace
189 )
190 )
191- for head_name in sorted(applied_ubuntu_head_versions):
192+ for head_name in sorted(applied_ubuntu_head_info):
193 _, _, pretty_head_name = head_name.partition(
194 '%s/' % namespace
195 )
196 logging.debug('Last applied %s version is %s',
197 pretty_head_name,
198- applied_ubuntu_head_versions[head_name]['version']
199+ applied_ubuntu_head_info[head_name].version
200 )
201
202 oldcwd = os.getcwd()
203@@ -321,8 +321,8 @@ def _main_with_repo(
204 pkgname=pkgname,
205 namespace=namespace,
206 patches_applied=False,
207- debian_head_versions=debian_head_versions,
208- ubuntu_head_versions=ubuntu_head_versions,
209+ debian_head_info=debian_head_info,
210+ ubuntu_head_info=ubuntu_head_info,
211 debian_sinfo=debian_sinfo,
212 ubuntu_sinfo=ubuntu_sinfo,
213 active_series_only=active_series_only,
214@@ -345,8 +345,8 @@ def _main_with_repo(
215 pkgname=pkgname,
216 namespace=namespace,
217 patches_applied=True,
218- debian_head_versions=applied_debian_head_versions,
219- ubuntu_head_versions=applied_ubuntu_head_versions,
220+ debian_head_info=applied_debian_head_info,
221+ ubuntu_head_info=applied_ubuntu_head_info,
222 debian_sinfo=debian_sinfo,
223 ubuntu_sinfo=ubuntu_sinfo,
224 active_series_only=active_series_only,
225@@ -1102,7 +1102,7 @@ def override_parents(parent_overrides, repo, version, namespace):
226
227 def _devel_branch_updates(
228 ref_prefix,
229- head_versions,
230+ head_info,
231 series_name_list,
232 ):
233 """Calculate devel branch pointer commits.
234@@ -1113,11 +1113,10 @@ def _devel_branch_updates(
235 Also yield a further generic 'ubuntu/devel' ref_name with its target.
236
237 :param: ref_prefix: what to stick on the front of all refs yielded and to
238- look for in head_versions
239+ look for in head_info
240
241- :param: head_versions: a dictionary
242- {ref_name: (hash_string, version_string)} describing which package
243- version each ref is currently on.
244+ :param: head_info: a dictionary as returned by
245+ GitUbuntuRepository.get_head_info()
246
247 :param: series_name_list: a list of strings of series names to consider
248 with most recent series first.
249@@ -1135,12 +1134,11 @@ def _devel_branch_updates(
250 # order of pockets
251 for suff in ['-proposed', '-updates', '-security', '']:
252 name = "%subuntu/%s%s" % (ref_prefix, series_name, suff)
253- if name in head_versions:
254- head_commit, head_version = head_versions[name]
255- if version_compare(head_version, series_devel_version) > 0:
256+ if name in head_info:
257+ if version_compare(head_info[name].version, series_devel_version) > 0:
258 series_devel_name = name
259- series_devel_version = head_version
260- series_devel_hash = head_commit
261+ series_devel_version = head_info[name].version
262+ series_devel_hash = str(head_info[name].commit_id)
263 if series_devel_hash:
264 series_devel_branch_name = '%subuntu/%s-devel' % (
265 ref_prefix,
266@@ -1187,16 +1185,12 @@ def update_devel_branches(
267 archive
268 """
269 for applied_prefix in ['', 'applied/']:
270- head_versions = {
271- ref: (str(target['head'].peel().id), target['version'])
272- for ref, target in repo.get_heads_and_versions(
273- '%subuntu' % applied_prefix,
274- namespace=namespace,
275- ).items()
276- }
277 for ref, commit in _devel_branch_updates(
278 ref_prefix=('%s/%s' % (namespace, applied_prefix)),
279- head_versions=head_versions,
280+ head_info=repo.get_head_info(
281+ head_prefix='%subuntu' % applied_prefix,
282+ namespace=namespace,
283+ ),
284 series_name_list=ubuntu_sinfo.all_series_name_list,
285 ):
286 if commit is None:
287@@ -2521,8 +2515,8 @@ def import_publishes(
288 pkgname,
289 namespace,
290 patches_applied,
291- debian_head_versions,
292- ubuntu_head_versions,
293+ debian_head_info,
294+ ubuntu_head_info,
295 debian_sinfo,
296 ubuntu_sinfo,
297 active_series_only,
298@@ -2559,8 +2553,8 @@ def import_publishes(
299 changelog_date_overrides=changelog_date_overrides,
300 )
301 # Always look in Ubuntu for new publications
302- gusi_head_versions_tuple_list = [
303- (ubuntu_sinfo, ubuntu_head_versions),
304+ gusi_head_info_tuple_list = [
305+ (ubuntu_sinfo, ubuntu_head_info),
306 ]
307 # Unless --active-series-only was specified, also look in Debian for new
308 # publications.
309@@ -2568,16 +2562,16 @@ def import_publishes(
310 # For hash stability, we must process new Debian publications with the
311 # same date_created before new Ubuntu publications, so we insert the
312 # Debian entry before the Ubuntu entry.
313- gusi_head_versions_tuple_list.insert(
314+ gusi_head_info_tuple_list.insert(
315 0,
316- (debian_sinfo, debian_head_versions),
317+ (debian_sinfo, debian_head_info),
318 )
319 try:
320 # We must process new publications in order of date_created interleaved
321 # across both distributions (Debian first) in order to maintain hash
322 # stability according to the spec.
323 for srcpkg_information in GitUbuntuSourceInformation.interleave_launchpad_versions_published_after(
324- gusi_head_versions_tuple_list,
325+ gusi_head_info_tuple_list,
326 namespace=namespace,
327 workdir=workdir,
328 active_series_only=active_series_only,
329diff --git a/gitubuntu/importer_test.py b/gitubuntu/importer_test.py
330index 05cafb0..433450f 100644
331--- a/gitubuntu/importer_test.py
332+++ b/gitubuntu/importer_test.py
333@@ -17,6 +17,7 @@ import pkg_resources
334 import pygit2
335 import tenacity
336
337+from gitubuntu.git_repository import HeadInfoItem
338 from gitubuntu.patch_state import PatchState
339 import gitubuntu.repo_builder as repo_builder
340 from gitubuntu.repo_builder import Placeholder
341@@ -33,43 +34,50 @@ from gitubuntu.test_fixtures import pygit2_repo, repo
342 @pytest.mark.parametrize('head_versions,expected', [
343 (
344 {
345- 'prefix/ubuntu/trusty': (sentinel.a, '1.0-1'),
346+ 'prefix/ubuntu/trusty': ('a', '1.0-1'),
347 },
348 {
349- 'prefix/ubuntu/trusty-devel': sentinel.a,
350- 'prefix/ubuntu/devel': sentinel.a,
351+ 'prefix/ubuntu/trusty-devel': 'a',
352+ 'prefix/ubuntu/devel': 'a',
353 }
354 ),
355 (
356 {
357- 'prefix/ubuntu/trusty': (sentinel.t, '1.0-1'),
358- 'prefix/ubuntu/xenial': (sentinel.x, '1.0-2'),
359- 'prefix/ubuntu/yakkety': (sentinel.y, '1.0-3'),
360+ 'prefix/ubuntu/trusty': ('t', '1.0-1'),
361+ 'prefix/ubuntu/xenial': ('x', '1.0-2'),
362+ 'prefix/ubuntu/yakkety': ('y', '1.0-3'),
363 },
364 {
365- 'prefix/ubuntu/trusty-devel': sentinel.t,
366- 'prefix/ubuntu/xenial-devel': sentinel.x,
367- 'prefix/ubuntu/yakkety-devel': sentinel.y,
368- 'prefix/ubuntu/devel': sentinel.y,
369+ 'prefix/ubuntu/trusty-devel': 't',
370+ 'prefix/ubuntu/xenial-devel': 'x',
371+ 'prefix/ubuntu/yakkety-devel': 'y',
372+ 'prefix/ubuntu/devel': 'y',
373 }
374 ),
375 (
376 {
377- 'prefix/ubuntu/trusty': (sentinel.t, '1.0-1'),
378- 'prefix/ubuntu/trusty-security': (sentinel.ts, '1.0-1.2'),
379- 'prefix/ubuntu/trusty-proposed': (sentinel.tp, '1.0-1.1'),
380+ 'prefix/ubuntu/trusty': ('t', '1.0-1'),
381+ 'prefix/ubuntu/trusty-security': ('ts', '1.0-1.2'),
382+ 'prefix/ubuntu/trusty-proposed': ('tp', '1.0-1.1'),
383 },
384 {
385- 'prefix/ubuntu/trusty-devel': sentinel.ts,
386- 'prefix/ubuntu/devel': sentinel.ts,
387+ 'prefix/ubuntu/trusty-devel': 'ts',
388+ 'prefix/ubuntu/devel': 'ts',
389 }
390 ),
391 ])
392 def test_devel_branch_updates(head_versions, expected):
393 result = target._devel_branch_updates(
394- 'prefix/',
395- head_versions,
396- ['yakkety', 'xenial', 'trusty'],
397+ ref_prefix='prefix/',
398+ head_info={
399+ head: HeadInfoItem(
400+ commit_id=commit_id,
401+ version=version,
402+ commit_time=None
403+ )
404+ for head, (commit_id, version) in head_versions.items()
405+ },
406+ series_name_list=['yakkety', 'xenial', 'trusty'],
407 )
408 assert set(result) == set(expected.items())
409
410diff --git a/gitubuntu/source_information.py b/gitubuntu/source_information.py
411index fbe2ee5..da79c5c 100644
412--- a/gitubuntu/source_information.py
413+++ b/gitubuntu/source_information.py
414@@ -394,9 +394,10 @@ class GitUbuntuSourceInformation(object):
415 def __init__(self, dist_name, pkgname=None,
416 pull_overrides_filename='/dev/null',
417 retries=0,
418- retry_backoffs=[]
419+ retry_backoffs=[],
420+ lp=None,
421 ):
422- self.launchpad = launchpad_login()
423+ self.launchpad = lp or launchpad_login()
424 self.dist_name = dist_name
425 if self.dist_name.startswith('ppa:'):
426 _, ppa = dist_name.split(':')
427@@ -421,16 +422,36 @@ class GitUbuntuSourceInformation(object):
428 self.retry_backoffs = retry_backoffs
429
430 @staticmethod
431- def _head_version_is_equal(head_versions, namespace, spi):
432+ def _spi_found_in_relevant_head(head_info, namespace, spi):
433+ """Determine if a branch head matches the given source publication
434+
435+ Consider the branch head that is expected to be updated were the source
436+ publication to be imported. If the source publication appears to
437+ already be imported there, then this method returns True.
438+
439+ This is assumed if the commit timestamp at the branch head matches the
440+ date_created timestamp of the source publication, and the package
441+ version strings also match.
442+
443+ :param dict(str, HeadInfoItem) head_info: the state of the relevant
444+ branch heads, as returned by GitUbuntuRepository.get_head_info().
445+ :param str namespace: the namespace prefix used in the git repository.
446+ :param GitUbuntuSourcePackageInformation spi: the Launchpad publication
447+ to consider.
448+ :rtype: bool
449+ :returns: True if the spi matches the relevant branch head
450+ """
451 try:
452- if (head_versions[spi.head_name(namespace)]['version'] == spi.version and
453- (spi.date_published is None or
454- int(spi.date_published.timestamp()) == head_versions[spi.head_name(namespace)]['head'].peel().commit_time)
455- ):
456- return True
457- return False
458+ head_info_item = head_info[spi.head_name(namespace)]
459 except KeyError:
460+ # We don't have information for the relevant branch. Probably
461+ # because it doesn't exist. So our answer is no: we don't appear to
462+ # have this source publication imported.
463 return False
464+ return (
465+ head_info_item.version == spi.version
466+ and int(spi.date_created.timestamp()) == head_info_item.commit_time
467+ )
468
469 @property
470 def current_series(self):
471@@ -552,7 +573,7 @@ class GitUbuntuSourceInformation(object):
472
473 def launchpad_versions_published_after(
474 self,
475- head_versions,
476+ head_info,
477 namespace,
478 workdir=None,
479 active_series_only=False,
480@@ -561,13 +582,13 @@ class GitUbuntuSourceInformation(object):
481
482 Return a sequence of GitUbuntuSourcePackageInformation instances
483 representing Launchpad publications created in a Launchpad distribution
484- after a particular point, as determined by the head_versions parameter.
485+ after a particular point, as determined by the head_info parameter.
486
487- :param dict head_versions: as returned by
488- GitUbuntuRepository.get_heads_and_versions(). This may be empty, in
489+ :param dict head_info: as returned by
490+ GitUbuntuRepository.get_head_info(). This may be empty, in
491 which case all publications are returned. Otherwise, publications
492- are skipped that match a version already found in head_versions, or
493- with a date_created older than what is found in head_versions, is
494+ are skipped that match a version already found in head_info, or
495+ with a date_created older than what is found in head_info, is
496 skipped, as well as any publications that have a prior date_created
497 date.
498 :param str namespace: the namespace prefix used in the git repository.
499@@ -577,11 +598,6 @@ class GitUbuntuSourceInformation(object):
500 an active series, as determined by self.active_series_name_list.
501 :rtype: sequence(GitUbuntuSourcePackageInformation)
502 """
503- args = {
504- 'exact_match': True,
505- 'source_name': self.pkgname,
506- 'order_by_date': True,
507- }
508
509 # we have the date of the commit too, so we can double-check
510 # that it matches
511@@ -598,7 +614,11 @@ class GitUbuntuSourceInformation(object):
512 # versions, as there are cases (clamav 0.91.2-3ubuntu2.2~feisty1
513 # in feisty-backports of a 'next' version being prior in the
514 # ordering)
515- spph = self.archive.getPublishedSources(**args)
516+ spph = self.archive.getPublishedSources(
517+ exact_match=True,
518+ source_name=self.pkgname,
519+ order_by_date=True,
520+ )
521 # Coherence check that the passed in srcpkg name has a publication
522 # history
523 if len(spph) == 0:
524@@ -606,47 +626,24 @@ class GitUbuntuSourceInformation(object):
525 self.pkgname, self.dist_name
526 )
527 return
528- if len(head_versions) > 0:
529- _spph = list()
530- for spphr in spph:
531- spi = GitUbuntuSourcePackageInformation(spphr, self.dist_name,
532- workdir=workdir)
533- if self._head_version_is_equal(head_versions, namespace, spi):
534- break
535- _spph.append(spphr)
536- spph = _spph
537- spph_iter = reversed(spph)
538-
539- seen_versions = list()
540- for srcpkg in spph_iter:
541- # skip carried forward versions between series/pocket
542- if (
543- srcpkg.source_package_version,
544- # Access through cache to avoid terrible performance
545- _get_cached_lp_link(srcpkg, 'distro_series'),
546- srcpkg.pocket,
547- srcpkg.date_published
548- ) \
549- in seen_versions:
550- continue
551- spi = self.get_corrected_spi(srcpkg, workdir)
552+ truncated_spph = list()
553+ for spphr in spph:
554+ spi = GitUbuntuSourcePackageInformation(
555+ spphr, self.dist_name, workdir=workdir
556+ )
557+ if self._spi_found_in_relevant_head(head_info, namespace, spi):
558+ break
559+ truncated_spph.append(spphr)
560+
561+ for spphr in reversed(truncated_spph):
562+ spi = self.get_corrected_spi(spphr, workdir)
563 if active_series_only and spi.series.name.lower() not in self.active_series_name_list:
564 continue
565- seen_versions.append(
566- (
567- srcpkg.source_package_version,
568- # Access through cache to avoid terrible performance
569- _get_cached_lp_link(srcpkg, 'distro_series'),
570- srcpkg.pocket,
571- srcpkg.date_published
572- )
573- )
574 yield spi
575- return
576
577 @staticmethod
578 def interleave_launchpad_versions_published_after(
579- gusi_head_versions_tuple_list,
580+ gusi_head_info_tuple_list,
581 namespace,
582 workdir=None,
583 active_series_only=False,
584@@ -662,16 +659,16 @@ class GitUbuntuSourceInformation(object):
585 simultaneously and interleaves the results so that the caller sees a
586 single combined sequence in ascending order of date_created. If
587 date_created is the same, results are provided in the order that the
588- distributions are specified in gusi_head_versions_tuple_list. In other
589+ distributions are specified in gusi_head_info_tuple_list. In other
590 words, the distribution ordering is the secondary sort key.
591
592 :param list(tuple(GitUbuntuSourceInformation, dict))
593- gusi_head_versions_tuple_list: a list of the parameters to use for
594- the underlying launchpad_versions_published_after() method call.
595- The GitUbuntuSourceInformation instance is the object to call the
596- method against. The second element of the tuple is the
597- head_versions parameter to pass to that method. The other
598- parameters are passed through to the underlying call as-is.
599+ gusi_head_info_tuple_list: a list of the parameters to use for the
600+ underlying launchpad_versions_published_after() method call. The
601+ GitUbuntuSourceInformation instance is the object to call the
602+ method against. The second element of the tuple is the head_info
603+ parameter to pass to that method. The other parameters are passed
604+ through to the underlying call as-is.
605 :param namespace: passed through to
606 launchpad_versions_published_after().
607 :param workdir: passed through to launchpad_versions_published_after().
608@@ -687,11 +684,11 @@ class GitUbuntuSourceInformation(object):
609 # be able to sort according to priority as a secondary key to ensure
610 # that if timestamps of a publication are the same across
611 # distributions, the distribution mentioned first in
612- # gusi_head_versions_tuple_list will appear in the results first.
613+ # gusi_head_info_tuple_list will appear in the results first.
614 dist_priority = {
615 gusi.dist.self_link: i
616- for i, (gusi, head_version)
617- in enumerate(gusi_head_versions_tuple_list)
618+ for i, (gusi, head_info)
619+ in enumerate(gusi_head_info_tuple_list)
620 }
621
622 # Now that we have a mapping of distribution_link to priority, the sort
623@@ -706,15 +703,15 @@ class GitUbuntuSourceInformation(object):
624 distribution_link = distro_series.distribution_link
625 return spi._spphr.date_created, dist_priority[distribution_link]
626
627- # Create one generator per gusi_head_versions_tuple_list entry.
628+ # Create one generator per gusi_head_info_tuple_list entry.
629 spi_generators_to_interleave = [
630 gusi.launchpad_versions_published_after(
631- head_versions=head_versions,
632+ head_info=head_info,
633 namespace=namespace,
634 workdir=workdir,
635 active_series_only=active_series_only,
636 )
637- for gusi, head_versions in gusi_head_versions_tuple_list
638+ for gusi, head_info in gusi_head_info_tuple_list
639 ]
640
641 # heapq.merge() will now do the interleaving. This relies on the
642diff --git a/gitubuntu/source_information_test.py b/gitubuntu/source_information_test.py
643index 7e88d77..e88b072 100644
644--- a/gitubuntu/source_information_test.py
645+++ b/gitubuntu/source_information_test.py
646@@ -4,6 +4,7 @@ from unittest.mock import Mock, sentinel
647
648 import keyring
649 import pytest
650+from gitubuntu.git_repository import HeadInfoItem
651 import gitubuntu.source_information as target
652 from gitubuntu.source_information import derive_codename_from_series
653 from gitubuntu import source_information
654@@ -255,12 +256,12 @@ def test_interleave_launchpad_versions_published_after(
655 ])
656 )
657
658- gusi_head_versions_tuple_list = [
659+ gusi_head_info_tuple_list = [
660 (debian, None),
661 (ubuntu, None),
662 ]
663 result = target.GitUbuntuSourceInformation.interleave_launchpad_versions_published_after(
664- gusi_head_versions_tuple_list=gusi_head_versions_tuple_list,
665+ gusi_head_info_tuple_list=gusi_head_info_tuple_list,
666 namespace=None,
667 )
668 expanded_result = list(result)
669@@ -274,3 +275,226 @@ def test_interleave_launchpad_versions_published_after(
670 )
671 for date_created, dist in expected
672 ]
673+
674+
675+@pytest.mark.parametrize(['dist_name', 'input_head_info', 'input_spph'], [
676+ # Empty case
677+ (
678+ 'debian', {}, [],
679+ ),
680+ # Single publication, nothing imported yet
681+ (
682+ 'debian',
683+ {},
684+ [
685+ ('sid', '1', 1, True),
686+ ],
687+ ),
688+ # Single publication, already imported
689+ (
690+ 'debian',
691+ {'sid': ('1', 1)},
692+ [
693+ ('sid', '1', 1, False),
694+ ],
695+ ),
696+ # Two identical publications, nothing imported yet
697+ (
698+ 'debian',
699+ {},
700+ [
701+ ('sid', '1', 2, True),
702+ ('bullseye', '1', 1, True),
703+ ],
704+ ),
705+ # Two identical publications, already imported
706+ (
707+ 'debian',
708+ {
709+ 'sid': ('1', 1),
710+ 'bullseye': ('1', 1),
711+ },
712+ [
713+ ('sid', '1', 2, True),
714+ ('bullseye', '1', 1, False),
715+ ],
716+ ),
717+ # Two versions in the normal proposed migration way, one imported
718+ (
719+ 'debian',
720+ {
721+ 'sid': ('1', 1),
722+ 'bullseye': ('1', 1),
723+ },
724+ [
725+ ('bullseye', '2', 4, True),
726+ ('sid', '2', 3, True),
727+ ('bullseye', '1', 2, True),
728+ ('sid', '1', 1, False),
729+ ],
730+ ),
731+ # Two versions in the normal proposed migration way, both imported
732+ (
733+ 'debian',
734+ {
735+ 'sid': ('2', 3),
736+ 'bullseye': ('2', 3),
737+ },
738+ [
739+ ('bullseye', '2', 4, True),
740+ ('sid', '2', 3, False),
741+ ('bullseye', '1', 2, False),
742+ ('sid', '1', 1, False),
743+ ],
744+ ),
745+ # Empty case
746+ (
747+ 'ubuntu', {}, [],
748+ ),
749+ # Single publication, nothing imported yet
750+ (
751+ 'ubuntu',
752+ {},
753+ [
754+ ('kinetic-proposed', '1', 1, True),
755+ ],
756+ ),
757+ # Single publication, already imported
758+ (
759+ 'ubuntu',
760+ {'kinetic-proposed': ('1', 1)},
761+ [
762+ ('kinetic-proposed', '1', 1, False),
763+ ],
764+ ),
765+ # Two identical publications, nothing imported yet
766+ (
767+ 'ubuntu',
768+ {},
769+ [
770+ ('kinetic', '1', 2, True),
771+ ('kinetic-proposed', '1', 1, True),
772+ ],
773+ ),
774+ # Two identical publications, already imported
775+ (
776+ 'ubuntu',
777+ {
778+ 'kinetic': ('1', 1),
779+ 'kinetic-proposed': ('1', 1),
780+ },
781+ [
782+ ('kinetic', '1', 2, True),
783+ ('kinetic-proposed', '1', 1, False),
784+ ],
785+ ),
786+ # Two versions in the normal proposed migration way, one imported
787+ (
788+ 'ubuntu',
789+ {
790+ 'kinetic-proposed': ('1', 1),
791+ 'kinetic': ('1', 1),
792+ },
793+ [
794+ ('kinetic', '2', 4, True),
795+ ('kinetic-proposed', '2', 3, True),
796+ ('kinetic', '1', 2, True),
797+ ('kinetic-proposed', '1', 1, False),
798+ ],
799+ ),
800+ # Two versions in the normal proposed migration way, both imported
801+ (
802+ 'ubuntu',
803+ {
804+ 'kinetic-proposed': ('2', 3),
805+ 'kinetic': ('2', 3),
806+ },
807+ [
808+ ('kinetic', '2', 4, True),
809+ ('kinetic-proposed', '2', 3, False),
810+ ('kinetic', '1', 2, False),
811+ ('kinetic-proposed', '1', 1, False),
812+ ],
813+ ),
814+])
815+def test_launchpad_versions_published_after(
816+ dist_name,
817+ input_head_info,
818+ input_spph,
819+):
820+ """If launchpad_versions_published_after() is shown some particular
821+ head_info and some set of publications in Launchpad, then it should present
822+ the expected subset of those publications for import.
823+
824+ :param str dist_name: either 'debian' or 'ubuntu'
825+ :param dict input_head_info: a dictionary. Keyed by the branch name without
826+ any namespace or prefix. The value is a tuple(str, int) of
827+ (package_version, timestamp).
828+ :param list input_spph: a list of mock publications from Launchpad in
829+ reverse chronological order by date_created. Each entry is a 4-tuple
830+ with the following fields:
831+ str: the name of the apt suite the publication was made in
832+ str: the package version string
833+ int: date_created
834+ bool: True if this entry should be included in the expected result
835+ """
836+ namespaced_input_head_info = {
837+ f'importer/{dist_name}/{suite}':
838+ HeadInfoItem(
839+ version=version,
840+ commit_time=commit_time,
841+ commit_id=None,
842+ )
843+ for suite, (version, commit_time) in input_head_info.items()
844+ }
845+
846+ dist = Mock()
847+ lp = Mock(distributions={dist_name: dist})
848+ getPublishedSources = dist.main_archive.getPublishedSources
849+
850+ getPublishedSources_return_value = []
851+ for i, (suite, version, date_created, _) in enumerate(input_spph):
852+ # Create a minimal fake record that would be returned by Launchpad's
853+ # getPublishedSources method on the Launchpad archive object. We have
854+ # to do this imperatively in order to set the name attributes
855+ # (https://docs.python.org/3/library/unittest.mock.html#mock-names-and-the-name-attribute).
856+ distribution_mock = Mock()
857+ distribution_mock.name = dist_name
858+ distro_series_mock = Mock(
859+ distribution=distribution_mock,
860+ distribution_link=dist_name,
861+ )
862+ distro_series_mock.name = suite.split('-')[0]
863+ spphr_mock = Mock(
864+ source_package_version=version,
865+ date_created=Mock(timestamp=Mock(return_value=date_created)),
866+ pocket=suite.split('-')[1] if '-' in suite else 'release',
867+ distro_series=distro_series_mock,
868+ distro_series_link=suite.split('-')[0],
869+ test_record_id=i,
870+ )
871+ getPublishedSources_return_value.append(spphr_mock)
872+ getPublishedSources.return_value = getPublishedSources_return_value
873+
874+ gusi = target.GitUbuntuSourceInformation(
875+ dist_name=dist_name,
876+ lp=lp,
877+ pkgname=sentinel.pkgname,
878+ )
879+
880+ actual_result = list(gusi.launchpad_versions_published_after(
881+ head_info=namespaced_input_head_info,
882+ namespace='importer',
883+ ))
884+
885+ getPublishedSources.assert_called_once_with(
886+ exact_match=True,
887+ source_name=sentinel.pkgname,
888+ order_by_date=True,
889+ )
890+
891+ assert list(reversed([
892+ i
893+ for i, (_, _, _, expected_in_result) in enumerate(input_spph)
894+ if expected_in_result
895+ ])) == [m._spphr.test_record_id for m in actual_result]

Subscribers

People subscribed via source and target branches