Merge lp:~benji/launchpad/bug-781600 into lp:launchpad

Proposed by Benji York
Status: Merged
Approved by: Benji York
Approved revision: no longer in the source branch.
Merged at revision: 13439
Proposed branch: lp:~benji/launchpad/bug-781600
Merge into: lp:launchpad
Diff against target: 219 lines (+147/-5)
3 files modified
lib/lp/registry/interfaces/distribution.py (+24/-0)
lib/lp/registry/model/distribution.py (+33/-2)
lib/lp/registry/tests/test_distro_webservice.py (+90/-3)
To merge this branch: bzr merge lp:~benji/launchpad/bug-781600
Reviewer Review Type Date Requested Status
Francis J. Lacoste (community) Approve
Review via email: mp+67704@code.launchpad.net

Commit message

[r=flacoste][bug=781600] Provide a web service API to retrieve the branch tip revisions (optionally since a given date) and official series information for a given distribution.

Description of the change

Bug 781600 is about providing a web service API to retrieve the branch
tip revisions (optionally since a given date) and official series
information for a given distribution. This branch adds a getBranchTips
named operation to distributions to provide that functionality.

Preimplemenation discussion was had with Francis.

The implementation is a relatively straight-forward SQL query who's
results are lightly post-processed to provide a slightly nicer (and
requested) structure for the return value.

Tests were added to lib/lp/registry/tests/test_distro_webservice.py.
Specifically the TestGetBranchTips class.

Lint: some pre-existing lint fixed, now lint-free.

To post a comment you must log in.
Revision history for this message
Francis J. Lacoste (flacoste) wrote :
Download full text (5.9 KiB)

> === modified file 'lib/lp/registry/interfaces/distribution.py'

> @operation_parameters(
> + since=Datetime(
> + title=_("Time of last change"),
> + description=_(
> + "Return branches that have new tips since this timestamp."),
> + required=False))
> + @export_operation_as(name="getBranchTips")
> + @export_read_operation()
> + @operation_for_version('devel')
> + def getBranchTips(since):
> + """Return a collection of branches which have new tips since a date.
> + """

I'd suggest improving the docstring. Something along the lines of:

    Return a list of branches information which have new tips since a date.

    Each branch information is a tuple of (branch_unique_name, tip_revision,
    (official_series*)).

    So for each branch in the distribution, you'll get the branch unique name,
    the revision id of tip, and if the branch is official for some series, the
    list of series name.

    :param since: If specified, only returns branch modified since that date
        time.

You probably need to make since=None since it's not a required parameter.

> === modified file 'lib/lp/registry/model/distribution.py'

> """See `IBugTarget`."""
> return get_bug_tags("BugTask.distribution = %s" % sqlvalues(self))
>
> + def getBranchTips(self, since=None):
> + """See `IDistribution`."""
> + query = """
> + SELECT unique_name, last_scanned_id,
> + SeriesSourcePackageBranch.distroseries FROM Branch
> + JOIN DistroSeries ON Branch.distroseries = DistroSeries.id
> + JOIN Distribution ON DistroSeries.distribution = Distribution.id
> + JOIN SeriesSourcePackageBranch ON
> + Branch.id = SeriesSourcePackageBranch.branch
> + WHERE Distribution.name = %s""" % sqlvalues(self.name)

Not all branches will be official, so you want to use a LEFT OUTER JOIN on
both SeriesSourcePackageBranch and DistroSeries.

Distro series id is pretty meangingless, we should return their name.
DistroSeries.name instead of SeriesSourcePackageBranch.distroseries

> +
> + if since is not None:
> + query += (
> + ' AND branch.last_scanned > %s' % sqlvalues(since))
> +
> + query += ' ORDER BY unique_name, last_scanned_id;'
> +
> + data = list(Store.of(self).execute(query))
> +
> + # Group on location (unique_name) and revision (last_scanned_id).
> + results = []
> + for key, group in itertools.groupby(data, itemgetter(0, 1)):
> + results.append(list(key))
> + # Pull out all the official series IDs and append them as a list
> + # to the end of the current record.
> + results[-1].append(map(itemgetter(-1), group))
> + return results
> +

That last grouping will need to be modified to handle the NULL case.

It's kind of a shame that we have to materialize the whole results set here,
since for huge size, it will be batched anyway by the web service code. Would
it make sense to use an generator here? Maybe not, since we'd still retrieve
all results for later batche...

Read more...

review: Needs Fixing
Revision history for this message
Benji York (benji) wrote :
Download full text (7.7 KiB)

On Tue, Jul 12, 2011 at 3:34 PM, Francis J. Lacoste
<email address hidden> wrote:
> Review: Needs Fixing

Thanks very much for the review. I'm certainly not knowledgeable of
this area of LP yet and can use the help.

>> === modified file 'lib/lp/registry/interfaces/distribution.py'
>
>>      @operation_parameters(
>> +        since=Datetime(
>> +            title=_("Time of last change"),
>> +            description=_(
>> +                "Return branches that have new tips since this timestamp."),
>> +            required=False))
>> +    @export_operation_as(name="getBranchTips")
>> +    @export_read_operation()
>> +    @operation_for_version('devel')
>> +    def getBranchTips(since):
>> +        """Return a collection of branches which have new tips since a date.
>> +        """
>
> I'd suggest improving the docstring. Something along the lines of:

Looks good. Done.

> You probably need to make since=None since it's not a required parameter.

It seems odd to me to put the default in the interface in addition to in
the implementation. It has no effect and I worry that the DRY violation
could result in a loss of synchronization with the real default and
result in confusion. Also, the "required=False" bit is what indicates
that "since" is optional, the interface reader shouldn't care what
internal value we use to signify that it wasn't provided.

That being said, I'm only -0 so I'll add it given a bit of a push.

>> === modified file 'lib/lp/registry/model/distribution.py'
>
>>          """See `IBugTarget`."""
>>          return get_bug_tags("BugTask.distribution = %s" % sqlvalues(self))
>>
>> +    def getBranchTips(self, since=None):
>> +        """See `IDistribution`."""
>> +        query = """
>> +            SELECT unique_name, last_scanned_id,
>> +                SeriesSourcePackageBranch.distroseries FROM Branch
>> +            JOIN DistroSeries ON Branch.distroseries = DistroSeries.id
>> +            JOIN Distribution ON DistroSeries.distribution = Distribution.id
>> +            JOIN SeriesSourcePackageBranch ON
>> +                Branch.id = SeriesSourcePackageBranch.branch
>> +            WHERE Distribution.name = %s""" % sqlvalues(self.name)
>
> Not all branches will be official, so you want to use a LEFT OUTER JOIN on
> both SeriesSourcePackageBranch and DistroSeries.

I fixed that and added a test as mentioned below.

> Distro series id is pretty meangingless, we should return their name.
> DistroSeries.name instead of SeriesSourcePackageBranch.distroseries

Done. I had to tweak the query more than just substituting
DistroSeries.name but not too much.

>> +
>> +        if since is not None:
>> +            query += (
>> +                ' AND branch.last_scanned > %s' % sqlvalues(since))
>> +
>> +        query += ' ORDER BY unique_name, last_scanned_id;'
>> +
>> +        data = list(Store.of(self).execute(query))
>> +
>> +        # Group on location (unique_name) and revision (last_scanned_id).
>> +        results = []
>> +        for key, group in itertools.groupby(data, itemgetter(0, 1)):
>> +            results.append(list(key))
>> +            # Pull out all the official series IDs and append them as a list
>> +  ...

Read more...

Revision history for this message
Robert Collins (lifeless) wrote :

On Thu, Jul 14, 2011 at 8:48 AM, Benji York <email address hidden> wrote:
>
>> It's kind of a shame that we have to materialize the whole results set here,
>> since for huge size, it will be batched anyway by the web service code. Would
>> it make sense to use an generator here? Maybe not, since we'd still retrieve
>> all results for later batches.
>
> Given that we use ORDER BY, I suspect the only savings would be not
> transmitting the non-batched results from the DB server to the app
> server.  However, making the method a generator might still be a win
> because even though they will likely come back to get the rest, that
> request may well hit a different app server.  (I don't think we do any
> sort of request affinity but would like to know if we do.)
>
> Later... Nope, that didn't work.  lazr.restful exploded because it
> couldn't adapt the generator to IFiniteSequence.  I would have expected
> returning a generator to work, but I can't find any other instances in
> the code base.  I've left it as a non-generator for the time being but
> would appreciate any knowledge that can be shared on the topic.

So, order by does not on its own mean that the whole result set is
materialised; in fact, if there is an index that can satisfy the
ordering, and its limited, pg seems to prioritise it very highly.

It looks like the grouping could be done in the DB ?
If so then returning the result set may make sense.

Another alternative is to use the new batchnavigator 1.2.5 facilities
to do batch memos rather than slicing. However that may require some
glue into the lazr stuff.. OTOH you're pretty well placed for that
sort of surgery :)

-Rob

Revision history for this message
Francis J. Lacoste (flacoste) wrote :
Download full text (6.2 KiB)

On 11-07-13 04:48 PM, Benji York wrote:
> On Tue, Jul 12, 2011 at 3:34 PM, Francis J. Lacoste

>
>> You probably need to make since=None since it's not a required parameter.
>
> It seems odd to me to put the default in the interface in addition to in
> the implementation. It has no effect and I worry that the DRY violation
> could result in a loss of synchronization with the real default and
> result in confusion. Also, the "required=False" bit is what indicates
> that "since" is optional, the interface reader shouldn't care what
> internal value we use to signify that it wasn't provided.
>
> That being said, I'm only -0 so I'll add it given a bit of a push.

Well, interfaces and implementation are usually not on the side of DRY
:-) To me, if the interface doesn't have a default parameter, it usually
means that client should expect to have to provide a parameter. Again,
the fact that this is mainly for webservice consumption makes this a
little less important. Anyway, Do as you will!

>
>>> === modified file 'lib/lp/registry/model/distribution.py'
>>
>>> """See `IBugTarget`."""
>>> return get_bug_tags("BugTask.distribution = %s" % sqlvalues(self))
>>>
>>> + def getBranchTips(self, since=None):
>>> + """See `IDistribution`."""
>>> + query = """
>>> + SELECT unique_name, last_scanned_id,
>>> + SeriesSourcePackageBranch.distroseries FROM Branch
>>> + JOIN DistroSeries ON Branch.distroseries = DistroSeries.id
>>> + JOIN Distribution ON DistroSeries.distribution = Distribution.id
>>> + JOIN SeriesSourcePackageBranch ON
>>> + Branch.id = SeriesSourcePackageBranch.branch
>>> + WHERE Distribution.name = %s""" % sqlvalues(self.name)
>>
>> Not all branches will be official, so you want to use a LEFT OUTER JOIN on
>> both SeriesSourcePackageBranch and DistroSeries.
>
> I fixed that and added a test as mentioned below.
>
>> Distro series id is pretty meangingless, we should return their name.
>> DistroSeries.name instead of SeriesSourcePackageBranch.distroseries
>
> Done. I had to tweak the query more than just substituting
> DistroSeries.name but not too much.

Some comment on your new query:

> + query = """
> + SELECT unique_name, last_scanned_id, SPBDS.name FROM Branch
> + LEFT OUTER JOIN DistroSeries
> + ON Branch.distroseries = DistroSeries.id

You probably want a normal join here as you don't expect any branch here
without a distroseries. (And this join is just to get at the
distribution right?)

> + LEFT OUTER JOIN SeriesSourcePackageBranch
> + ON Branch.id = SeriesSourcePackageBranch.branch
> + JOIN Distribution
> + ON DistroSeries.distribution = Distribution.id
> + LEFT OUTER JOIN DistroSeries SPBDS -- (SourcePackageBranchDistroSeries)
> + ON SeriesSourcePackageBranch.distroseries = SPBDS.id
> + WHERE Distribution.name = %s""" % sqlvalues(self.name)

You don't have to join the distribution here.

You could use

DistroSeries.distribution = self.id

The SPBDDS left outer join seems right....

Read more...

review: Approve
Revision history for this message
Benji York (benji) wrote :
Download full text (4.5 KiB)

On Wed, Jul 13, 2011 at 6:13 PM, Francis J. Lacoste
<email address hidden> wrote:
> Review: Approve
> On 11-07-13 04:48 PM, Benji York wrote:
>> On Tue, Jul 12, 2011 at 3:34 PM, Francis J. Lacoste

> Some comment on your new query:
>
>> +        query = """
>> +            SELECT unique_name, last_scanned_id, SPBDS.name FROM Branch
>> +            LEFT OUTER JOIN DistroSeries
>> +                ON Branch.distroseries = DistroSeries.id
>
> You probably want a normal join here as you don't expect any branch here
> without a distroseries. (And this join is just to get at the
> distribution right?)

Good catch. Fixed.

>> +            LEFT OUTER JOIN SeriesSourcePackageBranch
>> +                ON Branch.id = SeriesSourcePackageBranch.branch
>> +            JOIN Distribution
>> +                ON DistroSeries.distribution = Distribution.id
>> +            LEFT OUTER JOIN DistroSeries SPBDS -- (SourcePackageBranchDistroSeries)
>> +                ON SeriesSourcePackageBranch.distroseries = SPBDS.id
>> +            WHERE Distribution.name = %s""" % sqlvalues(self.name)
>
> You don't have to join the distribution here.
>
> You could use
>
> DistroSeries.distribution = self.id

Changed.

>>> It's kind of a shame that we have to materialize the whole results set here,
>>> since for huge size, it will be batched anyway by the web service code. Would
>>> it make sense to use an generator here? Maybe not, since we'd still retrieve
>>> all results for later batches.
>>
>> Given that we use ORDER BY, I suspect the only savings would be not
>> transmitting the non-batched results from the DB server to the app
>> server.  However, making the method a generator might still be a win
>> because even though they will likely come back to get the rest, that
>> request may well hit a different app server.  (I don't think we do any
>> sort of request affinity but would like to know if we do.)
>
> We don't and wouldn't really matter since each API request results in a
> different DB transaction.

Indeed. My history with ZODB is showing here. Where's the object cache
when you need it. ;)

>> Later... Nope, that didn't work.  lazr.restful exploded because it
>> couldn't adapt the generator to IFiniteSequence.  I would have expected
>> returning a generator to work, but I can't find any other instances in
>> the code base.  I've left it as a non-generator for the time being but
>> would appreciate any knowledge that can be shared on the topic.
>
> Yeah, that's fine. The direction Robert suggested might work, but not
> sure how easy/hard it would be.

Given the number of WIP items I have at the moment I really need to move
on so I'm not going to investigate further.

>>>> === modified file 'lib/lp/registry/tests/test_distro_webservice.py'
>>>
>
>>>
>>> You can use makeRelatedBranches(reference_branch=self.branch) to make it an
>>> official source package. Or use
>>> makeRelatedBranchesForSourcePackage(source_package) to create the branch and
>>> make it official in one call.
>>
>> I tried to use makeRelatedBranchesForSourcePackage and
>> makeRelatedBranches but the need to have two different distro series
>> releases (series_1 and series_2 in the original cod...

Read more...

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/lp/registry/interfaces/distribution.py'
2--- lib/lp/registry/interfaces/distribution.py 2011-06-16 21:17:37 +0000
3+++ lib/lp/registry/interfaces/distribution.py 2011-07-14 14:11:23 +0000
4@@ -28,6 +28,7 @@
5 export_operation_as,
6 export_read_operation,
7 exported,
8+ operation_for_version,
9 operation_parameters,
10 operation_returns_collection_of,
11 operation_returns_entry,
12@@ -392,6 +393,29 @@
13 """
14
15 @operation_parameters(
16+ since=Datetime(
17+ title=_("Time of last change"),
18+ description=_(
19+ "Return branches that have new tips since this timestamp."),
20+ required=False))
21+ @export_operation_as(name="getBranchTips")
22+ @export_read_operation()
23+ @operation_for_version('devel')
24+ def getBranchTips(since):
25+ """Return a list of branches which have new tips since a date.
26+
27+ Each branch information is a tuple of (branch_unique_name,
28+ tip_revision, (official_series*)).
29+
30+ So for each branch in the distribution, you'll get the branch unique
31+ name, the revision id of tip, and if the branch is official for some
32+ series, the list of series name.
33+
34+ :param since: If specified, limits results to branches modified since
35+ that date and time.
36+ """
37+
38+ @operation_parameters(
39 name=TextLine(title=_("Name"), required=True))
40 @operation_returns_entry(IDistributionMirror)
41 @export_read_operation()
42
43=== modified file 'lib/lp/registry/model/distribution.py'
44--- lib/lp/registry/model/distribution.py 2011-06-24 05:26:37 +0000
45+++ lib/lp/registry/model/distribution.py 2011-07-14 14:11:23 +0000
46@@ -10,7 +10,8 @@
47 'DistributionSet',
48 ]
49
50-from operator import attrgetter
51+from operator import attrgetter, itemgetter
52+import itertools
53
54 from sqlobject import (
55 BoolCol,
56@@ -654,6 +655,36 @@
57 """See `IBugTarget`."""
58 return get_bug_tags("BugTask.distribution = %s" % sqlvalues(self))
59
60+ def getBranchTips(self, since=None):
61+ """See `IDistribution`."""
62+ query = """
63+ SELECT unique_name, last_scanned_id, SPBDS.name FROM Branch
64+ JOIN DistroSeries
65+ ON Branch.distroseries = DistroSeries.id
66+ LEFT OUTER JOIN SeriesSourcePackageBranch
67+ ON Branch.id = SeriesSourcePackageBranch.branch
68+ LEFT OUTER JOIN DistroSeries SPBDS
69+ -- (SPDBS stands for Source Package Branch Distro Series)
70+ ON SeriesSourcePackageBranch.distroseries = SPBDS.id
71+ WHERE DistroSeries.distribution = %s""" % sqlvalues(self.id)
72+
73+ if since is not None:
74+ query += (
75+ ' AND branch.last_scanned > %s' % sqlvalues(since))
76+
77+ query += ' ORDER BY unique_name, last_scanned_id;'
78+
79+ data = Store.of(self).execute(query)
80+
81+ result = []
82+ # Group on location (unique_name) and revision (last_scanned_id).
83+ for key, group in itertools.groupby(data, itemgetter(0, 1)):
84+ result.append(list(key))
85+ # Pull out all the official series names and append them as a list
86+ # to the end of the current record, removing Nones from the list.
87+ result[-1].append(filter(None, map(itemgetter(-1), group)))
88+ return result
89+
90 def getMirrorByName(self, name):
91 """See `IDistribution`."""
92 return Store.of(self).find(
93@@ -1422,7 +1453,7 @@
94 # the sourcepackagename from that.
95 bpph = IStore(BinaryPackagePublishingHistory).find(
96 BinaryPackagePublishingHistory,
97- # See comment above for rationale for using an extra query
98+ # See comment above for rationale for using an extra query
99 # instead of an inner join. (Bottom line, it would time out
100 # otherwise.)
101 BinaryPackagePublishingHistory.archiveID.is_in(
102
103=== modified file 'lib/lp/registry/tests/test_distro_webservice.py'
104--- lib/lp/registry/tests/test_distro_webservice.py 2011-03-23 16:28:51 +0000
105+++ lib/lp/registry/tests/test_distro_webservice.py 2011-07-14 14:11:23 +0000
106@@ -3,12 +3,21 @@
107
108 __metaclass__ = type
109
110+from datetime import datetime
111+
112+import pytz
113 from launchpadlib.errors import Unauthorized
114
115-from zope.security.management import endInteraction
116-from zope.security.proxy import removeSecurityProxy
117+from zope.security.management import (
118+ endInteraction,
119+ newInteraction,
120+ )
121
122 from canonical.testing.layers import DatabaseFunctionalLayer
123+from lp.code.model.seriessourcepackagebranch import (
124+ SeriesSourcePackageBranchSet,
125+ )
126+from lp.registry.interfaces.pocket import PackagePublishingPocket
127 from lp.testing import (
128 api_url,
129 launchpadlib_for,
130@@ -21,10 +30,88 @@
131
132 layer = DatabaseFunctionalLayer
133
134- def test_attempt_to_write_data_without_permission_gives_Unauthorized(self):
135+ def test_write_without_permission_gives_Unauthorized(self):
136 distro = self.factory.makeDistribution()
137 endInteraction()
138 lp = launchpadlib_for("anonymous-access")
139 lp_distro = lp.load(api_url(distro))
140 lp_distro.active = False
141 self.assertRaises(Unauthorized, lp_distro.lp_save)
142+
143+
144+class TestGetBranchTips(TestCaseWithFactory):
145+ """Test the getBranchTips method and it's exposure to the web service."""
146+
147+ layer = DatabaseFunctionalLayer
148+
149+ def setUp(self):
150+ super(TestGetBranchTips, self).setUp()
151+ self.distro = self.factory.makeDistribution()
152+ series_1 = self.series_1 = self.factory.makeDistroRelease(self.distro)
153+ series_2 = self.series_2 = self.factory.makeDistroRelease(self.distro)
154+ source_package = self.factory.makeSourcePackage(distroseries=series_1)
155+ branch = self.factory.makeBranch(sourcepackage=source_package)
156+ unofficial_branch = self.factory.makeBranch(sourcepackage=source_package)
157+ registrant = self.factory.makePerson()
158+ now = datetime.now(pytz.UTC)
159+ sourcepackagename = self.factory.makeSourcePackageName()
160+ SeriesSourcePackageBranchSet.new(
161+ series_1, PackagePublishingPocket.RELEASE, sourcepackagename,
162+ branch, registrant, now)
163+ SeriesSourcePackageBranchSet.new(
164+ series_2, PackagePublishingPocket.RELEASE, sourcepackagename,
165+ branch, registrant, now)
166+ self.factory.makeRevisionsForBranch(branch)
167+ self.branch_name = branch.unique_name
168+ self.unofficial_branch_name = unofficial_branch.unique_name
169+ self.branch_last_scanned_id = branch.last_scanned_id
170+ endInteraction()
171+ self.lp = launchpadlib_for("anonymous-access")
172+ self.lp_distro = self.lp.distributions[self.distro.name]
173+
174+ def test_structure(self):
175+ """The structure of the results is what we expect."""
176+ # The results should be structured as a list of
177+ # (location, tip revision ID, [official series, official series, ...])
178+ item = self.lp_distro.getBranchTips()[0]
179+ self.assertEqual(item[0], self.branch_name)
180+ self.assertTrue(item[1], self.branch_last_scanned_id)
181+ self.assertEqual(
182+ sorted(item[2]),
183+ [self.series_1.name, self.series_2.name])
184+
185+ def test_same_results(self):
186+ """Calling getBranchTips directly matches calling it via the API."""
187+ # The web service transmutes tuples into lists, so we have to do the
188+ # same to the results of directly calling getBranchTips.
189+ listified = [list(x) for x in self.distro.getBranchTips()]
190+ self.assertEqual(listified, self.lp_distro.getBranchTips())
191+
192+ def test_revisions(self):
193+ """If a branch has revisions then the most recent one is returned."""
194+ revision = self.lp_distro.getBranchTips()[0][1]
195+ self.assertNotEqual(None, revision)
196+
197+ def test_since(self):
198+ """If "since" is given, return branches with new tips since then."""
199+ # There is at least one branch with a tip since the year 2000.
200+ self.assertNotEqual(0, len(self.lp_distro.getBranchTips(
201+ since=datetime(2000, 1, 1))))
202+ # There are no branches with a tip since the year 3000.
203+ self.assertEqual(0, len(self.lp_distro.getBranchTips(
204+ since=datetime(3000, 1, 1))))
205+
206+ def test_series(self):
207+ """The official series are included in the data."""
208+ actual_series_names = sorted([self.series_1.name, self.series_2.name])
209+ returned_series_names = sorted(self.lp_distro.getBranchTips()[0][-1])
210+ self.assertEqual(actual_series_names, returned_series_names)
211+
212+ def test_unofficial_branch(self):
213+ """Not all branches are official."""
214+ # If a branch isn't official, the last skanned ID will be None and the
215+ # official distro series list will be empty.
216+ tips = self.lp_distro.getBranchTips()[1]
217+ self.assertEqual(tips[0], self.unofficial_branch_name)
218+ self.assertEqual(tips[1], None)
219+ self.assertEqual(tips[2], [])