Merge lp:~cprov/launchpad/bug-1295318 into lp:launchpad

Proposed by Celso Providelo
Status: Merged
Merged at revision: 16987
Proposed branch: lp:~cprov/launchpad/bug-1295318
Merge into: lp:launchpad
Diff against target: 239 lines (+123/-10)
5 files modified
lib/lp/_schema_circular_imports.py (+7/-7)
lib/lp/soyuz/interfaces/archive.py (+9/-1)
lib/lp/soyuz/model/archive.py (+47/-1)
lib/lp/soyuz/tests/test_archive.py (+57/-0)
lib/lp/testing/factory.py (+3/-1)
To merge this branch: bzr merge lp:~cprov/launchpad/bug-1295318
Reviewer Review Type Date Requested Status
William Grant code Approve
Review via email: mp+216731@code.launchpad.net

Description of the change

Create a wrapper for IArchive.getPublishedSources() API calls. It is used for pre-caching related objects and simplifying security checks, resulting in less queries issued per call.

To post a comment you must log in.
Revision history for this message
William Grant (wgrant) wrote :

52 + def _getPublishedSources(name=None, version=None, status=None,
53 + distroseries=None, pocket=None,
54 + exact_match=False, created_since_date=None,
55 + component_name=None):
56 + """Wrapper for getPublishedSources.

This should be something like api_getPublishedSources to indicate its purpose.

92 + # Pre-cache related `PackageUpload`s, needed for security checks.

The PackageUploads are needed for the SPPH.packageupload field. We pre-populate the sources attribute of each PackageUpload to avoid the security check.

93 + pu_ids = set(map(attrgetter('packageuploadID'), rows))
94 + list(store.find(PackageUpload, PackageUpload.id.is_in(pu_ids)))
95 + # Pre-cache related `SourcePackageName`s, needed for the published
96 + # record title (materialized in the API).
97 + spn_ids = set(map(
98 + attrgetter('sourcepackagerelease.sourcepackagenameID'), rows))
99 + list(store.find(SourcePackageName,
100 + SourcePackageName.id.is_in(spn_ids)))

These should be able to be simplified using lp.services.database.bulk.load_related.

116 + get_property_cache(spr).published_archives = [self]
117 + # Fill `PackageUpload.sources`.
118 + if pub_source.packageupload is not None:
119 + upload = pub_source.packageupload
120 + get_property_cache(upload).sources = [
121 + pu_sources_dict.get(upload.id)]

Setting published_archives and sources to just the elements that we know are sufficient here is fine for security purposes. But the attributes aren't limited to security checks, so this may introduce confusing correctness bugs -- they'll magically be erroneously non-exhaustive when they happen to have been touched by this API method.

I'd consider renaming the attributes to make it clear that they're only good for security checks. This might mean creating a new cachedproperty around PU.sources, as that attribute is used for more than just security.

170 + def create_testing_ppa(self):

This isn't a legal method name.

review: Needs Fixing (code)
Revision history for this message
Celso Providelo (cprov) wrote :

Thanks for the comments, William.

Comments inline ...

On Wed, Apr 23, 2014 at 8:00 AM, William Grant <email address hidden> wrote:
>
> Review: Needs Fixing code
>
> 52 + def _getPublishedSources(name=None, version=None, status=None,
> 53 + distroseries=None, pocket=None,
> 54 + exact_match=False, created_since_date=None,
> 55 + component_name=None):
> 56 + """Wrapper for getPublishedSources.
>
> This should be something like api_getPublishedSources to indicate its purpose

Good point.

> 92 + # Pre-cache related `PackageUpload`s, needed for security checks.
>
> The PackageUploads are needed for the SPPH.packageupload field. We pre-populate the sources attribute of each PackageUpload to avoid the security check

Comment adjusted.

>
> 93 + pu_ids = set(map(attrgetter('packageuploadID'), rows))
> 94 + list(store.find(PackageUpload, PackageUpload.id.is_in(pu_ids)))
> 95 + # Pre-cache related `SourcePackageName`s, needed for the published
> 96 + # record title (materialized in the API).
> 97 + spn_ids = set(map(
> 98 + attrgetter('sourcepackagerelease.sourcepackagenameID'), rows))
> 99 + list(store.find(SourcePackageName,
> 100 + SourcePackageName.id.is_in(spn_ids)))
>
> These should be able to be simplified using lp.services.database.bulk.load_related.
>

Right, load_related() and load_referencing() helped with PU and PUS.
Unfortunately, they do not support nested attributes and could not be
used for SPN.

https://bugs.launchpad.net/launchpad/+bug/1311690

> 116 + get_property_cache(spr).published_archives = [self]
> 117 + # Fill `PackageUpload.sources`.
> 118 + if pub_source.packageupload is not None:
> 119 + upload = pub_source.packageupload
> 120 + get_property_cache(upload).sources = [
> 121 + pu_sources_dict.get(upload.id)]
>
> Setting published_archives and sources to just the elements that we know are sufficient here is fine for security purposes. But the attributes aren't limited to security checks, so this may introduce confusing correctness bugs -- they'll magically be erroneously non-exhaustive when they happen to have been touched by this API method.
>
> I'd consider renaming the attributes to make it clear that they're only good for security checks. This might mean creating a new cachedproperty around PU.sources, as that attribute is used for more than just security.
>

We discussed this further on IRC and caching PU.sources is fine and
correct because of coincidence of constraints, we only accept
single-source uploads and once created PU contents never change. So,
in this scenario, caching PU.sources outside the upload-processing
period will be always "correct".

That said, there are plans to redesign the package upload workflow
(and queue) entirely and this particular aspect might change and those
callsites will have to be revisited.

>
> 170 + def create_testing_ppa(self):
>
> This isn't a legal method name.

Right, fixed.

[]

--
Celso Providelo
<email address hidden>

Revision history for this message
William Grant (wgrant) wrote :

56 + """API wrapper for getPublishedSources.
57 +
58 + It loads additional related objects only needed in the API call
59 + context (i.e. security checks and entries marshalling).
60 + """

This will appear as the API documentation for getPublishedSources. You probably want to mention the APIness in a comment, not the docstring.

review: Approve (code)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/lp/_schema_circular_imports.py'
2--- lib/lp/_schema_circular_imports.py 2014-03-11 10:29:43 +0000
3+++ lib/lp/_schema_circular_imports.py 2014-04-24 02:14:17 +0000
4@@ -438,13 +438,13 @@
5 IArchive, 'getArchiveDependency', 'dependency', IArchive)
6 patch_entry_return_type(IArchive, 'getArchiveDependency', IArchiveDependency)
7 patch_plain_parameter_type(
8- IArchive, 'getPublishedSources', 'distroseries', IDistroSeries)
9+ IArchive, 'api_getPublishedSources', 'distroseries', IDistroSeries)
10 patch_collection_return_type(
11- IArchive, 'getPublishedSources', ISourcePackagePublishingHistory)
12-patch_choice_parameter_type(
13- IArchive, 'getPublishedSources', 'status', PackagePublishingStatus)
14-patch_choice_parameter_type(
15- IArchive, 'getPublishedSources', 'pocket', PackagePublishingPocket)
16+ IArchive, 'api_getPublishedSources', ISourcePackagePublishingHistory)
17+patch_choice_parameter_type(
18+ IArchive, 'api_getPublishedSources', 'status', PackagePublishingStatus)
19+patch_choice_parameter_type(
20+ IArchive, 'api_getPublishedSources', 'pocket', PackagePublishingPocket)
21 patch_plain_parameter_type(
22 IArchive, 'getAllPublishedBinaries', 'distroarchseries',
23 IDistroArchSeries)
24@@ -761,7 +761,7 @@
25 "getBuildSummariesForSourceIds", "getComponentsForQueueAdmin",
26 "getPackagesetsForSource", "getPackagesetsForSourceUploader",
27 "getPackagesetsForUploader", "getPermissionsForPerson",
28- "getPublishedSources", "getQueueAdminsForComponent",
29+ "api_getPublishedSources", "getQueueAdminsForComponent",
30 "getUploadersForComponent", "getUploadersForPackage",
31 "getUploadersForPackageset", "isSourceUploadAllowed",
32 "newComponentUploader", "newPackageUploader", "newPackagesetUploader",
33
34=== modified file 'lib/lp/soyuz/interfaces/archive.py'
35--- lib/lp/soyuz/interfaces/archive.py 2014-03-18 23:16:22 +0000
36+++ lib/lp/soyuz/interfaces/archive.py 2014-04-24 02:14:17 +0000
37@@ -405,6 +405,7 @@
38 :return: A IArchiveAuthToken, or None if the user has none.
39 """
40
41+ @export_operation_as('getPublishedSources')
42 @rename_parameters_as(name="source_name", distroseries="distro_series")
43 @operation_parameters(
44 name=TextLine(title=_("Source package name"), required=False),
45@@ -439,9 +440,16 @@
46 )
47 # Really returns ISourcePackagePublishingHistory, see below for
48 # patch to avoid circular import.
49- @call_with(eager_load=True)
50 @operation_returns_collection_of(Interface)
51 @export_read_operation()
52+ def api_getPublishedSources(name=None, version=None, status=None,
53+ distroseries=None, pocket=None,
54+ exact_match=False, created_since_date=None,
55+ component_name=None):
56+ """All `ISourcePackagePublishingHistory` target to this archive."""
57+ # It loads additional related objects only needed in the API call
58+ # context (i.e. security checks and entries marshalling).
59+
60 def getPublishedSources(name=None, version=None, status=None,
61 distroseries=None, pocket=None,
62 exact_match=False, created_since_date=None,
63
64=== modified file 'lib/lp/soyuz/model/archive.py'
65--- lib/lp/soyuz/model/archive.py 2013-11-25 07:36:12 +0000
66+++ lib/lp/soyuz/model/archive.py 2014-04-24 02:14:17 +0000
67@@ -77,7 +77,10 @@
68 from lp.registry.model.sourcepackagename import SourcePackageName
69 from lp.registry.model.teammembership import TeamParticipation
70 from lp.services.config import config
71-from lp.services.database.bulk import load_related
72+from lp.services.database.bulk import (
73+ load_referencing,
74+ load_related,
75+ )
76 from lp.services.database.constants import UTC_NOW
77 from lp.services.database.datetimecol import UtcDateTimeCol
78 from lp.services.database.decoratedresultset import DecoratedResultSet
79@@ -490,6 +493,49 @@
80 return getUtility(IBuildFarmJobSet).getBuildsForArchive(
81 self, status=build_state)
82
83+ def api_getPublishedSources(self, name=None, version=None, status=None,
84+ distroseries=None, pocket=None,
85+ exact_match=False, created_since_date=None,
86+ component_name=None):
87+ """See `IArchive`."""
88+ # 'eager_load' and 'include_removed' arguments are always True
89+ # for API calls.
90+ published_sources = self.getPublishedSources(
91+ name, version, status, distroseries, pocket, exact_match,
92+ created_since_date, True, component_name, True)
93+
94+ def load_api_extra_objects(rows):
95+ """Load extra related-objects needed by API calls."""
96+ # Pre-cache related `PackageUpload`s and `PackageUploadSource`s
97+ # which are immediatelly used in the API context for checking
98+ # permissions on the returned entries.
99+ uploads = load_related(PackageUpload, rows, ['packageuploadID'])
100+ pu_sources = load_referencing(
101+ PackageUploadSource, uploads, ['packageuploadID'])
102+ for pu_source in pu_sources:
103+ upload = pu_source.packageupload
104+ get_property_cache(upload).sources = [pu_source]
105+ # Pre-cache `SourcePackageName`s which are immediatelly used
106+ # in the API context for materializing the returned entries.
107+ # XXX cprov 2014-04-23: load_related() does not support
108+ # nested attributes as foreign keys (uses getattr instead of
109+ # attrgetter).
110+ spn_ids = set(map(
111+ attrgetter('sourcepackagerelease.sourcepackagenameID'), rows))
112+ list(Store.of(self).find(
113+ SourcePackageName, SourcePackageName.id.is_in(spn_ids)))
114+ # Bypass PackageUpload security-check query by caching
115+ # `SourcePackageRelease.published_archives` results.
116+ # All published sources are visible to the user, since they
117+ # are published in the context archive. No need to check
118+ # additional archives the source is published.
119+ for pub_source in rows:
120+ spr = pub_source.sourcepackagerelease
121+ get_property_cache(spr).published_archives = [self]
122+
123+ return DecoratedResultSet(published_sources,
124+ pre_iter_hook=load_api_extra_objects)
125+
126 def getPublishedSources(self, name=None, version=None, status=None,
127 distroseries=None, pocket=None,
128 exact_match=False, created_since_date=None,
129
130=== modified file 'lib/lp/soyuz/tests/test_archive.py'
131--- lib/lp/soyuz/tests/test_archive.py 2013-11-21 03:37:58 +0000
132+++ lib/lp/soyuz/tests/test_archive.py 2014-04-24 02:14:17 +0000
133@@ -13,6 +13,7 @@
134 from pytz import UTC
135 from testtools.matchers import (
136 DocTestMatches,
137+ LessThan,
138 MatchesRegex,
139 MatchesStructure,
140 )
141@@ -41,6 +42,7 @@
142 from lp.services.database.sqlbase import sqlvalues
143 from lp.services.job.interfaces.job import JobStatus
144 from lp.services.propertycache import clear_property_cache
145+from lp.services.webapp.interfaces import OAuthPermission
146 from lp.services.worlddata.interfaces.country import ICountrySet
147 from lp.soyuz.adapters.archivedependencies import (
148 get_sources_list_for_building,
149@@ -102,6 +104,9 @@
150 LaunchpadFunctionalLayer,
151 LaunchpadZopelessLayer,
152 )
153+from lp.testing.matchers import HasQueryCount
154+from lp.testing.pages import webservice_for_person
155+from lp.testing._webservice import QueryCollector
156
157
158 class TestGetPublicationsInArchive(TestCaseWithFactory):
159@@ -2126,6 +2131,58 @@
160 self.assertEqual('universe', filtered.component.name)
161
162
163+class GetPublishedSourcesWebServiceTests(TestCaseWithFactory):
164+
165+ layer = LaunchpadFunctionalLayer
166+
167+ def setUp(self):
168+ super(GetPublishedSourcesWebServiceTests, self).setUp()
169+
170+ def createTestingPPA(self):
171+ """Creates and populates a PPA for API performance tests.
172+
173+ Creates a public PPA and populates it with 5 distinct source
174+ publications with corresponding `PackageUpload` records.
175+ """
176+ ppa = self.factory.makeArchive(
177+ name='ppa', purpose=ArchivePurpose.PPA)
178+ distroseries = self.factory.makeDistroSeries()
179+ # XXX cprov 2014-04-22: currently the target archive owner cannot
180+ # 'addSource' to a `PackageUpload` ('launchpad.Edit'). It seems
181+ # too restrive to me.
182+ from zope.security.proxy import removeSecurityProxy
183+ with person_logged_in(ppa.owner):
184+ for i in range(5):
185+ upload = self.factory.makePackageUpload(
186+ distroseries=distroseries, archive=ppa)
187+ pub = self.factory.makeSourcePackagePublishingHistory(
188+ archive=ppa, distroseries=distroseries,
189+ creator=ppa.owner, spr_creator=ppa.owner,
190+ maintainer=ppa.owner, packageupload=upload)
191+ naked_upload = removeSecurityProxy(upload)
192+ naked_upload.addSource(pub.sourcepackagerelease)
193+ return ppa
194+
195+ def test_query_count(self):
196+ # IArchive.getPublishedSources() webservice is exposed
197+ # via a wrapper to improving performance (by reducing the
198+ # number of queries issued)
199+ ppa = self.createTestingPPA()
200+ ppa_url = '/~{}/+archive/ppa'.format(ppa.owner.name)
201+ webservice = webservice_for_person(
202+ ppa.owner, permission=OAuthPermission.READ_PRIVATE)
203+
204+ collector = QueryCollector()
205+ collector.register()
206+ self.addCleanup(collector.unregister)
207+
208+ response = webservice.named_get(ppa_url, 'getPublishedSources')
209+
210+ self.assertEqual(200, response.status)
211+ self.assertEqual(5, response.jsonBody()['total_size'])
212+ self.assertThat(collector, HasQueryCount(LessThan(28)))
213+
214+
215 class TestCopyPackage(TestCaseWithFactory):
216
217 layer = DatabaseFunctionalLayer
218
219=== modified file 'lib/lp/testing/factory.py'
220--- lib/lp/testing/factory.py 2014-04-23 10:50:06 +0000
221+++ lib/lp/testing/factory.py 2014-04-24 02:14:17 +0000
222@@ -3626,6 +3626,7 @@
223 scheduleddeletiondate=None,
224 ancestor=None,
225 creator=None,
226+ packageupload=None,
227 spr_creator=None,
228 **kwargs):
229 """Make a `SourcePackagePublishingHistory`.
230@@ -3684,7 +3685,8 @@
231 spph = getUtility(IPublishingSet).newSourcePublication(
232 archive, sourcepackagerelease, distroseries,
233 sourcepackagerelease.component, sourcepackagerelease.section,
234- pocket, ancestor=ancestor, creator=creator)
235+ pocket, ancestor=ancestor, creator=creator,
236+ packageupload=packageupload)
237
238 naked_spph = removeSecurityProxy(spph)
239 naked_spph.status = status