Merge lp:~adeuring/launchpad/bug-739065 into lp:launchpad

Proposed by Abel Deuring
Status: Merged
Approved by: Abel Deuring
Approved revision: no longer in the source branch.
Merged at revision: 13098
Proposed branch: lp:~adeuring/launchpad/bug-739065
Merge into: lp:launchpad
Diff against target: 360 lines (+251/-36)
4 files modified
lib/lp/soyuz/model/distributionsourcepackagerelease.py (+65/-31)
lib/lp/soyuz/model/distroseriesbinarypackage.py (+6/-4)
lib/lp/soyuz/tests/test_distributionsourcepackagerelease.py (+152/-0)
lib/lp/soyuz/tests/test_distroseriesbinarypackage.py (+28/-1)
To merge this branch: bzr merge lp:~adeuring/launchpad/bug-739065
Reviewer Review Type Date Requested Status
Gavin Panella (community) Approve
Review via email: mp+61555@code.launchpad.net

Description of the change

This branch is a first step to fix bug 739065: SourcePackage:+index timeouts

The OOPS reports from that bug show that two SQL queries which return only one row each are repeated quite often; this branch lets the query in DistributionSourcePackageRelease.sample_binary_packages prejoin DistroSeriesPackageCache records.

Since it can happen that no DistroSeriesPackageCache exists for a given DistroSeriesBinaryPackage, I also changed its constructor so that passing the optional parameter cache=None means "no cache available", instead of "parameter not specified or no cache available".

A test run of the new query in DistributionSourcePackageRelease.sample_binary_packages indicates that it is slightly faster than the old query; avoiding the retrieval DistroSeriesPackageCache records in separate queries would save more than 3 seconds in OOPS 1904E1683.

DistributionSourcePackageRelease.sample_binary_packages was not tested at all, so I added a few general tests.

tests:

./bin/test soyuz -vvt test_distributionsourcepackagerelease
./bin/test soyuz -vvt test_distroseriesbinarypackage

no lint

To post a comment you must log in.
Revision history for this message
Gavin Panella (allenap) wrote :

Looks good. Some trivial comments, that's all.

[1]

+ all_published = DecoratedResultSet(
+ all_published, lambda row: row[1:3])

Just for interest, operator.itemgetter() can return values at multiple
indices:

        all_published = DecoratedResultSet(
            all_published, itemgetter(1, 2))

Or even:

        all_published = DecoratedResultSet(
            all_published, itemgetter(slice(1, 3)))

[2]

+ for publishing, package_cache in all_published:
+ samples.append(
+ DistroSeriesBinaryPackage(
+ publishing.distroarchseries.distroseries,
+ publishing.binarypackagerelease.binarypackagename,

You seem to be selecting BinaryPackageName in all_published; perhaps,
but adjusting the slicing requested of the DecoratedResultSet you can
get it straight back from the query?

[3]

+ with StormStatementRecorder() as recorder:
+ binary_package.cache
+ self.assertEqual(0, len(recorder.statements))

I think you should do here:

        self.assertThat(recorder, HasQueryCount(Equals(0)))

This ensures you get a statement log in the event that the match
fails.

[4]

+ with StormStatementRecorder() as recorder:
+ self.distroseries_binary_package.cache
+ self.assertTrue(len(recorder.statements) > 0)

Likewise, I think you should do:

        self.assertThat(recorder, HasQueryCount(NotEquals(0)))

review: Approve
Revision history for this message
Abel Deuring (adeuring) wrote :

Hi Gavin,

thanks for your review!

All sugestions implemented.

On 19.05.2011 15:38, Gavin Panella wrote:
> Review: Approve
> Looks good. Some trivial comments, that's all.
>
>
> [1]
>
> + all_published = DecoratedResultSet(
> + all_published, lambda row: row[1:3])
>
> Just for interest, operator.itemgetter() can return values at multiple
> indices:
>
> all_published = DecoratedResultSet(
> all_published, itemgetter(1, 2))
>
> Or even:
>
> all_published = DecoratedResultSet(
> all_published, itemgetter(slice(1, 3)))
>
>
> [2]
>
> + for publishing, package_cache in all_published:
> + samples.append(
> + DistroSeriesBinaryPackage(
> + publishing.distroarchseries.distroseries,
> + publishing.binarypackagerelease.binarypackagename,
>
> You seem to be selecting BinaryPackageName in all_published; perhaps,
> but adjusting the slicing requested of the DecoratedResultSet you can
> get it straight back from the query?
>
>
> [3]
>
> + with StormStatementRecorder() as recorder:
> + binary_package.cache
> + self.assertEqual(0, len(recorder.statements))
>
> I think you should do here:
>
> self.assertThat(recorder, HasQueryCount(Equals(0)))
>
> This ensures you get a statement log in the event that the match
> fails.
>
>
> [4]
>
> + with StormStatementRecorder() as recorder:
> + self.distroseries_binary_package.cache
> + self.assertTrue(len(recorder.statements) > 0)
>
> Likewise, I think you should do:
>
> self.assertThat(recorder, HasQueryCount(NotEquals(0)))
>

Revision history for this message
Robert Collins (lifeless) wrote :

Only testing on production data will tell, but I suspect this will still be a problem: this is a wider query, which means more work for the planner - and more likelyhood the planner will get it wrong.

using load_data (or load_related) in the pre_iter_hook of DecoratedResultSet is probably faster.

No need to change this - I may be wrong - but if in QA it times out on qastaging, I expect query analysis to show this up.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/lp/soyuz/model/distributionsourcepackagerelease.py'
2--- lib/lp/soyuz/model/distributionsourcepackagerelease.py 2010-08-20 20:31:18 +0000
3+++ lib/lp/soyuz/model/distributionsourcepackagerelease.py 2011-05-19 14:19:45 +0000
4@@ -12,11 +12,20 @@
5 ]
6
7 from lazr.delegates import delegates
8-from storm.expr import Desc
9+from storm.expr import (
10+ And,
11+ Desc,
12+ Join,
13+ LeftJoin,
14+ SQL,
15+ )
16 from zope.component import getUtility
17 from zope.interface import implements
18
19 from canonical.database.sqlbase import sqlvalues
20+from canonical.launchpad.components.decoratedresultset import (
21+ DecoratedResultSet,
22+ )
23 from canonical.launchpad.webapp.interfaces import (
24 DEFAULT_FLAVOR,
25 IStoreSelector,
26@@ -34,6 +43,7 @@
27 from lp.soyuz.model.binarypackagename import BinaryPackageName
28 from lp.soyuz.model.binarypackagerelease import BinaryPackageRelease
29 from lp.soyuz.model.distroseriesbinarypackage import DistroSeriesBinaryPackage
30+from lp.soyuz.model.distroseriespackagecache import DistroSeriesPackageCache
31 from lp.soyuz.model.publishing import (
32 BinaryPackagePublishingHistory,
33 SourcePackagePublishingHistory,
34@@ -152,33 +162,57 @@
35 @property
36 def sample_binary_packages(self):
37 """See IDistributionSourcePackageRelease."""
38- all_published = BinaryPackagePublishingHistory.select("""
39- BinaryPackagePublishingHistory.distroarchseries =
40- DistroArchSeries.id AND
41- DistroArchSeries.distroseries = DistroSeries.id AND
42- DistroSeries.distribution = %s AND
43- BinaryPackagePublishingHistory.archive IN %s AND
44- BinaryPackagePublishingHistory.binarypackagerelease =
45- BinaryPackageRelease.id AND
46- BinaryPackageRelease.binarypackagename = BinaryPackageName.id AND
47- BinaryPackageRelease.build = BinaryPackageBuild.id AND
48- BinaryPackageBuild.source_package_release = %s
49- """ % sqlvalues(self.distribution,
50- self.distribution.all_distro_archive_ids,
51- self.sourcepackagerelease),
52- distinct=True,
53- orderBy=['BinaryPackageName.name'],
54- clauseTables=['DistroArchSeries', 'DistroSeries',
55- 'BinaryPackageRelease', 'BinaryPackageName',
56- 'BinaryPackageBuild'],
57- prejoinClauseTables=['BinaryPackageRelease', 'BinaryPackageName'])
58- samples = []
59- names = set()
60- for publishing in all_published:
61- if publishing.binarypackagerelease.binarypackagename not in names:
62- names.add(publishing.binarypackagerelease.binarypackagename)
63- samples.append(
64- DistroSeriesBinaryPackage(
65- publishing.distroarchseries.distroseries,
66- publishing.binarypackagerelease.binarypackagename))
67- return samples
68+ #avoid circular imports.
69+ from lp.registry.model.distroseries import DistroSeries
70+ from lp.soyuz.model.distroarchseries import DistroArchSeries
71+ store = getUtility(IStoreSelector).get(MAIN_STORE, DEFAULT_FLAVOR)
72+ archive_ids = list(self.distribution.all_distro_archive_ids)
73+ result_row = (
74+ SQL('DISTINCT ON(BinaryPackageName.name) 0 AS ignore'),
75+ BinaryPackagePublishingHistory, DistroSeriesPackageCache,
76+ BinaryPackageRelease, BinaryPackageName)
77+ tables = (
78+ BinaryPackagePublishingHistory,
79+ Join(
80+ DistroArchSeries,
81+ DistroArchSeries.id ==
82+ BinaryPackagePublishingHistory.distroarchseriesID),
83+ Join(
84+ DistroSeries,
85+ DistroArchSeries.distroseriesID == DistroSeries.id),
86+ Join(
87+ BinaryPackageRelease,
88+ BinaryPackageRelease.id ==
89+ BinaryPackagePublishingHistory.binarypackagereleaseID),
90+ Join(
91+ BinaryPackageName,
92+ BinaryPackageName.id ==
93+ BinaryPackageRelease.binarypackagenameID),
94+ Join(
95+ BinaryPackageBuild,
96+ BinaryPackageBuild.id == BinaryPackageRelease.buildID),
97+ LeftJoin(
98+ DistroSeriesPackageCache,
99+ And(
100+ DistroSeriesPackageCache.distroseries == DistroSeries.id,
101+ DistroSeriesPackageCache.archiveID.is_in(archive_ids),
102+ DistroSeriesPackageCache.binarypackagename ==
103+ BinaryPackageName.id)))
104+
105+ all_published = store.using(*tables).find(
106+ result_row,
107+ DistroSeries.distribution == self.distribution,
108+ BinaryPackagePublishingHistory.archiveID.is_in(archive_ids),
109+ BinaryPackageBuild.source_package_release ==
110+ self.sourcepackagerelease)
111+ all_published = all_published.order_by(
112+ BinaryPackageName.name)
113+
114+ def make_dsb_package(row):
115+ publishing = row[1]
116+ package_cache = row[2]
117+ return DistroSeriesBinaryPackage(
118+ publishing.distroarchseries.distroseries,
119+ publishing.binarypackagerelease.binarypackagename,
120+ package_cache)
121+ return DecoratedResultSet(all_published, make_dsb_package)
122
123=== modified file 'lib/lp/soyuz/model/distroseriesbinarypackage.py'
124--- lib/lp/soyuz/model/distroseriesbinarypackage.py 2010-10-24 12:46:23 +0000
125+++ lib/lp/soyuz/model/distroseriesbinarypackage.py 2011-05-19 14:19:45 +0000
126@@ -39,10 +39,12 @@
127
128 implements(IDistroSeriesBinaryPackage)
129
130- def __init__(self, distroseries, binarypackagename, cache=None):
131+ default = object()
132+
133+ def __init__(self, distroseries, binarypackagename, cache=default):
134 self.distroseries = distroseries
135 self.binarypackagename = binarypackagename
136- if cache is not None:
137+ if cache is not self.default:
138 get_property_cache(self).cache = cache
139
140 @property
141@@ -69,9 +71,9 @@
142 self.distroseries.distribution.all_distro_archive_ids)
143 result = store.find(
144 DistroSeriesPackageCache,
145- DistroSeriesPackageCache.distroseries==self.distroseries,
146+ DistroSeriesPackageCache.distroseries == self.distroseries,
147 DistroSeriesPackageCache.archiveID.is_in(archive_ids),
148- (DistroSeriesPackageCache.binarypackagename==
149+ (DistroSeriesPackageCache.binarypackagename ==
150 self.binarypackagename))
151 return result.any()
152
153
154=== added file 'lib/lp/soyuz/tests/test_distributionsourcepackagerelease.py'
155--- lib/lp/soyuz/tests/test_distributionsourcepackagerelease.py 1970-01-01 00:00:00 +0000
156+++ lib/lp/soyuz/tests/test_distributionsourcepackagerelease.py 2011-05-19 14:19:45 +0000
157@@ -0,0 +1,152 @@
158+# Copyright 2011 Canonical Ltd. This software is licensed under the
159+# GNU Affero General Public License version 3 (see the file LICENSE).
160+
161+"""Tests of DistributionSourcePackageRelease."""
162+
163+from testtools.matchers import LessThan
164+from zope.component import getUtility
165+
166+from canonical.testing.layers import DatabaseFunctionalLayer
167+from lp.soyuz.enums import PackagePublishingStatus
168+from lp.soyuz.interfaces.distroarchseries import IDistroArchSeriesSet
169+from lp.soyuz.model.distributionsourcepackagerelease import (
170+ DistributionSourcePackageRelease,
171+ )
172+from lp.soyuz.tests.test_publishing import SoyuzTestPublisher
173+from lp.testing import (
174+ StormStatementRecorder,
175+ TestCaseWithFactory,
176+ )
177+from lp.testing.matchers import HasQueryCount
178+
179+
180+class TestDistributionSourcePackageRelease(TestCaseWithFactory):
181+ """Tests for DistributionSourcePackageRelease."""
182+
183+ layer = DatabaseFunctionalLayer
184+
185+ def setUp(self):
186+ super(TestDistributionSourcePackageRelease, self).setUp()
187+ self.sourcepackagerelease = self.factory.makeSourcePackageRelease()
188+ self.distroarchseries = self.factory.makeDistroArchSeries(
189+ distroseries=self.sourcepackagerelease.sourcepackage.distroseries)
190+ distribution = self.distroarchseries.distroseries.distribution
191+ self.dsp_release = DistributionSourcePackageRelease(
192+ distribution, self.sourcepackagerelease)
193+
194+ def makeBinaryPackageRelease(self, name=None):
195+ if name is None:
196+ name = self.factory.makeBinaryPackageName()
197+ bp_build = self.factory.makeBinaryPackageBuild(
198+ source_package_release=self.sourcepackagerelease,
199+ distroarchseries=self.distroarchseries)
200+ bp_release = self.factory.makeBinaryPackageRelease(
201+ build=bp_build, binarypackagename=name)
202+ sourcepackagename = self.sourcepackagerelease.sourcepackagename
203+ self.factory.makeSourcePackagePublishingHistory(
204+ sourcepackagename,
205+ sourcepackagerelease=self.sourcepackagerelease,
206+ distroseries=self.sourcepackagerelease.sourcepackage.distroseries,
207+ status=PackagePublishingStatus.PUBLISHED)
208+ self.factory.makeBinaryPackagePublishingHistory(
209+ binarypackagerelease=bp_release,
210+ distroarchseries=self.distroarchseries)
211+
212+ def test_sample_binary_packages__no_releases(self):
213+ # If no binary releases exist,
214+ # DistributionSourcePackageRelease.sample_binary_packages is empty.
215+ self.assertEqual(0, self.dsp_release.sample_binary_packages.count())
216+
217+ def test_sample_binary_packages__one_release(self):
218+ # If a binary release exists,
219+ # DistributionSourcePackageRelease.sample_binary_packages
220+ # returns it.
221+ self.makeBinaryPackageRelease(
222+ self.factory.makeBinaryPackageName(name='binary-package'))
223+ self.assertEqual(
224+ ['binary-package'],
225+ [release.name
226+ for release in self.dsp_release.sample_binary_packages])
227+
228+ def test_sample_binary_packages__two_releases_one_binary_package(self):
229+ # If two binary releases with the same name exist,
230+ # DistributionSourcePackageRelease.sample_binary_packages
231+ # returns only one.
232+ name = self.factory.makeBinaryPackageName(name='binary-package')
233+ self.makeBinaryPackageRelease(name)
234+ self.makeBinaryPackageRelease(name)
235+ self.assertEqual(
236+ ['binary-package'],
237+ [release.name
238+ for release in self.dsp_release.sample_binary_packages])
239+
240+ def test_sample_binary_packages__two_release_two_binary_packages(self):
241+ # If a two binary releases with different names exist,
242+ # DistributionSourcePackageRelease.sample_binary_packages
243+ # returns both.
244+ self.makeBinaryPackageRelease(
245+ self.factory.makeBinaryPackageName(name='binary-package'))
246+ self.makeBinaryPackageRelease(
247+ self.factory.makeBinaryPackageName(name='binary-package-2'))
248+ self.assertEqual(
249+ ['binary-package', 'binary-package-2'],
250+ [release.name
251+ for release in self.dsp_release.sample_binary_packages])
252+
253+ def updateDistroSeriesPackageCache(self):
254+ # Create DistroSeriesPackageCache records for new binary
255+ # packages.
256+ #
257+ # SoyuzTestPublisher.updateDistroSeriesPackageCache() creates
258+ # a DistroSeriesPackageCache record for the new binary package.
259+ # The method closes the current DB connection, making references
260+ # to DB objects in other DB objects unusable. Starting with
261+ # the distroarchseries, we can create new, valid, instances of
262+ # objects required later in the test again.
263+ # of the objects we need later.
264+ sourcepackagename = self.sourcepackagerelease.sourcepackagename
265+ publisher = SoyuzTestPublisher()
266+ publisher.updateDistroSeriesPackageCache(
267+ self.distroarchseries.distroseries)
268+ self.distroarchseries = getUtility(IDistroArchSeriesSet).get(
269+ self.distroarchseries.id)
270+ distribution = self.distroarchseries.distroseries.distribution
271+ releases = distribution.getCurrentSourceReleases([sourcepackagename])
272+ [(distribution_sourcepackage, dsp_release)] = releases.items()
273+ self.dsp_release = dsp_release
274+ self.sourcepackagerelease = dsp_release.sourcepackagerelease
275+
276+ def test_sample_binary_packages__constant_number_sql_queries(self):
277+ # Retrieving
278+ # DistributionSourcePackageRelease.sample_binary_packages and
279+ # accessing the property "summary" of its items requires a
280+ # constant number of SQL queries, regardless of the number
281+ # of existing binary package releases.
282+ self.makeBinaryPackageRelease()
283+ self.updateDistroSeriesPackageCache()
284+ with StormStatementRecorder() as recorder:
285+ for ds_package in self.dsp_release.sample_binary_packages:
286+ ds_package.summary
287+ self.assertThat(recorder, HasQueryCount(LessThan(5)))
288+ self.assertEqual(1, self.dsp_release.sample_binary_packages.count())
289+
290+ for iteration in range(5):
291+ self.makeBinaryPackageRelease()
292+ self.updateDistroSeriesPackageCache()
293+ with StormStatementRecorder() as recorder:
294+ for ds_package in self.dsp_release.sample_binary_packages:
295+ ds_package.summary
296+ self.assertThat(recorder, HasQueryCount(LessThan(5)))
297+ self.assertEqual(6, self.dsp_release.sample_binary_packages.count())
298+
299+ # Even if the cache is not updated for binary packages,
300+ # DistributionSourcePackageRelease objects do not try to
301+ # retrieve DistroSeriesPackageCache records if they know
302+ # that such records do not exist.
303+ for iteration in range(5):
304+ self.makeBinaryPackageRelease()
305+ with StormStatementRecorder() as recorder:
306+ for ds_package in self.dsp_release.sample_binary_packages:
307+ ds_package.summary
308+ self.assertThat(recorder, HasQueryCount(LessThan(5)))
309+ self.assertEqual(11, self.dsp_release.sample_binary_packages.count())
310
311=== modified file 'lib/lp/soyuz/tests/test_distroseriesbinarypackage.py'
312--- lib/lp/soyuz/tests/test_distroseriesbinarypackage.py 2010-12-20 03:21:03 +0000
313+++ lib/lp/soyuz/tests/test_distroseriesbinarypackage.py 2011-05-19 14:19:45 +0000
314@@ -9,6 +9,10 @@
315 'test_suite',
316 ]
317
318+from testtools.matchers import (
319+ Equals,
320+ NotEquals,
321+ )
322 import transaction
323
324 from canonical.config import config
325@@ -16,7 +20,11 @@
326 from lp.services.log.logger import BufferLogger
327 from lp.soyuz.model.distroseriesbinarypackage import DistroSeriesBinaryPackage
328 from lp.soyuz.tests.test_publishing import SoyuzTestPublisher
329-from lp.testing import TestCaseWithFactory
330+from lp.testing import (
331+ StormStatementRecorder,
332+ TestCaseWithFactory,
333+ )
334+from lp.testing.matchers import HasQueryCount
335
336
337 class TestDistroSeriesBinaryPackage(TestCaseWithFactory):
338@@ -61,3 +69,22 @@
339
340 self.failUnlessEqual(
341 'Foo is the best', self.distroseries_binary_package.summary)
342+
343+ def test_none_cache_passed_at_init_counts_as_cached(self):
344+ # If the value None is passed as the constructor parameter
345+ # "cache", it is considered as a valid value.
346+ # Accesing the property DistroSeriesBinaryPackage.cache
347+ # later does not lead to the execution of an SQL query to
348+ # retrieve a DistroSeriesPackageCache record.
349+ binary_package = DistroSeriesBinaryPackage(
350+ self.distroseries, self.binary_package_name, cache=None)
351+ with StormStatementRecorder() as recorder:
352+ binary_package.cache
353+ self.assertThat(recorder, HasQueryCount(Equals(0)))
354+
355+ # If the parameter "cache" was not passed, accessing
356+ # DistroSeriesBinaryPackage.cache for the first time requires
357+ # at least one SQL query.
358+ with StormStatementRecorder() as recorder:
359+ self.distroseries_binary_package.cache
360+ self.assertThat(recorder, HasQueryCount(NotEquals(0)))