Merge lp:~cjwatson/launchpad/package-cache-drop-changelog into lp:launchpad

Proposed by Colin Watson on 2016-05-17
Status: Merged
Merged at revision: 18040
Proposed branch: lp:~cjwatson/launchpad/package-cache-drop-changelog
Merge into: lp:launchpad
Diff against target: 130 lines (+13/-50)
4 files modified
lib/lp/soyuz/doc/package-cache-script.txt (+8/-0)
lib/lp/soyuz/interfaces/distributionsourcepackagecache.py (+2/-2)
lib/lp/soyuz/model/distributionsourcepackagecache.py (+3/-7)
lib/lp/soyuz/stories/distribution/xx-distribution-packages.txt (+0/-41)
To merge this branch: bzr merge lp:~cjwatson/launchpad/package-cache-drop-changelog
Reviewer Review Type Date Requested Status
William Grant code 2016-05-17 Approve on 2016-05-17
Review via email: mp+294894@code.launchpad.net

Commit Message

Stop populating DistributionSourcePackageCache.changelog, and make update-pkgcache depopulate it.

Description of the Change

Stop populating DistributionSourcePackageCache.changelog, and make update-pkgcache depopulate it.

The changelog was first added here in https://bugs.launchpad.net/launchpad/+bug/48735, and it is at least considered only as a bottom-priority item. But it doesn't make all that much sense, at least nowadays: we now have a pretty decent batched +changelog view, matching on changelog terms will often not be what you expected except in special-purpose circumstances, and general fishing expeditions are probably better handled by a general-purpose search engine. The changelog makes up about half of the table size (166 of 341 MB on dogfood), so I'd like to trim it down so that we can make the package cache quicker to update and easier to use for source package vocabulary pickers.

The next step will be to drop the column from the database, but we need this on production first to rebuild the fti column.

To post a comment you must log in.
William Grant (wgrant) :
review: Approve (code)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/lp/soyuz/doc/package-cache-script.txt'
2--- lib/lp/soyuz/doc/package-cache-script.txt 2012-06-13 07:35:50 +0000
3+++ lib/lp/soyuz/doc/package-cache-script.txt 2016-05-17 13:35:55 +0000
4@@ -31,6 +31,11 @@
5 >>> warty.searchPackages(u'foobar').count()
6 1
7
8+The obsolete changelog field is still present in the sampledata.
9+
10+ >>> ubuntu.searchSourcePackages(u'placeholder').count()
11+ 2
12+
13 Normal operation produces INFO level information about the
14 distribution and respective distroseriess considered in stderr.
15
16@@ -72,6 +77,9 @@
17 >>> warty.searchPackages(u'foobar').count()
18 0
19
20+ >>> ubuntu.searchSourcePackages(u'placeholder').count()
21+ 0
22+
23 Explicitly mark the database as dirty so that it is cleaned (see bug 994158).
24
25 >>> from lp.testing.layers import DatabaseLayer
26
27=== modified file 'lib/lp/soyuz/interfaces/distributionsourcepackagecache.py'
28--- lib/lp/soyuz/interfaces/distributionsourcepackagecache.py 2015-10-21 09:37:08 +0000
29+++ lib/lp/soyuz/interfaces/distributionsourcepackagecache.py 2016-05-17 13:35:55 +0000
30@@ -1,4 +1,4 @@
31-# Copyright 2009 Canonical Ltd. This software is licensed under the
32+# Copyright 2009-2016 Canonical Ltd. This software is licensed under the
33 # GNU Affero General Public License version 3 (see the file LICENSE).
34
35 """Source package in Distribution Cache interfaces."""
36@@ -31,7 +31,7 @@
37 "distro.")
38 changelog = Attribute("A concatenation of the source package release "
39 "changelog entries for this source package, where the status is "
40- "not REMOVED.")
41+ "not REMOVED. (Deprecated; due to be removed.)")
42
43 distributionsourcepackage = Attribute("The DistributionSourcePackage "
44 "for which this is a cache.")
45
46=== modified file 'lib/lp/soyuz/model/distributionsourcepackagecache.py'
47--- lib/lp/soyuz/model/distributionsourcepackagecache.py 2015-07-08 16:05:11 +0000
48+++ lib/lp/soyuz/model/distributionsourcepackagecache.py 2016-05-17 13:35:55 +0000
49@@ -1,4 +1,4 @@
50-# Copyright 2009-2011 Canonical Ltd. This software is licensed under the
51+# Copyright 2009-2016 Canonical Ltd. This software is licensed under the
52 # GNU Affero General Public License version 3 (see the file LICENSE).
53
54 __metaclass__ = type
55@@ -167,13 +167,8 @@
56 binpkgnames = set()
57 binpkgsummaries = set()
58 binpkgdescriptions = set()
59- sprchangelog = set()
60 for spr in sprs:
61 log.debug("Considering source version %s" % spr.version)
62- # changelog may be empty, in which case we don't want to add it
63- # to the set as the join would fail below.
64- if spr.changelog_entry is not None:
65- sprchangelog.add(spr.changelog_entry)
66 binpkgs = IStore(BinaryPackageRelease).find(
67 (BinaryPackageName.name, BinaryPackageRelease.summary,
68 BinaryPackageRelease.description),
69@@ -190,7 +185,8 @@
70 cache.binpkgnames = ' '.join(sorted(binpkgnames))
71 cache.binpkgsummaries = ' '.join(sorted(binpkgsummaries))
72 cache.binpkgdescriptions = ' '.join(sorted(binpkgdescriptions))
73- cache.changelog = ' '.join(sorted(sprchangelog))
74+ # Column due for deletion.
75+ cache.changelog = None
76
77 @classmethod
78 def updateAll(cls, distro, archive, log, ztm, commit_chunk=500):
79
80=== modified file 'lib/lp/soyuz/stories/distribution/xx-distribution-packages.txt'
81--- lib/lp/soyuz/stories/distribution/xx-distribution-packages.txt 2015-09-04 12:36:43 +0000
82+++ lib/lp/soyuz/stories/distribution/xx-distribution-packages.txt 2016-05-17 13:35:55 +0000
83@@ -61,47 +61,6 @@
84 The Mozilla Firefox web browser
85 (Matching binaries: mozilla-firefox, mozilla-firefox-data.)
86
87-Now try searching for text that we know to be in a change log entry, to
88-prove that FTI works on change logs. The text we're looking for is
89-"placeholder" which is mentioned in the change log entry for pmount and
90-libstdc++, so we are looking for two results here as the "placeholder"
91-text is not mentioned in anything else that is indexed.
92-
93- >>> browser.open("http://localhost/ubuntu/+search")
94- >>> field = browser.getControl(name="text")
95- >>> field.value = 'placeholder'
96- >>> browser.getControl('Search', index=0).click()
97-
98-Note, by default we only search on binary package names (as the fti is
99-not currently so useful), so the initial result is empty, but contains
100-a link to the fti/source package search:
101-
102- >>> print extract_text(find_tag_by_id(browser.contents, 'no-results'))
103- Your search for “placeholder” did not return any results.
104- ...
105-
106-Clicking on the provided link to retry the search against source packages
107-finds the fti results:
108-
109- >>> browser.getLink(id='source-search').click()
110- >>> for tag in find_tags_by_class(
111- ... browser.contents, 'batch-navigation-index'):
112- ... print extract_text(tag)
113- All packages with sources matching your query “placeholder”
114- 1...2 of 2 results
115-
116- >>> soup = find_main_content(browser.contents)
117- >>> results = soup.findAll(attrs={'class': 'pagematch'})
118- >>> len(results)
119- 2
120-
121- >>> texts = [extract_text(html) for html in results]
122- >>> texts.sort()
123- >>> for text in texts:
124- ... print text.encode('ascii', 'backslashreplace')
125- libstdc++
126- pmount
127-
128
129 Distribution package change summary
130 -----------------------------------