Merge ~pappacena/launchpad:gitrepo-pending-merges-tuning into launchpad:master

Proposed by Thiago F. Pappacena
Status: Needs review
Proposed branch: ~pappacena/launchpad:gitrepo-pending-merges-tuning
Merge into: launchpad:master
Diff against target: 363 lines (+191/-28)
4 files modified
lib/lp/code/browser/branchmergeproposal.py (+64/-18)
lib/lp/code/browser/gitrepository.py (+64/-5)
lib/lp/code/browser/tests/test_gitrepository.py (+62/-4)
lib/lp/code/templates/branchmergeproposal-summary-fragment.pt (+1/-1)
Reviewer Review Type Date Requested Status
Launchpad code reviewers Pending
Review via email: mp+377393@code.launchpad.net

Commit message

Small improvements in performance and allowing to limit the amount of MPs displayed on repository-pending-merges portlet (on git repository page)

Description of the change

This is not exactly ready to be merged, but I would like some opinions.
I've checked with @profilehooks.profile some methods, and took a look on what could be improved for the repository-pending-merges portlet.

There are two places spending quite a lot of processing time:
1- Fetching and counting MPs to show the /+activereivews link
This part fetches all MPs (and they can be a lot) just to count. It cannot be done on the database directly because we need to use Zope to check for permissions. I'm proposing to limit this count to a certain number, and after that just show "more than <limit> MPs" to the user.

2- List the summary of the most recent MPs for each branch
Again, this cannot be totally filtered on the database because of Zope's permission check. But, apart from limiting, I'm proposing some small improvements on the method getting the latest MP per branch.

On the test side, while I was coding the changes, I've wrote a small test to check the processing time. I'm including a slightly different version of what I've writen before, but I'm not sure if we should include this type of check based on the processing time. Specially because this is dependent on the machine running and it might eventually break under unexpected situations (think of a busy buildbot, for example).

Anyway, I'll probably need to polish a bit this MP, but I would like to have some early feedback on it.

To post a comment you must log in.
Revision history for this message
Colin Watson (cjwatson) wrote :

I think it should be possible to express an equivalent of the launchpad.View permission check on merge proposals directly in a database query. We have a good chunk of the machinery for that already (BranchCollection and GitCollection), and I think we just need to use it more effectively here.

As for the remaining limit stuff, it looks to me as though this needs to be reworked using the batch navigation system; that's normally how we handle rendering large collections.

We normally do avoid wallclock speed tests for the sorts of reasons you suggest. In most cases query count tests (especially tests for how query counts scale with the number of objects involved) are good enough proxies, and they have the great virtue of behaving deterministically. I think that would be the case here.

There's a lot of stuff to get your teeth into here! Let's discuss it at the sprint, since that's so close.

e03cd15... by Thiago F. Pappacena on 2020-01-10

removing wallclock-dependent test

63450a2... by Thiago F. Pappacena on 2020-01-10

fetching less results to check the latests

d6456e3... by Thiago F. Pappacena on 2020-01-10

avoiding mixing MPs not included in the resultset (other status, for example)

af51100... by Thiago F. Pappacena on 2020-01-10

explanation about latest_proposals_for_each_branch

Unmerged commits

af51100... by Thiago F. Pappacena on 2020-01-10

explanation about latest_proposals_for_each_branch

d6456e3... by Thiago F. Pappacena on 2020-01-10

avoiding mixing MPs not included in the resultset (other status, for example)

63450a2... by Thiago F. Pappacena on 2020-01-10

fetching less results to check the latests

e03cd15... by Thiago F. Pappacena on 2020-01-10

removing wallclock-dependent test

ec5b051... by Thiago F. Pappacena on 2020-01-09

limiting data processed to display gitrepository-pending-merges.pt

b422150... by Thiago F. Pappacena on 2020-01-09

improving performance and adding optional limit to latest_proposals_for_each_branch

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/lib/lp/code/browser/branchmergeproposal.py b/lib/lp/code/browser/branchmergeproposal.py
2index a1b41fb..498c637 100644
3--- a/lib/lp/code/browser/branchmergeproposal.py
4+++ b/lib/lp/code/browser/branchmergeproposal.py
5@@ -1,4 +1,4 @@
6-# Copyright 2009-2019 Canonical Ltd. This software is licensed under the
7+# Copyright 2009-2020 Canonical Ltd. This software is licensed under the
8 # GNU Affero General Public License version 3 (see the file LICENSE).
9
10 """Views, navigation and actions for BranchMergeProposals."""
11@@ -25,6 +25,7 @@ __all__ = [
12 'latest_proposals_for_each_branch',
13 ]
14
15+from collections import OrderedDict
16 from functools import wraps
17 import operator
18
19@@ -35,6 +36,8 @@ from lazr.restful.interfaces import (
20 IWebServiceClientRequest,
21 )
22 import simplejson
23+from storm.expr import Desc, Max, Select
24+from storm.store import ResultSet
25 from zope.component import (
26 adapter,
27 getMultiAdapter,
28@@ -57,7 +60,10 @@ from zope.schema.vocabulary import (
29 SimpleTerm,
30 SimpleVocabulary,
31 )
32-
33+from zope.security.proxy import (
34+ isinstance as zope_isinstance,
35+ removeSecurityProxy
36+ )
37 from lp import _
38 from lp.app.browser.launchpadform import (
39 action,
40@@ -93,12 +99,14 @@ from lp.code.interfaces.codereviewinlinecomment import (
41 )
42 from lp.code.interfaces.codereviewvote import ICodeReviewVoteReference
43 from lp.code.interfaces.diff import IPreviewDiff
44+from lp.code.model.branchmergeproposal import BranchMergeProposal
45 from lp.registry.interfaces.person import IPersonSet
46 from lp.services.comments.interfaces.conversation import (
47 IComment,
48 IConversation,
49 )
50 from lp.services.config import config
51+from lp.services.database.decoratedresultset import DecoratedResultSet
52 from lp.services.features import getFeatureFlag
53 from lp.services.job.interfaces.job import JobStatus
54 from lp.services.librarian.interfaces.client import LibrarianServerError
55@@ -127,26 +135,64 @@ from lp.services.webapp.interfaces import ILaunchBag
56 from lp.services.webapp.menu import NavigationMenu
57
58
59-def latest_proposals_for_each_branch(proposals):
60+def latest_proposals_for_each_branch(proposals, limit=None):
61 """Returns the most recent merge proposals for any particular branch.
62
63 Also filters out proposals that the logged in user can't see.
64+
65+ This method uses 2 different approaches depending on the "proposals":
66+ - If it's a ResultSet object, it writes a query and fetches the most
67+ recent MPs from resultset directly from the database. So, it's safe
68+ to have a reasonably big resultset of BranchMergeProposals
69+
70+ - If it's a iterable - but not a ResultSet -, this method will sort
71+ the MPs [O(log n)], iterate over the sorted list [O(n)] and keep only
72+ the most recent ones by date_created.
73+
74+ In summary, prefer to send ResultSets as parameter
75 """
76- targets = {}
77- for proposal in proposals:
78- # Don't show the proposal if the user can't see it.
79- if not check_permission('launchpad.View', proposal):
80- continue
81- # Only show the most recent proposal for any given target.
82- date_created = proposal.date_created
83- target = proposal.merge_target
84-
85- if target not in targets or date_created > targets[target][1]:
86- targets[target] = (proposal, date_created)
87-
88- return sorted(
89- [proposal for proposal, date_created in targets.itervalues()],
90- key=operator.attrgetter('date_created'), reverse=True)
91+ if isinstance(proposals, DecoratedResultSet):
92+ proposals = removeSecurityProxy(proposals).get_plain_result_set()
93+ if zope_isinstance(proposals, ResultSet):
94+ # From the proposal IDs, get the biggest one for each branch
95+ proposals_ids = proposals.get_select_expr(BranchMergeProposal.id)
96+ most_recents = Select(
97+ Max(BranchMergeProposal.id),
98+ where=BranchMergeProposal.id.is_in(proposals_ids),
99+ group_by=[
100+ BranchMergeProposal.target_branchID,
101+ BranchMergeProposal.target_git_path,
102+ BranchMergeProposal.target_git_commit_sha1]
103+ )
104+ proposals = proposals.find(
105+ BranchMergeProposal.id.is_in(most_recents))
106+ proposals = proposals.order_by(Desc(BranchMergeProposal.date_created))
107+ # TODO: move this permission check to database
108+ return [i for i in proposals if check_permission('launchpad.View', i)]
109+ else:
110+ # Sort by the most recent first
111+ proposals = sorted(
112+ [i for i in proposals if check_permission('launchpad.View', i)],
113+ key=operator.attrgetter('date_created'),
114+ reverse=True)
115+ # OrderedDict to keep insertion order (that means: most recent
116+ # proposals first)
117+ targets = OrderedDict()
118+ for proposal in proposals:
119+ target = proposal.merge_target
120+
121+ # If we already inserted some target in the dict, it was for sure
122+ # the most recent (since they are sorted). Skip here and avoid
123+ # check_permission (since it's a bit more expensive than a hash
124+ # lookup), limit check and any other processing.
125+ if target in targets:
126+ continue
127+
128+ if limit is not None and len(targets) >= limit:
129+ break
130+ targets[target] = proposal
131+
132+ return targets.values()
133
134
135 class BranchMergeProposalBreadcrumb(Breadcrumb):
136diff --git a/lib/lp/code/browser/gitrepository.py b/lib/lp/code/browser/gitrepository.py
137index c05dcc1..97c5da8 100644
138--- a/lib/lp/code/browser/gitrepository.py
139+++ b/lib/lp/code/browser/gitrepository.py
140@@ -1,4 +1,4 @@
141-# Copyright 2015-2019 Canonical Ltd. This software is licensed under the
142+# Copyright 2015-2020 Canonical Ltd. This software is licensed under the
143 # GNU Affero General Public License version 3 (see the file LICENSE).
144
145 """Git repository views."""
146@@ -364,6 +364,14 @@ class GitRepositoryView(InformationTypePortletMixin, LaunchpadView,
147 label = page_title
148 show_merge_links = True
149
150+ # see `initialize_limits` for feature flags controlling these numbers
151+ # Maximum branches to count. None for unlimited
152+ landing_candidates_limit = None
153+
154+ # Maximum MPs to be displayed. None for unlimited
155+ landing_targets_limit = None
156+
157+
158 def initialize(self):
159 super(GitRepositoryView, self).initialize()
160 # Cache permission so that the private team owner can be rendered. The
161@@ -374,6 +382,30 @@ class GitRepositoryView(InformationTypePortletMixin, LaunchpadView,
162 precache_permission_for_objects(
163 self.request, "launchpad.LimitedView", authorised_people)
164
165+ # None means "not calculated", -1 if above landing_candidates_limit
166+ # Any other number is the actual number of MPs
167+ self._landing_candidates_count = None
168+ self.initialize_limits()
169+
170+ def get_limit_flag_value(self, flag_name):
171+ """Gets the valid limit value from the given feature flag name"""
172+ try:
173+ flag_candidate_limit = int(getFeatureFlag(flag_name))
174+ if flag_candidate_limit <= 0:
175+ return None
176+ return flag_candidate_limit
177+ # int('') or int(None) raises those exceptions
178+ except (ValueError, TypeError):
179+ return None
180+
181+ def initialize_limits(self):
182+ """Initialize landing_candidates_limit and landing_targets_limit
183+ from feature flags (unlimited by default)"""
184+ self.landing_candidates_limit = self.get_limit_flag_value(
185+ 'code.git.show_repository_mps.landing_candidates_limit')
186+ self.landing_targets_limit = self.get_limit_flag_value(
187+ 'code.git.show_repository_mps.landing_targets_limit')
188+
189 @property
190 def git_ssh_url(self):
191 """The git+ssh:// URL for this repository, adjusted for this user."""
192@@ -418,9 +450,31 @@ class GitRepositoryView(InformationTypePortletMixin, LaunchpadView,
193
194 @cachedproperty
195 def landing_candidates(self):
196+ """Returns up to {self.landing_candidates_limit} merge proposals,
197+ and sets the landing_candidates_count_text accordingly
198+
199+ The way it limits is still not the most efficient (it fetches the
200+ MPs, and filter here for check_permission), but it's ok for most
201+ real-world cases.
202+ Ideally, we should filter this directly on the database"""
203+ limit = self.landing_candidates_limit
204 candidates = self.context.getPrecachedLandingCandidates(self.user)
205- return [proposal for proposal in candidates
206- if check_permission("launchpad.View", proposal)]
207+ all_proposals = []
208+ self._landing_candidates_count = None
209+ i = 0
210+ for proposal in candidates:
211+ if not check_permission("launchpad.View", proposal):
212+ continue
213+ i += 1
214+ all_proposals.append(proposal)
215+ if limit is not None and i >= limit:
216+ # If we don't know the total (and only know that it's above
217+ # the limit), let's mark it as -1
218+ self._landing_candidates_count = -1
219+ break
220+ if self._landing_candidates_count is None:
221+ self._landing_candidates_count = len(all_proposals)
222+ return all_proposals
223
224 def _getBranchCountText(self, count):
225 """Help to show user friendly text."""
226@@ -428,19 +482,24 @@ class GitRepositoryView(InformationTypePortletMixin, LaunchpadView,
227 return 'No branches'
228 elif count == 1:
229 return '1 branch'
230+ elif count == -1:
231+ return '%s branches or more' % self.landing_candidates_limit
232 else:
233 return '%s branches' % count
234
235 @cachedproperty
236 def landing_candidate_count_text(self):
237- return self._getBranchCountText(len(self.landing_candidates))
238+ if self._landing_candidates_count is None:
239+ self.landing_candidates # call the property to calculate count
240+ return self._getBranchCountText(self._landing_candidates_count)
241
242 @cachedproperty
243 def landing_targets(self):
244 """Return a filtered list of active landing targets."""
245 targets = self.context.getPrecachedLandingTargets(
246 self.user, only_active=True)
247- return latest_proposals_for_each_branch(targets)
248+ return latest_proposals_for_each_branch(
249+ targets, limit=self.landing_targets_limit)
250
251 @property
252 def show_rescan_link(self):
253diff --git a/lib/lp/code/browser/tests/test_gitrepository.py b/lib/lp/code/browser/tests/test_gitrepository.py
254index a38120c..d100088 100644
255--- a/lib/lp/code/browser/tests/test_gitrepository.py
256+++ b/lib/lp/code/browser/tests/test_gitrepository.py
257@@ -1,4 +1,4 @@
258-# Copyright 2015-2019 Canonical Ltd. This software is licensed under the
259+# Copyright 2015-2020 Canonical Ltd. This software is licensed under the
260 # GNU Affero General Public License version 3 (see the file LICENSE).
261
262 """Unit tests for GitRepositoryView."""
263@@ -8,7 +8,10 @@ from __future__ import absolute_import, print_function, unicode_literals
264 __metaclass__ = type
265
266 import base64
267-from datetime import datetime
268+from datetime import (
269+ datetime,
270+ timedelta,
271+ )
272 import doctest
273 from itertools import chain
274 from operator import attrgetter
275@@ -24,6 +27,7 @@ from testtools.matchers import (
276 DocTestMatches,
277 Equals,
278 Is,
279+ LessThan,
280 MatchesDict,
281 MatchesListwise,
282 MatchesSetwise,
283@@ -349,11 +353,65 @@ class TestGitRepositoryView(BrowserTestCase):
284 set_state=BranchMergeProposalStatus.NEEDS_REVIEW)
285 with FeatureFixture({"code.git.show_repository_mps": "on"}):
286 with person_logged_in(target_repository.owner):
287- browser = self.getViewBrowser(
288- source_repository, user=source_repository.owner)
289+ with StormStatementRecorder() as recorder:
290+ browser = self.getViewBrowser(
291+ source_repository, user=source_repository.owner)
292+ self.assertThat(recorder, HasQueryCount(LessThan(47)))
293 self.assertIsNotNone(
294 find_tag_by_id(browser.contents, 'landing-targets'))
295
296+ def test_view_with_landing_targets_multiple_mps_performance(self):
297+ """This is a performance test to make sure the Git repo page with
298+ MPs has load time close enough to the load time when we disable the
299+ MPs.
300+
301+ To reduce the likelyhood of this test breaking because of machine
302+ circunstances or at the buildbot, the marging is 10%, which should
303+ be good enough for this test scenario"""
304+ product = self.factory.makeProduct(name="foo", vcs=VCSType.GIT)
305+ target_repository = self.factory.makeGitRepository(target=product)
306+ source_repository = self.factory.makeGitRepository(target=product)
307+ # creates 200 branches in each repo, and 100 MP
308+ for i in range(100):
309+ [source_master, source_devel] = self.factory.makeGitRefs(
310+ source_repository,
311+ paths=["refs/heads/master%s" % i, "refs/heads/devel%s" % i])
312+ [target_master, target_devel] = self.factory.makeGitRefs(
313+ target_repository,
314+ paths=["refs/heads/master%s" % i, "refs/heads/develop%s" % i])
315+
316+ self.factory.makeBranchMergeProposalForGit(
317+ target_ref=source_master,
318+ source_ref=source_devel,
319+ set_state=BranchMergeProposalStatus.NEEDS_REVIEW)
320+
321+ feature_flags = {
322+ "code.git.show_repository_mps": "on",
323+ "code.git.show_repository_mps.landing_candidates_limit": "10",
324+ "code.git.show_repository_mps.landing_targets_limit": "10"}
325+ with FeatureFixture(feature_flags):
326+ with person_logged_in(target_repository.owner):
327+ with StormStatementRecorder() as recorder:
328+ start_time = datetime.now()
329+ browser = self.getViewBrowser(
330+ source_repository, user=source_repository.owner)
331+ rendering_time_with_mp = datetime.now() - start_time
332+
333+ self.assertThat(recorder, HasQueryCount(Equals(56)))
334+ self.assertIsNotNone(
335+ find_tag_by_id(browser.contents, 'landing-targets'))
336+
337+ summary_fragments = find_tags_by_class(
338+ browser.contents, "branchmergepropostal-summary-fragment")
339+ self.assertEqual(10, len(summary_fragments))
340+
341+ link_match = soupmatchers.HTMLContains(
342+ soupmatchers.Tag(
343+ "link to branches", "a",
344+ text=re.compile("10 branches or more")),
345+ )
346+ self.assertThat(browser.contents, link_match)
347+
348 def test_landing_targets_query_count(self):
349 product = self.factory.makeProduct(name="foo", vcs=VCSType.GIT)
350 target_repository = self.factory.makeGitRepository(target=product)
351diff --git a/lib/lp/code/templates/branchmergeproposal-summary-fragment.pt b/lib/lp/code/templates/branchmergeproposal-summary-fragment.pt
352index 355c0c0..1f4c0fc 100644
353--- a/lib/lp/code/templates/branchmergeproposal-summary-fragment.pt
354+++ b/lib/lp/code/templates/branchmergeproposal-summary-fragment.pt
355@@ -3,7 +3,7 @@
356 xmlns:metal="http://xml.zope.org/namespaces/metal"
357 xmlns:i18n="http://xml.zope.org/namespaces/i18n">
358
359- <div>
360+ <div class="branchmergepropostal-summary-fragment">
361 <tal:merge-fragment tal:replace="structure context/@@+link-summary" />
362 </div>
363 <tal:merge-fragment tal:replace="structure context/@@+vote-summary" />