Merge lp:~abentley/launchpad/bulk-insert into lp:launchpad
- bulk-insert
- Merge into devel
Status: | Merged |
---|---|
Approved by: | j.c.sackett |
Approved revision: | no longer in the source branch. |
Merged at revision: | 14865 |
Proposed branch: | lp:~abentley/launchpad/bulk-insert |
Merge into: | lp:launchpad |
Diff against target: |
638 lines (+201/-134) 13 files modified
lib/lp/code/configure.zcml (+2/-2) lib/lp/code/interfaces/revision.py (+13/-2) lib/lp/code/model/branch.py (+6/-8) lib/lp/code/model/branchjob.py (+2/-3) lib/lp/code/model/revision.py (+83/-60) lib/lp/code/model/tests/test_branchjob.py (+3/-4) lib/lp/code/model/tests/test_revision.py (+41/-1) lib/lp/codehosting/scanner/buglinks.py (+4/-5) lib/lp/codehosting/scanner/bzrsync.py (+15/-21) lib/lp/codehosting/scanner/events.py (+13/-20) lib/lp/codehosting/scanner/tests/test_buglinks.py (+3/-5) lib/lp/codehosting/scanner/tests/test_bzrsync.py (+3/-3) lib/lp/testing/factory.py (+13/-0) |
To merge this branch: | bzr merge lp:~abentley/launchpad/bulk-insert |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
j.c.sackett (community) | Approve | ||
Richard Harding (community) | code* | Approve | |
Review via email: mp+94271@code.launchpad.net |
Commit message
Optimize branch scanner.
Description of the change
= Summary =
Partially address bug #808930: Timeout running branch scanner job
== Proposed fix ==
Optimize DB access in the branch scanner by doing bulk row insertions.
This takes a full scan of lp:~irar/gcc-linaro/slp-for-any-nops-4.6/ down to ~2 minutes locally.
== Pre-implementation notes ==
Discussed with gmb
== Implementation details ==
Change INewRevisionEvent to INewMainlineRev
Similarly, change got_new_revision to got_new_
Change createBranchRev
Change RevisionSet.
Change RevisionSet.
Simplify RevisionSet.
syncOneRevision becomes syncRevisions.
Scanner loops use chunks of 10k, as this shows some performance advantage over 1k loops. (100k does not.)
== Tests ==
bin/test -vt test_branchjob -t test_revision -t test_buglinks -t test_bzrsync
== Demo and Q/A ==
None
= Launchpad lint =
Checking for conflicts and issues in changed files.
Linting changed files:
lib/lp/
lib/lp/
lib/lp/
lib/lp/
lib/lp/
lib/lp/
lib/lp/
lib/lp/
lib/lp/
lib/lp/
lib/lp/
lib/lp/
lib/lp/
Aaron Bentley (abentley) wrote : | # |
Richard Harding (rharding) wrote : | # |
Overall this looks good. As someone not familiar with the code in question a few comments would have helped. It took me a lot of reads through newFromBazaarRe
Is there a general frowning on *args? A lot of places needed to be updated to change their calling arguments to be wrapped in a list:
RevisionSet
I can't help but wonder if allowing these to come in as
def newFromBazaarRe
would end up being a bit cleaner. I've not seen much of that practice so far though.
Aaron Bentley (abentley) wrote : | # |
> Is there a general frowning on *args?
I don't know if there's a general one, but I tend to avoid them for this kind of use, because it makes it harder to change the signature later. *args and optional arguments don't play well together. When "foo(*branches)" becomes "foo(dry_run=False, *branches)", lots of code breaks, and the method becomes more awkward to call. [And foo(*args, **kwargs) isn't explicit that dry_run is an acceptable kwarg]
I feel like the gain from using *args that way is negligible, and there are real problems, so I avoid it.
I'm comfortable using *args and **kwargs when I'm wrapping things.
Aaron
j.c.sackett (jcsackett) wrote : | # |
Aaron--
This looks good; I agree with Rick's request for a bit more comments in newFromBazaarRe
Rick--
You're right that the *args format would be cleaner in some cases, but here it looks like the more common case is that the param passed in is a list already, in which case the function would have to do something like `revisions = args[0]`. While I agree it may be a good idea to start raising the use of the *args pattern more, I don't think it's a net gain here.
Preview Diff
1 | === modified file 'lib/lp/code/configure.zcml' |
2 | --- lib/lp/code/configure.zcml 2011-12-24 17:49:30 +0000 |
3 | +++ lib/lp/code/configure.zcml 2012-02-24 02:20:24 +0000 |
4 | @@ -514,8 +514,8 @@ |
5 | for="lp.codehosting.scanner.events.IRevisionsRemoved" |
6 | handler="lp.codehosting.scanner.email.send_removed_revision_emails"/> |
7 | <subscriber |
8 | - for="lp.codehosting.scanner.events.INewRevision" |
9 | - handler="lp.codehosting.scanner.buglinks.got_new_revision"/> |
10 | + for="lp.codehosting.scanner.events.INewMainlineRevisions" |
11 | + handler="lp.codehosting.scanner.buglinks.got_new_mainline_revisions"/> |
12 | <subscriber |
13 | for="lp.codehosting.scanner.events.IScanCompleted" |
14 | handler="lp.codehosting.scanner.mergedetection.auto_merge_proposals"/> |
15 | |
16 | === modified file 'lib/lp/code/interfaces/revision.py' |
17 | --- lib/lp/code/interfaces/revision.py 2011-12-24 16:54:44 +0000 |
18 | +++ lib/lp/code/interfaces/revision.py 2012-02-24 02:20:24 +0000 |
19 | @@ -134,8 +134,19 @@ |
20 | parent_ids, properties): |
21 | """Create a new Revision with the given revision ID.""" |
22 | |
23 | - def newFromBazaarRevision(bzr_revision): |
24 | - """Create a new Revision from the given Bazaar Revision object.""" |
25 | + def newFromBazaarRevisions(revision_batch): |
26 | + """Create new Revisions from the given Bazaar Revisions. |
27 | + |
28 | + This method allows us to insert Revisions in bulk, which is |
29 | + important for larger branches. |
30 | + """ |
31 | + |
32 | + def acquireRevisionAuthors(names): |
33 | + """Return a dict of RevisionAuthors with the specified names. |
34 | + |
35 | + If a RevisionAuthor is not present in the database, it is created |
36 | + first. |
37 | + """ |
38 | |
39 | def getTipRevisionsForBranches(branches): |
40 | """Get the tip branch revisions for the specified branches. |
41 | |
42 | === modified file 'lib/lp/code/model/branch.py' |
43 | --- lib/lp/code/model/branch.py 2012-02-20 02:07:55 +0000 |
44 | +++ lib/lp/code/model/branch.py 2012-02-24 02:20:24 +0000 |
45 | @@ -28,6 +28,7 @@ |
46 | And, |
47 | Count, |
48 | Desc, |
49 | + Insert, |
50 | NamedFunc, |
51 | Not, |
52 | Or, |
53 | @@ -132,7 +133,9 @@ |
54 | validate_public_person, |
55 | ) |
56 | from lp.services.config import config |
57 | -from lp.services.database.bulk import load_related |
58 | +from lp.services.database.bulk import ( |
59 | + load_related, |
60 | + ) |
61 | from lp.services.database.constants import ( |
62 | DEFAULT, |
63 | UTC_NOW, |
64 | @@ -883,13 +886,8 @@ |
65 | CREATE TEMPORARY TABLE RevidSequence |
66 | (revision_id text, sequence integer) |
67 | """) |
68 | - data = [] |
69 | - for revid, sequence in revision_id_sequence_pairs: |
70 | - data.append('(%s, %s)' % sqlvalues(revid, sequence)) |
71 | - data = ', '.join(data) |
72 | - store.execute( |
73 | - "INSERT INTO RevidSequence (revision_id, sequence) VALUES %s" |
74 | - % data) |
75 | + store.execute(Insert(('revision_id', 'sequence'), |
76 | + table=['RevidSequence'], expr=revision_id_sequence_pairs)) |
77 | store.execute( |
78 | """ |
79 | INSERT INTO BranchRevision (branch, revision, sequence) |
80 | |
81 | === modified file 'lib/lp/code/model/branchjob.py' |
82 | --- lib/lp/code/model/branchjob.py 2012-02-15 22:01:27 +0000 |
83 | +++ lib/lp/code/model/branchjob.py 2012-02-24 02:20:24 +0000 |
84 | @@ -657,11 +657,10 @@ |
85 | merged_revisions = self.getMergedRevisionIDs(revision_id, graph) |
86 | authors = self.getAuthors(merged_revisions, graph) |
87 | revision_set = RevisionSet() |
88 | - rev_authors = set(revision_set.acquireRevisionAuthor(author) for |
89 | - author in authors) |
90 | + rev_authors = revision_set.acquireRevisionAuthors(authors) |
91 | outf = StringIO() |
92 | pretty_authors = [] |
93 | - for rev_author in rev_authors: |
94 | + for rev_author in rev_authors.values(): |
95 | if rev_author.person is None: |
96 | displayname = rev_author.name |
97 | else: |
98 | |
99 | === modified file 'lib/lp/code/model/revision.py' |
100 | --- lib/lp/code/model/revision.py 2011-12-30 06:14:56 +0000 |
101 | +++ lib/lp/code/model/revision.py 2012-02-24 02:20:24 +0000 |
102 | @@ -25,7 +25,6 @@ |
103 | ForeignKey, |
104 | IntCol, |
105 | SQLMultipleJoin, |
106 | - SQLObjectNotFound, |
107 | StringCol, |
108 | ) |
109 | from storm.expr import ( |
110 | @@ -61,6 +60,7 @@ |
111 | from lp.registry.interfaces.product import IProduct |
112 | from lp.registry.interfaces.projectgroup import IProjectGroup |
113 | from lp.registry.model.person import ValidPersonCache |
114 | +from lp.services.database.bulk import create |
115 | from lp.services.database.constants import ( |
116 | DEFAULT, |
117 | UTC_NOW, |
118 | @@ -73,18 +73,12 @@ |
119 | from lp.services.database.sqlbase import ( |
120 | quote, |
121 | SQLBase, |
122 | - sqlvalues, |
123 | ) |
124 | from lp.services.helpers import shortlist |
125 | from lp.services.identity.interfaces.emailaddress import ( |
126 | EmailAddressStatus, |
127 | IEmailAddressSet, |
128 | ) |
129 | -from lp.services.webapp.interfaces import ( |
130 | - DEFAULT_FLAVOR, |
131 | - IStoreSelector, |
132 | - MAIN_STORE, |
133 | - ) |
134 | |
135 | |
136 | class Revision(SQLBase): |
137 | @@ -277,13 +271,13 @@ |
138 | properties = {} |
139 | if _date_created is None: |
140 | _date_created = UTC_NOW |
141 | - author = self.acquireRevisionAuthor(revision_author) |
142 | + authors = self.acquireRevisionAuthors([revision_author]) |
143 | |
144 | revision = Revision( |
145 | revision_id=revision_id, |
146 | log_body=log_body, |
147 | revision_date=revision_date, |
148 | - revision_author=author, |
149 | + revision_author=authors[revision_author], |
150 | date_created=_date_created) |
151 | # Don't create future revisions. |
152 | if revision.revision_date > revision.date_created: |
153 | @@ -303,22 +297,29 @@ |
154 | |
155 | return revision |
156 | |
157 | - def acquireRevisionAuthor(self, name): |
158 | - """Find or create the RevisionAuthor with the specified name. |
159 | + def acquireRevisionAuthors(self, author_names): |
160 | + """Find or create the RevisionAuthors with the specified names. |
161 | |
162 | - Name may be any arbitrary string, but if it is an email-id, and |
163 | + A name may be any arbitrary string, but if it is an email-id, and |
164 | its email address is a verified email address, it will be |
165 | automatically linked to the corresponding Person. |
166 | |
167 | Email-ids come in two major forms: |
168 | "Foo Bar" <foo@bar.com> |
169 | foo@bar.com (Foo Bar) |
170 | + :return: a dict of name -> RevisionAuthor |
171 | """ |
172 | - # create a RevisionAuthor if necessary: |
173 | - try: |
174 | - return RevisionAuthor.byName(name) |
175 | - except SQLObjectNotFound: |
176 | - return self._createRevisionAuthor(name) |
177 | + store = IMasterStore(Revision) |
178 | + author_names = set(author_names) |
179 | + authors = {} |
180 | + for author in store.find(RevisionAuthor, |
181 | + RevisionAuthor.name.is_in(author_names)): |
182 | + authors[author.name] = author |
183 | + missing = author_names - set(authors.keys()) |
184 | + # create missing RevisionAuthors |
185 | + for name in missing: |
186 | + authors[name] = self._createRevisionAuthor(name) |
187 | + return authors |
188 | |
189 | def _timestampToDatetime(self, timestamp): |
190 | """Convert the given timestamp to a datetime object. |
191 | @@ -339,54 +340,76 @@ |
192 | revision_date += timedelta(seconds=timestamp - int_timestamp) |
193 | return revision_date |
194 | |
195 | - def newFromBazaarRevision(self, bzr_revision): |
196 | + def newFromBazaarRevisions(self, revisions): |
197 | """See `IRevisionSet`.""" |
198 | - revision_id = bzr_revision.revision_id |
199 | - revision_date = self._timestampToDatetime(bzr_revision.timestamp) |
200 | - authors = bzr_revision.get_apparent_authors() |
201 | - # XXX: JonathanLange 2009-05-01 bug=362686: We can only have one |
202 | - # author per revision, so we use the first on the assumption that |
203 | - # this is the primary author. |
204 | - try: |
205 | - author = authors[0] |
206 | - except IndexError: |
207 | - author = None |
208 | - return self.new( |
209 | - revision_id=revision_id, |
210 | - log_body=bzr_revision.message, |
211 | - revision_date=revision_date, |
212 | - revision_author=author, |
213 | - parent_ids=bzr_revision.parent_ids, |
214 | - properties=bzr_revision.properties) |
215 | + |
216 | + # Find all author names for these revisions. |
217 | + author_names = [] |
218 | + for bzr_revision in revisions: |
219 | + authors = bzr_revision.get_apparent_authors() |
220 | + try: |
221 | + author = authors[0] |
222 | + except IndexError: |
223 | + author = None |
224 | + author_names.append(author) |
225 | + # Get or make every RevisionAuthor for these revisions. |
226 | + revision_authors = dict( |
227 | + (name, author.id) for name, author in |
228 | + self.acquireRevisionAuthors(author_names).items()) |
229 | + |
230 | + # Collect all data for making Revision objects. |
231 | + data = [] |
232 | + for bzr_revision, author_name in zip(revisions, author_names): |
233 | + revision_id = bzr_revision.revision_id |
234 | + revision_date = self._timestampToDatetime(bzr_revision.timestamp) |
235 | + revision_author = revision_authors[author_name] |
236 | + |
237 | + data.append( |
238 | + (revision_id, bzr_revision.message, revision_date, |
239 | + revision_author)) |
240 | + # Create all Revision objects. |
241 | + db_revisions = create(( |
242 | + Revision.revision_id, Revision.log_body, Revision.revision_date, |
243 | + Revision.revision_author_id), data) |
244 | + |
245 | + # Map revision_id to Revision database ID. |
246 | + revision_db_id = dict( |
247 | + (rev.revision_id, rev.id) for rev in db_revisions) |
248 | + |
249 | + # Collect all data for making RevisionParent and RevisionProperty |
250 | + # objects. |
251 | + parent_data = [] |
252 | + property_data = [] |
253 | + for bzr_revision in revisions: |
254 | + db_id = revision_db_id[bzr_revision.revision_id] |
255 | + # Property data: revision DB id, name, value. |
256 | + for name, value in bzr_revision.properties.iteritems(): |
257 | + property_data.append((db_id, name, value)) |
258 | + parent_ids = bzr_revision.parent_ids |
259 | + # Parent data: revision DB id, sequence, revision_id |
260 | + seen_parents = set() |
261 | + for sequence, parent_id in enumerate(parent_ids): |
262 | + if parent_id in seen_parents: |
263 | + continue |
264 | + seen_parents.add(parent_id) |
265 | + parent_data.append((db_id, sequence, parent_id)) |
266 | + # Create all RevisionParent objects. |
267 | + create(( |
268 | + RevisionParent.revisionID, RevisionParent.sequence, |
269 | + RevisionParent.parent_id), parent_data, return_created=False) |
270 | + |
271 | + # Create all RevisionProperty objects. |
272 | + create(( |
273 | + RevisionProperty.revisionID, RevisionProperty.name, |
274 | + RevisionProperty.value), property_data, return_created=False) |
275 | |
276 | @staticmethod |
277 | def onlyPresent(revids): |
278 | """See `IRevisionSet`.""" |
279 | - if not revids: |
280 | - return set() |
281 | - store = getUtility(IStoreSelector).get(MAIN_STORE, DEFAULT_FLAVOR) |
282 | - store.execute( |
283 | - """ |
284 | - CREATE TEMPORARY TABLE Revids |
285 | - (revision_id text) |
286 | - """) |
287 | - data = [] |
288 | - for revid in revids: |
289 | - data.append('(%s)' % sqlvalues(revid)) |
290 | - data = ', '.join(data) |
291 | - store.execute( |
292 | - "INSERT INTO Revids (revision_id) VALUES %s" % data) |
293 | - result = store.execute( |
294 | - """ |
295 | - SELECT Revids.revision_id |
296 | - FROM Revids, Revision |
297 | - WHERE Revids.revision_id = Revision.revision_id |
298 | - """) |
299 | - present = set() |
300 | - for row in result.get_all(): |
301 | - present.add(row[0]) |
302 | - store.execute("DROP TABLE Revids") |
303 | - return present |
304 | + store = IStore(Revision) |
305 | + clause = Revision.revision_id.is_in(revids) |
306 | + present = store.find(Revision.revision_id, clause) |
307 | + return set(present) |
308 | |
309 | def checkNewVerifiedEmail(self, email): |
310 | """See `IRevisionSet`.""" |
311 | |
312 | === modified file 'lib/lp/code/model/tests/test_branchjob.py' |
313 | --- lib/lp/code/model/tests/test_branchjob.py 2012-02-15 22:01:27 +0000 |
314 | +++ lib/lp/code/model/tests/test_branchjob.py 2012-02-24 02:20:24 +0000 |
315 | @@ -392,10 +392,9 @@ |
316 | existing = branch.getBranchRevision( |
317 | revision_id=bzr_revision.revision_id) |
318 | if existing is None: |
319 | - revision = RevisionSet().newFromBazaarRevision(bzr_revision) |
320 | - else: |
321 | - revision = RevisionSet().getByRevisionId( |
322 | - bzr_revision.revision_id) |
323 | + RevisionSet().newFromBazaarRevisions([bzr_revision]) |
324 | + revision = RevisionSet().getByRevisionId( |
325 | + bzr_revision.revision_id) |
326 | try: |
327 | revno = bzr_branch.revision_id_to_revno(revision.revision_id) |
328 | except bzr_errors.NoSuchRevision: |
329 | |
330 | === modified file 'lib/lp/code/model/tests/test_revision.py' |
331 | --- lib/lp/code/model/tests/test_revision.py 2011-12-30 06:14:56 +0000 |
332 | +++ lib/lp/code/model/tests/test_revision.py 2012-02-24 02:20:24 +0000 |
333 | @@ -9,6 +9,7 @@ |
334 | datetime, |
335 | timedelta, |
336 | ) |
337 | +from testtools.matchers import Equals |
338 | import time |
339 | from unittest import TestCase |
340 | |
341 | @@ -27,7 +28,9 @@ |
342 | ) |
343 | from lp.registry.model.karma import Karma |
344 | from lp.scripts.garbo import RevisionAuthorEmailLinker |
345 | -from lp.services.database.lpstorm import IMasterObject |
346 | +from lp.services.database.lpstorm import ( |
347 | + IMasterObject, |
348 | + ) |
349 | from lp.services.database.sqlbase import cursor |
350 | from lp.services.identity.interfaces.account import AccountStatus |
351 | from lp.services.log.logger import DevNullLogger |
352 | @@ -39,11 +42,13 @@ |
353 | from lp.testing import ( |
354 | login, |
355 | logout, |
356 | + StormStatementRecorder, |
357 | TestCaseWithFactory, |
358 | time_counter, |
359 | ) |
360 | from lp.testing.factory import LaunchpadObjectFactory |
361 | from lp.testing.layers import DatabaseFunctionalLayer |
362 | +from lp.testing.matchers import HasQueryCount |
363 | |
364 | |
365 | class TestRevisionCreationDate(TestCaseWithFactory): |
366 | @@ -257,6 +262,41 @@ |
367 | found = self.revision_set.getByRevisionId('nonexistent') |
368 | self.assertIs(None, found) |
369 | |
370 | + def test_newFromBazaarRevisions(self): |
371 | + # newFromBazaarRevisions behaves as expected. |
372 | + # only branchscanner can SELECT revisionproperties. |
373 | + self.becomeDbUser('branchscanner') |
374 | + bzr_revisions = [ |
375 | + self.factory.makeBzrRevision('rev-1', prop1="foo"), |
376 | + self.factory.makeBzrRevision('rev-2', parent_ids=['rev-1']) |
377 | + ] |
378 | + with StormStatementRecorder() as recorder: |
379 | + self.revision_set.newFromBazaarRevisions(bzr_revisions) |
380 | + rev_1 = self.revision_set.getByRevisionId('rev-1') |
381 | + self.assertEqual( |
382 | + bzr_revisions[0].committer, rev_1.revision_author.name) |
383 | + self.assertEqual( |
384 | + bzr_revisions[0].message, rev_1.log_body) |
385 | + self.assertEqual( |
386 | + datetime(1970, 1, 1, 0, 0, tzinfo=pytz.UTC), rev_1.revision_date) |
387 | + self.assertEqual([], rev_1.parents) |
388 | + self.assertEqual({'prop1': 'foo'}, rev_1.getProperties()) |
389 | + rev_2 = self.revision_set.getByRevisionId('rev-2') |
390 | + self.assertEqual(['rev-1'], rev_2.parent_ids) |
391 | + # Really, less than 9 is great, but if the count improves, we should |
392 | + # tighten this restriction. |
393 | + self.assertThat(recorder, HasQueryCount(Equals(8))) |
394 | + |
395 | + def test_acquireRevisionAuthors(self): |
396 | + # AcquireRevisionAuthors creates new authors only if none exists with |
397 | + # that name. |
398 | + author1 = self.revision_set.acquireRevisionAuthors(['name1'])['name1'] |
399 | + self.assertEqual(author1.name, 'name1') |
400 | + Store.of(author1).flush() |
401 | + author2 = self.revision_set.acquireRevisionAuthors(['name1'])['name1'] |
402 | + self.assertEqual( |
403 | + removeSecurityProxy(author1).id, removeSecurityProxy(author2).id) |
404 | + |
405 | |
406 | class TestRevisionGetBranch(TestCaseWithFactory): |
407 | """Test the `getBranch` method of the revision.""" |
408 | |
409 | === modified file 'lib/lp/codehosting/scanner/buglinks.py' |
410 | --- lib/lp/codehosting/scanner/buglinks.py 2011-05-27 21:12:25 +0000 |
411 | +++ lib/lp/codehosting/scanner/buglinks.py 2012-02-24 02:20:24 +0000 |
412 | @@ -89,8 +89,7 @@ |
413 | registrant=getUtility(ILaunchpadCelebrities).janitor) |
414 | |
415 | |
416 | -def got_new_revision(new_revision): |
417 | - if new_revision.isMainline(): |
418 | - linker = BugBranchLinker(new_revision.db_branch) |
419 | - linker.createBugBranchLinksForRevision(new_revision.bzr_revision) |
420 | - |
421 | +def got_new_mainline_revisions(new_mainline_revisions): |
422 | + linker = BugBranchLinker(new_mainline_revisions.db_branch) |
423 | + for bzr_revision in new_mainline_revisions.bzr_revisions: |
424 | + linker.createBugBranchLinksForRevision(bzr_revision) |
425 | |
426 | === modified file 'lib/lp/codehosting/scanner/bzrsync.py' |
427 | --- lib/lp/codehosting/scanner/bzrsync.py 2012-02-15 17:29:54 +0000 |
428 | +++ lib/lp/codehosting/scanner/bzrsync.py 2012-02-24 02:20:24 +0000 |
429 | @@ -48,6 +48,7 @@ |
430 | if logger is None: |
431 | logger = logging.getLogger(self.__class__.__name__) |
432 | self.logger = logger |
433 | + self.revision_set = getUtility(IRevisionSet) |
434 | |
435 | def syncBranchAndClose(self, bzr_branch=None): |
436 | """Synchronize the database with a Bazaar branch, handling locking. |
437 | @@ -97,15 +98,9 @@ |
438 | new_db_revs = ( |
439 | new_ancestry - getUtility(IRevisionSet).onlyPresent(new_ancestry)) |
440 | self.logger.info("Adding %s new revisions.", len(new_db_revs)) |
441 | - for revids in iter_list_chunks(list(new_db_revs), 1000): |
442 | + for revids in iter_list_chunks(list(new_db_revs), 10000): |
443 | revisions = self.getBazaarRevisions(bzr_branch, revids) |
444 | - for revision in revisions: |
445 | - # This would probably go much faster if we found some way to |
446 | - # bulk-load multiple revisions at once, but as this is only |
447 | - # executed for revisions new to Launchpad, it doesn't seem |
448 | - # worth it at this stage. |
449 | - self.syncOneRevision( |
450 | - bzr_branch, revision, revids_to_insert) |
451 | + self.syncRevisions(bzr_branch, revisions, revids_to_insert) |
452 | self.deleteBranchRevisions(branchrevisions_to_delete) |
453 | self.insertBranchRevisions(bzr_branch, revids_to_insert) |
454 | transaction.commit() |
455 | @@ -239,24 +234,23 @@ |
456 | revisions = bzr_branch.repository.get_parent_map(revisions) |
457 | return bzr_branch.repository.get_revisions(revisions.keys()) |
458 | |
459 | - def syncOneRevision(self, bzr_branch, bzr_revision, revids_to_insert): |
460 | - """Import the revision with the given revision_id. |
461 | + def syncRevisions(self, bzr_branch, bzr_revisions, revids_to_insert): |
462 | + """Import the supplied revisions. |
463 | |
464 | :param bzr_branch: The Bazaar branch that's being scanned. |
465 | - :param bzr_revision: the revision to import |
466 | + :param bzr_revisions: the revisions to import |
467 | :type bzr_revision: bzrlib.revision.Revision |
468 | :param revids_to_insert: a dict of revision ids to integer |
469 | revno. Non-mainline revisions will be mapped to None. |
470 | """ |
471 | - revision_id = bzr_revision.revision_id |
472 | - revision_set = getUtility(IRevisionSet) |
473 | - # Revision not yet in the database. Load it. |
474 | - self.logger.debug("Inserting revision: %s", revision_id) |
475 | - db_revision = revision_set.newFromBazaarRevision(bzr_revision) |
476 | - notify( |
477 | - events.NewRevision( |
478 | - self.db_branch, bzr_branch, db_revision, bzr_revision, |
479 | - revids_to_insert[revision_id])) |
480 | + self.revision_set.newFromBazaarRevisions(bzr_revisions) |
481 | + mainline_revisions = [] |
482 | + for bzr_revision in bzr_revisions: |
483 | + if revids_to_insert[bzr_revision.revision_id] is None: |
484 | + continue |
485 | + mainline_revisions.append(bzr_revision) |
486 | + notify(events.NewMainlineRevisions( |
487 | + self.db_branch, bzr_branch, mainline_revisions)) |
488 | |
489 | @staticmethod |
490 | def revisionsToInsert(added_history, last_revno, added_ancestry): |
491 | @@ -293,7 +287,7 @@ |
492 | self.logger.info("Inserting %d branchrevision records.", |
493 | len(revids_to_insert)) |
494 | revid_seq_pairs = revids_to_insert.items() |
495 | - for revid_seq_pair_chunk in iter_list_chunks(revid_seq_pairs, 1000): |
496 | + for revid_seq_pair_chunk in iter_list_chunks(revid_seq_pairs, 10000): |
497 | self.db_branch.createBranchRevisionFromIDs(revid_seq_pair_chunk) |
498 | |
499 | def updateBranchStatus(self, bzr_history): |
500 | |
501 | === modified file 'lib/lp/codehosting/scanner/events.py' |
502 | --- lib/lp/codehosting/scanner/events.py 2010-10-15 18:33:07 +0000 |
503 | +++ lib/lp/codehosting/scanner/events.py 2012-02-24 02:20:24 +0000 |
504 | @@ -5,7 +5,7 @@ |
505 | |
506 | __metaclass__ = type |
507 | __all__ = [ |
508 | - 'NewRevision', |
509 | + 'NewMainlineRevisions', |
510 | 'RevisionsRemoved', |
511 | 'TipChanged', |
512 | ] |
513 | @@ -32,18 +32,17 @@ |
514 | self.bzr_branch = bzr_branch |
515 | |
516 | |
517 | -class INewRevision(IObjectEvent): |
518 | - """A new revision has been found in the branch.""" |
519 | - |
520 | - |
521 | -class NewRevision(ScannerEvent): |
522 | - """A new revision has been found in the branch.""" |
523 | - |
524 | - implements(INewRevision) |
525 | - |
526 | - def __init__(self, db_branch, bzr_branch, db_revision, bzr_revision, |
527 | - revno): |
528 | - """Construct a `NewRevision` event. |
529 | +class INewMainlineRevisions(IObjectEvent): |
530 | + """A new revision has been found in the branch.""" |
531 | + |
532 | + |
533 | +class NewMainlineRevisions(ScannerEvent): |
534 | + """A new revision has been found in the branch.""" |
535 | + |
536 | + implements(INewMainlineRevisions) |
537 | + |
538 | + def __init__(self, db_branch, bzr_branch, bzr_revisions): |
539 | + """Construct a `NewRevisions` event. |
540 | |
541 | :param db_branch: The database branch. |
542 | :param bzr_branch: The Bazaar branch. |
543 | @@ -53,13 +52,7 @@ |
544 | mainline. |
545 | """ |
546 | ScannerEvent.__init__(self, db_branch, bzr_branch) |
547 | - self.db_revision = db_revision |
548 | - self.bzr_revision = bzr_revision |
549 | - self.revno = revno |
550 | - |
551 | - def isMainline(self): |
552 | - """Is the new revision a mainline one?""" |
553 | - return self.revno is not None |
554 | + self.bzr_revisions = bzr_revisions |
555 | |
556 | |
557 | class ITipChanged(IObjectEvent): |
558 | |
559 | === modified file 'lib/lp/codehosting/scanner/tests/test_buglinks.py' |
560 | --- lib/lp/codehosting/scanner/tests/test_buglinks.py 2012-02-07 06:48:22 +0000 |
561 | +++ lib/lp/codehosting/scanner/tests/test_buglinks.py 2012-02-24 02:20:24 +0000 |
562 | @@ -8,7 +8,6 @@ |
563 | from bzrlib.revision import Revision |
564 | from zope.component import getUtility |
565 | from zope.event import notify |
566 | -from zope.security.proxy import removeSecurityProxy |
567 | |
568 | from lp.app.errors import NotFoundError |
569 | from lp.bugs.interfaces.bug import IBugSet |
570 | @@ -270,10 +269,9 @@ |
571 | revprops={ |
572 | 'bugs': 'https://launchpad.net/bugs/%d fixed' % bug.id}) |
573 | bzr_revision = tree.branch.repository.get_revision(revision_id) |
574 | - revno = 1 |
575 | revision_set = getUtility(IRevisionSet) |
576 | - db_revision = revision_set.newFromBazaarRevision(bzr_revision) |
577 | - notify(events.NewRevision( |
578 | - db_branch, tree.branch, db_revision, bzr_revision, revno)) |
579 | + revision_set.newFromBazaarRevisions([bzr_revision]) |
580 | + notify(events.NewMainlineRevisions( |
581 | + db_branch, tree.branch, [bzr_revision])) |
582 | bug_branch = getUtility(IBugBranchSet).getBugBranch(bug, db_branch) |
583 | self.assertIsNot(None, bug_branch) |
584 | |
585 | === modified file 'lib/lp/codehosting/scanner/tests/test_bzrsync.py' |
586 | --- lib/lp/codehosting/scanner/tests/test_bzrsync.py 2012-02-21 22:46:28 +0000 |
587 | +++ lib/lp/codehosting/scanner/tests/test_bzrsync.py 2012-02-24 02:20:24 +0000 |
588 | @@ -582,8 +582,8 @@ |
589 | self.assertIn(merge_id, branchrevisions_to_delete) |
590 | |
591 | |
592 | -class TestBzrSyncOneRevision(BzrSyncTestCase): |
593 | - """Tests for `BzrSync.syncOneRevision`.""" |
594 | +class TestBzrSyncRevisions(BzrSyncTestCase): |
595 | + """Tests for `BzrSync.syncRevisions`.""" |
596 | |
597 | def setUp(self): |
598 | BzrSyncTestCase.setUp(self) |
599 | @@ -606,7 +606,7 @@ |
600 | |
601 | # Sync the revision. The second parameter is a dict of revision ids |
602 | # to revnos, and will error if the revision id is not in the dict. |
603 | - self.bzrsync.syncOneRevision(None, fake_rev, {'rev42': None}) |
604 | + self.bzrsync.syncRevisions(None, [fake_rev], {'rev42': None}) |
605 | |
606 | # Find the revision we just synced and check that it has the correct |
607 | # date. |
608 | |
609 | === modified file 'lib/lp/testing/factory.py' |
610 | --- lib/lp/testing/factory.py 2012-02-20 02:07:55 +0000 |
611 | +++ lib/lp/testing/factory.py 2012-02-24 02:20:24 +0000 |
612 | @@ -48,6 +48,7 @@ |
613 | |
614 | from bzrlib.merge_directive import MergeDirective2 |
615 | from bzrlib.plugins.builder.recipe import BaseRecipeBranch |
616 | +from bzrlib.revision import Revision as BzrRevision |
617 | import pytz |
618 | from pytz import UTC |
619 | import simplejson |
620 | @@ -1545,6 +1546,18 @@ |
621 | return merge_proposal.generateIncrementalDiff( |
622 | old_revision, new_revision, diff) |
623 | |
624 | + def makeBzrRevision(self, revision_id=None, parent_ids=None, **kwargs): |
625 | + if revision_id is None: |
626 | + revision_id = self.getUniqueString('revision-id') |
627 | + if parent_ids is None: |
628 | + parent_ids = [] |
629 | + return BzrRevision( |
630 | + message=self.getUniqueString('message'), |
631 | + revision_id=revision_id, |
632 | + committer=self.getUniqueString('committer'), |
633 | + parent_ids=parent_ids, |
634 | + timestamp=0, timezone=0, properties=kwargs) |
635 | + |
636 | def makeRevision(self, author=None, revision_date=None, parent_ids=None, |
637 | rev_id=None, log_body=None, date_created=None): |
638 | """Create a single `Revision`.""" |
lifeless has approved LOC increase: "I agree that your branch should land because of decreased costs elsewhere paying for the code..."