Merge lp:~garyvdm/bzr/Bug427773 into lp:bzr/2.0
- Bug427773
- Merge into 2.0
Proposed by
Alexander Belchenko
Status: | Rejected | ||||
---|---|---|---|---|---|
Rejected by: | Gary van der Merwe | ||||
Proposed branch: | lp:~garyvdm/bzr/Bug427773 | ||||
Merge into: | lp:bzr/2.0 | ||||
Diff against target: | None lines | ||||
To merge this branch: | bzr merge lp:~garyvdm/bzr/Bug427773 | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
bzr-core | Pending | ||
Review via email: mp+11645@code.launchpad.net |
Commit message
Description of the change
To post a comment you must log in.
Revision history for this message
Gary van der Merwe (garyvdm) wrote : | # |
Preview Diff
[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1 | === modified file 'Makefile' |
2 | --- Makefile 2009-08-03 20:38:39 +0000 |
3 | +++ Makefile 2009-08-27 00:53:27 +0000 |
4 | @@ -1,4 +1,4 @@ |
5 | -# Copyright (C) 2005, 2006, 2007, 2008 Canonical Ltd |
6 | +# Copyright (C) 2005, 2006, 2007, 2008, 2009 Canonical Ltd |
7 | # |
8 | # This program is free software; you can redistribute it and/or modify |
9 | # it under the terms of the GNU General Public License as published by |
10 | @@ -40,8 +40,6 @@ |
11 | |
12 | check-nodocs: extensions |
13 | $(PYTHON) -Werror -O ./bzr selftest -1v $(tests) |
14 | - @echo "Running all tests with no locale." |
15 | - LC_CTYPE= LANG=C LC_ALL= ./bzr selftest -1v $(tests) 2>&1 | sed -e 's/^/[ascii] /' |
16 | |
17 | # Run Python style checker (apt-get install pyflakes) |
18 | # |
19 | |
20 | === modified file 'NEWS' |
21 | --- NEWS 2009-08-25 05:32:07 +0000 |
22 | +++ NEWS 2009-09-11 14:23:16 +0000 |
23 | @@ -6,6 +6,309 @@ |
24 | .. contents:: List of Releases |
25 | :depth: 1 |
26 | |
27 | +In Development |
28 | +############## |
29 | + |
30 | +Compatibility Breaks |
31 | +******************** |
32 | + |
33 | +New Features |
34 | +************ |
35 | + |
36 | +* Give more control on BZR_PLUGIN_PATH by providing a way to refer to or |
37 | + disable the user, site and core plugin directories. |
38 | + (Vincent Ladeuil, #412930, #316192, #145612) |
39 | + |
40 | +Bug Fixes |
41 | +********* |
42 | + |
43 | +* Bazaar's native protocol code now correctly handles EINTR, which most |
44 | + noticeably occurs if you break in to the debugger while connected to a |
45 | + bzr+ssh server. You can now can continue from the debugger (by typing |
46 | + 'c') and the process continues. However, note that pressing C-\ in the |
47 | + shell may still kill the SSH process, which is bug 162509, so you must |
48 | + sent a signal to the bzr process specifically, for example by typing |
49 | + ``kill -QUIT PID`` in another shell. (Martin Pool, #341535) |
50 | + |
51 | +* ``bzr check`` in pack-0.92, 1.6 and 1.9 format repositories will no |
52 | + longer report incorrect errors about ``Missing inventory ('TREE_ROOT', ...)`` |
53 | + (Robert Collins, #416732) |
54 | + |
55 | +* ``bzr info -v`` on a 2a format still claimed that it was a "Development |
56 | + format" (John Arbash Meinel, #424392) |
57 | + |
58 | +* ``bzr merge`` and ``bzr remove-tree`` now requires --force if pending |
59 | + merges are present in the working tree. |
60 | + (Vincent Ladeuil, #426344) |
61 | + |
62 | +* Clearer message when Bazaar runs out of memory, instead of a ``MemoryError`` |
63 | + traceback. (Martin Pool, #109115) |
64 | + |
65 | +* Conversion to 2a will create a single pack for all the new revisions (as |
66 | + long as it ran without interruption). This improves both ``bzr upgrade`` |
67 | + and ``bzr pull`` or ``bzr merge`` from local branches in older formats. |
68 | + The autopack logic that occurs every 100 revisions during local |
69 | + conversions was not returning that pack's identifier, which resulted in |
70 | + the partial packs created during the conversion not being consolidated |
71 | + at the end of the conversion process. (Robert Collins, #423818) |
72 | + |
73 | +* Don't restrict the command name used to run the test suite. |
74 | + (Vincent Ladeuil, #419950) |
75 | + |
76 | +* Fetches from 2a to 2a are now again requested in 'groupcompress' order. |
77 | + Groups that are seen as 'underutilized' will be repacked on-the-fly. |
78 | + This means that when the source is fully packed, there is minimal |
79 | + overhead during the fetch, but if the source is poorly packed the result |
80 | + is a fairly well packed repository (not as good as 'bzr pack' but |
81 | + good-enough.) (Robert Collins, John Arbash Meinel, #402652) |
82 | + |
83 | +* Network streams now decode adjacent records of the same type into a |
84 | + single stream, reducing layering churn. (Robert Collins) |
85 | + |
86 | +* Prevent some kinds of incomplete data from being committed to a 2a |
87 | + repository, such as revisions without inventories or inventories without |
88 | + chk_bytes root records. |
89 | + (Andrew Bennetts, #423506) |
90 | + |
91 | +* Make sure that we unlock the tree if we fail to create a TreeTransform |
92 | + object when doing a merge, and there is limbo, or pending-deletions |
93 | + directory. (Gary van der Merwe, #427773) |
94 | + |
95 | +Improvements |
96 | +************ |
97 | + |
98 | +Documentation |
99 | +************* |
100 | + |
101 | +* Help on hooks no longer says 'Not deprecated' for hooks that are |
102 | + currently supported. (Ian Clatworthy, #422415) |
103 | + |
104 | +API Changes |
105 | +*********** |
106 | + |
107 | +* ``bzrlib.tests`` now uses ``stopTestRun`` for its ``TestResult`` |
108 | + subclasses - the same as python's unittest module. (Robert Collins) |
109 | + |
110 | +Internals |
111 | +********* |
112 | + |
113 | +* ``BTreeLeafParser.extract_key`` has been tweaked slightly to reduce |
114 | + mallocs while parsing the index (approx 3=>1 mallocs per key read). |
115 | + This results in a 10% speedup while reading an index. |
116 | + (John Arbash Meinel) |
117 | + |
118 | +* The ``bzrlib.lsprof`` module has a new class ``BzrProfiler`` which makes |
119 | + profiling in some situations like callbacks and generators easier. |
120 | + (Robert Collins) |
121 | + |
122 | +Testing |
123 | +******* |
124 | + |
125 | +* Passing ``--lsprof-tests -v`` to bzr selftest will cause lsprof output to |
126 | + be output for every test. Note that this is very verbose! (Robert Collins) |
127 | + |
128 | +* Test parameterisation now does a shallow copy, not a deep copy of the test |
129 | + to be parameterised. This is not expected to break external use of test |
130 | + parameterisation, and is substantially faster. (Robert Collins) |
131 | + |
132 | + |
133 | +bzr 2.0rc2 (not released yet) |
134 | +############################# |
135 | + |
136 | +New Features |
137 | +************ |
138 | + |
139 | +* Added post_commit hook for mutable trees. This allows the keywords |
140 | + plugin to expand keywords on files changed by the commit. |
141 | + (Ian Clatworthy, #408841) |
142 | + |
143 | +Bug Fixes |
144 | +********* |
145 | + |
146 | +* Bazaar's native protocol code now correctly handles EINTR, which most |
147 | + noticeably occurs if you break in to the debugger while connected to a |
148 | + bzr+ssh server. You can now can continue from the debugger (by typing |
149 | + 'c') and the process continues. However, note that pressing C-\ in the |
150 | + shell may still kill the SSH process, which is bug 162509, so you must |
151 | + sent a signal to the bzr process specifically, for example by typing |
152 | + ``kill -QUIT PID`` in another shell. (Martin Pool, #341535) |
153 | + |
154 | +* ``bzr check`` in pack-0.92, 1.6 and 1.9 format repositories will no |
155 | + longer report incorrect errors about ``Missing inventory ('TREE_ROOT', ...)`` |
156 | + (Robert Collins, #416732) |
157 | + |
158 | +* ``bzr info -v`` on a 2a format still claimed that it was a "Development |
159 | + format" (John Arbash Meinel, #424392) |
160 | + |
161 | +* ``bzr log stacked-branch`` shows the full log including |
162 | + revisions that are in the fallback repository. (Regressed in 2.0rc1). |
163 | + (John Arbash Meinel, #419241) |
164 | + |
165 | +* Clearer message when Bazaar runs out of memory, instead of a ``MemoryError`` |
166 | + traceback. (Martin Pool, #109115) |
167 | + |
168 | +* Conversion to 2a will create a single pack for all the new revisions (as |
169 | + long as it ran without interruption). This improves both ``bzr upgrade`` |
170 | + and ``bzr pull`` or ``bzr merge`` from local branches in older formats. |
171 | + The autopack logic that occurs every 100 revisions during local |
172 | + conversions was not returning that pack's identifier, which resulted in |
173 | + the partial packs created during the conversion not being consolidated |
174 | + at the end of the conversion process. (Robert Collins, #423818) |
175 | + |
176 | +* Fetches from 2a to 2a are now again requested in 'groupcompress' order. |
177 | + Groups that are seen as 'underutilized' will be repacked on-the-fly. |
178 | + This means that when the source is fully packed, there is minimal |
179 | + overhead during the fetch, but if the source is poorly packed the result |
180 | + is a fairly well packed repository (not as good as 'bzr pack' but |
181 | + good-enough.) (Robert Collins, John Arbash Meinel, #402652) |
182 | + |
183 | +* Fix a potential segmentation fault when doing 'log' of a branch that had |
184 | + ghosts in its mainline. (Evaluating None as a tuple is bad.) |
185 | + (John Arbash Meinel, #419241) |
186 | + |
187 | +* ``groupcompress`` sort order is now more stable, rather than relying on |
188 | + ``topo_sort`` ordering. The implementation is now |
189 | + ``KnownGraph.gc_sort``. (John Arbash Meinel) |
190 | + |
191 | +* Local data conversion will generate correct deltas. This is a critical |
192 | + bugfix vs 2.0rc1, and all 2.0rc1 users should upgrade to 2.0rc2 before |
193 | + converting repositories. (Robert Collins, #422849) |
194 | + |
195 | +* Network streams now decode adjacent records of the same type into a |
196 | + single stream, reducing layering churn. (Robert Collins) |
197 | + |
198 | +* Prevent some kinds of incomplete data from being committed to a 2a |
199 | + repository, such as revisions without inventories, a missing chk_bytes |
200 | + record for an inventory, or a missing text referenced by an inventory. |
201 | + (Andrew Bennetts, #423506, #406687) |
202 | + |
203 | +Documentation |
204 | +************* |
205 | + |
206 | +* Fix assertion error about "_remember_remote_is_before" when pushing to |
207 | + older smart servers. |
208 | + (Andrew Bennetts, #418931) |
209 | + |
210 | +* Help on hooks no longer says 'Not deprecated' for hooks that are |
211 | + currently supported. (Ian Clatworthy, #422415) |
212 | + |
213 | +* The main table of contents now provides links to the new Migration Docs |
214 | + and Plugins Guide. (Ian Clatworthy) |
215 | + |
216 | + |
217 | +bzr 2.0rc1 |
218 | +########## |
219 | + |
220 | + |
221 | +:Codename: no worries |
222 | +:2.0rc1: 2009-08-26 |
223 | + |
224 | +This release of Bazaar makes 2a 'brisbane-core' format the |
225 | +default. Most of the work in this release now focuses on bug |
226 | +fixes and stabilization, covering both 2a and previous formats. |
227 | + |
228 | +The Bazaar team decided that 2.0 will be a long-term supported |
229 | +release, with bugfix-only releases based on it continuing for at |
230 | +least six months or until the following stable release (we said |
231 | +that previously, but that's worth repeating). |
232 | + |
233 | +Compatibility Breaks |
234 | +******************** |
235 | + |
236 | +* The default format for bzr is now ``2a``. This format brings many |
237 | + significant performance and size improvements. bzr can pull from |
238 | + any existing repository into a ``2a`` one, but can only transfer |
239 | + into ``rich-root`` repositories from ``2a``. The Upgrade guide |
240 | + has more information about this change. (Robert Collins) |
241 | + |
242 | +* On Windows auto-detection of Putty's plink.exe is disabled. |
243 | + Default SSH client for Windows is paramiko. User still can force |
244 | + usage of plink if explicitly set environment variable BZR_SSH=plink. |
245 | + (#414743, Alexander Belchenko) |
246 | + |
247 | +New Features |
248 | +************ |
249 | + |
250 | +* ``bzr branch --switch`` can now switch the checkout in the current directory |
251 | + to the newly created branch. (Lukáš Lalinský) |
252 | + |
253 | +Bug Fixes |
254 | +********* |
255 | + |
256 | +* Further tweaks to handling of ``bzr add`` messages about ignored files. |
257 | + (Jason Spashett, #76616) |
258 | + |
259 | +* Fetches were being requested in 'groupcompress' order, but weren't |
260 | + recombining the groups. Thus they would 'fragment' to get the correct |
261 | + order, but not 'recombine' to actually benefit from it. Until we get |
262 | + recombining to work, switching to 'unordered' fetches avoids the |
263 | + fragmentation. (John Arbash Meinel, #402645) |
264 | + |
265 | +* Fix a pycurl related test failure on karmic by recognizing an error |
266 | + raised by newer versions of pycurl. |
267 | + (Vincent Ladeuil, #306264) |
268 | + |
269 | +* Fix a test failure on karmic by making a locale test more robust. |
270 | + (Vincent Ladeuil, #413514) |
271 | + |
272 | +* Fix IndexError printing CannotBindAddress errors. |
273 | + (Martin Pool, #286871) |
274 | + |
275 | +* Fix "Revision ... not present" errors when upgrading stacked branches, |
276 | + or when doing fetches from a stacked source to a stacked target. |
277 | + (Andrew Bennetts, #399140) |
278 | + |
279 | +* ``bzr branch`` of 2a repositories over HTTP is much faster. bzr now |
280 | + batches together small fetches from 2a repositories, rather than |
281 | + fetching only a few hundred bytes at a time. |
282 | + (Andrew Bennetts, #402657) |
283 | + |
284 | +Improvements |
285 | +************ |
286 | + |
287 | +* A better description of the platform is shown in crash tracebacks, ``bzr |
288 | + --version`` and ``bzr selftest``. |
289 | + (Martin Pool, #409137) |
290 | + |
291 | +* bzr can now (again) capture crash data through the apport library, |
292 | + so that a single human-readable file can be attached to bug reports. |
293 | + This can be disabled by using ``-Dno_apport`` on the command line, or by |
294 | + putting ``no_apport`` into the ``debug_flags`` section of |
295 | + ``bazaar.conf``. |
296 | + (Martin Pool, Robert Collins, #389328) |
297 | + |
298 | +* ``bzr push`` locally on windows will no longer give a locking error with |
299 | + dirstate based formats. (Robert Collins) |
300 | + |
301 | +* ``bzr shelve`` and ``bzr unshelve`` now work on windows. |
302 | + (Robert Collins, #305006) |
303 | + |
304 | +* Commit of specific files no longer prevents using the the iter_changes |
305 | + codepath. On 2a repositories, commit of specific files should now be as |
306 | + fast, or slightly faster, than a full commit. (Robert Collins) |
307 | + |
308 | +* The internal core code that handles specific file operations like |
309 | + ``bzr st FILENAME`` or ``bzr commit FILENAME`` has been changed to |
310 | + include the parent directories if they have altered, and when a |
311 | + directory stops being a directory its children are always included. This |
312 | + fixes a number of causes for ``InconsistentDelta`` errors, and permits |
313 | + faster commit of specific paths. (Robert Collins, #347649) |
314 | + |
315 | +Documentation |
316 | +************* |
317 | + |
318 | +* New developer documentation for content filtering. |
319 | + (Martin Pool) |
320 | + |
321 | +API Changes |
322 | +*********** |
323 | + |
324 | +* ``bzrlib.shelf_ui`` has had the ``from_args`` convenience methods of its |
325 | + classes changed to manage lock lifetime of the trees they open in a way |
326 | + consistent with reader-exclusive locks. (Robert Collins, #305006) |
327 | + |
328 | +Testing |
329 | +******* |
330 | |
331 | bzr 1.18.1 NOT RELEASED YET |
332 | ########################### |
333 | @@ -17,17 +320,144 @@ |
334 | conversion will commit too many copies a file. |
335 | (Martin Pool, #415508) |
336 | |
337 | +Improvements |
338 | +************ |
339 | + |
340 | +* ``bzr push`` locally on windows will no longer give a locking error with |
341 | + dirstate based formats. (Robert Collins) |
342 | + |
343 | +* ``bzr shelve`` and ``bzr unshelve`` now work on windows. |
344 | + (Robert Collins, #305006) |
345 | + |
346 | API Changes |
347 | *********** |
348 | |
349 | +* ``bzrlib.shelf_ui`` has had the ``from_args`` convenience methods of its |
350 | + classes changed to manage lock lifetime of the trees they open in a way |
351 | + consistent with reader-exclusive locks. (Robert Collins, #305006) |
352 | + |
353 | * ``Tree.path_content_summary`` may return a size of None, when called on |
354 | a tree with content filtering where the size of the canonical form |
355 | cannot be cheaply determined. (Martin Pool) |
356 | |
357 | +* When manually creating transport servers in test cases, a new helper |
358 | + ``TestCase.start_server`` that registers a cleanup and starts the server |
359 | + should be used. (Robert Collins) |
360 | |
361 | bzr 1.18 |
362 | ######## |
363 | |
364 | +Compatibility Breaks |
365 | +******************** |
366 | + |
367 | +* Committing directly to a stacked branch from a lightweight checkout will |
368 | + no longer work. In previous versions this would appear to work but would |
369 | + generate repositories with insufficient data to create deltas, leading |
370 | + to later errors when branching or reading from the repository. |
371 | + (Robert Collins, bug #375013) |
372 | + |
373 | +New Features |
374 | +************ |
375 | + |
376 | +Bug Fixes |
377 | +********* |
378 | + |
379 | +* Fetching from 2a branches from a version-2 bzr protocol would fail to |
380 | + copy the internal inventory pages from the CHK store. This cannot happen |
381 | + in normal use as all 2a compatible clients and servers support the |
382 | + version-3 protocol, but it does cause test suite failures when testing |
383 | + downlevel protocol behaviour. (Robert Collins) |
384 | + |
385 | +* Fix a test failure on karmic by making a locale test more robust. |
386 | + (Vincent Ladeuil, #413514) |
387 | + |
388 | +* Fixed "Pack ... already exists" error when running ``bzr pack`` on a |
389 | + fully packed 2a repository. (Andrew Bennetts, #382463) |
390 | + |
391 | +* Further tweaks to handling of ``bzr add`` messages about ignored files. |
392 | + (Jason Spashett, #76616) |
393 | + |
394 | +* Properly handle fetching into a stacked branch while converting the |
395 | + data, especially when there are also ghosts. The code was filling in |
396 | + parent inventories incorrectly, and also not handling when one of the |
397 | + parents was a ghost. (John Arbash Meinel, #402778, #412198) |
398 | + |
399 | +* ``RemoteStreamSource.get_stream_for_missing_keys`` will fetch CHK |
400 | + inventory pages when appropriate (by falling back to the vfs stream |
401 | + source). (Andrew Bennetts, #406686) |
402 | + |
403 | +* StreamSource generates rich roots from non-rich root sources correctly |
404 | + now. (Andrew Bennetts, #368921) |
405 | + |
406 | +* When deciding whether a repository was compatible for upgrading or |
407 | + fetching, we previously incorrectly checked the default repository |
408 | + format for the bzrdir format, rather than the format that was actually |
409 | + present on disk. (Martin Pool, #408824) |
410 | + |
411 | +Improvements |
412 | +************ |
413 | + |
414 | +* A better description of the platform is shown in crash tracebacks, ``bzr |
415 | + --version`` and ``bzr selftest``. |
416 | + (Martin Pool, #409137) |
417 | + |
418 | +* Cross-format fetches (such as between 1.9-rich-root and 2a) via the |
419 | + smart server are more efficient now. They send inventory deltas rather |
420 | + than full inventories. The smart server has two new requests, |
421 | + ``Repository.get_stream_1.19`` and ``Repository.insert_stream_1.19`` to |
422 | + support this. (Andrew Bennetts, #374738, #385826) |
423 | + |
424 | +* Extracting the full ancestry and computing the ``merge_sort`` is now |
425 | + significantly faster. This effects things like ``bzr log -n0``. (For |
426 | + example, ``bzr log -r -10..-1 -n0 bzr.dev`` is 2.5s down to 1.0s. |
427 | + (John Arbash Meinel) |
428 | + |
429 | +Documentation |
430 | +************* |
431 | + |
432 | +API Changes |
433 | +*********** |
434 | + |
435 | +Internals |
436 | +********* |
437 | + |
438 | +* ``-Dstrict_locks`` can now be used to check that read and write locks |
439 | + are treated properly w.r.t. exclusivity. (We don't try to take an OS |
440 | + read lock on a file that we already have an OS write lock on.) This is |
441 | + now set by default for all tests, if you have a test which cannot be |
442 | + fixed, you can use ``self.thisFailsStrictLockCheck()`` as a |
443 | + compatibility knob. (John Arbash Meinel) |
444 | + |
445 | +* InterDifferingSerializer is now only used locally. Other fetches that |
446 | + would have used InterDifferingSerializer now use the more network |
447 | + friendly StreamSource, which now automatically does the same |
448 | + transformations as InterDifferingSerializer. (Andrew Bennetts) |
449 | + |
450 | +* ``KnownGraph`` now has a ``.topo_sort`` and ``.merge_sort`` member which |
451 | + are implemented in pyrex and significantly faster. This is exposed along |
452 | + with ``CombinedGraphIndex.find_ancestry()`` as |
453 | + ``VersionedFiles.get_known_graph_ancestry(keys)``. |
454 | + (John Arbash Meinel) |
455 | + |
456 | +* RemoteBranch.open now honours ignore_fallbacks correctly on bzr-v2 |
457 | + protocols. (Robert Collins) |
458 | + |
459 | +* The index code now has some specialized routines to extract the full |
460 | + ancestry of a key in a more efficient manner. |
461 | + ``CombinedGraphIndex.find_ancestry()``. (Time to get ancestry for |
462 | + bzr.dev drops from 1.5s down to 300ms. For OOo from 33s => 10.5s) (John |
463 | + Arbash Meinel) |
464 | + |
465 | +Testing |
466 | +******* |
467 | + |
468 | +* Install the test ssl certificate and key so that installed bzr |
469 | + can run the https tests. (Denys Duchier, #392401) |
470 | + |
471 | + |
472 | +bzr 1.18rc1 |
473 | +########### |
474 | + |
475 | :Codename: little traveller |
476 | :1.18: 2009-08-20 |
477 | :1.18rc1: 2009-08-10 |
478 | @@ -113,6 +543,9 @@ |
479 | running send and similar commands on 2a formats. |
480 | (Martin Pool, #408201) |
481 | |
482 | +* Fix crash in some invocations of ``bzr status`` in format 2a. |
483 | + (Martin Pool, #403523) |
484 | + |
485 | * Fixed export to existing directory: if directory is empty then export |
486 | will succeed, otherwise it fails with error. |
487 | (Alexander Belchenko, #406174) |
488 | @@ -133,7 +566,9 @@ |
489 | * Requests for unknown methods no longer cause the smart server to log |
490 | lots of backtraces about ``UnknownSmartMethod``, ``do_chunk`` or |
491 | ``do_end``. (Andrew Bennetts, #338561) |
492 | - |
493 | + |
494 | +* Shelve will not shelve the initial add of the tree root. (Aaron Bentley) |
495 | + |
496 | * Streaming from bzr servers where there is a chain of stacked branches |
497 | (A stacked on B stacked on C) will now work. (Robert Collins, #406597) |
498 | |
499 | @@ -245,6 +680,17 @@ |
500 | ``countTestsCases``. (Robert Collins) |
501 | |
502 | |
503 | +bzr 1.17.1 (unreleased) |
504 | +####################### |
505 | + |
506 | +Bug Fixes |
507 | +********* |
508 | + |
509 | +* The optional ``_knit_load_data_pyx`` C extension was never being |
510 | + imported. This caused significant slowdowns when reading data from |
511 | + knit format repositories. (Andrew Bennetts, #405653) |
512 | + |
513 | + |
514 | bzr 1.17 "So late it's brunch" 2009-07-20 |
515 | ######################################### |
516 | :Codename: so-late-its-brunch |
517 | @@ -743,6 +1189,9 @@ |
518 | Testing |
519 | ******* |
520 | |
521 | +* ``make check`` no longer repeats the test run in ``LANG=C``. |
522 | + (Martin Pool, #386180) |
523 | + |
524 | * The number of cores is now correctly detected on OSX. (John Szakmeister) |
525 | |
526 | * The number of cores is also detected on Solaris and win32. (Vincent Ladeuil) |
527 | @@ -4723,7 +5172,7 @@ |
528 | checkouts. (Aaron Bentley, #182040) |
529 | |
530 | * Stop polluting /tmp when running selftest. |
531 | - (Vincent Ladeuil, #123623) |
532 | + (Vincent Ladeuil, #123363) |
533 | |
534 | * Switch from NFKC => NFC for normalization checks. NFC allows a few |
535 | more characters which should be considered valid. |
536 | |
537 | === modified file 'bzr' |
538 | --- bzr 2009-07-30 23:54:26 +0000 |
539 | +++ bzr 2009-08-28 05:11:10 +0000 |
540 | @@ -23,7 +23,7 @@ |
541 | import warnings |
542 | |
543 | # update this on each release |
544 | -_script_version = (1, 18, 0) |
545 | +_script_version = (2, 1, 0) |
546 | |
547 | if __doc__ is None: |
548 | print "bzr does not support python -OO." |
549 | |
550 | === modified file 'bzrlib/__init__.py' |
551 | --- bzrlib/__init__.py 2009-08-20 08:38:09 +0000 |
552 | +++ bzrlib/__init__.py 2009-08-30 21:34:42 +0000 |
553 | @@ -50,7 +50,7 @@ |
554 | # Python version 2.0 is (2, 0, 0, 'final', 0)." Additionally we use a |
555 | # releaselevel of 'dev' for unreleased under-development code. |
556 | |
557 | -version_info = (1, 18, 0, 'final', 0) |
558 | +version_info = (2, 1, 0, 'dev', 0) |
559 | |
560 | # API compatibility version: bzrlib is currently API compatible with 1.15. |
561 | api_minimum_version = (1, 17, 0) |
562 | @@ -70,6 +70,8 @@ |
563 | 1.2dev |
564 | >>> print _format_version_tuple((1, 1, 1, 'candidate', 2)) |
565 | 1.1.1rc2 |
566 | + >>> print bzrlib._format_version_tuple((2, 1, 0, 'beta', 1)) |
567 | + 2.1b1 |
568 | >>> print _format_version_tuple((1, 4, 0)) |
569 | 1.4 |
570 | >>> print _format_version_tuple((1, 4)) |
571 | |
572 | === modified file 'bzrlib/_btree_serializer_pyx.pyx' |
573 | --- bzrlib/_btree_serializer_pyx.pyx 2009-06-22 12:52:39 +0000 |
574 | +++ bzrlib/_btree_serializer_pyx.pyx 2009-09-04 21:16:14 +0000 |
575 | @@ -1,4 +1,4 @@ |
576 | -# Copyright (C) 2008 Canonical Ltd |
577 | +# Copyright (C) 2008, 2009 Canonical Ltd |
578 | # |
579 | # This program is free software; you can redistribute it and/or modify |
580 | # it under the terms of the GNU General Public License as published by |
581 | @@ -41,8 +41,11 @@ |
582 | int PyString_AsStringAndSize_ptr(PyObject *, char **buf, Py_ssize_t *len) |
583 | void PyString_InternInPlace(PyObject **) |
584 | int PyTuple_CheckExact(object t) |
585 | + object PyTuple_New(Py_ssize_t n_entries) |
586 | + void PyTuple_SET_ITEM(object, Py_ssize_t offset, object) # steals the ref |
587 | Py_ssize_t PyTuple_GET_SIZE(object t) |
588 | PyObject *PyTuple_GET_ITEM_ptr_object "PyTuple_GET_ITEM" (object tpl, int index) |
589 | + void Py_INCREF(object) |
590 | void Py_DECREF_ptr "Py_DECREF" (PyObject *) |
591 | |
592 | cdef extern from "string.h": |
593 | @@ -140,14 +143,12 @@ |
594 | cdef char *temp_ptr |
595 | cdef int loop_counter |
596 | # keys are tuples |
597 | - loop_counter = 0 |
598 | - key_segments = [] |
599 | - while loop_counter < self.key_length: |
600 | - loop_counter = loop_counter + 1 |
601 | + key = PyTuple_New(self.key_length) |
602 | + for loop_counter from 0 <= loop_counter < self.key_length: |
603 | # grab a key segment |
604 | temp_ptr = <char*>memchr(self._start, c'\0', last - self._start) |
605 | if temp_ptr == NULL: |
606 | - if loop_counter == self.key_length: |
607 | + if loop_counter + 1 == self.key_length: |
608 | # capture to last |
609 | temp_ptr = last |
610 | else: |
611 | @@ -164,8 +165,9 @@ |
612 | temp_ptr - self._start) |
613 | # advance our pointer |
614 | self._start = temp_ptr + 1 |
615 | - PyList_Append(key_segments, key_element) |
616 | - return tuple(key_segments) |
617 | + Py_INCREF(key_element) |
618 | + PyTuple_SET_ITEM(key, loop_counter, key_element) |
619 | + return key |
620 | |
621 | cdef int process_line(self) except -1: |
622 | """Process a line in the bytes.""" |
623 | |
624 | === modified file 'bzrlib/_dirstate_helpers_pyx.pyx' |
625 | --- bzrlib/_dirstate_helpers_pyx.pyx 2009-07-27 05:44:19 +0000 |
626 | +++ bzrlib/_dirstate_helpers_pyx.pyx 2009-08-28 05:00:33 +0000 |
627 | @@ -28,7 +28,7 @@ |
628 | |
629 | from bzrlib import cache_utf8, errors, osutils |
630 | from bzrlib.dirstate import DirState |
631 | -from bzrlib.osutils import pathjoin, splitpath |
632 | +from bzrlib.osutils import parent_directories, pathjoin, splitpath |
633 | |
634 | |
635 | # This is the Windows equivalent of ENOTDIR |
636 | @@ -963,15 +963,21 @@ |
637 | |
638 | cdef class ProcessEntryC: |
639 | |
640 | + cdef int doing_consistency_expansion |
641 | cdef object old_dirname_to_file_id # dict |
642 | cdef object new_dirname_to_file_id # dict |
643 | cdef object last_source_parent |
644 | cdef object last_target_parent |
645 | - cdef object include_unchanged |
646 | + cdef int include_unchanged |
647 | + cdef int partial |
648 | cdef object use_filesystem_for_exec |
649 | cdef object utf8_decode |
650 | cdef readonly object searched_specific_files |
651 | + cdef readonly object searched_exact_paths |
652 | cdef object search_specific_files |
653 | + # The parents up to the root of the paths we are searching. |
654 | + # After all normal paths are returned, these specific items are returned. |
655 | + cdef object search_specific_file_parents |
656 | cdef object state |
657 | # Current iteration variables: |
658 | cdef object current_root |
659 | @@ -989,31 +995,48 @@ |
660 | cdef object current_block_list |
661 | cdef object current_dir_info |
662 | cdef object current_dir_list |
663 | + cdef object _pending_consistent_entries # list |
664 | cdef int path_index |
665 | cdef object root_dir_info |
666 | cdef object bisect_left |
667 | cdef object pathjoin |
668 | cdef object fstat |
669 | + # A set of the ids we've output when doing partial output. |
670 | + cdef object seen_ids |
671 | cdef object sha_file |
672 | |
673 | def __init__(self, include_unchanged, use_filesystem_for_exec, |
674 | search_specific_files, state, source_index, target_index, |
675 | want_unversioned, tree): |
676 | + self.doing_consistency_expansion = 0 |
677 | self.old_dirname_to_file_id = {} |
678 | self.new_dirname_to_file_id = {} |
679 | + # Are we doing a partial iter_changes? |
680 | + self.partial = set(['']).__ne__(search_specific_files) |
681 | # Using a list so that we can access the values and change them in |
682 | # nested scope. Each one is [path, file_id, entry] |
683 | self.last_source_parent = [None, None] |
684 | self.last_target_parent = [None, None] |
685 | - self.include_unchanged = include_unchanged |
686 | + if include_unchanged is None: |
687 | + self.include_unchanged = False |
688 | + else: |
689 | + self.include_unchanged = int(include_unchanged) |
690 | self.use_filesystem_for_exec = use_filesystem_for_exec |
691 | self.utf8_decode = cache_utf8._utf8_decode |
692 | # for all search_indexs in each path at or under each element of |
693 | - # search_specific_files, if the detail is relocated: add the id, and add the |
694 | - # relocated path as one to search if its not searched already. If the |
695 | - # detail is not relocated, add the id. |
696 | + # search_specific_files, if the detail is relocated: add the id, and |
697 | + # add the relocated path as one to search if its not searched already. |
698 | + # If the detail is not relocated, add the id. |
699 | self.searched_specific_files = set() |
700 | + # When we search exact paths without expanding downwards, we record |
701 | + # that here. |
702 | + self.searched_exact_paths = set() |
703 | self.search_specific_files = search_specific_files |
704 | + # The parents up to the root of the paths we are searching. |
705 | + # After all normal paths are returned, these specific items are returned. |
706 | + self.search_specific_file_parents = set() |
707 | + # The ids we've sent out in the delta. |
708 | + self.seen_ids = set() |
709 | self.state = state |
710 | self.current_root = None |
711 | self.current_root_unicode = None |
712 | @@ -1035,26 +1058,30 @@ |
713 | self.current_block_pos = -1 |
714 | self.current_dir_info = None |
715 | self.current_dir_list = None |
716 | + self._pending_consistent_entries = [] |
717 | self.path_index = 0 |
718 | self.root_dir_info = None |
719 | self.bisect_left = bisect.bisect_left |
720 | self.pathjoin = osutils.pathjoin |
721 | self.fstat = os.fstat |
722 | self.sha_file = osutils.sha_file |
723 | + if target_index != 0: |
724 | + # A lot of code in here depends on target_index == 0 |
725 | + raise errors.BzrError('unsupported target index') |
726 | |
727 | cdef _process_entry(self, entry, path_info): |
728 | """Compare an entry and real disk to generate delta information. |
729 | |
730 | :param path_info: top_relpath, basename, kind, lstat, abspath for |
731 | - the path of entry. If None, then the path is considered absent. |
732 | - (Perhaps we should pass in a concrete entry for this ?) |
733 | + the path of entry. If None, then the path is considered absent in |
734 | + the target (Perhaps we should pass in a concrete entry for this ?) |
735 | Basename is returned as a utf8 string because we expect this |
736 | tuple will be ignored, and don't want to take the time to |
737 | decode. |
738 | :return: (iter_changes_result, changed). If the entry has not been |
739 | handled then changed is None. Otherwise it is False if no content |
740 | - or metadata changes have occured, and None if any content or |
741 | - metadata change has occured. If self.include_unchanged is True then |
742 | + or metadata changes have occured, and True if any content or |
743 | + metadata change has occurred. If self.include_unchanged is True then |
744 | if changed is not None, iter_changes_result will always be a result |
745 | tuple. Otherwise, iter_changes_result is None unless changed is |
746 | True. |
747 | @@ -1099,9 +1126,12 @@ |
748 | else: |
749 | # add the source to the search path to find any children it |
750 | # has. TODO ? : only add if it is a container ? |
751 | - if not osutils.is_inside_any(self.searched_specific_files, |
752 | - source_details[1]): |
753 | + if (not self.doing_consistency_expansion and |
754 | + not osutils.is_inside_any(self.searched_specific_files, |
755 | + source_details[1])): |
756 | self.search_specific_files.add(source_details[1]) |
757 | + # expanding from a user requested path, parent expansion |
758 | + # for delta consistency happens later. |
759 | # generate the old path; this is needed for stating later |
760 | # as well. |
761 | old_path = source_details[1] |
762 | @@ -1180,7 +1210,8 @@ |
763 | file_id = entry[0][2] |
764 | self.old_dirname_to_file_id[old_path] = file_id |
765 | # parent id is the entry for the path in the target tree |
766 | - if old_dirname == self.last_source_parent[0]: |
767 | + if old_basename and old_dirname == self.last_source_parent[0]: |
768 | + # use a cached hit for non-root source entries. |
769 | source_parent_id = self.last_source_parent[1] |
770 | else: |
771 | try: |
772 | @@ -1196,7 +1227,8 @@ |
773 | self.last_source_parent[0] = old_dirname |
774 | self.last_source_parent[1] = source_parent_id |
775 | new_dirname = entry[0][0] |
776 | - if new_dirname == self.last_target_parent[0]: |
777 | + if entry[0][1] and new_dirname == self.last_target_parent[0]: |
778 | + # use a cached hit for non-root target entries. |
779 | target_parent_id = self.last_target_parent[1] |
780 | else: |
781 | try: |
782 | @@ -1313,8 +1345,13 @@ |
783 | # a renamed parent. TODO: handle this efficiently. Its not |
784 | # common case to rename dirs though, so a correct but slow |
785 | # implementation will do. |
786 | - if not osutils.is_inside_any(self.searched_specific_files, target_details[1]): |
787 | + if (not self.doing_consistency_expansion and |
788 | + not osutils.is_inside_any(self.searched_specific_files, |
789 | + target_details[1])): |
790 | self.search_specific_files.add(target_details[1]) |
791 | + # We don't expand the specific files parents list here as |
792 | + # the path is absent in target and won't create a delta with |
793 | + # missing parent. |
794 | elif ((source_minikind == c'r' or source_minikind == c'a') and |
795 | (target_minikind == c'r' or target_minikind == c'a')): |
796 | # neither of the selected trees contain this path, |
797 | @@ -1334,6 +1371,25 @@ |
798 | def iter_changes(self): |
799 | return self |
800 | |
801 | + cdef void _gather_result_for_consistency(self, result): |
802 | + """Check a result we will yield to make sure we are consistent later. |
803 | + |
804 | + This gathers result's parents into a set to output later. |
805 | + |
806 | + :param result: A result tuple. |
807 | + """ |
808 | + if not self.partial or not result[0]: |
809 | + return |
810 | + self.seen_ids.add(result[0]) |
811 | + new_path = result[1][1] |
812 | + if new_path: |
813 | + # Not the root and not a delete: queue up the parents of the path. |
814 | + self.search_specific_file_parents.update( |
815 | + osutils.parent_directories(new_path.encode('utf8'))) |
816 | + # Add the root directory which parent_directories does not |
817 | + # provide. |
818 | + self.search_specific_file_parents.add('') |
819 | + |
820 | cdef void _update_current_block(self): |
821 | if (self.block_index < len(self.state._dirblocks) and |
822 | osutils.is_inside(self.current_root, self.state._dirblocks[self.block_index][0])): |
823 | @@ -1406,8 +1462,11 @@ |
824 | entry = self.root_entries[self.root_entries_pos] |
825 | self.root_entries_pos = self.root_entries_pos + 1 |
826 | result, changed = self._process_entry(entry, self.root_dir_info) |
827 | - if changed is not None and changed or self.include_unchanged: |
828 | - return result |
829 | + if changed is not None: |
830 | + if changed: |
831 | + self._gather_result_for_consistency(result) |
832 | + if changed or self.include_unchanged: |
833 | + return result |
834 | # Have we finished the prior root, or never started one ? |
835 | if self.current_root is None: |
836 | # TODO: the pending list should be lexically sorted? the |
837 | @@ -1416,12 +1475,12 @@ |
838 | self.current_root = self.search_specific_files.pop() |
839 | except KeyError: |
840 | raise StopIteration() |
841 | - self.current_root_unicode = self.current_root.decode('utf8') |
842 | self.searched_specific_files.add(self.current_root) |
843 | # process the entries for this containing directory: the rest will be |
844 | # found by their parents recursively. |
845 | self.root_entries = self.state._entries_for_path(self.current_root) |
846 | self.root_entries_len = len(self.root_entries) |
847 | + self.current_root_unicode = self.current_root.decode('utf8') |
848 | self.root_abspath = self.tree.abspath(self.current_root_unicode) |
849 | try: |
850 | root_stat = os.lstat(self.root_abspath) |
851 | @@ -1458,6 +1517,8 @@ |
852 | result, changed = self._process_entry(entry, self.root_dir_info) |
853 | if changed is not None: |
854 | path_handled = -1 |
855 | + if changed: |
856 | + self._gather_result_for_consistency(result) |
857 | if changed or self.include_unchanged: |
858 | return result |
859 | # handle unversioned specified paths: |
860 | @@ -1476,7 +1537,8 @@ |
861 | ) |
862 | # If we reach here, the outer flow continues, which enters into the |
863 | # per-root setup logic. |
864 | - if self.current_dir_info is None and self.current_block is None: |
865 | + if (self.current_dir_info is None and self.current_block is None and not |
866 | + self.doing_consistency_expansion): |
867 | # setup iteration of this root: |
868 | self.current_dir_list = None |
869 | if self.root_dir_info and self.root_dir_info[2] == 'tree-reference': |
870 | @@ -1606,6 +1668,8 @@ |
871 | # advance the entry only, after processing. |
872 | result, changed = self._process_entry(current_entry, None) |
873 | if changed is not None: |
874 | + if changed: |
875 | + self._gather_result_for_consistency(result) |
876 | if changed or self.include_unchanged: |
877 | return result |
878 | self.block_index = self.block_index + 1 |
879 | @@ -1618,6 +1682,15 @@ |
880 | # More supplied paths to process |
881 | self.current_root = None |
882 | return self._iter_next() |
883 | + # Start expanding more conservatively, adding paths the user may not |
884 | + # have intended but required for consistent deltas. |
885 | + self.doing_consistency_expansion = 1 |
886 | + if not self._pending_consistent_entries: |
887 | + self._pending_consistent_entries = self._next_consistent_entries() |
888 | + while self._pending_consistent_entries: |
889 | + result, changed = self._pending_consistent_entries.pop() |
890 | + if changed is not None: |
891 | + return result |
892 | raise StopIteration() |
893 | |
894 | cdef object _maybe_tree_ref(self, current_path_info): |
895 | @@ -1705,6 +1778,8 @@ |
896 | current_path_info) |
897 | if changed is not None: |
898 | path_handled = -1 |
899 | + if not changed and not self.include_unchanged: |
900 | + changed = None |
901 | # >- loop control starts here: |
902 | # >- entry |
903 | if advance_entry and current_entry is not None: |
904 | @@ -1726,7 +1801,7 @@ |
905 | except UnicodeDecodeError: |
906 | raise errors.BadFilenameEncoding( |
907 | current_path_info[0], osutils._fs_enc) |
908 | - if result is not None: |
909 | + if changed is not None: |
910 | raise AssertionError( |
911 | "result is not None: %r" % result) |
912 | result = (None, |
913 | @@ -1737,6 +1812,7 @@ |
914 | (None, self.utf8_decode(current_path_info[1])[0]), |
915 | (None, current_path_info[2]), |
916 | (None, new_executable)) |
917 | + changed = True |
918 | # dont descend into this unversioned path if it is |
919 | # a dir |
920 | if current_path_info[2] in ('directory'): |
921 | @@ -1755,9 +1831,12 @@ |
922 | current_path_info) |
923 | else: |
924 | current_path_info = None |
925 | - if result is not None: |
926 | + if changed is not None: |
927 | # Found a result on this pass, yield it |
928 | - return result |
929 | + if changed: |
930 | + self._gather_result_for_consistency(result) |
931 | + if changed or self.include_unchanged: |
932 | + return result |
933 | if self.current_block is not None: |
934 | self.block_index = self.block_index + 1 |
935 | self._update_current_block() |
936 | @@ -1769,3 +1848,123 @@ |
937 | self.current_dir_list = self.current_dir_info[1] |
938 | except StopIteration: |
939 | self.current_dir_info = None |
940 | + |
941 | + cdef object _next_consistent_entries(self): |
942 | + """Grabs the next specific file parent case to consider. |
943 | + |
944 | + :return: A list of the results, each of which is as for _process_entry. |
945 | + """ |
946 | + results = [] |
947 | + while self.search_specific_file_parents: |
948 | + # Process the parent directories for the paths we were iterating. |
949 | + # Even in extremely large trees this should be modest, so currently |
950 | + # no attempt is made to optimise. |
951 | + path_utf8 = self.search_specific_file_parents.pop() |
952 | + if path_utf8 in self.searched_exact_paths: |
953 | + # We've examined this path. |
954 | + continue |
955 | + if osutils.is_inside_any(self.searched_specific_files, path_utf8): |
956 | + # We've examined this path. |
957 | + continue |
958 | + path_entries = self.state._entries_for_path(path_utf8) |
959 | + # We need either one or two entries. If the path in |
960 | + # self.target_index has moved (so the entry in source_index is in |
961 | + # 'ar') then we need to also look for the entry for this path in |
962 | + # self.source_index, to output the appropriate delete-or-rename. |
963 | + selected_entries = [] |
964 | + found_item = False |
965 | + for candidate_entry in path_entries: |
966 | + # Find entries present in target at this path: |
967 | + if candidate_entry[1][self.target_index][0] not in 'ar': |
968 | + found_item = True |
969 | + selected_entries.append(candidate_entry) |
970 | + # Find entries present in source at this path: |
971 | + elif (self.source_index is not None and |
972 | + candidate_entry[1][self.source_index][0] not in 'ar'): |
973 | + found_item = True |
974 | + if candidate_entry[1][self.target_index][0] == 'a': |
975 | + # Deleted, emit it here. |
976 | + selected_entries.append(candidate_entry) |
977 | + else: |
978 | + # renamed, emit it when we process the directory it |
979 | + # ended up at. |
980 | + self.search_specific_file_parents.add( |
981 | + candidate_entry[1][self.target_index][1]) |
982 | + if not found_item: |
983 | + raise AssertionError( |
984 | + "Missing entry for specific path parent %r, %r" % ( |
985 | + path_utf8, path_entries)) |
986 | + path_info = self._path_info(path_utf8, path_utf8.decode('utf8')) |
987 | + for entry in selected_entries: |
988 | + if entry[0][2] in self.seen_ids: |
989 | + continue |
990 | + result, changed = self._process_entry(entry, path_info) |
991 | + if changed is None: |
992 | + raise AssertionError( |
993 | + "Got entry<->path mismatch for specific path " |
994 | + "%r entry %r path_info %r " % ( |
995 | + path_utf8, entry, path_info)) |
996 | + # Only include changes - we're outside the users requested |
997 | + # expansion. |
998 | + if changed: |
999 | + self._gather_result_for_consistency(result) |
1000 | + if (result[6][0] == 'directory' and |
1001 | + result[6][1] != 'directory'): |
1002 | + # This stopped being a directory, the old children have |
1003 | + # to be included. |
1004 | + if entry[1][self.source_index][0] == 'r': |
1005 | + # renamed, take the source path |
1006 | + entry_path_utf8 = entry[1][self.source_index][1] |
1007 | + else: |
1008 | + entry_path_utf8 = path_utf8 |
1009 | + initial_key = (entry_path_utf8, '', '') |
1010 | + block_index, _ = self.state._find_block_index_from_key( |
1011 | + initial_key) |
1012 | + if block_index == 0: |
1013 | + # The children of the root are in block index 1. |
1014 | + block_index = block_index + 1 |
1015 | + current_block = None |
1016 | + if block_index < len(self.state._dirblocks): |
1017 | + current_block = self.state._dirblocks[block_index] |
1018 | + if not osutils.is_inside( |
1019 | + entry_path_utf8, current_block[0]): |
1020 | + # No entries for this directory at all. |
1021 | + current_block = None |
1022 | + if current_block is not None: |
1023 | + for entry in current_block[1]: |
1024 | + if entry[1][self.source_index][0] in 'ar': |
1025 | + # Not in the source tree, so doesn't have to be |
1026 | + # included. |
1027 | + continue |
1028 | + # Path of the entry itself. |
1029 | + self.search_specific_file_parents.add( |
1030 | + self.pathjoin(*entry[0][:2])) |
1031 | + if changed or self.include_unchanged: |
1032 | + results.append((result, changed)) |
1033 | + self.searched_exact_paths.add(path_utf8) |
1034 | + return results |
1035 | + |
1036 | + cdef object _path_info(self, utf8_path, unicode_path): |
1037 | + """Generate path_info for unicode_path. |
1038 | + |
1039 | + :return: None if unicode_path does not exist, or a path_info tuple. |
1040 | + """ |
1041 | + abspath = self.tree.abspath(unicode_path) |
1042 | + try: |
1043 | + stat = os.lstat(abspath) |
1044 | + except OSError, e: |
1045 | + if e.errno == errno.ENOENT: |
1046 | + # the path does not exist. |
1047 | + return None |
1048 | + else: |
1049 | + raise |
1050 | + utf8_basename = utf8_path.rsplit('/', 1)[-1] |
1051 | + dir_info = (utf8_path, utf8_basename, |
1052 | + osutils.file_kind_from_stat_mode(stat.st_mode), stat, |
1053 | + abspath) |
1054 | + if dir_info[2] == 'directory': |
1055 | + if self.tree._directory_is_tree_reference( |
1056 | + unicode_path): |
1057 | + self.root_dir_info = self.root_dir_info[:2] + \ |
1058 | + ('tree-reference',) + self.root_dir_info[3:] |
1059 | + return dir_info |
1060 | |
1061 | === modified file 'bzrlib/_known_graph_py.py' |
1062 | --- bzrlib/_known_graph_py.py 2009-07-08 20:58:10 +0000 |
1063 | +++ bzrlib/_known_graph_py.py 2009-09-07 14:19:05 +0000 |
1064 | @@ -18,6 +18,7 @@ |
1065 | """ |
1066 | |
1067 | from bzrlib import ( |
1068 | + errors, |
1069 | revision, |
1070 | ) |
1071 | |
1072 | @@ -40,6 +41,18 @@ |
1073 | self.parent_keys, self.child_keys) |
1074 | |
1075 | |
1076 | +class _MergeSortNode(object): |
1077 | + """Information about a specific node in the merge graph.""" |
1078 | + |
1079 | + __slots__ = ('key', 'merge_depth', 'revno', 'end_of_merge') |
1080 | + |
1081 | + def __init__(self, key, merge_depth, revno, end_of_merge): |
1082 | + self.key = key |
1083 | + self.merge_depth = merge_depth |
1084 | + self.revno = revno |
1085 | + self.end_of_merge = end_of_merge |
1086 | + |
1087 | + |
1088 | class KnownGraph(object): |
1089 | """This is a class which assumes we already know the full graph.""" |
1090 | |
1091 | @@ -84,6 +97,10 @@ |
1092 | return [node for node in self._nodes.itervalues() |
1093 | if not node.parent_keys] |
1094 | |
1095 | + def _find_tips(self): |
1096 | + return [node for node in self._nodes.itervalues() |
1097 | + if not node.child_keys] |
1098 | + |
1099 | def _find_gdfo(self): |
1100 | nodes = self._nodes |
1101 | known_parent_gdfos = {} |
1102 | @@ -171,3 +188,119 @@ |
1103 | self._known_heads[heads_key] = heads |
1104 | return heads |
1105 | |
1106 | + def topo_sort(self): |
1107 | + """Return the nodes in topological order. |
1108 | + |
1109 | + All parents must occur before all children. |
1110 | + """ |
1111 | + for node in self._nodes.itervalues(): |
1112 | + if node.gdfo is None: |
1113 | + raise errors.GraphCycleError(self._nodes) |
1114 | + pending = self._find_tails() |
1115 | + pending_pop = pending.pop |
1116 | + pending_append = pending.append |
1117 | + |
1118 | + topo_order = [] |
1119 | + topo_order_append = topo_order.append |
1120 | + |
1121 | + num_seen_parents = dict.fromkeys(self._nodes, 0) |
1122 | + while pending: |
1123 | + node = pending_pop() |
1124 | + if node.parent_keys is not None: |
1125 | + # We don't include ghost parents |
1126 | + topo_order_append(node.key) |
1127 | + for child_key in node.child_keys: |
1128 | + child_node = self._nodes[child_key] |
1129 | + seen_parents = num_seen_parents[child_key] + 1 |
1130 | + if seen_parents == len(child_node.parent_keys): |
1131 | + # All parents have been processed, enqueue this child |
1132 | + pending_append(child_node) |
1133 | + # This has been queued up, stop tracking it |
1134 | + del num_seen_parents[child_key] |
1135 | + else: |
1136 | + num_seen_parents[child_key] = seen_parents |
1137 | + # We started from the parents, so we don't need to do anymore work |
1138 | + return topo_order |
1139 | + |
1140 | + def gc_sort(self): |
1141 | + """Return a reverse topological ordering which is 'stable'. |
1142 | + |
1143 | + There are a few constraints: |
1144 | + 1) Reverse topological (all children before all parents) |
1145 | + 2) Grouped by prefix |
1146 | + 3) 'stable' sorting, so that we get the same result, independent of |
1147 | + machine, or extra data. |
1148 | + To do this, we use the same basic algorithm as topo_sort, but when we |
1149 | + aren't sure what node to access next, we sort them lexicographically. |
1150 | + """ |
1151 | + tips = self._find_tips() |
1152 | + # Split the tips based on prefix |
1153 | + prefix_tips = {} |
1154 | + for node in tips: |
1155 | + if node.key.__class__ is str or len(node.key) == 1: |
1156 | + prefix = '' |
1157 | + else: |
1158 | + prefix = node.key[0] |
1159 | + prefix_tips.setdefault(prefix, []).append(node) |
1160 | + |
1161 | + num_seen_children = dict.fromkeys(self._nodes, 0) |
1162 | + |
1163 | + result = [] |
1164 | + for prefix in sorted(prefix_tips): |
1165 | + pending = sorted(prefix_tips[prefix], key=lambda n:n.key, |
1166 | + reverse=True) |
1167 | + while pending: |
1168 | + node = pending.pop() |
1169 | + if node.parent_keys is None: |
1170 | + # Ghost node, skip it |
1171 | + continue |
1172 | + result.append(node.key) |
1173 | + for parent_key in sorted(node.parent_keys, reverse=True): |
1174 | + parent_node = self._nodes[parent_key] |
1175 | + seen_children = num_seen_children[parent_key] + 1 |
1176 | + if seen_children == len(parent_node.child_keys): |
1177 | + # All children have been processed, enqueue this parent |
1178 | + pending.append(parent_node) |
1179 | + # This has been queued up, stop tracking it |
1180 | + del num_seen_children[parent_key] |
1181 | + else: |
1182 | + num_seen_children[parent_key] = seen_children |
1183 | + return result |
1184 | + |
1185 | + def merge_sort(self, tip_key): |
1186 | + """Compute the merge sorted graph output.""" |
1187 | + from bzrlib import tsort |
1188 | + as_parent_map = dict((node.key, node.parent_keys) |
1189 | + for node in self._nodes.itervalues() |
1190 | + if node.parent_keys is not None) |
1191 | + # We intentionally always generate revnos and never force the |
1192 | + # mainline_revisions |
1193 | + # Strip the sequence_number that merge_sort generates |
1194 | + return [_MergeSortNode(key, merge_depth, revno, end_of_merge) |
1195 | + for _, key, merge_depth, revno, end_of_merge |
1196 | + in tsort.merge_sort(as_parent_map, tip_key, |
1197 | + mainline_revisions=None, |
1198 | + generate_revno=True)] |
1199 | + |
1200 | + def get_parent_keys(self, key): |
1201 | + """Get the parents for a key |
1202 | + |
1203 | + Returns a list containg the parents keys. If the key is a ghost, |
1204 | + None is returned. A KeyError will be raised if the key is not in |
1205 | + the graph. |
1206 | + |
1207 | + :param keys: Key to check (eg revision_id) |
1208 | + :return: A list of parents |
1209 | + """ |
1210 | + return self._nodes[key].parent_keys |
1211 | + |
1212 | + def get_child_keys(self, key): |
1213 | + """Get the children for a key |
1214 | + |
1215 | + Returns a list containg the children keys. A KeyError will be raised |
1216 | + if the key is not in the graph. |
1217 | + |
1218 | + :param keys: Key to check (eg revision_id) |
1219 | + :return: A list of children |
1220 | + """ |
1221 | + return self._nodes[key].child_keys |
1222 | |
1223 | === modified file 'bzrlib/_known_graph_pyx.pyx' |
1224 | --- bzrlib/_known_graph_pyx.pyx 2009-07-14 16:10:32 +0000 |
1225 | +++ bzrlib/_known_graph_pyx.pyx 2009-09-07 14:19:05 +0000 |
1226 | @@ -25,11 +25,18 @@ |
1227 | ctypedef struct PyObject: |
1228 | pass |
1229 | |
1230 | + int PyString_CheckExact(object) |
1231 | + |
1232 | + int PyObject_RichCompareBool(object, object, int) |
1233 | + int Py_LT |
1234 | + |
1235 | + int PyTuple_CheckExact(object) |
1236 | object PyTuple_New(Py_ssize_t n) |
1237 | Py_ssize_t PyTuple_GET_SIZE(object t) |
1238 | PyObject * PyTuple_GET_ITEM(object t, Py_ssize_t o) |
1239 | void PyTuple_SET_ITEM(object t, Py_ssize_t o, object v) |
1240 | |
1241 | + int PyList_CheckExact(object) |
1242 | Py_ssize_t PyList_GET_SIZE(object l) |
1243 | PyObject * PyList_GET_ITEM(object l, Py_ssize_t o) |
1244 | int PyList_SetItem(object l, Py_ssize_t o, object l) except -1 |
1245 | @@ -44,8 +51,9 @@ |
1246 | |
1247 | void Py_INCREF(object) |
1248 | |
1249 | +import gc |
1250 | |
1251 | -from bzrlib import revision |
1252 | +from bzrlib import errors, revision |
1253 | |
1254 | cdef object NULL_REVISION |
1255 | NULL_REVISION = revision.NULL_REVISION |
1256 | @@ -59,10 +67,9 @@ |
1257 | cdef object children |
1258 | cdef public long gdfo |
1259 | cdef int seen |
1260 | + cdef object extra |
1261 | |
1262 | def __init__(self, key): |
1263 | - cdef int i |
1264 | - |
1265 | self.key = key |
1266 | self.parents = None |
1267 | |
1268 | @@ -70,6 +77,7 @@ |
1269 | # Greatest distance from origin |
1270 | self.gdfo = -1 |
1271 | self.seen = 0 |
1272 | + self.extra = None |
1273 | |
1274 | property child_keys: |
1275 | def __get__(self): |
1276 | @@ -80,6 +88,18 @@ |
1277 | PyList_Append(keys, child.key) |
1278 | return keys |
1279 | |
1280 | + property parent_keys: |
1281 | + def __get__(self): |
1282 | + if self.parents is None: |
1283 | + return None |
1284 | + |
1285 | + cdef _KnownGraphNode parent |
1286 | + |
1287 | + keys = [] |
1288 | + for parent in self.parents: |
1289 | + PyList_Append(keys, parent.key) |
1290 | + return keys |
1291 | + |
1292 | cdef clear_references(self): |
1293 | self.parents = None |
1294 | self.children = None |
1295 | @@ -107,17 +127,66 @@ |
1296 | return <_KnownGraphNode>temp_node |
1297 | |
1298 | |
1299 | -cdef _KnownGraphNode _get_parent(parents, Py_ssize_t pos): |
1300 | +cdef _KnownGraphNode _get_tuple_node(tpl, Py_ssize_t pos): |
1301 | cdef PyObject *temp_node |
1302 | - cdef _KnownGraphNode node |
1303 | |
1304 | - temp_node = PyTuple_GET_ITEM(parents, pos) |
1305 | + temp_node = PyTuple_GET_ITEM(tpl, pos) |
1306 | return <_KnownGraphNode>temp_node |
1307 | |
1308 | |
1309 | -# TODO: slab allocate all _KnownGraphNode objects. |
1310 | -# We already know how many we are going to need, except for a couple of |
1311 | -# ghosts that could be allocated on demand. |
1312 | +def get_key(node): |
1313 | + cdef _KnownGraphNode real_node |
1314 | + real_node = node |
1315 | + return real_node.key |
1316 | + |
1317 | + |
1318 | +cdef object _sort_list_nodes(object lst_or_tpl, int reverse): |
1319 | + """Sort a list of _KnownGraphNode objects. |
1320 | + |
1321 | + If lst_or_tpl is a list, it is allowed to mutate in place. It may also |
1322 | + just return the input list if everything is already sorted. |
1323 | + """ |
1324 | + cdef _KnownGraphNode node1, node2 |
1325 | + cdef int do_swap, is_tuple |
1326 | + cdef Py_ssize_t length |
1327 | + |
1328 | + is_tuple = PyTuple_CheckExact(lst_or_tpl) |
1329 | + if not (is_tuple or PyList_CheckExact(lst_or_tpl)): |
1330 | + raise TypeError('lst_or_tpl must be a list or tuple.') |
1331 | + length = len(lst_or_tpl) |
1332 | + if length == 0 or length == 1: |
1333 | + return lst_or_tpl |
1334 | + if length == 2: |
1335 | + if is_tuple: |
1336 | + node1 = _get_tuple_node(lst_or_tpl, 0) |
1337 | + node2 = _get_tuple_node(lst_or_tpl, 1) |
1338 | + else: |
1339 | + node1 = _get_list_node(lst_or_tpl, 0) |
1340 | + node2 = _get_list_node(lst_or_tpl, 1) |
1341 | + if reverse: |
1342 | + do_swap = PyObject_RichCompareBool(node1.key, node2.key, Py_LT) |
1343 | + else: |
1344 | + do_swap = PyObject_RichCompareBool(node2.key, node1.key, Py_LT) |
1345 | + if not do_swap: |
1346 | + return lst_or_tpl |
1347 | + if is_tuple: |
1348 | + return (node2, node1) |
1349 | + else: |
1350 | + # Swap 'in-place', since lists are mutable |
1351 | + Py_INCREF(node1) |
1352 | + PyList_SetItem(lst_or_tpl, 1, node1) |
1353 | + Py_INCREF(node2) |
1354 | + PyList_SetItem(lst_or_tpl, 0, node2) |
1355 | + return lst_or_tpl |
1356 | + # For all other sizes, we just use 'sorted()' |
1357 | + if is_tuple: |
1358 | + # Note that sorted() is just list(iterable).sort() |
1359 | + lst_or_tpl = list(lst_or_tpl) |
1360 | + lst_or_tpl.sort(key=get_key, reverse=reverse) |
1361 | + return lst_or_tpl |
1362 | + |
1363 | + |
1364 | +cdef class _MergeSorter |
1365 | |
1366 | cdef class KnownGraph: |
1367 | """This is a class which assumes we already know the full graph.""" |
1368 | @@ -136,6 +205,9 @@ |
1369 | # Maps {sorted(revision_id, revision_id): heads} |
1370 | self._known_heads = {} |
1371 | self.do_cache = int(do_cache) |
1372 | + # TODO: consider disabling gc since we are allocating a lot of nodes |
1373 | + # that won't be collectable anyway. real world testing has not |
1374 | + # shown a specific impact, yet. |
1375 | self._initialize_nodes(parent_map) |
1376 | self._find_gdfo() |
1377 | |
1378 | @@ -183,11 +255,16 @@ |
1379 | parent_keys = <object>temp_parent_keys |
1380 | num_parent_keys = len(parent_keys) |
1381 | node = self._get_or_create_node(key) |
1382 | - # We know how many parents, so we could pre allocate an exact sized |
1383 | - # tuple here |
1384 | + # We know how many parents, so we pre allocate the tuple |
1385 | parent_nodes = PyTuple_New(num_parent_keys) |
1386 | - # We use iter here, because parent_keys maybe be a list or tuple |
1387 | for pos2 from 0 <= pos2 < num_parent_keys: |
1388 | + # Note: it costs us 10ms out of 40ms to lookup all of these |
1389 | + # parents, it doesn't seem to be an allocation overhead, |
1390 | + # but rather a lookup overhead. There doesn't seem to be |
1391 | + # a way around it, and that is one reason why |
1392 | + # KnownGraphNode maintains a direct pointer to the parent |
1393 | + # node. |
1394 | + # We use [] because parent_keys may be a tuple or list |
1395 | parent_node = self._get_or_create_node(parent_keys[pos2]) |
1396 | # PyTuple_SET_ITEM will steal a reference, so INCREF first |
1397 | Py_INCREF(parent_node) |
1398 | @@ -209,6 +286,19 @@ |
1399 | PyList_Append(tails, node) |
1400 | return tails |
1401 | |
1402 | + def _find_tips(self): |
1403 | + cdef PyObject *temp_node |
1404 | + cdef _KnownGraphNode node |
1405 | + cdef Py_ssize_t pos |
1406 | + |
1407 | + tips = [] |
1408 | + pos = 0 |
1409 | + while PyDict_Next(self._nodes, &pos, NULL, &temp_node): |
1410 | + node = <_KnownGraphNode>temp_node |
1411 | + if PyList_GET_SIZE(node.children) == 0: |
1412 | + PyList_Append(tips, node) |
1413 | + return tips |
1414 | + |
1415 | def _find_gdfo(self): |
1416 | cdef _KnownGraphNode node |
1417 | cdef _KnownGraphNode child |
1418 | @@ -315,7 +405,7 @@ |
1419 | continue |
1420 | if node.parents is not None and PyTuple_GET_SIZE(node.parents) > 0: |
1421 | for pos from 0 <= pos < PyTuple_GET_SIZE(node.parents): |
1422 | - parent_node = _get_parent(node.parents, pos) |
1423 | + parent_node = _get_tuple_node(node.parents, pos) |
1424 | last_item = last_item + 1 |
1425 | if last_item < PyList_GET_SIZE(pending): |
1426 | Py_INCREF(parent_node) # SetItem steals a ref |
1427 | @@ -335,3 +425,454 @@ |
1428 | if self.do_cache: |
1429 | PyDict_SetItem(self._known_heads, heads_key, heads) |
1430 | return heads |
1431 | + |
1432 | + def topo_sort(self): |
1433 | + """Return the nodes in topological order. |
1434 | + |
1435 | + All parents must occur before all children. |
1436 | + """ |
1437 | + # This is, for the most part, the same iteration order that we used for |
1438 | + # _find_gdfo, consider finding a way to remove the duplication |
1439 | + # In general, we find the 'tails' (nodes with no parents), and then |
1440 | + # walk to the children. For children that have all of their parents |
1441 | + # yielded, we queue up the child to be yielded as well. |
1442 | + cdef _KnownGraphNode node |
1443 | + cdef _KnownGraphNode child |
1444 | + cdef PyObject *temp |
1445 | + cdef Py_ssize_t pos |
1446 | + cdef int replace |
1447 | + cdef Py_ssize_t last_item |
1448 | + |
1449 | + pending = self._find_tails() |
1450 | + if PyList_GET_SIZE(pending) == 0 and len(self._nodes) > 0: |
1451 | + raise errors.GraphCycleError(self._nodes) |
1452 | + |
1453 | + topo_order = [] |
1454 | + |
1455 | + last_item = PyList_GET_SIZE(pending) - 1 |
1456 | + while last_item >= 0: |
1457 | + # Avoid pop followed by push, instead, peek, and replace |
1458 | + # timing shows this is 930ms => 770ms for OOo |
1459 | + node = _get_list_node(pending, last_item) |
1460 | + last_item = last_item - 1 |
1461 | + if node.parents is not None: |
1462 | + # We don't include ghost parents |
1463 | + PyList_Append(topo_order, node.key) |
1464 | + for pos from 0 <= pos < PyList_GET_SIZE(node.children): |
1465 | + child = _get_list_node(node.children, pos) |
1466 | + if child.gdfo == -1: |
1467 | + # We know we have a graph cycle because a node has a parent |
1468 | + # which we couldn't find |
1469 | + raise errors.GraphCycleError(self._nodes) |
1470 | + child.seen = child.seen + 1 |
1471 | + if child.seen == PyTuple_GET_SIZE(child.parents): |
1472 | + # All parents of this child have been yielded, queue this |
1473 | + # one to be yielded as well |
1474 | + last_item = last_item + 1 |
1475 | + if last_item < PyList_GET_SIZE(pending): |
1476 | + Py_INCREF(child) # SetItem steals a ref |
1477 | + PyList_SetItem(pending, last_item, child) |
1478 | + else: |
1479 | + PyList_Append(pending, child) |
1480 | + # We have queued this node, we don't need to track it |
1481 | + # anymore |
1482 | + child.seen = 0 |
1483 | + # We started from the parents, so we don't need to do anymore work |
1484 | + return topo_order |
1485 | + |
1486 | + def gc_sort(self): |
1487 | + """Return a reverse topological ordering which is 'stable'. |
1488 | + |
1489 | + There are a few constraints: |
1490 | + 1) Reverse topological (all children before all parents) |
1491 | + 2) Grouped by prefix |
1492 | + 3) 'stable' sorting, so that we get the same result, independent of |
1493 | + machine, or extra data. |
1494 | + To do this, we use the same basic algorithm as topo_sort, but when we |
1495 | + aren't sure what node to access next, we sort them lexicographically. |
1496 | + """ |
1497 | + cdef PyObject *temp |
1498 | + cdef Py_ssize_t pos, last_item |
1499 | + cdef _KnownGraphNode node, node2, parent_node |
1500 | + |
1501 | + tips = self._find_tips() |
1502 | + # Split the tips based on prefix |
1503 | + prefix_tips = {} |
1504 | + for pos from 0 <= pos < PyList_GET_SIZE(tips): |
1505 | + node = _get_list_node(tips, pos) |
1506 | + if PyString_CheckExact(node.key) or len(node.key) == 1: |
1507 | + prefix = '' |
1508 | + else: |
1509 | + prefix = node.key[0] |
1510 | + temp = PyDict_GetItem(prefix_tips, prefix) |
1511 | + if temp == NULL: |
1512 | + prefix_tips[prefix] = [node] |
1513 | + else: |
1514 | + tip_nodes = <object>temp |
1515 | + PyList_Append(tip_nodes, node) |
1516 | + |
1517 | + result = [] |
1518 | + for prefix in sorted(prefix_tips): |
1519 | + temp = PyDict_GetItem(prefix_tips, prefix) |
1520 | + assert temp != NULL |
1521 | + tip_nodes = <object>temp |
1522 | + pending = _sort_list_nodes(tip_nodes, 1) |
1523 | + last_item = PyList_GET_SIZE(pending) - 1 |
1524 | + while last_item >= 0: |
1525 | + node = _get_list_node(pending, last_item) |
1526 | + last_item = last_item - 1 |
1527 | + if node.parents is None: |
1528 | + # Ghost |
1529 | + continue |
1530 | + PyList_Append(result, node.key) |
1531 | + # Sorting the parent keys isn't strictly necessary for stable |
1532 | + # sorting of a given graph. But it does help minimize the |
1533 | + # differences between graphs |
1534 | + # For bzr.dev ancestry: |
1535 | + # 4.73ms no sort |
1536 | + # 7.73ms RichCompareBool sort |
1537 | + parents = _sort_list_nodes(node.parents, 1) |
1538 | + for pos from 0 <= pos < len(parents): |
1539 | + if PyTuple_CheckExact(parents): |
1540 | + parent_node = _get_tuple_node(parents, pos) |
1541 | + else: |
1542 | + parent_node = _get_list_node(parents, pos) |
1543 | + # TODO: GraphCycle detection |
1544 | + parent_node.seen = parent_node.seen + 1 |
1545 | + if (parent_node.seen |
1546 | + == PyList_GET_SIZE(parent_node.children)): |
1547 | + # All children have been processed, queue up this |
1548 | + # parent |
1549 | + last_item = last_item + 1 |
1550 | + if last_item < PyList_GET_SIZE(pending): |
1551 | + Py_INCREF(parent_node) # SetItem steals a ref |
1552 | + PyList_SetItem(pending, last_item, parent_node) |
1553 | + else: |
1554 | + PyList_Append(pending, parent_node) |
1555 | + parent_node.seen = 0 |
1556 | + return result |
1557 | + |
1558 | + def merge_sort(self, tip_key): |
1559 | + """Compute the merge sorted graph output.""" |
1560 | + cdef _MergeSorter sorter |
1561 | + |
1562 | + # TODO: consider disabling gc since we are allocating a lot of nodes |
1563 | + # that won't be collectable anyway. real world testing has not |
1564 | + # shown a specific impact, yet. |
1565 | + sorter = _MergeSorter(self, tip_key) |
1566 | + return sorter.topo_order() |
1567 | + |
1568 | + def get_parent_keys(self, key): |
1569 | + """Get the parents for a key |
1570 | + |
1571 | + Returns a list containg the parents keys. If the key is a ghost, |
1572 | + None is returned. A KeyError will be raised if the key is not in |
1573 | + the graph. |
1574 | + |
1575 | + :param keys: Key to check (eg revision_id) |
1576 | + :return: A list of parents |
1577 | + """ |
1578 | + return self._nodes[key].parent_keys |
1579 | + |
1580 | + def get_child_keys(self, key): |
1581 | + """Get the children for a key |
1582 | + |
1583 | + Returns a list containg the children keys. A KeyError will be raised |
1584 | + if the key is not in the graph. |
1585 | + |
1586 | + :param keys: Key to check (eg revision_id) |
1587 | + :return: A list of children |
1588 | + """ |
1589 | + return self._nodes[key].child_keys |
1590 | + |
1591 | + |
1592 | +cdef class _MergeSortNode: |
1593 | + """Tracks information about a node during the merge_sort operation.""" |
1594 | + |
1595 | + # Public api |
1596 | + cdef public object key |
1597 | + cdef public long merge_depth |
1598 | + cdef public object end_of_merge # True/False Is this the end of the current merge |
1599 | + |
1600 | + # Private api, used while computing the information |
1601 | + cdef _KnownGraphNode left_parent |
1602 | + cdef _KnownGraphNode left_pending_parent |
1603 | + cdef object pending_parents # list of _KnownGraphNode for non-left parents |
1604 | + cdef long _revno_first |
1605 | + cdef long _revno_second |
1606 | + cdef long _revno_last |
1607 | + # TODO: turn these into flag/bit fields rather than individual members |
1608 | + cdef int is_first_child # Is this the first child? |
1609 | + cdef int seen_by_child # A child node has seen this parent |
1610 | + cdef int completed # Fully Processed |
1611 | + |
1612 | + def __init__(self, key): |
1613 | + self.key = key |
1614 | + self.merge_depth = -1 |
1615 | + self.left_parent = None |
1616 | + self.left_pending_parent = None |
1617 | + self.pending_parents = None |
1618 | + self._revno_first = -1 |
1619 | + self._revno_second = -1 |
1620 | + self._revno_last = -1 |
1621 | + self.is_first_child = 0 |
1622 | + self.seen_by_child = 0 |
1623 | + self.completed = 0 |
1624 | + |
1625 | + def __repr__(self): |
1626 | + return '%s(%s depth:%s rev:%s,%s,%s first:%s seen:%s)' % ( |
1627 | + self.__class__.__name__, self.key, |
1628 | + self.merge_depth, |
1629 | + self._revno_first, self._revno_second, self._revno_last, |
1630 | + self.is_first_child, self.seen_by_child) |
1631 | + |
1632 | + cdef int has_pending_parents(self): |
1633 | + if self.left_pending_parent is not None or self.pending_parents: |
1634 | + return 1 |
1635 | + return 0 |
1636 | + |
1637 | + cdef object _revno(self): |
1638 | + if self._revno_first == -1: |
1639 | + if self._revno_second != -1: |
1640 | + raise RuntimeError('Something wrong with: %s' % (self,)) |
1641 | + return (self._revno_last,) |
1642 | + else: |
1643 | + return (self._revno_first, self._revno_second, self._revno_last) |
1644 | + |
1645 | + property revno: |
1646 | + def __get__(self): |
1647 | + return self._revno() |
1648 | + |
1649 | + |
1650 | +cdef class _MergeSorter: |
1651 | + """This class does the work of computing the merge_sort ordering. |
1652 | + |
1653 | + We have some small advantages, in that we get all the extra information |
1654 | + that KnownGraph knows, like knowing the child lists, etc. |
1655 | + """ |
1656 | + |
1657 | + # Current performance numbers for merge_sort(bzr_dev_parent_map): |
1658 | + # 302ms tsort.merge_sort() |
1659 | + # 91ms graph.KnownGraph().merge_sort() |
1660 | + # 40ms kg.merge_sort() |
1661 | + |
1662 | + cdef KnownGraph graph |
1663 | + cdef object _depth_first_stack # list |
1664 | + cdef Py_ssize_t _last_stack_item # offset to last item on stack |
1665 | + # cdef object _ms_nodes # dict of key => _MergeSortNode |
1666 | + cdef object _revno_to_branch_count # {revno => num child branches} |
1667 | + cdef object _scheduled_nodes # List of nodes ready to be yielded |
1668 | + |
1669 | + def __init__(self, known_graph, tip_key): |
1670 | + cdef _KnownGraphNode node |
1671 | + |
1672 | + self.graph = known_graph |
1673 | + # self._ms_nodes = {} |
1674 | + self._revno_to_branch_count = {} |
1675 | + self._depth_first_stack = [] |
1676 | + self._last_stack_item = -1 |
1677 | + self._scheduled_nodes = [] |
1678 | + if (tip_key is not None and tip_key != NULL_REVISION |
1679 | + and tip_key != (NULL_REVISION,)): |
1680 | + node = self.graph._nodes[tip_key] |
1681 | + self._push_node(node, 0) |
1682 | + |
1683 | + cdef _MergeSortNode _get_ms_node(self, _KnownGraphNode node): |
1684 | + cdef PyObject *temp_node |
1685 | + cdef _MergeSortNode ms_node |
1686 | + |
1687 | + if node.extra is None: |
1688 | + ms_node = _MergeSortNode(node.key) |
1689 | + node.extra = ms_node |
1690 | + else: |
1691 | + ms_node = <_MergeSortNode>node.extra |
1692 | + return ms_node |
1693 | + |
1694 | + cdef _push_node(self, _KnownGraphNode node, long merge_depth): |
1695 | + cdef _KnownGraphNode parent_node |
1696 | + cdef _MergeSortNode ms_node, ms_parent_node |
1697 | + cdef Py_ssize_t pos |
1698 | + |
1699 | + ms_node = self._get_ms_node(node) |
1700 | + ms_node.merge_depth = merge_depth |
1701 | + if node.parents is None: |
1702 | + raise RuntimeError('ghost nodes should not be pushed' |
1703 | + ' onto the stack: %s' % (node,)) |
1704 | + if PyTuple_GET_SIZE(node.parents) > 0: |
1705 | + parent_node = _get_tuple_node(node.parents, 0) |
1706 | + ms_node.left_parent = parent_node |
1707 | + if parent_node.parents is None: # left-hand ghost |
1708 | + ms_node.left_pending_parent = None |
1709 | + ms_node.left_parent = None |
1710 | + else: |
1711 | + ms_node.left_pending_parent = parent_node |
1712 | + if PyTuple_GET_SIZE(node.parents) > 1: |
1713 | + ms_node.pending_parents = [] |
1714 | + for pos from 1 <= pos < PyTuple_GET_SIZE(node.parents): |
1715 | + parent_node = _get_tuple_node(node.parents, pos) |
1716 | + if parent_node.parents is None: # ghost |
1717 | + continue |
1718 | + PyList_Append(ms_node.pending_parents, parent_node) |
1719 | + |
1720 | + ms_node.is_first_child = 1 |
1721 | + if ms_node.left_parent is not None: |
1722 | + ms_parent_node = self._get_ms_node(ms_node.left_parent) |
1723 | + if ms_parent_node.seen_by_child: |
1724 | + ms_node.is_first_child = 0 |
1725 | + ms_parent_node.seen_by_child = 1 |
1726 | + self._last_stack_item = self._last_stack_item + 1 |
1727 | + if self._last_stack_item < PyList_GET_SIZE(self._depth_first_stack): |
1728 | + Py_INCREF(node) # SetItem steals a ref |
1729 | + PyList_SetItem(self._depth_first_stack, self._last_stack_item, |
1730 | + node) |
1731 | + else: |
1732 | + PyList_Append(self._depth_first_stack, node) |
1733 | + |
1734 | + cdef _pop_node(self): |
1735 | + cdef PyObject *temp |
1736 | + cdef _MergeSortNode ms_node, ms_parent_node, ms_prev_node |
1737 | + cdef _KnownGraphNode node, parent_node, prev_node |
1738 | + |
1739 | + node = _get_list_node(self._depth_first_stack, self._last_stack_item) |
1740 | + ms_node = <_MergeSortNode>node.extra |
1741 | + self._last_stack_item = self._last_stack_item - 1 |
1742 | + if ms_node.left_parent is not None: |
1743 | + # Assign the revision number from the left-hand parent |
1744 | + ms_parent_node = <_MergeSortNode>ms_node.left_parent.extra |
1745 | + if ms_node.is_first_child: |
1746 | + # First child just increments the final digit |
1747 | + ms_node._revno_first = ms_parent_node._revno_first |
1748 | + ms_node._revno_second = ms_parent_node._revno_second |
1749 | + ms_node._revno_last = ms_parent_node._revno_last + 1 |
1750 | + else: |
1751 | + # Not the first child, make a new branch |
1752 | + # (mainline_revno, branch_count, 1) |
1753 | + if ms_parent_node._revno_first == -1: |
1754 | + # Mainline ancestor, the increment is on the last digit |
1755 | + base_revno = ms_parent_node._revno_last |
1756 | + else: |
1757 | + base_revno = ms_parent_node._revno_first |
1758 | + temp = PyDict_GetItem(self._revno_to_branch_count, |
1759 | + base_revno) |
1760 | + if temp == NULL: |
1761 | + branch_count = 1 |
1762 | + else: |
1763 | + branch_count = (<object>temp) + 1 |
1764 | + PyDict_SetItem(self._revno_to_branch_count, base_revno, |
1765 | + branch_count) |
1766 | + ms_node._revno_first = base_revno |
1767 | + ms_node._revno_second = branch_count |
1768 | + ms_node._revno_last = 1 |
1769 | + else: |
1770 | + temp = PyDict_GetItem(self._revno_to_branch_count, 0) |
1771 | + if temp == NULL: |
1772 | + # The first root node doesn't have a 3-digit revno |
1773 | + root_count = 0 |
1774 | + ms_node._revno_first = -1 |
1775 | + ms_node._revno_second = -1 |
1776 | + ms_node._revno_last = 1 |
1777 | + else: |
1778 | + root_count = (<object>temp) + 1 |
1779 | + ms_node._revno_first = 0 |
1780 | + ms_node._revno_second = root_count |
1781 | + ms_node._revno_last = 1 |
1782 | + PyDict_SetItem(self._revno_to_branch_count, 0, root_count) |
1783 | + ms_node.completed = 1 |
1784 | + if PyList_GET_SIZE(self._scheduled_nodes) == 0: |
1785 | + # The first scheduled node is always the end of merge |
1786 | + ms_node.end_of_merge = True |
1787 | + else: |
1788 | + prev_node = _get_list_node(self._scheduled_nodes, |
1789 | + PyList_GET_SIZE(self._scheduled_nodes) - 1) |
1790 | + ms_prev_node = <_MergeSortNode>prev_node.extra |
1791 | + if ms_prev_node.merge_depth < ms_node.merge_depth: |
1792 | + # The previously pushed node is to our left, so this is the end |
1793 | + # of this right-hand chain |
1794 | + ms_node.end_of_merge = True |
1795 | + elif (ms_prev_node.merge_depth == ms_node.merge_depth |
1796 | + and prev_node not in node.parents): |
1797 | + # The next node is not a direct parent of this node |
1798 | + ms_node.end_of_merge = True |
1799 | + else: |
1800 | + ms_node.end_of_merge = False |
1801 | + PyList_Append(self._scheduled_nodes, node) |
1802 | + |
1803 | + cdef _schedule_stack(self): |
1804 | + cdef _KnownGraphNode last_node, next_node |
1805 | + cdef _MergeSortNode ms_node, ms_last_node, ms_next_node |
1806 | + cdef long next_merge_depth |
1807 | + ordered = [] |
1808 | + while self._last_stack_item >= 0: |
1809 | + # Peek at the last item on the stack |
1810 | + last_node = _get_list_node(self._depth_first_stack, |
1811 | + self._last_stack_item) |
1812 | + if last_node.gdfo == -1: |
1813 | + # if _find_gdfo skipped a node, that means there is a graph |
1814 | + # cycle, error out now |
1815 | + raise errors.GraphCycleError(self.graph._nodes) |
1816 | + ms_last_node = <_MergeSortNode>last_node.extra |
1817 | + if not ms_last_node.has_pending_parents(): |
1818 | + # Processed all parents, pop this node |
1819 | + self._pop_node() |
1820 | + continue |
1821 | + while ms_last_node.has_pending_parents(): |
1822 | + if ms_last_node.left_pending_parent is not None: |
1823 | + # recurse depth first into the primary parent |
1824 | + next_node = ms_last_node.left_pending_parent |
1825 | + ms_last_node.left_pending_parent = None |
1826 | + else: |
1827 | + # place any merges in right-to-left order for scheduling |
1828 | + # which gives us left-to-right order after we reverse |
1829 | + # the scheduled queue. |
1830 | + # Note: This has the effect of allocating common-new |
1831 | + # revisions to the right-most subtree rather than the |
1832 | + # left most, which will display nicely (you get |
1833 | + # smaller trees at the top of the combined merge). |
1834 | + next_node = ms_last_node.pending_parents.pop() |
1835 | + ms_next_node = self._get_ms_node(next_node) |
1836 | + if ms_next_node.completed: |
1837 | + # this parent was completed by a child on the |
1838 | + # call stack. skip it. |
1839 | + continue |
1840 | + # otherwise transfer it from the source graph into the |
1841 | + # top of the current depth first search stack. |
1842 | + |
1843 | + if next_node is ms_last_node.left_parent: |
1844 | + next_merge_depth = ms_last_node.merge_depth |
1845 | + else: |
1846 | + next_merge_depth = ms_last_node.merge_depth + 1 |
1847 | + self._push_node(next_node, next_merge_depth) |
1848 | + # and do not continue processing parents until this 'call' |
1849 | + # has recursed. |
1850 | + break |
1851 | + |
1852 | + cdef topo_order(self): |
1853 | + cdef _MergeSortNode ms_node |
1854 | + cdef _KnownGraphNode node |
1855 | + cdef Py_ssize_t pos |
1856 | + cdef PyObject *temp_key, *temp_node |
1857 | + |
1858 | + # Note: allocating a _MergeSortNode and deallocating it for all nodes |
1859 | + # costs approx 8.52ms (21%) of the total runtime |
1860 | + # We might consider moving the attributes into the base |
1861 | + # KnownGraph object. |
1862 | + self._schedule_stack() |
1863 | + |
1864 | + # We've set up the basic schedule, now we can continue processing the |
1865 | + # output. |
1866 | + # Note: This final loop costs us 40.0ms => 28.8ms (11ms, 25%) on |
1867 | + # bzr.dev, to convert the internal Object representation into a |
1868 | + # Tuple representation... |
1869 | + # 2ms is walking the data and computing revno tuples |
1870 | + # 7ms is computing the return tuple |
1871 | + # 4ms is PyList_Append() |
1872 | + ordered = [] |
1873 | + # output the result in reverse order, and separate the generated info |
1874 | + for pos from PyList_GET_SIZE(self._scheduled_nodes) > pos >= 0: |
1875 | + node = _get_list_node(self._scheduled_nodes, pos) |
1876 | + ms_node = <_MergeSortNode>node.extra |
1877 | + PyList_Append(ordered, ms_node) |
1878 | + node.extra = None |
1879 | + # Clear out the scheduled nodes now that we're done |
1880 | + self._scheduled_nodes = [] |
1881 | + return ordered |
1882 | |
1883 | === modified file 'bzrlib/annotate.py' |
1884 | --- bzrlib/annotate.py 2009-07-08 17:09:03 +0000 |
1885 | +++ bzrlib/annotate.py 2009-08-17 18:52:01 +0000 |
1886 | @@ -188,6 +188,10 @@ |
1887 | # or something. |
1888 | last_revision = current_rev.revision_id |
1889 | # XXX: Partially Cloned from branch, uses the old_get_graph, eep. |
1890 | + # XXX: The main difficulty is that we need to inject a single new node |
1891 | + # (current_rev) into the graph before it gets numbered, etc. |
1892 | + # Once KnownGraph gets an 'add_node()' function, we can use |
1893 | + # VF.get_known_graph_ancestry(). |
1894 | graph = repository.get_graph() |
1895 | revision_graph = dict(((key, value) for key, value in |
1896 | graph.iter_ancestry(current_rev.parent_ids) if value is not None)) |
1897 | |
1898 | === modified file 'bzrlib/branch.py' |
1899 | --- bzrlib/branch.py 2009-08-04 04:36:34 +0000 |
1900 | +++ bzrlib/branch.py 2009-08-19 18:04:49 +0000 |
1901 | @@ -446,15 +446,11 @@ |
1902 | # start_revision_id. |
1903 | if self._merge_sorted_revisions_cache is None: |
1904 | last_revision = self.last_revision() |
1905 | - graph = self.repository.get_graph() |
1906 | - parent_map = dict(((key, value) for key, value in |
1907 | - graph.iter_ancestry([last_revision]) if value is not None)) |
1908 | - revision_graph = repository._strip_NULL_ghosts(parent_map) |
1909 | - revs = tsort.merge_sort(revision_graph, last_revision, None, |
1910 | - generate_revno=True) |
1911 | - # Drop the sequence # before caching |
1912 | - self._merge_sorted_revisions_cache = [r[1:] for r in revs] |
1913 | - |
1914 | + last_key = (last_revision,) |
1915 | + known_graph = self.repository.revisions.get_known_graph_ancestry( |
1916 | + [last_key]) |
1917 | + self._merge_sorted_revisions_cache = known_graph.merge_sort( |
1918 | + last_key) |
1919 | filtered = self._filter_merge_sorted_revisions( |
1920 | self._merge_sorted_revisions_cache, start_revision_id, |
1921 | stop_revision_id, stop_rule) |
1922 | @@ -470,27 +466,34 @@ |
1923 | """Iterate over an inclusive range of sorted revisions.""" |
1924 | rev_iter = iter(merge_sorted_revisions) |
1925 | if start_revision_id is not None: |
1926 | - for rev_id, depth, revno, end_of_merge in rev_iter: |
1927 | + for node in rev_iter: |
1928 | + rev_id = node.key[-1] |
1929 | if rev_id != start_revision_id: |
1930 | continue |
1931 | else: |
1932 | # The decision to include the start or not |
1933 | # depends on the stop_rule if a stop is provided |
1934 | - rev_iter = chain( |
1935 | - iter([(rev_id, depth, revno, end_of_merge)]), |
1936 | - rev_iter) |
1937 | + # so pop this node back into the iterator |
1938 | + rev_iter = chain(iter([node]), rev_iter) |
1939 | break |
1940 | if stop_revision_id is None: |
1941 | - for rev_id, depth, revno, end_of_merge in rev_iter: |
1942 | - yield rev_id, depth, revno, end_of_merge |
1943 | + # Yield everything |
1944 | + for node in rev_iter: |
1945 | + rev_id = node.key[-1] |
1946 | + yield (rev_id, node.merge_depth, node.revno, |
1947 | + node.end_of_merge) |
1948 | elif stop_rule == 'exclude': |
1949 | - for rev_id, depth, revno, end_of_merge in rev_iter: |
1950 | + for node in rev_iter: |
1951 | + rev_id = node.key[-1] |
1952 | if rev_id == stop_revision_id: |
1953 | return |
1954 | - yield rev_id, depth, revno, end_of_merge |
1955 | + yield (rev_id, node.merge_depth, node.revno, |
1956 | + node.end_of_merge) |
1957 | elif stop_rule == 'include': |
1958 | - for rev_id, depth, revno, end_of_merge in rev_iter: |
1959 | - yield rev_id, depth, revno, end_of_merge |
1960 | + for node in rev_iter: |
1961 | + rev_id = node.key[-1] |
1962 | + yield (rev_id, node.merge_depth, node.revno, |
1963 | + node.end_of_merge) |
1964 | if rev_id == stop_revision_id: |
1965 | return |
1966 | elif stop_rule == 'with-merges': |
1967 | @@ -499,10 +502,12 @@ |
1968 | left_parent = stop_rev.parent_ids[0] |
1969 | else: |
1970 | left_parent = _mod_revision.NULL_REVISION |
1971 | - for rev_id, depth, revno, end_of_merge in rev_iter: |
1972 | + for node in rev_iter: |
1973 | + rev_id = node.key[-1] |
1974 | if rev_id == left_parent: |
1975 | return |
1976 | - yield rev_id, depth, revno, end_of_merge |
1977 | + yield (rev_id, node.merge_depth, node.revno, |
1978 | + node.end_of_merge) |
1979 | else: |
1980 | raise ValueError('invalid stop_rule %r' % stop_rule) |
1981 | |
1982 | @@ -1147,6 +1152,9 @@ |
1983 | revision_id: if not None, the revision history in the new branch will |
1984 | be truncated to end with revision_id. |
1985 | """ |
1986 | + if (repository_policy is not None and |
1987 | + repository_policy.requires_stacking()): |
1988 | + to_bzrdir._format.require_stacking(_skip_repo=True) |
1989 | result = to_bzrdir.create_branch() |
1990 | result.lock_write() |
1991 | try: |
1992 | @@ -2064,7 +2072,7 @@ |
1993 | BranchFormat.register_format(__format6) |
1994 | BranchFormat.register_format(__format7) |
1995 | BranchFormat.register_format(__format8) |
1996 | -BranchFormat.set_default_format(__format6) |
1997 | +BranchFormat.set_default_format(__format7) |
1998 | _legacy_formats = [BzrBranchFormat4(), |
1999 | ] |
2000 | network_format_registry.register( |
2001 | |
2002 | === modified file 'bzrlib/btree_index.py' |
2003 | --- bzrlib/btree_index.py 2009-07-01 10:51:47 +0000 |
2004 | +++ bzrlib/btree_index.py 2009-08-17 22:11:06 +0000 |
2005 | @@ -586,13 +586,19 @@ |
2006 | class _LeafNode(object): |
2007 | """A leaf node for a serialised B+Tree index.""" |
2008 | |
2009 | - __slots__ = ('keys',) |
2010 | + __slots__ = ('keys', 'min_key', 'max_key') |
2011 | |
2012 | def __init__(self, bytes, key_length, ref_list_length): |
2013 | """Parse bytes to create a leaf node object.""" |
2014 | # splitlines mangles the \r delimiters.. don't use it. |
2015 | - self.keys = dict(_btree_serializer._parse_leaf_lines(bytes, |
2016 | - key_length, ref_list_length)) |
2017 | + key_list = _btree_serializer._parse_leaf_lines(bytes, |
2018 | + key_length, ref_list_length) |
2019 | + if key_list: |
2020 | + self.min_key = key_list[0][0] |
2021 | + self.max_key = key_list[-1][0] |
2022 | + else: |
2023 | + self.min_key = self.max_key = None |
2024 | + self.keys = dict(key_list) |
2025 | |
2026 | |
2027 | class _InternalNode(object): |
2028 | @@ -1039,6 +1045,39 @@ |
2029 | output.append(cur_out) |
2030 | return output |
2031 | |
2032 | + def _walk_through_internal_nodes(self, keys): |
2033 | + """Take the given set of keys, and find the corresponding LeafNodes. |
2034 | + |
2035 | + :param keys: An unsorted iterable of keys to search for |
2036 | + :return: (nodes, index_and_keys) |
2037 | + nodes is a dict mapping {index: LeafNode} |
2038 | + keys_at_index is a list of tuples of [(index, [keys for Leaf])] |
2039 | + """ |
2040 | + # 6 seconds spent in miss_torture using the sorted() line. |
2041 | + # Even with out of order disk IO it seems faster not to sort it when |
2042 | + # large queries are being made. |
2043 | + keys_at_index = [(0, sorted(keys))] |
2044 | + |
2045 | + for row_pos, next_row_start in enumerate(self._row_offsets[1:-1]): |
2046 | + node_indexes = [idx for idx, s_keys in keys_at_index] |
2047 | + nodes = self._get_internal_nodes(node_indexes) |
2048 | + |
2049 | + next_nodes_and_keys = [] |
2050 | + for node_index, sub_keys in keys_at_index: |
2051 | + node = nodes[node_index] |
2052 | + positions = self._multi_bisect_right(sub_keys, node.keys) |
2053 | + node_offset = next_row_start + node.offset |
2054 | + next_nodes_and_keys.extend([(node_offset + pos, s_keys) |
2055 | + for pos, s_keys in positions]) |
2056 | + keys_at_index = next_nodes_and_keys |
2057 | + # We should now be at the _LeafNodes |
2058 | + node_indexes = [idx for idx, s_keys in keys_at_index] |
2059 | + |
2060 | + # TODO: We may *not* want to always read all the nodes in one |
2061 | + # big go. Consider setting a max size on this. |
2062 | + nodes = self._get_leaf_nodes(node_indexes) |
2063 | + return nodes, keys_at_index |
2064 | + |
2065 | def iter_entries(self, keys): |
2066 | """Iterate over keys within the index. |
2067 | |
2068 | @@ -1082,32 +1121,7 @@ |
2069 | needed_keys = keys |
2070 | if not needed_keys: |
2071 | return |
2072 | - # 6 seconds spent in miss_torture using the sorted() line. |
2073 | - # Even with out of order disk IO it seems faster not to sort it when |
2074 | - # large queries are being made. |
2075 | - needed_keys = sorted(needed_keys) |
2076 | - |
2077 | - nodes_and_keys = [(0, needed_keys)] |
2078 | - |
2079 | - for row_pos, next_row_start in enumerate(self._row_offsets[1:-1]): |
2080 | - node_indexes = [idx for idx, s_keys in nodes_and_keys] |
2081 | - nodes = self._get_internal_nodes(node_indexes) |
2082 | - |
2083 | - next_nodes_and_keys = [] |
2084 | - for node_index, sub_keys in nodes_and_keys: |
2085 | - node = nodes[node_index] |
2086 | - positions = self._multi_bisect_right(sub_keys, node.keys) |
2087 | - node_offset = next_row_start + node.offset |
2088 | - next_nodes_and_keys.extend([(node_offset + pos, s_keys) |
2089 | - for pos, s_keys in positions]) |
2090 | - nodes_and_keys = next_nodes_and_keys |
2091 | - # We should now be at the _LeafNodes |
2092 | - node_indexes = [idx for idx, s_keys in nodes_and_keys] |
2093 | - |
2094 | - # TODO: We may *not* want to always read all the nodes in one |
2095 | - # big go. Consider setting a max size on this. |
2096 | - |
2097 | - nodes = self._get_leaf_nodes(node_indexes) |
2098 | + nodes, nodes_and_keys = self._walk_through_internal_nodes(needed_keys) |
2099 | for node_index, sub_keys in nodes_and_keys: |
2100 | if not sub_keys: |
2101 | continue |
2102 | @@ -1120,6 +1134,133 @@ |
2103 | else: |
2104 | yield (self, next_sub_key, value) |
2105 | |
2106 | + def _find_ancestors(self, keys, ref_list_num, parent_map, missing_keys): |
2107 | + """Find the parent_map information for the set of keys. |
2108 | + |
2109 | + This populates the parent_map dict and missing_keys set based on the |
2110 | + queried keys. It also can fill out an arbitrary number of parents that |
2111 | + it finds while searching for the supplied keys. |
2112 | + |
2113 | + It is unlikely that you want to call this directly. See |
2114 | + "CombinedGraphIndex.find_ancestry()" for a more appropriate API. |
2115 | + |
2116 | + :param keys: A keys whose ancestry we want to return |
2117 | + Every key will either end up in 'parent_map' or 'missing_keys'. |
2118 | + :param ref_list_num: This index in the ref_lists is the parents we |
2119 | + care about. |
2120 | + :param parent_map: {key: parent_keys} for keys that are present in this |
2121 | + index. This may contain more entries than were in 'keys', that are |
2122 | + reachable ancestors of the keys requested. |
2123 | + :param missing_keys: keys which are known to be missing in this index. |
2124 | + This may include parents that were not directly requested, but we |
2125 | + were able to determine that they are not present in this index. |
2126 | + :return: search_keys parents that were found but not queried to know |
2127 | + if they are missing or present. Callers can re-query this index for |
2128 | + those keys, and they will be placed into parent_map or missing_keys |
2129 | + """ |
2130 | + if not self.key_count(): |
2131 | + # We use key_count() to trigger reading the root node and |
2132 | + # determining info about this BTreeGraphIndex |
2133 | + # If we don't have any keys, then everything is missing |
2134 | + missing_keys.update(keys) |
2135 | + return set() |
2136 | + if ref_list_num >= self.node_ref_lists: |
2137 | + raise ValueError('No ref list %d, index has %d ref lists' |
2138 | + % (ref_list_num, self.node_ref_lists)) |
2139 | + |
2140 | + # The main trick we are trying to accomplish is that when we find a |
2141 | + # key listing its parents, we expect that the parent key is also likely |
2142 | + # to sit on the same page. Allowing us to expand parents quickly |
2143 | + # without suffering the full stack of bisecting, etc. |
2144 | + nodes, nodes_and_keys = self._walk_through_internal_nodes(keys) |
2145 | + |
2146 | + # These are parent keys which could not be immediately resolved on the |
2147 | + # page where the child was present. Note that we may already be |
2148 | + # searching for that key, and it may actually be present [or known |
2149 | + # missing] on one of the other pages we are reading. |
2150 | + # TODO: |
2151 | + # We could try searching for them in the immediate previous or next |
2152 | + # page. If they occur "later" we could put them in a pending lookup |
2153 | + # set, and then for each node we read thereafter we could check to |
2154 | + # see if they are present. |
2155 | + # However, we don't know the impact of keeping this list of things |
2156 | + # that I'm going to search for every node I come across from here on |
2157 | + # out. |
2158 | + # It doesn't handle the case when the parent key is missing on a |
2159 | + # page that we *don't* read. So we already have to handle being |
2160 | + # re-entrant for that. |
2161 | + # Since most keys contain a date string, they are more likely to be |
2162 | + # found earlier in the file than later, but we would know that right |
2163 | + # away (key < min_key), and wouldn't keep searching it on every other |
2164 | + # page that we read. |
2165 | + # Mostly, it is an idea, one which should be benchmarked. |
2166 | + parents_not_on_page = set() |
2167 | + |
2168 | + for node_index, sub_keys in nodes_and_keys: |
2169 | + if not sub_keys: |
2170 | + continue |
2171 | + # sub_keys is all of the keys we are looking for that should exist |
2172 | + # on this page, if they aren't here, then they won't be found |
2173 | + node = nodes[node_index] |
2174 | + node_keys = node.keys |
2175 | + parents_to_check = set() |
2176 | + for next_sub_key in sub_keys: |
2177 | + if next_sub_key not in node_keys: |
2178 | + # This one is just not present in the index at all |
2179 | + missing_keys.add(next_sub_key) |
2180 | + else: |
2181 | + value, refs = node_keys[next_sub_key] |
2182 | + parent_keys = refs[ref_list_num] |
2183 | + parent_map[next_sub_key] = parent_keys |
2184 | + parents_to_check.update(parent_keys) |
2185 | + # Don't look for things we've already found |
2186 | + parents_to_check = parents_to_check.difference(parent_map) |
2187 | + # this can be used to test the benefit of having the check loop |
2188 | + # inlined. |
2189 | + # parents_not_on_page.update(parents_to_check) |
2190 | + # continue |
2191 | + while parents_to_check: |
2192 | + next_parents_to_check = set() |
2193 | + for key in parents_to_check: |
2194 | + if key in node_keys: |
2195 | + value, refs = node_keys[key] |
2196 | + parent_keys = refs[ref_list_num] |
2197 | + parent_map[key] = parent_keys |
2198 | + next_parents_to_check.update(parent_keys) |
2199 | + else: |
2200 | + # This parent either is genuinely missing, or should be |
2201 | + # found on another page. Perf test whether it is better |
2202 | + # to check if this node should fit on this page or not. |
2203 | + # in the 'everything-in-one-pack' scenario, this *not* |
2204 | + # doing the check is 237ms vs 243ms. |
2205 | + # So slightly better, but I assume the standard 'lots |
2206 | + # of packs' is going to show a reasonable improvement |
2207 | + # from the check, because it avoids 'going around |
2208 | + # again' for everything that is in another index |
2209 | + # parents_not_on_page.add(key) |
2210 | + # Missing for some reason |
2211 | + if key < node.min_key: |
2212 | + # in the case of bzr.dev, 3.4k/5.3k misses are |
2213 | + # 'earlier' misses (65%) |
2214 | + parents_not_on_page.add(key) |
2215 | + elif key > node.max_key: |
2216 | + # This parent key would be present on a different |
2217 | + # LeafNode |
2218 | + parents_not_on_page.add(key) |
2219 | + else: |
2220 | + # assert key != node.min_key and key != node.max_key |
2221 | + # If it was going to be present, it would be on |
2222 | + # *this* page, so mark it missing. |
2223 | + missing_keys.add(key) |
2224 | + parents_to_check = next_parents_to_check.difference(parent_map) |
2225 | + # Might want to do another .difference() from missing_keys |
2226 | + # parents_not_on_page could have been found on a different page, or be |
2227 | + # known to be missing. So cull out everything that has already been |
2228 | + # found. |
2229 | + search_keys = parents_not_on_page.difference( |
2230 | + parent_map).difference(missing_keys) |
2231 | + return search_keys |
2232 | + |
2233 | def iter_entries_prefix(self, keys): |
2234 | """Iterate over keys within the index using prefix matching. |
2235 | |
2236 | |
2237 | === modified file 'bzrlib/builtins.py' |
2238 | --- bzrlib/builtins.py 2009-07-27 06:22:57 +0000 |
2239 | +++ bzrlib/builtins.py 2009-09-08 16:45:11 +0000 |
2240 | @@ -120,6 +120,15 @@ |
2241 | |
2242 | |
2243 | def _get_one_revision_tree(command_name, revisions, branch=None, tree=None): |
2244 | + """Get a revision tree. Not suitable for commands that change the tree. |
2245 | + |
2246 | + Specifically, the basis tree in dirstate trees is coupled to the dirstate |
2247 | + and doing a commit/uncommit/pull will at best fail due to changing the |
2248 | + basis revision data. |
2249 | + |
2250 | + If tree is passed in, it should be already locked, for lifetime management |
2251 | + of the trees internal cached state. |
2252 | + """ |
2253 | if branch is None: |
2254 | branch = tree.branch |
2255 | if revisions is None: |
2256 | @@ -452,8 +461,8 @@ |
2257 | raise errors.BzrCommandError("You cannot remove the working tree" |
2258 | " of a remote path") |
2259 | if not force: |
2260 | - # XXX: What about pending merges ? -- vila 20090629 |
2261 | - if working.has_changes(working.basis_tree()): |
2262 | + if (working.has_changes(working.basis_tree()) |
2263 | + or len(working.get_parent_ids()) > 1): |
2264 | raise errors.UncommittedChanges(working) |
2265 | |
2266 | working_path = working.bzrdir.root_transport.base |
2267 | @@ -603,6 +612,9 @@ |
2268 | branches that will be merged later (without showing the two different |
2269 | adds as a conflict). It is also useful when merging another project |
2270 | into a subdirectory of this one. |
2271 | + |
2272 | + Any files matching patterns in the ignore list will not be added |
2273 | + unless they are explicitly mentioned. |
2274 | """ |
2275 | takes_args = ['file*'] |
2276 | takes_options = [ |
2277 | @@ -616,7 +628,7 @@ |
2278 | help='Lookup file ids from this tree.'), |
2279 | ] |
2280 | encoding_type = 'replace' |
2281 | - _see_also = ['remove'] |
2282 | + _see_also = ['remove', 'ignore'] |
2283 | |
2284 | def run(self, file_list, no_recurse=False, dry_run=False, verbose=False, |
2285 | file_ids_from=None): |
2286 | @@ -654,14 +666,6 @@ |
2287 | for path in ignored[glob]: |
2288 | self.outf.write("ignored %s matching \"%s\"\n" |
2289 | % (path, glob)) |
2290 | - else: |
2291 | - match_len = 0 |
2292 | - for glob, paths in ignored.items(): |
2293 | - match_len += len(paths) |
2294 | - self.outf.write("ignored %d file(s).\n" % match_len) |
2295 | - self.outf.write("If you wish to add ignored files, " |
2296 | - "please add them explicitly by name. " |
2297 | - "(\"bzr ignored\" gives a list)\n") |
2298 | |
2299 | |
2300 | class cmd_mkdir(Command): |
2301 | @@ -1172,6 +1176,9 @@ |
2302 | help='Hard-link working tree files where possible.'), |
2303 | Option('no-tree', |
2304 | help="Create a branch without a working-tree."), |
2305 | + Option('switch', |
2306 | + help="Switch the checkout in the current directory " |
2307 | + "to the new branch."), |
2308 | Option('stacked', |
2309 | help='Create a stacked branch referring to the source branch. ' |
2310 | 'The new branch will depend on the availability of the source ' |
2311 | @@ -1188,9 +1195,9 @@ |
2312 | |
2313 | def run(self, from_location, to_location=None, revision=None, |
2314 | hardlink=False, stacked=False, standalone=False, no_tree=False, |
2315 | - use_existing_dir=False): |
2316 | + use_existing_dir=False, switch=False): |
2317 | + from bzrlib import switch as _mod_switch |
2318 | from bzrlib.tag import _merge_tags_if_possible |
2319 | - |
2320 | accelerator_tree, br_from = bzrdir.BzrDir.open_tree_or_branch( |
2321 | from_location) |
2322 | if (accelerator_tree is not None and |
2323 | @@ -1250,6 +1257,12 @@ |
2324 | except (errors.NotStacked, errors.UnstackableBranchFormat, |
2325 | errors.UnstackableRepositoryFormat), e: |
2326 | note('Branched %d revision(s).' % branch.revno()) |
2327 | + if switch: |
2328 | + # Switch to the new branch |
2329 | + wt, _ = WorkingTree.open_containing('.') |
2330 | + _mod_switch.switch(wt.bzrdir, branch) |
2331 | + note('Switched to branch: %s', |
2332 | + urlutils.unescape_for_display(branch.base, 'utf-8')) |
2333 | finally: |
2334 | br_from.unlock() |
2335 | |
2336 | @@ -3025,6 +3038,10 @@ |
2337 | raise errors.BzrCommandError("empty commit message specified") |
2338 | return my_message |
2339 | |
2340 | + # The API permits a commit with a filter of [] to mean 'select nothing' |
2341 | + # but the command line should not do that. |
2342 | + if not selected_list: |
2343 | + selected_list = None |
2344 | try: |
2345 | tree.commit(message_callback=get_message, |
2346 | specific_files=selected_list, |
2347 | @@ -3365,6 +3382,8 @@ |
2348 | Option('lsprof-timed', |
2349 | help='Generate lsprof output for benchmarked' |
2350 | ' sections of code.'), |
2351 | + Option('lsprof-tests', |
2352 | + help='Generate lsprof output for each test.'), |
2353 | Option('cache-dir', type=str, |
2354 | help='Cache intermediate benchmark output in this ' |
2355 | 'directory.'), |
2356 | @@ -3411,7 +3430,7 @@ |
2357 | first=False, list_only=False, |
2358 | randomize=None, exclude=None, strict=False, |
2359 | load_list=None, debugflag=None, starting_with=None, subunit=False, |
2360 | - parallel=None): |
2361 | + parallel=None, lsprof_tests=False): |
2362 | from bzrlib.tests import selftest |
2363 | import bzrlib.benchmarks as benchmarks |
2364 | from bzrlib.benchmarks import tree_creator |
2365 | @@ -3451,6 +3470,7 @@ |
2366 | "transport": transport, |
2367 | "test_suite_factory": test_suite_factory, |
2368 | "lsprof_timed": lsprof_timed, |
2369 | + "lsprof_tests": lsprof_tests, |
2370 | "bench_history": benchfile, |
2371 | "matching_tests_first": first, |
2372 | "list_only": list_only, |
2373 | @@ -3633,13 +3653,14 @@ |
2374 | verified = 'inapplicable' |
2375 | tree = WorkingTree.open_containing(directory)[0] |
2376 | |
2377 | - # die as quickly as possible if there are uncommitted changes |
2378 | try: |
2379 | basis_tree = tree.revision_tree(tree.last_revision()) |
2380 | except errors.NoSuchRevision: |
2381 | basis_tree = tree.basis_tree() |
2382 | + |
2383 | + # die as quickly as possible if there are uncommitted changes |
2384 | if not force: |
2385 | - if tree.has_changes(basis_tree): |
2386 | + if tree.has_changes(basis_tree) or len(tree.get_parent_ids()) > 1: |
2387 | raise errors.UncommittedChanges(tree) |
2388 | |
2389 | view_info = _get_view_info_for_change_reporter(tree) |
2390 | @@ -5627,8 +5648,12 @@ |
2391 | if writer is None: |
2392 | writer = bzrlib.option.diff_writer_registry.get() |
2393 | try: |
2394 | - Shelver.from_args(writer(sys.stdout), revision, all, file_list, |
2395 | - message, destroy=destroy).run() |
2396 | + shelver = Shelver.from_args(writer(sys.stdout), revision, all, |
2397 | + file_list, message, destroy=destroy) |
2398 | + try: |
2399 | + shelver.run() |
2400 | + finally: |
2401 | + shelver.work_tree.unlock() |
2402 | except errors.UserAbort: |
2403 | return 0 |
2404 | |
2405 | @@ -5673,7 +5698,11 @@ |
2406 | |
2407 | def run(self, shelf_id=None, action='apply'): |
2408 | from bzrlib.shelf_ui import Unshelver |
2409 | - Unshelver.from_args(shelf_id, action).run() |
2410 | + unshelver = Unshelver.from_args(shelf_id, action) |
2411 | + try: |
2412 | + unshelver.run() |
2413 | + finally: |
2414 | + unshelver.tree.unlock() |
2415 | |
2416 | |
2417 | class cmd_clean_tree(Command): |
2418 | |
2419 | === modified file 'bzrlib/bzrdir.py' |
2420 | --- bzrlib/bzrdir.py 2009-08-04 14:48:59 +0000 |
2421 | +++ bzrlib/bzrdir.py 2009-08-21 02:10:06 +0000 |
2422 | @@ -129,9 +129,16 @@ |
2423 | return True |
2424 | |
2425 | def check_conversion_target(self, target_format): |
2426 | + """Check that a bzrdir as a whole can be converted to a new format.""" |
2427 | + # The only current restriction is that the repository content can be |
2428 | + # fetched compatibly with the target. |
2429 | target_repo_format = target_format.repository_format |
2430 | - source_repo_format = self._format.repository_format |
2431 | - source_repo_format.check_conversion_target(target_repo_format) |
2432 | + try: |
2433 | + self.open_repository()._format.check_conversion_target( |
2434 | + target_repo_format) |
2435 | + except errors.NoRepositoryPresent: |
2436 | + # No repo, no problem. |
2437 | + pass |
2438 | |
2439 | @staticmethod |
2440 | def _check_supported(format, allow_unsupported, |
2441 | @@ -3039,7 +3046,8 @@ |
2442 | new is _mod_branch.BzrBranchFormat8): |
2443 | branch_converter = _mod_branch.Converter7to8() |
2444 | else: |
2445 | - raise errors.BadConversionTarget("No converter", new) |
2446 | + raise errors.BadConversionTarget("No converter", new, |
2447 | + branch._format) |
2448 | branch_converter.convert(branch) |
2449 | branch = self.bzrdir.open_branch() |
2450 | old = branch._format.__class__ |
2451 | @@ -3548,6 +3556,10 @@ |
2452 | if self._require_stacking: |
2453 | raise |
2454 | |
2455 | + def requires_stacking(self): |
2456 | + """Return True if this policy requires stacking.""" |
2457 | + return self._stack_on is not None and self._require_stacking |
2458 | + |
2459 | def _get_full_stack_on(self): |
2460 | """Get a fully-qualified URL for the stack_on location.""" |
2461 | if self._stack_on is None: |
2462 | @@ -3860,11 +3872,11 @@ |
2463 | # The following format should be an alias for the rich root equivalent |
2464 | # of the default format |
2465 | format_registry.register_metadir('default-rich-root', |
2466 | - 'bzrlib.repofmt.pack_repo.RepositoryFormatKnitPack4', |
2467 | - help='Default format, rich root variant. (needed for bzr-svn and bzr-git).', |
2468 | - branch_format='bzrlib.branch.BzrBranchFormat6', |
2469 | - tree_format='bzrlib.workingtree.WorkingTreeFormat4', |
2470 | + 'bzrlib.repofmt.groupcompress_repo.RepositoryFormat2a', |
2471 | + branch_format='bzrlib.branch.BzrBranchFormat7', |
2472 | + tree_format='bzrlib.workingtree.WorkingTreeFormat6', |
2473 | alias=True, |
2474 | - ) |
2475 | + help='Same as 2a.') |
2476 | + |
2477 | # The current format that is made on 'bzr init'. |
2478 | -format_registry.set_default('pack-0.92') |
2479 | +format_registry.set_default('2a') |
2480 | |
2481 | === modified file 'bzrlib/commands.py' |
2482 | --- bzrlib/commands.py 2009-06-19 09:06:56 +0000 |
2483 | +++ bzrlib/commands.py 2009-09-09 17:51:19 +0000 |
2484 | @@ -1028,13 +1028,13 @@ |
2485 | ret = apply_coveraged(opt_coverage_dir, run, *run_argv) |
2486 | else: |
2487 | ret = run(*run_argv) |
2488 | - if 'memory' in debug.debug_flags: |
2489 | - trace.debug_memory('Process status after command:', short=False) |
2490 | return ret or 0 |
2491 | finally: |
2492 | # reset, in case we may do other commands later within the same |
2493 | # process. Commands that want to execute sub-commands must propagate |
2494 | # --verbose in their own way. |
2495 | + if 'memory' in debug.debug_flags: |
2496 | + trace.debug_memory('Process status after command:', short=False) |
2497 | option._verbosity_level = saved_verbosity_level |
2498 | |
2499 | |
2500 | |
2501 | === modified file 'bzrlib/commit.py' |
2502 | --- bzrlib/commit.py 2009-08-25 05:09:42 +0000 |
2503 | +++ bzrlib/commit.py 2009-08-28 05:00:33 +0000 |
2504 | @@ -209,7 +209,8 @@ |
2505 | :param timestamp: if not None, seconds-since-epoch for a |
2506 | postdated/predated commit. |
2507 | |
2508 | - :param specific_files: If true, commit only those files. |
2509 | + :param specific_files: If not None, commit only those files. An empty |
2510 | + list means 'commit no files'. |
2511 | |
2512 | :param rev_id: If set, use this as the new revision id. |
2513 | Useful for test or import commands that need to tightly |
2514 | @@ -264,6 +265,8 @@ |
2515 | self.master_locked = False |
2516 | self.recursive = recursive |
2517 | self.rev_id = None |
2518 | + # self.specific_files is None to indicate no filter, or any iterable to |
2519 | + # indicate a filter - [] means no files at all, as per iter_changes. |
2520 | if specific_files is not None: |
2521 | self.specific_files = sorted( |
2522 | minimum_path_selection(specific_files)) |
2523 | @@ -285,7 +288,6 @@ |
2524 | # the command line parameters, and the repository has fast delta |
2525 | # generation. See bug 347649. |
2526 | self.use_record_iter_changes = ( |
2527 | - not self.specific_files and |
2528 | not self.exclude and |
2529 | not self.branch.repository._format.supports_tree_reference and |
2530 | (self.branch.repository._format.fast_deltas or |
2531 | @@ -333,7 +335,7 @@ |
2532 | self._gather_parents() |
2533 | # After a merge, a selected file commit is not supported. |
2534 | # See 'bzr help merge' for an explanation as to why. |
2535 | - if len(self.parents) > 1 and self.specific_files: |
2536 | + if len(self.parents) > 1 and self.specific_files is not None: |
2537 | raise errors.CannotCommitSelectedFileMerge(self.specific_files) |
2538 | # Excludes are a form of selected file commit. |
2539 | if len(self.parents) > 1 and self.exclude: |
2540 | @@ -619,12 +621,13 @@ |
2541 | """Update the commit builder with the data about what has changed. |
2542 | """ |
2543 | exclude = self.exclude |
2544 | - specific_files = self.specific_files or [] |
2545 | + specific_files = self.specific_files |
2546 | mutter("Selecting files for commit with filter %s", specific_files) |
2547 | |
2548 | self._check_strict() |
2549 | if self.use_record_iter_changes: |
2550 | - iter_changes = self.work_tree.iter_changes(self.basis_tree) |
2551 | + iter_changes = self.work_tree.iter_changes(self.basis_tree, |
2552 | + specific_files=specific_files) |
2553 | iter_changes = self._filter_iter_changes(iter_changes) |
2554 | for file_id, path, fs_hash in self.builder.record_iter_changes( |
2555 | self.work_tree, self.basis_revid, iter_changes): |
2556 | |
2557 | === modified file 'bzrlib/config.py' |
2558 | --- bzrlib/config.py 2009-07-02 08:59:16 +0000 |
2559 | +++ bzrlib/config.py 2009-08-20 04:53:23 +0000 |
2560 | @@ -821,6 +821,29 @@ |
2561 | return osutils.pathjoin(config_dir(), 'ignore') |
2562 | |
2563 | |
2564 | +def crash_dir(): |
2565 | + """Return the directory name to store crash files. |
2566 | + |
2567 | + This doesn't implicitly create it. |
2568 | + |
2569 | + On Windows it's in the config directory; elsewhere in the XDG cache directory. |
2570 | + """ |
2571 | + if sys.platform == 'win32': |
2572 | + return osutils.pathjoin(config_dir(), 'Crash') |
2573 | + else: |
2574 | + return osutils.pathjoin(xdg_cache_dir(), 'crash') |
2575 | + |
2576 | + |
2577 | +def xdg_cache_dir(): |
2578 | + # See http://standards.freedesktop.org/basedir-spec/latest/ar01s03.html |
2579 | + # Possibly this should be different on Windows? |
2580 | + e = os.environ.get('XDG_CACHE_DIR', None) |
2581 | + if e: |
2582 | + return e |
2583 | + else: |
2584 | + return os.path.expanduser('~/.cache') |
2585 | + |
2586 | + |
2587 | def _auto_user_id(): |
2588 | """Calculate automatic user identification. |
2589 | |
2590 | |
2591 | === added file 'bzrlib/crash.py' |
2592 | --- bzrlib/crash.py 1970-01-01 00:00:00 +0000 |
2593 | +++ bzrlib/crash.py 2009-08-20 05:47:53 +0000 |
2594 | @@ -0,0 +1,163 @@ |
2595 | +# Copyright (C) 2009 Canonical Ltd |
2596 | +# |
2597 | +# This program is free software; you can redistribute it and/or modify |
2598 | +# it under the terms of the GNU General Public License as published by |
2599 | +# the Free Software Foundation; either version 2 of the License, or |
2600 | +# (at your option) any later version. |
2601 | +# |
2602 | +# This program is distributed in the hope that it will be useful, |
2603 | +# but WITHOUT ANY WARRANTY; without even the implied warranty of |
2604 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
2605 | +# GNU General Public License for more details. |
2606 | +# |
2607 | +# You should have received a copy of the GNU General Public License |
2608 | +# along with this program; if not, write to the Free Software |
2609 | +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA |
2610 | + |
2611 | + |
2612 | +"""Handling and reporting crashes. |
2613 | +""" |
2614 | + |
2615 | +# for interactive testing, try the 'bzr assert-fail' command |
2616 | +# or see http://code.launchpad.net/~mbp/bzr/bzr-fail |
2617 | + |
2618 | +import os |
2619 | +import platform |
2620 | +import pprint |
2621 | +import sys |
2622 | +import time |
2623 | +from StringIO import StringIO |
2624 | + |
2625 | +import bzrlib |
2626 | +from bzrlib import ( |
2627 | + config, |
2628 | + debug, |
2629 | + osutils, |
2630 | + plugin, |
2631 | + trace, |
2632 | + ) |
2633 | + |
2634 | + |
2635 | +def report_bug(exc_info, stderr): |
2636 | + if 'no_apport' not in debug.debug_flags: |
2637 | + try: |
2638 | + report_bug_to_apport(exc_info, stderr) |
2639 | + return |
2640 | + except ImportError, e: |
2641 | + trace.mutter("couldn't find apport bug-reporting library: %s" % e) |
2642 | + pass |
2643 | + except Exception, e: |
2644 | + # this should only happen if apport is installed but it didn't |
2645 | + # work, eg because of an io error writing the crash file |
2646 | + sys.stderr.write("bzr: failed to report crash using apport:\n " |
2647 | + " %r\n" % e) |
2648 | + pass |
2649 | + report_bug_legacy(exc_info, stderr) |
2650 | + |
2651 | + |
2652 | +def report_bug_legacy(exc_info, err_file): |
2653 | + """Report a bug by just printing a message to the user.""" |
2654 | + trace.print_exception(exc_info, err_file) |
2655 | + err_file.write('\n') |
2656 | + err_file.write('bzr %s on python %s (%s)\n' % \ |
2657 | + (bzrlib.__version__, |
2658 | + bzrlib._format_version_tuple(sys.version_info), |
2659 | + platform.platform(aliased=1))) |
2660 | + err_file.write('arguments: %r\n' % sys.argv) |
2661 | + err_file.write( |
2662 | + 'encoding: %r, fsenc: %r, lang: %r\n' % ( |
2663 | + osutils.get_user_encoding(), sys.getfilesystemencoding(), |
2664 | + os.environ.get('LANG'))) |
2665 | + err_file.write("plugins:\n") |
2666 | + err_file.write(_format_plugin_list()) |
2667 | + err_file.write( |
2668 | + "\n\n" |
2669 | + "*** Bazaar has encountered an internal error. This probably indicates a\n" |
2670 | + " bug in Bazaar. You can help us fix it by filing a bug report at\n" |
2671 | + " https://bugs.launchpad.net/bzr/+filebug\n" |
2672 | + " including this traceback and a description of the problem.\n" |
2673 | + ) |
2674 | + |
2675 | + |
2676 | +def report_bug_to_apport(exc_info, stderr): |
2677 | + """Report a bug to apport for optional automatic filing. |
2678 | + """ |
2679 | + # this is based on apport_package_hook.py, but omitting some of the |
2680 | + # Ubuntu-specific policy about what to report and when |
2681 | + |
2682 | + # if this fails its caught at a higher level; we don't want to open the |
2683 | + # crash file unless apport can be loaded. |
2684 | + import apport |
2685 | + |
2686 | + crash_file = _open_crash_file() |
2687 | + try: |
2688 | + _write_apport_report_to_file(exc_info, crash_file) |
2689 | + finally: |
2690 | + crash_file.close() |
2691 | + |
2692 | + stderr.write("bzr: ERROR: %s.%s: %s\n" |
2693 | + "\n" |
2694 | + "*** Bazaar has encountered an internal error. This probably indicates a\n" |
2695 | + " bug in Bazaar. You can help us fix it by filing a bug report at\n" |
2696 | + " https://bugs.launchpad.net/bzr/+filebug\n" |
2697 | + " attaching the crash file\n" |
2698 | + " %s\n" |
2699 | + " and including a description of the problem.\n" |
2700 | + "\n" |
2701 | + " The crash file is plain text and you can inspect or edit it to remove\n" |
2702 | + " private information.\n" |
2703 | + % (exc_info[0].__module__, exc_info[0].__name__, exc_info[1], |
2704 | + crash_file.name)) |
2705 | + |
2706 | + |
2707 | +def _write_apport_report_to_file(exc_info, crash_file): |
2708 | + import traceback |
2709 | + from apport.report import Report |
2710 | + |
2711 | + exc_type, exc_object, exc_tb = exc_info |
2712 | + |
2713 | + pr = Report() |
2714 | + # add_proc_info gives you the memory map of the process: this seems rarely |
2715 | + # useful for Bazaar and it does make the report harder to scan, though it |
2716 | + # does tell you what binary modules are loaded. |
2717 | + # pr.add_proc_info() |
2718 | + pr.add_user_info() |
2719 | + pr['CommandLine'] = pprint.pformat(sys.argv) |
2720 | + pr['BzrVersion'] = bzrlib.__version__ |
2721 | + pr['PythonVersion'] = bzrlib._format_version_tuple(sys.version_info) |
2722 | + pr['Platform'] = platform.platform(aliased=1) |
2723 | + pr['UserEncoding'] = osutils.get_user_encoding() |
2724 | + pr['FileSystemEncoding'] = sys.getfilesystemencoding() |
2725 | + pr['Locale'] = os.environ.get('LANG') |
2726 | + pr['BzrPlugins'] = _format_plugin_list() |
2727 | + pr['PythonLoadedModules'] = _format_module_list() |
2728 | + pr['BzrDebugFlags'] = pprint.pformat(debug.debug_flags) |
2729 | + |
2730 | + tb_file = StringIO() |
2731 | + traceback.print_exception(exc_type, exc_object, exc_tb, file=tb_file) |
2732 | + pr['Traceback'] = tb_file.getvalue() |
2733 | + |
2734 | + pr.write(crash_file) |
2735 | + |
2736 | + |
2737 | +def _open_crash_file(): |
2738 | + crash_dir = config.crash_dir() |
2739 | + # user-readable only, just in case the contents are sensitive. |
2740 | + if not osutils.isdir(crash_dir): |
2741 | + os.makedirs(crash_dir, mode=0700) |
2742 | + filename = 'bzr-%s-%s.crash' % ( |
2743 | + osutils.compact_date(time.time()), |
2744 | + os.getpid(),) |
2745 | + return open(osutils.pathjoin(crash_dir, filename), 'wt') |
2746 | + |
2747 | + |
2748 | +def _format_plugin_list(): |
2749 | + plugin_lines = [] |
2750 | + for name, a_plugin in sorted(plugin.plugins().items()): |
2751 | + plugin_lines.append(" %-20s %s [%s]" % |
2752 | + (name, a_plugin.path(), a_plugin.__version__)) |
2753 | + return '\n'.join(plugin_lines) |
2754 | + |
2755 | + |
2756 | +def _format_module_list(): |
2757 | + return pprint.pformat(sys.modules) |
2758 | |
2759 | === modified file 'bzrlib/dirstate.py' |
2760 | --- bzrlib/dirstate.py 2009-07-27 05:44:19 +0000 |
2761 | +++ bzrlib/dirstate.py 2009-08-28 05:00:33 +0000 |
2762 | @@ -3166,15 +3166,18 @@ |
2763 | |
2764 | __slots__ = ["old_dirname_to_file_id", "new_dirname_to_file_id", |
2765 | "last_source_parent", "last_target_parent", "include_unchanged", |
2766 | - "use_filesystem_for_exec", "utf8_decode", "searched_specific_files", |
2767 | - "search_specific_files", "state", "source_index", "target_index", |
2768 | - "want_unversioned", "tree"] |
2769 | + "partial", "use_filesystem_for_exec", "utf8_decode", |
2770 | + "searched_specific_files", "search_specific_files", |
2771 | + "searched_exact_paths", "search_specific_file_parents", "seen_ids", |
2772 | + "state", "source_index", "target_index", "want_unversioned", "tree"] |
2773 | |
2774 | def __init__(self, include_unchanged, use_filesystem_for_exec, |
2775 | search_specific_files, state, source_index, target_index, |
2776 | want_unversioned, tree): |
2777 | self.old_dirname_to_file_id = {} |
2778 | self.new_dirname_to_file_id = {} |
2779 | + # Are we doing a partial iter_changes? |
2780 | + self.partial = search_specific_files != set(['']) |
2781 | # Using a list so that we can access the values and change them in |
2782 | # nested scope. Each one is [path, file_id, entry] |
2783 | self.last_source_parent = [None, None] |
2784 | @@ -3183,14 +3186,25 @@ |
2785 | self.use_filesystem_for_exec = use_filesystem_for_exec |
2786 | self.utf8_decode = cache_utf8._utf8_decode |
2787 | # for all search_indexs in each path at or under each element of |
2788 | - # search_specific_files, if the detail is relocated: add the id, and add the |
2789 | - # relocated path as one to search if its not searched already. If the |
2790 | - # detail is not relocated, add the id. |
2791 | + # search_specific_files, if the detail is relocated: add the id, and |
2792 | + # add the relocated path as one to search if its not searched already. |
2793 | + # If the detail is not relocated, add the id. |
2794 | self.searched_specific_files = set() |
2795 | + # When we search exact paths without expanding downwards, we record |
2796 | + # that here. |
2797 | + self.searched_exact_paths = set() |
2798 | self.search_specific_files = search_specific_files |
2799 | + # The parents up to the root of the paths we are searching. |
2800 | + # After all normal paths are returned, these specific items are returned. |
2801 | + self.search_specific_file_parents = set() |
2802 | + # The ids we've sent out in the delta. |
2803 | + self.seen_ids = set() |
2804 | self.state = state |
2805 | self.source_index = source_index |
2806 | self.target_index = target_index |
2807 | + if target_index != 0: |
2808 | + # A lot of code in here depends on target_index == 0 |
2809 | + raise errors.BzrError('unsupported target index') |
2810 | self.want_unversioned = want_unversioned |
2811 | self.tree = tree |
2812 | |
2813 | @@ -3198,15 +3212,15 @@ |
2814 | """Compare an entry and real disk to generate delta information. |
2815 | |
2816 | :param path_info: top_relpath, basename, kind, lstat, abspath for |
2817 | - the path of entry. If None, then the path is considered absent. |
2818 | - (Perhaps we should pass in a concrete entry for this ?) |
2819 | + the path of entry. If None, then the path is considered absent in |
2820 | + the target (Perhaps we should pass in a concrete entry for this ?) |
2821 | Basename is returned as a utf8 string because we expect this |
2822 | tuple will be ignored, and don't want to take the time to |
2823 | decode. |
2824 | :return: (iter_changes_result, changed). If the entry has not been |
2825 | handled then changed is None. Otherwise it is False if no content |
2826 | - or metadata changes have occured, and None if any content or |
2827 | - metadata change has occured. If self.include_unchanged is True then |
2828 | + or metadata changes have occurred, and True if any content or |
2829 | + metadata change has occurred. If self.include_unchanged is True then |
2830 | if changed is not None, iter_changes_result will always be a result |
2831 | tuple. Otherwise, iter_changes_result is None unless changed is |
2832 | True. |
2833 | @@ -3463,6 +3477,25 @@ |
2834 | def __iter__(self): |
2835 | return self |
2836 | |
2837 | + def _gather_result_for_consistency(self, result): |
2838 | + """Check a result we will yield to make sure we are consistent later. |
2839 | + |
2840 | + This gathers result's parents into a set to output later. |
2841 | + |
2842 | + :param result: A result tuple. |
2843 | + """ |
2844 | + if not self.partial or not result[0]: |
2845 | + return |
2846 | + self.seen_ids.add(result[0]) |
2847 | + new_path = result[1][1] |
2848 | + if new_path: |
2849 | + # Not the root and not a delete: queue up the parents of the path. |
2850 | + self.search_specific_file_parents.update( |
2851 | + osutils.parent_directories(new_path.encode('utf8'))) |
2852 | + # Add the root directory which parent_directories does not |
2853 | + # provide. |
2854 | + self.search_specific_file_parents.add('') |
2855 | + |
2856 | def iter_changes(self): |
2857 | """Iterate over the changes.""" |
2858 | utf8_decode = cache_utf8._utf8_decode |
2859 | @@ -3546,6 +3579,8 @@ |
2860 | result, changed = _process_entry(entry, root_dir_info) |
2861 | if changed is not None: |
2862 | path_handled = True |
2863 | + if changed: |
2864 | + self._gather_result_for_consistency(result) |
2865 | if changed or self.include_unchanged: |
2866 | yield result |
2867 | if self.want_unversioned and not path_handled and root_dir_info: |
2868 | @@ -3664,6 +3699,8 @@ |
2869 | # advance the entry only, after processing. |
2870 | result, changed = _process_entry(current_entry, None) |
2871 | if changed is not None: |
2872 | + if changed: |
2873 | + self._gather_result_for_consistency(result) |
2874 | if changed or self.include_unchanged: |
2875 | yield result |
2876 | block_index +=1 |
2877 | @@ -3702,6 +3739,8 @@ |
2878 | # no path is fine: the per entry code will handle it. |
2879 | result, changed = _process_entry(current_entry, current_path_info) |
2880 | if changed is not None: |
2881 | + if changed: |
2882 | + self._gather_result_for_consistency(result) |
2883 | if changed or self.include_unchanged: |
2884 | yield result |
2885 | elif (current_entry[0][1] != current_path_info[1] |
2886 | @@ -3723,6 +3762,8 @@ |
2887 | # advance the entry only, after processing. |
2888 | result, changed = _process_entry(current_entry, None) |
2889 | if changed is not None: |
2890 | + if changed: |
2891 | + self._gather_result_for_consistency(result) |
2892 | if changed or self.include_unchanged: |
2893 | yield result |
2894 | advance_path = False |
2895 | @@ -3730,6 +3771,8 @@ |
2896 | result, changed = _process_entry(current_entry, current_path_info) |
2897 | if changed is not None: |
2898 | path_handled = True |
2899 | + if changed: |
2900 | + self._gather_result_for_consistency(result) |
2901 | if changed or self.include_unchanged: |
2902 | yield result |
2903 | if advance_entry and current_entry is not None: |
2904 | @@ -3795,6 +3838,124 @@ |
2905 | current_dir_info = dir_iterator.next() |
2906 | except StopIteration: |
2907 | current_dir_info = None |
2908 | + for result in self._iter_specific_file_parents(): |
2909 | + yield result |
2910 | + |
2911 | + def _iter_specific_file_parents(self): |
2912 | + """Iter over the specific file parents.""" |
2913 | + while self.search_specific_file_parents: |
2914 | + # Process the parent directories for the paths we were iterating. |
2915 | + # Even in extremely large trees this should be modest, so currently |
2916 | + # no attempt is made to optimise. |
2917 | + path_utf8 = self.search_specific_file_parents.pop() |
2918 | + if osutils.is_inside_any(self.searched_specific_files, path_utf8): |
2919 | + # We've examined this path. |
2920 | + continue |
2921 | + if path_utf8 in self.searched_exact_paths: |
2922 | + # We've examined this path. |
2923 | + continue |
2924 | + path_entries = self.state._entries_for_path(path_utf8) |
2925 | + # We need either one or two entries. If the path in |
2926 | + # self.target_index has moved (so the entry in source_index is in |
2927 | + # 'ar') then we need to also look for the entry for this path in |
2928 | + # self.source_index, to output the appropriate delete-or-rename. |
2929 | + selected_entries = [] |
2930 | + found_item = False |
2931 | + for candidate_entry in path_entries: |
2932 | + # Find entries present in target at this path: |
2933 | + if candidate_entry[1][self.target_index][0] not in 'ar': |
2934 | + found_item = True |
2935 | + selected_entries.append(candidate_entry) |
2936 | + # Find entries present in source at this path: |
2937 | + elif (self.source_index is not None and |
2938 | + candidate_entry[1][self.source_index][0] not in 'ar'): |
2939 | + found_item = True |
2940 | + if candidate_entry[1][self.target_index][0] == 'a': |
2941 | + # Deleted, emit it here. |
2942 | + selected_entries.append(candidate_entry) |
2943 | + else: |
2944 | + # renamed, emit it when we process the directory it |
2945 | + # ended up at. |
2946 | + self.search_specific_file_parents.add( |
2947 | + candidate_entry[1][self.target_index][1]) |
2948 | + if not found_item: |
2949 | + raise AssertionError( |
2950 | + "Missing entry for specific path parent %r, %r" % ( |
2951 | + path_utf8, path_entries)) |
2952 | + path_info = self._path_info(path_utf8, path_utf8.decode('utf8')) |
2953 | + for entry in selected_entries: |
2954 | + if entry[0][2] in self.seen_ids: |
2955 | + continue |
2956 | + result, changed = self._process_entry(entry, path_info) |
2957 | + if changed is None: |
2958 | + raise AssertionError( |
2959 | + "Got entry<->path mismatch for specific path " |
2960 | + "%r entry %r path_info %r " % ( |
2961 | + path_utf8, entry, path_info)) |
2962 | + # Only include changes - we're outside the users requested |
2963 | + # expansion. |
2964 | + if changed: |
2965 | + self._gather_result_for_consistency(result) |
2966 | + if (result[6][0] == 'directory' and |
2967 | + result[6][1] != 'directory'): |
2968 | + # This stopped being a directory, the old children have |
2969 | + # to be included. |
2970 | + if entry[1][self.source_index][0] == 'r': |
2971 | + # renamed, take the source path |
2972 | + entry_path_utf8 = entry[1][self.source_index][1] |
2973 | + else: |
2974 | + entry_path_utf8 = path_utf8 |
2975 | + initial_key = (entry_path_utf8, '', '') |
2976 | + block_index, _ = self.state._find_block_index_from_key( |
2977 | + initial_key) |
2978 | + if block_index == 0: |
2979 | + # The children of the root are in block index 1. |
2980 | + block_index +=1 |
2981 | + current_block = None |
2982 | + if block_index < len(self.state._dirblocks): |
2983 | + current_block = self.state._dirblocks[block_index] |
2984 | + if not osutils.is_inside( |
2985 | + entry_path_utf8, current_block[0]): |
2986 | + # No entries for this directory at all. |
2987 | + current_block = None |
2988 | + if current_block is not None: |
2989 | + for entry in current_block[1]: |
2990 | + if entry[1][self.source_index][0] in 'ar': |
2991 | + # Not in the source tree, so doesn't have to be |
2992 | + # included. |
2993 | + continue |
2994 | + # Path of the entry itself. |
2995 | + |
2996 | + self.search_specific_file_parents.add( |
2997 | + osutils.pathjoin(*entry[0][:2])) |
2998 | + if changed or self.include_unchanged: |
2999 | + yield result |
3000 | + self.searched_exact_paths.add(path_utf8) |
3001 | + |
3002 | + def _path_info(self, utf8_path, unicode_path): |
3003 | + """Generate path_info for unicode_path. |
3004 | + |
3005 | + :return: None if unicode_path does not exist, or a path_info tuple. |
3006 | + """ |
3007 | + abspath = self.tree.abspath(unicode_path) |
3008 | + try: |
3009 | + stat = os.lstat(abspath) |
3010 | + except OSError, e: |
3011 | + if e.errno == errno.ENOENT: |
3012 | + # the path does not exist. |
3013 | + return None |
3014 | + else: |
3015 | + raise |
3016 | + utf8_basename = utf8_path.rsplit('/', 1)[-1] |
3017 | + dir_info = (utf8_path, utf8_basename, |
3018 | + osutils.file_kind_from_stat_mode(stat.st_mode), stat, |
3019 | + abspath) |
3020 | + if dir_info[2] == 'directory': |
3021 | + if self.tree._directory_is_tree_reference( |
3022 | + unicode_path): |
3023 | + self.root_dir_info = self.root_dir_info[:2] + \ |
3024 | + ('tree-reference',) + self.root_dir_info[3:] |
3025 | + return dir_info |
3026 | |
3027 | |
3028 | # Try to load the compiled form if possible |
3029 | |
3030 | === modified file 'bzrlib/errors.py' |
3031 | --- bzrlib/errors.py 2009-07-14 21:07:36 +0000 |
3032 | +++ bzrlib/errors.py 2009-08-30 21:34:42 +0000 |
3033 | @@ -793,6 +793,12 @@ |
3034 | |
3035 | |
3036 | class IncompatibleRepositories(BzrError): |
3037 | + """Report an error that two repositories are not compatible. |
3038 | + |
3039 | + Note that the source and target repositories are permitted to be strings: |
3040 | + this exception is thrown from the smart server and may refer to a |
3041 | + repository the client hasn't opened. |
3042 | + """ |
3043 | |
3044 | _fmt = "%(target)s\n" \ |
3045 | "is not compatible with\n" \ |
3046 | @@ -2006,12 +2012,14 @@ |
3047 | |
3048 | class BadConversionTarget(BzrError): |
3049 | |
3050 | - _fmt = "Cannot convert to format %(format)s. %(problem)s" |
3051 | + _fmt = "Cannot convert from format %(from_format)s to format %(format)s." \ |
3052 | + " %(problem)s" |
3053 | |
3054 | - def __init__(self, problem, format): |
3055 | + def __init__(self, problem, format, from_format=None): |
3056 | BzrError.__init__(self) |
3057 | self.problem = problem |
3058 | self.format = format |
3059 | + self.from_format = from_format or '(unspecified)' |
3060 | |
3061 | |
3062 | class NoDiffFound(BzrError): |
3063 | @@ -2918,8 +2926,9 @@ |
3064 | _fmt = 'Cannot bind address "%(host)s:%(port)i": %(orig_error)s.' |
3065 | |
3066 | def __init__(self, host, port, orig_error): |
3067 | + # nb: in python2.4 socket.error doesn't have a useful repr |
3068 | BzrError.__init__(self, host=host, port=port, |
3069 | - orig_error=orig_error[1]) |
3070 | + orig_error=repr(orig_error.args)) |
3071 | |
3072 | |
3073 | class UnknownRules(BzrError): |
3074 | |
3075 | === modified file 'bzrlib/fetch.py' |
3076 | --- bzrlib/fetch.py 2009-07-09 08:59:51 +0000 |
3077 | +++ bzrlib/fetch.py 2009-08-07 04:29:36 +0000 |
3078 | @@ -25,16 +25,21 @@ |
3079 | |
3080 | import operator |
3081 | |
3082 | +from bzrlib.lazy_import import lazy_import |
3083 | +lazy_import(globals(), """ |
3084 | +from bzrlib import ( |
3085 | + tsort, |
3086 | + versionedfile, |
3087 | + ) |
3088 | +""") |
3089 | import bzrlib |
3090 | from bzrlib import ( |
3091 | errors, |
3092 | symbol_versioning, |
3093 | ) |
3094 | from bzrlib.revision import NULL_REVISION |
3095 | -from bzrlib.tsort import topo_sort |
3096 | from bzrlib.trace import mutter |
3097 | import bzrlib.ui |
3098 | -from bzrlib.versionedfile import FulltextContentFactory |
3099 | |
3100 | |
3101 | class RepoFetcher(object): |
3102 | @@ -213,11 +218,9 @@ |
3103 | |
3104 | def _find_root_ids(self, revs, parent_map, graph): |
3105 | revision_root = {} |
3106 | - planned_versions = {} |
3107 | for tree in self.iter_rev_trees(revs): |
3108 | revision_id = tree.inventory.root.revision |
3109 | root_id = tree.get_root_id() |
3110 | - planned_versions.setdefault(root_id, []).append(revision_id) |
3111 | revision_root[revision_id] = root_id |
3112 | # Find out which parents we don't already know root ids for |
3113 | parents = set() |
3114 | @@ -229,7 +232,7 @@ |
3115 | for tree in self.iter_rev_trees(parents): |
3116 | root_id = tree.get_root_id() |
3117 | revision_root[tree.get_revision_id()] = root_id |
3118 | - return revision_root, planned_versions |
3119 | + return revision_root |
3120 | |
3121 | def generate_root_texts(self, revs): |
3122 | """Generate VersionedFiles for all root ids. |
3123 | @@ -238,9 +241,8 @@ |
3124 | """ |
3125 | graph = self.source.get_graph() |
3126 | parent_map = graph.get_parent_map(revs) |
3127 | - rev_order = topo_sort(parent_map) |
3128 | - rev_id_to_root_id, root_id_to_rev_ids = self._find_root_ids( |
3129 | - revs, parent_map, graph) |
3130 | + rev_order = tsort.topo_sort(parent_map) |
3131 | + rev_id_to_root_id = self._find_root_ids(revs, parent_map, graph) |
3132 | root_id_order = [(rev_id_to_root_id[rev_id], rev_id) for rev_id in |
3133 | rev_order] |
3134 | # Guaranteed stable, this groups all the file id operations together |
3135 | @@ -249,20 +251,93 @@ |
3136 | # yet, and are unlikely to in non-rich-root environments anyway. |
3137 | root_id_order.sort(key=operator.itemgetter(0)) |
3138 | # Create a record stream containing the roots to create. |
3139 | - def yield_roots(): |
3140 | - for key in root_id_order: |
3141 | - root_id, rev_id = key |
3142 | - rev_parents = parent_map[rev_id] |
3143 | - # We drop revision parents with different file-ids, because |
3144 | - # that represents a rename of the root to a different location |
3145 | - # - its not actually a parent for us. (We could look for that |
3146 | - # file id in the revision tree at considerably more expense, |
3147 | - # but for now this is sufficient (and reconcile will catch and |
3148 | - # correct this anyway). |
3149 | - # When a parent revision is a ghost, we guess that its root id |
3150 | - # was unchanged (rather than trimming it from the parent list). |
3151 | - parent_keys = tuple((root_id, parent) for parent in rev_parents |
3152 | - if parent != NULL_REVISION and |
3153 | - rev_id_to_root_id.get(parent, root_id) == root_id) |
3154 | - yield FulltextContentFactory(key, parent_keys, None, '') |
3155 | - return [('texts', yield_roots())] |
3156 | + from bzrlib.graph import FrozenHeadsCache |
3157 | + graph = FrozenHeadsCache(graph) |
3158 | + new_roots_stream = _new_root_data_stream( |
3159 | + root_id_order, rev_id_to_root_id, parent_map, self.source, graph) |
3160 | + return [('texts', new_roots_stream)] |
3161 | + |
3162 | + |
3163 | +def _new_root_data_stream( |
3164 | + root_keys_to_create, rev_id_to_root_id_map, parent_map, repo, graph=None): |
3165 | + """Generate a texts substream of synthesised root entries. |
3166 | + |
3167 | + Used in fetches that do rich-root upgrades. |
3168 | + |
3169 | + :param root_keys_to_create: iterable of (root_id, rev_id) pairs describing |
3170 | + the root entries to create. |
3171 | + :param rev_id_to_root_id_map: dict of known rev_id -> root_id mappings for |
3172 | + calculating the parents. If a parent rev_id is not found here then it |
3173 | + will be recalculated. |
3174 | + :param parent_map: a parent map for all the revisions in |
3175 | + root_keys_to_create. |
3176 | + :param graph: a graph to use instead of repo.get_graph(). |
3177 | + """ |
3178 | + for root_key in root_keys_to_create: |
3179 | + root_id, rev_id = root_key |
3180 | + parent_keys = _parent_keys_for_root_version( |
3181 | + root_id, rev_id, rev_id_to_root_id_map, parent_map, repo, graph) |
3182 | + yield versionedfile.FulltextContentFactory( |
3183 | + root_key, parent_keys, None, '') |
3184 | + |
3185 | + |
3186 | +def _parent_keys_for_root_version( |
3187 | + root_id, rev_id, rev_id_to_root_id_map, parent_map, repo, graph=None): |
3188 | + """Get the parent keys for a given root id. |
3189 | + |
3190 | + A helper function for _new_root_data_stream. |
3191 | + """ |
3192 | + # Include direct parents of the revision, but only if they used the same |
3193 | + # root_id and are heads. |
3194 | + rev_parents = parent_map[rev_id] |
3195 | + parent_ids = [] |
3196 | + for parent_id in rev_parents: |
3197 | + if parent_id == NULL_REVISION: |
3198 | + continue |
3199 | + if parent_id not in rev_id_to_root_id_map: |
3200 | + # We probably didn't read this revision, go spend the extra effort |
3201 | + # to actually check |
3202 | + try: |
3203 | + tree = repo.revision_tree(parent_id) |
3204 | + except errors.NoSuchRevision: |
3205 | + # Ghost, fill out rev_id_to_root_id in case we encounter this |
3206 | + # again. |
3207 | + # But set parent_root_id to None since we don't really know |
3208 | + parent_root_id = None |
3209 | + else: |
3210 | + parent_root_id = tree.get_root_id() |
3211 | + rev_id_to_root_id_map[parent_id] = None |
3212 | + # XXX: why not: |
3213 | + # rev_id_to_root_id_map[parent_id] = parent_root_id |
3214 | + # memory consumption maybe? |
3215 | + else: |
3216 | + parent_root_id = rev_id_to_root_id_map[parent_id] |
3217 | + if root_id == parent_root_id: |
3218 | + # With stacking we _might_ want to refer to a non-local revision, |
3219 | + # but this code path only applies when we have the full content |
3220 | + # available, so ghosts really are ghosts, not just the edge of |
3221 | + # local data. |
3222 | + parent_ids.append(parent_id) |
3223 | + else: |
3224 | + # root_id may be in the parent anyway. |
3225 | + try: |
3226 | + tree = repo.revision_tree(parent_id) |
3227 | + except errors.NoSuchRevision: |
3228 | + # ghost, can't refer to it. |
3229 | + pass |
3230 | + else: |
3231 | + try: |
3232 | + parent_ids.append(tree.inventory[root_id].revision) |
3233 | + except errors.NoSuchId: |
3234 | + # not in the tree |
3235 | + pass |
3236 | + # Drop non-head parents |
3237 | + if graph is None: |
3238 | + graph = repo.get_graph() |
3239 | + heads = graph.heads(parent_ids) |
3240 | + selected_ids = [] |
3241 | + for parent_id in parent_ids: |
3242 | + if parent_id in heads and parent_id not in selected_ids: |
3243 | + selected_ids.append(parent_id) |
3244 | + parent_keys = [(root_id, parent_id) for parent_id in selected_ids] |
3245 | + return parent_keys |
3246 | |
3247 | === modified file 'bzrlib/graph.py' |
3248 | --- bzrlib/graph.py 2009-08-04 04:36:34 +0000 |
3249 | +++ bzrlib/graph.py 2009-08-17 18:36:14 +0000 |
3250 | @@ -21,7 +21,6 @@ |
3251 | errors, |
3252 | revision, |
3253 | trace, |
3254 | - tsort, |
3255 | ) |
3256 | from bzrlib.symbol_versioning import deprecated_function, deprecated_in |
3257 | |
3258 | @@ -926,6 +925,7 @@ |
3259 | An ancestor may sort after a descendant if the relationship is not |
3260 | visible in the supplied list of revisions. |
3261 | """ |
3262 | + from bzrlib import tsort |
3263 | sorter = tsort.TopoSorter(self.get_parent_map(revisions)) |
3264 | return sorter.iter_topo_order() |
3265 | |
3266 | |
3267 | === modified file 'bzrlib/groupcompress.py' |
3268 | --- bzrlib/groupcompress.py 2009-08-04 04:36:34 +0000 |
3269 | +++ bzrlib/groupcompress.py 2009-09-09 13:05:33 +0000 |
3270 | @@ -33,7 +33,6 @@ |
3271 | pack, |
3272 | trace, |
3273 | ) |
3274 | -from bzrlib.graph import Graph |
3275 | from bzrlib.btree_index import BTreeBuilder |
3276 | from bzrlib.lru_cache import LRUSizeCache |
3277 | from bzrlib.tsort import topo_sort |
3278 | @@ -45,12 +44,15 @@ |
3279 | VersionedFiles, |
3280 | ) |
3281 | |
3282 | +# Minimum number of uncompressed bytes to try fetch at once when retrieving |
3283 | +# groupcompress blocks. |
3284 | +BATCH_SIZE = 2**16 |
3285 | + |
3286 | _USE_LZMA = False and (pylzma is not None) |
3287 | |
3288 | # osutils.sha_string('') |
3289 | _null_sha1 = 'da39a3ee5e6b4b0d3255bfef95601890afd80709' |
3290 | |
3291 | - |
3292 | def sort_gc_optimal(parent_map): |
3293 | """Sort and group the keys in parent_map into groupcompress order. |
3294 | |
3295 | @@ -62,16 +64,15 @@ |
3296 | # groupcompress ordering is approximately reverse topological, |
3297 | # properly grouped by file-id. |
3298 | per_prefix_map = {} |
3299 | - for item in parent_map.iteritems(): |
3300 | - key = item[0] |
3301 | + for key, value in parent_map.iteritems(): |
3302 | if isinstance(key, str) or len(key) == 1: |
3303 | prefix = '' |
3304 | else: |
3305 | prefix = key[0] |
3306 | try: |
3307 | - per_prefix_map[prefix].append(item) |
3308 | + per_prefix_map[prefix][key] = value |
3309 | except KeyError: |
3310 | - per_prefix_map[prefix] = [item] |
3311 | + per_prefix_map[prefix] = {key: value} |
3312 | |
3313 | present_keys = [] |
3314 | for prefix in sorted(per_prefix_map): |
3315 | @@ -456,7 +457,6 @@ |
3316 | # There are code paths that first extract as fulltext, and then |
3317 | # extract as storage_kind (smart fetch). So we don't break the |
3318 | # refcycle here, but instead in manager.get_record_stream() |
3319 | - # self._manager = None |
3320 | if storage_kind == 'fulltext': |
3321 | return self._bytes |
3322 | else: |
3323 | @@ -468,6 +468,14 @@ |
3324 | class _LazyGroupContentManager(object): |
3325 | """This manages a group of _LazyGroupCompressFactory objects.""" |
3326 | |
3327 | + _max_cut_fraction = 0.75 # We allow a block to be trimmed to 75% of |
3328 | + # current size, and still be considered |
3329 | + # resuable |
3330 | + _full_block_size = 4*1024*1024 |
3331 | + _full_mixed_block_size = 2*1024*1024 |
3332 | + _full_enough_block_size = 3*1024*1024 # size at which we won't repack |
3333 | + _full_enough_mixed_block_size = 2*768*1024 # 1.5MB |
3334 | + |
3335 | def __init__(self, block): |
3336 | self._block = block |
3337 | # We need to preserve the ordering |
3338 | @@ -545,22 +553,23 @@ |
3339 | # time (self._block._content) is a little expensive. |
3340 | self._block._ensure_content(self._last_byte) |
3341 | |
3342 | - def _check_rebuild_block(self): |
3343 | + def _check_rebuild_action(self): |
3344 | """Check to see if our block should be repacked.""" |
3345 | total_bytes_used = 0 |
3346 | last_byte_used = 0 |
3347 | for factory in self._factories: |
3348 | total_bytes_used += factory._end - factory._start |
3349 | - last_byte_used = max(last_byte_used, factory._end) |
3350 | - # If we are using most of the bytes from the block, we have nothing |
3351 | - # else to check (currently more that 1/2) |
3352 | + if last_byte_used < factory._end: |
3353 | + last_byte_used = factory._end |
3354 | + # If we are using more than half of the bytes from the block, we have |
3355 | + # nothing else to check |
3356 | if total_bytes_used * 2 >= self._block._content_length: |
3357 | - return |
3358 | - # Can we just strip off the trailing bytes? If we are going to be |
3359 | - # transmitting more than 50% of the front of the content, go ahead |
3360 | + return None, last_byte_used, total_bytes_used |
3361 | + # We are using less than 50% of the content. Is the content we are |
3362 | + # using at the beginning of the block? If so, we can just trim the |
3363 | + # tail, rather than rebuilding from scratch. |
3364 | if total_bytes_used * 2 > last_byte_used: |
3365 | - self._trim_block(last_byte_used) |
3366 | - return |
3367 | + return 'trim', last_byte_used, total_bytes_used |
3368 | |
3369 | # We are using a small amount of the data, and it isn't just packed |
3370 | # nicely at the front, so rebuild the content. |
3371 | @@ -573,7 +582,80 @@ |
3372 | # expanding many deltas into fulltexts, as well. |
3373 | # If we build a cheap enough 'strip', then we could try a strip, |
3374 | # if that expands the content, we then rebuild. |
3375 | - self._rebuild_block() |
3376 | + return 'rebuild', last_byte_used, total_bytes_used |
3377 | + |
3378 | + def check_is_well_utilized(self): |
3379 | + """Is the current block considered 'well utilized'? |
3380 | + |
3381 | + This heuristic asks if the current block considers itself to be a fully |
3382 | + developed group, rather than just a loose collection of data. |
3383 | + """ |
3384 | + if len(self._factories) == 1: |
3385 | + # A block of length 1 could be improved by combining with other |
3386 | + # groups - don't look deeper. Even larger than max size groups |
3387 | + # could compress well with adjacent versions of the same thing. |
3388 | + return False |
3389 | + action, last_byte_used, total_bytes_used = self._check_rebuild_action() |
3390 | + block_size = self._block._content_length |
3391 | + if total_bytes_used < block_size * self._max_cut_fraction: |
3392 | + # This block wants to trim itself small enough that we want to |
3393 | + # consider it under-utilized. |
3394 | + return False |
3395 | + # TODO: This code is meant to be the twin of _insert_record_stream's |
3396 | + # 'start_new_block' logic. It would probably be better to factor |
3397 | + # out that logic into a shared location, so that it stays |
3398 | + # together better |
3399 | + # We currently assume a block is properly utilized whenever it is >75% |
3400 | + # of the size of a 'full' block. In normal operation, a block is |
3401 | + # considered full when it hits 4MB of same-file content. So any block |
3402 | + # >3MB is 'full enough'. |
3403 | + # The only time this isn't true is when a given block has large-object |
3404 | + # content. (a single file >4MB, etc.) |
3405 | + # Under these circumstances, we allow a block to grow to |
3406 | + # 2 x largest_content. Which means that if a given block had a large |
3407 | + # object, it may actually be under-utilized. However, given that this |
3408 | + # is 'pack-on-the-fly' it is probably reasonable to not repack large |
3409 | + # content blobs on-the-fly. Note that because we return False for all |
3410 | + # 1-item blobs, we will repack them; we may wish to reevaluate our |
3411 | + # treatment of large object blobs in the future. |
3412 | + if block_size >= self._full_enough_block_size: |
3413 | + return True |
3414 | + # If a block is <3MB, it still may be considered 'full' if it contains |
3415 | + # mixed content. The current rule is 2MB of mixed content is considered |
3416 | + # full. So check to see if this block contains mixed content, and |
3417 | + # set the threshold appropriately. |
3418 | + common_prefix = None |
3419 | + for factory in self._factories: |
3420 | + prefix = factory.key[:-1] |
3421 | + if common_prefix is None: |
3422 | + common_prefix = prefix |
3423 | + elif prefix != common_prefix: |
3424 | + # Mixed content, check the size appropriately |
3425 | + if block_size >= self._full_enough_mixed_block_size: |
3426 | + return True |
3427 | + break |
3428 | + # The content failed both the mixed check and the single-content check |
3429 | + # so obviously it is not fully utilized |
3430 | + # TODO: there is one other constraint that isn't being checked |
3431 | + # namely, that the entries in the block are in the appropriate |
3432 | + # order. For example, you could insert the entries in exactly |
3433 | + # reverse groupcompress order, and we would think that is ok. |
3434 | + # (all the right objects are in one group, and it is fully |
3435 | + # utilized, etc.) For now, we assume that case is rare, |
3436 | + # especially since we should always fetch in 'groupcompress' |
3437 | + # order. |
3438 | + return False |
3439 | + |
3440 | + def _check_rebuild_block(self): |
3441 | + action, last_byte_used, total_bytes_used = self._check_rebuild_action() |
3442 | + if action is None: |
3443 | + return |
3444 | + if action == 'trim': |
3445 | + self._trim_block(last_byte_used) |
3446 | + elif action == 'rebuild': |
3447 | + self._rebuild_block() |
3448 | + else: |
3449 | + raise ValueError('unknown rebuild action: %r' % (action,)) |
3450 | |
3451 | def _wire_bytes(self): |
3452 | """Return a byte stream suitable for transmitting over the wire.""" |
3453 | @@ -975,23 +1057,139 @@ |
3454 | versioned_files.stream.close() |
3455 | |
3456 | |
3457 | +class _BatchingBlockFetcher(object): |
3458 | + """Fetch group compress blocks in batches. |
3459 | + |
3460 | + :ivar total_bytes: int of expected number of bytes needed to fetch the |
3461 | + currently pending batch. |
3462 | + """ |
3463 | + |
3464 | + def __init__(self, gcvf, locations): |
3465 | + self.gcvf = gcvf |
3466 | + self.locations = locations |
3467 | + self.keys = [] |
3468 | + self.batch_memos = {} |
3469 | + self.memos_to_get = [] |
3470 | + self.total_bytes = 0 |
3471 | + self.last_read_memo = None |
3472 | + self.manager = None |
3473 | + |
3474 | + def add_key(self, key): |
3475 | + """Add another to key to fetch. |
3476 | + |
3477 | + :return: The estimated number of bytes needed to fetch the batch so |
3478 | + far. |
3479 | + """ |
3480 | + self.keys.append(key) |
3481 | + index_memo, _, _, _ = self.locations[key] |
3482 | + read_memo = index_memo[0:3] |
3483 | + # Three possibilities for this read_memo: |
3484 | + # - it's already part of this batch; or |
3485 | + # - it's not yet part of this batch, but is already cached; or |
3486 | + # - it's not yet part of this batch and will need to be fetched. |
3487 | + if read_memo in self.batch_memos: |
3488 | + # This read memo is already in this batch. |
3489 | + return self.total_bytes |
3490 | + try: |
3491 | + cached_block = self.gcvf._group_cache[read_memo] |
3492 | + except KeyError: |
3493 | + # This read memo is new to this batch, and the data isn't cached |
3494 | + # either. |
3495 | + self.batch_memos[read_memo] = None |
3496 | + self.memos_to_get.append(read_memo) |
3497 | + byte_length = read_memo[2] |
3498 | + self.total_bytes += byte_length |
3499 | + else: |
3500 | + # This read memo is new to this batch, but cached. |
3501 | + # Keep a reference to the cached block in batch_memos because it's |
3502 | + # certain that we'll use it when this batch is processed, but |
3503 | + # there's a risk that it would fall out of _group_cache between now |
3504 | + # and then. |
3505 | + self.batch_memos[read_memo] = cached_block |
3506 | + return self.total_bytes |
3507 | + |
3508 | + def _flush_manager(self): |
3509 | + if self.manager is not None: |
3510 | + for factory in self.manager.get_record_stream(): |
3511 | + yield factory |
3512 | + self.manager = None |
3513 | + self.last_read_memo = None |
3514 | + |
3515 | + def yield_factories(self, full_flush=False): |
3516 | + """Yield factories for keys added since the last yield. They will be |
3517 | + returned in the order they were added via add_key. |
3518 | + |
3519 | + :param full_flush: by default, some results may not be returned in case |
3520 | + they can be part of the next batch. If full_flush is True, then |
3521 | + all results are returned. |
3522 | + """ |
3523 | + if self.manager is None and not self.keys: |
3524 | + return |
3525 | + # Fetch all memos in this batch. |
3526 | + blocks = self.gcvf._get_blocks(self.memos_to_get) |
3527 | + # Turn blocks into factories and yield them. |
3528 | + memos_to_get_stack = list(self.memos_to_get) |
3529 | + memos_to_get_stack.reverse() |
3530 | + for key in self.keys: |
3531 | + index_memo, _, parents, _ = self.locations[key] |
3532 | + read_memo = index_memo[:3] |
3533 | + if self.last_read_memo != read_memo: |
3534 | + # We are starting a new block. If we have a |
3535 | + # manager, we have found everything that fits for |
3536 | + # now, so yield records |
3537 | + for factory in self._flush_manager(): |
3538 | + yield factory |
3539 | + # Now start a new manager. |
3540 | + if memos_to_get_stack and memos_to_get_stack[-1] == read_memo: |
3541 | + # The next block from _get_blocks will be the block we |
3542 | + # need. |
3543 | + block_read_memo, block = blocks.next() |
3544 | + if block_read_memo != read_memo: |
3545 | + raise AssertionError( |
3546 | + "block_read_memo out of sync with read_memo" |
3547 | + "(%r != %r)" % (block_read_memo, read_memo)) |
3548 | + self.batch_memos[read_memo] = block |
3549 | + memos_to_get_stack.pop() |
3550 | + else: |
3551 | + block = self.batch_memos[read_memo] |
3552 | + self.manager = _LazyGroupContentManager(block) |
3553 | + self.last_read_memo = read_memo |
3554 | + start, end = index_memo[3:5] |
3555 | + self.manager.add_factory(key, parents, start, end) |
3556 | + if full_flush: |
3557 | + for factory in self._flush_manager(): |
3558 | + yield factory |
3559 | + del self.keys[:] |
3560 | + self.batch_memos.clear() |
3561 | + del self.memos_to_get[:] |
3562 | + self.total_bytes = 0 |
3563 | + |
3564 | + |
3565 | class GroupCompressVersionedFiles(VersionedFiles): |
3566 | """A group-compress based VersionedFiles implementation.""" |
3567 | |
3568 | - def __init__(self, index, access, delta=True): |
3569 | + def __init__(self, index, access, delta=True, _unadded_refs=None): |
3570 | """Create a GroupCompressVersionedFiles object. |
3571 | |
3572 | :param index: The index object storing access and graph data. |
3573 | :param access: The access object storing raw data. |
3574 | :param delta: Whether to delta compress or just entropy compress. |
3575 | + :param _unadded_refs: private parameter, don't use. |
3576 | """ |
3577 | self._index = index |
3578 | self._access = access |
3579 | self._delta = delta |
3580 | - self._unadded_refs = {} |
3581 | + if _unadded_refs is None: |
3582 | + _unadded_refs = {} |
3583 | + self._unadded_refs = _unadded_refs |
3584 | self._group_cache = LRUSizeCache(max_size=50*1024*1024) |
3585 | self._fallback_vfs = [] |
3586 | |
3587 | + def without_fallbacks(self): |
3588 | + """Return a clone of this object without any fallbacks configured.""" |
3589 | + return GroupCompressVersionedFiles(self._index, self._access, |
3590 | + self._delta, _unadded_refs=dict(self._unadded_refs)) |
3591 | + |
3592 | def add_lines(self, key, parents, lines, parent_texts=None, |
3593 | left_matching_blocks=None, nostore_sha=None, random_id=False, |
3594 | check_content=True): |
3595 | @@ -1099,6 +1297,22 @@ |
3596 | self._check_lines_not_unicode(lines) |
3597 | self._check_lines_are_lines(lines) |
3598 | |
3599 | + def get_known_graph_ancestry(self, keys): |
3600 | + """Get a KnownGraph instance with the ancestry of keys.""" |
3601 | + # Note that this is identical to |
3602 | + # KnitVersionedFiles.get_known_graph_ancestry, but they don't share |
3603 | + # ancestry. |
3604 | + parent_map, missing_keys = self._index.find_ancestry(keys) |
3605 | + for fallback in self._fallback_vfs: |
3606 | + if not missing_keys: |
3607 | + break |
3608 | + (f_parent_map, f_missing_keys) = fallback._index.find_ancestry( |
3609 | + missing_keys) |
3610 | + parent_map.update(f_parent_map) |
3611 | + missing_keys = f_missing_keys |
3612 | + kg = _mod_graph.KnownGraph(parent_map) |
3613 | + return kg |
3614 | + |
3615 | def get_parent_map(self, keys): |
3616 | """Get a map of the graph parents of keys. |
3617 | |
3618 | @@ -1131,26 +1345,42 @@ |
3619 | missing.difference_update(set(new_result)) |
3620 | return result, source_results |
3621 | |
3622 | - def _get_block(self, index_memo): |
3623 | - read_memo = index_memo[0:3] |
3624 | - # get the group: |
3625 | - try: |
3626 | - block = self._group_cache[read_memo] |
3627 | - except KeyError: |
3628 | - # read the group |
3629 | - zdata = self._access.get_raw_records([read_memo]).next() |
3630 | - # decompress - whole thing - this is not a bug, as it |
3631 | - # permits caching. We might want to store the partially |
3632 | - # decompresed group and decompress object, so that recent |
3633 | - # texts are not penalised by big groups. |
3634 | - block = GroupCompressBlock.from_bytes(zdata) |
3635 | - self._group_cache[read_memo] = block |
3636 | - # cheapo debugging: |
3637 | - # print len(zdata), len(plain) |
3638 | - # parse - requires split_lines, better to have byte offsets |
3639 | - # here (but not by much - we only split the region for the |
3640 | - # recipe, and we often want to end up with lines anyway. |
3641 | - return block |
3642 | + def _get_blocks(self, read_memos): |
3643 | + """Get GroupCompressBlocks for the given read_memos. |
3644 | + |
3645 | + :returns: a series of (read_memo, block) pairs, in the order they were |
3646 | + originally passed. |
3647 | + """ |
3648 | + cached = {} |
3649 | + for read_memo in read_memos: |
3650 | + try: |
3651 | + block = self._group_cache[read_memo] |
3652 | + except KeyError: |
3653 | + pass |
3654 | + else: |
3655 | + cached[read_memo] = block |
3656 | + not_cached = [] |
3657 | + not_cached_seen = set() |
3658 | + for read_memo in read_memos: |
3659 | + if read_memo in cached: |
3660 | + # Don't fetch what we already have |
3661 | + continue |
3662 | + if read_memo in not_cached_seen: |
3663 | + # Don't try to fetch the same data twice |
3664 | + continue |
3665 | + not_cached.append(read_memo) |
3666 | + not_cached_seen.add(read_memo) |
3667 | + raw_records = self._access.get_raw_records(not_cached) |
3668 | + for read_memo in read_memos: |
3669 | + try: |
3670 | + yield read_memo, cached[read_memo] |
3671 | + except KeyError: |
3672 | + # Read the block, and cache it. |
3673 | + zdata = raw_records.next() |
3674 | + block = GroupCompressBlock.from_bytes(zdata) |
3675 | + self._group_cache[read_memo] = block |
3676 | + cached[read_memo] = block |
3677 | + yield read_memo, block |
3678 | |
3679 | def get_missing_compression_parent_keys(self): |
3680 | """Return the keys of missing compression parents. |
3681 | @@ -1322,55 +1552,35 @@ |
3682 | unadded_keys, source_result) |
3683 | for key in missing: |
3684 | yield AbsentContentFactory(key) |
3685 | - manager = None |
3686 | - last_read_memo = None |
3687 | - # TODO: This works fairly well at batching up existing groups into a |
3688 | - # streamable format, and possibly allowing for taking one big |
3689 | - # group and splitting it when it isn't fully utilized. |
3690 | - # However, it doesn't allow us to find under-utilized groups and |
3691 | - # combine them into a bigger group on the fly. |
3692 | - # (Consider the issue with how chk_map inserts texts |
3693 | - # one-at-a-time.) This could be done at insert_record_stream() |
3694 | - # time, but it probably would decrease the number of |
3695 | - # bytes-on-the-wire for fetch. |
3696 | + # Batch up as many keys as we can until either: |
3697 | + # - we encounter an unadded ref, or |
3698 | + # - we run out of keys, or |
3699 | + # - the total bytes to retrieve for this batch > BATCH_SIZE |
3700 | + batcher = _BatchingBlockFetcher(self, locations) |
3701 | for source, keys in source_keys: |
3702 | if source is self: |
3703 | for key in keys: |
3704 | if key in self._unadded_refs: |
3705 | - if manager is not None: |
3706 | - for factory in manager.get_record_stream(): |
3707 | - yield factory |
3708 | - last_read_memo = manager = None |
3709 | + # Flush batch, then yield unadded ref from |
3710 | + # self._compressor. |
3711 | + for factory in batcher.yield_factories(full_flush=True): |
3712 | + yield factory |
3713 | bytes, sha1 = self._compressor.extract(key) |
3714 | parents = self._unadded_refs[key] |
3715 | yield FulltextContentFactory(key, parents, sha1, bytes) |
3716 | - else: |
3717 | - index_memo, _, parents, (method, _) = locations[key] |
3718 | - read_memo = index_memo[0:3] |
3719 | - if last_read_memo != read_memo: |
3720 | - # We are starting a new block. If we have a |
3721 | - # manager, we have found everything that fits for |
3722 | - # now, so yield records |
3723 | - if manager is not None: |
3724 | - for factory in manager.get_record_stream(): |
3725 | - yield factory |
3726 | - # Now start a new manager |
3727 | - block = self._get_block(index_memo) |
3728 | - manager = _LazyGroupContentManager(block) |
3729 | - last_read_memo = read_memo |
3730 | - start, end = index_memo[3:5] |
3731 | - manager.add_factory(key, parents, start, end) |
3732 | + continue |
3733 | + if batcher.add_key(key) > BATCH_SIZE: |
3734 | + # Ok, this batch is big enough. Yield some results. |
3735 | + for factory in batcher.yield_factories(): |
3736 | + yield factory |
3737 | else: |
3738 | - if manager is not None: |
3739 | - for factory in manager.get_record_stream(): |
3740 | - yield factory |
3741 | - last_read_memo = manager = None |
3742 | + for factory in batcher.yield_factories(full_flush=True): |
3743 | + yield factory |
3744 | for record in source.get_record_stream(keys, ordering, |
3745 | include_delta_closure): |
3746 | yield record |
3747 | - if manager is not None: |
3748 | - for factory in manager.get_record_stream(): |
3749 | - yield factory |
3750 | + for factory in batcher.yield_factories(full_flush=True): |
3751 | + yield factory |
3752 | |
3753 | def get_sha1s(self, keys): |
3754 | """See VersionedFiles.get_sha1s().""" |
3755 | @@ -1449,6 +1659,7 @@ |
3756 | block_length = None |
3757 | # XXX: TODO: remove this, it is just for safety checking for now |
3758 | inserted_keys = set() |
3759 | + reuse_this_block = reuse_blocks |
3760 | for record in stream: |
3761 | # Raise an error when a record is missing. |
3762 | if record.storage_kind == 'absent': |
3763 | @@ -1462,10 +1673,20 @@ |
3764 | if reuse_blocks: |
3765 | # If the reuse_blocks flag is set, check to see if we can just |
3766 | # copy a groupcompress block as-is. |
3767 | + # We only check on the first record (groupcompress-block) not |
3768 | + # on all of the (groupcompress-block-ref) entries. |
3769 | + # The reuse_this_block flag is then kept for as long as |
3770 | + if record.storage_kind == 'groupcompress-block': |
3771 | + # Check to see if we really want to re-use this block |
3772 | + insert_manager = record._manager |
3773 | + reuse_this_block = insert_manager.check_is_well_utilized() |
3774 | + else: |
3775 | + reuse_this_block = False |
3776 | + if reuse_this_block: |
3777 | + # We still want to reuse this block |
3778 | if record.storage_kind == 'groupcompress-block': |
3779 | # Insert the raw block into the target repo |
3780 | insert_manager = record._manager |
3781 | - insert_manager._check_rebuild_block() |
3782 | bytes = record._manager._block.to_bytes() |
3783 | _, start, length = self._access.add_raw_records( |
3784 | [(None, len(bytes))], bytes)[0] |
3785 | @@ -1476,6 +1697,11 @@ |
3786 | 'groupcompress-block-ref'): |
3787 | if insert_manager is None: |
3788 | raise AssertionError('No insert_manager set') |
3789 | + if insert_manager is not record._manager: |
3790 | + raise AssertionError('insert_manager does not match' |
3791 | + ' the current record, we cannot be positive' |
3792 | + ' that the appropriate content was inserted.' |
3793 | + ) |
3794 | value = "%d %d %d %d" % (block_start, block_length, |
3795 | record._start, record._end) |
3796 | nodes = [(record.key, value, (record.parents,))] |
3797 | @@ -1593,7 +1819,7 @@ |
3798 | |
3799 | def __init__(self, graph_index, is_locked, parents=True, |
3800 | add_callback=None, track_external_parent_refs=False, |
3801 | - inconsistency_fatal=True): |
3802 | + inconsistency_fatal=True, track_new_keys=False): |
3803 | """Construct a _GCGraphIndex on a graph_index. |
3804 | |
3805 | :param graph_index: An implementation of bzrlib.index.GraphIndex. |
3806 | @@ -1619,7 +1845,8 @@ |
3807 | self._is_locked = is_locked |
3808 | self._inconsistency_fatal = inconsistency_fatal |
3809 | if track_external_parent_refs: |
3810 | - self._key_dependencies = knit._KeyRefs() |
3811 | + self._key_dependencies = knit._KeyRefs( |
3812 | + track_new_keys=track_new_keys) |
3813 | else: |
3814 | self._key_dependencies = None |
3815 | |
3816 | @@ -1679,10 +1906,14 @@ |
3817 | result.append((key, value)) |
3818 | records = result |
3819 | key_dependencies = self._key_dependencies |
3820 | - if key_dependencies is not None and self._parents: |
3821 | - for key, value, refs in records: |
3822 | - parents = refs[0] |
3823 | - key_dependencies.add_references(key, parents) |
3824 | + if key_dependencies is not None: |
3825 | + if self._parents: |
3826 | + for key, value, refs in records: |
3827 | + parents = refs[0] |
3828 | + key_dependencies.add_references(key, parents) |
3829 | + else: |
3830 | + for key, value, refs in records: |
3831 | + new_keys.add_key(key) |
3832 | self._add_callback(records) |
3833 | |
3834 | def _check_read(self): |
3835 | @@ -1719,6 +1950,10 @@ |
3836 | if missing_keys: |
3837 | raise errors.RevisionNotPresent(missing_keys.pop(), self) |
3838 | |
3839 | + def find_ancestry(self, keys): |
3840 | + """See CombinedGraphIndex.find_ancestry""" |
3841 | + return self._graph_index.find_ancestry(keys, 0) |
3842 | + |
3843 | def get_parent_map(self, keys): |
3844 | """Get a map of the parents of keys. |
3845 | |
3846 | @@ -1741,7 +1976,7 @@ |
3847 | """Return the keys of missing parents.""" |
3848 | # Copied from _KnitGraphIndex.get_missing_parents |
3849 | # We may have false positives, so filter those out. |
3850 | - self._key_dependencies.add_keys( |
3851 | + self._key_dependencies.satisfy_refs_for_keys( |
3852 | self.get_parent_map(self._key_dependencies.get_unsatisfied_refs())) |
3853 | return frozenset(self._key_dependencies.get_unsatisfied_refs()) |
3854 | |
3855 | @@ -1801,17 +2036,17 @@ |
3856 | |
3857 | This allows this _GCGraphIndex to keep track of any missing |
3858 | compression parents we may want to have filled in to make those |
3859 | - indices valid. |
3860 | + indices valid. It also allows _GCGraphIndex to track any new keys. |
3861 | |
3862 | :param graph_index: A GraphIndex |
3863 | """ |
3864 | - if self._key_dependencies is not None: |
3865 | - # Add parent refs from graph_index (and discard parent refs that |
3866 | - # the graph_index has). |
3867 | - add_refs = self._key_dependencies.add_references |
3868 | - for node in graph_index.iter_all_entries(): |
3869 | - add_refs(node[1], node[3][0]) |
3870 | - |
3871 | + key_dependencies = self._key_dependencies |
3872 | + if key_dependencies is None: |
3873 | + return |
3874 | + for node in graph_index.iter_all_entries(): |
3875 | + # Add parent refs from graph_index (and discard parent refs |
3876 | + # that the graph_index has). |
3877 | + key_dependencies.add_references(node[1], node[3][0]) |
3878 | |
3879 | |
3880 | from bzrlib._groupcompress_py import ( |
3881 | |
3882 | === modified file 'bzrlib/help_topics/en/configuration.txt' |
3883 | --- bzrlib/help_topics/en/configuration.txt 2009-06-26 18:13:41 +0000 |
3884 | +++ bzrlib/help_topics/en/configuration.txt 2009-08-21 09:19:11 +0000 |
3885 | @@ -63,6 +63,60 @@ |
3886 | ~~~~~~~~~~~~~~~ |
3887 | |
3888 | The path to the plugins directory that Bazaar should use. |
3889 | +If not set, Bazaar will search for plugins in: |
3890 | + |
3891 | +* the user specific plugin directory (containing the ``user`` plugins), |
3892 | + |
3893 | +* the bzrlib directory (containing the ``core`` plugins), |
3894 | + |
3895 | +* the site specific plugin directory if applicable (containing |
3896 | + the ``site`` plugins). |
3897 | + |
3898 | +If ``BZR_PLUGIN_PATH`` is set in any fashion, it will change the |
3899 | +the way the plugin are searched. |
3900 | + |
3901 | +As for the ``PATH`` variables, if multiple directories are |
3902 | +specified in ``BZR_PLUGIN_PATH`` they should be separated by the |
3903 | +platform specific appropriate character (':' on Unix/Linux/etc, |
3904 | +';' on windows) |
3905 | + |
3906 | +By default if ``BZR_PLUGIN_PATH`` is set, it replaces searching |
3907 | +in ``user``. However it will continue to search in ``core`` and |
3908 | +``site`` unless they are explicitly removed. |
3909 | + |
3910 | +If you need to change the order or remove one of these |
3911 | +directories, you should use special values: |
3912 | + |
3913 | +* ``-user``, ``-core``, ``-site`` will remove the corresponding |
3914 | + path from the default values, |
3915 | + |
3916 | +* ``+user``, ``+core``, ``+site`` will add the corresponding path |
3917 | + before the remaining default values (and also remove it from |
3918 | + the default values). |
3919 | + |
3920 | +Note that the special values 'user', 'core' and 'site' should be |
3921 | +used literally, they will be substituted by the corresponding, |
3922 | +platform specific, values. |
3923 | + |
3924 | +Examples: |
3925 | +^^^^^^^^^ |
3926 | + |
3927 | +The examples below uses ':' as the separator, windows users |
3928 | +should use ';'. |
3929 | + |
3930 | +Overriding the default user plugin directory: |
3931 | +``BZR_PLUGIN_PATH='/path/to/my/other/plugins'`` |
3932 | + |
3933 | +Disabling the site directory while retaining the user directory: |
3934 | +``BZR_PLUGIN_PATH='-site:+user'`` |
3935 | + |
3936 | +Disabling all plugins (better achieved with --no-plugins): |
3937 | +``BZR_PLUGIN_PATH='-user:-core:-site'`` |
3938 | + |
3939 | +Overriding the default site plugin directory: |
3940 | +``BZR_PLUGIN_PATH='/path/to/my/site/plugins:-site':+user`` |
3941 | + |
3942 | + |
3943 | |
3944 | BZRPATH |
3945 | ~~~~~~~ |
3946 | |
3947 | === modified file 'bzrlib/help_topics/en/debug-flags.txt' |
3948 | --- bzrlib/help_topics/en/debug-flags.txt 2009-08-04 04:36:34 +0000 |
3949 | +++ bzrlib/help_topics/en/debug-flags.txt 2009-08-20 05:02:45 +0000 |
3950 | @@ -12,18 +12,26 @@ |
3951 | operations. |
3952 | -Dfetch Trace history copying between repositories. |
3953 | -Dfilters Emit information for debugging content filtering. |
3954 | +-Dforceinvdeltas Force use of inventory deltas during generic streaming fetch. |
3955 | -Dgraph Trace graph traversal. |
3956 | -Dhashcache Log every time a working file is read to determine its hash. |
3957 | -Dhooks Trace hook execution. |
3958 | -Dhpss Trace smart protocol requests and responses. |
3959 | -Dhpssdetail More hpss details. |
3960 | -Dhpssvfs Traceback on vfs access to Remote objects. |
3961 | --Dhttp Trace http connections, requests and responses |
3962 | +-Dhttp Trace http connections, requests and responses. |
3963 | -Dindex Trace major index operations. |
3964 | -Dknit Trace knit operations. |
3965 | +-Dstrict_locks Trace when OS locks are potentially used in a non-portable |
3966 | + manner. |
3967 | -Dlock Trace when lockdir locks are taken or released. |
3968 | -Dprogress Trace progress bar operations. |
3969 | -Dmerge Emit information for debugging merges. |
3970 | +-Dno_apport Don't use apport to report crashes. |
3971 | -Dunlock Some errors during unlock are treated as warnings. |
3972 | -Dpack Emit information about pack operations. |
3973 | -Dsftp Trace SFTP internals. |
3974 | +-Dstream Trace fetch streams. |
3975 | +-DIDS_never Never use InterDifferingSerializer when fetching. |
3976 | +-DIDS_always Always use InterDifferingSerializer to fetch if appropriate |
3977 | + for the format, even for non-local fetches. |
3978 | |
3979 | === modified file 'bzrlib/hooks.py' |
3980 | --- bzrlib/hooks.py 2009-06-10 03:31:01 +0000 |
3981 | +++ bzrlib/hooks.py 2009-09-01 12:29:54 +0000 |
3982 | @@ -219,9 +219,7 @@ |
3983 | strings.append('Introduced in: %s' % introduced_string) |
3984 | if self.deprecated: |
3985 | deprecated_string = _format_version_tuple(self.deprecated) |
3986 | - else: |
3987 | - deprecated_string = 'Not deprecated' |
3988 | - strings.append('Deprecated in: %s' % deprecated_string) |
3989 | + strings.append('Deprecated in: %s' % deprecated_string) |
3990 | strings.append('') |
3991 | strings.extend(textwrap.wrap(self.__doc__, |
3992 | break_long_words=False)) |
3993 | |
3994 | === modified file 'bzrlib/index.py' |
3995 | --- bzrlib/index.py 2009-07-01 10:53:08 +0000 |
3996 | +++ bzrlib/index.py 2009-08-17 22:11:06 +0000 |
3997 | @@ -333,6 +333,22 @@ |
3998 | if combine_backing_indices is not None: |
3999 | self._combine_backing_indices = combine_backing_indices |
4000 | |
4001 | + def find_ancestry(self, keys, ref_list_num): |
4002 | + """See CombinedGraphIndex.find_ancestry()""" |
4003 | + pending = set(keys) |
4004 | + parent_map = {} |
4005 | + missing_keys = set() |
4006 | + while pending: |
4007 | + next_pending = set() |
4008 | + for _, key, value, ref_lists in self.iter_entries(pending): |
4009 | + parent_keys = ref_lists[ref_list_num] |
4010 | + parent_map[key] = parent_keys |
4011 | + next_pending.update([p for p in parent_keys if p not in |
4012 | + parent_map]) |
4013 | + missing_keys.update(pending.difference(parent_map)) |
4014 | + pending = next_pending |
4015 | + return parent_map, missing_keys |
4016 | + |
4017 | |
4018 | class GraphIndex(object): |
4019 | """An index for data with embedded graphs. |
4020 | @@ -702,6 +718,23 @@ |
4021 | # the last thing looked up was a terminal element |
4022 | yield (self, ) + key_dict |
4023 | |
4024 | + def _find_ancestors(self, keys, ref_list_num, parent_map, missing_keys): |
4025 | + """See BTreeIndex._find_ancestors.""" |
4026 | + # The api can be implemented as a trivial overlay on top of |
4027 | + # iter_entries, it is not an efficient implementation, but it at least |
4028 | + # gets the job done. |
4029 | + found_keys = set() |
4030 | + search_keys = set() |
4031 | + for index, key, value, refs in self.iter_entries(keys): |
4032 | + parent_keys = refs[ref_list_num] |
4033 | + found_keys.add(key) |
4034 | + parent_map[key] = parent_keys |
4035 | + search_keys.update(parent_keys) |
4036 | + # Figure out what, if anything, was missing |
4037 | + missing_keys.update(set(keys).difference(found_keys)) |
4038 | + search_keys = search_keys.difference(parent_map) |
4039 | + return search_keys |
4040 | + |
4041 | def key_count(self): |
4042 | """Return an estimate of the number of keys in this index. |
4043 | |
4044 | @@ -1297,6 +1330,69 @@ |
4045 | except errors.NoSuchFile: |
4046 | self._reload_or_raise() |
4047 | |
4048 | + def find_ancestry(self, keys, ref_list_num): |
4049 | + """Find the complete ancestry for the given set of keys. |
4050 | + |
4051 | + Note that this is a whole-ancestry request, so it should be used |
4052 | + sparingly. |
4053 | + |
4054 | + :param keys: An iterable of keys to look for |
4055 | + :param ref_list_num: The reference list which references the parents |
4056 | + we care about. |
4057 | + :return: (parent_map, missing_keys) |
4058 | + """ |
4059 | + missing_keys = set() |
4060 | + parent_map = {} |
4061 | + keys_to_lookup = set(keys) |
4062 | + generation = 0 |
4063 | + while keys_to_lookup: |
4064 | + # keys that *all* indexes claim are missing, stop searching them |
4065 | + generation += 1 |
4066 | + all_index_missing = None |
4067 | + # print 'gen\tidx\tsub\tn_keys\tn_pmap\tn_miss' |
4068 | + # print '%4d\t\t\t%4d\t%5d\t%5d' % (generation, len(keys_to_lookup), |
4069 | + # len(parent_map), |
4070 | + # len(missing_keys)) |
4071 | + for index_idx, index in enumerate(self._indices): |
4072 | + # TODO: we should probably be doing something with |
4073 | + # 'missing_keys' since we've already determined that |
4074 | + # those revisions have not been found anywhere |
4075 | + index_missing_keys = set() |
4076 | + # Find all of the ancestry we can from this index |
4077 | + # keep looking until the search_keys set is empty, which means |
4078 | + # things we didn't find should be in index_missing_keys |
4079 | + search_keys = keys_to_lookup |
4080 | + sub_generation = 0 |
4081 | + # print ' \t%2d\t\t%4d\t%5d\t%5d' % ( |
4082 | + # index_idx, len(search_keys), |
4083 | + # len(parent_map), len(index_missing_keys)) |
4084 | + while search_keys: |
4085 | + sub_generation += 1 |
4086 | + # TODO: ref_list_num should really be a parameter, since |
4087 | + # CombinedGraphIndex does not know what the ref lists |
4088 | + # mean. |
4089 | + search_keys = index._find_ancestors(search_keys, |
4090 | + ref_list_num, parent_map, index_missing_keys) |
4091 | + # print ' \t \t%2d\t%4d\t%5d\t%5d' % ( |
4092 | + # sub_generation, len(search_keys), |
4093 | + # len(parent_map), len(index_missing_keys)) |
4094 | + # Now set whatever was missing to be searched in the next index |
4095 | + keys_to_lookup = index_missing_keys |
4096 | + if all_index_missing is None: |
4097 | + all_index_missing = set(index_missing_keys) |
4098 | + else: |
4099 | + all_index_missing.intersection_update(index_missing_keys) |
4100 | + if not keys_to_lookup: |
4101 | + break |
4102 | + if all_index_missing is None: |
4103 | + # There were no indexes, so all search keys are 'missing' |
4104 | + missing_keys.update(keys_to_lookup) |
4105 | + keys_to_lookup = None |
4106 | + else: |
4107 | + missing_keys.update(all_index_missing) |
4108 | + keys_to_lookup.difference_update(all_index_missing) |
4109 | + return parent_map, missing_keys |
4110 | + |
4111 | def key_count(self): |
4112 | """Return an estimate of the number of keys in this index. |
4113 | |
4114 | |
4115 | === modified file 'bzrlib/inventory.py' |
4116 | --- bzrlib/inventory.py 2009-08-05 02:30:59 +0000 |
4117 | +++ bzrlib/inventory.py 2009-08-30 23:51:10 +0000 |
4118 | @@ -437,7 +437,13 @@ |
4119 | self.text_id is not None): |
4120 | checker._report_items.append('directory {%s} has text in revision {%s}' |
4121 | % (self.file_id, rev_id)) |
4122 | - # Directories are stored as ''. |
4123 | + # In non rich root repositories we do not expect a file graph for the |
4124 | + # root. |
4125 | + if self.name == '' and not checker.rich_roots: |
4126 | + return |
4127 | + # Directories are stored as an empty file, but the file should exist |
4128 | + # to provide a per-fileid log. The hash of every directory content is |
4129 | + # "da..." below (the sha1sum of ''). |
4130 | checker.add_pending_item(rev_id, |
4131 | ('texts', self.file_id, self.revision), 'text', |
4132 | 'da39a3ee5e6b4b0d3255bfef95601890afd80709') |
4133 | @@ -743,6 +749,9 @@ |
4134 | """ |
4135 | return self.has_id(file_id) |
4136 | |
4137 | + def has_filename(self, filename): |
4138 | + return bool(self.path2id(filename)) |
4139 | + |
4140 | def id2path(self, file_id): |
4141 | """Return as a string the path to file_id. |
4142 | |
4143 | @@ -751,6 +760,8 @@ |
4144 | >>> e = i.add(InventoryFile('foo-id', 'foo.c', parent_id='src-id')) |
4145 | >>> print i.id2path('foo-id') |
4146 | src/foo.c |
4147 | + |
4148 | + :raises NoSuchId: If file_id is not present in the inventory. |
4149 | """ |
4150 | # get all names, skipping root |
4151 | return '/'.join(reversed( |
4152 | @@ -1363,9 +1374,6 @@ |
4153 | yield ie |
4154 | file_id = ie.parent_id |
4155 | |
4156 | - def has_filename(self, filename): |
4157 | - return bool(self.path2id(filename)) |
4158 | - |
4159 | def has_id(self, file_id): |
4160 | return (file_id in self._byid) |
4161 | |
4162 | |
4163 | === modified file 'bzrlib/inventory_delta.py' |
4164 | --- bzrlib/inventory_delta.py 2009-04-02 05:53:12 +0000 |
4165 | +++ bzrlib/inventory_delta.py 2009-08-13 00:20:29 +0000 |
4166 | @@ -29,6 +29,25 @@ |
4167 | from bzrlib import inventory |
4168 | from bzrlib.revision import NULL_REVISION |
4169 | |
4170 | +FORMAT_1 = 'bzr inventory delta v1 (bzr 1.14)' |
4171 | + |
4172 | + |
4173 | +class InventoryDeltaError(errors.BzrError): |
4174 | + """An error when serializing or deserializing an inventory delta.""" |
4175 | + |
4176 | + # Most errors when serializing and deserializing are due to bugs, although |
4177 | + # damaged input (i.e. a bug in a different process) could cause |
4178 | + # deserialization errors too. |
4179 | + internal_error = True |
4180 | + |
4181 | + |
4182 | +class IncompatibleInventoryDelta(errors.BzrError): |
4183 | + """The delta could not be deserialised because its contents conflict with |
4184 | + the allow_versioned_root or allow_tree_references flags of the |
4185 | + deserializer. |
4186 | + """ |
4187 | + internal_error = False |
4188 | + |
4189 | |
4190 | def _directory_content(entry): |
4191 | """Serialize the content component of entry which is a directory. |
4192 | @@ -49,7 +68,7 @@ |
4193 | exec_bytes = '' |
4194 | size_exec_sha = (entry.text_size, exec_bytes, entry.text_sha1) |
4195 | if None in size_exec_sha: |
4196 | - raise errors.BzrError('Missing size or sha for %s' % entry.file_id) |
4197 | + raise InventoryDeltaError('Missing size or sha for %s' % entry.file_id) |
4198 | return "file\x00%d\x00%s\x00%s" % size_exec_sha |
4199 | |
4200 | |
4201 | @@ -60,7 +79,7 @@ |
4202 | """ |
4203 | target = entry.symlink_target |
4204 | if target is None: |
4205 | - raise errors.BzrError('Missing target for %s' % entry.file_id) |
4206 | + raise InventoryDeltaError('Missing target for %s' % entry.file_id) |
4207 | return "link\x00%s" % target.encode('utf8') |
4208 | |
4209 | |
4210 | @@ -71,7 +90,8 @@ |
4211 | """ |
4212 | tree_revision = entry.reference_revision |
4213 | if tree_revision is None: |
4214 | - raise errors.BzrError('Missing reference revision for %s' % entry.file_id) |
4215 | + raise InventoryDeltaError( |
4216 | + 'Missing reference revision for %s' % entry.file_id) |
4217 | return "tree\x00%s" % tree_revision |
4218 | |
4219 | |
4220 | @@ -115,11 +135,8 @@ |
4221 | return result |
4222 | |
4223 | |
4224 | - |
4225 | class InventoryDeltaSerializer(object): |
4226 | - """Serialize and deserialize inventory deltas.""" |
4227 | - |
4228 | - FORMAT_1 = 'bzr inventory delta v1 (bzr 1.14)' |
4229 | + """Serialize inventory deltas.""" |
4230 | |
4231 | def __init__(self, versioned_root, tree_references): |
4232 | """Create an InventoryDeltaSerializer. |
4233 | @@ -141,6 +158,9 @@ |
4234 | def delta_to_lines(self, old_name, new_name, delta_to_new): |
4235 | """Return a line sequence for delta_to_new. |
4236 | |
4237 | + Both the versioned_root and tree_references flags must be set via |
4238 | + require_flags before calling this. |
4239 | + |
4240 | :param old_name: A UTF8 revision id for the old inventory. May be |
4241 | NULL_REVISION if there is no older inventory and delta_to_new |
4242 | includes the entire inventory contents. |
4243 | @@ -150,15 +170,20 @@ |
4244 | takes. |
4245 | :return: The serialized delta as lines. |
4246 | """ |
4247 | + if type(old_name) is not str: |
4248 | + raise TypeError('old_name should be str, got %r' % (old_name,)) |
4249 | + if type(new_name) is not str: |
4250 | + raise TypeError('new_name should be str, got %r' % (new_name,)) |
4251 | lines = ['', '', '', '', ''] |
4252 | to_line = self._delta_item_to_line |
4253 | for delta_item in delta_to_new: |
4254 | - lines.append(to_line(delta_item)) |
4255 | - if lines[-1].__class__ != str: |
4256 | - raise errors.BzrError( |
4257 | + line = to_line(delta_item, new_name) |
4258 | + if line.__class__ != str: |
4259 | + raise InventoryDeltaError( |
4260 | 'to_line generated non-str output %r' % lines[-1]) |
4261 | + lines.append(line) |
4262 | lines.sort() |
4263 | - lines[0] = "format: %s\n" % InventoryDeltaSerializer.FORMAT_1 |
4264 | + lines[0] = "format: %s\n" % FORMAT_1 |
4265 | lines[1] = "parent: %s\n" % old_name |
4266 | lines[2] = "version: %s\n" % new_name |
4267 | lines[3] = "versioned_root: %s\n" % self._serialize_bool( |
4268 | @@ -173,7 +198,7 @@ |
4269 | else: |
4270 | return "false" |
4271 | |
4272 | - def _delta_item_to_line(self, delta_item): |
4273 | + def _delta_item_to_line(self, delta_item, new_version): |
4274 | """Convert delta_item to a line.""" |
4275 | oldpath, newpath, file_id, entry = delta_item |
4276 | if newpath is None: |
4277 | @@ -188,6 +213,10 @@ |
4278 | oldpath_utf8 = 'None' |
4279 | else: |
4280 | oldpath_utf8 = '/' + oldpath.encode('utf8') |
4281 | + if newpath == '/': |
4282 | + raise AssertionError( |
4283 | + "Bad inventory delta: '/' is not a valid newpath " |
4284 | + "(should be '') in delta item %r" % (delta_item,)) |
4285 | # TODO: Test real-world utf8 cache hit rate. It may be a win. |
4286 | newpath_utf8 = '/' + newpath.encode('utf8') |
4287 | # Serialize None as '' |
4288 | @@ -196,58 +225,78 @@ |
4289 | last_modified = entry.revision |
4290 | # special cases for / |
4291 | if newpath_utf8 == '/' and not self._versioned_root: |
4292 | - if file_id != 'TREE_ROOT': |
4293 | - raise errors.BzrError( |
4294 | - 'file_id %s is not TREE_ROOT for /' % file_id) |
4295 | - if last_modified is not None: |
4296 | - raise errors.BzrError( |
4297 | - 'Version present for / in %s' % file_id) |
4298 | - last_modified = NULL_REVISION |
4299 | + # This is an entry for the root, this inventory does not |
4300 | + # support versioned roots. So this must be an unversioned |
4301 | + # root, i.e. last_modified == new revision. Otherwise, this |
4302 | + # delta is invalid. |
4303 | + # Note: the non-rich-root repositories *can* have roots with |
4304 | + # file-ids other than TREE_ROOT, e.g. repo formats that use the |
4305 | + # xml5 serializer. |
4306 | + if last_modified != new_version: |
4307 | + raise InventoryDeltaError( |
4308 | + 'Version present for / in %s (%s != %s)' |
4309 | + % (file_id, last_modified, new_version)) |
4310 | if last_modified is None: |
4311 | - raise errors.BzrError("no version for fileid %s" % file_id) |
4312 | + raise InventoryDeltaError("no version for fileid %s" % file_id) |
4313 | content = self._entry_to_content[entry.kind](entry) |
4314 | return ("%s\x00%s\x00%s\x00%s\x00%s\x00%s\n" % |
4315 | (oldpath_utf8, newpath_utf8, file_id, parent_id, last_modified, |
4316 | content)) |
4317 | |
4318 | + |
4319 | +class InventoryDeltaDeserializer(object): |
4320 | + """Deserialize inventory deltas.""" |
4321 | + |
4322 | + def __init__(self, allow_versioned_root=True, allow_tree_references=True): |
4323 | + """Create an InventoryDeltaDeserializer. |
4324 | + |
4325 | + :param versioned_root: If True, any root entry that is seen is expected |
4326 | + to be versioned, and root entries can have any fileid. |
4327 | + :param tree_references: If True support tree-reference entries. |
4328 | + """ |
4329 | + self._allow_versioned_root = allow_versioned_root |
4330 | + self._allow_tree_references = allow_tree_references |
4331 | + |
4332 | def _deserialize_bool(self, value): |
4333 | if value == "true": |
4334 | return True |
4335 | elif value == "false": |
4336 | return False |
4337 | else: |
4338 | - raise errors.BzrError("value %r is not a bool" % (value,)) |
4339 | + raise InventoryDeltaError("value %r is not a bool" % (value,)) |
4340 | |
4341 | def parse_text_bytes(self, bytes): |
4342 | """Parse the text bytes of a serialized inventory delta. |
4343 | |
4344 | + If versioned_root and/or tree_references flags were set via |
4345 | + require_flags, then the parsed flags must match or a BzrError will be |
4346 | + raised. |
4347 | + |
4348 | :param bytes: The bytes to parse. This can be obtained by calling |
4349 | delta_to_lines and then doing ''.join(delta_lines). |
4350 | - :return: (parent_id, new_id, inventory_delta) |
4351 | + :return: (parent_id, new_id, versioned_root, tree_references, |
4352 | + inventory_delta) |
4353 | """ |
4354 | + if bytes[-1:] != '\n': |
4355 | + last_line = bytes.rsplit('\n', 1)[-1] |
4356 | + raise InventoryDeltaError('last line not empty: %r' % (last_line,)) |
4357 | lines = bytes.split('\n')[:-1] # discard the last empty line |
4358 | - if not lines or lines[0] != 'format: %s' % InventoryDeltaSerializer.FORMAT_1: |
4359 | - raise errors.BzrError('unknown format %r' % lines[0:1]) |
4360 | + if not lines or lines[0] != 'format: %s' % FORMAT_1: |
4361 | + raise InventoryDeltaError('unknown format %r' % lines[0:1]) |
4362 | if len(lines) < 2 or not lines[1].startswith('parent: '): |
4363 | - raise errors.BzrError('missing parent: marker') |
4364 | + raise InventoryDeltaError('missing parent: marker') |
4365 | delta_parent_id = lines[1][8:] |
4366 | if len(lines) < 3 or not lines[2].startswith('version: '): |
4367 | - raise errors.BzrError('missing version: marker') |
4368 | + raise InventoryDeltaError('missing version: marker') |
4369 | delta_version_id = lines[2][9:] |
4370 | if len(lines) < 4 or not lines[3].startswith('versioned_root: '): |
4371 | - raise errors.BzrError('missing versioned_root: marker') |
4372 | + raise InventoryDeltaError('missing versioned_root: marker') |
4373 | delta_versioned_root = self._deserialize_bool(lines[3][16:]) |
4374 | if len(lines) < 5 or not lines[4].startswith('tree_references: '): |
4375 | - raise errors.BzrError('missing tree_references: marker') |
4376 | + raise InventoryDeltaError('missing tree_references: marker') |
4377 | delta_tree_references = self._deserialize_bool(lines[4][17:]) |
4378 | - if delta_versioned_root != self._versioned_root: |
4379 | - raise errors.BzrError( |
4380 | - "serialized versioned_root flag is wrong: %s" % |
4381 | - (delta_versioned_root,)) |
4382 | - if delta_tree_references != self._tree_references: |
4383 | - raise errors.BzrError( |
4384 | - "serialized tree_references flag is wrong: %s" % |
4385 | - (delta_tree_references,)) |
4386 | + if (not self._allow_versioned_root and delta_versioned_root): |
4387 | + raise IncompatibleInventoryDelta("versioned_root not allowed") |
4388 | result = [] |
4389 | seen_ids = set() |
4390 | line_iter = iter(lines) |
4391 | @@ -258,33 +307,58 @@ |
4392 | content) = line.split('\x00', 5) |
4393 | parent_id = parent_id or None |
4394 | if file_id in seen_ids: |
4395 | - raise errors.BzrError( |
4396 | + raise InventoryDeltaError( |
4397 | "duplicate file id in inventory delta %r" % lines) |
4398 | seen_ids.add(file_id) |
4399 | - if newpath_utf8 == '/' and not delta_versioned_root and ( |
4400 | - last_modified != 'null:' or file_id != 'TREE_ROOT'): |
4401 | - raise errors.BzrError("Versioned root found: %r" % line) |
4402 | - elif last_modified[-1] == ':': |
4403 | - raise errors.BzrError('special revisionid found: %r' % line) |
4404 | - if not delta_tree_references and content.startswith('tree\x00'): |
4405 | - raise errors.BzrError("Tree reference found: %r" % line) |
4406 | - content_tuple = tuple(content.split('\x00')) |
4407 | - entry = _parse_entry( |
4408 | - newpath_utf8, file_id, parent_id, last_modified, content_tuple) |
4409 | + if (newpath_utf8 == '/' and not delta_versioned_root and |
4410 | + last_modified != delta_version_id): |
4411 | + # Delta claims to be not have a versioned root, yet here's |
4412 | + # a root entry with a non-default version. |
4413 | + raise InventoryDeltaError("Versioned root found: %r" % line) |
4414 | + elif newpath_utf8 != 'None' and last_modified[-1] == ':': |
4415 | + # Deletes have a last_modified of null:, but otherwise special |
4416 | + # revision ids should not occur. |
4417 | + raise InventoryDeltaError('special revisionid found: %r' % line) |
4418 | + if content.startswith('tree\x00'): |
4419 | + if delta_tree_references is False: |
4420 | + raise InventoryDeltaError( |
4421 | + "Tree reference found (but header said " |
4422 | + "tree_references: false): %r" % line) |
4423 | + elif not self._allow_tree_references: |
4424 | + raise IncompatibleInventoryDelta( |
4425 | + "Tree reference not allowed") |
4426 | if oldpath_utf8 == 'None': |
4427 | oldpath = None |
4428 | + elif oldpath_utf8[:1] != '/': |
4429 | + raise InventoryDeltaError( |
4430 | + "oldpath invalid (does not start with /): %r" |
4431 | + % (oldpath_utf8,)) |
4432 | else: |
4433 | + oldpath_utf8 = oldpath_utf8[1:] |
4434 | oldpath = oldpath_utf8.decode('utf8') |
4435 | if newpath_utf8 == 'None': |
4436 | newpath = None |
4437 | + elif newpath_utf8[:1] != '/': |
4438 | + raise InventoryDeltaError( |
4439 | + "newpath invalid (does not start with /): %r" |
4440 | + % (newpath_utf8,)) |
4441 | else: |
4442 | + # Trim leading slash |
4443 | + newpath_utf8 = newpath_utf8[1:] |
4444 | newpath = newpath_utf8.decode('utf8') |
4445 | + content_tuple = tuple(content.split('\x00')) |
4446 | + if content_tuple[0] == 'deleted': |
4447 | + entry = None |
4448 | + else: |
4449 | + entry = _parse_entry( |
4450 | + newpath, file_id, parent_id, last_modified, content_tuple) |
4451 | delta_item = (oldpath, newpath, file_id, entry) |
4452 | result.append(delta_item) |
4453 | - return delta_parent_id, delta_version_id, result |
4454 | - |
4455 | - |
4456 | -def _parse_entry(utf8_path, file_id, parent_id, last_modified, content): |
4457 | + return (delta_parent_id, delta_version_id, delta_versioned_root, |
4458 | + delta_tree_references, result) |
4459 | + |
4460 | + |
4461 | +def _parse_entry(path, file_id, parent_id, last_modified, content): |
4462 | entry_factory = { |
4463 | 'dir': _dir_to_entry, |
4464 | 'file': _file_to_entry, |
4465 | @@ -292,8 +366,10 @@ |
4466 | 'tree': _tree_to_entry, |
4467 | } |
4468 | kind = content[0] |
4469 | - path = utf8_path[1:].decode('utf8') |
4470 | + if path.startswith('/'): |
4471 | + raise AssertionError |
4472 | name = basename(path) |
4473 | return entry_factory[content[0]]( |
4474 | content, name, parent_id, file_id, last_modified) |
4475 | |
4476 | + |
4477 | |
4478 | === modified file 'bzrlib/knit.py' |
4479 | --- bzrlib/knit.py 2009-08-04 04:36:34 +0000 |
4480 | +++ bzrlib/knit.py 2009-09-09 13:05:33 +0000 |
4481 | @@ -1190,6 +1190,19 @@ |
4482 | generator = _VFContentMapGenerator(self, [key]) |
4483 | return generator._get_content(key) |
4484 | |
4485 | + def get_known_graph_ancestry(self, keys): |
4486 | + """Get a KnownGraph instance with the ancestry of keys.""" |
4487 | + parent_map, missing_keys = self._index.find_ancestry(keys) |
4488 | + for fallback in self._fallback_vfs: |
4489 | + if not missing_keys: |
4490 | + break |
4491 | + (f_parent_map, f_missing_keys) = fallback._index.find_ancestry( |
4492 | + missing_keys) |
4493 | + parent_map.update(f_parent_map) |
4494 | + missing_keys = f_missing_keys |
4495 | + kg = _mod_graph.KnownGraph(parent_map) |
4496 | + return kg |
4497 | + |
4498 | def get_parent_map(self, keys): |
4499 | """Get a map of the graph parents of keys. |
4500 | |
4501 | @@ -1505,10 +1518,10 @@ |
4502 | if source is parent_maps[0]: |
4503 | # this KnitVersionedFiles |
4504 | records = [(key, positions[key][1]) for key in keys] |
4505 | - for key, raw_data, sha1 in self._read_records_iter_raw(records): |
4506 | + for key, raw_data in self._read_records_iter_unchecked(records): |
4507 | (record_details, index_memo, _) = positions[key] |
4508 | yield KnitContentFactory(key, global_map[key], |
4509 | - record_details, sha1, raw_data, self._factory.annotated, None) |
4510 | + record_details, None, raw_data, self._factory.annotated, None) |
4511 | else: |
4512 | vf = self._fallback_vfs[parent_maps.index(source) - 1] |
4513 | for record in vf.get_record_stream(keys, ordering, |
4514 | @@ -1583,6 +1596,13 @@ |
4515 | # key = basis_parent, value = index entry to add |
4516 | buffered_index_entries = {} |
4517 | for record in stream: |
4518 | + kind = record.storage_kind |
4519 | + if kind.startswith('knit-') and kind.endswith('-gz'): |
4520 | + # Check that the ID in the header of the raw knit bytes matches |
4521 | + # the record metadata. |
4522 | + raw_data = record._raw_record |
4523 | + df, rec = self._parse_record_header(record.key, raw_data) |
4524 | + df.close() |
4525 | buffered = False |
4526 | parents = record.parents |
4527 | if record.storage_kind in delta_types: |
4528 | @@ -2560,6 +2580,33 @@ |
4529 | except KeyError: |
4530 | raise RevisionNotPresent(key, self) |
4531 | |
4532 | + def find_ancestry(self, keys): |
4533 | + """See CombinedGraphIndex.find_ancestry()""" |
4534 | + prefixes = set(key[:-1] for key in keys) |
4535 | + self._load_prefixes(prefixes) |
4536 | + result = {} |
4537 | + parent_map = {} |
4538 | + missing_keys = set() |
4539 | + pending_keys = list(keys) |
4540 | + # This assumes that keys will not reference parents in a different |
4541 | + # prefix, which is accurate so far. |
4542 | + while pending_keys: |
4543 | + key = pending_keys.pop() |
4544 | + if key in parent_map: |
4545 | + continue |
4546 | + prefix = key[:-1] |
4547 | + try: |
4548 | + suffix_parents = self._kndx_cache[prefix][0][key[-1]][4] |
4549 | + except KeyError: |
4550 | + missing_keys.add(key) |
4551 | + else: |
4552 | + parent_keys = tuple([prefix + (suffix,) |
4553 | + for suffix in suffix_parents]) |
4554 | + parent_map[key] = parent_keys |
4555 | + pending_keys.extend([p for p in parent_keys |
4556 | + if p not in parent_map]) |
4557 | + return parent_map, missing_keys |
4558 | + |
4559 | def get_parent_map(self, keys): |
4560 | """Get a map of the parents of keys. |
4561 | |
4562 | @@ -2737,9 +2784,20 @@ |
4563 | |
4564 | class _KeyRefs(object): |
4565 | |
4566 | - def __init__(self): |
4567 | + def __init__(self, track_new_keys=False): |
4568 | # dict mapping 'key' to 'set of keys referring to that key' |
4569 | self.refs = {} |
4570 | + if track_new_keys: |
4571 | + # set remembering all new keys |
4572 | + self.new_keys = set() |
4573 | + else: |
4574 | + self.new_keys = None |
4575 | + |
4576 | + def clear(self): |
4577 | + if self.refs: |
4578 | + self.refs.clear() |
4579 | + if self.new_keys: |
4580 | + self.new_keys.clear() |
4581 | |
4582 | def add_references(self, key, refs): |
4583 | # Record the new references |
4584 | @@ -2752,19 +2810,28 @@ |
4585 | # Discard references satisfied by the new key |
4586 | self.add_key(key) |
4587 | |
4588 | + def get_new_keys(self): |
4589 | + return self.new_keys |
4590 | + |
4591 | def get_unsatisfied_refs(self): |
4592 | return self.refs.iterkeys() |
4593 | |
4594 | - def add_key(self, key): |
4595 | + def _satisfy_refs_for_key(self, key): |
4596 | try: |
4597 | del self.refs[key] |
4598 | except KeyError: |
4599 | # No keys depended on this key. That's ok. |
4600 | pass |
4601 | |
4602 | - def add_keys(self, keys): |
4603 | + def add_key(self, key): |
4604 | + # satisfy refs for key, and remember that we've seen this key. |
4605 | + self._satisfy_refs_for_key(key) |
4606 | + if self.new_keys is not None: |
4607 | + self.new_keys.add(key) |
4608 | + |
4609 | + def satisfy_refs_for_keys(self, keys): |
4610 | for key in keys: |
4611 | - self.add_key(key) |
4612 | + self._satisfy_refs_for_key(key) |
4613 | |
4614 | def get_referrers(self): |
4615 | result = set() |
4616 | @@ -2932,7 +2999,7 @@ |
4617 | # If updating this, you should also update |
4618 | # groupcompress._GCGraphIndex.get_missing_parents |
4619 | # We may have false positives, so filter those out. |
4620 | - self._key_dependencies.add_keys( |
4621 | + self._key_dependencies.satisfy_refs_for_keys( |
4622 | self.get_parent_map(self._key_dependencies.get_unsatisfied_refs())) |
4623 | return frozenset(self._key_dependencies.get_unsatisfied_refs()) |
4624 | |
4625 | @@ -3049,6 +3116,10 @@ |
4626 | options.append('no-eol') |
4627 | return options |
4628 | |
4629 | + def find_ancestry(self, keys): |
4630 | + """See CombinedGraphIndex.find_ancestry()""" |
4631 | + return self._graph_index.find_ancestry(keys, 0) |
4632 | + |
4633 | def get_parent_map(self, keys): |
4634 | """Get a map of the parents of keys. |
4635 | |
4636 | |
4637 | === modified file 'bzrlib/lock.py' |
4638 | --- bzrlib/lock.py 2009-07-20 08:47:58 +0000 |
4639 | +++ bzrlib/lock.py 2009-07-31 16:51:48 +0000 |
4640 | @@ -190,6 +190,13 @@ |
4641 | if self.filename in _fcntl_WriteLock._open_locks: |
4642 | self._clear_f() |
4643 | raise errors.LockContention(self.filename) |
4644 | + if self.filename in _fcntl_ReadLock._open_locks: |
4645 | + if 'strict_locks' in debug.debug_flags: |
4646 | + self._clear_f() |
4647 | + raise errors.LockContention(self.filename) |
4648 | + else: |
4649 | + trace.mutter('Write lock taken w/ an open read lock on: %s' |
4650 | + % (self.filename,)) |
4651 | |
4652 | self._open(self.filename, 'rb+') |
4653 | # reserve a slot for this lock - even if the lockf call fails, |
4654 | @@ -220,6 +227,14 @@ |
4655 | def __init__(self, filename): |
4656 | super(_fcntl_ReadLock, self).__init__() |
4657 | self.filename = osutils.realpath(filename) |
4658 | + if self.filename in _fcntl_WriteLock._open_locks: |
4659 | + if 'strict_locks' in debug.debug_flags: |
4660 | + # We raise before calling _open so we don't need to |
4661 | + # _clear_f |
4662 | + raise errors.LockContention(self.filename) |
4663 | + else: |
4664 | + trace.mutter('Read lock taken w/ an open write lock on: %s' |
4665 | + % (self.filename,)) |
4666 | _fcntl_ReadLock._open_locks.setdefault(self.filename, 0) |
4667 | _fcntl_ReadLock._open_locks[self.filename] += 1 |
4668 | self._open(filename, 'rb') |
4669 | @@ -418,15 +433,15 @@ |
4670 | DWORD, # dwFlagsAndAttributes |
4671 | HANDLE # hTemplateFile |
4672 | )((_function_name, ctypes.windll.kernel32)) |
4673 | - |
4674 | + |
4675 | INVALID_HANDLE_VALUE = -1 |
4676 | - |
4677 | + |
4678 | GENERIC_READ = 0x80000000 |
4679 | GENERIC_WRITE = 0x40000000 |
4680 | FILE_SHARE_READ = 1 |
4681 | OPEN_ALWAYS = 4 |
4682 | FILE_ATTRIBUTE_NORMAL = 128 |
4683 | - |
4684 | + |
4685 | ERROR_ACCESS_DENIED = 5 |
4686 | ERROR_SHARING_VIOLATION = 32 |
4687 | |
4688 | |
4689 | === modified file 'bzrlib/lsprof.py' |
4690 | --- bzrlib/lsprof.py 2009-03-08 06:18:06 +0000 |
4691 | +++ bzrlib/lsprof.py 2009-08-24 21:05:09 +0000 |
4692 | @@ -13,45 +13,74 @@ |
4693 | |
4694 | __all__ = ['profile', 'Stats'] |
4695 | |
4696 | -_g_threadmap = {} |
4697 | - |
4698 | - |
4699 | -def _thread_profile(f, *args, **kwds): |
4700 | - # we lose the first profile point for a new thread in order to trampoline |
4701 | - # a new Profile object into place |
4702 | - global _g_threadmap |
4703 | - thr = thread.get_ident() |
4704 | - _g_threadmap[thr] = p = Profiler() |
4705 | - # this overrides our sys.setprofile hook: |
4706 | - p.enable(subcalls=True, builtins=True) |
4707 | - |
4708 | - |
4709 | def profile(f, *args, **kwds): |
4710 | """Run a function profile. |
4711 | |
4712 | Exceptions are not caught: If you need stats even when exceptions are to be |
4713 | - raised, passing in a closure that will catch the exceptions and transform |
4714 | - them appropriately for your driver function. |
4715 | + raised, pass in a closure that will catch the exceptions and transform them |
4716 | + appropriately for your driver function. |
4717 | |
4718 | :return: The functions return value and a stats object. |
4719 | """ |
4720 | - global _g_threadmap |
4721 | - p = Profiler() |
4722 | - p.enable(subcalls=True) |
4723 | - threading.setprofile(_thread_profile) |
4724 | + profiler = BzrProfiler() |
4725 | + profiler.start() |
4726 | try: |
4727 | ret = f(*args, **kwds) |
4728 | finally: |
4729 | - p.disable() |
4730 | - for pp in _g_threadmap.values(): |
4731 | + stats = profiler.stop() |
4732 | + return ret, stats |
4733 | + |
4734 | + |
4735 | +class BzrProfiler(object): |
4736 | + """Bzr utility wrapper around Profiler. |
4737 | + |
4738 | + For most uses the module level 'profile()' function will be suitable. |
4739 | + However profiling when a simple wrapped function isn't available may |
4740 | + be easier to accomplish using this class. |
4741 | + |
4742 | + To use it, create a BzrProfiler and call start() on it. Some arbitrary |
4743 | + time later call stop() to stop profiling and retrieve the statistics |
4744 | + from the code executed in the interim. |
4745 | + """ |
4746 | + |
4747 | + def start(self): |
4748 | + """Start profiling. |
4749 | + |
4750 | + This hooks into threading and will record all calls made until |
4751 | + stop() is called. |
4752 | + """ |
4753 | + self._g_threadmap = {} |
4754 | + self.p = Profiler() |
4755 | + self.p.enable(subcalls=True) |
4756 | + threading.setprofile(self._thread_profile) |
4757 | + |
4758 | + def stop(self): |
4759 | + """Stop profiling. |
4760 | + |
4761 | + This unhooks from threading and cleans up the profiler, returning |
4762 | + the gathered Stats object. |
4763 | + |
4764 | + :return: A bzrlib.lsprof.Stats object. |
4765 | + """ |
4766 | + self.p.disable() |
4767 | + for pp in self._g_threadmap.values(): |
4768 | pp.disable() |
4769 | threading.setprofile(None) |
4770 | + p = self.p |
4771 | + self.p = None |
4772 | + threads = {} |
4773 | + for tid, pp in self._g_threadmap.items(): |
4774 | + threads[tid] = Stats(pp.getstats(), {}) |
4775 | + self._g_threadmap = None |
4776 | + return Stats(p.getstats(), threads) |
4777 | |
4778 | - threads = {} |
4779 | - for tid, pp in _g_threadmap.items(): |
4780 | - threads[tid] = Stats(pp.getstats(), {}) |
4781 | - _g_threadmap = {} |
4782 | - return ret, Stats(p.getstats(), threads) |
4783 | + def _thread_profile(self, f, *args, **kwds): |
4784 | + # we lose the first profile point for a new thread in order to |
4785 | + # trampoline a new Profile object into place |
4786 | + thr = thread.get_ident() |
4787 | + self._g_threadmap[thr] = p = Profiler() |
4788 | + # this overrides our sys.setprofile hook: |
4789 | + p.enable(subcalls=True, builtins=True) |
4790 | |
4791 | |
4792 | class Stats(object): |
4793 | |
4794 | === modified file 'bzrlib/mail_client.py' |
4795 | --- bzrlib/mail_client.py 2009-06-10 03:56:49 +0000 |
4796 | +++ bzrlib/mail_client.py 2009-09-02 08:26:27 +0000 |
4797 | @@ -424,6 +424,10 @@ |
4798 | |
4799 | _client_commands = ['emacsclient'] |
4800 | |
4801 | + def __init__(self, config): |
4802 | + super(EmacsMail, self).__init__(config) |
4803 | + self.elisp_tmp_file = None |
4804 | + |
4805 | def _prepare_send_function(self): |
4806 | """Write our wrapper function into a temporary file. |
4807 | |
4808 | @@ -500,6 +504,7 @@ |
4809 | if attach_path is not None: |
4810 | # Do not create a file if there is no attachment |
4811 | elisp = self._prepare_send_function() |
4812 | + self.elisp_tmp_file = elisp |
4813 | lmmform = '(load "%s")' % elisp |
4814 | mmform = '(bzr-add-mime-att "%s")' % \ |
4815 | self._encode_path(attach_path, 'attachment') |
4816 | |
4817 | === modified file 'bzrlib/merge.py' |
4818 | --- bzrlib/merge.py 2009-07-02 13:07:14 +0000 |
4819 | +++ bzrlib/merge.py 2009-09-11 13:32:55 +0000 |
4820 | @@ -64,8 +64,12 @@ |
4821 | |
4822 | |
4823 | def transform_tree(from_tree, to_tree, interesting_ids=None): |
4824 | - merge_inner(from_tree.branch, to_tree, from_tree, ignore_zero=True, |
4825 | - interesting_ids=interesting_ids, this_tree=from_tree) |
4826 | + from_tree.lock_tree_write() |
4827 | + try: |
4828 | + merge_inner(from_tree.branch, to_tree, from_tree, ignore_zero=True, |
4829 | + interesting_ids=interesting_ids, this_tree=from_tree) |
4830 | + finally: |
4831 | + from_tree.unlock() |
4832 | |
4833 | |
4834 | class Merger(object): |
4835 | @@ -102,6 +106,17 @@ |
4836 | self._is_criss_cross = None |
4837 | self._lca_trees = None |
4838 | |
4839 | + def cache_trees_with_revision_ids(self, trees): |
4840 | + """Cache any tree in trees if it has a revision_id.""" |
4841 | + for maybe_tree in trees: |
4842 | + if maybe_tree is None: |
4843 | + continue |
4844 | + try: |
4845 | + rev_id = maybe_tree.get_revision_id() |
4846 | + except AttributeError: |
4847 | + continue |
4848 | + self._cached_trees[rev_id] = maybe_tree |
4849 | + |
4850 | @property |
4851 | def revision_graph(self): |
4852 | if self._revision_graph is None: |
4853 | @@ -598,19 +613,21 @@ |
4854 | self.this_tree.lock_tree_write() |
4855 | self.base_tree.lock_read() |
4856 | self.other_tree.lock_read() |
4857 | - self.tt = TreeTransform(self.this_tree, self.pb) |
4858 | try: |
4859 | - self.pp.next_phase() |
4860 | - self._compute_transform() |
4861 | - self.pp.next_phase() |
4862 | - results = self.tt.apply(no_conflicts=True) |
4863 | - self.write_modified(results) |
4864 | + self.tt = TreeTransform(self.this_tree, self.pb) |
4865 | try: |
4866 | - self.this_tree.add_conflicts(self.cooked_conflicts) |
4867 | - except UnsupportedOperation: |
4868 | - pass |
4869 | + self.pp.next_phase() |
4870 | + self._compute_transform() |
4871 | + self.pp.next_phase() |
4872 | + results = self.tt.apply(no_conflicts=True) |
4873 | + self.write_modified(results) |
4874 | + try: |
4875 | + self.this_tree.add_conflicts(self.cooked_conflicts) |
4876 | + except UnsupportedOperation: |
4877 | + pass |
4878 | + finally: |
4879 | + self.tt.finalize() |
4880 | finally: |
4881 | - self.tt.finalize() |
4882 | self.other_tree.unlock() |
4883 | self.base_tree.unlock() |
4884 | self.this_tree.unlock() |
4885 | @@ -1516,6 +1533,7 @@ |
4886 | get_revision_id = getattr(base_tree, 'get_revision_id', None) |
4887 | if get_revision_id is None: |
4888 | get_revision_id = base_tree.last_revision |
4889 | + merger.cache_trees_with_revision_ids([other_tree, base_tree, this_tree]) |
4890 | merger.set_base_revision(get_revision_id(), this_branch) |
4891 | return merger.do_merge() |
4892 | |
4893 | |
4894 | === modified file 'bzrlib/missing.py' |
4895 | --- bzrlib/missing.py 2009-03-23 14:59:43 +0000 |
4896 | +++ bzrlib/missing.py 2009-08-17 18:52:01 +0000 |
4897 | @@ -138,31 +138,13 @@ |
4898 | if not ancestry: #Empty ancestry, no need to do any work |
4899 | return [] |
4900 | |
4901 | - mainline_revs, rev_nos, start_rev_id, end_rev_id = log._get_mainline_revs( |
4902 | - branch, None, tip_revno) |
4903 | - if not mainline_revs: |
4904 | - return [] |
4905 | - |
4906 | - # This asks for all mainline revisions, which is size-of-history and |
4907 | - # should be addressed (but currently the only way to get correct |
4908 | - # revnos). |
4909 | - |
4910 | - # mainline_revisions always includes an extra revision at the |
4911 | - # beginning, so don't request it. |
4912 | - parent_map = dict(((key, value) for key, value |
4913 | - in graph.iter_ancestry(mainline_revs[1:]) |
4914 | - if value is not None)) |
4915 | - # filter out ghosts; merge_sort errors on ghosts. |
4916 | - # XXX: is this needed here ? -- vila080910 |
4917 | - rev_graph = _mod_repository._strip_NULL_ghosts(parent_map) |
4918 | - # XXX: what if rev_graph is empty now ? -- vila080910 |
4919 | - merge_sorted_revisions = tsort.merge_sort(rev_graph, tip, |
4920 | - mainline_revs, |
4921 | - generate_revno=True) |
4922 | + merge_sorted_revisions = branch.iter_merge_sorted_revisions() |
4923 | # Now that we got the correct revnos, keep only the relevant |
4924 | # revisions. |
4925 | merge_sorted_revisions = [ |
4926 | - (s, revid, n, d, e) for s, revid, n, d, e in merge_sorted_revisions |
4927 | + # log.reverse_by_depth expects seq_num to be present, but it is |
4928 | + # stripped by iter_merge_sorted_revisions() |
4929 | + (0, revid, n, d, e) for revid, n, d, e in merge_sorted_revisions |
4930 | if revid in ancestry] |
4931 | if not backward: |
4932 | merge_sorted_revisions = log.reverse_by_depth(merge_sorted_revisions) |
4933 | |
4934 | === modified file 'bzrlib/mutabletree.py' |
4935 | --- bzrlib/mutabletree.py 2009-07-10 08:33:11 +0000 |
4936 | +++ bzrlib/mutabletree.py 2009-09-07 23:14:05 +0000 |
4937 | @@ -226,6 +226,9 @@ |
4938 | revprops=revprops, |
4939 | possible_master_transports=possible_master_transports, |
4940 | *args, **kwargs) |
4941 | + post_hook_params = PostCommitHookParams(self) |
4942 | + for hook in MutableTree.hooks['post_commit']: |
4943 | + hook(post_hook_params) |
4944 | return committed_id |
4945 | |
4946 | def _gather_kinds(self, files, kinds): |
4947 | @@ -581,14 +584,32 @@ |
4948 | self.create_hook(hooks.HookPoint('start_commit', |
4949 | "Called before a commit is performed on a tree. The start commit " |
4950 | "hook is able to change the tree before the commit takes place. " |
4951 | - "start_commit is called with the bzrlib.tree.MutableTree that the " |
4952 | - "commit is being performed on.", (1, 4), None)) |
4953 | + "start_commit is called with the bzrlib.mutabletree.MutableTree " |
4954 | + "that the commit is being performed on.", (1, 4), None)) |
4955 | + self.create_hook(hooks.HookPoint('post_commit', |
4956 | + "Called after a commit is performed on a tree. The hook is " |
4957 | + "called with a bzrlib.mutabletree.PostCommitHookParams object. " |
4958 | + "The mutable tree the commit was performed on is available via " |
4959 | + "the mutable_tree attribute of that object.", (2, 0), None)) |
4960 | |
4961 | |
4962 | # install the default hooks into the MutableTree class. |
4963 | MutableTree.hooks = MutableTreeHooks() |
4964 | |
4965 | |
4966 | +class PostCommitHookParams(object): |
4967 | + """Parameters for the post_commit hook. |
4968 | + |
4969 | + To access the parameters, use the following attributes: |
4970 | + |
4971 | + * mutable_tree - the MutableTree object |
4972 | + """ |
4973 | + |
4974 | + def __init__(self, mutable_tree): |
4975 | + """Create the parameters for the post_commit hook.""" |
4976 | + self.mutable_tree = mutable_tree |
4977 | + |
4978 | + |
4979 | class _FastPath(object): |
4980 | """A path object with fast accessors for things like basename.""" |
4981 | |
4982 | |
4983 | === modified file 'bzrlib/plugin.py' |
4984 | --- bzrlib/plugin.py 2009-03-24 01:53:42 +0000 |
4985 | +++ bzrlib/plugin.py 2009-09-04 15:36:48 +0000 |
4986 | @@ -52,12 +52,16 @@ |
4987 | from bzrlib import plugins as _mod_plugins |
4988 | """) |
4989 | |
4990 | -from bzrlib.symbol_versioning import deprecated_function |
4991 | +from bzrlib.symbol_versioning import ( |
4992 | + deprecated_function, |
4993 | + deprecated_in, |
4994 | + ) |
4995 | |
4996 | |
4997 | DEFAULT_PLUGIN_PATH = None |
4998 | _loaded = False |
4999 | |
5000 | +@deprecated_function(deprecated_in((2, 0, 0))) |
The diff has been truncated for viewing.
I've specially prepared a cherry-picked patch for the 2.0 branch. https:/ /code.launchpad .net/~garyvdm/ bzr/427773- 2.0-unlock- when-limbo/ +merge/ 11631