Merge lp:~garyvdm/bzr/Bug427773 into lp:bzr/2.0

Proposed by Alexander Belchenko
Status: Rejected
Rejected by: Gary van der Merwe
Proposed branch: lp:~garyvdm/bzr/Bug427773
Merge into: lp:bzr/2.0
Diff against target: None lines
To merge this branch: bzr merge lp:~garyvdm/bzr/Bug427773
Reviewer Review Type Date Requested Status
bzr-core Pending
Review via email: mp+11645@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Gary van der Merwe (garyvdm) wrote :

I've specially prepared a cherry-picked patch for the 2.0 branch. https://code.launchpad.net/~garyvdm/bzr/427773-2.0-unlock-when-limbo/+merge/11631

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'Makefile'
2--- Makefile 2009-08-03 20:38:39 +0000
3+++ Makefile 2009-08-27 00:53:27 +0000
4@@ -1,4 +1,4 @@
5-# Copyright (C) 2005, 2006, 2007, 2008 Canonical Ltd
6+# Copyright (C) 2005, 2006, 2007, 2008, 2009 Canonical Ltd
7 #
8 # This program is free software; you can redistribute it and/or modify
9 # it under the terms of the GNU General Public License as published by
10@@ -40,8 +40,6 @@
11
12 check-nodocs: extensions
13 $(PYTHON) -Werror -O ./bzr selftest -1v $(tests)
14- @echo "Running all tests with no locale."
15- LC_CTYPE= LANG=C LC_ALL= ./bzr selftest -1v $(tests) 2>&1 | sed -e 's/^/[ascii] /'
16
17 # Run Python style checker (apt-get install pyflakes)
18 #
19
20=== modified file 'NEWS'
21--- NEWS 2009-08-25 05:32:07 +0000
22+++ NEWS 2009-09-11 14:23:16 +0000
23@@ -6,6 +6,309 @@
24 .. contents:: List of Releases
25 :depth: 1
26
27+In Development
28+##############
29+
30+Compatibility Breaks
31+********************
32+
33+New Features
34+************
35+
36+* Give more control on BZR_PLUGIN_PATH by providing a way to refer to or
37+ disable the user, site and core plugin directories.
38+ (Vincent Ladeuil, #412930, #316192, #145612)
39+
40+Bug Fixes
41+*********
42+
43+* Bazaar's native protocol code now correctly handles EINTR, which most
44+ noticeably occurs if you break in to the debugger while connected to a
45+ bzr+ssh server. You can now can continue from the debugger (by typing
46+ 'c') and the process continues. However, note that pressing C-\ in the
47+ shell may still kill the SSH process, which is bug 162509, so you must
48+ sent a signal to the bzr process specifically, for example by typing
49+ ``kill -QUIT PID`` in another shell. (Martin Pool, #341535)
50+
51+* ``bzr check`` in pack-0.92, 1.6 and 1.9 format repositories will no
52+ longer report incorrect errors about ``Missing inventory ('TREE_ROOT', ...)``
53+ (Robert Collins, #416732)
54+
55+* ``bzr info -v`` on a 2a format still claimed that it was a "Development
56+ format" (John Arbash Meinel, #424392)
57+
58+* ``bzr merge`` and ``bzr remove-tree`` now requires --force if pending
59+ merges are present in the working tree.
60+ (Vincent Ladeuil, #426344)
61+
62+* Clearer message when Bazaar runs out of memory, instead of a ``MemoryError``
63+ traceback. (Martin Pool, #109115)
64+
65+* Conversion to 2a will create a single pack for all the new revisions (as
66+ long as it ran without interruption). This improves both ``bzr upgrade``
67+ and ``bzr pull`` or ``bzr merge`` from local branches in older formats.
68+ The autopack logic that occurs every 100 revisions during local
69+ conversions was not returning that pack's identifier, which resulted in
70+ the partial packs created during the conversion not being consolidated
71+ at the end of the conversion process. (Robert Collins, #423818)
72+
73+* Don't restrict the command name used to run the test suite.
74+ (Vincent Ladeuil, #419950)
75+
76+* Fetches from 2a to 2a are now again requested in 'groupcompress' order.
77+ Groups that are seen as 'underutilized' will be repacked on-the-fly.
78+ This means that when the source is fully packed, there is minimal
79+ overhead during the fetch, but if the source is poorly packed the result
80+ is a fairly well packed repository (not as good as 'bzr pack' but
81+ good-enough.) (Robert Collins, John Arbash Meinel, #402652)
82+
83+* Network streams now decode adjacent records of the same type into a
84+ single stream, reducing layering churn. (Robert Collins)
85+
86+* Prevent some kinds of incomplete data from being committed to a 2a
87+ repository, such as revisions without inventories or inventories without
88+ chk_bytes root records.
89+ (Andrew Bennetts, #423506)
90+
91+* Make sure that we unlock the tree if we fail to create a TreeTransform
92+ object when doing a merge, and there is limbo, or pending-deletions
93+ directory. (Gary van der Merwe, #427773)
94+
95+Improvements
96+************
97+
98+Documentation
99+*************
100+
101+* Help on hooks no longer says 'Not deprecated' for hooks that are
102+ currently supported. (Ian Clatworthy, #422415)
103+
104+API Changes
105+***********
106+
107+* ``bzrlib.tests`` now uses ``stopTestRun`` for its ``TestResult``
108+ subclasses - the same as python's unittest module. (Robert Collins)
109+
110+Internals
111+*********
112+
113+* ``BTreeLeafParser.extract_key`` has been tweaked slightly to reduce
114+ mallocs while parsing the index (approx 3=>1 mallocs per key read).
115+ This results in a 10% speedup while reading an index.
116+ (John Arbash Meinel)
117+
118+* The ``bzrlib.lsprof`` module has a new class ``BzrProfiler`` which makes
119+ profiling in some situations like callbacks and generators easier.
120+ (Robert Collins)
121+
122+Testing
123+*******
124+
125+* Passing ``--lsprof-tests -v`` to bzr selftest will cause lsprof output to
126+ be output for every test. Note that this is very verbose! (Robert Collins)
127+
128+* Test parameterisation now does a shallow copy, not a deep copy of the test
129+ to be parameterised. This is not expected to break external use of test
130+ parameterisation, and is substantially faster. (Robert Collins)
131+
132+
133+bzr 2.0rc2 (not released yet)
134+#############################
135+
136+New Features
137+************
138+
139+* Added post_commit hook for mutable trees. This allows the keywords
140+ plugin to expand keywords on files changed by the commit.
141+ (Ian Clatworthy, #408841)
142+
143+Bug Fixes
144+*********
145+
146+* Bazaar's native protocol code now correctly handles EINTR, which most
147+ noticeably occurs if you break in to the debugger while connected to a
148+ bzr+ssh server. You can now can continue from the debugger (by typing
149+ 'c') and the process continues. However, note that pressing C-\ in the
150+ shell may still kill the SSH process, which is bug 162509, so you must
151+ sent a signal to the bzr process specifically, for example by typing
152+ ``kill -QUIT PID`` in another shell. (Martin Pool, #341535)
153+
154+* ``bzr check`` in pack-0.92, 1.6 and 1.9 format repositories will no
155+ longer report incorrect errors about ``Missing inventory ('TREE_ROOT', ...)``
156+ (Robert Collins, #416732)
157+
158+* ``bzr info -v`` on a 2a format still claimed that it was a "Development
159+ format" (John Arbash Meinel, #424392)
160+
161+* ``bzr log stacked-branch`` shows the full log including
162+ revisions that are in the fallback repository. (Regressed in 2.0rc1).
163+ (John Arbash Meinel, #419241)
164+
165+* Clearer message when Bazaar runs out of memory, instead of a ``MemoryError``
166+ traceback. (Martin Pool, #109115)
167+
168+* Conversion to 2a will create a single pack for all the new revisions (as
169+ long as it ran without interruption). This improves both ``bzr upgrade``
170+ and ``bzr pull`` or ``bzr merge`` from local branches in older formats.
171+ The autopack logic that occurs every 100 revisions during local
172+ conversions was not returning that pack's identifier, which resulted in
173+ the partial packs created during the conversion not being consolidated
174+ at the end of the conversion process. (Robert Collins, #423818)
175+
176+* Fetches from 2a to 2a are now again requested in 'groupcompress' order.
177+ Groups that are seen as 'underutilized' will be repacked on-the-fly.
178+ This means that when the source is fully packed, there is minimal
179+ overhead during the fetch, but if the source is poorly packed the result
180+ is a fairly well packed repository (not as good as 'bzr pack' but
181+ good-enough.) (Robert Collins, John Arbash Meinel, #402652)
182+
183+* Fix a potential segmentation fault when doing 'log' of a branch that had
184+ ghosts in its mainline. (Evaluating None as a tuple is bad.)
185+ (John Arbash Meinel, #419241)
186+
187+* ``groupcompress`` sort order is now more stable, rather than relying on
188+ ``topo_sort`` ordering. The implementation is now
189+ ``KnownGraph.gc_sort``. (John Arbash Meinel)
190+
191+* Local data conversion will generate correct deltas. This is a critical
192+ bugfix vs 2.0rc1, and all 2.0rc1 users should upgrade to 2.0rc2 before
193+ converting repositories. (Robert Collins, #422849)
194+
195+* Network streams now decode adjacent records of the same type into a
196+ single stream, reducing layering churn. (Robert Collins)
197+
198+* Prevent some kinds of incomplete data from being committed to a 2a
199+ repository, such as revisions without inventories, a missing chk_bytes
200+ record for an inventory, or a missing text referenced by an inventory.
201+ (Andrew Bennetts, #423506, #406687)
202+
203+Documentation
204+*************
205+
206+* Fix assertion error about "_remember_remote_is_before" when pushing to
207+ older smart servers.
208+ (Andrew Bennetts, #418931)
209+
210+* Help on hooks no longer says 'Not deprecated' for hooks that are
211+ currently supported. (Ian Clatworthy, #422415)
212+
213+* The main table of contents now provides links to the new Migration Docs
214+ and Plugins Guide. (Ian Clatworthy)
215+
216+
217+bzr 2.0rc1
218+##########
219+
220+
221+:Codename: no worries
222+:2.0rc1: 2009-08-26
223+
224+This release of Bazaar makes 2a 'brisbane-core' format the
225+default. Most of the work in this release now focuses on bug
226+fixes and stabilization, covering both 2a and previous formats.
227+
228+The Bazaar team decided that 2.0 will be a long-term supported
229+release, with bugfix-only releases based on it continuing for at
230+least six months or until the following stable release (we said
231+that previously, but that's worth repeating).
232+
233+Compatibility Breaks
234+********************
235+
236+* The default format for bzr is now ``2a``. This format brings many
237+ significant performance and size improvements. bzr can pull from
238+ any existing repository into a ``2a`` one, but can only transfer
239+ into ``rich-root`` repositories from ``2a``. The Upgrade guide
240+ has more information about this change. (Robert Collins)
241+
242+* On Windows auto-detection of Putty's plink.exe is disabled.
243+ Default SSH client for Windows is paramiko. User still can force
244+ usage of plink if explicitly set environment variable BZR_SSH=plink.
245+ (#414743, Alexander Belchenko)
246+
247+New Features
248+************
249+
250+* ``bzr branch --switch`` can now switch the checkout in the current directory
251+ to the newly created branch. (Lukáš Lalinský)
252+
253+Bug Fixes
254+*********
255+
256+* Further tweaks to handling of ``bzr add`` messages about ignored files.
257+ (Jason Spashett, #76616)
258+
259+* Fetches were being requested in 'groupcompress' order, but weren't
260+ recombining the groups. Thus they would 'fragment' to get the correct
261+ order, but not 'recombine' to actually benefit from it. Until we get
262+ recombining to work, switching to 'unordered' fetches avoids the
263+ fragmentation. (John Arbash Meinel, #402645)
264+
265+* Fix a pycurl related test failure on karmic by recognizing an error
266+ raised by newer versions of pycurl.
267+ (Vincent Ladeuil, #306264)
268+
269+* Fix a test failure on karmic by making a locale test more robust.
270+ (Vincent Ladeuil, #413514)
271+
272+* Fix IndexError printing CannotBindAddress errors.
273+ (Martin Pool, #286871)
274+
275+* Fix "Revision ... not present" errors when upgrading stacked branches,
276+ or when doing fetches from a stacked source to a stacked target.
277+ (Andrew Bennetts, #399140)
278+
279+* ``bzr branch`` of 2a repositories over HTTP is much faster. bzr now
280+ batches together small fetches from 2a repositories, rather than
281+ fetching only a few hundred bytes at a time.
282+ (Andrew Bennetts, #402657)
283+
284+Improvements
285+************
286+
287+* A better description of the platform is shown in crash tracebacks, ``bzr
288+ --version`` and ``bzr selftest``.
289+ (Martin Pool, #409137)
290+
291+* bzr can now (again) capture crash data through the apport library,
292+ so that a single human-readable file can be attached to bug reports.
293+ This can be disabled by using ``-Dno_apport`` on the command line, or by
294+ putting ``no_apport`` into the ``debug_flags`` section of
295+ ``bazaar.conf``.
296+ (Martin Pool, Robert Collins, #389328)
297+
298+* ``bzr push`` locally on windows will no longer give a locking error with
299+ dirstate based formats. (Robert Collins)
300+
301+* ``bzr shelve`` and ``bzr unshelve`` now work on windows.
302+ (Robert Collins, #305006)
303+
304+* Commit of specific files no longer prevents using the the iter_changes
305+ codepath. On 2a repositories, commit of specific files should now be as
306+ fast, or slightly faster, than a full commit. (Robert Collins)
307+
308+* The internal core code that handles specific file operations like
309+ ``bzr st FILENAME`` or ``bzr commit FILENAME`` has been changed to
310+ include the parent directories if they have altered, and when a
311+ directory stops being a directory its children are always included. This
312+ fixes a number of causes for ``InconsistentDelta`` errors, and permits
313+ faster commit of specific paths. (Robert Collins, #347649)
314+
315+Documentation
316+*************
317+
318+* New developer documentation for content filtering.
319+ (Martin Pool)
320+
321+API Changes
322+***********
323+
324+* ``bzrlib.shelf_ui`` has had the ``from_args`` convenience methods of its
325+ classes changed to manage lock lifetime of the trees they open in a way
326+ consistent with reader-exclusive locks. (Robert Collins, #305006)
327+
328+Testing
329+*******
330
331 bzr 1.18.1 NOT RELEASED YET
332 ###########################
333@@ -17,17 +320,144 @@
334 conversion will commit too many copies a file.
335 (Martin Pool, #415508)
336
337+Improvements
338+************
339+
340+* ``bzr push`` locally on windows will no longer give a locking error with
341+ dirstate based formats. (Robert Collins)
342+
343+* ``bzr shelve`` and ``bzr unshelve`` now work on windows.
344+ (Robert Collins, #305006)
345+
346 API Changes
347 ***********
348
349+* ``bzrlib.shelf_ui`` has had the ``from_args`` convenience methods of its
350+ classes changed to manage lock lifetime of the trees they open in a way
351+ consistent with reader-exclusive locks. (Robert Collins, #305006)
352+
353 * ``Tree.path_content_summary`` may return a size of None, when called on
354 a tree with content filtering where the size of the canonical form
355 cannot be cheaply determined. (Martin Pool)
356
357+* When manually creating transport servers in test cases, a new helper
358+ ``TestCase.start_server`` that registers a cleanup and starts the server
359+ should be used. (Robert Collins)
360
361 bzr 1.18
362 ########
363
364+Compatibility Breaks
365+********************
366+
367+* Committing directly to a stacked branch from a lightweight checkout will
368+ no longer work. In previous versions this would appear to work but would
369+ generate repositories with insufficient data to create deltas, leading
370+ to later errors when branching or reading from the repository.
371+ (Robert Collins, bug #375013)
372+
373+New Features
374+************
375+
376+Bug Fixes
377+*********
378+
379+* Fetching from 2a branches from a version-2 bzr protocol would fail to
380+ copy the internal inventory pages from the CHK store. This cannot happen
381+ in normal use as all 2a compatible clients and servers support the
382+ version-3 protocol, but it does cause test suite failures when testing
383+ downlevel protocol behaviour. (Robert Collins)
384+
385+* Fix a test failure on karmic by making a locale test more robust.
386+ (Vincent Ladeuil, #413514)
387+
388+* Fixed "Pack ... already exists" error when running ``bzr pack`` on a
389+ fully packed 2a repository. (Andrew Bennetts, #382463)
390+
391+* Further tweaks to handling of ``bzr add`` messages about ignored files.
392+ (Jason Spashett, #76616)
393+
394+* Properly handle fetching into a stacked branch while converting the
395+ data, especially when there are also ghosts. The code was filling in
396+ parent inventories incorrectly, and also not handling when one of the
397+ parents was a ghost. (John Arbash Meinel, #402778, #412198)
398+
399+* ``RemoteStreamSource.get_stream_for_missing_keys`` will fetch CHK
400+ inventory pages when appropriate (by falling back to the vfs stream
401+ source). (Andrew Bennetts, #406686)
402+
403+* StreamSource generates rich roots from non-rich root sources correctly
404+ now. (Andrew Bennetts, #368921)
405+
406+* When deciding whether a repository was compatible for upgrading or
407+ fetching, we previously incorrectly checked the default repository
408+ format for the bzrdir format, rather than the format that was actually
409+ present on disk. (Martin Pool, #408824)
410+
411+Improvements
412+************
413+
414+* A better description of the platform is shown in crash tracebacks, ``bzr
415+ --version`` and ``bzr selftest``.
416+ (Martin Pool, #409137)
417+
418+* Cross-format fetches (such as between 1.9-rich-root and 2a) via the
419+ smart server are more efficient now. They send inventory deltas rather
420+ than full inventories. The smart server has two new requests,
421+ ``Repository.get_stream_1.19`` and ``Repository.insert_stream_1.19`` to
422+ support this. (Andrew Bennetts, #374738, #385826)
423+
424+* Extracting the full ancestry and computing the ``merge_sort`` is now
425+ significantly faster. This effects things like ``bzr log -n0``. (For
426+ example, ``bzr log -r -10..-1 -n0 bzr.dev`` is 2.5s down to 1.0s.
427+ (John Arbash Meinel)
428+
429+Documentation
430+*************
431+
432+API Changes
433+***********
434+
435+Internals
436+*********
437+
438+* ``-Dstrict_locks`` can now be used to check that read and write locks
439+ are treated properly w.r.t. exclusivity. (We don't try to take an OS
440+ read lock on a file that we already have an OS write lock on.) This is
441+ now set by default for all tests, if you have a test which cannot be
442+ fixed, you can use ``self.thisFailsStrictLockCheck()`` as a
443+ compatibility knob. (John Arbash Meinel)
444+
445+* InterDifferingSerializer is now only used locally. Other fetches that
446+ would have used InterDifferingSerializer now use the more network
447+ friendly StreamSource, which now automatically does the same
448+ transformations as InterDifferingSerializer. (Andrew Bennetts)
449+
450+* ``KnownGraph`` now has a ``.topo_sort`` and ``.merge_sort`` member which
451+ are implemented in pyrex and significantly faster. This is exposed along
452+ with ``CombinedGraphIndex.find_ancestry()`` as
453+ ``VersionedFiles.get_known_graph_ancestry(keys)``.
454+ (John Arbash Meinel)
455+
456+* RemoteBranch.open now honours ignore_fallbacks correctly on bzr-v2
457+ protocols. (Robert Collins)
458+
459+* The index code now has some specialized routines to extract the full
460+ ancestry of a key in a more efficient manner.
461+ ``CombinedGraphIndex.find_ancestry()``. (Time to get ancestry for
462+ bzr.dev drops from 1.5s down to 300ms. For OOo from 33s => 10.5s) (John
463+ Arbash Meinel)
464+
465+Testing
466+*******
467+
468+* Install the test ssl certificate and key so that installed bzr
469+ can run the https tests. (Denys Duchier, #392401)
470+
471+
472+bzr 1.18rc1
473+###########
474+
475 :Codename: little traveller
476 :1.18: 2009-08-20
477 :1.18rc1: 2009-08-10
478@@ -113,6 +543,9 @@
479 running send and similar commands on 2a formats.
480 (Martin Pool, #408201)
481
482+* Fix crash in some invocations of ``bzr status`` in format 2a.
483+ (Martin Pool, #403523)
484+
485 * Fixed export to existing directory: if directory is empty then export
486 will succeed, otherwise it fails with error.
487 (Alexander Belchenko, #406174)
488@@ -133,7 +566,9 @@
489 * Requests for unknown methods no longer cause the smart server to log
490 lots of backtraces about ``UnknownSmartMethod``, ``do_chunk`` or
491 ``do_end``. (Andrew Bennetts, #338561)
492-
493+
494+* Shelve will not shelve the initial add of the tree root. (Aaron Bentley)
495+
496 * Streaming from bzr servers where there is a chain of stacked branches
497 (A stacked on B stacked on C) will now work. (Robert Collins, #406597)
498
499@@ -245,6 +680,17 @@
500 ``countTestsCases``. (Robert Collins)
501
502
503+bzr 1.17.1 (unreleased)
504+#######################
505+
506+Bug Fixes
507+*********
508+
509+* The optional ``_knit_load_data_pyx`` C extension was never being
510+ imported. This caused significant slowdowns when reading data from
511+ knit format repositories. (Andrew Bennetts, #405653)
512+
513+
514 bzr 1.17 "So late it's brunch" 2009-07-20
515 #########################################
516 :Codename: so-late-its-brunch
517@@ -743,6 +1189,9 @@
518 Testing
519 *******
520
521+* ``make check`` no longer repeats the test run in ``LANG=C``.
522+ (Martin Pool, #386180)
523+
524 * The number of cores is now correctly detected on OSX. (John Szakmeister)
525
526 * The number of cores is also detected on Solaris and win32. (Vincent Ladeuil)
527@@ -4723,7 +5172,7 @@
528 checkouts. (Aaron Bentley, #182040)
529
530 * Stop polluting /tmp when running selftest.
531- (Vincent Ladeuil, #123623)
532+ (Vincent Ladeuil, #123363)
533
534 * Switch from NFKC => NFC for normalization checks. NFC allows a few
535 more characters which should be considered valid.
536
537=== modified file 'bzr'
538--- bzr 2009-07-30 23:54:26 +0000
539+++ bzr 2009-08-28 05:11:10 +0000
540@@ -23,7 +23,7 @@
541 import warnings
542
543 # update this on each release
544-_script_version = (1, 18, 0)
545+_script_version = (2, 1, 0)
546
547 if __doc__ is None:
548 print "bzr does not support python -OO."
549
550=== modified file 'bzrlib/__init__.py'
551--- bzrlib/__init__.py 2009-08-20 08:38:09 +0000
552+++ bzrlib/__init__.py 2009-08-30 21:34:42 +0000
553@@ -50,7 +50,7 @@
554 # Python version 2.0 is (2, 0, 0, 'final', 0)." Additionally we use a
555 # releaselevel of 'dev' for unreleased under-development code.
556
557-version_info = (1, 18, 0, 'final', 0)
558+version_info = (2, 1, 0, 'dev', 0)
559
560 # API compatibility version: bzrlib is currently API compatible with 1.15.
561 api_minimum_version = (1, 17, 0)
562@@ -70,6 +70,8 @@
563 1.2dev
564 >>> print _format_version_tuple((1, 1, 1, 'candidate', 2))
565 1.1.1rc2
566+ >>> print bzrlib._format_version_tuple((2, 1, 0, 'beta', 1))
567+ 2.1b1
568 >>> print _format_version_tuple((1, 4, 0))
569 1.4
570 >>> print _format_version_tuple((1, 4))
571
572=== modified file 'bzrlib/_btree_serializer_pyx.pyx'
573--- bzrlib/_btree_serializer_pyx.pyx 2009-06-22 12:52:39 +0000
574+++ bzrlib/_btree_serializer_pyx.pyx 2009-09-04 21:16:14 +0000
575@@ -1,4 +1,4 @@
576-# Copyright (C) 2008 Canonical Ltd
577+# Copyright (C) 2008, 2009 Canonical Ltd
578 #
579 # This program is free software; you can redistribute it and/or modify
580 # it under the terms of the GNU General Public License as published by
581@@ -41,8 +41,11 @@
582 int PyString_AsStringAndSize_ptr(PyObject *, char **buf, Py_ssize_t *len)
583 void PyString_InternInPlace(PyObject **)
584 int PyTuple_CheckExact(object t)
585+ object PyTuple_New(Py_ssize_t n_entries)
586+ void PyTuple_SET_ITEM(object, Py_ssize_t offset, object) # steals the ref
587 Py_ssize_t PyTuple_GET_SIZE(object t)
588 PyObject *PyTuple_GET_ITEM_ptr_object "PyTuple_GET_ITEM" (object tpl, int index)
589+ void Py_INCREF(object)
590 void Py_DECREF_ptr "Py_DECREF" (PyObject *)
591
592 cdef extern from "string.h":
593@@ -140,14 +143,12 @@
594 cdef char *temp_ptr
595 cdef int loop_counter
596 # keys are tuples
597- loop_counter = 0
598- key_segments = []
599- while loop_counter < self.key_length:
600- loop_counter = loop_counter + 1
601+ key = PyTuple_New(self.key_length)
602+ for loop_counter from 0 <= loop_counter < self.key_length:
603 # grab a key segment
604 temp_ptr = <char*>memchr(self._start, c'\0', last - self._start)
605 if temp_ptr == NULL:
606- if loop_counter == self.key_length:
607+ if loop_counter + 1 == self.key_length:
608 # capture to last
609 temp_ptr = last
610 else:
611@@ -164,8 +165,9 @@
612 temp_ptr - self._start)
613 # advance our pointer
614 self._start = temp_ptr + 1
615- PyList_Append(key_segments, key_element)
616- return tuple(key_segments)
617+ Py_INCREF(key_element)
618+ PyTuple_SET_ITEM(key, loop_counter, key_element)
619+ return key
620
621 cdef int process_line(self) except -1:
622 """Process a line in the bytes."""
623
624=== modified file 'bzrlib/_dirstate_helpers_pyx.pyx'
625--- bzrlib/_dirstate_helpers_pyx.pyx 2009-07-27 05:44:19 +0000
626+++ bzrlib/_dirstate_helpers_pyx.pyx 2009-08-28 05:00:33 +0000
627@@ -28,7 +28,7 @@
628
629 from bzrlib import cache_utf8, errors, osutils
630 from bzrlib.dirstate import DirState
631-from bzrlib.osutils import pathjoin, splitpath
632+from bzrlib.osutils import parent_directories, pathjoin, splitpath
633
634
635 # This is the Windows equivalent of ENOTDIR
636@@ -963,15 +963,21 @@
637
638 cdef class ProcessEntryC:
639
640+ cdef int doing_consistency_expansion
641 cdef object old_dirname_to_file_id # dict
642 cdef object new_dirname_to_file_id # dict
643 cdef object last_source_parent
644 cdef object last_target_parent
645- cdef object include_unchanged
646+ cdef int include_unchanged
647+ cdef int partial
648 cdef object use_filesystem_for_exec
649 cdef object utf8_decode
650 cdef readonly object searched_specific_files
651+ cdef readonly object searched_exact_paths
652 cdef object search_specific_files
653+ # The parents up to the root of the paths we are searching.
654+ # After all normal paths are returned, these specific items are returned.
655+ cdef object search_specific_file_parents
656 cdef object state
657 # Current iteration variables:
658 cdef object current_root
659@@ -989,31 +995,48 @@
660 cdef object current_block_list
661 cdef object current_dir_info
662 cdef object current_dir_list
663+ cdef object _pending_consistent_entries # list
664 cdef int path_index
665 cdef object root_dir_info
666 cdef object bisect_left
667 cdef object pathjoin
668 cdef object fstat
669+ # A set of the ids we've output when doing partial output.
670+ cdef object seen_ids
671 cdef object sha_file
672
673 def __init__(self, include_unchanged, use_filesystem_for_exec,
674 search_specific_files, state, source_index, target_index,
675 want_unversioned, tree):
676+ self.doing_consistency_expansion = 0
677 self.old_dirname_to_file_id = {}
678 self.new_dirname_to_file_id = {}
679+ # Are we doing a partial iter_changes?
680+ self.partial = set(['']).__ne__(search_specific_files)
681 # Using a list so that we can access the values and change them in
682 # nested scope. Each one is [path, file_id, entry]
683 self.last_source_parent = [None, None]
684 self.last_target_parent = [None, None]
685- self.include_unchanged = include_unchanged
686+ if include_unchanged is None:
687+ self.include_unchanged = False
688+ else:
689+ self.include_unchanged = int(include_unchanged)
690 self.use_filesystem_for_exec = use_filesystem_for_exec
691 self.utf8_decode = cache_utf8._utf8_decode
692 # for all search_indexs in each path at or under each element of
693- # search_specific_files, if the detail is relocated: add the id, and add the
694- # relocated path as one to search if its not searched already. If the
695- # detail is not relocated, add the id.
696+ # search_specific_files, if the detail is relocated: add the id, and
697+ # add the relocated path as one to search if its not searched already.
698+ # If the detail is not relocated, add the id.
699 self.searched_specific_files = set()
700+ # When we search exact paths without expanding downwards, we record
701+ # that here.
702+ self.searched_exact_paths = set()
703 self.search_specific_files = search_specific_files
704+ # The parents up to the root of the paths we are searching.
705+ # After all normal paths are returned, these specific items are returned.
706+ self.search_specific_file_parents = set()
707+ # The ids we've sent out in the delta.
708+ self.seen_ids = set()
709 self.state = state
710 self.current_root = None
711 self.current_root_unicode = None
712@@ -1035,26 +1058,30 @@
713 self.current_block_pos = -1
714 self.current_dir_info = None
715 self.current_dir_list = None
716+ self._pending_consistent_entries = []
717 self.path_index = 0
718 self.root_dir_info = None
719 self.bisect_left = bisect.bisect_left
720 self.pathjoin = osutils.pathjoin
721 self.fstat = os.fstat
722 self.sha_file = osutils.sha_file
723+ if target_index != 0:
724+ # A lot of code in here depends on target_index == 0
725+ raise errors.BzrError('unsupported target index')
726
727 cdef _process_entry(self, entry, path_info):
728 """Compare an entry and real disk to generate delta information.
729
730 :param path_info: top_relpath, basename, kind, lstat, abspath for
731- the path of entry. If None, then the path is considered absent.
732- (Perhaps we should pass in a concrete entry for this ?)
733+ the path of entry. If None, then the path is considered absent in
734+ the target (Perhaps we should pass in a concrete entry for this ?)
735 Basename is returned as a utf8 string because we expect this
736 tuple will be ignored, and don't want to take the time to
737 decode.
738 :return: (iter_changes_result, changed). If the entry has not been
739 handled then changed is None. Otherwise it is False if no content
740- or metadata changes have occured, and None if any content or
741- metadata change has occured. If self.include_unchanged is True then
742+ or metadata changes have occured, and True if any content or
743+ metadata change has occurred. If self.include_unchanged is True then
744 if changed is not None, iter_changes_result will always be a result
745 tuple. Otherwise, iter_changes_result is None unless changed is
746 True.
747@@ -1099,9 +1126,12 @@
748 else:
749 # add the source to the search path to find any children it
750 # has. TODO ? : only add if it is a container ?
751- if not osutils.is_inside_any(self.searched_specific_files,
752- source_details[1]):
753+ if (not self.doing_consistency_expansion and
754+ not osutils.is_inside_any(self.searched_specific_files,
755+ source_details[1])):
756 self.search_specific_files.add(source_details[1])
757+ # expanding from a user requested path, parent expansion
758+ # for delta consistency happens later.
759 # generate the old path; this is needed for stating later
760 # as well.
761 old_path = source_details[1]
762@@ -1180,7 +1210,8 @@
763 file_id = entry[0][2]
764 self.old_dirname_to_file_id[old_path] = file_id
765 # parent id is the entry for the path in the target tree
766- if old_dirname == self.last_source_parent[0]:
767+ if old_basename and old_dirname == self.last_source_parent[0]:
768+ # use a cached hit for non-root source entries.
769 source_parent_id = self.last_source_parent[1]
770 else:
771 try:
772@@ -1196,7 +1227,8 @@
773 self.last_source_parent[0] = old_dirname
774 self.last_source_parent[1] = source_parent_id
775 new_dirname = entry[0][0]
776- if new_dirname == self.last_target_parent[0]:
777+ if entry[0][1] and new_dirname == self.last_target_parent[0]:
778+ # use a cached hit for non-root target entries.
779 target_parent_id = self.last_target_parent[1]
780 else:
781 try:
782@@ -1313,8 +1345,13 @@
783 # a renamed parent. TODO: handle this efficiently. Its not
784 # common case to rename dirs though, so a correct but slow
785 # implementation will do.
786- if not osutils.is_inside_any(self.searched_specific_files, target_details[1]):
787+ if (not self.doing_consistency_expansion and
788+ not osutils.is_inside_any(self.searched_specific_files,
789+ target_details[1])):
790 self.search_specific_files.add(target_details[1])
791+ # We don't expand the specific files parents list here as
792+ # the path is absent in target and won't create a delta with
793+ # missing parent.
794 elif ((source_minikind == c'r' or source_minikind == c'a') and
795 (target_minikind == c'r' or target_minikind == c'a')):
796 # neither of the selected trees contain this path,
797@@ -1334,6 +1371,25 @@
798 def iter_changes(self):
799 return self
800
801+ cdef void _gather_result_for_consistency(self, result):
802+ """Check a result we will yield to make sure we are consistent later.
803+
804+ This gathers result's parents into a set to output later.
805+
806+ :param result: A result tuple.
807+ """
808+ if not self.partial or not result[0]:
809+ return
810+ self.seen_ids.add(result[0])
811+ new_path = result[1][1]
812+ if new_path:
813+ # Not the root and not a delete: queue up the parents of the path.
814+ self.search_specific_file_parents.update(
815+ osutils.parent_directories(new_path.encode('utf8')))
816+ # Add the root directory which parent_directories does not
817+ # provide.
818+ self.search_specific_file_parents.add('')
819+
820 cdef void _update_current_block(self):
821 if (self.block_index < len(self.state._dirblocks) and
822 osutils.is_inside(self.current_root, self.state._dirblocks[self.block_index][0])):
823@@ -1406,8 +1462,11 @@
824 entry = self.root_entries[self.root_entries_pos]
825 self.root_entries_pos = self.root_entries_pos + 1
826 result, changed = self._process_entry(entry, self.root_dir_info)
827- if changed is not None and changed or self.include_unchanged:
828- return result
829+ if changed is not None:
830+ if changed:
831+ self._gather_result_for_consistency(result)
832+ if changed or self.include_unchanged:
833+ return result
834 # Have we finished the prior root, or never started one ?
835 if self.current_root is None:
836 # TODO: the pending list should be lexically sorted? the
837@@ -1416,12 +1475,12 @@
838 self.current_root = self.search_specific_files.pop()
839 except KeyError:
840 raise StopIteration()
841- self.current_root_unicode = self.current_root.decode('utf8')
842 self.searched_specific_files.add(self.current_root)
843 # process the entries for this containing directory: the rest will be
844 # found by their parents recursively.
845 self.root_entries = self.state._entries_for_path(self.current_root)
846 self.root_entries_len = len(self.root_entries)
847+ self.current_root_unicode = self.current_root.decode('utf8')
848 self.root_abspath = self.tree.abspath(self.current_root_unicode)
849 try:
850 root_stat = os.lstat(self.root_abspath)
851@@ -1458,6 +1517,8 @@
852 result, changed = self._process_entry(entry, self.root_dir_info)
853 if changed is not None:
854 path_handled = -1
855+ if changed:
856+ self._gather_result_for_consistency(result)
857 if changed or self.include_unchanged:
858 return result
859 # handle unversioned specified paths:
860@@ -1476,7 +1537,8 @@
861 )
862 # If we reach here, the outer flow continues, which enters into the
863 # per-root setup logic.
864- if self.current_dir_info is None and self.current_block is None:
865+ if (self.current_dir_info is None and self.current_block is None and not
866+ self.doing_consistency_expansion):
867 # setup iteration of this root:
868 self.current_dir_list = None
869 if self.root_dir_info and self.root_dir_info[2] == 'tree-reference':
870@@ -1606,6 +1668,8 @@
871 # advance the entry only, after processing.
872 result, changed = self._process_entry(current_entry, None)
873 if changed is not None:
874+ if changed:
875+ self._gather_result_for_consistency(result)
876 if changed or self.include_unchanged:
877 return result
878 self.block_index = self.block_index + 1
879@@ -1618,6 +1682,15 @@
880 # More supplied paths to process
881 self.current_root = None
882 return self._iter_next()
883+ # Start expanding more conservatively, adding paths the user may not
884+ # have intended but required for consistent deltas.
885+ self.doing_consistency_expansion = 1
886+ if not self._pending_consistent_entries:
887+ self._pending_consistent_entries = self._next_consistent_entries()
888+ while self._pending_consistent_entries:
889+ result, changed = self._pending_consistent_entries.pop()
890+ if changed is not None:
891+ return result
892 raise StopIteration()
893
894 cdef object _maybe_tree_ref(self, current_path_info):
895@@ -1705,6 +1778,8 @@
896 current_path_info)
897 if changed is not None:
898 path_handled = -1
899+ if not changed and not self.include_unchanged:
900+ changed = None
901 # >- loop control starts here:
902 # >- entry
903 if advance_entry and current_entry is not None:
904@@ -1726,7 +1801,7 @@
905 except UnicodeDecodeError:
906 raise errors.BadFilenameEncoding(
907 current_path_info[0], osutils._fs_enc)
908- if result is not None:
909+ if changed is not None:
910 raise AssertionError(
911 "result is not None: %r" % result)
912 result = (None,
913@@ -1737,6 +1812,7 @@
914 (None, self.utf8_decode(current_path_info[1])[0]),
915 (None, current_path_info[2]),
916 (None, new_executable))
917+ changed = True
918 # dont descend into this unversioned path if it is
919 # a dir
920 if current_path_info[2] in ('directory'):
921@@ -1755,9 +1831,12 @@
922 current_path_info)
923 else:
924 current_path_info = None
925- if result is not None:
926+ if changed is not None:
927 # Found a result on this pass, yield it
928- return result
929+ if changed:
930+ self._gather_result_for_consistency(result)
931+ if changed or self.include_unchanged:
932+ return result
933 if self.current_block is not None:
934 self.block_index = self.block_index + 1
935 self._update_current_block()
936@@ -1769,3 +1848,123 @@
937 self.current_dir_list = self.current_dir_info[1]
938 except StopIteration:
939 self.current_dir_info = None
940+
941+ cdef object _next_consistent_entries(self):
942+ """Grabs the next specific file parent case to consider.
943+
944+ :return: A list of the results, each of which is as for _process_entry.
945+ """
946+ results = []
947+ while self.search_specific_file_parents:
948+ # Process the parent directories for the paths we were iterating.
949+ # Even in extremely large trees this should be modest, so currently
950+ # no attempt is made to optimise.
951+ path_utf8 = self.search_specific_file_parents.pop()
952+ if path_utf8 in self.searched_exact_paths:
953+ # We've examined this path.
954+ continue
955+ if osutils.is_inside_any(self.searched_specific_files, path_utf8):
956+ # We've examined this path.
957+ continue
958+ path_entries = self.state._entries_for_path(path_utf8)
959+ # We need either one or two entries. If the path in
960+ # self.target_index has moved (so the entry in source_index is in
961+ # 'ar') then we need to also look for the entry for this path in
962+ # self.source_index, to output the appropriate delete-or-rename.
963+ selected_entries = []
964+ found_item = False
965+ for candidate_entry in path_entries:
966+ # Find entries present in target at this path:
967+ if candidate_entry[1][self.target_index][0] not in 'ar':
968+ found_item = True
969+ selected_entries.append(candidate_entry)
970+ # Find entries present in source at this path:
971+ elif (self.source_index is not None and
972+ candidate_entry[1][self.source_index][0] not in 'ar'):
973+ found_item = True
974+ if candidate_entry[1][self.target_index][0] == 'a':
975+ # Deleted, emit it here.
976+ selected_entries.append(candidate_entry)
977+ else:
978+ # renamed, emit it when we process the directory it
979+ # ended up at.
980+ self.search_specific_file_parents.add(
981+ candidate_entry[1][self.target_index][1])
982+ if not found_item:
983+ raise AssertionError(
984+ "Missing entry for specific path parent %r, %r" % (
985+ path_utf8, path_entries))
986+ path_info = self._path_info(path_utf8, path_utf8.decode('utf8'))
987+ for entry in selected_entries:
988+ if entry[0][2] in self.seen_ids:
989+ continue
990+ result, changed = self._process_entry(entry, path_info)
991+ if changed is None:
992+ raise AssertionError(
993+ "Got entry<->path mismatch for specific path "
994+ "%r entry %r path_info %r " % (
995+ path_utf8, entry, path_info))
996+ # Only include changes - we're outside the users requested
997+ # expansion.
998+ if changed:
999+ self._gather_result_for_consistency(result)
1000+ if (result[6][0] == 'directory' and
1001+ result[6][1] != 'directory'):
1002+ # This stopped being a directory, the old children have
1003+ # to be included.
1004+ if entry[1][self.source_index][0] == 'r':
1005+ # renamed, take the source path
1006+ entry_path_utf8 = entry[1][self.source_index][1]
1007+ else:
1008+ entry_path_utf8 = path_utf8
1009+ initial_key = (entry_path_utf8, '', '')
1010+ block_index, _ = self.state._find_block_index_from_key(
1011+ initial_key)
1012+ if block_index == 0:
1013+ # The children of the root are in block index 1.
1014+ block_index = block_index + 1
1015+ current_block = None
1016+ if block_index < len(self.state._dirblocks):
1017+ current_block = self.state._dirblocks[block_index]
1018+ if not osutils.is_inside(
1019+ entry_path_utf8, current_block[0]):
1020+ # No entries for this directory at all.
1021+ current_block = None
1022+ if current_block is not None:
1023+ for entry in current_block[1]:
1024+ if entry[1][self.source_index][0] in 'ar':
1025+ # Not in the source tree, so doesn't have to be
1026+ # included.
1027+ continue
1028+ # Path of the entry itself.
1029+ self.search_specific_file_parents.add(
1030+ self.pathjoin(*entry[0][:2]))
1031+ if changed or self.include_unchanged:
1032+ results.append((result, changed))
1033+ self.searched_exact_paths.add(path_utf8)
1034+ return results
1035+
1036+ cdef object _path_info(self, utf8_path, unicode_path):
1037+ """Generate path_info for unicode_path.
1038+
1039+ :return: None if unicode_path does not exist, or a path_info tuple.
1040+ """
1041+ abspath = self.tree.abspath(unicode_path)
1042+ try:
1043+ stat = os.lstat(abspath)
1044+ except OSError, e:
1045+ if e.errno == errno.ENOENT:
1046+ # the path does not exist.
1047+ return None
1048+ else:
1049+ raise
1050+ utf8_basename = utf8_path.rsplit('/', 1)[-1]
1051+ dir_info = (utf8_path, utf8_basename,
1052+ osutils.file_kind_from_stat_mode(stat.st_mode), stat,
1053+ abspath)
1054+ if dir_info[2] == 'directory':
1055+ if self.tree._directory_is_tree_reference(
1056+ unicode_path):
1057+ self.root_dir_info = self.root_dir_info[:2] + \
1058+ ('tree-reference',) + self.root_dir_info[3:]
1059+ return dir_info
1060
1061=== modified file 'bzrlib/_known_graph_py.py'
1062--- bzrlib/_known_graph_py.py 2009-07-08 20:58:10 +0000
1063+++ bzrlib/_known_graph_py.py 2009-09-07 14:19:05 +0000
1064@@ -18,6 +18,7 @@
1065 """
1066
1067 from bzrlib import (
1068+ errors,
1069 revision,
1070 )
1071
1072@@ -40,6 +41,18 @@
1073 self.parent_keys, self.child_keys)
1074
1075
1076+class _MergeSortNode(object):
1077+ """Information about a specific node in the merge graph."""
1078+
1079+ __slots__ = ('key', 'merge_depth', 'revno', 'end_of_merge')
1080+
1081+ def __init__(self, key, merge_depth, revno, end_of_merge):
1082+ self.key = key
1083+ self.merge_depth = merge_depth
1084+ self.revno = revno
1085+ self.end_of_merge = end_of_merge
1086+
1087+
1088 class KnownGraph(object):
1089 """This is a class which assumes we already know the full graph."""
1090
1091@@ -84,6 +97,10 @@
1092 return [node for node in self._nodes.itervalues()
1093 if not node.parent_keys]
1094
1095+ def _find_tips(self):
1096+ return [node for node in self._nodes.itervalues()
1097+ if not node.child_keys]
1098+
1099 def _find_gdfo(self):
1100 nodes = self._nodes
1101 known_parent_gdfos = {}
1102@@ -171,3 +188,119 @@
1103 self._known_heads[heads_key] = heads
1104 return heads
1105
1106+ def topo_sort(self):
1107+ """Return the nodes in topological order.
1108+
1109+ All parents must occur before all children.
1110+ """
1111+ for node in self._nodes.itervalues():
1112+ if node.gdfo is None:
1113+ raise errors.GraphCycleError(self._nodes)
1114+ pending = self._find_tails()
1115+ pending_pop = pending.pop
1116+ pending_append = pending.append
1117+
1118+ topo_order = []
1119+ topo_order_append = topo_order.append
1120+
1121+ num_seen_parents = dict.fromkeys(self._nodes, 0)
1122+ while pending:
1123+ node = pending_pop()
1124+ if node.parent_keys is not None:
1125+ # We don't include ghost parents
1126+ topo_order_append(node.key)
1127+ for child_key in node.child_keys:
1128+ child_node = self._nodes[child_key]
1129+ seen_parents = num_seen_parents[child_key] + 1
1130+ if seen_parents == len(child_node.parent_keys):
1131+ # All parents have been processed, enqueue this child
1132+ pending_append(child_node)
1133+ # This has been queued up, stop tracking it
1134+ del num_seen_parents[child_key]
1135+ else:
1136+ num_seen_parents[child_key] = seen_parents
1137+ # We started from the parents, so we don't need to do anymore work
1138+ return topo_order
1139+
1140+ def gc_sort(self):
1141+ """Return a reverse topological ordering which is 'stable'.
1142+
1143+ There are a few constraints:
1144+ 1) Reverse topological (all children before all parents)
1145+ 2) Grouped by prefix
1146+ 3) 'stable' sorting, so that we get the same result, independent of
1147+ machine, or extra data.
1148+ To do this, we use the same basic algorithm as topo_sort, but when we
1149+ aren't sure what node to access next, we sort them lexicographically.
1150+ """
1151+ tips = self._find_tips()
1152+ # Split the tips based on prefix
1153+ prefix_tips = {}
1154+ for node in tips:
1155+ if node.key.__class__ is str or len(node.key) == 1:
1156+ prefix = ''
1157+ else:
1158+ prefix = node.key[0]
1159+ prefix_tips.setdefault(prefix, []).append(node)
1160+
1161+ num_seen_children = dict.fromkeys(self._nodes, 0)
1162+
1163+ result = []
1164+ for prefix in sorted(prefix_tips):
1165+ pending = sorted(prefix_tips[prefix], key=lambda n:n.key,
1166+ reverse=True)
1167+ while pending:
1168+ node = pending.pop()
1169+ if node.parent_keys is None:
1170+ # Ghost node, skip it
1171+ continue
1172+ result.append(node.key)
1173+ for parent_key in sorted(node.parent_keys, reverse=True):
1174+ parent_node = self._nodes[parent_key]
1175+ seen_children = num_seen_children[parent_key] + 1
1176+ if seen_children == len(parent_node.child_keys):
1177+ # All children have been processed, enqueue this parent
1178+ pending.append(parent_node)
1179+ # This has been queued up, stop tracking it
1180+ del num_seen_children[parent_key]
1181+ else:
1182+ num_seen_children[parent_key] = seen_children
1183+ return result
1184+
1185+ def merge_sort(self, tip_key):
1186+ """Compute the merge sorted graph output."""
1187+ from bzrlib import tsort
1188+ as_parent_map = dict((node.key, node.parent_keys)
1189+ for node in self._nodes.itervalues()
1190+ if node.parent_keys is not None)
1191+ # We intentionally always generate revnos and never force the
1192+ # mainline_revisions
1193+ # Strip the sequence_number that merge_sort generates
1194+ return [_MergeSortNode(key, merge_depth, revno, end_of_merge)
1195+ for _, key, merge_depth, revno, end_of_merge
1196+ in tsort.merge_sort(as_parent_map, tip_key,
1197+ mainline_revisions=None,
1198+ generate_revno=True)]
1199+
1200+ def get_parent_keys(self, key):
1201+ """Get the parents for a key
1202+
1203+ Returns a list containg the parents keys. If the key is a ghost,
1204+ None is returned. A KeyError will be raised if the key is not in
1205+ the graph.
1206+
1207+ :param keys: Key to check (eg revision_id)
1208+ :return: A list of parents
1209+ """
1210+ return self._nodes[key].parent_keys
1211+
1212+ def get_child_keys(self, key):
1213+ """Get the children for a key
1214+
1215+ Returns a list containg the children keys. A KeyError will be raised
1216+ if the key is not in the graph.
1217+
1218+ :param keys: Key to check (eg revision_id)
1219+ :return: A list of children
1220+ """
1221+ return self._nodes[key].child_keys
1222
1223=== modified file 'bzrlib/_known_graph_pyx.pyx'
1224--- bzrlib/_known_graph_pyx.pyx 2009-07-14 16:10:32 +0000
1225+++ bzrlib/_known_graph_pyx.pyx 2009-09-07 14:19:05 +0000
1226@@ -25,11 +25,18 @@
1227 ctypedef struct PyObject:
1228 pass
1229
1230+ int PyString_CheckExact(object)
1231+
1232+ int PyObject_RichCompareBool(object, object, int)
1233+ int Py_LT
1234+
1235+ int PyTuple_CheckExact(object)
1236 object PyTuple_New(Py_ssize_t n)
1237 Py_ssize_t PyTuple_GET_SIZE(object t)
1238 PyObject * PyTuple_GET_ITEM(object t, Py_ssize_t o)
1239 void PyTuple_SET_ITEM(object t, Py_ssize_t o, object v)
1240
1241+ int PyList_CheckExact(object)
1242 Py_ssize_t PyList_GET_SIZE(object l)
1243 PyObject * PyList_GET_ITEM(object l, Py_ssize_t o)
1244 int PyList_SetItem(object l, Py_ssize_t o, object l) except -1
1245@@ -44,8 +51,9 @@
1246
1247 void Py_INCREF(object)
1248
1249+import gc
1250
1251-from bzrlib import revision
1252+from bzrlib import errors, revision
1253
1254 cdef object NULL_REVISION
1255 NULL_REVISION = revision.NULL_REVISION
1256@@ -59,10 +67,9 @@
1257 cdef object children
1258 cdef public long gdfo
1259 cdef int seen
1260+ cdef object extra
1261
1262 def __init__(self, key):
1263- cdef int i
1264-
1265 self.key = key
1266 self.parents = None
1267
1268@@ -70,6 +77,7 @@
1269 # Greatest distance from origin
1270 self.gdfo = -1
1271 self.seen = 0
1272+ self.extra = None
1273
1274 property child_keys:
1275 def __get__(self):
1276@@ -80,6 +88,18 @@
1277 PyList_Append(keys, child.key)
1278 return keys
1279
1280+ property parent_keys:
1281+ def __get__(self):
1282+ if self.parents is None:
1283+ return None
1284+
1285+ cdef _KnownGraphNode parent
1286+
1287+ keys = []
1288+ for parent in self.parents:
1289+ PyList_Append(keys, parent.key)
1290+ return keys
1291+
1292 cdef clear_references(self):
1293 self.parents = None
1294 self.children = None
1295@@ -107,17 +127,66 @@
1296 return <_KnownGraphNode>temp_node
1297
1298
1299-cdef _KnownGraphNode _get_parent(parents, Py_ssize_t pos):
1300+cdef _KnownGraphNode _get_tuple_node(tpl, Py_ssize_t pos):
1301 cdef PyObject *temp_node
1302- cdef _KnownGraphNode node
1303
1304- temp_node = PyTuple_GET_ITEM(parents, pos)
1305+ temp_node = PyTuple_GET_ITEM(tpl, pos)
1306 return <_KnownGraphNode>temp_node
1307
1308
1309-# TODO: slab allocate all _KnownGraphNode objects.
1310-# We already know how many we are going to need, except for a couple of
1311-# ghosts that could be allocated on demand.
1312+def get_key(node):
1313+ cdef _KnownGraphNode real_node
1314+ real_node = node
1315+ return real_node.key
1316+
1317+
1318+cdef object _sort_list_nodes(object lst_or_tpl, int reverse):
1319+ """Sort a list of _KnownGraphNode objects.
1320+
1321+ If lst_or_tpl is a list, it is allowed to mutate in place. It may also
1322+ just return the input list if everything is already sorted.
1323+ """
1324+ cdef _KnownGraphNode node1, node2
1325+ cdef int do_swap, is_tuple
1326+ cdef Py_ssize_t length
1327+
1328+ is_tuple = PyTuple_CheckExact(lst_or_tpl)
1329+ if not (is_tuple or PyList_CheckExact(lst_or_tpl)):
1330+ raise TypeError('lst_or_tpl must be a list or tuple.')
1331+ length = len(lst_or_tpl)
1332+ if length == 0 or length == 1:
1333+ return lst_or_tpl
1334+ if length == 2:
1335+ if is_tuple:
1336+ node1 = _get_tuple_node(lst_or_tpl, 0)
1337+ node2 = _get_tuple_node(lst_or_tpl, 1)
1338+ else:
1339+ node1 = _get_list_node(lst_or_tpl, 0)
1340+ node2 = _get_list_node(lst_or_tpl, 1)
1341+ if reverse:
1342+ do_swap = PyObject_RichCompareBool(node1.key, node2.key, Py_LT)
1343+ else:
1344+ do_swap = PyObject_RichCompareBool(node2.key, node1.key, Py_LT)
1345+ if not do_swap:
1346+ return lst_or_tpl
1347+ if is_tuple:
1348+ return (node2, node1)
1349+ else:
1350+ # Swap 'in-place', since lists are mutable
1351+ Py_INCREF(node1)
1352+ PyList_SetItem(lst_or_tpl, 1, node1)
1353+ Py_INCREF(node2)
1354+ PyList_SetItem(lst_or_tpl, 0, node2)
1355+ return lst_or_tpl
1356+ # For all other sizes, we just use 'sorted()'
1357+ if is_tuple:
1358+ # Note that sorted() is just list(iterable).sort()
1359+ lst_or_tpl = list(lst_or_tpl)
1360+ lst_or_tpl.sort(key=get_key, reverse=reverse)
1361+ return lst_or_tpl
1362+
1363+
1364+cdef class _MergeSorter
1365
1366 cdef class KnownGraph:
1367 """This is a class which assumes we already know the full graph."""
1368@@ -136,6 +205,9 @@
1369 # Maps {sorted(revision_id, revision_id): heads}
1370 self._known_heads = {}
1371 self.do_cache = int(do_cache)
1372+ # TODO: consider disabling gc since we are allocating a lot of nodes
1373+ # that won't be collectable anyway. real world testing has not
1374+ # shown a specific impact, yet.
1375 self._initialize_nodes(parent_map)
1376 self._find_gdfo()
1377
1378@@ -183,11 +255,16 @@
1379 parent_keys = <object>temp_parent_keys
1380 num_parent_keys = len(parent_keys)
1381 node = self._get_or_create_node(key)
1382- # We know how many parents, so we could pre allocate an exact sized
1383- # tuple here
1384+ # We know how many parents, so we pre allocate the tuple
1385 parent_nodes = PyTuple_New(num_parent_keys)
1386- # We use iter here, because parent_keys maybe be a list or tuple
1387 for pos2 from 0 <= pos2 < num_parent_keys:
1388+ # Note: it costs us 10ms out of 40ms to lookup all of these
1389+ # parents, it doesn't seem to be an allocation overhead,
1390+ # but rather a lookup overhead. There doesn't seem to be
1391+ # a way around it, and that is one reason why
1392+ # KnownGraphNode maintains a direct pointer to the parent
1393+ # node.
1394+ # We use [] because parent_keys may be a tuple or list
1395 parent_node = self._get_or_create_node(parent_keys[pos2])
1396 # PyTuple_SET_ITEM will steal a reference, so INCREF first
1397 Py_INCREF(parent_node)
1398@@ -209,6 +286,19 @@
1399 PyList_Append(tails, node)
1400 return tails
1401
1402+ def _find_tips(self):
1403+ cdef PyObject *temp_node
1404+ cdef _KnownGraphNode node
1405+ cdef Py_ssize_t pos
1406+
1407+ tips = []
1408+ pos = 0
1409+ while PyDict_Next(self._nodes, &pos, NULL, &temp_node):
1410+ node = <_KnownGraphNode>temp_node
1411+ if PyList_GET_SIZE(node.children) == 0:
1412+ PyList_Append(tips, node)
1413+ return tips
1414+
1415 def _find_gdfo(self):
1416 cdef _KnownGraphNode node
1417 cdef _KnownGraphNode child
1418@@ -315,7 +405,7 @@
1419 continue
1420 if node.parents is not None and PyTuple_GET_SIZE(node.parents) > 0:
1421 for pos from 0 <= pos < PyTuple_GET_SIZE(node.parents):
1422- parent_node = _get_parent(node.parents, pos)
1423+ parent_node = _get_tuple_node(node.parents, pos)
1424 last_item = last_item + 1
1425 if last_item < PyList_GET_SIZE(pending):
1426 Py_INCREF(parent_node) # SetItem steals a ref
1427@@ -335,3 +425,454 @@
1428 if self.do_cache:
1429 PyDict_SetItem(self._known_heads, heads_key, heads)
1430 return heads
1431+
1432+ def topo_sort(self):
1433+ """Return the nodes in topological order.
1434+
1435+ All parents must occur before all children.
1436+ """
1437+ # This is, for the most part, the same iteration order that we used for
1438+ # _find_gdfo, consider finding a way to remove the duplication
1439+ # In general, we find the 'tails' (nodes with no parents), and then
1440+ # walk to the children. For children that have all of their parents
1441+ # yielded, we queue up the child to be yielded as well.
1442+ cdef _KnownGraphNode node
1443+ cdef _KnownGraphNode child
1444+ cdef PyObject *temp
1445+ cdef Py_ssize_t pos
1446+ cdef int replace
1447+ cdef Py_ssize_t last_item
1448+
1449+ pending = self._find_tails()
1450+ if PyList_GET_SIZE(pending) == 0 and len(self._nodes) > 0:
1451+ raise errors.GraphCycleError(self._nodes)
1452+
1453+ topo_order = []
1454+
1455+ last_item = PyList_GET_SIZE(pending) - 1
1456+ while last_item >= 0:
1457+ # Avoid pop followed by push, instead, peek, and replace
1458+ # timing shows this is 930ms => 770ms for OOo
1459+ node = _get_list_node(pending, last_item)
1460+ last_item = last_item - 1
1461+ if node.parents is not None:
1462+ # We don't include ghost parents
1463+ PyList_Append(topo_order, node.key)
1464+ for pos from 0 <= pos < PyList_GET_SIZE(node.children):
1465+ child = _get_list_node(node.children, pos)
1466+ if child.gdfo == -1:
1467+ # We know we have a graph cycle because a node has a parent
1468+ # which we couldn't find
1469+ raise errors.GraphCycleError(self._nodes)
1470+ child.seen = child.seen + 1
1471+ if child.seen == PyTuple_GET_SIZE(child.parents):
1472+ # All parents of this child have been yielded, queue this
1473+ # one to be yielded as well
1474+ last_item = last_item + 1
1475+ if last_item < PyList_GET_SIZE(pending):
1476+ Py_INCREF(child) # SetItem steals a ref
1477+ PyList_SetItem(pending, last_item, child)
1478+ else:
1479+ PyList_Append(pending, child)
1480+ # We have queued this node, we don't need to track it
1481+ # anymore
1482+ child.seen = 0
1483+ # We started from the parents, so we don't need to do anymore work
1484+ return topo_order
1485+
1486+ def gc_sort(self):
1487+ """Return a reverse topological ordering which is 'stable'.
1488+
1489+ There are a few constraints:
1490+ 1) Reverse topological (all children before all parents)
1491+ 2) Grouped by prefix
1492+ 3) 'stable' sorting, so that we get the same result, independent of
1493+ machine, or extra data.
1494+ To do this, we use the same basic algorithm as topo_sort, but when we
1495+ aren't sure what node to access next, we sort them lexicographically.
1496+ """
1497+ cdef PyObject *temp
1498+ cdef Py_ssize_t pos, last_item
1499+ cdef _KnownGraphNode node, node2, parent_node
1500+
1501+ tips = self._find_tips()
1502+ # Split the tips based on prefix
1503+ prefix_tips = {}
1504+ for pos from 0 <= pos < PyList_GET_SIZE(tips):
1505+ node = _get_list_node(tips, pos)
1506+ if PyString_CheckExact(node.key) or len(node.key) == 1:
1507+ prefix = ''
1508+ else:
1509+ prefix = node.key[0]
1510+ temp = PyDict_GetItem(prefix_tips, prefix)
1511+ if temp == NULL:
1512+ prefix_tips[prefix] = [node]
1513+ else:
1514+ tip_nodes = <object>temp
1515+ PyList_Append(tip_nodes, node)
1516+
1517+ result = []
1518+ for prefix in sorted(prefix_tips):
1519+ temp = PyDict_GetItem(prefix_tips, prefix)
1520+ assert temp != NULL
1521+ tip_nodes = <object>temp
1522+ pending = _sort_list_nodes(tip_nodes, 1)
1523+ last_item = PyList_GET_SIZE(pending) - 1
1524+ while last_item >= 0:
1525+ node = _get_list_node(pending, last_item)
1526+ last_item = last_item - 1
1527+ if node.parents is None:
1528+ # Ghost
1529+ continue
1530+ PyList_Append(result, node.key)
1531+ # Sorting the parent keys isn't strictly necessary for stable
1532+ # sorting of a given graph. But it does help minimize the
1533+ # differences between graphs
1534+ # For bzr.dev ancestry:
1535+ # 4.73ms no sort
1536+ # 7.73ms RichCompareBool sort
1537+ parents = _sort_list_nodes(node.parents, 1)
1538+ for pos from 0 <= pos < len(parents):
1539+ if PyTuple_CheckExact(parents):
1540+ parent_node = _get_tuple_node(parents, pos)
1541+ else:
1542+ parent_node = _get_list_node(parents, pos)
1543+ # TODO: GraphCycle detection
1544+ parent_node.seen = parent_node.seen + 1
1545+ if (parent_node.seen
1546+ == PyList_GET_SIZE(parent_node.children)):
1547+ # All children have been processed, queue up this
1548+ # parent
1549+ last_item = last_item + 1
1550+ if last_item < PyList_GET_SIZE(pending):
1551+ Py_INCREF(parent_node) # SetItem steals a ref
1552+ PyList_SetItem(pending, last_item, parent_node)
1553+ else:
1554+ PyList_Append(pending, parent_node)
1555+ parent_node.seen = 0
1556+ return result
1557+
1558+ def merge_sort(self, tip_key):
1559+ """Compute the merge sorted graph output."""
1560+ cdef _MergeSorter sorter
1561+
1562+ # TODO: consider disabling gc since we are allocating a lot of nodes
1563+ # that won't be collectable anyway. real world testing has not
1564+ # shown a specific impact, yet.
1565+ sorter = _MergeSorter(self, tip_key)
1566+ return sorter.topo_order()
1567+
1568+ def get_parent_keys(self, key):
1569+ """Get the parents for a key
1570+
1571+ Returns a list containg the parents keys. If the key is a ghost,
1572+ None is returned. A KeyError will be raised if the key is not in
1573+ the graph.
1574+
1575+ :param keys: Key to check (eg revision_id)
1576+ :return: A list of parents
1577+ """
1578+ return self._nodes[key].parent_keys
1579+
1580+ def get_child_keys(self, key):
1581+ """Get the children for a key
1582+
1583+ Returns a list containg the children keys. A KeyError will be raised
1584+ if the key is not in the graph.
1585+
1586+ :param keys: Key to check (eg revision_id)
1587+ :return: A list of children
1588+ """
1589+ return self._nodes[key].child_keys
1590+
1591+
1592+cdef class _MergeSortNode:
1593+ """Tracks information about a node during the merge_sort operation."""
1594+
1595+ # Public api
1596+ cdef public object key
1597+ cdef public long merge_depth
1598+ cdef public object end_of_merge # True/False Is this the end of the current merge
1599+
1600+ # Private api, used while computing the information
1601+ cdef _KnownGraphNode left_parent
1602+ cdef _KnownGraphNode left_pending_parent
1603+ cdef object pending_parents # list of _KnownGraphNode for non-left parents
1604+ cdef long _revno_first
1605+ cdef long _revno_second
1606+ cdef long _revno_last
1607+ # TODO: turn these into flag/bit fields rather than individual members
1608+ cdef int is_first_child # Is this the first child?
1609+ cdef int seen_by_child # A child node has seen this parent
1610+ cdef int completed # Fully Processed
1611+
1612+ def __init__(self, key):
1613+ self.key = key
1614+ self.merge_depth = -1
1615+ self.left_parent = None
1616+ self.left_pending_parent = None
1617+ self.pending_parents = None
1618+ self._revno_first = -1
1619+ self._revno_second = -1
1620+ self._revno_last = -1
1621+ self.is_first_child = 0
1622+ self.seen_by_child = 0
1623+ self.completed = 0
1624+
1625+ def __repr__(self):
1626+ return '%s(%s depth:%s rev:%s,%s,%s first:%s seen:%s)' % (
1627+ self.__class__.__name__, self.key,
1628+ self.merge_depth,
1629+ self._revno_first, self._revno_second, self._revno_last,
1630+ self.is_first_child, self.seen_by_child)
1631+
1632+ cdef int has_pending_parents(self):
1633+ if self.left_pending_parent is not None or self.pending_parents:
1634+ return 1
1635+ return 0
1636+
1637+ cdef object _revno(self):
1638+ if self._revno_first == -1:
1639+ if self._revno_second != -1:
1640+ raise RuntimeError('Something wrong with: %s' % (self,))
1641+ return (self._revno_last,)
1642+ else:
1643+ return (self._revno_first, self._revno_second, self._revno_last)
1644+
1645+ property revno:
1646+ def __get__(self):
1647+ return self._revno()
1648+
1649+
1650+cdef class _MergeSorter:
1651+ """This class does the work of computing the merge_sort ordering.
1652+
1653+ We have some small advantages, in that we get all the extra information
1654+ that KnownGraph knows, like knowing the child lists, etc.
1655+ """
1656+
1657+ # Current performance numbers for merge_sort(bzr_dev_parent_map):
1658+ # 302ms tsort.merge_sort()
1659+ # 91ms graph.KnownGraph().merge_sort()
1660+ # 40ms kg.merge_sort()
1661+
1662+ cdef KnownGraph graph
1663+ cdef object _depth_first_stack # list
1664+ cdef Py_ssize_t _last_stack_item # offset to last item on stack
1665+ # cdef object _ms_nodes # dict of key => _MergeSortNode
1666+ cdef object _revno_to_branch_count # {revno => num child branches}
1667+ cdef object _scheduled_nodes # List of nodes ready to be yielded
1668+
1669+ def __init__(self, known_graph, tip_key):
1670+ cdef _KnownGraphNode node
1671+
1672+ self.graph = known_graph
1673+ # self._ms_nodes = {}
1674+ self._revno_to_branch_count = {}
1675+ self._depth_first_stack = []
1676+ self._last_stack_item = -1
1677+ self._scheduled_nodes = []
1678+ if (tip_key is not None and tip_key != NULL_REVISION
1679+ and tip_key != (NULL_REVISION,)):
1680+ node = self.graph._nodes[tip_key]
1681+ self._push_node(node, 0)
1682+
1683+ cdef _MergeSortNode _get_ms_node(self, _KnownGraphNode node):
1684+ cdef PyObject *temp_node
1685+ cdef _MergeSortNode ms_node
1686+
1687+ if node.extra is None:
1688+ ms_node = _MergeSortNode(node.key)
1689+ node.extra = ms_node
1690+ else:
1691+ ms_node = <_MergeSortNode>node.extra
1692+ return ms_node
1693+
1694+ cdef _push_node(self, _KnownGraphNode node, long merge_depth):
1695+ cdef _KnownGraphNode parent_node
1696+ cdef _MergeSortNode ms_node, ms_parent_node
1697+ cdef Py_ssize_t pos
1698+
1699+ ms_node = self._get_ms_node(node)
1700+ ms_node.merge_depth = merge_depth
1701+ if node.parents is None:
1702+ raise RuntimeError('ghost nodes should not be pushed'
1703+ ' onto the stack: %s' % (node,))
1704+ if PyTuple_GET_SIZE(node.parents) > 0:
1705+ parent_node = _get_tuple_node(node.parents, 0)
1706+ ms_node.left_parent = parent_node
1707+ if parent_node.parents is None: # left-hand ghost
1708+ ms_node.left_pending_parent = None
1709+ ms_node.left_parent = None
1710+ else:
1711+ ms_node.left_pending_parent = parent_node
1712+ if PyTuple_GET_SIZE(node.parents) > 1:
1713+ ms_node.pending_parents = []
1714+ for pos from 1 <= pos < PyTuple_GET_SIZE(node.parents):
1715+ parent_node = _get_tuple_node(node.parents, pos)
1716+ if parent_node.parents is None: # ghost
1717+ continue
1718+ PyList_Append(ms_node.pending_parents, parent_node)
1719+
1720+ ms_node.is_first_child = 1
1721+ if ms_node.left_parent is not None:
1722+ ms_parent_node = self._get_ms_node(ms_node.left_parent)
1723+ if ms_parent_node.seen_by_child:
1724+ ms_node.is_first_child = 0
1725+ ms_parent_node.seen_by_child = 1
1726+ self._last_stack_item = self._last_stack_item + 1
1727+ if self._last_stack_item < PyList_GET_SIZE(self._depth_first_stack):
1728+ Py_INCREF(node) # SetItem steals a ref
1729+ PyList_SetItem(self._depth_first_stack, self._last_stack_item,
1730+ node)
1731+ else:
1732+ PyList_Append(self._depth_first_stack, node)
1733+
1734+ cdef _pop_node(self):
1735+ cdef PyObject *temp
1736+ cdef _MergeSortNode ms_node, ms_parent_node, ms_prev_node
1737+ cdef _KnownGraphNode node, parent_node, prev_node
1738+
1739+ node = _get_list_node(self._depth_first_stack, self._last_stack_item)
1740+ ms_node = <_MergeSortNode>node.extra
1741+ self._last_stack_item = self._last_stack_item - 1
1742+ if ms_node.left_parent is not None:
1743+ # Assign the revision number from the left-hand parent
1744+ ms_parent_node = <_MergeSortNode>ms_node.left_parent.extra
1745+ if ms_node.is_first_child:
1746+ # First child just increments the final digit
1747+ ms_node._revno_first = ms_parent_node._revno_first
1748+ ms_node._revno_second = ms_parent_node._revno_second
1749+ ms_node._revno_last = ms_parent_node._revno_last + 1
1750+ else:
1751+ # Not the first child, make a new branch
1752+ # (mainline_revno, branch_count, 1)
1753+ if ms_parent_node._revno_first == -1:
1754+ # Mainline ancestor, the increment is on the last digit
1755+ base_revno = ms_parent_node._revno_last
1756+ else:
1757+ base_revno = ms_parent_node._revno_first
1758+ temp = PyDict_GetItem(self._revno_to_branch_count,
1759+ base_revno)
1760+ if temp == NULL:
1761+ branch_count = 1
1762+ else:
1763+ branch_count = (<object>temp) + 1
1764+ PyDict_SetItem(self._revno_to_branch_count, base_revno,
1765+ branch_count)
1766+ ms_node._revno_first = base_revno
1767+ ms_node._revno_second = branch_count
1768+ ms_node._revno_last = 1
1769+ else:
1770+ temp = PyDict_GetItem(self._revno_to_branch_count, 0)
1771+ if temp == NULL:
1772+ # The first root node doesn't have a 3-digit revno
1773+ root_count = 0
1774+ ms_node._revno_first = -1
1775+ ms_node._revno_second = -1
1776+ ms_node._revno_last = 1
1777+ else:
1778+ root_count = (<object>temp) + 1
1779+ ms_node._revno_first = 0
1780+ ms_node._revno_second = root_count
1781+ ms_node._revno_last = 1
1782+ PyDict_SetItem(self._revno_to_branch_count, 0, root_count)
1783+ ms_node.completed = 1
1784+ if PyList_GET_SIZE(self._scheduled_nodes) == 0:
1785+ # The first scheduled node is always the end of merge
1786+ ms_node.end_of_merge = True
1787+ else:
1788+ prev_node = _get_list_node(self._scheduled_nodes,
1789+ PyList_GET_SIZE(self._scheduled_nodes) - 1)
1790+ ms_prev_node = <_MergeSortNode>prev_node.extra
1791+ if ms_prev_node.merge_depth < ms_node.merge_depth:
1792+ # The previously pushed node is to our left, so this is the end
1793+ # of this right-hand chain
1794+ ms_node.end_of_merge = True
1795+ elif (ms_prev_node.merge_depth == ms_node.merge_depth
1796+ and prev_node not in node.parents):
1797+ # The next node is not a direct parent of this node
1798+ ms_node.end_of_merge = True
1799+ else:
1800+ ms_node.end_of_merge = False
1801+ PyList_Append(self._scheduled_nodes, node)
1802+
1803+ cdef _schedule_stack(self):
1804+ cdef _KnownGraphNode last_node, next_node
1805+ cdef _MergeSortNode ms_node, ms_last_node, ms_next_node
1806+ cdef long next_merge_depth
1807+ ordered = []
1808+ while self._last_stack_item >= 0:
1809+ # Peek at the last item on the stack
1810+ last_node = _get_list_node(self._depth_first_stack,
1811+ self._last_stack_item)
1812+ if last_node.gdfo == -1:
1813+ # if _find_gdfo skipped a node, that means there is a graph
1814+ # cycle, error out now
1815+ raise errors.GraphCycleError(self.graph._nodes)
1816+ ms_last_node = <_MergeSortNode>last_node.extra
1817+ if not ms_last_node.has_pending_parents():
1818+ # Processed all parents, pop this node
1819+ self._pop_node()
1820+ continue
1821+ while ms_last_node.has_pending_parents():
1822+ if ms_last_node.left_pending_parent is not None:
1823+ # recurse depth first into the primary parent
1824+ next_node = ms_last_node.left_pending_parent
1825+ ms_last_node.left_pending_parent = None
1826+ else:
1827+ # place any merges in right-to-left order for scheduling
1828+ # which gives us left-to-right order after we reverse
1829+ # the scheduled queue.
1830+ # Note: This has the effect of allocating common-new
1831+ # revisions to the right-most subtree rather than the
1832+ # left most, which will display nicely (you get
1833+ # smaller trees at the top of the combined merge).
1834+ next_node = ms_last_node.pending_parents.pop()
1835+ ms_next_node = self._get_ms_node(next_node)
1836+ if ms_next_node.completed:
1837+ # this parent was completed by a child on the
1838+ # call stack. skip it.
1839+ continue
1840+ # otherwise transfer it from the source graph into the
1841+ # top of the current depth first search stack.
1842+
1843+ if next_node is ms_last_node.left_parent:
1844+ next_merge_depth = ms_last_node.merge_depth
1845+ else:
1846+ next_merge_depth = ms_last_node.merge_depth + 1
1847+ self._push_node(next_node, next_merge_depth)
1848+ # and do not continue processing parents until this 'call'
1849+ # has recursed.
1850+ break
1851+
1852+ cdef topo_order(self):
1853+ cdef _MergeSortNode ms_node
1854+ cdef _KnownGraphNode node
1855+ cdef Py_ssize_t pos
1856+ cdef PyObject *temp_key, *temp_node
1857+
1858+ # Note: allocating a _MergeSortNode and deallocating it for all nodes
1859+ # costs approx 8.52ms (21%) of the total runtime
1860+ # We might consider moving the attributes into the base
1861+ # KnownGraph object.
1862+ self._schedule_stack()
1863+
1864+ # We've set up the basic schedule, now we can continue processing the
1865+ # output.
1866+ # Note: This final loop costs us 40.0ms => 28.8ms (11ms, 25%) on
1867+ # bzr.dev, to convert the internal Object representation into a
1868+ # Tuple representation...
1869+ # 2ms is walking the data and computing revno tuples
1870+ # 7ms is computing the return tuple
1871+ # 4ms is PyList_Append()
1872+ ordered = []
1873+ # output the result in reverse order, and separate the generated info
1874+ for pos from PyList_GET_SIZE(self._scheduled_nodes) > pos >= 0:
1875+ node = _get_list_node(self._scheduled_nodes, pos)
1876+ ms_node = <_MergeSortNode>node.extra
1877+ PyList_Append(ordered, ms_node)
1878+ node.extra = None
1879+ # Clear out the scheduled nodes now that we're done
1880+ self._scheduled_nodes = []
1881+ return ordered
1882
1883=== modified file 'bzrlib/annotate.py'
1884--- bzrlib/annotate.py 2009-07-08 17:09:03 +0000
1885+++ bzrlib/annotate.py 2009-08-17 18:52:01 +0000
1886@@ -188,6 +188,10 @@
1887 # or something.
1888 last_revision = current_rev.revision_id
1889 # XXX: Partially Cloned from branch, uses the old_get_graph, eep.
1890+ # XXX: The main difficulty is that we need to inject a single new node
1891+ # (current_rev) into the graph before it gets numbered, etc.
1892+ # Once KnownGraph gets an 'add_node()' function, we can use
1893+ # VF.get_known_graph_ancestry().
1894 graph = repository.get_graph()
1895 revision_graph = dict(((key, value) for key, value in
1896 graph.iter_ancestry(current_rev.parent_ids) if value is not None))
1897
1898=== modified file 'bzrlib/branch.py'
1899--- bzrlib/branch.py 2009-08-04 04:36:34 +0000
1900+++ bzrlib/branch.py 2009-08-19 18:04:49 +0000
1901@@ -446,15 +446,11 @@
1902 # start_revision_id.
1903 if self._merge_sorted_revisions_cache is None:
1904 last_revision = self.last_revision()
1905- graph = self.repository.get_graph()
1906- parent_map = dict(((key, value) for key, value in
1907- graph.iter_ancestry([last_revision]) if value is not None))
1908- revision_graph = repository._strip_NULL_ghosts(parent_map)
1909- revs = tsort.merge_sort(revision_graph, last_revision, None,
1910- generate_revno=True)
1911- # Drop the sequence # before caching
1912- self._merge_sorted_revisions_cache = [r[1:] for r in revs]
1913-
1914+ last_key = (last_revision,)
1915+ known_graph = self.repository.revisions.get_known_graph_ancestry(
1916+ [last_key])
1917+ self._merge_sorted_revisions_cache = known_graph.merge_sort(
1918+ last_key)
1919 filtered = self._filter_merge_sorted_revisions(
1920 self._merge_sorted_revisions_cache, start_revision_id,
1921 stop_revision_id, stop_rule)
1922@@ -470,27 +466,34 @@
1923 """Iterate over an inclusive range of sorted revisions."""
1924 rev_iter = iter(merge_sorted_revisions)
1925 if start_revision_id is not None:
1926- for rev_id, depth, revno, end_of_merge in rev_iter:
1927+ for node in rev_iter:
1928+ rev_id = node.key[-1]
1929 if rev_id != start_revision_id:
1930 continue
1931 else:
1932 # The decision to include the start or not
1933 # depends on the stop_rule if a stop is provided
1934- rev_iter = chain(
1935- iter([(rev_id, depth, revno, end_of_merge)]),
1936- rev_iter)
1937+ # so pop this node back into the iterator
1938+ rev_iter = chain(iter([node]), rev_iter)
1939 break
1940 if stop_revision_id is None:
1941- for rev_id, depth, revno, end_of_merge in rev_iter:
1942- yield rev_id, depth, revno, end_of_merge
1943+ # Yield everything
1944+ for node in rev_iter:
1945+ rev_id = node.key[-1]
1946+ yield (rev_id, node.merge_depth, node.revno,
1947+ node.end_of_merge)
1948 elif stop_rule == 'exclude':
1949- for rev_id, depth, revno, end_of_merge in rev_iter:
1950+ for node in rev_iter:
1951+ rev_id = node.key[-1]
1952 if rev_id == stop_revision_id:
1953 return
1954- yield rev_id, depth, revno, end_of_merge
1955+ yield (rev_id, node.merge_depth, node.revno,
1956+ node.end_of_merge)
1957 elif stop_rule == 'include':
1958- for rev_id, depth, revno, end_of_merge in rev_iter:
1959- yield rev_id, depth, revno, end_of_merge
1960+ for node in rev_iter:
1961+ rev_id = node.key[-1]
1962+ yield (rev_id, node.merge_depth, node.revno,
1963+ node.end_of_merge)
1964 if rev_id == stop_revision_id:
1965 return
1966 elif stop_rule == 'with-merges':
1967@@ -499,10 +502,12 @@
1968 left_parent = stop_rev.parent_ids[0]
1969 else:
1970 left_parent = _mod_revision.NULL_REVISION
1971- for rev_id, depth, revno, end_of_merge in rev_iter:
1972+ for node in rev_iter:
1973+ rev_id = node.key[-1]
1974 if rev_id == left_parent:
1975 return
1976- yield rev_id, depth, revno, end_of_merge
1977+ yield (rev_id, node.merge_depth, node.revno,
1978+ node.end_of_merge)
1979 else:
1980 raise ValueError('invalid stop_rule %r' % stop_rule)
1981
1982@@ -1147,6 +1152,9 @@
1983 revision_id: if not None, the revision history in the new branch will
1984 be truncated to end with revision_id.
1985 """
1986+ if (repository_policy is not None and
1987+ repository_policy.requires_stacking()):
1988+ to_bzrdir._format.require_stacking(_skip_repo=True)
1989 result = to_bzrdir.create_branch()
1990 result.lock_write()
1991 try:
1992@@ -2064,7 +2072,7 @@
1993 BranchFormat.register_format(__format6)
1994 BranchFormat.register_format(__format7)
1995 BranchFormat.register_format(__format8)
1996-BranchFormat.set_default_format(__format6)
1997+BranchFormat.set_default_format(__format7)
1998 _legacy_formats = [BzrBranchFormat4(),
1999 ]
2000 network_format_registry.register(
2001
2002=== modified file 'bzrlib/btree_index.py'
2003--- bzrlib/btree_index.py 2009-07-01 10:51:47 +0000
2004+++ bzrlib/btree_index.py 2009-08-17 22:11:06 +0000
2005@@ -586,13 +586,19 @@
2006 class _LeafNode(object):
2007 """A leaf node for a serialised B+Tree index."""
2008
2009- __slots__ = ('keys',)
2010+ __slots__ = ('keys', 'min_key', 'max_key')
2011
2012 def __init__(self, bytes, key_length, ref_list_length):
2013 """Parse bytes to create a leaf node object."""
2014 # splitlines mangles the \r delimiters.. don't use it.
2015- self.keys = dict(_btree_serializer._parse_leaf_lines(bytes,
2016- key_length, ref_list_length))
2017+ key_list = _btree_serializer._parse_leaf_lines(bytes,
2018+ key_length, ref_list_length)
2019+ if key_list:
2020+ self.min_key = key_list[0][0]
2021+ self.max_key = key_list[-1][0]
2022+ else:
2023+ self.min_key = self.max_key = None
2024+ self.keys = dict(key_list)
2025
2026
2027 class _InternalNode(object):
2028@@ -1039,6 +1045,39 @@
2029 output.append(cur_out)
2030 return output
2031
2032+ def _walk_through_internal_nodes(self, keys):
2033+ """Take the given set of keys, and find the corresponding LeafNodes.
2034+
2035+ :param keys: An unsorted iterable of keys to search for
2036+ :return: (nodes, index_and_keys)
2037+ nodes is a dict mapping {index: LeafNode}
2038+ keys_at_index is a list of tuples of [(index, [keys for Leaf])]
2039+ """
2040+ # 6 seconds spent in miss_torture using the sorted() line.
2041+ # Even with out of order disk IO it seems faster not to sort it when
2042+ # large queries are being made.
2043+ keys_at_index = [(0, sorted(keys))]
2044+
2045+ for row_pos, next_row_start in enumerate(self._row_offsets[1:-1]):
2046+ node_indexes = [idx for idx, s_keys in keys_at_index]
2047+ nodes = self._get_internal_nodes(node_indexes)
2048+
2049+ next_nodes_and_keys = []
2050+ for node_index, sub_keys in keys_at_index:
2051+ node = nodes[node_index]
2052+ positions = self._multi_bisect_right(sub_keys, node.keys)
2053+ node_offset = next_row_start + node.offset
2054+ next_nodes_and_keys.extend([(node_offset + pos, s_keys)
2055+ for pos, s_keys in positions])
2056+ keys_at_index = next_nodes_and_keys
2057+ # We should now be at the _LeafNodes
2058+ node_indexes = [idx for idx, s_keys in keys_at_index]
2059+
2060+ # TODO: We may *not* want to always read all the nodes in one
2061+ # big go. Consider setting a max size on this.
2062+ nodes = self._get_leaf_nodes(node_indexes)
2063+ return nodes, keys_at_index
2064+
2065 def iter_entries(self, keys):
2066 """Iterate over keys within the index.
2067
2068@@ -1082,32 +1121,7 @@
2069 needed_keys = keys
2070 if not needed_keys:
2071 return
2072- # 6 seconds spent in miss_torture using the sorted() line.
2073- # Even with out of order disk IO it seems faster not to sort it when
2074- # large queries are being made.
2075- needed_keys = sorted(needed_keys)
2076-
2077- nodes_and_keys = [(0, needed_keys)]
2078-
2079- for row_pos, next_row_start in enumerate(self._row_offsets[1:-1]):
2080- node_indexes = [idx for idx, s_keys in nodes_and_keys]
2081- nodes = self._get_internal_nodes(node_indexes)
2082-
2083- next_nodes_and_keys = []
2084- for node_index, sub_keys in nodes_and_keys:
2085- node = nodes[node_index]
2086- positions = self._multi_bisect_right(sub_keys, node.keys)
2087- node_offset = next_row_start + node.offset
2088- next_nodes_and_keys.extend([(node_offset + pos, s_keys)
2089- for pos, s_keys in positions])
2090- nodes_and_keys = next_nodes_and_keys
2091- # We should now be at the _LeafNodes
2092- node_indexes = [idx for idx, s_keys in nodes_and_keys]
2093-
2094- # TODO: We may *not* want to always read all the nodes in one
2095- # big go. Consider setting a max size on this.
2096-
2097- nodes = self._get_leaf_nodes(node_indexes)
2098+ nodes, nodes_and_keys = self._walk_through_internal_nodes(needed_keys)
2099 for node_index, sub_keys in nodes_and_keys:
2100 if not sub_keys:
2101 continue
2102@@ -1120,6 +1134,133 @@
2103 else:
2104 yield (self, next_sub_key, value)
2105
2106+ def _find_ancestors(self, keys, ref_list_num, parent_map, missing_keys):
2107+ """Find the parent_map information for the set of keys.
2108+
2109+ This populates the parent_map dict and missing_keys set based on the
2110+ queried keys. It also can fill out an arbitrary number of parents that
2111+ it finds while searching for the supplied keys.
2112+
2113+ It is unlikely that you want to call this directly. See
2114+ "CombinedGraphIndex.find_ancestry()" for a more appropriate API.
2115+
2116+ :param keys: A keys whose ancestry we want to return
2117+ Every key will either end up in 'parent_map' or 'missing_keys'.
2118+ :param ref_list_num: This index in the ref_lists is the parents we
2119+ care about.
2120+ :param parent_map: {key: parent_keys} for keys that are present in this
2121+ index. This may contain more entries than were in 'keys', that are
2122+ reachable ancestors of the keys requested.
2123+ :param missing_keys: keys which are known to be missing in this index.
2124+ This may include parents that were not directly requested, but we
2125+ were able to determine that they are not present in this index.
2126+ :return: search_keys parents that were found but not queried to know
2127+ if they are missing or present. Callers can re-query this index for
2128+ those keys, and they will be placed into parent_map or missing_keys
2129+ """
2130+ if not self.key_count():
2131+ # We use key_count() to trigger reading the root node and
2132+ # determining info about this BTreeGraphIndex
2133+ # If we don't have any keys, then everything is missing
2134+ missing_keys.update(keys)
2135+ return set()
2136+ if ref_list_num >= self.node_ref_lists:
2137+ raise ValueError('No ref list %d, index has %d ref lists'
2138+ % (ref_list_num, self.node_ref_lists))
2139+
2140+ # The main trick we are trying to accomplish is that when we find a
2141+ # key listing its parents, we expect that the parent key is also likely
2142+ # to sit on the same page. Allowing us to expand parents quickly
2143+ # without suffering the full stack of bisecting, etc.
2144+ nodes, nodes_and_keys = self._walk_through_internal_nodes(keys)
2145+
2146+ # These are parent keys which could not be immediately resolved on the
2147+ # page where the child was present. Note that we may already be
2148+ # searching for that key, and it may actually be present [or known
2149+ # missing] on one of the other pages we are reading.
2150+ # TODO:
2151+ # We could try searching for them in the immediate previous or next
2152+ # page. If they occur "later" we could put them in a pending lookup
2153+ # set, and then for each node we read thereafter we could check to
2154+ # see if they are present.
2155+ # However, we don't know the impact of keeping this list of things
2156+ # that I'm going to search for every node I come across from here on
2157+ # out.
2158+ # It doesn't handle the case when the parent key is missing on a
2159+ # page that we *don't* read. So we already have to handle being
2160+ # re-entrant for that.
2161+ # Since most keys contain a date string, they are more likely to be
2162+ # found earlier in the file than later, but we would know that right
2163+ # away (key < min_key), and wouldn't keep searching it on every other
2164+ # page that we read.
2165+ # Mostly, it is an idea, one which should be benchmarked.
2166+ parents_not_on_page = set()
2167+
2168+ for node_index, sub_keys in nodes_and_keys:
2169+ if not sub_keys:
2170+ continue
2171+ # sub_keys is all of the keys we are looking for that should exist
2172+ # on this page, if they aren't here, then they won't be found
2173+ node = nodes[node_index]
2174+ node_keys = node.keys
2175+ parents_to_check = set()
2176+ for next_sub_key in sub_keys:
2177+ if next_sub_key not in node_keys:
2178+ # This one is just not present in the index at all
2179+ missing_keys.add(next_sub_key)
2180+ else:
2181+ value, refs = node_keys[next_sub_key]
2182+ parent_keys = refs[ref_list_num]
2183+ parent_map[next_sub_key] = parent_keys
2184+ parents_to_check.update(parent_keys)
2185+ # Don't look for things we've already found
2186+ parents_to_check = parents_to_check.difference(parent_map)
2187+ # this can be used to test the benefit of having the check loop
2188+ # inlined.
2189+ # parents_not_on_page.update(parents_to_check)
2190+ # continue
2191+ while parents_to_check:
2192+ next_parents_to_check = set()
2193+ for key in parents_to_check:
2194+ if key in node_keys:
2195+ value, refs = node_keys[key]
2196+ parent_keys = refs[ref_list_num]
2197+ parent_map[key] = parent_keys
2198+ next_parents_to_check.update(parent_keys)
2199+ else:
2200+ # This parent either is genuinely missing, or should be
2201+ # found on another page. Perf test whether it is better
2202+ # to check if this node should fit on this page or not.
2203+ # in the 'everything-in-one-pack' scenario, this *not*
2204+ # doing the check is 237ms vs 243ms.
2205+ # So slightly better, but I assume the standard 'lots
2206+ # of packs' is going to show a reasonable improvement
2207+ # from the check, because it avoids 'going around
2208+ # again' for everything that is in another index
2209+ # parents_not_on_page.add(key)
2210+ # Missing for some reason
2211+ if key < node.min_key:
2212+ # in the case of bzr.dev, 3.4k/5.3k misses are
2213+ # 'earlier' misses (65%)
2214+ parents_not_on_page.add(key)
2215+ elif key > node.max_key:
2216+ # This parent key would be present on a different
2217+ # LeafNode
2218+ parents_not_on_page.add(key)
2219+ else:
2220+ # assert key != node.min_key and key != node.max_key
2221+ # If it was going to be present, it would be on
2222+ # *this* page, so mark it missing.
2223+ missing_keys.add(key)
2224+ parents_to_check = next_parents_to_check.difference(parent_map)
2225+ # Might want to do another .difference() from missing_keys
2226+ # parents_not_on_page could have been found on a different page, or be
2227+ # known to be missing. So cull out everything that has already been
2228+ # found.
2229+ search_keys = parents_not_on_page.difference(
2230+ parent_map).difference(missing_keys)
2231+ return search_keys
2232+
2233 def iter_entries_prefix(self, keys):
2234 """Iterate over keys within the index using prefix matching.
2235
2236
2237=== modified file 'bzrlib/builtins.py'
2238--- bzrlib/builtins.py 2009-07-27 06:22:57 +0000
2239+++ bzrlib/builtins.py 2009-09-08 16:45:11 +0000
2240@@ -120,6 +120,15 @@
2241
2242
2243 def _get_one_revision_tree(command_name, revisions, branch=None, tree=None):
2244+ """Get a revision tree. Not suitable for commands that change the tree.
2245+
2246+ Specifically, the basis tree in dirstate trees is coupled to the dirstate
2247+ and doing a commit/uncommit/pull will at best fail due to changing the
2248+ basis revision data.
2249+
2250+ If tree is passed in, it should be already locked, for lifetime management
2251+ of the trees internal cached state.
2252+ """
2253 if branch is None:
2254 branch = tree.branch
2255 if revisions is None:
2256@@ -452,8 +461,8 @@
2257 raise errors.BzrCommandError("You cannot remove the working tree"
2258 " of a remote path")
2259 if not force:
2260- # XXX: What about pending merges ? -- vila 20090629
2261- if working.has_changes(working.basis_tree()):
2262+ if (working.has_changes(working.basis_tree())
2263+ or len(working.get_parent_ids()) > 1):
2264 raise errors.UncommittedChanges(working)
2265
2266 working_path = working.bzrdir.root_transport.base
2267@@ -603,6 +612,9 @@
2268 branches that will be merged later (without showing the two different
2269 adds as a conflict). It is also useful when merging another project
2270 into a subdirectory of this one.
2271+
2272+ Any files matching patterns in the ignore list will not be added
2273+ unless they are explicitly mentioned.
2274 """
2275 takes_args = ['file*']
2276 takes_options = [
2277@@ -616,7 +628,7 @@
2278 help='Lookup file ids from this tree.'),
2279 ]
2280 encoding_type = 'replace'
2281- _see_also = ['remove']
2282+ _see_also = ['remove', 'ignore']
2283
2284 def run(self, file_list, no_recurse=False, dry_run=False, verbose=False,
2285 file_ids_from=None):
2286@@ -654,14 +666,6 @@
2287 for path in ignored[glob]:
2288 self.outf.write("ignored %s matching \"%s\"\n"
2289 % (path, glob))
2290- else:
2291- match_len = 0
2292- for glob, paths in ignored.items():
2293- match_len += len(paths)
2294- self.outf.write("ignored %d file(s).\n" % match_len)
2295- self.outf.write("If you wish to add ignored files, "
2296- "please add them explicitly by name. "
2297- "(\"bzr ignored\" gives a list)\n")
2298
2299
2300 class cmd_mkdir(Command):
2301@@ -1172,6 +1176,9 @@
2302 help='Hard-link working tree files where possible.'),
2303 Option('no-tree',
2304 help="Create a branch without a working-tree."),
2305+ Option('switch',
2306+ help="Switch the checkout in the current directory "
2307+ "to the new branch."),
2308 Option('stacked',
2309 help='Create a stacked branch referring to the source branch. '
2310 'The new branch will depend on the availability of the source '
2311@@ -1188,9 +1195,9 @@
2312
2313 def run(self, from_location, to_location=None, revision=None,
2314 hardlink=False, stacked=False, standalone=False, no_tree=False,
2315- use_existing_dir=False):
2316+ use_existing_dir=False, switch=False):
2317+ from bzrlib import switch as _mod_switch
2318 from bzrlib.tag import _merge_tags_if_possible
2319-
2320 accelerator_tree, br_from = bzrdir.BzrDir.open_tree_or_branch(
2321 from_location)
2322 if (accelerator_tree is not None and
2323@@ -1250,6 +1257,12 @@
2324 except (errors.NotStacked, errors.UnstackableBranchFormat,
2325 errors.UnstackableRepositoryFormat), e:
2326 note('Branched %d revision(s).' % branch.revno())
2327+ if switch:
2328+ # Switch to the new branch
2329+ wt, _ = WorkingTree.open_containing('.')
2330+ _mod_switch.switch(wt.bzrdir, branch)
2331+ note('Switched to branch: %s',
2332+ urlutils.unescape_for_display(branch.base, 'utf-8'))
2333 finally:
2334 br_from.unlock()
2335
2336@@ -3025,6 +3038,10 @@
2337 raise errors.BzrCommandError("empty commit message specified")
2338 return my_message
2339
2340+ # The API permits a commit with a filter of [] to mean 'select nothing'
2341+ # but the command line should not do that.
2342+ if not selected_list:
2343+ selected_list = None
2344 try:
2345 tree.commit(message_callback=get_message,
2346 specific_files=selected_list,
2347@@ -3365,6 +3382,8 @@
2348 Option('lsprof-timed',
2349 help='Generate lsprof output for benchmarked'
2350 ' sections of code.'),
2351+ Option('lsprof-tests',
2352+ help='Generate lsprof output for each test.'),
2353 Option('cache-dir', type=str,
2354 help='Cache intermediate benchmark output in this '
2355 'directory.'),
2356@@ -3411,7 +3430,7 @@
2357 first=False, list_only=False,
2358 randomize=None, exclude=None, strict=False,
2359 load_list=None, debugflag=None, starting_with=None, subunit=False,
2360- parallel=None):
2361+ parallel=None, lsprof_tests=False):
2362 from bzrlib.tests import selftest
2363 import bzrlib.benchmarks as benchmarks
2364 from bzrlib.benchmarks import tree_creator
2365@@ -3451,6 +3470,7 @@
2366 "transport": transport,
2367 "test_suite_factory": test_suite_factory,
2368 "lsprof_timed": lsprof_timed,
2369+ "lsprof_tests": lsprof_tests,
2370 "bench_history": benchfile,
2371 "matching_tests_first": first,
2372 "list_only": list_only,
2373@@ -3633,13 +3653,14 @@
2374 verified = 'inapplicable'
2375 tree = WorkingTree.open_containing(directory)[0]
2376
2377- # die as quickly as possible if there are uncommitted changes
2378 try:
2379 basis_tree = tree.revision_tree(tree.last_revision())
2380 except errors.NoSuchRevision:
2381 basis_tree = tree.basis_tree()
2382+
2383+ # die as quickly as possible if there are uncommitted changes
2384 if not force:
2385- if tree.has_changes(basis_tree):
2386+ if tree.has_changes(basis_tree) or len(tree.get_parent_ids()) > 1:
2387 raise errors.UncommittedChanges(tree)
2388
2389 view_info = _get_view_info_for_change_reporter(tree)
2390@@ -5627,8 +5648,12 @@
2391 if writer is None:
2392 writer = bzrlib.option.diff_writer_registry.get()
2393 try:
2394- Shelver.from_args(writer(sys.stdout), revision, all, file_list,
2395- message, destroy=destroy).run()
2396+ shelver = Shelver.from_args(writer(sys.stdout), revision, all,
2397+ file_list, message, destroy=destroy)
2398+ try:
2399+ shelver.run()
2400+ finally:
2401+ shelver.work_tree.unlock()
2402 except errors.UserAbort:
2403 return 0
2404
2405@@ -5673,7 +5698,11 @@
2406
2407 def run(self, shelf_id=None, action='apply'):
2408 from bzrlib.shelf_ui import Unshelver
2409- Unshelver.from_args(shelf_id, action).run()
2410+ unshelver = Unshelver.from_args(shelf_id, action)
2411+ try:
2412+ unshelver.run()
2413+ finally:
2414+ unshelver.tree.unlock()
2415
2416
2417 class cmd_clean_tree(Command):
2418
2419=== modified file 'bzrlib/bzrdir.py'
2420--- bzrlib/bzrdir.py 2009-08-04 14:48:59 +0000
2421+++ bzrlib/bzrdir.py 2009-08-21 02:10:06 +0000
2422@@ -129,9 +129,16 @@
2423 return True
2424
2425 def check_conversion_target(self, target_format):
2426+ """Check that a bzrdir as a whole can be converted to a new format."""
2427+ # The only current restriction is that the repository content can be
2428+ # fetched compatibly with the target.
2429 target_repo_format = target_format.repository_format
2430- source_repo_format = self._format.repository_format
2431- source_repo_format.check_conversion_target(target_repo_format)
2432+ try:
2433+ self.open_repository()._format.check_conversion_target(
2434+ target_repo_format)
2435+ except errors.NoRepositoryPresent:
2436+ # No repo, no problem.
2437+ pass
2438
2439 @staticmethod
2440 def _check_supported(format, allow_unsupported,
2441@@ -3039,7 +3046,8 @@
2442 new is _mod_branch.BzrBranchFormat8):
2443 branch_converter = _mod_branch.Converter7to8()
2444 else:
2445- raise errors.BadConversionTarget("No converter", new)
2446+ raise errors.BadConversionTarget("No converter", new,
2447+ branch._format)
2448 branch_converter.convert(branch)
2449 branch = self.bzrdir.open_branch()
2450 old = branch._format.__class__
2451@@ -3548,6 +3556,10 @@
2452 if self._require_stacking:
2453 raise
2454
2455+ def requires_stacking(self):
2456+ """Return True if this policy requires stacking."""
2457+ return self._stack_on is not None and self._require_stacking
2458+
2459 def _get_full_stack_on(self):
2460 """Get a fully-qualified URL for the stack_on location."""
2461 if self._stack_on is None:
2462@@ -3860,11 +3872,11 @@
2463 # The following format should be an alias for the rich root equivalent
2464 # of the default format
2465 format_registry.register_metadir('default-rich-root',
2466- 'bzrlib.repofmt.pack_repo.RepositoryFormatKnitPack4',
2467- help='Default format, rich root variant. (needed for bzr-svn and bzr-git).',
2468- branch_format='bzrlib.branch.BzrBranchFormat6',
2469- tree_format='bzrlib.workingtree.WorkingTreeFormat4',
2470+ 'bzrlib.repofmt.groupcompress_repo.RepositoryFormat2a',
2471+ branch_format='bzrlib.branch.BzrBranchFormat7',
2472+ tree_format='bzrlib.workingtree.WorkingTreeFormat6',
2473 alias=True,
2474- )
2475+ help='Same as 2a.')
2476+
2477 # The current format that is made on 'bzr init'.
2478-format_registry.set_default('pack-0.92')
2479+format_registry.set_default('2a')
2480
2481=== modified file 'bzrlib/commands.py'
2482--- bzrlib/commands.py 2009-06-19 09:06:56 +0000
2483+++ bzrlib/commands.py 2009-09-09 17:51:19 +0000
2484@@ -1028,13 +1028,13 @@
2485 ret = apply_coveraged(opt_coverage_dir, run, *run_argv)
2486 else:
2487 ret = run(*run_argv)
2488- if 'memory' in debug.debug_flags:
2489- trace.debug_memory('Process status after command:', short=False)
2490 return ret or 0
2491 finally:
2492 # reset, in case we may do other commands later within the same
2493 # process. Commands that want to execute sub-commands must propagate
2494 # --verbose in their own way.
2495+ if 'memory' in debug.debug_flags:
2496+ trace.debug_memory('Process status after command:', short=False)
2497 option._verbosity_level = saved_verbosity_level
2498
2499
2500
2501=== modified file 'bzrlib/commit.py'
2502--- bzrlib/commit.py 2009-08-25 05:09:42 +0000
2503+++ bzrlib/commit.py 2009-08-28 05:00:33 +0000
2504@@ -209,7 +209,8 @@
2505 :param timestamp: if not None, seconds-since-epoch for a
2506 postdated/predated commit.
2507
2508- :param specific_files: If true, commit only those files.
2509+ :param specific_files: If not None, commit only those files. An empty
2510+ list means 'commit no files'.
2511
2512 :param rev_id: If set, use this as the new revision id.
2513 Useful for test or import commands that need to tightly
2514@@ -264,6 +265,8 @@
2515 self.master_locked = False
2516 self.recursive = recursive
2517 self.rev_id = None
2518+ # self.specific_files is None to indicate no filter, or any iterable to
2519+ # indicate a filter - [] means no files at all, as per iter_changes.
2520 if specific_files is not None:
2521 self.specific_files = sorted(
2522 minimum_path_selection(specific_files))
2523@@ -285,7 +288,6 @@
2524 # the command line parameters, and the repository has fast delta
2525 # generation. See bug 347649.
2526 self.use_record_iter_changes = (
2527- not self.specific_files and
2528 not self.exclude and
2529 not self.branch.repository._format.supports_tree_reference and
2530 (self.branch.repository._format.fast_deltas or
2531@@ -333,7 +335,7 @@
2532 self._gather_parents()
2533 # After a merge, a selected file commit is not supported.
2534 # See 'bzr help merge' for an explanation as to why.
2535- if len(self.parents) > 1 and self.specific_files:
2536+ if len(self.parents) > 1 and self.specific_files is not None:
2537 raise errors.CannotCommitSelectedFileMerge(self.specific_files)
2538 # Excludes are a form of selected file commit.
2539 if len(self.parents) > 1 and self.exclude:
2540@@ -619,12 +621,13 @@
2541 """Update the commit builder with the data about what has changed.
2542 """
2543 exclude = self.exclude
2544- specific_files = self.specific_files or []
2545+ specific_files = self.specific_files
2546 mutter("Selecting files for commit with filter %s", specific_files)
2547
2548 self._check_strict()
2549 if self.use_record_iter_changes:
2550- iter_changes = self.work_tree.iter_changes(self.basis_tree)
2551+ iter_changes = self.work_tree.iter_changes(self.basis_tree,
2552+ specific_files=specific_files)
2553 iter_changes = self._filter_iter_changes(iter_changes)
2554 for file_id, path, fs_hash in self.builder.record_iter_changes(
2555 self.work_tree, self.basis_revid, iter_changes):
2556
2557=== modified file 'bzrlib/config.py'
2558--- bzrlib/config.py 2009-07-02 08:59:16 +0000
2559+++ bzrlib/config.py 2009-08-20 04:53:23 +0000
2560@@ -821,6 +821,29 @@
2561 return osutils.pathjoin(config_dir(), 'ignore')
2562
2563
2564+def crash_dir():
2565+ """Return the directory name to store crash files.
2566+
2567+ This doesn't implicitly create it.
2568+
2569+ On Windows it's in the config directory; elsewhere in the XDG cache directory.
2570+ """
2571+ if sys.platform == 'win32':
2572+ return osutils.pathjoin(config_dir(), 'Crash')
2573+ else:
2574+ return osutils.pathjoin(xdg_cache_dir(), 'crash')
2575+
2576+
2577+def xdg_cache_dir():
2578+ # See http://standards.freedesktop.org/basedir-spec/latest/ar01s03.html
2579+ # Possibly this should be different on Windows?
2580+ e = os.environ.get('XDG_CACHE_DIR', None)
2581+ if e:
2582+ return e
2583+ else:
2584+ return os.path.expanduser('~/.cache')
2585+
2586+
2587 def _auto_user_id():
2588 """Calculate automatic user identification.
2589
2590
2591=== added file 'bzrlib/crash.py'
2592--- bzrlib/crash.py 1970-01-01 00:00:00 +0000
2593+++ bzrlib/crash.py 2009-08-20 05:47:53 +0000
2594@@ -0,0 +1,163 @@
2595+# Copyright (C) 2009 Canonical Ltd
2596+#
2597+# This program is free software; you can redistribute it and/or modify
2598+# it under the terms of the GNU General Public License as published by
2599+# the Free Software Foundation; either version 2 of the License, or
2600+# (at your option) any later version.
2601+#
2602+# This program is distributed in the hope that it will be useful,
2603+# but WITHOUT ANY WARRANTY; without even the implied warranty of
2604+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
2605+# GNU General Public License for more details.
2606+#
2607+# You should have received a copy of the GNU General Public License
2608+# along with this program; if not, write to the Free Software
2609+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
2610+
2611+
2612+"""Handling and reporting crashes.
2613+"""
2614+
2615+# for interactive testing, try the 'bzr assert-fail' command
2616+# or see http://code.launchpad.net/~mbp/bzr/bzr-fail
2617+
2618+import os
2619+import platform
2620+import pprint
2621+import sys
2622+import time
2623+from StringIO import StringIO
2624+
2625+import bzrlib
2626+from bzrlib import (
2627+ config,
2628+ debug,
2629+ osutils,
2630+ plugin,
2631+ trace,
2632+ )
2633+
2634+
2635+def report_bug(exc_info, stderr):
2636+ if 'no_apport' not in debug.debug_flags:
2637+ try:
2638+ report_bug_to_apport(exc_info, stderr)
2639+ return
2640+ except ImportError, e:
2641+ trace.mutter("couldn't find apport bug-reporting library: %s" % e)
2642+ pass
2643+ except Exception, e:
2644+ # this should only happen if apport is installed but it didn't
2645+ # work, eg because of an io error writing the crash file
2646+ sys.stderr.write("bzr: failed to report crash using apport:\n "
2647+ " %r\n" % e)
2648+ pass
2649+ report_bug_legacy(exc_info, stderr)
2650+
2651+
2652+def report_bug_legacy(exc_info, err_file):
2653+ """Report a bug by just printing a message to the user."""
2654+ trace.print_exception(exc_info, err_file)
2655+ err_file.write('\n')
2656+ err_file.write('bzr %s on python %s (%s)\n' % \
2657+ (bzrlib.__version__,
2658+ bzrlib._format_version_tuple(sys.version_info),
2659+ platform.platform(aliased=1)))
2660+ err_file.write('arguments: %r\n' % sys.argv)
2661+ err_file.write(
2662+ 'encoding: %r, fsenc: %r, lang: %r\n' % (
2663+ osutils.get_user_encoding(), sys.getfilesystemencoding(),
2664+ os.environ.get('LANG')))
2665+ err_file.write("plugins:\n")
2666+ err_file.write(_format_plugin_list())
2667+ err_file.write(
2668+ "\n\n"
2669+ "*** Bazaar has encountered an internal error. This probably indicates a\n"
2670+ " bug in Bazaar. You can help us fix it by filing a bug report at\n"
2671+ " https://bugs.launchpad.net/bzr/+filebug\n"
2672+ " including this traceback and a description of the problem.\n"
2673+ )
2674+
2675+
2676+def report_bug_to_apport(exc_info, stderr):
2677+ """Report a bug to apport for optional automatic filing.
2678+ """
2679+ # this is based on apport_package_hook.py, but omitting some of the
2680+ # Ubuntu-specific policy about what to report and when
2681+
2682+ # if this fails its caught at a higher level; we don't want to open the
2683+ # crash file unless apport can be loaded.
2684+ import apport
2685+
2686+ crash_file = _open_crash_file()
2687+ try:
2688+ _write_apport_report_to_file(exc_info, crash_file)
2689+ finally:
2690+ crash_file.close()
2691+
2692+ stderr.write("bzr: ERROR: %s.%s: %s\n"
2693+ "\n"
2694+ "*** Bazaar has encountered an internal error. This probably indicates a\n"
2695+ " bug in Bazaar. You can help us fix it by filing a bug report at\n"
2696+ " https://bugs.launchpad.net/bzr/+filebug\n"
2697+ " attaching the crash file\n"
2698+ " %s\n"
2699+ " and including a description of the problem.\n"
2700+ "\n"
2701+ " The crash file is plain text and you can inspect or edit it to remove\n"
2702+ " private information.\n"
2703+ % (exc_info[0].__module__, exc_info[0].__name__, exc_info[1],
2704+ crash_file.name))
2705+
2706+
2707+def _write_apport_report_to_file(exc_info, crash_file):
2708+ import traceback
2709+ from apport.report import Report
2710+
2711+ exc_type, exc_object, exc_tb = exc_info
2712+
2713+ pr = Report()
2714+ # add_proc_info gives you the memory map of the process: this seems rarely
2715+ # useful for Bazaar and it does make the report harder to scan, though it
2716+ # does tell you what binary modules are loaded.
2717+ # pr.add_proc_info()
2718+ pr.add_user_info()
2719+ pr['CommandLine'] = pprint.pformat(sys.argv)
2720+ pr['BzrVersion'] = bzrlib.__version__
2721+ pr['PythonVersion'] = bzrlib._format_version_tuple(sys.version_info)
2722+ pr['Platform'] = platform.platform(aliased=1)
2723+ pr['UserEncoding'] = osutils.get_user_encoding()
2724+ pr['FileSystemEncoding'] = sys.getfilesystemencoding()
2725+ pr['Locale'] = os.environ.get('LANG')
2726+ pr['BzrPlugins'] = _format_plugin_list()
2727+ pr['PythonLoadedModules'] = _format_module_list()
2728+ pr['BzrDebugFlags'] = pprint.pformat(debug.debug_flags)
2729+
2730+ tb_file = StringIO()
2731+ traceback.print_exception(exc_type, exc_object, exc_tb, file=tb_file)
2732+ pr['Traceback'] = tb_file.getvalue()
2733+
2734+ pr.write(crash_file)
2735+
2736+
2737+def _open_crash_file():
2738+ crash_dir = config.crash_dir()
2739+ # user-readable only, just in case the contents are sensitive.
2740+ if not osutils.isdir(crash_dir):
2741+ os.makedirs(crash_dir, mode=0700)
2742+ filename = 'bzr-%s-%s.crash' % (
2743+ osutils.compact_date(time.time()),
2744+ os.getpid(),)
2745+ return open(osutils.pathjoin(crash_dir, filename), 'wt')
2746+
2747+
2748+def _format_plugin_list():
2749+ plugin_lines = []
2750+ for name, a_plugin in sorted(plugin.plugins().items()):
2751+ plugin_lines.append(" %-20s %s [%s]" %
2752+ (name, a_plugin.path(), a_plugin.__version__))
2753+ return '\n'.join(plugin_lines)
2754+
2755+
2756+def _format_module_list():
2757+ return pprint.pformat(sys.modules)
2758
2759=== modified file 'bzrlib/dirstate.py'
2760--- bzrlib/dirstate.py 2009-07-27 05:44:19 +0000
2761+++ bzrlib/dirstate.py 2009-08-28 05:00:33 +0000
2762@@ -3166,15 +3166,18 @@
2763
2764 __slots__ = ["old_dirname_to_file_id", "new_dirname_to_file_id",
2765 "last_source_parent", "last_target_parent", "include_unchanged",
2766- "use_filesystem_for_exec", "utf8_decode", "searched_specific_files",
2767- "search_specific_files", "state", "source_index", "target_index",
2768- "want_unversioned", "tree"]
2769+ "partial", "use_filesystem_for_exec", "utf8_decode",
2770+ "searched_specific_files", "search_specific_files",
2771+ "searched_exact_paths", "search_specific_file_parents", "seen_ids",
2772+ "state", "source_index", "target_index", "want_unversioned", "tree"]
2773
2774 def __init__(self, include_unchanged, use_filesystem_for_exec,
2775 search_specific_files, state, source_index, target_index,
2776 want_unversioned, tree):
2777 self.old_dirname_to_file_id = {}
2778 self.new_dirname_to_file_id = {}
2779+ # Are we doing a partial iter_changes?
2780+ self.partial = search_specific_files != set([''])
2781 # Using a list so that we can access the values and change them in
2782 # nested scope. Each one is [path, file_id, entry]
2783 self.last_source_parent = [None, None]
2784@@ -3183,14 +3186,25 @@
2785 self.use_filesystem_for_exec = use_filesystem_for_exec
2786 self.utf8_decode = cache_utf8._utf8_decode
2787 # for all search_indexs in each path at or under each element of
2788- # search_specific_files, if the detail is relocated: add the id, and add the
2789- # relocated path as one to search if its not searched already. If the
2790- # detail is not relocated, add the id.
2791+ # search_specific_files, if the detail is relocated: add the id, and
2792+ # add the relocated path as one to search if its not searched already.
2793+ # If the detail is not relocated, add the id.
2794 self.searched_specific_files = set()
2795+ # When we search exact paths without expanding downwards, we record
2796+ # that here.
2797+ self.searched_exact_paths = set()
2798 self.search_specific_files = search_specific_files
2799+ # The parents up to the root of the paths we are searching.
2800+ # After all normal paths are returned, these specific items are returned.
2801+ self.search_specific_file_parents = set()
2802+ # The ids we've sent out in the delta.
2803+ self.seen_ids = set()
2804 self.state = state
2805 self.source_index = source_index
2806 self.target_index = target_index
2807+ if target_index != 0:
2808+ # A lot of code in here depends on target_index == 0
2809+ raise errors.BzrError('unsupported target index')
2810 self.want_unversioned = want_unversioned
2811 self.tree = tree
2812
2813@@ -3198,15 +3212,15 @@
2814 """Compare an entry and real disk to generate delta information.
2815
2816 :param path_info: top_relpath, basename, kind, lstat, abspath for
2817- the path of entry. If None, then the path is considered absent.
2818- (Perhaps we should pass in a concrete entry for this ?)
2819+ the path of entry. If None, then the path is considered absent in
2820+ the target (Perhaps we should pass in a concrete entry for this ?)
2821 Basename is returned as a utf8 string because we expect this
2822 tuple will be ignored, and don't want to take the time to
2823 decode.
2824 :return: (iter_changes_result, changed). If the entry has not been
2825 handled then changed is None. Otherwise it is False if no content
2826- or metadata changes have occured, and None if any content or
2827- metadata change has occured. If self.include_unchanged is True then
2828+ or metadata changes have occurred, and True if any content or
2829+ metadata change has occurred. If self.include_unchanged is True then
2830 if changed is not None, iter_changes_result will always be a result
2831 tuple. Otherwise, iter_changes_result is None unless changed is
2832 True.
2833@@ -3463,6 +3477,25 @@
2834 def __iter__(self):
2835 return self
2836
2837+ def _gather_result_for_consistency(self, result):
2838+ """Check a result we will yield to make sure we are consistent later.
2839+
2840+ This gathers result's parents into a set to output later.
2841+
2842+ :param result: A result tuple.
2843+ """
2844+ if not self.partial or not result[0]:
2845+ return
2846+ self.seen_ids.add(result[0])
2847+ new_path = result[1][1]
2848+ if new_path:
2849+ # Not the root and not a delete: queue up the parents of the path.
2850+ self.search_specific_file_parents.update(
2851+ osutils.parent_directories(new_path.encode('utf8')))
2852+ # Add the root directory which parent_directories does not
2853+ # provide.
2854+ self.search_specific_file_parents.add('')
2855+
2856 def iter_changes(self):
2857 """Iterate over the changes."""
2858 utf8_decode = cache_utf8._utf8_decode
2859@@ -3546,6 +3579,8 @@
2860 result, changed = _process_entry(entry, root_dir_info)
2861 if changed is not None:
2862 path_handled = True
2863+ if changed:
2864+ self._gather_result_for_consistency(result)
2865 if changed or self.include_unchanged:
2866 yield result
2867 if self.want_unversioned and not path_handled and root_dir_info:
2868@@ -3664,6 +3699,8 @@
2869 # advance the entry only, after processing.
2870 result, changed = _process_entry(current_entry, None)
2871 if changed is not None:
2872+ if changed:
2873+ self._gather_result_for_consistency(result)
2874 if changed or self.include_unchanged:
2875 yield result
2876 block_index +=1
2877@@ -3702,6 +3739,8 @@
2878 # no path is fine: the per entry code will handle it.
2879 result, changed = _process_entry(current_entry, current_path_info)
2880 if changed is not None:
2881+ if changed:
2882+ self._gather_result_for_consistency(result)
2883 if changed or self.include_unchanged:
2884 yield result
2885 elif (current_entry[0][1] != current_path_info[1]
2886@@ -3723,6 +3762,8 @@
2887 # advance the entry only, after processing.
2888 result, changed = _process_entry(current_entry, None)
2889 if changed is not None:
2890+ if changed:
2891+ self._gather_result_for_consistency(result)
2892 if changed or self.include_unchanged:
2893 yield result
2894 advance_path = False
2895@@ -3730,6 +3771,8 @@
2896 result, changed = _process_entry(current_entry, current_path_info)
2897 if changed is not None:
2898 path_handled = True
2899+ if changed:
2900+ self._gather_result_for_consistency(result)
2901 if changed or self.include_unchanged:
2902 yield result
2903 if advance_entry and current_entry is not None:
2904@@ -3795,6 +3838,124 @@
2905 current_dir_info = dir_iterator.next()
2906 except StopIteration:
2907 current_dir_info = None
2908+ for result in self._iter_specific_file_parents():
2909+ yield result
2910+
2911+ def _iter_specific_file_parents(self):
2912+ """Iter over the specific file parents."""
2913+ while self.search_specific_file_parents:
2914+ # Process the parent directories for the paths we were iterating.
2915+ # Even in extremely large trees this should be modest, so currently
2916+ # no attempt is made to optimise.
2917+ path_utf8 = self.search_specific_file_parents.pop()
2918+ if osutils.is_inside_any(self.searched_specific_files, path_utf8):
2919+ # We've examined this path.
2920+ continue
2921+ if path_utf8 in self.searched_exact_paths:
2922+ # We've examined this path.
2923+ continue
2924+ path_entries = self.state._entries_for_path(path_utf8)
2925+ # We need either one or two entries. If the path in
2926+ # self.target_index has moved (so the entry in source_index is in
2927+ # 'ar') then we need to also look for the entry for this path in
2928+ # self.source_index, to output the appropriate delete-or-rename.
2929+ selected_entries = []
2930+ found_item = False
2931+ for candidate_entry in path_entries:
2932+ # Find entries present in target at this path:
2933+ if candidate_entry[1][self.target_index][0] not in 'ar':
2934+ found_item = True
2935+ selected_entries.append(candidate_entry)
2936+ # Find entries present in source at this path:
2937+ elif (self.source_index is not None and
2938+ candidate_entry[1][self.source_index][0] not in 'ar'):
2939+ found_item = True
2940+ if candidate_entry[1][self.target_index][0] == 'a':
2941+ # Deleted, emit it here.
2942+ selected_entries.append(candidate_entry)
2943+ else:
2944+ # renamed, emit it when we process the directory it
2945+ # ended up at.
2946+ self.search_specific_file_parents.add(
2947+ candidate_entry[1][self.target_index][1])
2948+ if not found_item:
2949+ raise AssertionError(
2950+ "Missing entry for specific path parent %r, %r" % (
2951+ path_utf8, path_entries))
2952+ path_info = self._path_info(path_utf8, path_utf8.decode('utf8'))
2953+ for entry in selected_entries:
2954+ if entry[0][2] in self.seen_ids:
2955+ continue
2956+ result, changed = self._process_entry(entry, path_info)
2957+ if changed is None:
2958+ raise AssertionError(
2959+ "Got entry<->path mismatch for specific path "
2960+ "%r entry %r path_info %r " % (
2961+ path_utf8, entry, path_info))
2962+ # Only include changes - we're outside the users requested
2963+ # expansion.
2964+ if changed:
2965+ self._gather_result_for_consistency(result)
2966+ if (result[6][0] == 'directory' and
2967+ result[6][1] != 'directory'):
2968+ # This stopped being a directory, the old children have
2969+ # to be included.
2970+ if entry[1][self.source_index][0] == 'r':
2971+ # renamed, take the source path
2972+ entry_path_utf8 = entry[1][self.source_index][1]
2973+ else:
2974+ entry_path_utf8 = path_utf8
2975+ initial_key = (entry_path_utf8, '', '')
2976+ block_index, _ = self.state._find_block_index_from_key(
2977+ initial_key)
2978+ if block_index == 0:
2979+ # The children of the root are in block index 1.
2980+ block_index +=1
2981+ current_block = None
2982+ if block_index < len(self.state._dirblocks):
2983+ current_block = self.state._dirblocks[block_index]
2984+ if not osutils.is_inside(
2985+ entry_path_utf8, current_block[0]):
2986+ # No entries for this directory at all.
2987+ current_block = None
2988+ if current_block is not None:
2989+ for entry in current_block[1]:
2990+ if entry[1][self.source_index][0] in 'ar':
2991+ # Not in the source tree, so doesn't have to be
2992+ # included.
2993+ continue
2994+ # Path of the entry itself.
2995+
2996+ self.search_specific_file_parents.add(
2997+ osutils.pathjoin(*entry[0][:2]))
2998+ if changed or self.include_unchanged:
2999+ yield result
3000+ self.searched_exact_paths.add(path_utf8)
3001+
3002+ def _path_info(self, utf8_path, unicode_path):
3003+ """Generate path_info for unicode_path.
3004+
3005+ :return: None if unicode_path does not exist, or a path_info tuple.
3006+ """
3007+ abspath = self.tree.abspath(unicode_path)
3008+ try:
3009+ stat = os.lstat(abspath)
3010+ except OSError, e:
3011+ if e.errno == errno.ENOENT:
3012+ # the path does not exist.
3013+ return None
3014+ else:
3015+ raise
3016+ utf8_basename = utf8_path.rsplit('/', 1)[-1]
3017+ dir_info = (utf8_path, utf8_basename,
3018+ osutils.file_kind_from_stat_mode(stat.st_mode), stat,
3019+ abspath)
3020+ if dir_info[2] == 'directory':
3021+ if self.tree._directory_is_tree_reference(
3022+ unicode_path):
3023+ self.root_dir_info = self.root_dir_info[:2] + \
3024+ ('tree-reference',) + self.root_dir_info[3:]
3025+ return dir_info
3026
3027
3028 # Try to load the compiled form if possible
3029
3030=== modified file 'bzrlib/errors.py'
3031--- bzrlib/errors.py 2009-07-14 21:07:36 +0000
3032+++ bzrlib/errors.py 2009-08-30 21:34:42 +0000
3033@@ -793,6 +793,12 @@
3034
3035
3036 class IncompatibleRepositories(BzrError):
3037+ """Report an error that two repositories are not compatible.
3038+
3039+ Note that the source and target repositories are permitted to be strings:
3040+ this exception is thrown from the smart server and may refer to a
3041+ repository the client hasn't opened.
3042+ """
3043
3044 _fmt = "%(target)s\n" \
3045 "is not compatible with\n" \
3046@@ -2006,12 +2012,14 @@
3047
3048 class BadConversionTarget(BzrError):
3049
3050- _fmt = "Cannot convert to format %(format)s. %(problem)s"
3051+ _fmt = "Cannot convert from format %(from_format)s to format %(format)s." \
3052+ " %(problem)s"
3053
3054- def __init__(self, problem, format):
3055+ def __init__(self, problem, format, from_format=None):
3056 BzrError.__init__(self)
3057 self.problem = problem
3058 self.format = format
3059+ self.from_format = from_format or '(unspecified)'
3060
3061
3062 class NoDiffFound(BzrError):
3063@@ -2918,8 +2926,9 @@
3064 _fmt = 'Cannot bind address "%(host)s:%(port)i": %(orig_error)s.'
3065
3066 def __init__(self, host, port, orig_error):
3067+ # nb: in python2.4 socket.error doesn't have a useful repr
3068 BzrError.__init__(self, host=host, port=port,
3069- orig_error=orig_error[1])
3070+ orig_error=repr(orig_error.args))
3071
3072
3073 class UnknownRules(BzrError):
3074
3075=== modified file 'bzrlib/fetch.py'
3076--- bzrlib/fetch.py 2009-07-09 08:59:51 +0000
3077+++ bzrlib/fetch.py 2009-08-07 04:29:36 +0000
3078@@ -25,16 +25,21 @@
3079
3080 import operator
3081
3082+from bzrlib.lazy_import import lazy_import
3083+lazy_import(globals(), """
3084+from bzrlib import (
3085+ tsort,
3086+ versionedfile,
3087+ )
3088+""")
3089 import bzrlib
3090 from bzrlib import (
3091 errors,
3092 symbol_versioning,
3093 )
3094 from bzrlib.revision import NULL_REVISION
3095-from bzrlib.tsort import topo_sort
3096 from bzrlib.trace import mutter
3097 import bzrlib.ui
3098-from bzrlib.versionedfile import FulltextContentFactory
3099
3100
3101 class RepoFetcher(object):
3102@@ -213,11 +218,9 @@
3103
3104 def _find_root_ids(self, revs, parent_map, graph):
3105 revision_root = {}
3106- planned_versions = {}
3107 for tree in self.iter_rev_trees(revs):
3108 revision_id = tree.inventory.root.revision
3109 root_id = tree.get_root_id()
3110- planned_versions.setdefault(root_id, []).append(revision_id)
3111 revision_root[revision_id] = root_id
3112 # Find out which parents we don't already know root ids for
3113 parents = set()
3114@@ -229,7 +232,7 @@
3115 for tree in self.iter_rev_trees(parents):
3116 root_id = tree.get_root_id()
3117 revision_root[tree.get_revision_id()] = root_id
3118- return revision_root, planned_versions
3119+ return revision_root
3120
3121 def generate_root_texts(self, revs):
3122 """Generate VersionedFiles for all root ids.
3123@@ -238,9 +241,8 @@
3124 """
3125 graph = self.source.get_graph()
3126 parent_map = graph.get_parent_map(revs)
3127- rev_order = topo_sort(parent_map)
3128- rev_id_to_root_id, root_id_to_rev_ids = self._find_root_ids(
3129- revs, parent_map, graph)
3130+ rev_order = tsort.topo_sort(parent_map)
3131+ rev_id_to_root_id = self._find_root_ids(revs, parent_map, graph)
3132 root_id_order = [(rev_id_to_root_id[rev_id], rev_id) for rev_id in
3133 rev_order]
3134 # Guaranteed stable, this groups all the file id operations together
3135@@ -249,20 +251,93 @@
3136 # yet, and are unlikely to in non-rich-root environments anyway.
3137 root_id_order.sort(key=operator.itemgetter(0))
3138 # Create a record stream containing the roots to create.
3139- def yield_roots():
3140- for key in root_id_order:
3141- root_id, rev_id = key
3142- rev_parents = parent_map[rev_id]
3143- # We drop revision parents with different file-ids, because
3144- # that represents a rename of the root to a different location
3145- # - its not actually a parent for us. (We could look for that
3146- # file id in the revision tree at considerably more expense,
3147- # but for now this is sufficient (and reconcile will catch and
3148- # correct this anyway).
3149- # When a parent revision is a ghost, we guess that its root id
3150- # was unchanged (rather than trimming it from the parent list).
3151- parent_keys = tuple((root_id, parent) for parent in rev_parents
3152- if parent != NULL_REVISION and
3153- rev_id_to_root_id.get(parent, root_id) == root_id)
3154- yield FulltextContentFactory(key, parent_keys, None, '')
3155- return [('texts', yield_roots())]
3156+ from bzrlib.graph import FrozenHeadsCache
3157+ graph = FrozenHeadsCache(graph)
3158+ new_roots_stream = _new_root_data_stream(
3159+ root_id_order, rev_id_to_root_id, parent_map, self.source, graph)
3160+ return [('texts', new_roots_stream)]
3161+
3162+
3163+def _new_root_data_stream(
3164+ root_keys_to_create, rev_id_to_root_id_map, parent_map, repo, graph=None):
3165+ """Generate a texts substream of synthesised root entries.
3166+
3167+ Used in fetches that do rich-root upgrades.
3168+
3169+ :param root_keys_to_create: iterable of (root_id, rev_id) pairs describing
3170+ the root entries to create.
3171+ :param rev_id_to_root_id_map: dict of known rev_id -> root_id mappings for
3172+ calculating the parents. If a parent rev_id is not found here then it
3173+ will be recalculated.
3174+ :param parent_map: a parent map for all the revisions in
3175+ root_keys_to_create.
3176+ :param graph: a graph to use instead of repo.get_graph().
3177+ """
3178+ for root_key in root_keys_to_create:
3179+ root_id, rev_id = root_key
3180+ parent_keys = _parent_keys_for_root_version(
3181+ root_id, rev_id, rev_id_to_root_id_map, parent_map, repo, graph)
3182+ yield versionedfile.FulltextContentFactory(
3183+ root_key, parent_keys, None, '')
3184+
3185+
3186+def _parent_keys_for_root_version(
3187+ root_id, rev_id, rev_id_to_root_id_map, parent_map, repo, graph=None):
3188+ """Get the parent keys for a given root id.
3189+
3190+ A helper function for _new_root_data_stream.
3191+ """
3192+ # Include direct parents of the revision, but only if they used the same
3193+ # root_id and are heads.
3194+ rev_parents = parent_map[rev_id]
3195+ parent_ids = []
3196+ for parent_id in rev_parents:
3197+ if parent_id == NULL_REVISION:
3198+ continue
3199+ if parent_id not in rev_id_to_root_id_map:
3200+ # We probably didn't read this revision, go spend the extra effort
3201+ # to actually check
3202+ try:
3203+ tree = repo.revision_tree(parent_id)
3204+ except errors.NoSuchRevision:
3205+ # Ghost, fill out rev_id_to_root_id in case we encounter this
3206+ # again.
3207+ # But set parent_root_id to None since we don't really know
3208+ parent_root_id = None
3209+ else:
3210+ parent_root_id = tree.get_root_id()
3211+ rev_id_to_root_id_map[parent_id] = None
3212+ # XXX: why not:
3213+ # rev_id_to_root_id_map[parent_id] = parent_root_id
3214+ # memory consumption maybe?
3215+ else:
3216+ parent_root_id = rev_id_to_root_id_map[parent_id]
3217+ if root_id == parent_root_id:
3218+ # With stacking we _might_ want to refer to a non-local revision,
3219+ # but this code path only applies when we have the full content
3220+ # available, so ghosts really are ghosts, not just the edge of
3221+ # local data.
3222+ parent_ids.append(parent_id)
3223+ else:
3224+ # root_id may be in the parent anyway.
3225+ try:
3226+ tree = repo.revision_tree(parent_id)
3227+ except errors.NoSuchRevision:
3228+ # ghost, can't refer to it.
3229+ pass
3230+ else:
3231+ try:
3232+ parent_ids.append(tree.inventory[root_id].revision)
3233+ except errors.NoSuchId:
3234+ # not in the tree
3235+ pass
3236+ # Drop non-head parents
3237+ if graph is None:
3238+ graph = repo.get_graph()
3239+ heads = graph.heads(parent_ids)
3240+ selected_ids = []
3241+ for parent_id in parent_ids:
3242+ if parent_id in heads and parent_id not in selected_ids:
3243+ selected_ids.append(parent_id)
3244+ parent_keys = [(root_id, parent_id) for parent_id in selected_ids]
3245+ return parent_keys
3246
3247=== modified file 'bzrlib/graph.py'
3248--- bzrlib/graph.py 2009-08-04 04:36:34 +0000
3249+++ bzrlib/graph.py 2009-08-17 18:36:14 +0000
3250@@ -21,7 +21,6 @@
3251 errors,
3252 revision,
3253 trace,
3254- tsort,
3255 )
3256 from bzrlib.symbol_versioning import deprecated_function, deprecated_in
3257
3258@@ -926,6 +925,7 @@
3259 An ancestor may sort after a descendant if the relationship is not
3260 visible in the supplied list of revisions.
3261 """
3262+ from bzrlib import tsort
3263 sorter = tsort.TopoSorter(self.get_parent_map(revisions))
3264 return sorter.iter_topo_order()
3265
3266
3267=== modified file 'bzrlib/groupcompress.py'
3268--- bzrlib/groupcompress.py 2009-08-04 04:36:34 +0000
3269+++ bzrlib/groupcompress.py 2009-09-09 13:05:33 +0000
3270@@ -33,7 +33,6 @@
3271 pack,
3272 trace,
3273 )
3274-from bzrlib.graph import Graph
3275 from bzrlib.btree_index import BTreeBuilder
3276 from bzrlib.lru_cache import LRUSizeCache
3277 from bzrlib.tsort import topo_sort
3278@@ -45,12 +44,15 @@
3279 VersionedFiles,
3280 )
3281
3282+# Minimum number of uncompressed bytes to try fetch at once when retrieving
3283+# groupcompress blocks.
3284+BATCH_SIZE = 2**16
3285+
3286 _USE_LZMA = False and (pylzma is not None)
3287
3288 # osutils.sha_string('')
3289 _null_sha1 = 'da39a3ee5e6b4b0d3255bfef95601890afd80709'
3290
3291-
3292 def sort_gc_optimal(parent_map):
3293 """Sort and group the keys in parent_map into groupcompress order.
3294
3295@@ -62,16 +64,15 @@
3296 # groupcompress ordering is approximately reverse topological,
3297 # properly grouped by file-id.
3298 per_prefix_map = {}
3299- for item in parent_map.iteritems():
3300- key = item[0]
3301+ for key, value in parent_map.iteritems():
3302 if isinstance(key, str) or len(key) == 1:
3303 prefix = ''
3304 else:
3305 prefix = key[0]
3306 try:
3307- per_prefix_map[prefix].append(item)
3308+ per_prefix_map[prefix][key] = value
3309 except KeyError:
3310- per_prefix_map[prefix] = [item]
3311+ per_prefix_map[prefix] = {key: value}
3312
3313 present_keys = []
3314 for prefix in sorted(per_prefix_map):
3315@@ -456,7 +457,6 @@
3316 # There are code paths that first extract as fulltext, and then
3317 # extract as storage_kind (smart fetch). So we don't break the
3318 # refcycle here, but instead in manager.get_record_stream()
3319- # self._manager = None
3320 if storage_kind == 'fulltext':
3321 return self._bytes
3322 else:
3323@@ -468,6 +468,14 @@
3324 class _LazyGroupContentManager(object):
3325 """This manages a group of _LazyGroupCompressFactory objects."""
3326
3327+ _max_cut_fraction = 0.75 # We allow a block to be trimmed to 75% of
3328+ # current size, and still be considered
3329+ # resuable
3330+ _full_block_size = 4*1024*1024
3331+ _full_mixed_block_size = 2*1024*1024
3332+ _full_enough_block_size = 3*1024*1024 # size at which we won't repack
3333+ _full_enough_mixed_block_size = 2*768*1024 # 1.5MB
3334+
3335 def __init__(self, block):
3336 self._block = block
3337 # We need to preserve the ordering
3338@@ -545,22 +553,23 @@
3339 # time (self._block._content) is a little expensive.
3340 self._block._ensure_content(self._last_byte)
3341
3342- def _check_rebuild_block(self):
3343+ def _check_rebuild_action(self):
3344 """Check to see if our block should be repacked."""
3345 total_bytes_used = 0
3346 last_byte_used = 0
3347 for factory in self._factories:
3348 total_bytes_used += factory._end - factory._start
3349- last_byte_used = max(last_byte_used, factory._end)
3350- # If we are using most of the bytes from the block, we have nothing
3351- # else to check (currently more that 1/2)
3352+ if last_byte_used < factory._end:
3353+ last_byte_used = factory._end
3354+ # If we are using more than half of the bytes from the block, we have
3355+ # nothing else to check
3356 if total_bytes_used * 2 >= self._block._content_length:
3357- return
3358- # Can we just strip off the trailing bytes? If we are going to be
3359- # transmitting more than 50% of the front of the content, go ahead
3360+ return None, last_byte_used, total_bytes_used
3361+ # We are using less than 50% of the content. Is the content we are
3362+ # using at the beginning of the block? If so, we can just trim the
3363+ # tail, rather than rebuilding from scratch.
3364 if total_bytes_used * 2 > last_byte_used:
3365- self._trim_block(last_byte_used)
3366- return
3367+ return 'trim', last_byte_used, total_bytes_used
3368
3369 # We are using a small amount of the data, and it isn't just packed
3370 # nicely at the front, so rebuild the content.
3371@@ -573,7 +582,80 @@
3372 # expanding many deltas into fulltexts, as well.
3373 # If we build a cheap enough 'strip', then we could try a strip,
3374 # if that expands the content, we then rebuild.
3375- self._rebuild_block()
3376+ return 'rebuild', last_byte_used, total_bytes_used
3377+
3378+ def check_is_well_utilized(self):
3379+ """Is the current block considered 'well utilized'?
3380+
3381+ This heuristic asks if the current block considers itself to be a fully
3382+ developed group, rather than just a loose collection of data.
3383+ """
3384+ if len(self._factories) == 1:
3385+ # A block of length 1 could be improved by combining with other
3386+ # groups - don't look deeper. Even larger than max size groups
3387+ # could compress well with adjacent versions of the same thing.
3388+ return False
3389+ action, last_byte_used, total_bytes_used = self._check_rebuild_action()
3390+ block_size = self._block._content_length
3391+ if total_bytes_used < block_size * self._max_cut_fraction:
3392+ # This block wants to trim itself small enough that we want to
3393+ # consider it under-utilized.
3394+ return False
3395+ # TODO: This code is meant to be the twin of _insert_record_stream's
3396+ # 'start_new_block' logic. It would probably be better to factor
3397+ # out that logic into a shared location, so that it stays
3398+ # together better
3399+ # We currently assume a block is properly utilized whenever it is >75%
3400+ # of the size of a 'full' block. In normal operation, a block is
3401+ # considered full when it hits 4MB of same-file content. So any block
3402+ # >3MB is 'full enough'.
3403+ # The only time this isn't true is when a given block has large-object
3404+ # content. (a single file >4MB, etc.)
3405+ # Under these circumstances, we allow a block to grow to
3406+ # 2 x largest_content. Which means that if a given block had a large
3407+ # object, it may actually be under-utilized. However, given that this
3408+ # is 'pack-on-the-fly' it is probably reasonable to not repack large
3409+ # content blobs on-the-fly. Note that because we return False for all
3410+ # 1-item blobs, we will repack them; we may wish to reevaluate our
3411+ # treatment of large object blobs in the future.
3412+ if block_size >= self._full_enough_block_size:
3413+ return True
3414+ # If a block is <3MB, it still may be considered 'full' if it contains
3415+ # mixed content. The current rule is 2MB of mixed content is considered
3416+ # full. So check to see if this block contains mixed content, and
3417+ # set the threshold appropriately.
3418+ common_prefix = None
3419+ for factory in self._factories:
3420+ prefix = factory.key[:-1]
3421+ if common_prefix is None:
3422+ common_prefix = prefix
3423+ elif prefix != common_prefix:
3424+ # Mixed content, check the size appropriately
3425+ if block_size >= self._full_enough_mixed_block_size:
3426+ return True
3427+ break
3428+ # The content failed both the mixed check and the single-content check
3429+ # so obviously it is not fully utilized
3430+ # TODO: there is one other constraint that isn't being checked
3431+ # namely, that the entries in the block are in the appropriate
3432+ # order. For example, you could insert the entries in exactly
3433+ # reverse groupcompress order, and we would think that is ok.
3434+ # (all the right objects are in one group, and it is fully
3435+ # utilized, etc.) For now, we assume that case is rare,
3436+ # especially since we should always fetch in 'groupcompress'
3437+ # order.
3438+ return False
3439+
3440+ def _check_rebuild_block(self):
3441+ action, last_byte_used, total_bytes_used = self._check_rebuild_action()
3442+ if action is None:
3443+ return
3444+ if action == 'trim':
3445+ self._trim_block(last_byte_used)
3446+ elif action == 'rebuild':
3447+ self._rebuild_block()
3448+ else:
3449+ raise ValueError('unknown rebuild action: %r' % (action,))
3450
3451 def _wire_bytes(self):
3452 """Return a byte stream suitable for transmitting over the wire."""
3453@@ -975,23 +1057,139 @@
3454 versioned_files.stream.close()
3455
3456
3457+class _BatchingBlockFetcher(object):
3458+ """Fetch group compress blocks in batches.
3459+
3460+ :ivar total_bytes: int of expected number of bytes needed to fetch the
3461+ currently pending batch.
3462+ """
3463+
3464+ def __init__(self, gcvf, locations):
3465+ self.gcvf = gcvf
3466+ self.locations = locations
3467+ self.keys = []
3468+ self.batch_memos = {}
3469+ self.memos_to_get = []
3470+ self.total_bytes = 0
3471+ self.last_read_memo = None
3472+ self.manager = None
3473+
3474+ def add_key(self, key):
3475+ """Add another to key to fetch.
3476+
3477+ :return: The estimated number of bytes needed to fetch the batch so
3478+ far.
3479+ """
3480+ self.keys.append(key)
3481+ index_memo, _, _, _ = self.locations[key]
3482+ read_memo = index_memo[0:3]
3483+ # Three possibilities for this read_memo:
3484+ # - it's already part of this batch; or
3485+ # - it's not yet part of this batch, but is already cached; or
3486+ # - it's not yet part of this batch and will need to be fetched.
3487+ if read_memo in self.batch_memos:
3488+ # This read memo is already in this batch.
3489+ return self.total_bytes
3490+ try:
3491+ cached_block = self.gcvf._group_cache[read_memo]
3492+ except KeyError:
3493+ # This read memo is new to this batch, and the data isn't cached
3494+ # either.
3495+ self.batch_memos[read_memo] = None
3496+ self.memos_to_get.append(read_memo)
3497+ byte_length = read_memo[2]
3498+ self.total_bytes += byte_length
3499+ else:
3500+ # This read memo is new to this batch, but cached.
3501+ # Keep a reference to the cached block in batch_memos because it's
3502+ # certain that we'll use it when this batch is processed, but
3503+ # there's a risk that it would fall out of _group_cache between now
3504+ # and then.
3505+ self.batch_memos[read_memo] = cached_block
3506+ return self.total_bytes
3507+
3508+ def _flush_manager(self):
3509+ if self.manager is not None:
3510+ for factory in self.manager.get_record_stream():
3511+ yield factory
3512+ self.manager = None
3513+ self.last_read_memo = None
3514+
3515+ def yield_factories(self, full_flush=False):
3516+ """Yield factories for keys added since the last yield. They will be
3517+ returned in the order they were added via add_key.
3518+
3519+ :param full_flush: by default, some results may not be returned in case
3520+ they can be part of the next batch. If full_flush is True, then
3521+ all results are returned.
3522+ """
3523+ if self.manager is None and not self.keys:
3524+ return
3525+ # Fetch all memos in this batch.
3526+ blocks = self.gcvf._get_blocks(self.memos_to_get)
3527+ # Turn blocks into factories and yield them.
3528+ memos_to_get_stack = list(self.memos_to_get)
3529+ memos_to_get_stack.reverse()
3530+ for key in self.keys:
3531+ index_memo, _, parents, _ = self.locations[key]
3532+ read_memo = index_memo[:3]
3533+ if self.last_read_memo != read_memo:
3534+ # We are starting a new block. If we have a
3535+ # manager, we have found everything that fits for
3536+ # now, so yield records
3537+ for factory in self._flush_manager():
3538+ yield factory
3539+ # Now start a new manager.
3540+ if memos_to_get_stack and memos_to_get_stack[-1] == read_memo:
3541+ # The next block from _get_blocks will be the block we
3542+ # need.
3543+ block_read_memo, block = blocks.next()
3544+ if block_read_memo != read_memo:
3545+ raise AssertionError(
3546+ "block_read_memo out of sync with read_memo"
3547+ "(%r != %r)" % (block_read_memo, read_memo))
3548+ self.batch_memos[read_memo] = block
3549+ memos_to_get_stack.pop()
3550+ else:
3551+ block = self.batch_memos[read_memo]
3552+ self.manager = _LazyGroupContentManager(block)
3553+ self.last_read_memo = read_memo
3554+ start, end = index_memo[3:5]
3555+ self.manager.add_factory(key, parents, start, end)
3556+ if full_flush:
3557+ for factory in self._flush_manager():
3558+ yield factory
3559+ del self.keys[:]
3560+ self.batch_memos.clear()
3561+ del self.memos_to_get[:]
3562+ self.total_bytes = 0
3563+
3564+
3565 class GroupCompressVersionedFiles(VersionedFiles):
3566 """A group-compress based VersionedFiles implementation."""
3567
3568- def __init__(self, index, access, delta=True):
3569+ def __init__(self, index, access, delta=True, _unadded_refs=None):
3570 """Create a GroupCompressVersionedFiles object.
3571
3572 :param index: The index object storing access and graph data.
3573 :param access: The access object storing raw data.
3574 :param delta: Whether to delta compress or just entropy compress.
3575+ :param _unadded_refs: private parameter, don't use.
3576 """
3577 self._index = index
3578 self._access = access
3579 self._delta = delta
3580- self._unadded_refs = {}
3581+ if _unadded_refs is None:
3582+ _unadded_refs = {}
3583+ self._unadded_refs = _unadded_refs
3584 self._group_cache = LRUSizeCache(max_size=50*1024*1024)
3585 self._fallback_vfs = []
3586
3587+ def without_fallbacks(self):
3588+ """Return a clone of this object without any fallbacks configured."""
3589+ return GroupCompressVersionedFiles(self._index, self._access,
3590+ self._delta, _unadded_refs=dict(self._unadded_refs))
3591+
3592 def add_lines(self, key, parents, lines, parent_texts=None,
3593 left_matching_blocks=None, nostore_sha=None, random_id=False,
3594 check_content=True):
3595@@ -1099,6 +1297,22 @@
3596 self._check_lines_not_unicode(lines)
3597 self._check_lines_are_lines(lines)
3598
3599+ def get_known_graph_ancestry(self, keys):
3600+ """Get a KnownGraph instance with the ancestry of keys."""
3601+ # Note that this is identical to
3602+ # KnitVersionedFiles.get_known_graph_ancestry, but they don't share
3603+ # ancestry.
3604+ parent_map, missing_keys = self._index.find_ancestry(keys)
3605+ for fallback in self._fallback_vfs:
3606+ if not missing_keys:
3607+ break
3608+ (f_parent_map, f_missing_keys) = fallback._index.find_ancestry(
3609+ missing_keys)
3610+ parent_map.update(f_parent_map)
3611+ missing_keys = f_missing_keys
3612+ kg = _mod_graph.KnownGraph(parent_map)
3613+ return kg
3614+
3615 def get_parent_map(self, keys):
3616 """Get a map of the graph parents of keys.
3617
3618@@ -1131,26 +1345,42 @@
3619 missing.difference_update(set(new_result))
3620 return result, source_results
3621
3622- def _get_block(self, index_memo):
3623- read_memo = index_memo[0:3]
3624- # get the group:
3625- try:
3626- block = self._group_cache[read_memo]
3627- except KeyError:
3628- # read the group
3629- zdata = self._access.get_raw_records([read_memo]).next()
3630- # decompress - whole thing - this is not a bug, as it
3631- # permits caching. We might want to store the partially
3632- # decompresed group and decompress object, so that recent
3633- # texts are not penalised by big groups.
3634- block = GroupCompressBlock.from_bytes(zdata)
3635- self._group_cache[read_memo] = block
3636- # cheapo debugging:
3637- # print len(zdata), len(plain)
3638- # parse - requires split_lines, better to have byte offsets
3639- # here (but not by much - we only split the region for the
3640- # recipe, and we often want to end up with lines anyway.
3641- return block
3642+ def _get_blocks(self, read_memos):
3643+ """Get GroupCompressBlocks for the given read_memos.
3644+
3645+ :returns: a series of (read_memo, block) pairs, in the order they were
3646+ originally passed.
3647+ """
3648+ cached = {}
3649+ for read_memo in read_memos:
3650+ try:
3651+ block = self._group_cache[read_memo]
3652+ except KeyError:
3653+ pass
3654+ else:
3655+ cached[read_memo] = block
3656+ not_cached = []
3657+ not_cached_seen = set()
3658+ for read_memo in read_memos:
3659+ if read_memo in cached:
3660+ # Don't fetch what we already have
3661+ continue
3662+ if read_memo in not_cached_seen:
3663+ # Don't try to fetch the same data twice
3664+ continue
3665+ not_cached.append(read_memo)
3666+ not_cached_seen.add(read_memo)
3667+ raw_records = self._access.get_raw_records(not_cached)
3668+ for read_memo in read_memos:
3669+ try:
3670+ yield read_memo, cached[read_memo]
3671+ except KeyError:
3672+ # Read the block, and cache it.
3673+ zdata = raw_records.next()
3674+ block = GroupCompressBlock.from_bytes(zdata)
3675+ self._group_cache[read_memo] = block
3676+ cached[read_memo] = block
3677+ yield read_memo, block
3678
3679 def get_missing_compression_parent_keys(self):
3680 """Return the keys of missing compression parents.
3681@@ -1322,55 +1552,35 @@
3682 unadded_keys, source_result)
3683 for key in missing:
3684 yield AbsentContentFactory(key)
3685- manager = None
3686- last_read_memo = None
3687- # TODO: This works fairly well at batching up existing groups into a
3688- # streamable format, and possibly allowing for taking one big
3689- # group and splitting it when it isn't fully utilized.
3690- # However, it doesn't allow us to find under-utilized groups and
3691- # combine them into a bigger group on the fly.
3692- # (Consider the issue with how chk_map inserts texts
3693- # one-at-a-time.) This could be done at insert_record_stream()
3694- # time, but it probably would decrease the number of
3695- # bytes-on-the-wire for fetch.
3696+ # Batch up as many keys as we can until either:
3697+ # - we encounter an unadded ref, or
3698+ # - we run out of keys, or
3699+ # - the total bytes to retrieve for this batch > BATCH_SIZE
3700+ batcher = _BatchingBlockFetcher(self, locations)
3701 for source, keys in source_keys:
3702 if source is self:
3703 for key in keys:
3704 if key in self._unadded_refs:
3705- if manager is not None:
3706- for factory in manager.get_record_stream():
3707- yield factory
3708- last_read_memo = manager = None
3709+ # Flush batch, then yield unadded ref from
3710+ # self._compressor.
3711+ for factory in batcher.yield_factories(full_flush=True):
3712+ yield factory
3713 bytes, sha1 = self._compressor.extract(key)
3714 parents = self._unadded_refs[key]
3715 yield FulltextContentFactory(key, parents, sha1, bytes)
3716- else:
3717- index_memo, _, parents, (method, _) = locations[key]
3718- read_memo = index_memo[0:3]
3719- if last_read_memo != read_memo:
3720- # We are starting a new block. If we have a
3721- # manager, we have found everything that fits for
3722- # now, so yield records
3723- if manager is not None:
3724- for factory in manager.get_record_stream():
3725- yield factory
3726- # Now start a new manager
3727- block = self._get_block(index_memo)
3728- manager = _LazyGroupContentManager(block)
3729- last_read_memo = read_memo
3730- start, end = index_memo[3:5]
3731- manager.add_factory(key, parents, start, end)
3732+ continue
3733+ if batcher.add_key(key) > BATCH_SIZE:
3734+ # Ok, this batch is big enough. Yield some results.
3735+ for factory in batcher.yield_factories():
3736+ yield factory
3737 else:
3738- if manager is not None:
3739- for factory in manager.get_record_stream():
3740- yield factory
3741- last_read_memo = manager = None
3742+ for factory in batcher.yield_factories(full_flush=True):
3743+ yield factory
3744 for record in source.get_record_stream(keys, ordering,
3745 include_delta_closure):
3746 yield record
3747- if manager is not None:
3748- for factory in manager.get_record_stream():
3749- yield factory
3750+ for factory in batcher.yield_factories(full_flush=True):
3751+ yield factory
3752
3753 def get_sha1s(self, keys):
3754 """See VersionedFiles.get_sha1s()."""
3755@@ -1449,6 +1659,7 @@
3756 block_length = None
3757 # XXX: TODO: remove this, it is just for safety checking for now
3758 inserted_keys = set()
3759+ reuse_this_block = reuse_blocks
3760 for record in stream:
3761 # Raise an error when a record is missing.
3762 if record.storage_kind == 'absent':
3763@@ -1462,10 +1673,20 @@
3764 if reuse_blocks:
3765 # If the reuse_blocks flag is set, check to see if we can just
3766 # copy a groupcompress block as-is.
3767+ # We only check on the first record (groupcompress-block) not
3768+ # on all of the (groupcompress-block-ref) entries.
3769+ # The reuse_this_block flag is then kept for as long as
3770+ if record.storage_kind == 'groupcompress-block':
3771+ # Check to see if we really want to re-use this block
3772+ insert_manager = record._manager
3773+ reuse_this_block = insert_manager.check_is_well_utilized()
3774+ else:
3775+ reuse_this_block = False
3776+ if reuse_this_block:
3777+ # We still want to reuse this block
3778 if record.storage_kind == 'groupcompress-block':
3779 # Insert the raw block into the target repo
3780 insert_manager = record._manager
3781- insert_manager._check_rebuild_block()
3782 bytes = record._manager._block.to_bytes()
3783 _, start, length = self._access.add_raw_records(
3784 [(None, len(bytes))], bytes)[0]
3785@@ -1476,6 +1697,11 @@
3786 'groupcompress-block-ref'):
3787 if insert_manager is None:
3788 raise AssertionError('No insert_manager set')
3789+ if insert_manager is not record._manager:
3790+ raise AssertionError('insert_manager does not match'
3791+ ' the current record, we cannot be positive'
3792+ ' that the appropriate content was inserted.'
3793+ )
3794 value = "%d %d %d %d" % (block_start, block_length,
3795 record._start, record._end)
3796 nodes = [(record.key, value, (record.parents,))]
3797@@ -1593,7 +1819,7 @@
3798
3799 def __init__(self, graph_index, is_locked, parents=True,
3800 add_callback=None, track_external_parent_refs=False,
3801- inconsistency_fatal=True):
3802+ inconsistency_fatal=True, track_new_keys=False):
3803 """Construct a _GCGraphIndex on a graph_index.
3804
3805 :param graph_index: An implementation of bzrlib.index.GraphIndex.
3806@@ -1619,7 +1845,8 @@
3807 self._is_locked = is_locked
3808 self._inconsistency_fatal = inconsistency_fatal
3809 if track_external_parent_refs:
3810- self._key_dependencies = knit._KeyRefs()
3811+ self._key_dependencies = knit._KeyRefs(
3812+ track_new_keys=track_new_keys)
3813 else:
3814 self._key_dependencies = None
3815
3816@@ -1679,10 +1906,14 @@
3817 result.append((key, value))
3818 records = result
3819 key_dependencies = self._key_dependencies
3820- if key_dependencies is not None and self._parents:
3821- for key, value, refs in records:
3822- parents = refs[0]
3823- key_dependencies.add_references(key, parents)
3824+ if key_dependencies is not None:
3825+ if self._parents:
3826+ for key, value, refs in records:
3827+ parents = refs[0]
3828+ key_dependencies.add_references(key, parents)
3829+ else:
3830+ for key, value, refs in records:
3831+ new_keys.add_key(key)
3832 self._add_callback(records)
3833
3834 def _check_read(self):
3835@@ -1719,6 +1950,10 @@
3836 if missing_keys:
3837 raise errors.RevisionNotPresent(missing_keys.pop(), self)
3838
3839+ def find_ancestry(self, keys):
3840+ """See CombinedGraphIndex.find_ancestry"""
3841+ return self._graph_index.find_ancestry(keys, 0)
3842+
3843 def get_parent_map(self, keys):
3844 """Get a map of the parents of keys.
3845
3846@@ -1741,7 +1976,7 @@
3847 """Return the keys of missing parents."""
3848 # Copied from _KnitGraphIndex.get_missing_parents
3849 # We may have false positives, so filter those out.
3850- self._key_dependencies.add_keys(
3851+ self._key_dependencies.satisfy_refs_for_keys(
3852 self.get_parent_map(self._key_dependencies.get_unsatisfied_refs()))
3853 return frozenset(self._key_dependencies.get_unsatisfied_refs())
3854
3855@@ -1801,17 +2036,17 @@
3856
3857 This allows this _GCGraphIndex to keep track of any missing
3858 compression parents we may want to have filled in to make those
3859- indices valid.
3860+ indices valid. It also allows _GCGraphIndex to track any new keys.
3861
3862 :param graph_index: A GraphIndex
3863 """
3864- if self._key_dependencies is not None:
3865- # Add parent refs from graph_index (and discard parent refs that
3866- # the graph_index has).
3867- add_refs = self._key_dependencies.add_references
3868- for node in graph_index.iter_all_entries():
3869- add_refs(node[1], node[3][0])
3870-
3871+ key_dependencies = self._key_dependencies
3872+ if key_dependencies is None:
3873+ return
3874+ for node in graph_index.iter_all_entries():
3875+ # Add parent refs from graph_index (and discard parent refs
3876+ # that the graph_index has).
3877+ key_dependencies.add_references(node[1], node[3][0])
3878
3879
3880 from bzrlib._groupcompress_py import (
3881
3882=== modified file 'bzrlib/help_topics/en/configuration.txt'
3883--- bzrlib/help_topics/en/configuration.txt 2009-06-26 18:13:41 +0000
3884+++ bzrlib/help_topics/en/configuration.txt 2009-08-21 09:19:11 +0000
3885@@ -63,6 +63,60 @@
3886 ~~~~~~~~~~~~~~~
3887
3888 The path to the plugins directory that Bazaar should use.
3889+If not set, Bazaar will search for plugins in:
3890+
3891+* the user specific plugin directory (containing the ``user`` plugins),
3892+
3893+* the bzrlib directory (containing the ``core`` plugins),
3894+
3895+* the site specific plugin directory if applicable (containing
3896+ the ``site`` plugins).
3897+
3898+If ``BZR_PLUGIN_PATH`` is set in any fashion, it will change the
3899+the way the plugin are searched.
3900+
3901+As for the ``PATH`` variables, if multiple directories are
3902+specified in ``BZR_PLUGIN_PATH`` they should be separated by the
3903+platform specific appropriate character (':' on Unix/Linux/etc,
3904+';' on windows)
3905+
3906+By default if ``BZR_PLUGIN_PATH`` is set, it replaces searching
3907+in ``user``. However it will continue to search in ``core`` and
3908+``site`` unless they are explicitly removed.
3909+
3910+If you need to change the order or remove one of these
3911+directories, you should use special values:
3912+
3913+* ``-user``, ``-core``, ``-site`` will remove the corresponding
3914+ path from the default values,
3915+
3916+* ``+user``, ``+core``, ``+site`` will add the corresponding path
3917+ before the remaining default values (and also remove it from
3918+ the default values).
3919+
3920+Note that the special values 'user', 'core' and 'site' should be
3921+used literally, they will be substituted by the corresponding,
3922+platform specific, values.
3923+
3924+Examples:
3925+^^^^^^^^^
3926+
3927+The examples below uses ':' as the separator, windows users
3928+should use ';'.
3929+
3930+Overriding the default user plugin directory:
3931+``BZR_PLUGIN_PATH='/path/to/my/other/plugins'``
3932+
3933+Disabling the site directory while retaining the user directory:
3934+``BZR_PLUGIN_PATH='-site:+user'``
3935+
3936+Disabling all plugins (better achieved with --no-plugins):
3937+``BZR_PLUGIN_PATH='-user:-core:-site'``
3938+
3939+Overriding the default site plugin directory:
3940+``BZR_PLUGIN_PATH='/path/to/my/site/plugins:-site':+user``
3941+
3942+
3943
3944 BZRPATH
3945 ~~~~~~~
3946
3947=== modified file 'bzrlib/help_topics/en/debug-flags.txt'
3948--- bzrlib/help_topics/en/debug-flags.txt 2009-08-04 04:36:34 +0000
3949+++ bzrlib/help_topics/en/debug-flags.txt 2009-08-20 05:02:45 +0000
3950@@ -12,18 +12,26 @@
3951 operations.
3952 -Dfetch Trace history copying between repositories.
3953 -Dfilters Emit information for debugging content filtering.
3954+-Dforceinvdeltas Force use of inventory deltas during generic streaming fetch.
3955 -Dgraph Trace graph traversal.
3956 -Dhashcache Log every time a working file is read to determine its hash.
3957 -Dhooks Trace hook execution.
3958 -Dhpss Trace smart protocol requests and responses.
3959 -Dhpssdetail More hpss details.
3960 -Dhpssvfs Traceback on vfs access to Remote objects.
3961--Dhttp Trace http connections, requests and responses
3962+-Dhttp Trace http connections, requests and responses.
3963 -Dindex Trace major index operations.
3964 -Dknit Trace knit operations.
3965+-Dstrict_locks Trace when OS locks are potentially used in a non-portable
3966+ manner.
3967 -Dlock Trace when lockdir locks are taken or released.
3968 -Dprogress Trace progress bar operations.
3969 -Dmerge Emit information for debugging merges.
3970+-Dno_apport Don't use apport to report crashes.
3971 -Dunlock Some errors during unlock are treated as warnings.
3972 -Dpack Emit information about pack operations.
3973 -Dsftp Trace SFTP internals.
3974+-Dstream Trace fetch streams.
3975+-DIDS_never Never use InterDifferingSerializer when fetching.
3976+-DIDS_always Always use InterDifferingSerializer to fetch if appropriate
3977+ for the format, even for non-local fetches.
3978
3979=== modified file 'bzrlib/hooks.py'
3980--- bzrlib/hooks.py 2009-06-10 03:31:01 +0000
3981+++ bzrlib/hooks.py 2009-09-01 12:29:54 +0000
3982@@ -219,9 +219,7 @@
3983 strings.append('Introduced in: %s' % introduced_string)
3984 if self.deprecated:
3985 deprecated_string = _format_version_tuple(self.deprecated)
3986- else:
3987- deprecated_string = 'Not deprecated'
3988- strings.append('Deprecated in: %s' % deprecated_string)
3989+ strings.append('Deprecated in: %s' % deprecated_string)
3990 strings.append('')
3991 strings.extend(textwrap.wrap(self.__doc__,
3992 break_long_words=False))
3993
3994=== modified file 'bzrlib/index.py'
3995--- bzrlib/index.py 2009-07-01 10:53:08 +0000
3996+++ bzrlib/index.py 2009-08-17 22:11:06 +0000
3997@@ -333,6 +333,22 @@
3998 if combine_backing_indices is not None:
3999 self._combine_backing_indices = combine_backing_indices
4000
4001+ def find_ancestry(self, keys, ref_list_num):
4002+ """See CombinedGraphIndex.find_ancestry()"""
4003+ pending = set(keys)
4004+ parent_map = {}
4005+ missing_keys = set()
4006+ while pending:
4007+ next_pending = set()
4008+ for _, key, value, ref_lists in self.iter_entries(pending):
4009+ parent_keys = ref_lists[ref_list_num]
4010+ parent_map[key] = parent_keys
4011+ next_pending.update([p for p in parent_keys if p not in
4012+ parent_map])
4013+ missing_keys.update(pending.difference(parent_map))
4014+ pending = next_pending
4015+ return parent_map, missing_keys
4016+
4017
4018 class GraphIndex(object):
4019 """An index for data with embedded graphs.
4020@@ -702,6 +718,23 @@
4021 # the last thing looked up was a terminal element
4022 yield (self, ) + key_dict
4023
4024+ def _find_ancestors(self, keys, ref_list_num, parent_map, missing_keys):
4025+ """See BTreeIndex._find_ancestors."""
4026+ # The api can be implemented as a trivial overlay on top of
4027+ # iter_entries, it is not an efficient implementation, but it at least
4028+ # gets the job done.
4029+ found_keys = set()
4030+ search_keys = set()
4031+ for index, key, value, refs in self.iter_entries(keys):
4032+ parent_keys = refs[ref_list_num]
4033+ found_keys.add(key)
4034+ parent_map[key] = parent_keys
4035+ search_keys.update(parent_keys)
4036+ # Figure out what, if anything, was missing
4037+ missing_keys.update(set(keys).difference(found_keys))
4038+ search_keys = search_keys.difference(parent_map)
4039+ return search_keys
4040+
4041 def key_count(self):
4042 """Return an estimate of the number of keys in this index.
4043
4044@@ -1297,6 +1330,69 @@
4045 except errors.NoSuchFile:
4046 self._reload_or_raise()
4047
4048+ def find_ancestry(self, keys, ref_list_num):
4049+ """Find the complete ancestry for the given set of keys.
4050+
4051+ Note that this is a whole-ancestry request, so it should be used
4052+ sparingly.
4053+
4054+ :param keys: An iterable of keys to look for
4055+ :param ref_list_num: The reference list which references the parents
4056+ we care about.
4057+ :return: (parent_map, missing_keys)
4058+ """
4059+ missing_keys = set()
4060+ parent_map = {}
4061+ keys_to_lookup = set(keys)
4062+ generation = 0
4063+ while keys_to_lookup:
4064+ # keys that *all* indexes claim are missing, stop searching them
4065+ generation += 1
4066+ all_index_missing = None
4067+ # print 'gen\tidx\tsub\tn_keys\tn_pmap\tn_miss'
4068+ # print '%4d\t\t\t%4d\t%5d\t%5d' % (generation, len(keys_to_lookup),
4069+ # len(parent_map),
4070+ # len(missing_keys))
4071+ for index_idx, index in enumerate(self._indices):
4072+ # TODO: we should probably be doing something with
4073+ # 'missing_keys' since we've already determined that
4074+ # those revisions have not been found anywhere
4075+ index_missing_keys = set()
4076+ # Find all of the ancestry we can from this index
4077+ # keep looking until the search_keys set is empty, which means
4078+ # things we didn't find should be in index_missing_keys
4079+ search_keys = keys_to_lookup
4080+ sub_generation = 0
4081+ # print ' \t%2d\t\t%4d\t%5d\t%5d' % (
4082+ # index_idx, len(search_keys),
4083+ # len(parent_map), len(index_missing_keys))
4084+ while search_keys:
4085+ sub_generation += 1
4086+ # TODO: ref_list_num should really be a parameter, since
4087+ # CombinedGraphIndex does not know what the ref lists
4088+ # mean.
4089+ search_keys = index._find_ancestors(search_keys,
4090+ ref_list_num, parent_map, index_missing_keys)
4091+ # print ' \t \t%2d\t%4d\t%5d\t%5d' % (
4092+ # sub_generation, len(search_keys),
4093+ # len(parent_map), len(index_missing_keys))
4094+ # Now set whatever was missing to be searched in the next index
4095+ keys_to_lookup = index_missing_keys
4096+ if all_index_missing is None:
4097+ all_index_missing = set(index_missing_keys)
4098+ else:
4099+ all_index_missing.intersection_update(index_missing_keys)
4100+ if not keys_to_lookup:
4101+ break
4102+ if all_index_missing is None:
4103+ # There were no indexes, so all search keys are 'missing'
4104+ missing_keys.update(keys_to_lookup)
4105+ keys_to_lookup = None
4106+ else:
4107+ missing_keys.update(all_index_missing)
4108+ keys_to_lookup.difference_update(all_index_missing)
4109+ return parent_map, missing_keys
4110+
4111 def key_count(self):
4112 """Return an estimate of the number of keys in this index.
4113
4114
4115=== modified file 'bzrlib/inventory.py'
4116--- bzrlib/inventory.py 2009-08-05 02:30:59 +0000
4117+++ bzrlib/inventory.py 2009-08-30 23:51:10 +0000
4118@@ -437,7 +437,13 @@
4119 self.text_id is not None):
4120 checker._report_items.append('directory {%s} has text in revision {%s}'
4121 % (self.file_id, rev_id))
4122- # Directories are stored as ''.
4123+ # In non rich root repositories we do not expect a file graph for the
4124+ # root.
4125+ if self.name == '' and not checker.rich_roots:
4126+ return
4127+ # Directories are stored as an empty file, but the file should exist
4128+ # to provide a per-fileid log. The hash of every directory content is
4129+ # "da..." below (the sha1sum of '').
4130 checker.add_pending_item(rev_id,
4131 ('texts', self.file_id, self.revision), 'text',
4132 'da39a3ee5e6b4b0d3255bfef95601890afd80709')
4133@@ -743,6 +749,9 @@
4134 """
4135 return self.has_id(file_id)
4136
4137+ def has_filename(self, filename):
4138+ return bool(self.path2id(filename))
4139+
4140 def id2path(self, file_id):
4141 """Return as a string the path to file_id.
4142
4143@@ -751,6 +760,8 @@
4144 >>> e = i.add(InventoryFile('foo-id', 'foo.c', parent_id='src-id'))
4145 >>> print i.id2path('foo-id')
4146 src/foo.c
4147+
4148+ :raises NoSuchId: If file_id is not present in the inventory.
4149 """
4150 # get all names, skipping root
4151 return '/'.join(reversed(
4152@@ -1363,9 +1374,6 @@
4153 yield ie
4154 file_id = ie.parent_id
4155
4156- def has_filename(self, filename):
4157- return bool(self.path2id(filename))
4158-
4159 def has_id(self, file_id):
4160 return (file_id in self._byid)
4161
4162
4163=== modified file 'bzrlib/inventory_delta.py'
4164--- bzrlib/inventory_delta.py 2009-04-02 05:53:12 +0000
4165+++ bzrlib/inventory_delta.py 2009-08-13 00:20:29 +0000
4166@@ -29,6 +29,25 @@
4167 from bzrlib import inventory
4168 from bzrlib.revision import NULL_REVISION
4169
4170+FORMAT_1 = 'bzr inventory delta v1 (bzr 1.14)'
4171+
4172+
4173+class InventoryDeltaError(errors.BzrError):
4174+ """An error when serializing or deserializing an inventory delta."""
4175+
4176+ # Most errors when serializing and deserializing are due to bugs, although
4177+ # damaged input (i.e. a bug in a different process) could cause
4178+ # deserialization errors too.
4179+ internal_error = True
4180+
4181+
4182+class IncompatibleInventoryDelta(errors.BzrError):
4183+ """The delta could not be deserialised because its contents conflict with
4184+ the allow_versioned_root or allow_tree_references flags of the
4185+ deserializer.
4186+ """
4187+ internal_error = False
4188+
4189
4190 def _directory_content(entry):
4191 """Serialize the content component of entry which is a directory.
4192@@ -49,7 +68,7 @@
4193 exec_bytes = ''
4194 size_exec_sha = (entry.text_size, exec_bytes, entry.text_sha1)
4195 if None in size_exec_sha:
4196- raise errors.BzrError('Missing size or sha for %s' % entry.file_id)
4197+ raise InventoryDeltaError('Missing size or sha for %s' % entry.file_id)
4198 return "file\x00%d\x00%s\x00%s" % size_exec_sha
4199
4200
4201@@ -60,7 +79,7 @@
4202 """
4203 target = entry.symlink_target
4204 if target is None:
4205- raise errors.BzrError('Missing target for %s' % entry.file_id)
4206+ raise InventoryDeltaError('Missing target for %s' % entry.file_id)
4207 return "link\x00%s" % target.encode('utf8')
4208
4209
4210@@ -71,7 +90,8 @@
4211 """
4212 tree_revision = entry.reference_revision
4213 if tree_revision is None:
4214- raise errors.BzrError('Missing reference revision for %s' % entry.file_id)
4215+ raise InventoryDeltaError(
4216+ 'Missing reference revision for %s' % entry.file_id)
4217 return "tree\x00%s" % tree_revision
4218
4219
4220@@ -115,11 +135,8 @@
4221 return result
4222
4223
4224-
4225 class InventoryDeltaSerializer(object):
4226- """Serialize and deserialize inventory deltas."""
4227-
4228- FORMAT_1 = 'bzr inventory delta v1 (bzr 1.14)'
4229+ """Serialize inventory deltas."""
4230
4231 def __init__(self, versioned_root, tree_references):
4232 """Create an InventoryDeltaSerializer.
4233@@ -141,6 +158,9 @@
4234 def delta_to_lines(self, old_name, new_name, delta_to_new):
4235 """Return a line sequence for delta_to_new.
4236
4237+ Both the versioned_root and tree_references flags must be set via
4238+ require_flags before calling this.
4239+
4240 :param old_name: A UTF8 revision id for the old inventory. May be
4241 NULL_REVISION if there is no older inventory and delta_to_new
4242 includes the entire inventory contents.
4243@@ -150,15 +170,20 @@
4244 takes.
4245 :return: The serialized delta as lines.
4246 """
4247+ if type(old_name) is not str:
4248+ raise TypeError('old_name should be str, got %r' % (old_name,))
4249+ if type(new_name) is not str:
4250+ raise TypeError('new_name should be str, got %r' % (new_name,))
4251 lines = ['', '', '', '', '']
4252 to_line = self._delta_item_to_line
4253 for delta_item in delta_to_new:
4254- lines.append(to_line(delta_item))
4255- if lines[-1].__class__ != str:
4256- raise errors.BzrError(
4257+ line = to_line(delta_item, new_name)
4258+ if line.__class__ != str:
4259+ raise InventoryDeltaError(
4260 'to_line generated non-str output %r' % lines[-1])
4261+ lines.append(line)
4262 lines.sort()
4263- lines[0] = "format: %s\n" % InventoryDeltaSerializer.FORMAT_1
4264+ lines[0] = "format: %s\n" % FORMAT_1
4265 lines[1] = "parent: %s\n" % old_name
4266 lines[2] = "version: %s\n" % new_name
4267 lines[3] = "versioned_root: %s\n" % self._serialize_bool(
4268@@ -173,7 +198,7 @@
4269 else:
4270 return "false"
4271
4272- def _delta_item_to_line(self, delta_item):
4273+ def _delta_item_to_line(self, delta_item, new_version):
4274 """Convert delta_item to a line."""
4275 oldpath, newpath, file_id, entry = delta_item
4276 if newpath is None:
4277@@ -188,6 +213,10 @@
4278 oldpath_utf8 = 'None'
4279 else:
4280 oldpath_utf8 = '/' + oldpath.encode('utf8')
4281+ if newpath == '/':
4282+ raise AssertionError(
4283+ "Bad inventory delta: '/' is not a valid newpath "
4284+ "(should be '') in delta item %r" % (delta_item,))
4285 # TODO: Test real-world utf8 cache hit rate. It may be a win.
4286 newpath_utf8 = '/' + newpath.encode('utf8')
4287 # Serialize None as ''
4288@@ -196,58 +225,78 @@
4289 last_modified = entry.revision
4290 # special cases for /
4291 if newpath_utf8 == '/' and not self._versioned_root:
4292- if file_id != 'TREE_ROOT':
4293- raise errors.BzrError(
4294- 'file_id %s is not TREE_ROOT for /' % file_id)
4295- if last_modified is not None:
4296- raise errors.BzrError(
4297- 'Version present for / in %s' % file_id)
4298- last_modified = NULL_REVISION
4299+ # This is an entry for the root, this inventory does not
4300+ # support versioned roots. So this must be an unversioned
4301+ # root, i.e. last_modified == new revision. Otherwise, this
4302+ # delta is invalid.
4303+ # Note: the non-rich-root repositories *can* have roots with
4304+ # file-ids other than TREE_ROOT, e.g. repo formats that use the
4305+ # xml5 serializer.
4306+ if last_modified != new_version:
4307+ raise InventoryDeltaError(
4308+ 'Version present for / in %s (%s != %s)'
4309+ % (file_id, last_modified, new_version))
4310 if last_modified is None:
4311- raise errors.BzrError("no version for fileid %s" % file_id)
4312+ raise InventoryDeltaError("no version for fileid %s" % file_id)
4313 content = self._entry_to_content[entry.kind](entry)
4314 return ("%s\x00%s\x00%s\x00%s\x00%s\x00%s\n" %
4315 (oldpath_utf8, newpath_utf8, file_id, parent_id, last_modified,
4316 content))
4317
4318+
4319+class InventoryDeltaDeserializer(object):
4320+ """Deserialize inventory deltas."""
4321+
4322+ def __init__(self, allow_versioned_root=True, allow_tree_references=True):
4323+ """Create an InventoryDeltaDeserializer.
4324+
4325+ :param versioned_root: If True, any root entry that is seen is expected
4326+ to be versioned, and root entries can have any fileid.
4327+ :param tree_references: If True support tree-reference entries.
4328+ """
4329+ self._allow_versioned_root = allow_versioned_root
4330+ self._allow_tree_references = allow_tree_references
4331+
4332 def _deserialize_bool(self, value):
4333 if value == "true":
4334 return True
4335 elif value == "false":
4336 return False
4337 else:
4338- raise errors.BzrError("value %r is not a bool" % (value,))
4339+ raise InventoryDeltaError("value %r is not a bool" % (value,))
4340
4341 def parse_text_bytes(self, bytes):
4342 """Parse the text bytes of a serialized inventory delta.
4343
4344+ If versioned_root and/or tree_references flags were set via
4345+ require_flags, then the parsed flags must match or a BzrError will be
4346+ raised.
4347+
4348 :param bytes: The bytes to parse. This can be obtained by calling
4349 delta_to_lines and then doing ''.join(delta_lines).
4350- :return: (parent_id, new_id, inventory_delta)
4351+ :return: (parent_id, new_id, versioned_root, tree_references,
4352+ inventory_delta)
4353 """
4354+ if bytes[-1:] != '\n':
4355+ last_line = bytes.rsplit('\n', 1)[-1]
4356+ raise InventoryDeltaError('last line not empty: %r' % (last_line,))
4357 lines = bytes.split('\n')[:-1] # discard the last empty line
4358- if not lines or lines[0] != 'format: %s' % InventoryDeltaSerializer.FORMAT_1:
4359- raise errors.BzrError('unknown format %r' % lines[0:1])
4360+ if not lines or lines[0] != 'format: %s' % FORMAT_1:
4361+ raise InventoryDeltaError('unknown format %r' % lines[0:1])
4362 if len(lines) < 2 or not lines[1].startswith('parent: '):
4363- raise errors.BzrError('missing parent: marker')
4364+ raise InventoryDeltaError('missing parent: marker')
4365 delta_parent_id = lines[1][8:]
4366 if len(lines) < 3 or not lines[2].startswith('version: '):
4367- raise errors.BzrError('missing version: marker')
4368+ raise InventoryDeltaError('missing version: marker')
4369 delta_version_id = lines[2][9:]
4370 if len(lines) < 4 or not lines[3].startswith('versioned_root: '):
4371- raise errors.BzrError('missing versioned_root: marker')
4372+ raise InventoryDeltaError('missing versioned_root: marker')
4373 delta_versioned_root = self._deserialize_bool(lines[3][16:])
4374 if len(lines) < 5 or not lines[4].startswith('tree_references: '):
4375- raise errors.BzrError('missing tree_references: marker')
4376+ raise InventoryDeltaError('missing tree_references: marker')
4377 delta_tree_references = self._deserialize_bool(lines[4][17:])
4378- if delta_versioned_root != self._versioned_root:
4379- raise errors.BzrError(
4380- "serialized versioned_root flag is wrong: %s" %
4381- (delta_versioned_root,))
4382- if delta_tree_references != self._tree_references:
4383- raise errors.BzrError(
4384- "serialized tree_references flag is wrong: %s" %
4385- (delta_tree_references,))
4386+ if (not self._allow_versioned_root and delta_versioned_root):
4387+ raise IncompatibleInventoryDelta("versioned_root not allowed")
4388 result = []
4389 seen_ids = set()
4390 line_iter = iter(lines)
4391@@ -258,33 +307,58 @@
4392 content) = line.split('\x00', 5)
4393 parent_id = parent_id or None
4394 if file_id in seen_ids:
4395- raise errors.BzrError(
4396+ raise InventoryDeltaError(
4397 "duplicate file id in inventory delta %r" % lines)
4398 seen_ids.add(file_id)
4399- if newpath_utf8 == '/' and not delta_versioned_root and (
4400- last_modified != 'null:' or file_id != 'TREE_ROOT'):
4401- raise errors.BzrError("Versioned root found: %r" % line)
4402- elif last_modified[-1] == ':':
4403- raise errors.BzrError('special revisionid found: %r' % line)
4404- if not delta_tree_references and content.startswith('tree\x00'):
4405- raise errors.BzrError("Tree reference found: %r" % line)
4406- content_tuple = tuple(content.split('\x00'))
4407- entry = _parse_entry(
4408- newpath_utf8, file_id, parent_id, last_modified, content_tuple)
4409+ if (newpath_utf8 == '/' and not delta_versioned_root and
4410+ last_modified != delta_version_id):
4411+ # Delta claims to be not have a versioned root, yet here's
4412+ # a root entry with a non-default version.
4413+ raise InventoryDeltaError("Versioned root found: %r" % line)
4414+ elif newpath_utf8 != 'None' and last_modified[-1] == ':':
4415+ # Deletes have a last_modified of null:, but otherwise special
4416+ # revision ids should not occur.
4417+ raise InventoryDeltaError('special revisionid found: %r' % line)
4418+ if content.startswith('tree\x00'):
4419+ if delta_tree_references is False:
4420+ raise InventoryDeltaError(
4421+ "Tree reference found (but header said "
4422+ "tree_references: false): %r" % line)
4423+ elif not self._allow_tree_references:
4424+ raise IncompatibleInventoryDelta(
4425+ "Tree reference not allowed")
4426 if oldpath_utf8 == 'None':
4427 oldpath = None
4428+ elif oldpath_utf8[:1] != '/':
4429+ raise InventoryDeltaError(
4430+ "oldpath invalid (does not start with /): %r"
4431+ % (oldpath_utf8,))
4432 else:
4433+ oldpath_utf8 = oldpath_utf8[1:]
4434 oldpath = oldpath_utf8.decode('utf8')
4435 if newpath_utf8 == 'None':
4436 newpath = None
4437+ elif newpath_utf8[:1] != '/':
4438+ raise InventoryDeltaError(
4439+ "newpath invalid (does not start with /): %r"
4440+ % (newpath_utf8,))
4441 else:
4442+ # Trim leading slash
4443+ newpath_utf8 = newpath_utf8[1:]
4444 newpath = newpath_utf8.decode('utf8')
4445+ content_tuple = tuple(content.split('\x00'))
4446+ if content_tuple[0] == 'deleted':
4447+ entry = None
4448+ else:
4449+ entry = _parse_entry(
4450+ newpath, file_id, parent_id, last_modified, content_tuple)
4451 delta_item = (oldpath, newpath, file_id, entry)
4452 result.append(delta_item)
4453- return delta_parent_id, delta_version_id, result
4454-
4455-
4456-def _parse_entry(utf8_path, file_id, parent_id, last_modified, content):
4457+ return (delta_parent_id, delta_version_id, delta_versioned_root,
4458+ delta_tree_references, result)
4459+
4460+
4461+def _parse_entry(path, file_id, parent_id, last_modified, content):
4462 entry_factory = {
4463 'dir': _dir_to_entry,
4464 'file': _file_to_entry,
4465@@ -292,8 +366,10 @@
4466 'tree': _tree_to_entry,
4467 }
4468 kind = content[0]
4469- path = utf8_path[1:].decode('utf8')
4470+ if path.startswith('/'):
4471+ raise AssertionError
4472 name = basename(path)
4473 return entry_factory[content[0]](
4474 content, name, parent_id, file_id, last_modified)
4475
4476+
4477
4478=== modified file 'bzrlib/knit.py'
4479--- bzrlib/knit.py 2009-08-04 04:36:34 +0000
4480+++ bzrlib/knit.py 2009-09-09 13:05:33 +0000
4481@@ -1190,6 +1190,19 @@
4482 generator = _VFContentMapGenerator(self, [key])
4483 return generator._get_content(key)
4484
4485+ def get_known_graph_ancestry(self, keys):
4486+ """Get a KnownGraph instance with the ancestry of keys."""
4487+ parent_map, missing_keys = self._index.find_ancestry(keys)
4488+ for fallback in self._fallback_vfs:
4489+ if not missing_keys:
4490+ break
4491+ (f_parent_map, f_missing_keys) = fallback._index.find_ancestry(
4492+ missing_keys)
4493+ parent_map.update(f_parent_map)
4494+ missing_keys = f_missing_keys
4495+ kg = _mod_graph.KnownGraph(parent_map)
4496+ return kg
4497+
4498 def get_parent_map(self, keys):
4499 """Get a map of the graph parents of keys.
4500
4501@@ -1505,10 +1518,10 @@
4502 if source is parent_maps[0]:
4503 # this KnitVersionedFiles
4504 records = [(key, positions[key][1]) for key in keys]
4505- for key, raw_data, sha1 in self._read_records_iter_raw(records):
4506+ for key, raw_data in self._read_records_iter_unchecked(records):
4507 (record_details, index_memo, _) = positions[key]
4508 yield KnitContentFactory(key, global_map[key],
4509- record_details, sha1, raw_data, self._factory.annotated, None)
4510+ record_details, None, raw_data, self._factory.annotated, None)
4511 else:
4512 vf = self._fallback_vfs[parent_maps.index(source) - 1]
4513 for record in vf.get_record_stream(keys, ordering,
4514@@ -1583,6 +1596,13 @@
4515 # key = basis_parent, value = index entry to add
4516 buffered_index_entries = {}
4517 for record in stream:
4518+ kind = record.storage_kind
4519+ if kind.startswith('knit-') and kind.endswith('-gz'):
4520+ # Check that the ID in the header of the raw knit bytes matches
4521+ # the record metadata.
4522+ raw_data = record._raw_record
4523+ df, rec = self._parse_record_header(record.key, raw_data)
4524+ df.close()
4525 buffered = False
4526 parents = record.parents
4527 if record.storage_kind in delta_types:
4528@@ -2560,6 +2580,33 @@
4529 except KeyError:
4530 raise RevisionNotPresent(key, self)
4531
4532+ def find_ancestry(self, keys):
4533+ """See CombinedGraphIndex.find_ancestry()"""
4534+ prefixes = set(key[:-1] for key in keys)
4535+ self._load_prefixes(prefixes)
4536+ result = {}
4537+ parent_map = {}
4538+ missing_keys = set()
4539+ pending_keys = list(keys)
4540+ # This assumes that keys will not reference parents in a different
4541+ # prefix, which is accurate so far.
4542+ while pending_keys:
4543+ key = pending_keys.pop()
4544+ if key in parent_map:
4545+ continue
4546+ prefix = key[:-1]
4547+ try:
4548+ suffix_parents = self._kndx_cache[prefix][0][key[-1]][4]
4549+ except KeyError:
4550+ missing_keys.add(key)
4551+ else:
4552+ parent_keys = tuple([prefix + (suffix,)
4553+ for suffix in suffix_parents])
4554+ parent_map[key] = parent_keys
4555+ pending_keys.extend([p for p in parent_keys
4556+ if p not in parent_map])
4557+ return parent_map, missing_keys
4558+
4559 def get_parent_map(self, keys):
4560 """Get a map of the parents of keys.
4561
4562@@ -2737,9 +2784,20 @@
4563
4564 class _KeyRefs(object):
4565
4566- def __init__(self):
4567+ def __init__(self, track_new_keys=False):
4568 # dict mapping 'key' to 'set of keys referring to that key'
4569 self.refs = {}
4570+ if track_new_keys:
4571+ # set remembering all new keys
4572+ self.new_keys = set()
4573+ else:
4574+ self.new_keys = None
4575+
4576+ def clear(self):
4577+ if self.refs:
4578+ self.refs.clear()
4579+ if self.new_keys:
4580+ self.new_keys.clear()
4581
4582 def add_references(self, key, refs):
4583 # Record the new references
4584@@ -2752,19 +2810,28 @@
4585 # Discard references satisfied by the new key
4586 self.add_key(key)
4587
4588+ def get_new_keys(self):
4589+ return self.new_keys
4590+
4591 def get_unsatisfied_refs(self):
4592 return self.refs.iterkeys()
4593
4594- def add_key(self, key):
4595+ def _satisfy_refs_for_key(self, key):
4596 try:
4597 del self.refs[key]
4598 except KeyError:
4599 # No keys depended on this key. That's ok.
4600 pass
4601
4602- def add_keys(self, keys):
4603+ def add_key(self, key):
4604+ # satisfy refs for key, and remember that we've seen this key.
4605+ self._satisfy_refs_for_key(key)
4606+ if self.new_keys is not None:
4607+ self.new_keys.add(key)
4608+
4609+ def satisfy_refs_for_keys(self, keys):
4610 for key in keys:
4611- self.add_key(key)
4612+ self._satisfy_refs_for_key(key)
4613
4614 def get_referrers(self):
4615 result = set()
4616@@ -2932,7 +2999,7 @@
4617 # If updating this, you should also update
4618 # groupcompress._GCGraphIndex.get_missing_parents
4619 # We may have false positives, so filter those out.
4620- self._key_dependencies.add_keys(
4621+ self._key_dependencies.satisfy_refs_for_keys(
4622 self.get_parent_map(self._key_dependencies.get_unsatisfied_refs()))
4623 return frozenset(self._key_dependencies.get_unsatisfied_refs())
4624
4625@@ -3049,6 +3116,10 @@
4626 options.append('no-eol')
4627 return options
4628
4629+ def find_ancestry(self, keys):
4630+ """See CombinedGraphIndex.find_ancestry()"""
4631+ return self._graph_index.find_ancestry(keys, 0)
4632+
4633 def get_parent_map(self, keys):
4634 """Get a map of the parents of keys.
4635
4636
4637=== modified file 'bzrlib/lock.py'
4638--- bzrlib/lock.py 2009-07-20 08:47:58 +0000
4639+++ bzrlib/lock.py 2009-07-31 16:51:48 +0000
4640@@ -190,6 +190,13 @@
4641 if self.filename in _fcntl_WriteLock._open_locks:
4642 self._clear_f()
4643 raise errors.LockContention(self.filename)
4644+ if self.filename in _fcntl_ReadLock._open_locks:
4645+ if 'strict_locks' in debug.debug_flags:
4646+ self._clear_f()
4647+ raise errors.LockContention(self.filename)
4648+ else:
4649+ trace.mutter('Write lock taken w/ an open read lock on: %s'
4650+ % (self.filename,))
4651
4652 self._open(self.filename, 'rb+')
4653 # reserve a slot for this lock - even if the lockf call fails,
4654@@ -220,6 +227,14 @@
4655 def __init__(self, filename):
4656 super(_fcntl_ReadLock, self).__init__()
4657 self.filename = osutils.realpath(filename)
4658+ if self.filename in _fcntl_WriteLock._open_locks:
4659+ if 'strict_locks' in debug.debug_flags:
4660+ # We raise before calling _open so we don't need to
4661+ # _clear_f
4662+ raise errors.LockContention(self.filename)
4663+ else:
4664+ trace.mutter('Read lock taken w/ an open write lock on: %s'
4665+ % (self.filename,))
4666 _fcntl_ReadLock._open_locks.setdefault(self.filename, 0)
4667 _fcntl_ReadLock._open_locks[self.filename] += 1
4668 self._open(filename, 'rb')
4669@@ -418,15 +433,15 @@
4670 DWORD, # dwFlagsAndAttributes
4671 HANDLE # hTemplateFile
4672 )((_function_name, ctypes.windll.kernel32))
4673-
4674+
4675 INVALID_HANDLE_VALUE = -1
4676-
4677+
4678 GENERIC_READ = 0x80000000
4679 GENERIC_WRITE = 0x40000000
4680 FILE_SHARE_READ = 1
4681 OPEN_ALWAYS = 4
4682 FILE_ATTRIBUTE_NORMAL = 128
4683-
4684+
4685 ERROR_ACCESS_DENIED = 5
4686 ERROR_SHARING_VIOLATION = 32
4687
4688
4689=== modified file 'bzrlib/lsprof.py'
4690--- bzrlib/lsprof.py 2009-03-08 06:18:06 +0000
4691+++ bzrlib/lsprof.py 2009-08-24 21:05:09 +0000
4692@@ -13,45 +13,74 @@
4693
4694 __all__ = ['profile', 'Stats']
4695
4696-_g_threadmap = {}
4697-
4698-
4699-def _thread_profile(f, *args, **kwds):
4700- # we lose the first profile point for a new thread in order to trampoline
4701- # a new Profile object into place
4702- global _g_threadmap
4703- thr = thread.get_ident()
4704- _g_threadmap[thr] = p = Profiler()
4705- # this overrides our sys.setprofile hook:
4706- p.enable(subcalls=True, builtins=True)
4707-
4708-
4709 def profile(f, *args, **kwds):
4710 """Run a function profile.
4711
4712 Exceptions are not caught: If you need stats even when exceptions are to be
4713- raised, passing in a closure that will catch the exceptions and transform
4714- them appropriately for your driver function.
4715+ raised, pass in a closure that will catch the exceptions and transform them
4716+ appropriately for your driver function.
4717
4718 :return: The functions return value and a stats object.
4719 """
4720- global _g_threadmap
4721- p = Profiler()
4722- p.enable(subcalls=True)
4723- threading.setprofile(_thread_profile)
4724+ profiler = BzrProfiler()
4725+ profiler.start()
4726 try:
4727 ret = f(*args, **kwds)
4728 finally:
4729- p.disable()
4730- for pp in _g_threadmap.values():
4731+ stats = profiler.stop()
4732+ return ret, stats
4733+
4734+
4735+class BzrProfiler(object):
4736+ """Bzr utility wrapper around Profiler.
4737+
4738+ For most uses the module level 'profile()' function will be suitable.
4739+ However profiling when a simple wrapped function isn't available may
4740+ be easier to accomplish using this class.
4741+
4742+ To use it, create a BzrProfiler and call start() on it. Some arbitrary
4743+ time later call stop() to stop profiling and retrieve the statistics
4744+ from the code executed in the interim.
4745+ """
4746+
4747+ def start(self):
4748+ """Start profiling.
4749+
4750+ This hooks into threading and will record all calls made until
4751+ stop() is called.
4752+ """
4753+ self._g_threadmap = {}
4754+ self.p = Profiler()
4755+ self.p.enable(subcalls=True)
4756+ threading.setprofile(self._thread_profile)
4757+
4758+ def stop(self):
4759+ """Stop profiling.
4760+
4761+ This unhooks from threading and cleans up the profiler, returning
4762+ the gathered Stats object.
4763+
4764+ :return: A bzrlib.lsprof.Stats object.
4765+ """
4766+ self.p.disable()
4767+ for pp in self._g_threadmap.values():
4768 pp.disable()
4769 threading.setprofile(None)
4770+ p = self.p
4771+ self.p = None
4772+ threads = {}
4773+ for tid, pp in self._g_threadmap.items():
4774+ threads[tid] = Stats(pp.getstats(), {})
4775+ self._g_threadmap = None
4776+ return Stats(p.getstats(), threads)
4777
4778- threads = {}
4779- for tid, pp in _g_threadmap.items():
4780- threads[tid] = Stats(pp.getstats(), {})
4781- _g_threadmap = {}
4782- return ret, Stats(p.getstats(), threads)
4783+ def _thread_profile(self, f, *args, **kwds):
4784+ # we lose the first profile point for a new thread in order to
4785+ # trampoline a new Profile object into place
4786+ thr = thread.get_ident()
4787+ self._g_threadmap[thr] = p = Profiler()
4788+ # this overrides our sys.setprofile hook:
4789+ p.enable(subcalls=True, builtins=True)
4790
4791
4792 class Stats(object):
4793
4794=== modified file 'bzrlib/mail_client.py'
4795--- bzrlib/mail_client.py 2009-06-10 03:56:49 +0000
4796+++ bzrlib/mail_client.py 2009-09-02 08:26:27 +0000
4797@@ -424,6 +424,10 @@
4798
4799 _client_commands = ['emacsclient']
4800
4801+ def __init__(self, config):
4802+ super(EmacsMail, self).__init__(config)
4803+ self.elisp_tmp_file = None
4804+
4805 def _prepare_send_function(self):
4806 """Write our wrapper function into a temporary file.
4807
4808@@ -500,6 +504,7 @@
4809 if attach_path is not None:
4810 # Do not create a file if there is no attachment
4811 elisp = self._prepare_send_function()
4812+ self.elisp_tmp_file = elisp
4813 lmmform = '(load "%s")' % elisp
4814 mmform = '(bzr-add-mime-att "%s")' % \
4815 self._encode_path(attach_path, 'attachment')
4816
4817=== modified file 'bzrlib/merge.py'
4818--- bzrlib/merge.py 2009-07-02 13:07:14 +0000
4819+++ bzrlib/merge.py 2009-09-11 13:32:55 +0000
4820@@ -64,8 +64,12 @@
4821
4822
4823 def transform_tree(from_tree, to_tree, interesting_ids=None):
4824- merge_inner(from_tree.branch, to_tree, from_tree, ignore_zero=True,
4825- interesting_ids=interesting_ids, this_tree=from_tree)
4826+ from_tree.lock_tree_write()
4827+ try:
4828+ merge_inner(from_tree.branch, to_tree, from_tree, ignore_zero=True,
4829+ interesting_ids=interesting_ids, this_tree=from_tree)
4830+ finally:
4831+ from_tree.unlock()
4832
4833
4834 class Merger(object):
4835@@ -102,6 +106,17 @@
4836 self._is_criss_cross = None
4837 self._lca_trees = None
4838
4839+ def cache_trees_with_revision_ids(self, trees):
4840+ """Cache any tree in trees if it has a revision_id."""
4841+ for maybe_tree in trees:
4842+ if maybe_tree is None:
4843+ continue
4844+ try:
4845+ rev_id = maybe_tree.get_revision_id()
4846+ except AttributeError:
4847+ continue
4848+ self._cached_trees[rev_id] = maybe_tree
4849+
4850 @property
4851 def revision_graph(self):
4852 if self._revision_graph is None:
4853@@ -598,19 +613,21 @@
4854 self.this_tree.lock_tree_write()
4855 self.base_tree.lock_read()
4856 self.other_tree.lock_read()
4857- self.tt = TreeTransform(self.this_tree, self.pb)
4858 try:
4859- self.pp.next_phase()
4860- self._compute_transform()
4861- self.pp.next_phase()
4862- results = self.tt.apply(no_conflicts=True)
4863- self.write_modified(results)
4864+ self.tt = TreeTransform(self.this_tree, self.pb)
4865 try:
4866- self.this_tree.add_conflicts(self.cooked_conflicts)
4867- except UnsupportedOperation:
4868- pass
4869+ self.pp.next_phase()
4870+ self._compute_transform()
4871+ self.pp.next_phase()
4872+ results = self.tt.apply(no_conflicts=True)
4873+ self.write_modified(results)
4874+ try:
4875+ self.this_tree.add_conflicts(self.cooked_conflicts)
4876+ except UnsupportedOperation:
4877+ pass
4878+ finally:
4879+ self.tt.finalize()
4880 finally:
4881- self.tt.finalize()
4882 self.other_tree.unlock()
4883 self.base_tree.unlock()
4884 self.this_tree.unlock()
4885@@ -1516,6 +1533,7 @@
4886 get_revision_id = getattr(base_tree, 'get_revision_id', None)
4887 if get_revision_id is None:
4888 get_revision_id = base_tree.last_revision
4889+ merger.cache_trees_with_revision_ids([other_tree, base_tree, this_tree])
4890 merger.set_base_revision(get_revision_id(), this_branch)
4891 return merger.do_merge()
4892
4893
4894=== modified file 'bzrlib/missing.py'
4895--- bzrlib/missing.py 2009-03-23 14:59:43 +0000
4896+++ bzrlib/missing.py 2009-08-17 18:52:01 +0000
4897@@ -138,31 +138,13 @@
4898 if not ancestry: #Empty ancestry, no need to do any work
4899 return []
4900
4901- mainline_revs, rev_nos, start_rev_id, end_rev_id = log._get_mainline_revs(
4902- branch, None, tip_revno)
4903- if not mainline_revs:
4904- return []
4905-
4906- # This asks for all mainline revisions, which is size-of-history and
4907- # should be addressed (but currently the only way to get correct
4908- # revnos).
4909-
4910- # mainline_revisions always includes an extra revision at the
4911- # beginning, so don't request it.
4912- parent_map = dict(((key, value) for key, value
4913- in graph.iter_ancestry(mainline_revs[1:])
4914- if value is not None))
4915- # filter out ghosts; merge_sort errors on ghosts.
4916- # XXX: is this needed here ? -- vila080910
4917- rev_graph = _mod_repository._strip_NULL_ghosts(parent_map)
4918- # XXX: what if rev_graph is empty now ? -- vila080910
4919- merge_sorted_revisions = tsort.merge_sort(rev_graph, tip,
4920- mainline_revs,
4921- generate_revno=True)
4922+ merge_sorted_revisions = branch.iter_merge_sorted_revisions()
4923 # Now that we got the correct revnos, keep only the relevant
4924 # revisions.
4925 merge_sorted_revisions = [
4926- (s, revid, n, d, e) for s, revid, n, d, e in merge_sorted_revisions
4927+ # log.reverse_by_depth expects seq_num to be present, but it is
4928+ # stripped by iter_merge_sorted_revisions()
4929+ (0, revid, n, d, e) for revid, n, d, e in merge_sorted_revisions
4930 if revid in ancestry]
4931 if not backward:
4932 merge_sorted_revisions = log.reverse_by_depth(merge_sorted_revisions)
4933
4934=== modified file 'bzrlib/mutabletree.py'
4935--- bzrlib/mutabletree.py 2009-07-10 08:33:11 +0000
4936+++ bzrlib/mutabletree.py 2009-09-07 23:14:05 +0000
4937@@ -226,6 +226,9 @@
4938 revprops=revprops,
4939 possible_master_transports=possible_master_transports,
4940 *args, **kwargs)
4941+ post_hook_params = PostCommitHookParams(self)
4942+ for hook in MutableTree.hooks['post_commit']:
4943+ hook(post_hook_params)
4944 return committed_id
4945
4946 def _gather_kinds(self, files, kinds):
4947@@ -581,14 +584,32 @@
4948 self.create_hook(hooks.HookPoint('start_commit',
4949 "Called before a commit is performed on a tree. The start commit "
4950 "hook is able to change the tree before the commit takes place. "
4951- "start_commit is called with the bzrlib.tree.MutableTree that the "
4952- "commit is being performed on.", (1, 4), None))
4953+ "start_commit is called with the bzrlib.mutabletree.MutableTree "
4954+ "that the commit is being performed on.", (1, 4), None))
4955+ self.create_hook(hooks.HookPoint('post_commit',
4956+ "Called after a commit is performed on a tree. The hook is "
4957+ "called with a bzrlib.mutabletree.PostCommitHookParams object. "
4958+ "The mutable tree the commit was performed on is available via "
4959+ "the mutable_tree attribute of that object.", (2, 0), None))
4960
4961
4962 # install the default hooks into the MutableTree class.
4963 MutableTree.hooks = MutableTreeHooks()
4964
4965
4966+class PostCommitHookParams(object):
4967+ """Parameters for the post_commit hook.
4968+
4969+ To access the parameters, use the following attributes:
4970+
4971+ * mutable_tree - the MutableTree object
4972+ """
4973+
4974+ def __init__(self, mutable_tree):
4975+ """Create the parameters for the post_commit hook."""
4976+ self.mutable_tree = mutable_tree
4977+
4978+
4979 class _FastPath(object):
4980 """A path object with fast accessors for things like basename."""
4981
4982
4983=== modified file 'bzrlib/plugin.py'
4984--- bzrlib/plugin.py 2009-03-24 01:53:42 +0000
4985+++ bzrlib/plugin.py 2009-09-04 15:36:48 +0000
4986@@ -52,12 +52,16 @@
4987 from bzrlib import plugins as _mod_plugins
4988 """)
4989
4990-from bzrlib.symbol_versioning import deprecated_function
4991+from bzrlib.symbol_versioning import (
4992+ deprecated_function,
4993+ deprecated_in,
4994+ )
4995
4996
4997 DEFAULT_PLUGIN_PATH = None
4998 _loaded = False
4999
5000+@deprecated_function(deprecated_in((2, 0, 0)))
The diff has been truncated for viewing.

Subscribers

People subscribed via source and target branches