Merge lp:~ian-clatworthy/bzr/eol-update-bug into lp:bzr/2.0
- eol-update-bug
- Merge into 2.0
Status: | Merged |
---|---|
Merged at revision: | not available |
Proposed branch: | lp:~ian-clatworthy/bzr/eol-update-bug |
Merge into: | lp:bzr/2.0 |
Diff against target: | 325 lines |
To merge this branch: | bzr merge lp:~ian-clatworthy/bzr/eol-update-bug |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Robert Collins (community) | Needs Fixing | ||
Review via email: mp+10959@code.launchpad.net |
Commit message
Description of the change
Ian Clatworthy (ian-clatworthy) wrote : | # |
Robert Collins (lifeless) wrote : | # |
>> BTW, the bug report also mentions that after committing a tree with keywords expanded, the keywords aren't re-expanded after commit.
Commit does not write to the users tree : a post commit hook isn't appropriate [for starters, other post commit hooks could interfere/see convenience forms themselves etc]. I suggest looking closely for the cause/some possible confusion is going on here.
Robert Collins (lifeless) wrote : | # |
I'm trying to channel Aaron here, and I don't know the transform code *all that well*. However, what you've done looks wrong to me.
Specifically, you look up a path in one tree, and use that to select filters in a different tree (this is at the very top of your patch).
This has the problem that when a containing directory is renamed in this tree, it will be matching on a different path than actually will be used.
Secondly, as filters are defined as being path based, *every* change to a transform that can cause a change to the path of something that can be filtered needs to recalculate the filter stack and possibly reapply it, for all those somethings. E.g. renaming a directory foo/baz to bar/baz, and foo was filtered and bar isn't.
I think the change to the API to pass in a path that is able to be wrong reduces clarity and shouldn't be done. We can probably get by (with a critical bug to be fixed asap after the release) with out updating filters appropriately.
Ian Clatworthy (ian-clatworthy) wrote : | # |
Robert Collins wrote:
> Review: Needs Fixing
Thanks for the feedback. poolie is taking over this patch from here so I
can focus on the Windows installer. I'll let him look deeper but some
quick comments on your first point ...
> Specifically, you look up a path in one tree, and use that to select filters in a different tree (this is at the very top of your patch).
The patch mightn't be quite right. I can say however that this cross
tree stuff is tricky. Keep in mind that:
* in the case of reverting a removed file, its path doesn't exist
in the working tree so the only place to find that is the rev-tree
* rules aren't stored historically so, IIRC, revtrees don't know
what they are - only the WT knows that.
So looking up a path in the revtree and using the filters from the WT
may indeed be correct, in the case of revert at least.
Merging has similar challenges. A new path may only exist in the OTHER
tree say but the filters applied need to come from the working tree
being merged into.
Ian C.
Ian Clatworthy (ian-clatworthy) wrote : | # |
Robert Collins wrote:
>>> BTW, the bug report also mentions that after committing a tree with keywords expanded, the keywords aren't re-expanded after commit.
>
> Commit does not write to the users tree : a post commit hook isn't appropriate [for starters, other post commit hooks could interfere/see convenience forms themselves etc]. I suggest looking closely for the cause/some possible confusion is going on here.
Plugins like keywords do need to update the working tree, if any, after
a commit so that files just committed get keywords values updated. As
the keywords plugin is currently designed, the only files needing
potential update are the newly committed ones.
I agree a post commit hook isn't the answer. I've put a separate patch
up for a finish-commit hook on mutable trees instead.
Ian C.
Robert Collins (lifeless) wrote : | # |
On Thu, 2009-09-03 at 00:57 +0000, Ian Clatworthy wrote:
>
> > Commit does not write to the users tree : a post commit hook isn't
> appropriate [for starters, other post commit hooks could interfere/see
> convenience forms themselves etc]. I suggest looking closely for the
> cause/some possible confusion is going on here.
>
> Plugins like keywords do need to update the working tree, if any,
> after
> a commit so that files just committed get keywords values updated. As
> the keywords plugin is currently designed, the only files needing
> potential update are the newly committed ones.
>
> I agree a post commit hook isn't the answer. I've put a separate patch
> up for a finish-commit hook on mutable trees instead.
Oh, that use case wasn't clear - I didn't get what was needed.
Branch.post_commit definitely isn't the right thing as its on branch.
There may be room for MutableTree.
whether or not <activity> should occur before the basis revision id is
changed.
So, the question to ask is 'should keywords update the files *after* the
basis changes, or *before*. I think the answer is *after*, but you've
spent more time staring at this problem: its up to you.
I do suggest that the hook name try to reflect where in the process its
happening, and if its after commit has done its stuff, its really just a
post commit hook on a different object, so lets use the namespaces and
call it MT.post_commit
-Rob
Robert Collins (lifeless) wrote : | # |
On Thu, 2009-09-03 at 00:54 +0000, Ian Clatworthy wrote:
>
> The patch mightn't be quite right. I can say however that this cross
> tree stuff is tricky. Keep in mind that:
>
> * in the case of reverting a removed file, its path doesn't exist
> in the working tree so the only place to find that is the rev-tree
It has a path that it will end up with in the output tree; thats the one
that should be used, right? Paths in other trees are a poor proxy for
that.
> So looking up a path in the revtree and using the filters from the WT
> may indeed be correct, in the case of revert at least.
I argue that it would only be accurate in the most trivial of cases;
solving the primary problem - that a transform may need to reapply
filters some N times - will solve the other cases as well.
> Merging has similar challenges. A new path may only exist in the OTHER
> tree say but the filters applied need to come from the working tree
> being merged into.
Right, its exactly what I'm driving at.
-Rob
Martin Pool (mbp) wrote : | # |
I don't think this is safe for 2.0 but it might be ok for 2.1b1
280 +def create_
281 + filter_
282 + """Create new file contents according to tree contents.
283 +
284 + :param filter_tree_path: the tree path to use to lookup
285 + content filters to apply to the bytes output in the working tree.
286 + This only applies if the working tree supports content filtering.
287 + """
I don't understand how that new parameter works:
Is it a single file? How can that work with tt working across a whole tree?
Ian Clatworthy (ian-clatworthy) wrote : | # |
> I don't think this is safe for 2.0 but it might be ok for 2.1b1
>
> 280 +def create_
> 281 + filter_
> 282 + """Create new file contents according to tree contents.
> 283 +
> 284 + :param filter_tree_path: the tree path to use to lookup
> 285 + content filters to apply to the bytes output in the working tree.
> 286 + This only applies if the working tree supports content filtering.
> 287 + """
>
> I don't understand how that new parameter works:
>
> Is it a single file? How can that work with tt working across a whole tree?
This API is called per file-id, it's not called per tree.
Preview Diff
1 | === modified file 'NEWS' |
2 | --- NEWS 2009-08-31 00:25:33 +0000 |
3 | +++ NEWS 2009-09-01 06:35:12 +0000 |
4 | @@ -20,6 +20,9 @@ |
5 | revisions that are in the fallback repository. (Regressed in 2.0rc1). |
6 | (John Arbash Meinel, #419241) |
7 | |
8 | +* Content filters are now applied correctly after pull, merge, switch |
9 | + and revert. (Ian Clatworthy, #385879) |
10 | + |
11 | * Fix a segmentation fault when computing the ``merge_sort`` of a graph |
12 | that has a ghost in the mainline ancestry. |
13 | (John Arbash Meinel, #419241) |
14 | |
15 | === modified file 'bzrlib/merge.py' |
16 | --- bzrlib/merge.py 2009-08-26 03:20:32 +0000 |
17 | +++ bzrlib/merge.py 2009-09-01 06:35:12 +0000 |
18 | @@ -1160,8 +1160,15 @@ |
19 | self.tt.delete_contents(trans_id) |
20 | if file_id in self.other_tree: |
21 | # OTHER changed the file |
22 | + wt = self.this_tree |
23 | + if wt.supports_content_filtering(): |
24 | + filter_tree_path = self.other_tree.id2path(file_id) |
25 | + else: |
26 | + # Skip the id2path lookup for older formats |
27 | + filter_tree_path = None |
28 | create_from_tree(self.tt, trans_id, |
29 | - self.other_tree, file_id) |
30 | + self.other_tree, file_id, |
31 | + filter_tree_path=filter_tree_path) |
32 | if not file_in_this: |
33 | self.tt.version_file(file_id, trans_id) |
34 | return "modified" |
35 | @@ -1254,12 +1261,26 @@ |
36 | ('THIS', self.this_tree, this_lines)] |
37 | if not no_base: |
38 | data.append(('BASE', self.base_tree, base_lines)) |
39 | + |
40 | + # We need to use the actual path in the working tree of the file here, |
41 | + # ignoring the conflict suffixes |
42 | + wt = self.this_tree |
43 | + if wt.supports_content_filtering(): |
44 | + try: |
45 | + filter_tree_path = wt.id2path(file_id) |
46 | + except errors.NoSuchId: |
47 | + # file has been deleted |
48 | + filter_tree_path = None |
49 | + else: |
50 | + # Skip the id2path lookup for older formats |
51 | + filter_tree_path = None |
52 | + |
53 | versioned = False |
54 | file_group = [] |
55 | for suffix, tree, lines in data: |
56 | if file_id in tree: |
57 | trans_id = self._conflict_file(name, parent_id, tree, file_id, |
58 | - suffix, lines) |
59 | + suffix, lines, filter_tree_path) |
60 | file_group.append(trans_id) |
61 | if set_version and not versioned: |
62 | self.tt.version_file(file_id, trans_id) |
63 | @@ -1267,11 +1288,12 @@ |
64 | return file_group |
65 | |
66 | def _conflict_file(self, name, parent_id, tree, file_id, suffix, |
67 | - lines=None): |
68 | + lines=None, filter_tree_path=None): |
69 | """Emit a single conflict file.""" |
70 | name = name + '.' + suffix |
71 | trans_id = self.tt.create_path(name, parent_id) |
72 | - create_from_tree(self.tt, trans_id, tree, file_id, lines) |
73 | + create_from_tree(self.tt, trans_id, tree, file_id, lines, |
74 | + filter_tree_path) |
75 | return trans_id |
76 | |
77 | def merge_executable(self, file_id, file_status): |
78 | |
79 | === modified file 'bzrlib/tests/per_workingtree/test_content_filters.py' |
80 | --- bzrlib/tests/per_workingtree/test_content_filters.py 2009-08-25 04:43:21 +0000 |
81 | +++ bzrlib/tests/per_workingtree/test_content_filters.py 2009-09-01 06:35:12 +0000 |
82 | @@ -16,7 +16,11 @@ |
83 | |
84 | """Tests for content filtering conformance""" |
85 | |
86 | +import os |
87 | + |
88 | +from bzrlib.bzrdir import BzrDir |
89 | from bzrlib.filters import ContentFilter |
90 | +from bzrlib.switch import switch |
91 | from bzrlib.workingtree import WorkingTree |
92 | from bzrlib.tests.per_workingtree import TestCaseWithWorkingTree |
93 | |
94 | @@ -64,7 +68,8 @@ |
95 | |
96 | class TestWorkingTreeWithContentFilters(TestCaseWithWorkingTree): |
97 | |
98 | - def create_cf_tree(self, txt_reader, txt_writer, dir='.'): |
99 | + def create_cf_tree(self, txt_reader, txt_writer, dir='.', |
100 | + two_revisions=False): |
101 | tree = self.make_branch_and_tree(dir) |
102 | def _content_filter_stack(path=None, file_id=None): |
103 | if path.endswith('.txt'): |
104 | @@ -77,10 +82,37 @@ |
105 | (dir + '/file2.bin', 'Foo Bin')]) |
106 | tree.add(['file1.txt', 'file2.bin']) |
107 | tree.commit('commit raw content') |
108 | + # Commit another revision with changed text, if requested |
109 | + if two_revisions: |
110 | + self.build_tree_contents([(dir + '/file1.txt', 'Foo ROCKS!')]) |
111 | + tree.commit("changed file1.txt") |
112 | txt_fileid = tree.path2id('file1.txt') |
113 | bin_fileid = tree.path2id('file2.bin') |
114 | return tree, txt_fileid, bin_fileid |
115 | |
116 | + def patch_in_content_filter(self): |
117 | + # Patch in a custom, symmetric content filter stack |
118 | + self.real_content_filter_stack = WorkingTree._content_filter_stack |
119 | + def restore_real_content_filter_stack(): |
120 | + WorkingTree._content_filter_stack = self.real_content_filter_stack |
121 | + self.addCleanup(restore_real_content_filter_stack) |
122 | + def _content_filter_stack(tree, path=None, file_id=None): |
123 | + if path.endswith('.txt'): |
124 | + return [ContentFilter(_swapcase, _swapcase)] |
125 | + else: |
126 | + return [] |
127 | + WorkingTree._content_filter_stack = _content_filter_stack |
128 | + |
129 | + def assert_basis_content(self, expected_content, branch, file_id): |
130 | + # Note: We need to use try/finally here instead of addCleanup() |
131 | + # as the latter leaves the read lock in place too long |
132 | + basis = branch.basis_tree() |
133 | + basis.lock_read() |
134 | + try: |
135 | + self.assertEqual(expected_content, basis.get_file_text(file_id)) |
136 | + finally: |
137 | + basis.unlock() |
138 | + |
139 | def test_symmetric_content_filtering(self): |
140 | # test handling when read then write gives back the initial content |
141 | tree, txt_fileid, bin_fileid = self.create_cf_tree( |
142 | @@ -132,10 +164,7 @@ |
143 | if not source.supports_content_filtering(): |
144 | return |
145 | self.assertFileEqual("Foo Txt", 'source/file1.txt') |
146 | - basis = source.basis_tree() |
147 | - basis.lock_read() |
148 | - self.addCleanup(basis.unlock) |
149 | - self.assertEqual("FOO TXT", basis.get_file_text(txt_fileid)) |
150 | + self.assert_basis_content("FOO TXT", source, txt_fileid) |
151 | |
152 | # Now branch it |
153 | self.run_bzr('branch source target') |
154 | @@ -153,24 +182,10 @@ |
155 | if not source.supports_content_filtering(): |
156 | return |
157 | self.assertFileEqual("Foo Txt", 'source/file1.txt') |
158 | - basis = source.basis_tree() |
159 | - basis.lock_read() |
160 | - self.addCleanup(basis.unlock) |
161 | - self.assertEqual("Foo Txt", basis.get_file_text(txt_fileid)) |
162 | - |
163 | - # Patch in a custom, symmetric content filter stack |
164 | - self.real_content_filter_stack = WorkingTree._content_filter_stack |
165 | - def restore_real_content_filter_stack(): |
166 | - WorkingTree._content_filter_stack = self.real_content_filter_stack |
167 | - self.addCleanup(restore_real_content_filter_stack) |
168 | - def _content_filter_stack(tree, path=None, file_id=None): |
169 | - if path.endswith('.txt'): |
170 | - return [ContentFilter(_swapcase, _swapcase)] |
171 | - else: |
172 | - return [] |
173 | - WorkingTree._content_filter_stack = _content_filter_stack |
174 | - |
175 | - # Now branch it |
176 | + self.assert_basis_content("Foo Txt", source, txt_fileid) |
177 | + |
178 | + # Now patch in content filtering and branch the source |
179 | + self.patch_in_content_filter() |
180 | self.run_bzr('branch source target') |
181 | target = WorkingTree.open('target') |
182 | # Even though the content in source and target are different |
183 | @@ -209,3 +224,86 @@ |
184 | # we could give back the length of the canonical form, but in general |
185 | # that will be expensive to compute, so it's acceptable to just return |
186 | # None. |
187 | + |
188 | + def test_content_filtering_applied_on_pull(self): |
189 | + # Create a source branch with two revisions |
190 | + source, txt_fileid, bin_fileid = self.create_cf_tree( |
191 | + txt_reader=None, txt_writer=None, dir='source', two_revisions=True) |
192 | + if not source.supports_content_filtering(): |
193 | + return |
194 | + self.assertFileEqual("Foo ROCKS!", 'source/file1.txt') |
195 | + self.assert_basis_content("Foo ROCKS!", source, txt_fileid) |
196 | + |
197 | + # Now patch in content filtering and branch from revision 1 |
198 | + self.patch_in_content_filter() |
199 | + self.run_bzr('branch -r1 source target') |
200 | + target = WorkingTree.open('target') |
201 | + self.assertFileEqual("fOO tXT", 'target/file1.txt') |
202 | + self.assert_basis_content("Foo Txt", target, txt_fileid) |
203 | + |
204 | + # Pull the latter change and check the target tree is updated |
205 | + self.run_bzr('pull -d target') |
206 | + self.assertFileEqual("fOO rocks!", 'target/file1.txt') |
207 | + self.assert_basis_content("Foo ROCKS!", target, txt_fileid) |
208 | + |
209 | + def test_content_filtering_applied_on_merge(self): |
210 | + # Create a source branch with two revisions |
211 | + source, txt_fileid, bin_fileid = self.create_cf_tree( |
212 | + txt_reader=None, txt_writer=None, dir='source', two_revisions=True) |
213 | + if not source.supports_content_filtering(): |
214 | + return |
215 | + self.assertFileEqual("Foo ROCKS!", 'source/file1.txt') |
216 | + self.assert_basis_content("Foo ROCKS!", source, txt_fileid) |
217 | + |
218 | + # Now patch in content filtering and branch from revision 1 |
219 | + self.patch_in_content_filter() |
220 | + self.run_bzr('branch -r1 source target') |
221 | + target = WorkingTree.open('target') |
222 | + self.assertFileEqual("fOO tXT", 'target/file1.txt') |
223 | + self.assert_basis_content("Foo Txt", target, txt_fileid) |
224 | + |
225 | + # Merge the latter change and check the target tree is updated |
226 | + self.run_bzr('merge -d target source') |
227 | + self.assertFileEqual("fOO rocks!", 'target/file1.txt') |
228 | + |
229 | + # Commit the merge and check the right content is stored |
230 | + target.commit("merge file1.txt changes from source") |
231 | + self.assert_basis_content("Foo ROCKS!", target, txt_fileid) |
232 | + |
233 | + def test_content_filtering_applied_on_switch(self): |
234 | + # Create a source branch with two revisions |
235 | + source, txt_fileid, bin_fileid = self.create_cf_tree( |
236 | + txt_reader=None, txt_writer=None, dir='branch-a', two_revisions=True) |
237 | + if not source.supports_content_filtering(): |
238 | + return |
239 | + |
240 | + # Now patch in content filtering and branch from revision 1 |
241 | + self.patch_in_content_filter() |
242 | + self.run_bzr('branch -r1 branch-a branch-b') |
243 | + |
244 | + # Now create a lightweight checkout referring to branch-b |
245 | + self.run_bzr('checkout --lightweight branch-b checkout') |
246 | + self.assertFileEqual("fOO tXT", 'checkout/file1.txt') |
247 | + |
248 | + # Switch it to branch-b and check the tree is updated |
249 | + checkout_control_dir = BzrDir.open_containing('checkout')[0] |
250 | + switch(checkout_control_dir, source.branch) |
251 | + self.assertFileEqual("fOO rocks!", 'checkout/file1.txt') |
252 | + |
253 | + def test_content_filtering_applied_on_revert(self): |
254 | + # Create a source branch with content filtering |
255 | + source, txt_fileid, bin_fileid = self.create_cf_tree( |
256 | + txt_reader=_uppercase, txt_writer=_lowercase, dir='source') |
257 | + if not source.supports_content_filtering(): |
258 | + return |
259 | + self.assertFileEqual("Foo Txt", 'source/file1.txt') |
260 | + self.assert_basis_content("FOO TXT", source, txt_fileid) |
261 | + |
262 | + # Now delete the file, revert it and check the content |
263 | + os.unlink('source/file1.txt') |
264 | + self.assertFalse(os.path.exists('source/file1.txt')) |
265 | + source.revert(['file1.txt']) |
266 | + self.assertTrue(os.path.exists('source/file1.txt')) |
267 | + # Note: we don't get back exactly what was in the tree |
268 | + # previously because lower(upper(text)) is a lossy transformation |
269 | + self.assertFileEqual("foo txt", 'source/file1.txt') |
270 | |
271 | === modified file 'bzrlib/transform.py' |
272 | --- bzrlib/transform.py 2009-08-26 05:38:16 +0000 |
273 | +++ bzrlib/transform.py 2009-09-01 06:35:12 +0000 |
274 | @@ -2402,8 +2402,14 @@ |
275 | tt.create_directory(trans_id) |
276 | |
277 | |
278 | -def create_from_tree(tt, trans_id, tree, file_id, bytes=None): |
279 | - """Create new file contents according to tree contents.""" |
280 | +def create_from_tree(tt, trans_id, tree, file_id, bytes=None, |
281 | + filter_tree_path=None): |
282 | + """Create new file contents according to tree contents. |
283 | + |
284 | + :param filter_tree_path: the tree path to use to lookup |
285 | + content filters to apply to the bytes output in the working tree. |
286 | + This only applies if the working tree supports content filtering. |
287 | + """ |
288 | kind = tree.kind(file_id) |
289 | if kind == 'directory': |
290 | tt.create_directory(trans_id) |
291 | @@ -2414,6 +2420,11 @@ |
292 | bytes = tree_file.readlines() |
293 | finally: |
294 | tree_file.close() |
295 | + wt = tt._tree |
296 | + if wt.supports_content_filtering() and filter_tree_path is not None: |
297 | + filters = wt._content_filter_stack(filter_tree_path) |
298 | + bytes = filtered_output_bytes(bytes, filters, |
299 | + ContentFilterContext(filter_tree_path, tree)) |
300 | tt.create_file(bytes, trans_id) |
301 | elif kind == "symlink": |
302 | tt.create_symlink(tree.get_symlink_target(file_id), trans_id) |
303 | @@ -2610,9 +2621,19 @@ |
304 | tt.adjust_path(name[1], parent_trans, trans_id) |
305 | if executable[0] != executable[1] and kind[1] == "file": |
306 | tt.set_executability(executable[1], trans_id) |
307 | - for (trans_id, mode_id), bytes in target_tree.iter_files_bytes( |
308 | - deferred_files): |
309 | - tt.create_file(bytes, trans_id, mode_id) |
310 | + if working_tree.supports_content_filtering(): |
311 | + for index, ((trans_id, mode_id), bytes) in enumerate( |
312 | + target_tree.iter_files_bytes(deferred_files)): |
313 | + file_id = deferred_files[index][0] |
314 | + filter_tree_path = target_tree.id2path(file_id) |
315 | + filters = working_tree._content_filter_stack(filter_tree_path) |
316 | + bytes = filtered_output_bytes(bytes, filters, |
317 | + ContentFilterContext(filter_tree_path, working_tree)) |
318 | + tt.create_file(bytes, trans_id, mode_id) |
319 | + else: |
320 | + for (trans_id, mode_id), bytes in target_tree.iter_files_bytes( |
321 | + deferred_files): |
322 | + tt.create_file(bytes, trans_id, mode_id) |
323 | finally: |
324 | if basis_tree is not None: |
325 | basis_tree.unlock() |
This patch fixes content filtering so that it works after most operations, not just after the initial branch/checkout. There were 2 code paths in transform.py that needed enhancing to check for content filters: one used by 'merging' and one used by 'reverting'. The merging code path is used by lots of operations including merge, pull, update and switch.
I've added several 'blackbox-style' tests as part of this patch. It probably wouldn't hurt to add some more for other operations, e.g. shelve/unshelve, though I'm optimistic that the 2 code paths above will cover 99% of them.
In any case, I'm submitting this for review now because:
1. Without these code changes, content filtering is effectively broken for most users.
2. The 2.0 code freeze is really close and I want to maximise the time available for review of what's fixed/tested already.
3. Those tests could come in a latter patch (after there's agreement on whether blackbox-style tests are acceptable here or not).
BTW, the bug report also mentions that after committing a tree with keywords expanded, the keywords aren't re-expanded after commit. I'm going to work on that issue now, either by adding a post-commit hook to the keywords plugin or by changing commit. The latter approach implies always paying a performance cost whether it's required or not so I'll try the post-commit approach first. Either way, I feel it's best to keep that issue out of scope of this patch.