Bazaar

Merge lp:~vila/bzr/880701-resolve-duplicate into lp:bzr/2.4

880701-resolve-duplicate
Merge into 2.4

Proposed by Vincent Ladeuil on 2011-10-26

Status:

Merged

Approved by:

Vincent Ladeuil on 2011-10-27

Approved revision:

no longer in the source branch.

Merged at revision:

6059

Proposed branch:

lp:~vila/bzr/880701-resolve-duplicate

Merge into:

lp:bzr/2.4

Diff against target:

280 lines (+107/-35)

5 files modified

bzrlib/errors.py (+1/-1)
bzrlib/merge.py (+73/-32)
bzrlib/tests/test_conflicts.py (+20/-1)
bzrlib/transform.py (+1/-1)
doc/en/release-notes/bzr-2.4.txt (+12/-0)

To merge this branch:

bzr merge lp:~vila/bzr/880701-resolve-duplicate

High

Fix Released

Link a bug report

Reviewer	Review Type	Date Requested	Status
Martin Packman (community)		2011-10-26	Approve on 2011-10-27
Review via email: mp+80489@code.launchpad.net

Commit message

Generate a 'duplicate' conflict instead of a 'content' conflict when the same path is involved in both branches.

Description of the change

td;lr: When a file is replaced by one with a new file-id and a merge is
attempted from a branch where the original file has been modified (or
vice-versa) a content conflict was created instead of duplicate one.

Hmm, even the short version is long isn't it ?

$ bzr init trunk ; cd trunk
$ echo 'foo' > file
$ bzr add ; bzr commit -m 'add file'
$ bzr rm file ; bzr commit -m 'delete file'
$ echo 'bar' > file
$ bzr add ; bzr commit -m 'add file again'

$ cd ..; bzr branch -r1 trunk feature; cd feature
$ echo 'baz' > file
$ bzr commit -m 'modify file'

$ cd ../trunk
$ bzr merge ../feature

At this point, two conflicts may created:
- 'content conflict' because file has been modified/deleted
- 'duplicate conflict' because file got a new file-id

The 'duplicate' one is more appropriate as it better reflects the resulting
situation. There is little point bothering the user about the changes made
on the other side and deleted on her side, the file has been replaced anyway
so either the user keeps the changes or take the new file which are the
choices provided after a 'duplicate' conflict.

The fix looks like black magic because the 'duplicate' conflicts are
generated later (in bzrlib.transform.conflict_pass).

I'm reasonably confident that this fixes one more 'malformed transform' bug
and get us closer to adressing some 'parallel imports' use cases.

Revision history for this message

Vincent Ladeuil (vila) wrote on 2011-10-27:

My description was may be as obscure as the fix itself, so let's try again.

The merge process is roughly:

1 - establish an iter_changes() between THIS and OTHER,
2 - do merge_names/merge_content for each entry and build the corresponding tree transform,
3 - do _compute_transform, itself calling _finish_computing_transform which itself does:
3.1 - transform_resolve_conflicts
3.2 - cook_conflicts

All these steps (except for 1) create and transform the conflicts objects but some conflicts are handled only by some of them.

The 'content' conflicts are created during 2 but the 'duplicate' are created during 3. So basically either we:
- create the content conflicts and later replace them by duplicate conflicts
- avoid creating the content conflicts making sure the duplicate ones will be created later

I went with the later to avoid undoing the work done when creating a content conflict (helpers creation .THIS/.OTHER/.BASE, unversioning the THIS 'file', making 'file.OTHER' versioned, etc).

This way what happens is:

- 'file' is left as is in THIS
- 'file' (with a different filed-id) is merged from OTHER
- two 'file's want to be at the same path (i.e. a 'duplicate' conflict)
- the old 'file' becomes 'file.moved' and the one coming from OTHER keeps the 'file' name

I hope this helps better understand the fix.

Revision history for this message

Martin Packman (gz) wrote on 2011-10-27:

The basic aim to here sounds like the right thing, content conflicts for files that happen to share the same path after merging is never going to be useful. Some code style nits you can act on or not as you wish, but needs fixing for improving the release notes a little.

+ keep_this = False

An alternative to this flag and the check at the end would be to `return None` early inside these two blocks. This also avoids the need for the `inhibit_content_conflict` flag.

The remainder of the logic added here looks sensible to me, but it's hard to comprehend the exact implications.

+ def _get_filter_tree_path(self, file_id):

Now you've split this to a separate function, may as well return early.

+ filter_tree_path = wt.id2path(file_id)

return wt.id2path(file_id)

+ except errors.NoSuchId:
+ filter_tree_path = self.other_tree.id2path(file_id)

except errors.NoSuchId:
return self.other_tree.id2path(file_id)

+ else:
+ # Skip the id2path lookup for older formats
+ filter_tree_path = None
+ return filter_tree_path

return None

+ # File created with different file-ids but deleted on one side
+ (dict(_base_actions='create_file_a'),

I like functional programming languages, but the way these test_conflict cases are spelt with scenarios like this is... not very easy to read. Per the prevailing style though, they look about right.

In the release notes:

+* Fixed an infinite loop when creating a repo at the root of the filesystem,

Moving this is right, but:

+* 'Duplicate' conflicts are now created instead of 'Content' ones when a
+ merge tries to create two entries for the same path.
+ (Vincent Ladeuil, #880701)

'D' should come before 'F'?

Also, saying something about the what effect this fix has directly would be good, as this is a user-facing bug. So, mention the exception that will no longer be raised in which circumstances or similar.

General review note, fixing code formatting nits as you go when working on something like this is a good habit, but pulling them out to a separate, prerequisite if needed, branch would make life easier for diff readers.

+        keep_this = False

An alternative to this flag and the check at the end would be to `return None` early inside these two blocks. This also avoids the need for the `inhibit_content_conflict` flag.

The remainder of the logic added here looks sensible to me, but it's hard to comprehend the exact implications.

+    def _get_filter_tree_path(self, file_id):

Now you've split this to a separate function, may as well return early.

+                filter_tree_path = wt.id2path(file_id)

return wt.id2path(file_id)

+            except errors.NoSuchId:
+                filter_tree_path = self.other_tree.id2path(file_id)

except errors.NoSuchId:
        return self.other_tree.id2path(file_id)

+        else:
+            # Skip the id2path lookup for older formats
+            filter_tree_path = None
+        return filter_tree_path

return None

+            # File created with different file-ids but deleted on one side
+            (dict(_base_actions='create_file_a'),

I like functional programming languages, but the way these test_conflict cases are spelt with scenarios like this is... not very easy to read. Per the prevailing style though, they look about right.

In the release notes:

+* Fixed an infinite loop when creating a repo at the root of the filesystem,

Moving this is right, but:

+* 'Duplicate' conflicts are now created instead of 'Content' ones when a
+  merge tries to create two entries for the same path.
+  (Vincent Ladeuil, #880701)

'D' should come before 'F'?

review: Needs Fixing

Revision history for this message

Vincent Ladeuil (vila) wrote on 2011-10-27:

Download full text (4.2 KiB)

> The basic aim to here sounds like the right thing, content conflicts for files
> that happen to share the same path after merging is never going to be useful.
> Some code style nits you can act on or not as you wish, but needs fixing for
> improving the release notes a little.
>
>
> + keep_this = False
>
> An alternative to this flag and the check at the end would be to `return None`

return result you mean, but I see your point.

... crap !... it really returns None. That's bad :-( Yet another point to
address ...

Anyway, I don't really agree with returning early as this method is already
very hard to read and I'm afraid adding early returns, while locally making
it more understandable, will make it harder to grasp globally.

It's 120 lines long and really need a better refactoring that I don't want
to do now :-/

If you look at annotations, you'll see that postponing the
self.tt.delete_contents(trans_id) was introduced recently and it took me a
while to understand its true meaning and *this* proposal introducing an
exception for it makes me think that we'd better put it back in all relevant
parts instead of delaying it.

As it stands, I think it makes more sense to leave it this way so people
realize that its intent is to get rid of the content of the file *before*
the merge *because* it's obsoleted by the merge producing a different
content (I've clarified the associated comment).

Also, there is:

if not self.this_tree.has_id(file_id) and result == "modified":
self.tt.version_file(file_id, trans_id)

that is still needed (or may be not but the fact that I can't tell is
another indication we still have more work to do here).

> early inside these two blocks. This also avoids the need for the
> `inhibit_content_conflict` flag.
>
> The remainder of the logic added here looks sensible to me, but it's hard to
> comprehend the exact implications.

Yeah, I know :-/ The conflicts being handled in various parts under various
circumstances is part of the problem. I don't have a good answer for that
(yet) as it's strongly related to tree transform features.

>
>
> + def _get_filter_tree_path(self, file_id):
>
> Now you've split this to a separate function, may as well return early.

Fixed.

>
> + # File created with different file-ids but deleted on one side
> + (dict(_base_actions='create_file_a'),
>

> I like functional programming languages, but the way these test_conflict
> cases are spelt with scenarios like this is... not very easy to read. Per
> the prevailing style though, they look about right.

Yeah, this has been mentioned in the past.

These tests are... hard to describe.

An alternative would be to generate test scripts instead but I refrained
from doing so in the past as that turned them into blackbox tests which
couldn't be debugged easily. This is not true anymore.

Otherwise, their main interest is that they allow to really run 4 different
tests with a single description:
- merging from one branch into the other (and vice-versa)
- resolving in both directions (--take-this and --take-other)

They also encompass running two bzr commands (merge and resolve) which is
not en...

> The basic aim to here sounds like the right thing, content conflicts for files
> that happen to share the same path after merging is never going to be useful.
> Some code style nits you can act on or not as you wish, but needs fixing for
> improving the release notes a little.
> 
> 
> +        keep_this = False
> 
> An alternative to this flag and the check at the end would be to `return None`

return result you mean, but I see your point.

... crap !... it really returns None. That's bad :-( Yet another point to
address ...

It's 120 lines long and really need a better refactoring that I don't want
to do now :-/

Also, there is:

if not self.this_tree.has_id(file_id) and result == "modified":
            self.tt.version_file(file_id, trans_id)

that is still needed (or may be not but the fact that I can't tell is
another indication we still have more work to do here).

> early inside these two blocks. This also avoids the need for the
> `inhibit_content_conflict` flag.
> 
> The remainder of the logic added here looks sensible to me, but it's hard to
> comprehend the exact implications.

> 
> 
> +    def _get_filter_tree_path(self, file_id):
> 
> Now you've split this to a separate function, may as well return early.

Fixed.

> 
> +            # File created with different file-ids but deleted on one side
> +            (dict(_base_actions='create_file_a'),
>

Yeah, this has been mentioned in the past.

These tests are... hard to describe.

An alternative would be to generate test scripts instead but I refrained
from doing so in the past as that turned them into blackbox tests which
couldn't be debugged easily. This is not true anymore.

They also encompass running two bzr commands (merge and resolve) which is
not encountered elsewhere in the test code base (AFAIK).

> 
> 
> In the release notes:
> 
> +* Fixed an infinite loop when creating a repo at the root of the filesystem,
> 
> Moving this is right, but:
> 
> +* 'Duplicate' conflicts are now created instead of 'Content' ones when a
> +  merge tries to create two entries for the same path.
> +  (Vincent Ladeuil, #880701)
> 
> 'D' should come before 'F'?

Oh ? Nobody never tells me anything ! :)

> 
> Also, saying something about the what effect this fix has directly would be
> good, as this is a user-facing bug. So, mention the exception that will no
> longer be raised in which circumstances or similar.

Done.

> 
> General review note, fixing code formatting nits as you go when working on
> something like this is a good habit, but pulling them out to a separate,
> prerequisite if needed, branch would make life easier for diff readers.

Yeah, this has been discussed numerous times in the past. I generally revert
such nits to avoid this (especially on older stable releases). Sorry about
that.

Revision history for this message

Martin Packman (gz) wrote on 2011-10-27:

> > + keep_this = False
> >
> > An alternative to this flag and the check at the end would be to `return
> None`
>
> return result you mean, but I see your point.
>
> ... crap !... it really returns None. That's bad :-( Yet another point to
> address ...

Should is also be a string of some sort? Is None intended to indicate unhandled?

> Anyway, I don't really agree with returning early as this method is already
> very hard to read and I'm afraid adding early returns, while locally making
> it more understandable, will make it harder to grasp globally.
>
> It's 120 lines long and really need a better refactoring that I don't want
> to do now :-/

It's reasonable to avoid funny control flow in a function this large, the flags are at least quite obvious.

> Also, there is:
>
> if not self.this_tree.has_id(file_id) and result == "modified":
> self.tt.version_file(file_id, trans_id)
>
> that is still needed (or may be not but the fact that I can't tell is
> another indication we still have more work to do here).

That result being None thing for those two cases skips this currently, which is correct as versioning file_id already happens in your new code when it's required.

> > The remainder of the logic added here looks sensible to me, but it's hard to
> > comprehend the exact implications.
>
> Yeah, I know :-/ The conflicts being handled in various parts under various
> circumstances is part of the problem. I don't have a good answer for that
> (yet) as it's strongly related to tree transform features.

Ideally I think we want someone else to look at this too, but we are rather short of people currently.

> > Also, saying something about the what effect this fix has directly would be
> > good, as this is a user-facing bug. So, mention the exception that will no
> > longer be raised in which circumstances or similar.
>
> Done.

You confused me for a bit there by having two entries under bug fixes now, but the new wording is a definite improvement.

> > +        keep_this = False
> >
> > An alternative to this flag and the check at the end would be to `return
> None`
> 
> return result you mean, but I see your point.
> 
> ... crap !... it really returns None. That's bad :-( Yet another point to
> address ...

Should is also be a string of some sort? Is None intended to indicate unhandled?

> Anyway, I don't really agree with returning early as this method is already
> very hard to read and I'm afraid adding early returns, while locally making
> it more understandable, will make it harder to grasp globally.
> 
> It's 120 lines long and really need a better refactoring that I don't want
> to do now :-/

It's reasonable to avoid funny control flow in a function this large, the flags are at least quite obvious.

> Also, there is:
> 
>         if not self.this_tree.has_id(file_id) and result == "modified":
>             self.tt.version_file(file_id, trans_id)
> 
> that is still needed (or may be not but the fact that I can't tell is
> another indication we still have more work to do here).

That result being None thing for those two cases skips this currently, which is correct as versioning file_id already happens in your new code when it's required.

> > The remainder of the logic added here looks sensible to me, but it's hard to
> > comprehend the exact implications.
> 
> Yeah, I know :-/ The conflicts being handled in various parts under various
> circumstances is part of the problem. I don't have a good answer for that
> (yet) as it's strongly related to tree transform features.

Ideally I think we want someone else to look at this too, but we are rather short of people currently.

You confused me for a bit there by having two entries under bug fixes now, but the new wording is a definite improvement.

review: Approve

Revision history for this message

Vincent Ladeuil (vila) wrote on 2011-10-27:

> > ... crap !... it really returns None. That's bad :-( Yet another point to
> > address ...
>
> Should is also be a string of some sort?

Probably.

> Is None intended to indicate unhandled?

I think it's a hole in the implementation. It doesn't really matter for now
but it's weird that it's not more related to whatever the hooks return.

>
> > Anyway, I don't really agree with returning early as this method is already
> > very hard to read and I'm afraid adding early returns, while locally making
> > it more understandable, will make it harder to grasp globally.
> >
> > It's 120 lines long and really need a better refactoring that I don't want
> > to do now :-/
>
> It's reasonable to avoid funny control flow in a function this large, the
> flags are at least quite obvious.

Good, that was the main intent even if I still think this method should be
split into more understandable parts.

>
> > Also, there is:
> >
> > if not self.this_tree.has_id(file_id) and result == "modified":
> > self.tt.version_file(file_id, trans_id)
> >
> > that is still needed (or may be not but the fact that I can't tell is
> > another indication we still have more work to do here).
>
> That result being None thing for those two cases skips this currently, which
> is correct as versioning file_id already happens in your new code when it's
> required.

Damn, you're right ! More evidence this method is hard to grasp.

> Ideally I think we want someone else to look at this too, but we are rather
> short of people currently.

As mentioned on IRC, I'm confident this introduces a special case for
'content conflicts' *only* and those were leading to a 'marlformed
transform'.

> You confused me for a bit there by having two entries under bug fixes now,
> but the new wording is a definite improvement.

Sorry. I should have put those two entries from the start since they address
the two issues mentioned in the bug report. Thanks for helping clarify them anyway.

I'm now waiting for 2.4.3 to open to land there and then in bzr.dev.

> > ... crap !... it really returns None. That's bad :-( Yet another point to
> > address ...
> 
> Should is also be a string of some sort?

Probably.

> Is None intended to indicate unhandled?

I think it's a hole in the implementation. It doesn't really matter for now
but it's weird that it's not more related to whatever the hooks return.

> 
> > Anyway, I don't really agree with returning early as this method is already
> > very hard to read and I'm afraid adding early returns, while locally making
> > it more understandable, will make it harder to grasp globally.
> >
> > It's 120 lines long and really need a better refactoring that I don't want
> > to do now :-/
> 
> It's reasonable to avoid funny control flow in a function this large, the
> flags are at least quite obvious.

Good, that was the main intent even if I still think this method should be
split into more understandable parts.

> 
> > Also, there is:
> >
> >         if not self.this_tree.has_id(file_id) and result == "modified":
> >             self.tt.version_file(file_id, trans_id)
> >
> > that is still needed (or may be not but the fact that I can't tell is
> > another indication we still have more work to do here).
> 
> That result being None thing for those two cases skips this currently, which
> is correct as versioning file_id already happens in your new code when it's
> required.

Damn, you're right ! More evidence this method is hard to grasp.

> Ideally I think we want someone else to look at this too, but we are rather
> short of people currently.

As mentioned on IRC, I'm confident this introduces a special case for
'content conflicts' *only* and those were leading to a 'marlformed
transform'.

> You confused me for a bit there by having two entries under bug fixes now,
> but the new wording is a definite improvement.

Sorry. I should have put those two entries from the start since they address
the two issues mentioned in the bug report. Thanks for helping clarify them anyway.

I'm now waiting for 2.4.3 to open to land there and then in bzr.dev.

Revision history for this message

Vincent Ladeuil (vila) wrote on 2011-10-27:

sent to pqm by email

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Bazaar Codereview Subscribers

Vincent Ladeuil

bzr PQM

 === modified file 'bzrlib/errors.py'
 --- bzrlib/errors.py	2011-08-01 06:24:48 +0000
 +++ bzrlib/errors.py	2011-10-27 15:07:26 +0000
@@ -1964,7 +1964,7 @@
          self.prefix = prefix
--class MalformedTransform(BzrError):
++class MalformedTransform(InternalBzrError):
      _fmt = "Tree transform is malformed %(conflicts)r"
 === modified file 'bzrlib/merge.py'
 --- bzrlib/merge.py	2011-07-06 09:22:00 +0000
 +++ bzrlib/merge.py	2011-10-27 15:07:26 +0000
@@ -591,11 +591,11 @@
                  else:
                      self.base_rev_id = self.revision_graph.find_unique_lca(
                                              *lcas)
--                sorted_lca_keys = self.revision_graph.find_merge_order(
++                sorted_lca_keys = self.revision_graph.find_merge_order(
                      revisions[0], lcas)
                  if self.base_rev_id == _mod_revision.NULL_REVISION:
                      self.base_rev_id = sorted_lca_keys[0]
--
++
              if self.base_rev_id == _mod_revision.NULL_REVISION:
                  raise errors.UnrelatedBranches()
              if self._is_criss_cross:
@@ -604,7 +604,8 @@
                  trace.mutter('Criss-cross lcas: %r' % lcas)
                  if self.base_rev_id in lcas:
                      trace.mutter('Unable to find unique lca. '
--                                 'Fallback %r as best option.' % self.base_rev_id)
++                                 'Fallback %r as best option.'
++                                 % self.base_rev_id)
                  interesting_revision_ids = set(lcas)
                  interesting_revision_ids.add(self.base_rev_id)
                  interesting_trees = dict((t.get_revision_id(), t)
@@ -689,7 +690,8 @@
                      continue
                  sub_merge = Merger(sub_tree.branch, this_tree=sub_tree)
                  sub_merge.merge_type = self.merge_type
--                other_branch = self.other_branch.reference_parent(file_id, relpath)
++                other_branch = self.other_branch.reference_parent(file_id,
++                                                                  relpath)
                  sub_merge.set_other_revision(other_revision, other_branch)
                  base_revision = self.base_tree.get_reference_revision(file_id)
                  sub_merge.base_tree = \
@@ -1387,18 +1389,54 @@
              if hook_status != 'not_applicable':
                  # Don't try any more hooks, this one applies.
                  break
++        # If the merge ends up replacing the content of the file, we get rid of
++        # it at the end of this method (this variable is used to track the
++        # exceptions to this rule).
++        keep_this = False
          result = "modified"
          if hook_status == 'not_applicable':
--            # This is a contents conflict, because none of the available
--            # functions could merge it.
++            # No merge hook was able to resolve the situation. Two cases exist:
++            # a content conflict or a duplicate one.
              result = None
              name = self.tt.final_name(trans_id)
              parent_id = self.tt.final_parent(trans_id)
--            if self.this_tree.has_id(file_id):
--                self.tt.unversion_file(trans_id)
--            file_group = self._dump_conflicts(name, parent_id, file_id,
--                                              set_version=True)
--            self._raw_conflicts.append(('contents conflict', file_group))
++            duplicate = False
++            inhibit_content_conflict = False
++            if params.this_kind is None: # file_id is not in THIS
++                # Is the name used for a different file_id ?
++                dupe_path = self.other_tree.id2path(file_id)
++                this_id = self.this_tree.path2id(dupe_path)
++                if this_id is not None:
++                    # Two entries for the same path
++                    keep_this = True
++                    # versioning the merged file will trigger a duplicate
++                    # conflict
++                    self.tt.version_file(file_id, trans_id)
++                    transform.create_from_tree(
++                        self.tt, trans_id, self.other_tree, file_id,
++                        filter_tree_path=self._get_filter_tree_path(file_id))
++                    inhibit_content_conflict = True
++            elif params.other_kind is None: # file_id is not in OTHER
++                # Is the name used for a different file_id ?
++                dupe_path = self.this_tree.id2path(file_id)
++                other_id = self.other_tree.path2id(dupe_path)
++                if other_id is not None:
++                    # Two entries for the same path again, but here, the other
++                    # entry will also be merged.  We simply inhibit the
++                    # 'content' conflict creation because we know OTHER will
++                    # create (or has already created depending on ordering) an
++                    # entry at the same path. This will trigger a 'duplicate'
++                    # conflict later.
++                    keep_this = True
++                    inhibit_content_conflict = True
++            if not inhibit_content_conflict:
++                if params.this_kind is not None:
++                    self.tt.unversion_file(trans_id)
++                # This is a contents conflict, because none of the available
++                # functions could merge it.
++                file_group = self._dump_conflicts(name, parent_id, file_id,
++                                                  set_version=True)
++                self._raw_conflicts.append(('contents conflict', file_group))
          elif hook_status == 'success':
              self.tt.create_file(lines, trans_id)
          elif hook_status == 'conflicted':
@@ -1420,36 +1458,23 @@
              raise AssertionError('unknown hook_status: %r' % (hook_status,))
          if not self.this_tree.has_id(file_id) and result == "modified":
              self.tt.version_file(file_id, trans_id)
--        # The merge has been performed, so the old contents should not be
--        # retained.
--        self.tt.delete_contents(trans_id)
++        if not keep_this:
++            # The merge has been performed and produced a new content, so the
++            # old contents should not be retained.
++            self.tt.delete_contents(trans_id)
          return result
      def _default_other_winner_merge(self, merge_hook_params):
          """Replace this contents with other."""
          file_id = merge_hook_params.file_id
          trans_id = merge_hook_params.trans_id
--        file_in_this = self.this_tree.has_id(file_id)
          if self.other_tree.has_id(file_id):
              # OTHER changed the file
--            wt = self.this_tree
--            if wt.supports_content_filtering():
--                # We get the path from the working tree if it exists.
--                # That fails though when OTHER is adding a file, so
--                # we fall back to the other tree to find the path if
--                # it doesn't exist locally.
--                try:
--                    filter_tree_path = wt.id2path(file_id)
--                except errors.NoSuchId:
--                    filter_tree_path = self.other_tree.id2path(file_id)
--            else:
--                # Skip the id2path lookup for older formats
--                filter_tree_path = None
--            transform.create_from_tree(self.tt, trans_id,
--                             self.other_tree, file_id,
--                             filter_tree_path=filter_tree_path)
++            transform.create_from_tree(
++                self.tt, trans_id, self.other_tree, file_id,
++                filter_tree_path=self._get_filter_tree_path(file_id))
              return 'done', None
--        elif file_in_this:
++        elif self.this_tree.has_id(file_id):
              # OTHER deleted the file
              return 'delete', None
          else:
@@ -1529,6 +1554,20 @@
                                                other_lines)
              file_group.append(trans_id)
++
++    def _get_filter_tree_path(self, file_id):
++        if self.this_tree.supports_content_filtering():
++            # We get the path from the working tree if it exists.
++            # That fails though when OTHER is adding a file, so
++            # we fall back to the other tree to find the path if
++            # it doesn't exist locally.
++            try:
++                return self.this_tree.id2path(file_id)
++            except errors.NoSuchId:
++                return self.other_tree.id2path(file_id)
++        # Skip the id2path lookup for older formats
++        return None
++
      def _dump_conflicts(self, name, parent_id, file_id, this_lines=None,
                          base_lines=None, other_lines=None, set_version=False,
                          no_base=False):
@@ -1652,10 +1691,12 @@
                  for trans_id in conflict[1]:
                      file_id = self.tt.final_file_id(trans_id)
                      if file_id is not None:
++                        # Ok we found the relevant file-id
                          break
                  path = fp.get_path(trans_id)
                  for suffix in ('.BASE', '.THIS', '.OTHER'):
                      if path.endswith(suffix):
++                        # Here is the raw path
                          path = path[:-len(suffix)]
                          break
                  c = _mod_conflicts.Conflict.factory(conflict_type,
 === modified file 'bzrlib/tests/test_conflicts.py'
 --- bzrlib/tests/test_conflicts.py	2011-07-07 10:20:59 +0000
 +++ bzrlib/tests/test_conflicts.py	2011-10-27 15:07:26 +0000
@@ -677,6 +677,14 @@
               ('fileb_created',
                dict(actions='create_file_b', check='file_content_b',
                     path='file', file_id='file-b-id')),),
++            # File created with different file-ids but deleted on one side
++            (dict(_base_actions='create_file_a'),
++             ('filea_replaced',
++              dict(actions='replace_file_a_by_b', check='file_content_b',
++                   path='file', file_id='file-b-id')),
++             ('filea_modified',
++              dict(actions='modify_file_a', check='file_new_content',
++                   path='file', file_id='file-a-id')),),
              ])
      def do_nothing(self):
@@ -694,6 +702,16 @@
      def check_file_content_b(self):
          self.assertFileEqual('file b content\n', 'branch/file')
++    def do_replace_file_a_by_b(self):
++        return [('unversion', 'file-a-id'),
++                ('add', ('file', 'file-b-id', 'file', 'file b content\n'))]
++
++    def do_modify_file_a(self):
++        return [('modify', ('file-a-id', 'new content\n'))]
++
++    def check_file_new_content(self):
++        self.assertFileEqual('new content\n', 'branch/file')
++
      def _get_resolve_path_arg(self, wt, action):
          return self._this['path']
@@ -1043,7 +1061,8 @@
          # This is nearly like TestResolveNonDirectoryParent but with branch and
          # trunk switched. As such it should certainly produce the same
          # conflict.
--        self.run_script("""
++        self.assertRaises(errors.MalformedTransform,
++                          self.run_script,"""
  $ bzr init trunk
  ...
  $ cd trunk
 === modified file 'bzrlib/transform.py'
 --- bzrlib/transform.py	2011-07-06 20:52:00 +0000
 +++ bzrlib/transform.py	2011-10-27 15:07:26 +0000
@@ -3068,7 +3068,7 @@
                  existing_file, new_file = conflict[2], conflict[1]
              else:
                  existing_file, new_file = conflict[1], conflict[2]
--            new_name = tt.final_name(existing_file)+'.moved'
++            new_name = tt.final_name(existing_file) + '.moved'
              tt.adjust_path(new_name, final_parent, existing_file)
              new_conflicts.add((c_type, 'Moved existing file to',
                                 existing_file, new_file))
 === modified file 'doc/en/release-notes/bzr-2.4.txt'
 --- doc/en/release-notes/bzr-2.4.txt	2011-10-27 14:16:10 +0000
 +++ doc/en/release-notes/bzr-2.4.txt	2011-10-27 15:07:26 +0000
@@ -32,6 +32,17 @@
  .. Fixes for situations where bzr would previously crash or give incorrect
     or undesirable results.
++* During merges, when two entries end up using the same path for two
++  different file-ids (the same file being 'bzr added' in two different
++  branches) , 'duplicate' conflicts are created instead of 'content'
++  ones. This was previously leading to a 'Malformed tramsform' exception.
++  (Vincent Ladeuil, #880701)
++
++* 'Malformed transform' exceptions are now recognized as internal errors
++  instead of user errors and report a traceback. This will reduce user
++  confusion as there is generally nothing users can do about them.
++  (Vincent Ladeuil, #880701)
++
  Documentation
  *************
@@ -102,6 +113,7 @@
  * Return early from create_delta_index_from_delta given tiny inputs. This
    avoids raising a spurious MemoryError on certain platforms such as AIX.
    (John Arbash Meinel, #856731)
++
  Documentation
  *************