KeyError in knit.simple_annotate running annotate on a stacked branch

Bug #393366 reported by David I
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Bazaar
Fix Released
High
John A Meinel

Bug Description

The following crash occurs consistently in a stacked branch create with bzr 1.15.1.
The same command in the parent repository runs correctly.

$ bzr annotate src/idbnewaccess/record_tables.txt
bzr: ERROR: exceptions.KeyError: ('record_tables.txt-20080609102156-ahxdt96jsifkgvt1-1', 'XXXXXXX@msdes004-20080613065152-m502g0j0kyg2ksz3')

Traceback (most recent call last):
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/commands.py", line 729, in exception_to_return_code
    return the_callable(*args, **kwargs)
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/commands.py", line 924, in run_bzr
    ret = run(*run_argv)
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/commands.py", line 560, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/commands.py", line 939, in ignore_pipe
    result = func(*args, **kwargs)
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/builtins.py", line 4251, in run
    show_ids=show_ids)
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/annotate.py", line 89, in annotate_file_tree
    annotations = list(tree.annotate_iter(file_id))
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/decorators.py", line 138, in read_locked
    result = unbound(self, *args, **kwargs)
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/workingtree.py", line 497, in annotate_iter
    return basis.annotate_iter(file_id)
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/workingtree_4.py", line 1576, in annotate_iter
    annotations = self._repository.texts.annotate(text_key)
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/knit.py", line 1006, in annotate
    return self._factory.annotate(self, key)
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/knit.py", line 761, in annotate
    return annotator.annotate(key)
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/knit.py", line 3567, in annotate
    return self._simple_annotate(key)
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/knit.py", line 3607, in _simple_annotate
    parent_lines = [parent_cache[parent] for parent in parent_map[key]]
KeyError: ('record_tables.txt-20080609102156-ahxdt96jsifkgvt1-1', 'peter.scheeres@msdes004-20080613065152-m502g0j0kyg2ksz3')

bzr 1.15.1 on python 2.5.2 (linux2)
arguments: ['/data/id/release/bzr/current/bin/bzr', 'annotate', 'src/idbnewaccess/record_tables.txt']
encoding: 'ANSI_X3.4-1968', fsenc: 'ANSI_X3.4-1968', lang: None
plugins:
  bzrtools /data/id/release/bzr/current/lib/python/bzrlib/plugins/bzrtools [1.15]
  explorer /data/users/david.ingamells/.bazaar/plugins/explorer [0.3.1]
  gtk /data/id/release/bzr/current/lib/python/bzrlib/plugins/gtk [0.95.0.final.1]
  launchpad /data/id/release/bzr/current/lib/python/bzrlib/plugins/launchpad [1.15.1]
  netrc_credential_store /data/id/release/bzr/current/lib/python/bzrlib/plugins/netrc_credential_store [1.15.1]
  qbzr /data/id/release/bzr/current/lib/python/bzrlib/plugins/qbzr [0.10]
*** Bazaar has encountered an internal error.
    Please report a bug at https://bugs.launchpad.net/bzr/+filebug
    including this traceback, and a description of what you
    were doing when the error occurred.

Revision history for this message
Martin Pool (mbp) wrote :

Thanks for the report.

Could you please attach the result of running 'bzr info' on the stacked branch (with paths obscured if you wish). Can you reproduce it in a smaller case?

Changed in bzr:
importance: Undecided → High
status: New → Confirmed
summary: - bzr annotate exceptions.KeyError
+ KeyError in knit.simple_annotate running annotate on a stacked branch
Revision history for this message
David I (david-ingamells) wrote : Re: [Bug 393366] Re: bzr annotate exceptions.KeyError
Download full text (3.9 KiB)

Martin Pool wrote:
> Could you please attach the result of running 'bzr info' on the stacked
> branch (with paths obscured if you wish).
On a smaller repos the same occurs:

bzr: ERROR: exceptions.KeyError:
('cmsmanager.ini-20080220104629-krpntudqdg1zfzkp-1',
'<obscured>-20080827133521-mk6up7txygk04r2m')

Traceback (most recent call last):
  File
"/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/commands.py", line
729, in exception_to_return_code
    return the_callable(*args, **kwargs)
  File
"/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/commands.py", line
924, in run_bzr
    ret = run(*run_argv)
  File
"/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/commands.py", line
560, in run_argv_aliases
    return self.run(**all_cmd_args)
  File
"/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/commands.py", line
939, in ignore_pipe
    result = func(*args, **kwargs)
  File
"/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/builtins.py", line
4251, in run
    show_ids=show_ids)
  File
"/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/annotate.py", line
89, in annotate_file_tree
    annotations = list(tree.annotate_iter(file_id))
  File
"/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/decorators.py",
line 138, in read_locked
    result = unbound(self, *args, **kwargs)
  File
"/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/workingtree.py",
line 497, in annotate_iter
    return basis.annotate_iter(file_id)
  File
"/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/workingtree_4.py",
line 1576, in annotate_iter
    annotations = self._repository.texts.annotate(text_key)
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/knit.py",
line 1006, in annotate
    return self._factory.annotate(self, key)
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/knit.py",
line 761, in annotate
    return annotator.annotate(key)
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/knit.py",
line 3567, in annotate
    return self._simple_annotate(key)
  File "/data/id/release/bzr/bzr_20090616/lib/python/bzrlib/knit.py",
line 3607, in _simple_annotate
    parent_lines = [parent_cache[parent] for parent in parent_map[key]]
KeyError: ('cmsmanager.ini-20080220104629-krpntudqdg1zfzkp-1',
'<obscured>-20080827133521-mk6up7txygk04r2m')

bzr 1.15.1 on python 2.5.2 (linux2)
arguments: ['/data/id/release/bzr/current/bin/bzr', 'annotate',
'.cmsmanager.ini']
encoding: 'ANSI_X3.4-1968', fsenc: 'ANSI_X3.4-1968', lang: None
plugins:
  bzrtools
/data/id/release/bzr/current/lib/python/bzrlib/plugins/bzrtools [1.15]
  explorer /data/users/<obscured>/.bazaar/plugins/explorer
[0.3.1]
  gtk
/data/id/release/bzr/current/lib/python/bzrlib/plugins/gtk [0.95.0.final.1]
  launchpad
/data/id/release/bzr/current/lib/python/bzrlib/plugins/launchpad [1.15.1]
  netrc_credential_store
/data/id/release/bzr/current/lib/python/bzrlib/plugins/netrc_credential_store
[1.15.1]
  qbzr
/data/id/release/bzr/current/lib/python/bzrlib/plugins/qbzr [0.10]
*** Bazaar has encountered an internal error.

 and this is th...

Read more...

Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 393366] Re: bzr annotate exceptions.KeyError

2009/6/29 David I <email address hidden>:
> It does also occur in about the smallest repos (see above) I have
> available here:
> One with about 30 files and a current revno of 133.

I meant, can you give us a script that will reproduce the failure that
we can run here.

--
Martin <http://launchpad.net/~mbp/>

Revision history for this message
David I (david-ingamells) wrote :

Found a sequence that fails!
This uses 2 different machines and bzr+ssh, but I don't know yet if that
is a factor.

The ". .profile" is to get bzr in the path as we're using a
self-installed version of bzr to ensure that everyone is using the same
version.

#!/bin/bash

bzr init --1.9-rich-root 1.0

# the following init does not result in a fail:
#bzr init 1.0

cd 1.0
echo "Hello" > file
bzr add file
bzr commit -m "first file"
bzr remove-tree

ssh mscvs01 ". .profile; cd temp/bzr/work; bzr branch --stacked
bzr+ssh://msdes002/data/users/david.ingamells/temp/bzr/repos/1.0"

ssh mscvs01 ". .profile; cd temp/bzr/work/1.0; bzr annotate file"

bzr checkout
echo "Hello too" >> file
bzr commit -m "Updated file"

ssh mscvs01 ". .profile; cd temp/bzr/work/1.0; bzr pull"
ssh mscvs01 ". .profile; cd temp/bzr/work/1.0; bzr annotate file"

cd ..
rm -rf 1.0
ssh mscvs01 ". .profile; cd temp/bzr/work/; rm -rf 1.0"

Martin Pool wrote:
> 2009/6/29 David I <email address hidden>:
>
>> It does also occur in about the smallest repos (see above) I have
>> available here:
>> One with about 30 files and a current revno of 133.
>>
>
> I meant, can you give us a script that will reproduce the failure that
> we can run here.
>
>

Revision history for this message
John A Meinel (jameinel) wrote :

I haven't been able to reproduce this yet.
My immediate guess is that this has to do with bzr+ssh and stacking.

I'm also curious if it just happens to be fixed in the latest bzr.dev versus 1.15...

Revision history for this message
David I (david-ingamells) wrote : Re: [Bug 393366] Re: KeyError in knit.simple_annotate running annotate on a stacked branch

Is there anything more I can do to help track this bug down?

John A Meinel wrote:
> I haven't been able to reproduce this yet.
> My immediate guess is that this has to do with bzr+ssh and stacking.
>
> I'm also curious if it just happens to be fixed in the latest bzr.dev
> versus 1.15...
>
>

Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 393366] Re: KeyError in knit.simple_annotate running annotate on a stacked branch

2009/6/30 David I <email address hidden>:
> Is there anything more I can do to help track this bug down?

If you can provide a reproduction in non-private data either as a
script or a tarball that would be excellent.

--
Martin <http://launchpad.net/~mbp/>

Revision history for this message
David I (david-ingamells) wrote : Re: [Bug 393366] Re: KeyError in knit.simple_annotate running annotate on a stacked branch

Martin Pool wrote:
> 2009/6/30 David I <email address hidden>:
>
>> Is there anything more I can do to help track this bug down?
>>
>
> If you can provide a reproduction in non-private data either as a
> script or a tarball that would be excellent.
>
>
The script I sent before was completely self-contained. I can't pare it
down much more that that ;)

Here is a more abstract version. When I checked this script I notice
that the annotate does not fail every time. Running with a pause after
the last run seems to consistently fail, but running twice with no pause
often succeeds on the second try.

#!/bin/bash
local_host=`hostname`
remote_host=mscvs01

repos_root=/data/users/david.ingamells/temp/bzr/repos
branch_root=/data/users/david.ingamells/temp/bzr/work

repos_name=1.0

cd ${repos_root}

bzr init --1.9-rich-root ${repos_name}

# the following init does not result in a fail:
#bzr init ${repos_name}

cd ${repos_name}
echo "Hello" > file
bzr add file
bzr commit -m "first file"
bzr remove-tree

ssh ${remote_host} ". .profile; cd ${branch_root}; bzr branch --stacked
bzr+ssh://${local_host}/${repos_root}/${repos_name}"

#this works
ssh ${remote_host} ". .profile; cd ${branch_root}/${repos_name}; bzr
annotate file"

bzr checkout
echo `date` >> file
bzr commit -m "Updated file"

ssh ${remote_host} ". .profile; cd ${branch_root}/${repos_name}; bzr pull"

#this fails <---------------------------------
ssh ${remote_host} ". .profile; cd ${branch_root}/${repos_name}; bzr
annotate file"

#cleanup

cd ..
rm -rf ${repos_root}/${repos_name}
ssh mscvs01 ". .profile; rm -rf ${branch_root}/${repos_name}"

Revision history for this message
David I (david-ingamells) wrote :

> The script I sent before was completely self-contained. I can't pare it
> down much more that that ;)
>
> Here is a more abstract version. When I checked this script I notice
> that the annotate does not fail every time.
A small update.
Now that I saw that it doesn't always fail I re-tried other init
formats - 1.9 and without a format (which defaults to pack-0.92). They
all fail in the same way - often, but not every time. I must have just
been luck the first time I tested with calling init without a format
specifier.

Doing the branch without the --stacked never fails.

removing the lines "bzr remove-tree" and "bzr checkout" makes no difference.

Revision history for this message
Martin Pool (mbp) wrote : Re: [Bug 393366] Re: KeyError in knit.simple_annotate running annotate on a stacked branch

2009/7/1 David Ingamells <email address hidden>:
>
> The script I sent before was completely self-contained. I can't pare it down
> much more that that ;)
Sorry, I only saw your reply after I sent mine. Happens to the best of us ;-)

--
Martin <http://launchpad.net/~mbp/>

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

FWIW, this regularly trips up annotate on stacked branches on launchpad, for example:

bzr annotate http://bazaar.launchpad.net/~renatosilva/bzr-java-lib/log-view-fix/src/main/java/org/vcs/bazaar/client/commandline/parser/XMLParser.java

I can read the codebrowse logs to supply probably hundreds of examples of this if you like :)

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 393366] Re: KeyError in knit.simple_annotate running annotate on a stacked branch

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Michael Hudson wrote:
> FWIW, this regularly trips up annotate on stacked branches on launchpad,
> for example:
>
> bzr annotate http://bazaar.launchpad.net/~renatosilva/bzr-java-lib/log-
> view-
> fix/src/main/java/org/vcs/bazaar/client/commandline/parser/XMLParser.java
>
> I can read the codebrowse logs to supply probably hundreds of examples
> of this if you like :)
>

My immediate guess is that there is a bug in:

get_record_stream(..., 'topological') when there is stacking involved.
I'm not really sure, though.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpVMUIACgkQJdeBCYSNAAMXkgCfRM+Hl6KalenrCVbP9uQTsgFh
m7kAoL8r96od6gSFAMpiB3j929HbBLfV
=LHtt
-----END PGP SIGNATURE-----

Revision history for this message
John A Meinel (jameinel) wrote :

So I believe the bug is in knit._get_remaining_record_stream() around this point:
        if include_delta_closure:
            # XXX: get_content_maps performs its own index queries; allow state
            # to be passed in.
            non_local_keys = needed_from_fallback - absent_keys
            for keys, non_local_keys in self._group_keys_for_io(present_keys,
                                                                non_local_keys,
                                                                positions):
                generator = _VFContentMapGenerator(self, keys, non_local_keys,
                                                   global_map)
                for record in generator.get_record_stream():
                    yield record

Namely, if "include_delta_closure=True" it ignores the source_keys grouping that had just been set up.

Which is why we have the problem in *simple_annotate* which wants fulltexts, and not elsewhere. I'll see what I can set up.

Changed in bzr:
assignee: nobody → John A Meinel (jameinel)
status: Confirmed → In Progress
Revision history for this message
John A Meinel (jameinel) wrote :

Digging deeper it seems that _group_keys_for_io is designed to preserve the key ordering that we got from 'present_keys', which means that it does preserve the topological sorting (as long as there are no non-local keys)

If we have a non-local key, we first have put that into a set() which destroys the ordering.
We then end up with 2 queues, one with the properly sorted 'local' keys, and another with randomly ordered 'nonlocal' keys.

Now in the common case, these won't be interleaved. However in the real world, I think it could happen. Consider:

(time goes down)
 #- fallback repo
    #- stacked repo
 A
 |\
 | B
 |/|
 C |
  \|
   E

In this case, the revisions A B C will be present in the fallback, and the revisions B & E will be present in the stacked repo.

The proper topological sorting would be:
 A B C E

However, because the lists are now split, we have:
 B E (correct) set([A, C])

Now, we hit this problem much more often than the above would indicate, *because* the code in _VFContentManager does:

   missing_keys = set(nonlocal_keys)
   # Read from remote versioned file instances and provide to our caller.
   for source in self.vf._fallback_vfs:
       if not missing_keys:
           break
       # Loop over fallback repositories asking them for texts - ignore
       # any missing from a particular fallback.
       for record in source.get_record_stream(missing_keys,
           'unordered', True):

           ^^^^^^^^^^- always an unordered fetch from fallbacks

The latter is pretty easy to fix, so I'm writing up something for that. I expect it to handle 90% of the cases.

Going further, --2a formats handle this case properly *today*, so I'm loathe to spend a lot of time fixing the old one. I'll open up a new bug on that.

Revision history for this message
John A Meinel (jameinel) wrote :

A fix for the simple cases is available in the associated branch.
See bug #399884 for the cases it does not handle.

Changed in bzr:
status: In Progress → Fix Committed
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

Here's a list of URLs that have caused this problem: http://pastebin.ubuntu.com/219306/

I haven't tried the simple fix on these yet.

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

The attached branch fixes all 19 problem URLs I found in 4 days worth of logs.

Andrew Bennetts (spiv)
Changed in bzr:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.