Robert Collins wrote:
> Robert Collins has proposed merging lp:~lifeless/bzr/check-1 into lp:bzr.
>
> Requested reviews:
>     bzr-core (bzr-core)
>
> This gives a single-pass progress bar to check, makes it talk 1/2 the
> time on 1.9 flavour formats and nochange on CHK formats. More work to
> come but this seems unqualified better to me.
>
>   
Thanks for working on check. This command will get quite a workout from
lots of users as they migrate to 2a so I think getting it performing
well and reliably is quite a high priority.

This is a pretty big patch and I'm finding it difficult to review it all
at once. Here's a partial review with numerous clean-ups outside the
guts of your new code. To begin with, I'm happy with the -Dprogress
changes. These are arguably completely independent from check so it
would have been nice to submit them as a separate patch. In any case,
the changes related to that are good to go so you can land them immediately.

> +API Changes
> +***********
> +
> +* ``WorkingTree._check`` now requires a references dict with keys matching
> +  those returned by ``WorkingTree._get_check_refs``. (Robert Collins)
> +
>   
And more importantly, you've changed the public API Branch.check() to
have a new mandatory parameter and InventoryEntry.check() to not have a
tree parameter. You need to document these as API breaks.

> +    def find_lefthand_distances(self, keys):
> +        """Find the distance to null for all the keys in keys.
> +
> +        :param keys: keys to lookup.
> +        :return: A dict key->distance for all of keys.
> +        """
> +        # Optimisable by concurrent searching, but a random spread should get
> +        # some sort of hit rate.
> +        result = {}
>   

result isn't used in this routine.

> +        else:
> +            # At the moment, check does not extra work over get_record_stream
> +            return self.get_record_stream(keys, 'unordered', True)
>   

This comment is confusing and needs rewording.

> +        for revid, revision in revisions_iterator:
> +            if revision is None:
> +                pass
> +            parent_map = vf.get_parent_map([(revid,)])
>   
"pass" does nothing here so I'm assuming it's wrong. "continue" maybe?

> === modified file 'bzrlib/smart/medium.py'
> --- bzrlib/smart/medium.py	2009-06-10 03:56:49 +0000
> +++ bzrlib/smart/medium.py	2009-06-16 05:29:42 +0000
> @@ -37,7 +37,6 @@
>  from bzrlib import (
>      debug,
>      errors,
> -    osutils,
>      symbol_versioning,
>      trace,
>      ui,
> @@ -46,7 +45,8 @@
>  from bzrlib.smart import client, protocol, request, vfs
>  from bzrlib.transport import ssh
>  """)
> -
> +#usually already imported, and getting IllegalScoperReplacer on it here.
> +from bzrlib import osutils
>   

Was this a transient issue? If not, it looks like a completely separate
bug to the check stuff and maybe ought to be submitted as such.
>      def check(self, progress_bar=None):
> -        """Check this object for integrity."""
> +        """Check this object for integrity.
> +        
> +        :param progress_bar: A progress bar to output as the check progresses.
> +        :param keys: Specific keys within the VersionedFiles to check. When
> +            this parameter is not None, check() becomes a generator as per
> +            get_record_stream. The difference to get_record_stream is that
> +            more or deeper checks will be performed.
> +        :return: None, or if keys was supplied a generator as per
> +            get_record_stream.
> +        """
>   
This is a really good docstring thanks. You need to add keys to the
parameter list though.

> +Repository
> +----------
> +
> +Things that can go wrong:
> +* Bit errors or alterations may occur in raw data.
> +* Data that is referenced may be missing
> +* There could be a lot of garbage in upload etc.
> +* File graphs may be inconsistent with inventories and parents.
> +* The revision graph cache can be inconsistent with the revision data.
> +
> +Branch
> +------
> +
> +Things that can go wrong:
> +* Tag or tip revision ids may be missing from the repo.
>   

ReST doesn't correctly format these lists. You need a blank line above them.

So, beyond the -Dprogress stuff, I'm not sure how best to break this up
to make review easier. I certainly don't want to see you spending much
time doing that but a patch with 24 files changed is stretching my small
brain's working set. If you can see an easy split into two parts - e.g.
land the tests first or land the VF changes first - please consider it.

Ian C.