Merge lp:~gz/bzr/root_drive_file_url_841322 into lp:bzr

Proposed by Martin Packman
Status: Merged
Approved by: Jelmer Vernooij
Approved revision: no longer in the source branch.
Merged at revision: 6127
Proposed branch: lp:~gz/bzr/root_drive_file_url_841322
Merge into: lp:bzr
Diff against target: 58 lines (+12/-3)
4 files modified
bzrlib/tests/test_urlutils.py (+6/-0)
bzrlib/transport/__init__.py (+1/-2)
bzrlib/urlutils.py (+1/-1)
doc/en/release-notes/bzr-2.5.txt (+4/-0)
To merge this branch: bzr merge lp:~gz/bzr/root_drive_file_url_841322
Reviewer Review Type Date Requested Status
Jelmer Vernooij (community) Approve
Review via email: mp+74034@code.launchpad.net

Commit message

Fix breakage when creating a transport at drive root on windows

Description of the change

Fixes massive breakage on windows from changes to path handling aimed at smuggling extra information for colocated branch addressing into urls. There's a fun set of interactions and a couple of very minor bugs that make for some serious fallout.

Lots of operations involve speculatively opening a location, and per `BzrDir.open_containing_from_transport` they ascend until they reach a control directory or the root. Additionally, transports represents locations as urls, and in the case of file system paths, as file: urls. Therefore on windows, creating a transport at "file:///C:/" or such like is common. (As an aside, the test suite is one place we generally avoid this, because we want tests isolated they have a containing control directory).

The simple bug is the urlutils._win32_extract_drive_letter check:

    if len(path) < 3 or path[2] not in ':|' or path[3] != '/':
        raise errors.InvalidURL(url_base + path,
            'win32 file:/// paths need a drive letter')

Which ensures the string is at least three characters long then looks at the fourth character.

This only gets exposed because of the set of changes landed in r6079:
<https://code.launchpad.net/~jelmer/bzr/transport-segments/+merge/71583>

The problems come from the added split_segment_parameters functions. They aim to remove the extra bit from the end of the url, but use the normal url split function, which requires a valid url an input. For most urls, adding a comma and some more characters to the end just looks like a longer final path segment, but there are some special cases that are rather different.

The last complication is in `Transport.__init__` which strips trailing forward slashes before passing on the url:

    self._segment_parameters = urlutils.split_segment_parameters(
        base.rstrip("/"))[1]

This makes the common form of "file:///C:/" into "file:///C:" which tickles the bug above, and would anyway not be accepted as valid by all applications. As far as I can see, the strip is entirely redundant as the first thing done inside is the call to split with the default argument `exclude_trailing_slash=True` which takes off the last slash without creating an invalid url.

So, the little things are easily fixed, but this is not a complete solution. There are at least two failing tests on windows (in bt.test_urlutils) from the addition of split_segment_parameters_raw inside the local_path_from_url implementations. That's from the issue above, "file:///" can be treated as valid as a translation of the path \ but "file:///,tip" can't be handled as a url. It's probably possible to hack around that and other issues, but resolving the confusion over paths, urls, and urls-plus should make life easier. Someone with a clear picture idea of the objectives may be able to suggest something sensible here.

To post a comment you must log in.
Revision history for this message
Jelmer Vernooij (jelmer) :
review: Approve
Revision history for this message
John A Meinel (jameinel) wrote :
Download full text (3.5 KiB)

I'm pretty hesitant to change the required number of char to 4, but it may
be correct.

John
=:->
On Sep 5, 2011 1:33 AM, "Martin [gz]" <email address hidden> wrote:
> Martin [gz] has proposed merging lp:~gz/bzr/root_drive_file_url_841322
into lp:bzr.
>
> Requested reviews:
> bzr-core (bzr-core)
> Related bugs:
> Bug #841322 in Bazaar: "IndexError on windows from any operation that
opens a transport at root drive"
> https://bugs.launchpad.net/bzr/+bug/841322
>
> For more details, see:
> https://code.launchpad.net/~gz/bzr/root_drive_file_url_841322/+merge/74034
>
> Fixes massive breakage on windows from changes to path handling aimed at
smuggling extra information for colocated branch addressing into urls.
There's a fun set of interactions and a couple of very minor bugs that make
for some serious fallout.
>
> Lots of operations involve speculatively opening a location, and per
`BzrDir.open_containing_from_transport` they ascend until they reach a
control directory or the root. Additionally, transports represents locations
as urls, and in the case of file system paths, as file: urls. Therefore on
windows, creating a transport at "file:///C:/" or such like is common. (As
an aside, the test suite is one place we generally avoid this, because we
want tests isolated they have a containing control directory).
>
> The simple bug is the urlutils._win32_extract_drive_letter check:
>
> if len(path) < 3 or path[2] not in ':|' or path[3] != '/':
> raise errors.InvalidURL(url_base + path,
> 'win32 file:/// paths need a drive letter')
>
> Which ensures the string is at least three characters long then looks at
the fourth character.
>
> This only gets exposed because of the set of changes landed in r6079:
> <https://code.launchpad.net/~jelmer/bzr/transport-segments/+merge/71583>
>
> The problems come from the added split_segment_parameters functions. They
aim to remove the extra bit from the end of the url, but use the normal url
split function, which requires a valid url an input. For most urls, adding a
comma and some more characters to the end just looks like a longer final
path segment, but there are some special cases that are rather different.
>
> The last complication is in `Transport.__init__` which strips trailing
forward slashes before passing on the url:
>
> self._segment_parameters = urlutils.split_segment_parameters(
> base.rstrip("/"))[1]
>
> This makes the common form of "file:///C:/" into "file:///C:" which
tickles the bug above, and would anyway not be accepted as valid by all
applications. As far as I can see, the strip is entirely redundant as the
first thing done inside is the call to split with the default argument
`exclude_trailing_slash=True` which takes off the last slash without
creating an invalid url.
>
>
> So, the little things are easily fixed, but this is not a complete
solution. There are at least two failing tests on windows (in
bt.test_urlutils) from the addition of split_segment_parameters_raw inside
the local_path_from_url implementations. That's from the issue above,
"file:///" can be treated as valid as a translation of the path \ but
"file:///,tip" can't be handled as a url. It's probably possible to hack
around that a...

Read more...

Revision history for this message
Martin Packman (gz) wrote :

> I'm pretty hesitant to change the required number of char to 4, but it may
> be correct.

Yeah, I was also very unsure about that. I think it's okay because:

* It was throwing IndexError, so throwing a more appropriate error is progress
* _win32_local_path_from_url already treats 'file:///C:' as invalid and has a test
* Explorer doesn't resolve a file url without a trailing slash for me

Revision history for this message
Martin Packman (gz) wrote :

sent to pqm by email

Revision history for this message
Martin Packman (gz) wrote :

A babune run with this landed shows it fixed many but not all of the failures introduced:
<http://babune.ladeuil.net:24842/job/selftest-windows/522/>

I've filed bug 842223 and bug 842233 for the remaining failures, and bug 788130 already exists for an intermittent failure that happened to reappear that run.

Revision history for this message
John A Meinel (jameinel) wrote :
Download full text (4.7 KiB)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 9/5/2011 6:58 PM, Martin [gz] wrote:
>> I'm pretty hesitant to change the required number of char to 4,
>> but it may be correct.
>
> Yeah, I was also very unsure about that. I think it's okay
> because:
>
> * It was throwing IndexError, so throwing a more appropriate error
> is progress * _win32_local_path_from_url already treats
> 'file:///C:' as invalid and has a test * Explorer doesn't resolve a
> file url without a trailing slash for me

This is what I'm getting with bzr.dev after your patch landed, trying
to run "bzr st" outside of a working tree. Thoughts?

$ bzr --version
Bazaar (bzr) 2.5.0dev1
  from bzr checkout C:/dev/bzr/bzr.dev
    revision: 6125
    revid: <email address hidden>
    branch nick: bzr
...

$ bzr st
bzr: ERROR: exceptions.IndexError: string index out of range

Traceback (most recent call last):
  File "C:\dev\bzr\bzr.dev\bzrlib\commands.py", line 918, in
exception_to_return_code
    return the_callable(*args, **kwargs)
  File "C:\dev\bzr\bzr.dev\bzrlib\commands.py", line 1118, in run_bzr
    ret = run(*run_argv)
  File "C:\dev\bzr\bzr.dev\bzrlib\commands.py", line 676, in
run_argv_aliases
    return self.run(**all_cmd_args)
  File "C:\dev\bzr\bzr.dev\bzrlib\commands.py", line 698, in run
    return self._operation.run_simple(*args, **kwargs)
  File "C:\dev\bzr\bzr.dev\bzrlib\cleanup.py", line 135, in run_simple
    self.cleanups, self.func, *args, **kwargs)
  File "C:\dev\bzr\bzr.dev\bzrlib\cleanup.py", line 165, in
_do_with_cleanups
    result = func(*args, **kwargs)
  File "C:\dev\bzr\bzr.dev\bzrlib\commands.py", line 1133, in ignore_pipe
    result = func(*args, **kwargs)
  File "C:\dev\bzr\bzr.dev\bzrlib\builtins.py", line 294, in run
    tree, relfile_list = WorkingTree.open_containing_paths(file_list)
  File "C:\dev\bzr\bzr.dev\bzrlib\workingtree.py", line 296, in
open_containing_paths
    tree = WorkingTree.open_containing(default_directory)[0]
  File "C:\dev\bzr\bzr.dev\bzrlib\workingtree.py", line 279, in
open_containing
    control, relpath = bzrdir.BzrDir.open_containing(path)
  File "C:\dev\bzr\bzr.dev\bzrlib\bzrdir.py", line 875, in open_containing
    return BzrDir.open_containing_from_transport(transport)
  File "C:\dev\bzr\bzr.dev\bzrlib\bzrdir.py", line 901, in
open_containing_from_transport
    new_t = a_transport.clone('..')
  File "C:\dev\bzr\bzr.dev\bzrlib\transport\local.py", line 92, in clone
    return LocalTransport(abspath)
  File "C:\dev\bzr\bzr.dev\bzrlib\transport\local.py", line 73, in
__init__
    super(LocalTransport, self).__init__(base)
  File "C:\dev\bzr\bzr.dev\bzrlib\transport\__init__.py", line 314, in
__init__
    base.rstrip("/"))[1]
  File "C:\dev\bzr\bzr.dev\bzrlib\urlutils.py", line 458, in
split_segment_parameters
    (base_url, subsegments) = split_segment_parameters_raw(url)
  File "C:\dev\bzr\bzr.dev\bzrlib\urlutils.py", line 445, in
split_segment_parameters_raw
    (parent_url, child_dir) = split(url)
  File "C:\dev\bzr\bzr.dev\bzrlib\urlutils.py", line 430, in split
    url_base, path = _win32_extract_drive_letter(url_base, path)
  File "C:\dev\bzr\bzr.dev\bzrlib\urlutils.py", line 391, i...

Read more...

Revision history for this message
John A Meinel (jameinel) wrote :

My bad. This landed in bzr.dev 6129, and I was at 6125.

It seems "bzr update" started failing because of:
$ bzr up
bzr: ERROR: Operation denied because it would change the main history, which is not permitted by the append_revisions_only setting on branch "C:/dev/bzr/bzr.dev/".

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/tests/test_urlutils.py'
2--- bzrlib/tests/test_urlutils.py 2011-08-15 13:12:36 +0000
3+++ bzrlib/tests/test_urlutils.py 2011-09-05 17:14:27 +0000
4@@ -457,6 +457,12 @@
5 self.assertEqual(('file:///C:', '/foo'), extract('file://', '/C:/foo'))
6 self.assertEqual(('file:///d|', '/path'), extract('file://', '/d|/path'))
7 self.assertRaises(InvalidURL, extract, 'file://', '/path')
8+ # Root drives without slash treated as invalid, see bug #841322
9+ self.assertEqual(('file:///C:', '/'), extract('file://', '/C:/'))
10+ self.assertRaises(InvalidURL, extract, 'file://', '/C:')
11+ # Invalid without drive separator or following forward slash
12+ self.assertRaises(InvalidURL, extract, 'file://', '/C')
13+ self.assertRaises(InvalidURL, extract, 'file://', '/C:ool')
14
15 def test_split(self):
16 # Test bzrlib.urlutils.split()
17
18=== modified file 'bzrlib/transport/__init__.py'
19--- bzrlib/transport/__init__.py 2011-08-28 15:33:21 +0000
20+++ bzrlib/transport/__init__.py 2011-09-05 17:14:27 +0000
21@@ -310,8 +310,7 @@
22 def __init__(self, base):
23 super(Transport, self).__init__()
24 self.base = base
25- self._segment_parameters = urlutils.split_segment_parameters(
26- base.rstrip("/"))[1]
27+ self._segment_parameters = urlutils.split_segment_parameters(base)[1]
28
29 def _translate_error(self, e, path, raise_generic=True):
30 """Translate an IOError or OSError into an appropriate bzr error.
31
32=== modified file 'bzrlib/urlutils.py'
33--- bzrlib/urlutils.py 2011-08-15 13:12:36 +0000
34+++ bzrlib/urlutils.py 2011-09-05 17:14:27 +0000
35@@ -388,7 +388,7 @@
36 """On win32 the drive letter needs to be added to the url base."""
37 # Strip off the drive letter
38 # path is currently /C:/foo
39- if len(path) < 3 or path[2] not in ':|' or path[3] != '/':
40+ if len(path) < 4 or path[2] not in ':|' or path[3] != '/':
41 raise errors.InvalidURL(url_base + path,
42 'win32 file:/// paths need a drive letter')
43 url_base += path[0:3] # file:// + /C:
44
45=== modified file 'doc/en/release-notes/bzr-2.5.txt'
46--- doc/en/release-notes/bzr-2.5.txt 2011-09-02 12:45:36 +0000
47+++ doc/en/release-notes/bzr-2.5.txt 2011-09-05 17:14:27 +0000
48@@ -148,6 +148,10 @@
49 * Decode ``BZR_HOME`` with fs encoding on posix platforms to avoid unicode
50 errors. (Vincent Ladeuil, #822571)
51
52+* Fix fallout from URL handling changes in 2.5 that caused an IndexError to be
53+ raised whenever a transport at the drive root was opened on windows.
54+ (Martin [gz], #841322)
55+
56 * Rather than an error being raised, a warning is now printed when the
57 current user does not have permission to read a configuration file.
58 (Jelmer Vernooij, #837324)