Merge lp:~jspashett/bzr/587868_args_handling_cant_debug into lp:bzr

Proposed by Jason Spashett
Status: Superseded
Proposed branch: lp:~jspashett/bzr/587868_args_handling_cant_debug
Merge into: lp:bzr
Diff against target: 105 lines (+29/-21)
4 files modified
NEWS (+5/-0)
bzr (+0/-1)
bzrlib/commands.py (+2/-0)
bzrlib/win32utils.py (+22/-20)
To merge this branch: bzr merge lp:~jspashett/bzr/587868_args_handling_cant_debug
Reviewer Review Type Date Requested Status
Alexander Belchenko Disapprove
Martin Packman Pending
John A Meinel Pending
Review via email: mp+27897@code.launchpad.net

This proposal supersedes a proposal from 2010-06-07.

This proposal has been superseded by a proposal from 2010-06-21.

Description of the change

Change to enable debugging of bzr under windows, pdb, eclipse et al.
* Compare the number of arguments from GetCommandLineW with sys.argv and remove any 'extra' arguments present obtained from GetCommandLineW therby eliminating any unwanted args if run under a debugger.

e.g. python -m pdb bzr status

currently yeilds -m pdb bzr status, and bzr complains about arguments.

Rationale: In all cases, using sys.argv or GetCommandLineW command arguments are seperated by spaces. Hence the number of arguments should agree. We do not use sys.argv becuase the encoding can be an unknown multi-byte. The assumption is that arguments will always be seperated by a space (as Splitter class assumes, and presumably python itself, as well as the command interpreter) thereore we can use this counting method to determine the proper argument start.

To post a comment you must log in.
Revision history for this message
Alexander Belchenko (bialix) wrote : Posted in a previous version of this proposal

1) Your changes are conflicting with current state of bzr.dev. Please, merge lp:bzr into your branch and resolve conflicts. Then commit and push new state to lp.

2) I think you don't need additional function process_metachars_in_commandline, do you?

3) Why you have changed name of the function to _win32syscall_command_line_to_argv? What it should say to other people?

4) Don't use `raise AssertionError()`, use instead BzrInternalError.

review: Needs Fixing
Revision history for this message
Alexander Belchenko (bialix) wrote : Posted in a previous version of this proposal

5) You don't need to use such complex code as

arguments = []
arguments.extend(s)

There is enough to use just:

arguments = list(s)

6) len (sys.argv) <-- don't put a space between len and parenthesis.

review: Needs Fixing
Revision history for this message
Jason Spashett (jspashett) wrote : Posted in a previous version of this proposal

1) Have merged and resolved, removing your fix for 588277 as this fix fixes that bug too.
2) Yes. Have refactored into call site
3) Thought it would be more informative, but have changed the name back
4) Ok, changed.
5) Yes, my bad python
6) Changed.

I have not updated the news file to say I've removed the code fix for 588277.

Revision history for this message
Alexander Belchenko (bialix) wrote : Posted in a previous version of this proposal

I don't think that removing my fix for bug 588277 is the right thing. And btw, have you tested your branch manually with --profile-imports in the real command line?

Revision history for this message
Jason Spashett (jspashett) wrote : Posted in a previous version of this proposal

Alexander Belchenko wrote:
> I don't think that removing my fix for bug 588277 is the right thing. And btw, have you tested your branch manually with --profile-imports in the real command line?
>
Yes I did test it with "python bzr --profile-imports status". Perhaps
it's not right as you say but It does get rid of the message "bzr:
ERROR: no such option: --profile-imports" though, and it does print out
the profile to stdout.

Like so:

C:\bzr\bazaar\587868_args_handling_cant_debug>python bzr.py
--profile-imports status 2>&1 | more
unknown:
  .project
  .pydevproject
  bzr.py
  cum inline name @ file:line
 92.2 1.5 bzrlib.commands @ __main__:133
 80.8 1.0 + [disable_plugins, load_plugins]bzrlib.plugin @
bzrlib.commands:52
 79.8 61.1 ++ [osutils]bzrlib @ bzrlib.plugin:36
  ...etc

It works because --profile-imports is removed from sys.argv, and that is
the length used in my fix, so it is also removed from the string
returned by GetCommandLineW
However this isn't correct because if you run python bzr status
--profile-imports 2>&1 | more with --profile-imports last, it does not
work. So your code is required. I could uncomit my change and do it
again. But since sys.argv is modified I don't think my fix will work
properly with as it assumes the arguments to the front should be removed
not arbitrary ones in the middle.

bzr, line 55: sys.argv.remove('--profile-imports')

I think therefore I will scrap this branch as I don't think there is a
way around this.

Revision history for this message
Alexander Belchenko (bialix) wrote : Posted in a previous version of this proposal

Jason Spashett пишет:
> I think therefore I will scrap this branch as I don't think there is a
> way around this.

Not so fast, please.

I think you need to add the code which removes --profile-imports
before you start comparing with len(sys.argv).
It's a bit tricky because arguments is a list of 2-tuples.

So you need something like this

if '--profile-imports' not in sys.argv:
    if (False, '--profile-imports') in arguments:
        arguments.remove((False, '--profile-imports'))
    elif (True, '--profile-imports') in arguments:
        arguments.remove((True, '--profile-imports'))

Which is a bit verbose, but it should handle both cases (plain and
quoted --profile-imports).

Revision history for this message
Alexander Belchenko (bialix) wrote : Posted in a previous version of this proposal
Revision history for this message
Alexander Belchenko (bialix) wrote : Posted in a previous version of this proposal

2 more fixes needed:

1) you can't raise InternalBzrError without importing this name first, so please add this line before raise:

    from bzrlib.errors import InternalBzrError

and sorry for my typo in error class name, it's actually InternalBzrError not BzrInternalError.

2) Bazaar code guideline requires only 2 blank lines between high-level functions, so please remove extra blank line after `return argv`

And just a note: I'm usually trying to create minimal possible patch and avoid changing names of variables unless it's strictly necessary, because it's just creates a noise which makes review harder. Your change of args to argv is not strictly necessary here, although I don't mind with such change. It won't block your patch. But such change don't add any real value to the patch.

Revision history for this message
Jason Spashett (jspashett) wrote : Posted in a previous version of this proposal

Alexander Belchenko wrote:
> does PEB means Process Enviroment Block? http://undocumented.ntinternals.net/UserMode/Undocumented%20Functions/NT%20Objects/Process/PEB.html
>
Yes, the RTL_USER_PROCESS_PARAMETERS is in there. All I meant was that
sys.argv in python and GetCommandLineW should look at the same data,
except sys.argv is locale encoded. At least this is what I believe.

Revision history for this message
Jason Spashett (jspashett) wrote : Posted in a previous version of this proposal

Alexander Belchenko wrote:
> 2 more fixes needed:
>
> 1) you can't raise InternalBzrError without importing this name first, so please add this line before raise:
>
> from bzrlib.errors import InternalBzrError
>
> and sorry for my typo in error class name, it's actually InternalBzrError not BzrInternalError.
>
> 2) Bazaar code guideline requires only 2 blank lines between high-level functions, so please remove extra blank line after `return argv`
>
> And just a note: I'm usually trying to create minimal possible patch and avoid changing names of variables unless it's strictly necessary, because it's just creates a noise which makes review harder. Your change of args to argv is not strictly necessary here, although I don't mind with such change. It won't block your patch. But such change don't add any real value to the patch.
>
Alexander,

I put the code you suggested in, but have trouble with the import and
can't tell why at the moment:

    module = __import__(module_python_path, scope, scope, [member])
  File
"C:\bzr\bazaar\587868_args_handling_cant_debug\bzrlib\win32utils.py",
line 28, in <module>
    from bzrlib.errors import InternalBzrError
  File "C:\bzr\bazaar\587868_args_handling_cant_debug\bzrlib\errors.py",
line 20, in <module>
    from bzrlib import (
ImportError: cannot import name osutils

Revision history for this message
Alexander Belchenko (bialix) wrote : Posted in a previous version of this proposal

Jason Spashett пишет:
> I put the code you suggested in, but have trouble with the import and
> can't tell why at the moment:
>
> module = __import__(module_python_path, scope, scope, [member])
> File
> "C:\bzr\bazaar\587868_args_handling_cant_debug\bzrlib\win32utils.py",
> line 28, in <module>
> from bzrlib.errors import InternalBzrError

You have this error because osutils module tries to import win32utils
module. But win32utils tries to import bzrlib which in fact imports
osutils.

Don't put this import at the beginning of the file. Put it inside the
"if" block, right before raise statement, e.g.:

    if len(arguments) < len(sys.argv):
        from bzrlib.errors import InternalBzrError
        raise InternalBzrError("len(GetCommandLineW(...)) <
len(sys.argv) should not be possible")

That will work.

Revision history for this message
Alexander Belchenko (bialix) wrote : Posted in a previous version of this proposal

I think, now it's ready. Except... you need corresponding NEWS entry! Please, add one in the bugfix section.

Also we need a second approver to land this patch. Maybe Martin gz? ;-)

review: Approve
Revision history for this message
Martin Packman (gz) wrote : Posted in a previous version of this proposal

This basically looks fine to me, but could do with some tests. It's hard to tell just from the diff that, for instance, the py2exe case is still correct. Perhaps Alexander could help you out with that?

Some follow on comments about specific parts of change already mentioned in review:

+ # Bug #588277 remove the --profile-imports from the command line arguments

I realise the below code is just trying to match what the code in the bzr script is doing, but a better option would be to change that to not do `sys.argv.remove('--profile-imports')` at all, and add it to the global args list.

+ from bzrlib.errors import InternalBzrError
+ raise InternalBzrError("len(GetCommandLineW(...)) < len(sys.argv) should not be possible")

Just AssertionError would be fine here.

I'd have no problem with this landing as-is though.

review: Approve
Revision history for this message
Jason Spashett (jspashett) wrote : Posted in a previous version of this proposal

Martin [gz] wrote:
> Review: Approve
> This basically looks fine to me, but could do with some tests. It's hard to tell just from the diff that, for instance, the py2exe case is still correct. Perhaps Alexander could help you out with that?
>
> Some follow on comments about specific parts of change already mentioned in review:
>
> + # Bug #588277 remove the --profile-imports from the command line arguments
>
> I realise the below code is just trying to match what the code in the bzr script is doing, but a better option would be to change that to not do `sys.argv.remove('--profile-imports')` at all, and add it to the global args list.
>
>
Yes I agree with that. I am not familiar with the global args list
though. My thought was that sys.argv shouldn't be modified, and that
there should be some args list that is passed around instead of
referencing sys.args from within the lower code. That way we can do what
ever is necessary to prepare the arguments list. Unfortunately that's
too many changes as sys.argv is used in many places as you would expect.

> + from bzrlib.errors import InternalBzrError
> + raise InternalBzrError("len(GetCommandLineW(...)) < len(sys.argv) should not be possible")
>
> Just AssertionError would be fine here.
>
> I'd have no problem with this landing as-is though.
>

If you want I can start up another branch try and do what you say Martin
and if Alexander thinks it's a good idea.

Revision history for this message
Jason Spashett (jspashett) wrote : Posted in a previous version of this proposal

Jason Spashett wrote:
> Martin [gz] wrote:
>
> ...
>> Just AssertionError would be fine here.
>>
>> I'd have no problem with this landing as-is though.
>>
>>
>
> If you want I can start up another branch try and do what you say Martin
> and if Alexander thinks it's a good idea.
>
I've just 'discovered' that the unit tests don't pass, and they don't
because of course sys.argv isn't set. And so an exception is raised, namely:
raise InternalBzrError("len(GetCommandLineW(...)) < len(sys.argv) should
not be possible")
It's all rather messy now somehow.

I suppose that could be transformed into a > if statement instead.

Is it possible to call near the start of execution the GetCommandLineW
function, process it for 'extraneous' args and then set sys.argv to the
result? would this work? I just say that because it may get around these
fiddly special cases that are now cropping up -- like the
--profile-imports fix. Instead, fix it up once and assign it to
sys.argv, and the rest of the code can carry on as it does removing args
or whatever it wants to do.

Revision history for this message
Martin Packman (gz) wrote : Posted in a previous version of this proposal

Hm, yes. Means move some code around, because the order must be:
1) Split line
2) Strip args from front
3) Glob unquoted args

The unit tests for _command_line_to_argv aren't expecting step 2 so don't think you'll need to be looking at sys.argv there. You could change the tests, but it's good to have them reasonably isolated. Might be better to split the three steps into separate private functions and call 1 followed by 3 for those tests.

I don't think assigning to sys.argv will really help anything, there's only one entry point for this code already, it's just been poked too many times and become rather gnarly.

On the --profile-imports front, how about just removing the removal from from the bzr script and adding to bzrlib.commands.run_bzr something like:
    elif a == '--profile-imports':
        pass # already handled in startup script

Revision history for this message
Jason Spashett (jspashett) wrote : Posted in a previous version of this proposal

Martin [gz] wrote:
> Hm, yes. Means move some code around, because the order must be:
> 1) Split line
> 2) Strip args from front
> 3) Glob unquoted args
>
> The unit tests for _command_line_to_argv aren't expecting step 2 so don't think you'll need to be looking at sys.argv there. You could change the tests, but it's good to have them reasonably isolated. Might be better to split the three steps into separate private functions and call 1 followed by 3 for those tests.
>
> I don't think assigning to sys.argv will really help anything, there's only one entry point for this code already, it's just been poked too many times and become rather gnarly.
>
>
Yes, well I have more questions than answers now. Presumably we have to
call GetCommandLineW on windows. So what happens on linux if the locale
is ascii but a file we try and open is utf with 'unicode' characters in
it? Does it fail? Sorry to be a pain but I am just trying to understand
why things are they way they are.

I was thinking of this sort of thing (not exactly this), so that the
command line is 'fixed' and out of the way. Of course there would need
to be changes in win32utils.py too, refactorings, which I've hacked in
but not shown.
C:\bzr\bazaar\587868_args_handling_cant_debug>bzr diff
=== modified file 'bzr'
--- bzr 2010-05-27 17:59:18 +0000
+++ bzr 2010-06-11 00:42:26 +0000
@@ -49,6 +49,16 @@
 if hasattr(os, "unsetenv"):
     os.unsetenv(REINVOKE)

+if sys.platform.startswith('win'):
+ from bzrlib.win32utils import get_unicode_argv
+ new_args = [u'bzr']
+ new_args.extend(get_unicode_argv())
+ if len(new_args) < len(sys.argv):
+ raise AssertError("len(GetCommandLineW(...)) < len(sys.argv)
should not be possible")
+ new_args = new_args[len(new_args) - len(sys.argv):]
+ sys.argv = new_args
+
+

In other words the work done to get the correct command line is called
once from bzr. Things appear to then work, including the unit tests.

> On the --profile-imports front, how about just removing the removal from from the bzr script and adding to bzrlib.commands.run_bzr something like:
> elif a == '--profile-imports':
> pass # already handled in startup script
>

Should probably do this anyway/aswell.

Revision history for this message
Vincent Ladeuil (vila) wrote : Posted in a previous version of this proposal

@Jason, Thanks a lot, I think the only missing bit now is the contributor agreement: http://www.canonical.com/contributors

Revision history for this message
Jason Spashett (jspashett) wrote : Posted in a previous version of this proposal

Vincent Ladeuil wrote:
> @Jason, Thanks a lot, I think the only missing bit now is the contributor agreement: http://www.canonical.com/contributors
>
I can do that Vincent, but my attempted patch isn't good enough yet, and
I would be interested to see what people thing of my last suggestion. If
that's of no help then I'll do what Matrin [gz] suggests.

Revision history for this message
Vincent Ladeuil (vila) wrote : Posted in a previous version of this proposal

@Jason: Ok, so, I'll put this proposal status to 'Work In progress' then.
From there, you can push new changes to the same branch and set the status back to 'Needs Review', people
that have already review the previous version should be notified and asked to review again.

If you have doubts about how your changes will behave outside of windows, feel free to add more tests and ping me
on IRC #bzr or here, I can help you run these tests on http://babune.ladeuil.net:24842.

Revision history for this message
Jason Spashett (jspashett) wrote : Posted in a previous version of this proposal

I have now re-done this fix, hopefully it's reasonable and should avoide the cases were we had to adjust the command line in win32utils.py becuase of things like --profile-imports

What I've done is to:

set sys.argv to a unicode copy of the command line, for all patforms, by calling bzrlib.osutils.get_unicode_argv

I've then gone and changed get_uncode_argv (both of them) to return the full command line complete with program name, because it's needed in te bzr script (as above)

So then I've changed _specified_or_unicode_argv to return sys.argv instead of call get_unicode_argv

You'll see from the diffs, but the upshot is, hopefully that sys.argv is unicode from the start. It's a bit 'odd' changing sys.argv I know, but perhaps it's not too hacky and it's done in one place as startup now.

Revision history for this message
Gordon Tyler (doxxx) wrote :

The first hunk of changes in win32utils seem superfluous.

Revision history for this message
Alexander Belchenko (bialix) wrote : Posted in a previous version of this proposal

Jason Spashett пишет:
> I have now re-done this fix, hopefully it's reasonable and should avoide the cases were we had to adjust the command line in win32utils.py becuase of things like --profile-imports
>
> What I've done is to:
>
> set sys.argv to a unicode copy of the command line, for all patforms, by calling bzrlib.osutils.get_unicode_argv
>
> I've then gone and changed get_uncode_argv (both of them) to return the full command line complete with program name, because it's needed in te bzr script (as above)
>
> So then I've changed _specified_or_unicode_argv to return sys.argv instead of call get_unicode_argv
>
> You'll see from the diffs, but the upshot is, hopefully that sys.argv is unicode from the start. It's a bit 'odd' changing sys.argv I know, but perhaps it's not too hacky and it's done in one place as startup now.

You can't import anything from bzrlib before --profile-imports will be
processed. At all. Otherwise the output of --profile-imports will be
incorrect. So this is wrong path.

--
All the dude wanted was his rug back

Revision history for this message
Alexander Belchenko (bialix) wrote :

You can't import anything before --profile-imports option will be processed, otherwise its result will be wrong. This is wrong path.

Martin gz explained you earlier what need to be done to avoid sys.argv manipulations when --profile-imports is processed: add --profile-imports to the global bzr options.

review: Disapprove
Revision history for this message
Jason Spashett (jspashett) wrote :

I have done what Matring gz explained. Removed the modification of sys.argv from the bzr script
Ignore --profile-imports option in run_bzr

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2010-06-18 10:40:31 +0000
3+++ NEWS 2010-06-18 11:09:29 +0000
4@@ -80,6 +80,11 @@
5 * ``walkdirs`` now raises a useful message when the filenames are not using
6 the filesystem encoding. (Eric Moritz, #488519)
7
8+* Enable debugging of bzr on windows with pdb and other tools. This was
9+ broken because we call GetCommandLineW on windows. The fix adjusts the
10+ command line we get to be the same length as sys.argv.
11+ (Jason Spashett, Alexander Belchenko, #587868)
12+
13 Improvements
14 ************
15
16
17=== modified file 'bzr'
18--- bzr 2010-06-16 12:47:51 +0000
19+++ bzr 2010-06-18 11:09:29 +0000
20@@ -52,7 +52,6 @@
21
22 profiling = False
23 if '--profile-imports' in sys.argv:
24- sys.argv.remove('--profile-imports')
25 import profile_imports
26 profile_imports.install()
27 profiling = True
28
29=== modified file 'bzrlib/commands.py'
30--- bzrlib/commands.py 2010-05-27 21:16:48 +0000
31+++ bzrlib/commands.py 2010-06-18 11:09:29 +0000
32@@ -1054,6 +1054,8 @@
33 elif a == '--coverage':
34 opt_coverage_dir = argv[i + 1]
35 i += 1
36+ elif a == '--profile-imports':
37+ pass # already handled in startup script Bug #588277
38 elif a.startswith('-D'):
39 debug.debug_flags.add(a[2:])
40 else:
41
42=== modified file 'bzrlib/win32utils.py'
43--- bzrlib/win32utils.py 2010-06-01 14:24:01 +0000
44+++ bzrlib/win32utils.py 2010-06-18 11:09:29 +0000
45@@ -535,17 +535,33 @@
46 default.
47 :return: A list of unicode strings.
48 """
49+ # First, spit the command line
50 s = cmdline.Splitter(command_line, single_quotes_allowed=single_quotes_allowed)
51- # Now that we've split the content, expand globs if necessary
52+
53+ # Bug #587868 Now make sure that the length of s agrees with sys.argv
54+ # we do this by simply counting the number of arguments in each. The counts should
55+ # agree no matter what encoding sys.argv is in (AFAIK)
56+ # len(arguments) < len(sys.argv) should be an impossibility since python gets
57+ # args from the very same PEB as does GetCommandLineW
58+ arguments = list(s)
59+
60+ # Now shorten the command line we get from GetCommandLineW to match sys.argv
61+ if len(arguments) < len(sys.argv):
62+ from bzrlib.errors import InternalBzrError
63+ raise InternalBzrError("len(GetCommandLineW(...)) < len(sys.argv) should not be possible")
64+ arguments = arguments[len(arguments) - len(sys.argv):]
65+
66+ # Carry on to process globs (metachars) in the command line
67+ # expand globs if necessary
68 # TODO: Use 'globbing' instead of 'glob.glob', this gives us stuff like
69 # '**/' style globs
70- args = []
71- for is_quoted, arg in s:
72+ argv = []
73+ for is_quoted, arg in arguments:
74 if is_quoted or not glob.has_magic(arg):
75- args.append(arg)
76+ argv.append(arg)
77 else:
78- args.extend(glob_one(arg))
79- return args
80+ argv.extend(glob_one(arg))
81+ return argv
82
83
84 if has_ctypes and winver != 'Windows 98':
85@@ -558,20 +574,6 @@
86 raise ctypes.WinError()
87 # Skip the first argument, since we only care about parameters
88 argv = _command_line_to_argv(command_line)[1:]
89- if getattr(sys, 'frozen', None) is None:
90- # Invoked via 'python.exe' which takes the form:
91- # python.exe [PYTHON_OPTIONS] C:\Path\bzr [BZR_OPTIONS]
92- # we need to get only BZR_OPTIONS part,
93- # We already removed 'python.exe' so we remove everything up to and
94- # including the first non-option ('-') argument.
95- for idx in xrange(len(argv)):
96- if argv[idx][:1] != '-':
97- break
98- argv = argv[idx+1:]
99- # we should remove '--profile-imports' option as well (bug #588277)
100- # see bzr script ~ line 54
101- if '--profile-imports' in argv and '--profile-imports' not in sys.argv:
102- argv.remove('--profile-imports')
103 return argv
104 else:
105 get_unicode_argv = None