Merge lp:~gz/bzr/remove_monkey_patched_elementtree_escaping_614522 into lp:bzr
| Status: | Merged |
|---|---|
| Approved by: | John A Meinel on 2010-08-30 |
| Approved revision: | 5344 |
| Merged at revision: | 5407 |
| Proposed branch: | lp:~gz/bzr/remove_monkey_patched_elementtree_escaping_614522 |
| Merge into: | lp:bzr |
| Diff against target: |
85 lines (+2/-67) 1 file modified
bzrlib/xml_serializer.py (+2/-67) |
| To merge this branch: | bzr merge lp:~gz/bzr/remove_monkey_patched_elementtree_escaping_614522 |
| Related bugs: |
| Reviewer | Review Type | Date Requested | Status |
|---|---|---|---|
| John A Meinel | 2010-08-29 | Approve on 2010-08-30 | |
|
Review via email:
|
|||
Commit Message
Remove monkey patching for elementtree, we do it differently anyway
Description of the Change
The monkey patching of some ElementTree escaping functions for performance purposes mangles the output with newer versions of the library.
As this kind of thing is a bad idea and it's unclear to me what improvement this provides, this branch just removes all of that.
I expect people using non-ascii characters with this code on Python 2.7 currently risk corrupting their branch or something.
| Jelmer Vernooij (jelmer) wrote : | # |
| Jelmer Vernooij (jelmer) wrote : | # |
(If the optimization is only relevant for pre-2a formats then it would be an easier choice to get rid of an optimization).
| John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 8/29/2010 1:51 PM, Jelmer Vernooij wrote:
> (If the optimization is only relevant for pre-2a formats then it would be an easier choice to get rid of an optimization).
we only use xml for bundles in 2a. So not everyday, but it does get used
from time to time.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAkx
hoIAmwfDioXiaVe
=uqPs
-----END PGP SIGNATURE-----
| Martin Packman (gz) wrote : | # |
Okay, I'm confused now. Every microbenchmark I could cook up the bzrlib version of the _escape_cdata is actually slower than the original. So, I tried profiling bundle, and sure enough there's an xml escaping function high up in the list, but it's not the etree one:
502 0 205.1471 16.3189 <C:\Python24\
+2610268 0 13.1339 8.5832 +<C:\Python24\
So, what command can I run instead to measure how much ripping this code out hurts performance?
| John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 8/30/2010 7:44 AM, Martin [gz] wrote:
> Okay, I'm confused now. Every microbenchmark I could cook up the bzrlib version of the _escape_cdata is actually slower than the original. So, I tried profiling bundle, and sure enough there's an xml escaping function high up in the list, but it's not the etree one:
>
> 502 0 205.1471 16.3189 <C:\Python24\
> +2610268 0 13.1339 8.5832 +<C:\Python24\
>
> So, what command can I run instead to measure how much ripping this code out hurts performance?
I would guess a major factor would be "which version of Elementtree" :)
since it isn't bundled with earlier pythons.
As near as I can tell, the main change is to switch:
text = replace(text, "&", "&")
text = replace(text, "'", "'") # FIXME: overkill
text = replace(text, "\"", """)
text = replace(text, "<", "<")
text = replace(text, ">", ">")
to using:
escape_re = re.compile(
escape_map = {
"&":'&',
"'":"'", # FIXME: overkill
"\"":""",
"<":"<",
">":">",
}
def _escape_
return map[match.group()]
...
text = escape_
As such, I think a valid benchmark would be:
a) grab a large inventory content from pre-2a format (1.9-rich-root,
for example). This can be a single revision
b) Time the different between a single re.sub() versus 5 calls to
'string.
Anyway, as mentioned, this isn't a large perf issue for current formats,
so we probably can just revert it.
And your profiling shows... we worked around the ElementTree code
entirely in later revisions, so again, it is likely to not degrade
performance by applying your patch.
merge: approve
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAkx
vjUAoJ/
=Sm4E
-----END PGP SIGNATURE-----
| John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 8/30/2010 7:44 AM, Martin [gz] wrote:
> Okay, I'm confused now. Every microbenchmark I could cook up the bzrlib version of the _escape_cdata is actually slower than the original. So, I tried profiling bundle, and sure enough there's an xml escaping function high up in the list, but it's not the etree one:
>
> 502 0 205.1471 16.3189 <C:\Python24\
> +2610268 0 13.1339 8.5832 +<C:\Python24\
>
> So, what command can I run instead to measure how much ripping this code out hurts performance?
Note also that when we performance tuned it, we may have been running
under lsprof, which would also penalize the extra function calls more
than real runtime would (IME).
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAkx
JnYAnjIFNiwViT9
=gdz3
-----END PGP SIGNATURE-----
| John A Meinel (jameinel) wrote : | # |
sent to pqm by email

Do we still use XML anywhere other than pre-2a formats?