Merge lp:~andreserl/maas/lp1570609 into lp:maas/trunk

Proposed by Andres Rodriguez on 2016-04-14
Status: Rejected
Rejected by: Mike Pontillo on 2016-04-15
Proposed branch: lp:~andreserl/maas/lp1570609
Merge into: lp:maas/trunk
Diff against target: 11 lines (+1/-1)
1 file modified
src/provisioningserver/drivers/power/amt.py (+1/-1)
To merge this branch: bzr merge lp:~andreserl/maas/lp1570609
Reviewer Review Type Date Requested Status
Mike Pontillo (community) 2016-04-14 Abstain on 2016-04-15
Gavin Panella (community) Needs Fixing on 2016-04-15
Review via email: mp+291953@code.launchpad.net

Commit message

Fix AMT when using wsman (LP: #1570609)

To post a comment you must log in.
Mike Pontillo (mpontillo) wrote :

Looks good, though I do have a minor nit. I would suggest changing this line instead:

        xmldecl = re.compile(b'<[?]xml\\s')

to:

        xmldecl = re.compile('<[?]xml\\s')

Reasoning: there is no need to encode the XML string into bytes if you can just compare it to a string using the regex. (In other words, the .encode() is doing a small amount of unnecessary extra work.)

review: Approve
Andres Rodriguez (andreserl) wrote :

Been there, done that :). I actually did that first, and came across http://pastebin.ubuntu.com/15841111/ which lead me to do what I did. If you have a better way of fixing, I would liek to see it :)

Mike Pontillo (mpontillo) wrote :

Well, that's annoying. In that case, it's not worth our time to figure out how to avoid the .encode().

Land it!

Gavin Panella (allenap) wrote :

The error:

  builtins.ValueError: Unicode strings with encoding declaration are not
  supported. Please use bytes input or XML fragments without
  declaration.

means that we actually need to go back a step further.

The change:

- for doc in self._parse_multiple_xml_docs(xml)
+ for doc in self._parse_multiple_xml_docs(xml.encode())

encodes `xml` as UTF-8 [1], but if the XML declaration is:

  <?xml version="1.0" encoding="ISO-8859-1"?>

then the XML parser will treat it as ISO-8859-1, which is subtly
different to UTF-8. Substitute ISO-8859-1 with UTF-16, Shift-JIS, or any
other recognised encoding and you'll have similar problems.

What you need are the bytes that were originally pulled off the wire,
read in from a process, or read in from a file. Those should be passed
to the XML parser which will DTRT with respect to encoding.

[1] FWIW, I *hate* that Python 3 has defined a default character set for
    str.encode() and bytes.decode().

review: Needs Fixing
Andres Rodriguez (andreserl) wrote :

That seems like an alternate solution to what I'm doing. Can you suggest a fix for it if not to be fixed this way?

Mike Pontillo (mpontillo) wrote :

Let me start by saying that Gavin is absolutely correct.

However, if fixing it Gavin's way will take more than an hour or so, I would suggest that we land Andres's fix first, and then prioritize Gavin's follow-on fix as a Medium/Low. I think there are more important bugs to work on right now.

Reasoning: it's unlikely that the AMT will return XML documents in inconsistent encodings. Since this is a power driver that is very specific to the AMT, it is likely to work for the foreseeable future if we hard-code the encoding we know the AMT is using today. (if it's using UTF-8, for example, it is unlikely to change for the lifetime of the AMT product, IMHO.)

Mike Pontillo (mpontillo) wrote :

Marking Rejected since this was superseded by Gavin's branch:

https://code.launchpad.net/~allenap/maas/xml-as-bytes--bug-1570609/+merge/292012

review: Abstain

Unmerged revisions

4921. By Andres Rodriguez on 2016-04-14

Fix LP: #1570609

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/provisioningserver/drivers/power/amt.py'
2--- src/provisioningserver/drivers/power/amt.py 2016-04-14 15:25:36 +0000
3+++ src/provisioningserver/drivers/power/amt.py 2016-04-14 22:36:26 +0000
4@@ -85,7 +85,7 @@
5 }
6 state = next(chain.from_iterable(
7 doc.xpath('//h:PowerState/text()', namespaces=namespaces)
8- for doc in self._parse_multiple_xml_docs(xml)
9+ for doc in self._parse_multiple_xml_docs(xml.encode())
10 ))
11 return state
12