Merge lp:~cmiller/apport/attach_drm into lp:~apport-hackers/apport/trunk

Proposed by Chad Miller
Status: Merged
Merged at revision: 2918
Proposed branch: lp:~cmiller/apport/attach_drm
Merge into: lp:~apport-hackers/apport/trunk
Diff against target: 19 lines (+3/-3)
1 file modified
apport/hookutils.py (+3/-3)
To merge this branch: bzr merge lp:~cmiller/apport/attach_drm
Reviewer Review Type Date Requested Status
Martin Pitt (community) Approve
Review via email: mp+250831@code.launchpad.net

Commit message

Open DRM files as if they're bytes, not an encoded Unicode string. Process bytes, and decode (as "latin1") as last step, to get Unicode. Avoid at least two TypeErrors in Py3.

To post a comment you must log in.
Revision history for this message
Martin Pitt (pitti) wrote :

Thanks for this! But why do you decode them as latin-1? It was just encoded to base64, so ASCII should be fine. But "modes" and the others also just look like ASCII. So UTF-8 might still be a better default, or do you have a case where these attributes contain non-ASCII data which isn't UTF-8 either?

When presenting the data to the user, or downloading it from LP, apport will try and convert byte arrays to UTF-8 unicode strings; if that fails, the values are simply left as byte arrays. So either leave the entire info as a byte array, or try to convert it from ASCII or UTF-8 and if that fails with an UnicodeDecodeError, just leave it like it is?

review: Needs Information
Revision history for this message
Chad Miller (cmiller) wrote :

Martin, I have little confidence that the kernel will always return lower-7-bit values or valid UTF8 in those files. I *never want to guess encoding at all*, but if I have to pick, I would rather preserve bytes and get some mojibake that I can decode manually, than crash or discard.

I didn't dig into whether I could change the interface of the report hash to have byte array. If you say it's okay to make its values byte arrays, I'll happily drop the decode()ing.

Revision history for this message
Martin Pitt (pitti) wrote :

I modified the decoding to default to UTF-8 and be robust against wrong characters, and merged. Thanks!

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'apport/hookutils.py'
--- apport/hookutils.py 2015-02-10 11:50:59 +0000
+++ apport/hookutils.py 2015-02-24 19:33:57 +0000
@@ -838,14 +838,14 @@
838 path = os.path.join(con, f)838 path = os.path.join(con, f)
839 if f == 'uevent' or not os.path.isfile(path):839 if f == 'uevent' or not os.path.isfile(path):
840 continue840 continue
841 val = open(path).read().strip()841 val = open(path, "rb").read().strip()
842 # format some well-known attributes specially842 # format some well-known attributes specially
843 if f == 'modes':843 if f == 'modes':
844 val = val.replace('\n', ' ')844 val = val.replace(b'\n', b' ')
845 if f == 'edid':845 if f == 'edid':
846 val = base64.b64encode(val)846 val = base64.b64encode(val)
847 f += '-base64'847 f += '-base64'
848 info += '%s: %s\n' % (f, val)848 info += '%s: %s\n' % (f, val.decode("latin1"))
849 return info849 return info
850850
851851

Subscribers

People subscribed via source and target branches