Apport

Merge lp:~cmiller/apport/attach_drm into lp:~apport-hackers/apport/trunk

attach_drm
Merge into trunk

Proposed by Chad Miller on 2015-02-24

Status:	Merged
Merged at revision:	2918
Proposed branch:	lp:~cmiller/apport/attach_drm
Merge into:	lp:~apport-hackers/apport/trunk
Diff against target:	19 lines (+3/-3) 1 file modified apport/hookutils.py (+3/-3)
To merge this branch:	bzr merge lp:~cmiller/apport/attach_drm
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Martin Pitt (community)		2015-02-24	Approve on 2015-02-27
Review via email: mp+250831@code.launchpad.net

Commit message

Open DRM files as if they're bytes, not an encoded Unicode string. Process bytes, and decode (as "latin1") as last step, to get Unicode. Avoid at least two TypeErrors in Py3.

Revision history for this message

Martin Pitt (pitti) wrote on 2015-02-26:

Thanks for this! But why do you decode them as latin-1? It was just encoded to base64, so ASCII should be fine. But "modes" and the others also just look like ASCII. So UTF-8 might still be a better default, or do you have a case where these attributes contain non-ASCII data which isn't UTF-8 either?

When presenting the data to the user, or downloading it from LP, apport will try and convert byte arrays to UTF-8 unicode strings; if that fails, the values are simply left as byte arrays. So either leave the entire info as a byte array, or try to convert it from ASCII or UTF-8 and if that fails with an UnicodeDecodeError, just leave it like it is?

review: Needs Information

Revision history for this message

Chad Miller (cmiller) wrote on 2015-02-26:

Martin, I have little confidence that the kernel will always return lower-7-bit values or valid UTF8 in those files. I *never want to guess encoding at all*, but if I have to pick, I would rather preserve bytes and get some mojibake that I can decode manually, than crash or discard.

I didn't dig into whether I could change the interface of the report hash to have byte array. If you say it's okay to make its values byte arrays, I'll happily drop the decode()ing.

Revision history for this message

Martin Pitt (pitti) wrote on 2015-02-27:

I modified the decoding to default to UTF-8 and be robust against wrong characters, and merged. Thanks!

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Brian Murray

Bruno Maximilian Voss

Chad Miller

Martin Pitt

Ritesh Raj Sarraf

1	=== modified file 'apport/hookutils.py'
2	--- apport/hookutils.py 2015-02-10 11:50:59 +0000
3	+++ apport/hookutils.py 2015-02-24 19:33:57 +0000
4	@@ -838,14 +838,14 @@
5	path = os.path.join(con, f)
6	if f == 'uevent' or not os.path.isfile(path):
7	continue
8	- val = open(path).read().strip()
9	+ val = open(path, "rb").read().strip()
10	# format some well-known attributes specially
11	if f == 'modes':
12	- val = val.replace('\n', ' ')
13	+ val = val.replace(b'\n', b' ')
14	if f == 'edid':
15	val = base64.b64encode(val)
16	f += '-base64'
17	- info += '%s: %s\n' % (f, val)
18	+ info += '%s: %s\n' % (f, val.decode("latin1"))
19	return info
20
21

Apport

Merge lp:~cmiller/apport/attach_drm into lp:~apport-hackers/apport/trunk

Commit message

Description of the change

Preview Diff

Subscribers