Merge ~cjwatson/launchpad:py3-messageset-decode-header into launchpad:master

Proposed by Colin Watson
Status: Merged
Approved by: Colin Watson
Approved revision: 403e92f1059f373263a095cb6a9fa40cb77f019b
Merge reported by: Otto Co-Pilot
Merged at revision: not available
Proposed branch: ~cjwatson/launchpad:py3-messageset-decode-header
Merge into: launchpad:master
Diff against target: 31 lines (+6/-8)
1 file modified
lib/lp/services/messages/model/message.py (+6/-8)
Reviewer Review Type Date Requested Status
Cristian Gonzalez (community) Approve
Review via email: mp+395936@code.launchpad.net

Commit message

Fix MessageSet._decode_header for Python 3

Description of the change

On Python 3, decode_header returns (str, None) if the given header has no internal encoding, even though it normally returns (bytes, charset) pairs. Adjust MessageSet._decode_header to cope with this.

To post a comment you must log in.
Revision history for this message
Cristian Gonzalez (cristiangsp) wrote :

Looks good!

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/lib/lp/services/messages/model/message.py b/lib/lp/services/messages/model/message.py
2index 05d8a5d..28d70b8 100644
3--- a/lib/lp/services/messages/model/message.py
4+++ b/lib/lp/services/messages/model/message.py
5@@ -254,21 +254,19 @@ class MessageSet:
6 # Re-encode the header parts using utf-8, replacing undecodable
7 # characters with question marks.
8 re_encoded_bits = []
9- for bytes, charset in bits:
10- if charset is None:
11- charset = 'us-ascii'
12+ for word, charset in bits:
13 # 2008-09-26 gary:
14 # The RFC 2047 encoding names and the Python encoding names are
15 # not always the same. A safer and more correct approach would use
16- # bytes.decode(email.charset.Charset(charset).input_codec,
17- # 'replace')
18+ # word.decode(email.charset.Charset(charset).input_codec,
19+ # 'replace')
20 # or similar, rather than
21- # bytes.decode(charset, 'replace')
22+ # word.decode(charset, 'replace')
23 # That said, this has not bitten us so far, and is only likely to
24 # cause problems in unusual encodings that we are hopefully
25 # unlikely to encounter in this part of the code.
26- re_encoded_bits.append(
27- (self.decode(bytes, charset).encode('utf-8'), 'utf-8'))
28+ decoded = word if charset is None else self.decode(word, charset)
29+ re_encoded_bits.append((decoded.encode('utf-8'), 'utf-8'))
30
31 return six.text_type(email.header.make_header(re_encoded_bits))
32

Subscribers

People subscribed via source and target branches

to status/vote changes: