Code review comment for lp:~brian-murray/apport/bug-1016380

Revision history for this message
Martin Pitt (pitti) wrote :

I added tests for this bug.

Why do you need to encode the regexp into UTF-8 when you already decode the UTF-8 string? I think the re and the text both have to have the same type.

We also need to handle the case where the bytes object is not UTF-8, otherwise the .decode('UTF-8') will crash with a UnicodeDecodeError.

So I think it's better to use re in bytes mode when the value is in bytes, as this is both more efficient (avoids decoding) more robust (avoids UnicodeDecodeErrors), and also allows us to match against binary values.

Thanks for the initial fix!

review: Needs Fixing

« Back to merge proposal