This is potentially buggy:
>>> print u'foo\xce'.encode('utf8').encode('utf8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
3: ordinal not in range(128)
Do we know that the thing being encoded is always unicode? If so, the
patch is fine. If it /might be/ bytes already re-encoding would go
boom.
This is potentially buggy: .encode( 'utf8') .encode( 'utf8')
>>> print u'foo\xce'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
3: ordinal not in range(128)
Do we know that the thing being encoded is always unicode? If so, the
patch is fine. If it /might be/ bytes already re-encoding would go
boom.