Comment 5 for bug 896959

Revision history for this message
Robert Collins (lifeless) wrote :

def encode_cstring(value):
    if isinstance(value, unicode):
        value = value.encode("utf8")
    return value + "\x00"

clearly unsafe: while most invalid keys will raise e.g. TypeError:
>>> bson.encode({1:0})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'encode'
>>> bson.dumps({1:0})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/robertc/source/launchpad/oops-datedir-repo/working/eggs/bson-0.3.2-py2.6.egg/bson/__init__.py", line 69, in dumps
    return encode_document(obj, [], generator_func = generator)
  File "/home/robertc/source/launchpad/oops-datedir-repo/working/eggs/bson-0.3.2-py2.6.egg/bson/codec.py", line 207, in encode_document
    encode_value(name, value, buf, traversal_stack, generator_func)
  File "/home/robertc/source/launchpad/oops-datedir-repo/working/eggs/bson-0.3.2-py2.6.egg/bson/codec.py", line 193, in encode_value
    buf.write(encode_int32_element(name, value))
  File "/home/robertc/source/launchpad/oops-datedir-repo/working/eggs/bson-0.3.2-py2.6.egg/bson/codec.py", line 310, in encode_int32_element
    return "\x10" + encode_cstring(name) + struct.pack("<i", value)
  File "/home/robertc/source/launchpad/oops-datedir-repo/working/eggs/bson-0.3.2-py2.6.egg/bson/codec.py", line 111, in encode_cstring
    return value + "\x00"
TypeError: unsupported operand type(s) for +: 'int' and 'str'

but a bytestring is assumed to be utf8 already, which is quite clearly not the case here.
def encode_cstring(value):
    if isinstance(value, unicode):
        value = value.encode("utf8")
    elif isinstance(value, str):
         # check value is utf8.
         value.decode('utf8')
    else:
        raise TypeError('Invalid type for cstring %r' % value)
    return value + "\x00"