Merge lp:~allenap/storm/json-variable-unicode-bug-846867 into lp:storm

Proposed by Gavin Panella
Status: Merged
Approved by: Stuart Bishop
Approved revision: 406
Merged at revision: 403
Proposed branch: lp:~allenap/storm/json-variable-unicode-bug-846867
Merge into: lp:storm
Diff against target: 87 lines (+30/-5)
3 files modified
storm/variables.py (+13/-1)
tests/variables.py (+16/-3)
tests/zope/zstorm.py (+1/-1)
To merge this branch: bzr merge lp:~allenap/storm/json-variable-unicode-bug-846867
Reviewer Review Type Date Requested Status
Stuart Bishop (community) Approve
Jamu Kakar (community) Approve
Review via email: mp+75129@code.launchpad.net

Commit message

JSON properties now must back onto a unicode text column rather than a byte column.

Description of the change

Bug 846867 and bug 845904 have the background to this change. In essence, this ensures that JSONVariable always produces unicode strings at the right points, so that the JSON property can/should be used with a text column in the database rather than a byte column.

It also fixes a tiny bug in zstorm.py where tests are not skipped when the transaction module is not available but zope.component is.

To post a comment you must log in.
Revision history for this message
Jamu Kakar (jkakar) wrote :

Nice one, +1!

review: Approve
Revision history for this message
Stuart Bishop (stub) wrote :

As discussed, we should drop the _loads() change if it isn't absolutely necessary. If we are getting byte strings passed into _loads(), we need to know where they are coming from to determine the correct encoding rather than assume UTF-8. Assuming UTF-8 could cause data loss in some situations. I think that with the _dumps() change we will never see byte strings in _loads(), so replacing the str -> unicode decode with an assert seems appropriate.

review: Needs Fixing
405. By Gavin Panella

Complain loudly in JSONVariable._loads() when given a byte string, instead of trying to guess the encoding.

406. By Gavin Panella

Switch from assert to TypeError to be consistent with other parts of Storm.

Revision history for this message
Gavin Panella (allenap) wrote :

I've changed _loads() to raise a TypeError (suggested by Jamu as more
consistent with the rest of Storm than an assert).

Revision history for this message
Stuart Bishop (stub) :
review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'storm/variables.py'
2--- storm/variables.py 2011-09-07 14:04:04 +0000
3+++ storm/variables.py 2011-09-13 10:28:34 +0000
4@@ -633,10 +633,22 @@
5 super(JSONVariable, self).__init__(*args, **kwargs)
6
7 def _loads(self, value):
8+ if not isinstance(value, unicode):
9+ raise TypeError(
10+ "Cannot safely assume encoding of byte string %r." % value)
11 return json.loads(value)
12
13 def _dumps(self, value):
14- return json.dumps(value)
15+ # http://www.ietf.org/rfc/rfc4627.txt states that JSON is text-based
16+ # and so we treat it as such here. In other words, this method returns
17+ # unicode and never str.
18+ dump = json.dumps(value, ensure_ascii=False)
19+ if not isinstance(dump, unicode):
20+ # json.dumps() does not always return unicode. See
21+ # http://code.google.com/p/simplejson/issues/detail?id=40 for one
22+ # of many discussions of str/unicode handling in simplejson.
23+ dump = dump.decode("utf-8")
24+ return dump
25
26
27 class ListVariable(MutableValueVariable):
28
29=== modified file 'tests/variables.py'
30--- tests/variables.py 2011-09-07 14:04:04 +0000
31+++ tests/variables.py 2011-09-13 10:28:34 +0000
32@@ -833,7 +833,7 @@
33
34 def test_get_set(self):
35 d = {"a": 1}
36- d_dump = self.encoding.dumps(d, -1)
37+ d_dump = self.encode(d)
38
39 variable = self.variable_type()
40
41@@ -893,18 +893,31 @@
42
43 class PickleVariableTest(EncodedValueVariableTestMixin, TestHelper):
44
45- encoding = pickle
46+ encode = staticmethod(lambda data: pickle.dumps(data, -1))
47 variable_type = PickleVariable
48
49
50 class JSONVariableTest(EncodedValueVariableTestMixin, TestHelper):
51
52- encoding = json
53+ encode = staticmethod(lambda data: json.dumps(data).decode("utf-8"))
54 variable_type = JSONVariable
55
56 def is_supported(self):
57 return json is not None
58
59+ def test_unicode_from_db_required(self):
60+ # JSONVariable._loads() complains loudly if it does not receive a
61+ # unicode string because it has no way of knowing its encoding.
62+ variable = self.variable_type()
63+ self.assertRaises(TypeError, variable.set, '"abc"', from_db=True)
64+
65+ def test_unicode_to_db(self):
66+ # JSONVariable._dumps() works around unicode/str handling issues in
67+ # simplejson/json.
68+ variable = self.variable_type()
69+ variable.set({u"a": 1})
70+ self.assertIsInstance(variable.get(to_db=True), unicode)
71+
72
73 class ListVariableTest(TestHelper):
74
75
76=== modified file 'tests/zope/zstorm.py'
77--- tests/zope/zstorm.py 2010-07-01 17:47:01 +0000
78+++ tests/zope/zstorm.py 2011-09-13 10:28:34 +0000
79@@ -266,7 +266,7 @@
80 class ZStormUtilityTest(TestHelper):
81
82 def is_supported(self):
83- return has_zope_component
84+ return has_transaction and has_zope_component
85
86 def test_utility(self):
87 provideUtility(ZStorm())

Subscribers

People subscribed via source and target branches

to status/vote changes: