Merge lp:~leonardr/lazr.restful/multi-part-etag into lp:lazr.restful

Proposed by Leonard Richardson
Status: Merged
Approved by: Graham Binns
Approved revision: 124
Merged at revision: not available
Proposed branch: lp:~leonardr/lazr.restful/multi-part-etag
Merge into: lp:lazr.restful
Diff against target: 362 lines (+222/-29)
3 files modified
src/lazr/restful/_resource.py (+75/-28)
src/lazr/restful/example/base/tests/entry.txt (+146/-0)
src/lazr/restful/version.txt (+1/-1)
To merge this branch: bzr merge lp:~leonardr/lazr.restful/multi-part-etag
Reviewer Review Type Date Requested Status
Graham Binns (community) code Approve
Review via email: mp+21398@code.launchpad.net

Description of the change

OK, this branch will hopefully go a long way towards avoiding the 412 errors that results in bugs like bug 336866 and bug 534066. The solution is as described here: https://bugs.edge.launchpad.net/lazr.restful/+bug/336866/comments/41

An entry resource's ETag now has two components: a component describing the state of the read-only fields, and a component describing the state of the read-write fields as well as global information like the current code revision. Conditional GET requests still check the entire ETag, but conditional PUT and PATCH requests only check the read-write portion. This way, server-side changes to read-only fields don't cause PATCH requests to fail. A PATCH request should only fail if there's some possibility that _some other user_ made a conflicting change.

So, let's say a bug's Etag is "foo-bar". If you make a GET request with an If-None-Match of "foo-bar", you'll get 304 Not Modified. If you make a PATCH request with an If-Match of "foo-bar" your request will succeed.

Now let's say some behind-the-scenes cron job runs, and the bug's bug_heat changes. Now the bug's ETag is "baz-bar". Only the first part changed, because bug_heat is not writable. Now, if you make a GET request with an If-None-Match of "foo-bar", you'll get 200 Ok and a new representation with the new value of bug_heat. The request condition will fail. But if you make a PATCH request with an If-Match of "foo-bar", the condition will succeed and your PATCH will go through. After your PATCH goes through, the bug's ETag will be "baz-quux", and a PATCH with an If-Match of "foo-bar" or "baz-bar" will fail.

Hopefully the doctests will make it clear how this system works.

To post a comment you must log in.
Revision history for this message
Graham Binns (gmb) :
review: Approve (code)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/lazr/restful/_resource.py'
2--- src/lazr/restful/_resource.py 2010-02-18 15:02:10 +0000
3+++ src/lazr/restful/_resource.py 2010-03-15 20:24:26 +0000
4@@ -276,9 +276,11 @@
5 """Handle a possible conditional PUT or PATCH request.
6
7 This method has side effects. If the write operation should
8- not continue, because the incoming ETag doesn't match the
9- generated ETag, it sets the response code to 412
10- ("Precondition Failed").
11+ not continue, because the read-write portion of the incoming
12+ ETag doesn't match the read-write portion of the generated
13+ ETag, it sets the response code to 412 ("Precondition
14+ Failed"). It's okay for the read-only portion of the incoming
15+ ETag not to match.
16
17 If the PUT or PATCH request is being tunneled through POST
18 with X-HTTP-Method-Override, the media type of the incoming
19@@ -307,7 +309,7 @@
20 # This is not a conditional write.
21 return media_type
22 existing_etag = self.getETag(media_type)
23- if existing_etag in incoming_etags:
24+ if self._etagMatchesForWrite(existing_etag, incoming_etags):
25 # The conditional write can continue.
26 return media_type
27 # The resource has changed since the client requested it.
28@@ -316,10 +318,17 @@
29 self.request.response.setStatus(412)
30 return None
31
32- def _getETagCore(self, cache=None):
33+ def _etagMatchesForWrite(self, existing_etag, incoming_etags):
34+ """Perform an ETag match to see if conditional write is OK.
35+
36+ By default, we match the full ETag strings against each other.
37+ """
38+ return existing_etag in incoming_etags
39+
40+ def _getETagCores(self, cache=None):
41 """Calculate the core ETag for a representation.
42
43- :return: a string that will be used to calculate the full
44+ :return: a list of string that will be used to calculate the full
45 ETag. If None is returned, no ETag at all will be
46 calculated for the given resource and media type.
47 """
48@@ -346,29 +355,32 @@
49 # The WADL representation of a resource only changes when
50 # the software itself changes. Thus, we don't need any
51 # special information for its ETag core.
52- etag_core = ''
53+ etag_cores = ['']
54 else:
55 # For other media types, we calculate the ETag core by
56 # delegating to the subclass.
57- etag_core = self._getETagCore(cache)
58+ etag_cores = self._getETagCores(cache)
59
60- if etag_core is None:
61+ if etag_cores is None:
62 return None
63
64- hash_object = sha_constructor()
65- hash_object.update(etag_core)
66+ core_hashes = []
67+ for core in etag_cores:
68+ hash_object = sha_constructor()
69+ hash_object.update(core)
70+ core_hashes.append(hash_object)
71
72 # Append the media type, so that web browsers won't treat
73 # different representations of a resource interchangeably.
74- hash_object.update("\0" + media_type)
75+ core_hashes[-1].update("\0" + media_type)
76
77 # Append the revision number, because the algorithm for
78 # generating the representation might itself change across
79 # versions.
80 revno = getUtility(IWebServiceConfiguration).code_revision
81- hash_object.update("\0" + revno)
82+ core_hashes[-1].update("\0" + revno)
83
84- etag = '"%s"' % hash_object.hexdigest()
85+ etag = '"%s"' % "-".join([core.hexdigest() for core in core_hashes])
86 self.etags_by_media_type[media_type] = etag
87 return etag
88
89@@ -1200,14 +1212,14 @@
90
91 do_PATCH = do_PUT
92
93- def _getETagCore(self, unmarshalled_field_values=None):
94+ def _getETagCores(self, unmarshalled_field_values=None):
95 """Calculate the ETag for an entry field.
96
97 The core of the ETag is the field value itself.
98 """
99 name, value = self._unmarshallField(
100 self.context.name, self.context.field)
101- return str(value)
102+ return [str(value)]
103
104 def _representation(self, media_type):
105 """Create a representation of the field value."""
106@@ -1265,23 +1277,58 @@
107 super(EntryResource, self).__init__(context, request)
108 self.entry = getMultiAdapter((context, request), IEntry)
109
110- def _getETagCore(self, unmarshalled_field_values=None):
111+ def _getETagCores(self, unmarshalled_field_values=None):
112 """Calculate the ETag for an entry.
113
114 :arg unmarshalled_field_values: A dict mapping field names to
115 unmarshalled values, obtained during some other operation such
116 as the construction of a representation.
117 """
118- values = []
119+ unwritable_values = []
120+ writable_values = []
121 for name, field in getFieldsInOrder(self.entry.schema):
122- if self.isModifiableField(field, False):
123- if (unmarshalled_field_values is not None
124- and unmarshalled_field_values.get(name)):
125- value = unmarshalled_field_values[name]
126- else:
127- ignored, value = self._unmarshallField(name, field)
128- values.append(decode_value(value))
129- return "\0".join(values).encode("utf-8")
130+ if self.isModifiableField(field, True):
131+ # The client can write to this value.
132+ bucket = writable_values
133+ elif self.isModifiableField(field, False):
134+ # The client can't write this value, but it still
135+ # might change.
136+ bucket = unwritable_values
137+ else:
138+ # This value can never change, and as such does not
139+ # need to be included in the ETag.
140+ continue
141+ if (unmarshalled_field_values is not None
142+ and unmarshalled_field_values.get(name)):
143+ value = unmarshalled_field_values[name]
144+ else:
145+ ignored, value = self._unmarshallField(name, field)
146+ bucket.append(decode_value(value))
147+
148+ unwritable = "\0".join(unwritable_values).encode("utf-8")
149+ writable = "\0".join(writable_values).encode("utf-8")
150+ return [unwritable, writable]
151+
152+
153+ def _etagMatchesForWrite(self, existing_etag, incoming_etags):
154+ """Make sure no other client has modified this resource.
155+
156+ For a conditional read to succeed, the entire ETag must
157+ match. But for a conditional write to succeed, only the second
158+ half of the ETag must match. This prevents spurious 412 errors
159+ on conditional writes where the only fields that changed are
160+ read-only fields that can't possibly cause a conflict.
161+ """
162+ existing_write_portion = existing_etag.split('-', 1)[-1]
163+ for etag in incoming_etags:
164+ if '-' in etag:
165+ incoming_write_portion = etag.rsplit('-', 1)[-1]
166+ else:
167+ incoming_write_portion = etag
168+ if existing_write_portion == incoming_write_portion:
169+ return True
170+ return False
171+
172
173 def toDataForJSON(self):
174 """Turn the object into a simple data structure."""
175@@ -1552,14 +1599,14 @@
176 """Fetch the current web service request."""
177 return get_current_web_service_request()
178
179- def _getETagCore(self, cache=None):
180+ def _getETagCores(self, cache=None):
181 """Calculate an ETag for a representation of this resource.
182
183 The service root resource changes only when the software
184 itself changes. This information goes into the ETag already,
185 so there's no need to provide anything.
186 """
187- return ''
188+ return ['']
189
190 def __call__(self, REQUEST=None):
191 """Handle a GET request."""
192
193=== modified file 'src/lazr/restful/example/base/tests/entry.txt'
194--- src/lazr/restful/example/base/tests/entry.txt 2010-01-25 20:56:57 +0000
195+++ src/lazr/restful/example/base/tests/entry.txt 2010-03-15 20:24:26 +0000
196@@ -543,6 +543,151 @@
197 HTTP/1.1 412 Precondition Failed
198 ...
199
200+Conditional writes are a little more complicated
201+************************************************
202+
203+OK, but consider the 'copyright_date' field of a cookbook. This is a
204+read-only field; the client can't change it. But what if it changes on
205+the server-side? What happens to the ETag then? Does it change,
206+causing a conditional PATCH to fail, even if the PATCH doesn't touch
207+that read-only field?
208+
209+ >>> greens = webservice.get(greens_url).jsonBody()
210+ >>> print greens['copyright_date']
211+ 2003-01-01
212+ >>> etag_before_server_modification = greens['http_etag']
213+
214+ >>> from lazr.restful.example.base.root import C4
215+ >>> greens_object = C4
216+ >>> old_date = greens_object.copyright_date
217+ >>> old_date
218+ datetime.date(2003, 1, 1)
219+
220+Let's change the server-side value and find out.
221+
222+ >>> import datetime
223+ >>> greens_object.copyright_date = datetime.date(2005, 12, 12)
224+
225+ >>> new_greens = webservice.get(greens_url).jsonBody()
226+ >>> print new_greens['copyright_date']
227+ 2005-12-12
228+ >>> etag_after_server_modification = new_greens['http_etag']
229+
230+The ETag has indeed changed.
231+
232+ >>> etag_before_server_modification == etag_after_server_modification
233+ False
234+
235+So if we try to modify the cookbook using the old ETag, it should
236+fail, right?
237+
238+ >>> body = {'description' : 'New description.'}
239+ >>> print modify_cookbook('Everyday Greens', body, 'PATCH',
240+ ... {'If-Match' : etag_before_server_modification})
241+ HTTP/1.1 209 Content Returned
242+ ...
243+
244+Actually, it succeeds! How does that work? Well, the ETag consists of
245+two parts, separated by a dash.
246+
247+ >>> read_before, write_before = etag_before_server_modification.split('-')
248+ >>> read_after, write_after = etag_after_server_modification.split('-')
249+
250+The first part of the ETag only changes when a field the client can't
251+modify changes. This is the part of the ETag that changed when
252+copyright_date changed.
253+
254+ >>> read_before == read_after
255+ False
256+
257+The second part only changes when a field changes that the client can
258+modify. This part of the ETag hasn't changed.
259+
260+ >>> write_before == write_after
261+ True
262+
263+When you make a conditional write, the second part of the ETag you
264+provide is checked against the second part of the ETag generated on
265+the server. The first part of the ETag is ignored.
266+
267+The point of checking the ETag is to avoid conflicts where two clients
268+modify the same resource. But no client can modify any of those
269+read-only fields, so changes to them don't matter for purposes of
270+avoiding conflicts.
271+
272+If you hack the ETag to something that's not "two parts, separated by
273+a dash", lazr.restful will still handle it. (Of course, since that
274+ETag will never match anything, you'll always get a 412 error.)
275+
276+ >>> body = {'description' : 'New description.'}
277+ >>> print modify_cookbook('Everyday Greens', body, 'PATCH',
278+ ... {'If-Match' : "Weird etag"})
279+ HTTP/1.1 412 Precondition Failed
280+ ...
281+
282+Conditional PUT fails where a PATCH would succeed, but not because
283+lazr.restful rejects an old ETag. To verify this, let's change the
284+cookbook's copyright_date again, behind the scenes.
285+
286+ >>> greens = webservice.get(greens_url).jsonBody()
287+ >>> old_etag = greens['http_etag']
288+ >>> greens_object.copyright_date = datetime.date(2005, 11, 11)
289+
290+A totally bogus ETag fails with a 412 error.
291+
292+ >>> greens['description'] = 'Another new description'
293+ >>> print modify_cookbook('Everyday Greens', greens, 'PUT',
294+ ... {'If-Match' : "Not the old ETag"})
295+ HTTP/1.1 412 Precondition Failed
296+ ...
297+
298+When we use the original ETag, we don't cause a 412 error, but the PUT
299+request fails anyway.
300+
301+ >>> print modify_cookbook('Everyday Greens', greens, 'PUT',
302+ ... {'If-Match' : old_etag})
303+ HTTP/1.1 400 Bad Request
304+ ...
305+ http_etag: You tried to modify a read-only attribute.
306+ copyright_date: You tried to modify a read-only attribute.
307+
308+Rather, it's because a PUT request includes a representation of the
309+entire resource, and lazr.restful thinks the client is trying to
310+modify the fields that changed behind the scenes--in this case,
311+copyright_date and http_etag.
312+
313+Conditional reads are *not* more complicated
314+********************************************
315+
316+The point of the two-part ETag is to avoid spurious 412 errors when
317+doing conditional writes. When making a conditional _read_ request,
318+the condition will fail if _any_ part of the ETag is different.
319+
320+ >>> new_etag = webservice.get(greens_url).jsonBody()['http_etag']
321+
322+ >>> print webservice.get(
323+ ... greens_url,
324+ ... headers={'If-None-Match': new_etag})
325+ HTTP/1.1 304 Not Modified
326+ ...
327+
328+ >>> print webservice.get(
329+ ... greens_url,
330+ ... headers={'If-None-Match': "changed" + new_etag})
331+ HTTP/1.1 200 Ok
332+ ...
333+
334+lazr.restful checks the entire ETag on conditional GET because the
335+purpose of a conditional read is to avoid getting data that hasn't
336+changed. A server-side change to a read-only field like copyright_date
337+doesn't affect future writes, but it _is_ a change to the
338+representation.
339+
340+A bit of cleanup: restore the old value for the cookbook's
341+copyright_date.
342+
343+ >>> greens_object.copyright_date = old_date
344+
345 Changing object relationships
346 =============================
347
348@@ -621,6 +766,7 @@
349 The tests that follow make a number of PATCH requests that include
350 values for a cookbook's 'copyright_date' attribute.
351
352+ >>> greens = webservice.get(greens_url).jsonBody()
353 >>> greens['copyright_date']
354 u'2003-01-01'
355
356
357=== modified file 'src/lazr/restful/version.txt'
358--- src/lazr/restful/version.txt 2010-03-11 20:33:25 +0000
359+++ src/lazr/restful/version.txt 2010-03-15 20:24:26 +0000
360@@ -1,1 +1,1 @@
361-0.9.23
362+0.9.24

Subscribers

People subscribed via source and target branches