Merge lp:~leonardr/lazr.restful/binary-field-type into lp:lazr.restful

Proposed by Leonard Richardson
Status: Merged
Approved by: Francis J. Lacoste
Approved revision: 38
Merged at revision: not available
Proposed branch: lp:~leonardr/lazr.restful/binary-field-type
Merge into: lp:lazr.restful
Diff against target: None lines
To merge this branch: bzr merge lp:~leonardr/lazr.restful/binary-field-type
Reviewer Review Type Date Requested Status
Francis J. Lacoste (community) Approve
Review via email: mp+6056@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Leonard Richardson (leonardr) wrote :

Bug: 353805

This branch changes the WADL generation so that the <param> tags for
Bytes fields have a type of 'binary'. Currently lazr.restfulclient
thinks these fields are strings, and runs data for them through
simplejson.dumps(). simplejson.dumps() can't handle binary data and
crashes. The upshot is that launchpadlib cannot invoke any named
operation that takes a binary argument.

Note that the type is not 'xsd:binary'. There is no such type. XSD
only defines 'base64binary' and 'hexbinary', because XSD was designed
to describe the contents of XML documents themeselves. What we're
trying to do is use XML to define the contents of a non-XML
document. I searched for a while to find other examples of this and
didn't find any. I've got an email in to Marc Hadley asking him if
he's seen anyone need this before.

The expected media type is still x-www-form-urlencoded. Encoding all
that binary data will not be good for performance, so we should do
some branches that make named operations that include binary inputs
use multipart/form-data. But at least we'll be able to invoke the
named operations.

(Alternatively, we could just use base64binary; it'd be simpler, and more efficient than url-encoding everything, though less efficient than multipart/form-data.)

The code is a two-line patch, and to avoid changing the example web
service I moved one test section down so that an existing binary
parameter would be in scope.

Revision history for this message
Francis J. Lacoste (flacoste) wrote :

On Thursday 30 April 2009, Leonard Richardson wrote:
>
> The expected media type is still x-www-form-urlencoded. Encoding all
> that binary data will not be good for performance, so we should do
> some branches that make named operations that include binary inputs
> use multipart/form-data. But at least we'll be able to invoke the
> named operations.
>
> (Alternatively, we could just use base64binary; it'd be simpler, and more
> efficient than url-encoding everything, though less efficient than
> multipart/form-data.)
>

 status approved
 review approve

We need a lazr.restfulclient branch anyway to use this information. On that
branch, I think it shouldn't be much more work to use 'multipart/form-data'
when a parameter is using 'binary'. The regular zope publisher will handle
everything server-side.

--
Francis J. Lacoste
<email address hidden>

review: Approve
Revision history for this message
Leonard Richardson (leonardr) wrote :

I request a re-review with this more complex and complete branch:

https://pastebin.canonical.com/17197/

This branch changes the generated WADL so that a named operation that includes binary fields is defined as accepting the media type multipart/form-data. This in turn will be picked up by wadllib. I also added a definition of the 'binary' type.

Named operations that do not include binary fields are still defined as accepting application/x-www-form-urlencoded. This ensures that those operations will still work with old versions of wadllib. And operations that take binary arguments are broken anyway, which is what I'm fixing.

Revision history for this message
Francis J. Lacoste (flacoste) wrote :

On Monday 04 May 2009, Leonard Richardson wrote:
> I request a re-review with this more complex and complete branch:
>
> https://pastebin.canonical.com/17197/
>
> This branch changes the generated WADL so that a named operation that
> includes binary fields is defined as accepting the media type
> multipart/form-data. This in turn will be picked up by wadllib. I also
> added a definition of the 'binary' type.
>
> Named operations that do not include binary fields are still defined as
> accepting application/x-www-form-urlencoded. This ensures that those
> operations will still work with old versions of wadllib. And operations
> that take binary arguments are broken anyway, which is what I'm fixing.

That all looks good. I have a minor comment below.

  status approved
  review approve

>+
>+Parameter data types
>+====================
>+
>+Operation and representation parameters are always transfered between
>+client and server as strings. But a parameter may have a logical data
>+type which indicates how to treat the string. Data types are specified
>+using the primitive types from the XML Schema standard.
>+
>+Here's a parameter of type 'date'.
>+
>+ >>> copyright_date = single_list_value(
>+ ... [param for param in create_representation
>+ ... if param.attrib.get('name') == 'copyright_date'])
>+ >>> copyright_date.attrib['type']
>+ 'xsd:date'
>+
>+ >>> cover = single_list_value(
>+ ... [param for param in full_rep
>+ ... if param.attrib.get('name') == 'cover_link'])
>+ >>> cover.attrib['type']
>+ 'binary'

Maybe add a sample sentence in front of that example?

--
Francis J. Lacoste
<email address hidden>

review: Approve
39. By Leonard Richardson

A more complete representation.

40. By Leonard Richardson

Response to feedback.

41. By Leonard Richardson

Properly implemented and tested change_cover.

Revision history for this message
Leonard Richardson (leonardr) wrote :

Embarassingly, I need a follow-up reviewed.

https://pastebin.canonical.com/17304/

Basically, I didn't test or even implement the 'replace_cover' operation, because all I cared about was that it showed up in the WADL. But my lazr.restfulclient branch needs a working 'replace_cover' operation to be properly testable, so I've come back and implemented it correctly on the server side.

The most embarassing part is that I put 'replace_cover' on the wrong interface! It should be on ICookbook, and I put it on ICookbookSet.

The _makeMultipart method is based on code taken from wadllib. I don't have a good way to refactor the code, but it is fairly different from the wadllib code and it's not much code.

Revision history for this message
Francis J. Lacoste (flacoste) wrote :

On Thursday 07 May 2009, Leonard Richardson wrote:
> Embarassingly, I need a follow-up reviewed.
>
> https://pastebin.canonical.com/17304/
>
> Basically, I didn't test or even implement the 'replace_cover' operation,
> because all I cared about was that it showed up in the WADL. But my
> lazr.restfulclient branch needs a working 'replace_cover' operation to be
> properly testable, so I've come back and implemented it correctly on the
> server side.
>
> The most embarassing part is that I put 'replace_cover' on the wrong
> interface! It should be on ICookbook, and I put it on ICookbookSet.
>
> The _makeMultipart method is based on code taken from wadllib. I don't have
> a good way to refactor the code, but it is fairly different from the
> wadllib code and it's not much code.
>
>

I don't think the _makeMultipart is working nor necessary. It's useful for
lazr.restfulclient to use multipart encoding for efficiency reason. There is
no reason why it's needed in the webservice testing infrastructure. For
example, I'm pretty sure you could use a regular named_post() call with the
binary data here. My understanding is that your implementation anyway uses
form form-encoding throughout.

>=== modified file 'src/lazr/restful/testing/webservice.py'

>
>+ def _makeMultipart(self, operation_name, args):
>+ args['ws.op'] = operation_name
>+ outer = MIMEMultipart()
>+ outer.set_type('multipart/form-data')
>+ for name, value in args.items():
>+ try:
>+ # See if it's a text field.
>+ value.decode('utf-8')
>+ maintype, subtype = 'text', 'plain'
>+ params = {'charset' : 'utf-8'}
>+ except UnicodeDecodeError:
>+ # It's a binary field
>+ maintype, subtype = 'application', 'octet-stream'
>+ params = {}

I don't think the exception is raised in your actual example. The above logic
is flawed for several reasons:

- you shouldn't attempt that on strings that are already unicode.
- u"\x01\0x2".decode('utf-8') -> '\x01\x02' (You'd have to use binary string
with the 8th bit set to trigger a UnicodeDecodeError)
- You cannot assume that the system encoding is UTF-8, so it would be possible
to pass a valid text string "ĂȘtre" that blows up here and shouldn't be sent as
binary anyway.

The only benefit of getting this right would be to have a kind of integration
test in lazr.restful that shows that submitting binary data through
multipart/form-data is working. Not sure it's worth it. We have that
integration tests in lazr.restfulclient I think anyway. So I'm fine for you to
merge if you remove it.

--
Francis J. Lacoste
<email address hidden>

42. By Leonard Richardson

Removed multipart document generation.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/lazr/restful/example/tests/wadl.txt'
2--- src/lazr/restful/example/tests/wadl.txt 2009-04-22 17:56:52 +0000
3+++ src/lazr/restful/example/tests/wadl.txt 2009-04-30 16:07:31 +0000
4@@ -636,50 +636,6 @@
5 >>> find_response_representation.attrib['href']
6 'http://...#recipe-page'
7
8---------------------
9-Parameter data types
10---------------------
11-
12-Operation parameters are always transfered between client and server
13-as strings. But a parameter may have a logical data type which
14-indicates how to treat the string. Data types are specified using the
15-primitive types from the XML Schema standard.
16-
17-Here's a parameter of type 'date'.
18-
19- >>> copyright_date = single_list_value(
20- ... [param for param in create_representation
21- ... if param.attrib.get('name') == 'copyright_date'])
22- >>> copyright_date.attrib['type']
23- 'xsd:date'
24-
25-If no data type is provided, the WADL standard says to default to
26-"xsd:string". This is the case for the vast majority of parameters.
27-
28- >>> cuisine = single_list_value(
29- ... [param for param in create_representation
30- ... if param.attrib.get('name') == 'cuisine'])
31-
32- >>> 'type' in cuisine.attrib
33- False
34-
35-If a parameter represents a choice among several options, it might be
36-described with a list of valid values. The "cuisine" parameter is a
37-choice among several preset values, and the WADL describes those
38-values.
39-
40- >>> values = cuisine[1:]
41- >>> sorted([value.attrib['value'] for value in values])
42- ['American', ... 'Vegetarian']
43-
44-Here's the same field for a named operation.
45-
46- >>> doc, request, response = list(find_for_cuisine)
47- >>> ws_op, cuisine = list(request)
48- >>> values = cuisine[1:]
49- >>> sorted([value.attrib['value'] for value in values])
50- ['American', ... 'Vegetarian']
51-
52 ======================
53 An entry resource type
54 ======================
55@@ -882,6 +838,56 @@
56 >>> sorted(status.attrib['value'] for status in cuisine_options[1:])
57 ['American', ... 'General', 'Vegetarian']
58
59+
60+Parameter data types
61+====================
62+
63+Operation and representation parameters are always transfered between
64+client and server as strings. But a parameter may have a logical data
65+type which indicates how to treat the string. Data types are specified
66+using the primitive types from the XML Schema standard.
67+
68+Here's a parameter of type 'date'.
69+
70+ >>> copyright_date = single_list_value(
71+ ... [param for param in create_representation
72+ ... if param.attrib.get('name') == 'copyright_date'])
73+ >>> copyright_date.attrib['type']
74+ 'xsd:date'
75+
76+ >>> cover = single_list_value(
77+ ... [param for param in full_rep
78+ ... if param.attrib.get('name') == 'cover_link'])
79+ >>> cover.attrib['type']
80+ 'binary'
81+
82+If no data type is provided, the WADL standard says to default to
83+"xsd:string". This is the case for the vast majority of parameters.
84+
85+ >>> cuisine = single_list_value(
86+ ... [param for param in create_representation
87+ ... if param.attrib.get('name') == 'cuisine'])
88+
89+ >>> 'type' in cuisine.attrib
90+ False
91+
92+If a parameter represents a choice among several options, it might be
93+described with a list of valid values. The "cuisine" parameter is a
94+choice among several preset values, and the WADL describes those
95+values.
96+
97+ >>> values = cuisine[1:]
98+ >>> sorted([value.attrib['value'] for value in values])
99+ ['American', ... 'Vegetarian']
100+
101+Here's the same field for a named operation.
102+
103+ >>> doc, request, response = list(find_for_cuisine)
104+ >>> ws_op, cuisine = list(request)
105+ >>> values = cuisine[1:]
106+ >>> sorted([value.attrib['value'] for value in values])
107+ ['American', ... 'Vegetarian']
108+
109 The entry resource type itself
110 ==============================
111
112
113=== modified file 'src/lazr/restful/tales.py'
114--- src/lazr/restful/tales.py 2009-04-02 16:24:42 +0000
115+++ src/lazr/restful/tales.py 2009-04-30 16:07:31 +0000
116@@ -451,6 +451,8 @@
117 return 'xsd:dateTime'
118 elif IDate.providedBy(self.field):
119 return 'xsd:date'
120+ elif IBytes.providedBy(self.field):
121+ return 'binary'
122 else:
123 return None
124

Subscribers

People subscribed via source and target branches