Merge lp:~paul-lucas/zorba/bug-1002867 into lp:zorba

Proposed by Paul J. Lucas
Status: Merged
Approved by: Paul J. Lucas
Approved revision: 10863
Merged at revision: 10865
Proposed branch: lp:~paul-lucas/zorba/bug-1002867
Merge into: lp:zorba
Diff against target: 52 lines (+17/-4)
1 file modified
modules/com/zorba-xquery/www/modules/http-client.xq.src/http_response_parser.cpp (+17/-4)
To merge this branch: bzr merge lp:~paul-lucas/zorba/bug-1002867
Reviewer Review Type Date Requested Status
David Graf (community) Approve
Matthias Brantner Approve
Review via email: mp+106855@code.launchpad.net

Commit message

No longer setting the charset of an overridden media type to a default value.

Description of the change

No longer setting the charset of an overridden media type to a default value.

To post a comment you must log in.
Revision history for this message
Matthias Brantner (matthias-brantner) wrote :

It seems your investigation makes sense. However, I think that the solution should be to not do any transcoding for any non-textual content (even if the user gives one).

review: Needs Fixing
Revision history for this message
Paul J. Lucas (paul-lucas) wrote :

If the content is non-textual, then they shouldn't be giving a charset="...". So I don't see the problem.

Revision history for this message
Matthias Brantner (matthias-brantner) wrote :

Does you change imply that the default if no charset is given isn't ISO-8859-1 anymore.

review: Needs Information
Revision history for this message
Paul J. Lucas (paul-lucas) wrote :

If no override-media-type is given, then the default charset is still ISO 8859-1 (see line 76 in http_response_parser.cpp).

However, if the user gives an override-media-type, then there is no charset unless the user explicitly includes a "; charset=..." as part of the override-media-type; hence, the bytes will pass through unchanged.

I can't think of anything else that's sensible to do.

Revision history for this message
Matthias Brantner (matthias-brantner) wrote :

> If no override-media-type is given, then the default charset is still ISO
> 8859-1 (see line 76 in http_response_parser.cpp).
>
> However, if the user gives an override-media-type, then there is no charset
> unless the user explicitly includes a "; charset=..." as part of the override-
> media-type; hence, the bytes will pass through unchanged.
>
> I can't think of anything else that's sensible to do.
If the user overrides the media-type, the behavior should be no different then if the server returned one. That is, if the media type is overwritten to be a text content-type, the default encoding ISO-8859-1 should still be used. Of course, unless otherwise given by the user.

lp:~paul-lucas/zorba/bug-1002867 updated
10863. By Paul J. Lucas

Now setting default of ISO 8859-1 only for text/ MIME types.

Revision history for this message
Matthias Brantner (matthias-brantner) :
review: Approve
Revision history for this message
David Graf (davidagraf) :
review: Approve
Revision history for this message
Zorba Build Bot (zorba-buildbot) wrote :
Revision history for this message
Zorba Build Bot (zorba-buildbot) wrote :

Validation queue job bug-1002867-2012-05-23T13-42-01.572Z is finished. The final status was:

All tests succeeded!

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'modules/com/zorba-xquery/www/modules/http-client.xq.src/http_response_parser.cpp'
2--- modules/com/zorba-xquery/www/modules/http-client.xq.src/http_response_parser.cpp 2012-05-10 04:12:09 +0000
3+++ modules/com/zorba-xquery/www/modules/http-client.xq.src/http_response_parser.cpp 2012-05-22 18:26:18 +0000
4@@ -13,6 +13,7 @@
5 * See the License for the specific language governing permissions and
6 * limitations under the License.
7 */
8+#include <cstring>
9 #include <string>
10 #include <sstream>
11 #include <assert.h>
12@@ -41,8 +42,20 @@
13 std::string::size_type pos = s.find( ';' );
14 *mime_type = s.substr( 0, pos );
15
16- // The HTTP/1.1 spec says that the default charset is ISO-8859-1.
17- *charset = "ISO-8859-1";
18+ if ( std::strncmp( mime_type->c_str(), "text/", 5 ) == 0 ) {
19+ //
20+ // RFC 2616: "Hypertext Transfer Protocol -- HTTP/1.1," section 3.7.1,
21+ // "Canonicalization and Text Defaults":
22+ //
23+ // The "charset" parameter is used with some media types to define the
24+ // character set (section 3.4) of the data. When no explicit charset
25+ // parameter is provided by the sender, media subtypes of the "text"
26+ // type are defined to have a default charset value of "ISO-8859-1" when
27+ // received via HTTP.
28+ //
29+ *charset = "ISO-8859-1";
30+ } else
31+ charset->clear();
32
33 if ( pos != std::string::npos ) {
34 //
35@@ -74,7 +87,6 @@
36 theHandler(aHandler),
37 theCurl(aCurl),
38 theErrorThrower(aErrorThrower),
39- theCurrentCharset("ISO-8859-1"), // HTTP/1.1 says this is the default
40 theStatus(-1),
41 theStreamBuffer(0),
42 theInsideRead(false),
43@@ -109,7 +121,8 @@
44
45 std::auto_ptr<std::istream> lStream;
46 try {
47- if ( transcode::is_necessary( theCurrentCharset.c_str() ) ) {
48+ if ( !theCurrentCharset.empty() &&
49+ transcode::is_necessary( theCurrentCharset.c_str() ) ) {
50 lStream.reset(
51 new transcode::stream<std::istream>(
52 theCurrentCharset.c_str(), theStreamBuffer

Subscribers

People subscribed via source and target branches