Merge lp:~paul-lucas/zorba/bug-1002867 into lp:zorba

Proposed by Paul J. Lucas
Status: Merged
Approved by: Paul J. Lucas
Approved revision: 10863
Merged at revision: 10865
Proposed branch: lp:~paul-lucas/zorba/bug-1002867
Merge into: lp:zorba
Diff against target: 52 lines (+17/-4)
1 file modified
modules/com/zorba-xquery/www/modules/http-client.xq.src/http_response_parser.cpp (+17/-4)
To merge this branch: bzr merge lp:~paul-lucas/zorba/bug-1002867
Reviewer Review Type Date Requested Status
David Graf (community) Approve
Matthias Brantner Approve
Review via email: mp+106855@code.launchpad.net

Commit message

No longer setting the charset of an overridden media type to a default value.

Description of the change

No longer setting the charset of an overridden media type to a default value.

To post a comment you must log in.
Revision history for this message
Matthias Brantner (matthias-brantner) wrote :

It seems your investigation makes sense. However, I think that the solution should be to not do any transcoding for any non-textual content (even if the user gives one).

review: Needs Fixing
Revision history for this message
Paul J. Lucas (paul-lucas) wrote :

If the content is non-textual, then they shouldn't be giving a charset="...". So I don't see the problem.

Revision history for this message
Matthias Brantner (matthias-brantner) wrote :

Does you change imply that the default if no charset is given isn't ISO-8859-1 anymore.

review: Needs Information
Revision history for this message
Paul J. Lucas (paul-lucas) wrote :

If no override-media-type is given, then the default charset is still ISO 8859-1 (see line 76 in http_response_parser.cpp).

However, if the user gives an override-media-type, then there is no charset unless the user explicitly includes a "; charset=..." as part of the override-media-type; hence, the bytes will pass through unchanged.

I can't think of anything else that's sensible to do.

Revision history for this message
Matthias Brantner (matthias-brantner) wrote :

> If no override-media-type is given, then the default charset is still ISO
> 8859-1 (see line 76 in http_response_parser.cpp).
>
> However, if the user gives an override-media-type, then there is no charset
> unless the user explicitly includes a "; charset=..." as part of the override-
> media-type; hence, the bytes will pass through unchanged.
>
> I can't think of anything else that's sensible to do.
If the user overrides the media-type, the behavior should be no different then if the server returned one. That is, if the media type is overwritten to be a text content-type, the default encoding ISO-8859-1 should still be used. Of course, unless otherwise given by the user.

lp:~paul-lucas/zorba/bug-1002867 updated
10863. By Paul J. Lucas

Now setting default of ISO 8859-1 only for text/ MIME types.

Revision history for this message
Matthias Brantner (matthias-brantner) :
review: Approve
Revision history for this message
David Graf (davidagraf) :
review: Approve
Revision history for this message
Zorba Build Bot (zorba-buildbot) wrote :
Revision history for this message
Zorba Build Bot (zorba-buildbot) wrote :

Validation queue job bug-1002867-2012-05-23T13-42-01.572Z is finished. The final status was:

All tests succeeded!

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'modules/com/zorba-xquery/www/modules/http-client.xq.src/http_response_parser.cpp'
--- modules/com/zorba-xquery/www/modules/http-client.xq.src/http_response_parser.cpp 2012-05-10 04:12:09 +0000
+++ modules/com/zorba-xquery/www/modules/http-client.xq.src/http_response_parser.cpp 2012-05-22 18:26:18 +0000
@@ -13,6 +13,7 @@
13 * See the License for the specific language governing permissions and13 * See the License for the specific language governing permissions and
14 * limitations under the License.14 * limitations under the License.
15 */15 */
16#include <cstring>
16#include <string>17#include <string>
17#include <sstream>18#include <sstream>
18#include <assert.h>19#include <assert.h>
@@ -41,8 +42,20 @@
41 std::string::size_type pos = s.find( ';' );42 std::string::size_type pos = s.find( ';' );
42 *mime_type = s.substr( 0, pos );43 *mime_type = s.substr( 0, pos );
4344
44 // The HTTP/1.1 spec says that the default charset is ISO-8859-1.45 if ( std::strncmp( mime_type->c_str(), "text/", 5 ) == 0 ) {
45 *charset = "ISO-8859-1";46 //
47 // RFC 2616: "Hypertext Transfer Protocol -- HTTP/1.1," section 3.7.1,
48 // "Canonicalization and Text Defaults":
49 //
50 // The "charset" parameter is used with some media types to define the
51 // character set (section 3.4) of the data. When no explicit charset
52 // parameter is provided by the sender, media subtypes of the "text"
53 // type are defined to have a default charset value of "ISO-8859-1" when
54 // received via HTTP.
55 //
56 *charset = "ISO-8859-1";
57 } else
58 charset->clear();
4659
47 if ( pos != std::string::npos ) {60 if ( pos != std::string::npos ) {
48 //61 //
@@ -74,7 +87,6 @@
74 theHandler(aHandler),87 theHandler(aHandler),
75 theCurl(aCurl),88 theCurl(aCurl),
76 theErrorThrower(aErrorThrower),89 theErrorThrower(aErrorThrower),
77 theCurrentCharset("ISO-8859-1"), // HTTP/1.1 says this is the default
78 theStatus(-1),90 theStatus(-1),
79 theStreamBuffer(0),91 theStreamBuffer(0),
80 theInsideRead(false),92 theInsideRead(false),
@@ -109,7 +121,8 @@
109121
110 std::auto_ptr<std::istream> lStream;122 std::auto_ptr<std::istream> lStream;
111 try {123 try {
112 if ( transcode::is_necessary( theCurrentCharset.c_str() ) ) {124 if ( !theCurrentCharset.empty() &&
125 transcode::is_necessary( theCurrentCharset.c_str() ) ) {
113 lStream.reset(126 lStream.reset(
114 new transcode::stream<std::istream>(127 new transcode::stream<std::istream>(
115 theCurrentCharset.c_str(), theStreamBuffer128 theCurrentCharset.c_str(), theStreamBuffer

Subscribers

People subscribed via source and target branches