Merge lp:~zorba-coders/zorba/feature-transcode_streambuf into lp:zorba
- feature-transcode_streambuf
- Merge into trunk
Status: | Merged | ||||
---|---|---|---|---|---|
Approved by: | Matthias Brantner | ||||
Approved revision: | 10660 | ||||
Merged at revision: | 10663 | ||||
Proposed branch: | lp:~zorba-coders/zorba/feature-transcode_streambuf | ||||
Merge into: | lp:zorba | ||||
Diff against target: |
2967 lines (+1874/-555) 37 files modified
ChangeLog (+4/-0) include/zorba/internal/proxy.h (+48/-0) include/zorba/pregenerated/diagnostic_list.h (+4/-0) include/zorba/transcode_stream.h (+213/-0) modules/ExternalModules.conf (+1/-1) modules/com/zorba-xquery/www/modules/http-client.xq (+2/-2) modules/com/zorba-xquery/www/modules/http-client.xq.src/curl_stream_buffer.cpp (+337/-338) modules/com/zorba-xquery/www/modules/http-client.xq.src/curl_stream_buffer.h (+164/-143) modules/com/zorba-xquery/www/modules/http-client.xq.src/http_response_parser.cpp (+71/-21) modules/com/zorba-xquery/www/modules/http-client.xq.src/http_response_parser.h (+10/-6) modules/com/zorba-xquery/www/modules/pregenerated/errors.xq (+8/-0) modules/org/expath/ns/file.xq.src/file.cpp (+25/-10) modules/org/expath/ns/file.xq.src/file_function.cpp (+0/-5) modules/org/expath/ns/file.xq.src/file_function.h (+5/-9) modules/org/expath/ns/file.xq.src/file_module.cpp (+2/-5) modules/org/expath/ns/file.xq.src/file_module.h (+13/-6) src/api/CMakeLists.txt (+1/-0) src/api/transcode_streambuf.cpp (+102/-0) src/diagnostics/diagnostic_en.xml (+8/-0) src/diagnostics/pregenerated/diagnostic_list.cpp (+6/-0) src/diagnostics/pregenerated/dict_en.cpp (+2/-0) src/unit_tests/CMakeLists.txt (+4/-6) src/unit_tests/test_icu_streambuf.cpp (+151/-0) src/unit_tests/unit_test_list.h (+5/-0) src/unit_tests/unit_tests.cpp (+3/-0) src/util/CMakeLists.txt (+6/-1) src/util/icu_streambuf.cpp (+300/-0) src/util/icu_streambuf.h (+140/-0) src/util/passthru_streambuf.cpp (+105/-0) src/util/passthru_streambuf.h (+76/-0) src/util/transcode_streambuf.h (+47/-0) test/rbkt/ExpQueryResults/zorba/file/cp1252.xml.res (+1/-0) test/rbkt/Queries/zorba/file/cp1252.txt (+1/-0) test/rbkt/Queries/zorba/file/cp1252.xq (+3/-0) test/rbkt/Queries/zorba/file/invalid_encoding.spec (+1/-0) test/rbkt/Queries/zorba/file/invalid_encoding.xq (+3/-0) test/rbkt/Queries/zorba/http-client/send-request/http2-read-svg.xq (+2/-2) |
||||
To merge this branch: | bzr merge lp:~zorba-coders/zorba/feature-transcode_streambuf | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Matthias Brantner | Approve | ||
Paul J. Lucas | Approve | ||
Review via email: mp+93327@code.launchpad.net |
This proposal supersedes a proposal from 2012-02-08.
Commit message
- Added transcode_streambuf
- file:read-text now respects encodings
- http:send-
Description of the change
Added transcode_
Paul J. Lucas (paul-lucas) : Posted in a previous version of this proposal | # |
Matthias Brantner (matthias-brantner) : Posted in a previous version of this proposal | # |
Zorba Build Bot (zorba-buildbot) wrote : Posted in a previous version of this proposal | # |
Zorba Build Bot (zorba-buildbot) wrote : Posted in a previous version of this proposal | # |
The attempt to merge lp:~zorba-coders/zorba/feature-transcode_streambuf into lp:zorba failed. Below is the output from the failed tests.
CMake Error at /home/ceej/
Validation queue job feature-
is finished. The final status was:
3 tests did not succeed - changes not commited.
Error in read script: /home/ceej/
Paul J. Lucas (paul-lucas) : Posted in a previous version of this proposal | # |
Zorba Build Bot (zorba-buildbot) wrote : Posted in a previous version of this proposal | # |
Validation queue starting for merge proposal.
Log at: http://
Zorba Build Bot (zorba-buildbot) wrote : Posted in a previous version of this proposal | # |
The attempt to merge lp:~zorba-coders/zorba/feature-transcode_streambuf into lp:zorba failed. Below is the output from the failed tests.
CMake Error at /home/ceej/
Validation queue job feature-
is finished. The final status was:
1 tests did not succeed - changes not commited.
Error in read script: /home/ceej/
Zorba Build Bot (zorba-buildbot) wrote : Posted in a previous version of this proposal | # |
Attempt to merge into lp:zorba failed due to conflicts:
text conflict in src/unit_
text conflict in src/unit_
Zorba Build Bot (zorba-buildbot) wrote : Posted in a previous version of this proposal | # |
There are additional revisions which have not been approved in review. Please seek review and approval of these new revisions.
Paul J. Lucas (paul-lucas) : | # |
Matthias Brantner (matthias-brantner) : | # |
Zorba Build Bot (zorba-buildbot) wrote : | # |
Validation queue starting for merge proposal.
Log at: http://
Zorba Build Bot (zorba-buildbot) wrote : | # |
Validation queue starting for merge proposal.
Log at: http://
Zorba Build Bot (zorba-buildbot) wrote : | # |
Validation queue job feature-
All tests succeeded!
Preview Diff
1 | === modified file 'ChangeLog' |
2 | --- ChangeLog 2012-02-16 00:52:25 +0000 |
3 | +++ ChangeLog 2012-02-16 02:19:18 +0000 |
4 | @@ -34,6 +34,10 @@ |
5 | * zerr is not predeclared anymore to be http://www.zorba-xquery.com/errors |
6 | * Add new XQuery interface for the PHP bindings. |
7 | * Added API method Item::getNamespaceBindings(). |
8 | + * Added a transcoding streambuffer to the API which allows transcoding arbitrary encodings |
9 | + from and to UTF-8 |
10 | + * file:read-text is able to handle arbitrary encodings (fixes bug #867159) |
11 | + * http:send-request is able to handle arbitrary encodings |
12 | * Fixed bug #917981 (disallow declaring same module twice). |
13 | * Added API method StaticContext::getNamespaceBindings() (see bug #905035) |
14 | * Deprecated StaticContext:getNamespaceURIByPrefix() |
15 | |
16 | === added file 'include/zorba/internal/proxy.h' |
17 | --- include/zorba/internal/proxy.h 1970-01-01 00:00:00 +0000 |
18 | +++ include/zorba/internal/proxy.h 2012-02-16 02:19:18 +0000 |
19 | @@ -0,0 +1,48 @@ |
20 | +/* |
21 | + * Copyright 2006-2008 The FLWOR Foundation. |
22 | + * |
23 | + * Licensed under the Apache License, Version 2.0 (the "License"); |
24 | + * you may not use this file except in compliance with the License. |
25 | + * You may obtain a copy of the License at |
26 | + * |
27 | + * http://www.apache.org/licenses/LICENSE-2.0 |
28 | + * |
29 | + * Unless required by applicable law or agreed to in writing, software |
30 | + * distributed under the License is distributed on an "AS IS" BASIS, |
31 | + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
32 | + * See the License for the specific language governing permissions and |
33 | + * limitations under the License. |
34 | + */ |
35 | + |
36 | +#ifndef ZORBA_INTERNAL_PROXY_H |
37 | +#define ZORBA_INTERNAL_PROXY_H |
38 | + |
39 | +namespace zorba { |
40 | +namespace internal { |
41 | +namespace ztd { |
42 | + |
43 | +/////////////////////////////////////////////////////////////////////////////// |
44 | + |
45 | +/** |
46 | + * \internal |
47 | + * A %proxy<T> is-a \c T that also contains a T* -- a pointer to the original. |
48 | + */ |
49 | +template<class OriginalType> |
50 | +class proxy : public OriginalType { |
51 | +public: |
52 | + proxy( OriginalType *p ) : original_( p ) { } |
53 | + |
54 | + OriginalType* original() const { |
55 | + return original_; |
56 | + } |
57 | +private: |
58 | + OriginalType *original_; |
59 | +}; |
60 | + |
61 | +/////////////////////////////////////////////////////////////////////////////// |
62 | + |
63 | +} // namespace ztd |
64 | +} // namespace internal |
65 | +} // namespace zorba |
66 | +#endif /* ZORBA_INTERNAL_PROXY_H */ |
67 | +/* vim:set et sw=2 ts=2: */ |
68 | |
69 | === modified file 'include/zorba/pregenerated/diagnostic_list.h' |
70 | --- include/zorba/pregenerated/diagnostic_list.h 2012-01-26 01:35:11 +0000 |
71 | +++ include/zorba/pregenerated/diagnostic_list.h 2012-02-16 02:19:18 +0000 |
72 | @@ -392,6 +392,8 @@ |
73 | |
74 | extern ZORBA_DLL_PUBLIC ZorbaErrorCode ZXQP0005_NOT_ENABLED; |
75 | |
76 | +extern ZORBA_DLL_PUBLIC ZorbaErrorCode ZXQP0006_UNKNOWN_ENCODING; |
77 | + |
78 | extern ZORBA_DLL_PUBLIC ZorbaErrorCode ZXQP0007_FUNCTION_SIGNATURE_NOT_EQUAL; |
79 | |
80 | extern ZORBA_DLL_PUBLIC ZorbaErrorCode ZXQP0008_FUNCTION_IMPL_NOT_FOUND; |
81 | @@ -684,6 +686,8 @@ |
82 | |
83 | extern ZORBA_DLL_PUBLIC ZorbaErrorCode ZOSE0005_DLL_LOAD_FAILED; |
84 | |
85 | +extern ZORBA_DLL_PUBLIC ZorbaErrorCode ZOSE0006_TRANSCODING_ERROR; |
86 | + |
87 | extern ZORBA_DLL_PUBLIC ZorbaErrorCode ZSTR0001_INDEX_ALREADY_EXISTS; |
88 | |
89 | extern ZORBA_DLL_PUBLIC ZorbaErrorCode ZSTR0002_INDEX_DOES_NOT_EXIST; |
90 | |
91 | === added file 'include/zorba/transcode_stream.h' |
92 | --- include/zorba/transcode_stream.h 1970-01-01 00:00:00 +0000 |
93 | +++ include/zorba/transcode_stream.h 2012-02-16 02:19:18 +0000 |
94 | @@ -0,0 +1,213 @@ |
95 | +/* |
96 | + * Copyright 2006-2008 The FLWOR Foundation. |
97 | + * |
98 | + * Licensed under the Apache License, Version 2.0 (the "License"); |
99 | + * you may not use this file except in compliance with the License. |
100 | + * You may obtain a copy of the License at |
101 | + * |
102 | + * http://www.apache.org/licenses/LICENSE-2.0 |
103 | + * |
104 | + * Unless required by applicable law or agreed to in writing, software |
105 | + * distributed under the License is distributed on an "AS IS" BASIS, |
106 | + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
107 | + * See the License for the specific language governing permissions and |
108 | + * limitations under the License. |
109 | + */ |
110 | + |
111 | +#ifndef ZORBA_TRANSCODE_STREAM_API_H |
112 | +#define ZORBA_TRANSCODE_STREAM_API_H |
113 | + |
114 | +#include <stdexcept> |
115 | +#include <streambuf> |
116 | +#include <string> |
117 | + |
118 | +#include <zorba/config.h> |
119 | +#include <zorba/internal/proxy.h> |
120 | +#include <zorba/internal/unique_ptr.h> |
121 | + |
122 | +namespace zorba { |
123 | + |
124 | +typedef internal::ztd::proxy<std::streambuf> proxy_streambuf; |
125 | + |
126 | +namespace transcode { |
127 | + |
128 | +/////////////////////////////////////////////////////////////////////////////// |
129 | + |
130 | +/** |
131 | + * A %transcode::streambuf is-a std::streambuf for transcoding character |
132 | + * encodings from/to UTF-8 on-the-fly. |
133 | + * |
134 | + * To use it, replace a stream's streambuf: |
135 | + * \code |
136 | + * istream is; |
137 | + * // ... |
138 | + * transcode::streambuf tbuf( "ISO-8859-1", is.rdbuf() ); |
139 | + * is.ios::rdbuf( &tbuf ); |
140 | + * \endcode |
141 | + * Note that the %transcode::streambuf must exist for as long as it's being used |
142 | + * by the stream. If you are replacing the streabuf for a stream you did not |
143 | + * create, you should set it back to the original streambuf: |
144 | + * \code |
145 | + * void f( ostream &os ) { |
146 | + * transcode::streambuf tbuf( "ISO-8859-1", os.rdbuf() ); |
147 | + * try { |
148 | + * os.ios::rdbuf( &tbuf ); |
149 | + * // ... |
150 | + * } |
151 | + * catch ( ... ) { |
152 | + * os.ios::rdbuf( tbuf.orig_streambuf() ); |
153 | + * throw; |
154 | + * } |
155 | + * } |
156 | + * \endcode |
157 | + * |
158 | + * While %transcode::streambuf does support seeking, the positions are relative |
159 | + * to the original byte stream. |
160 | + */ |
161 | +class ZORBA_DLL_PUBLIC streambuf : public std::streambuf { |
162 | +public: |
163 | + /** |
164 | + * Constructs a %transcode::streambuf. |
165 | + * |
166 | + * @param charset The name of the character encoding to convert from/to. |
167 | + * @param orig The original streambuf to read/write from/to. |
168 | + * @throws std::invalid_argument if either \a charset is not supported or |
169 | + * \a orig is null. |
170 | + */ |
171 | + streambuf( char const *charset, std::streambuf *orig ); |
172 | + |
173 | + /** |
174 | + * Destructs a %transcode::streambuf. |
175 | + */ |
176 | + ~streambuf(); |
177 | + |
178 | + /** |
179 | + * Gets the original streambuf. |
180 | + * |
181 | + * @return said streambuf. |
182 | + */ |
183 | + std::streambuf* orig_streambuf() const { |
184 | + return proxy_buf_->original(); |
185 | + } |
186 | + |
187 | +protected: |
188 | + void imbue( std::locale const& ); |
189 | + pos_type seekoff( off_type, std::ios_base::seekdir, std::ios_base::openmode ); |
190 | + pos_type seekpos( pos_type, std::ios_base::openmode ); |
191 | + std::streambuf* setbuf( char_type*, std::streamsize ); |
192 | + std::streamsize showmanyc(); |
193 | + int sync(); |
194 | + int_type overflow( int_type ); |
195 | + int_type pbackfail( int_type ); |
196 | + int_type uflow(); |
197 | + int_type underflow(); |
198 | + std::streamsize xsgetn( char_type*, std::streamsize ); |
199 | + std::streamsize xsputn( char_type const*, std::streamsize ); |
200 | + |
201 | +private: |
202 | + std::unique_ptr<proxy_streambuf> proxy_buf_; |
203 | + |
204 | + // forbid |
205 | + streambuf( streambuf const& ); |
206 | + streambuf& operator=( streambuf const& ); |
207 | +}; |
208 | + |
209 | +/////////////////////////////////////////////////////////////////////////////// |
210 | + |
211 | +/** |
212 | + * A %transcode::stream is used to wrap a C++ standard I/O stream with a |
213 | + * transcode::streambuf so that transcoding and the management of the streambuf |
214 | + * happens automatically. |
215 | + * |
216 | + * @tparam StreamType The I/O stream class type to wrap. It must be a concrete |
217 | + * stream class. |
218 | + */ |
219 | +template<class StreamType> |
220 | +class stream : public StreamType { |
221 | +public: |
222 | + /** |
223 | + * Constructs a %transcode::stream. |
224 | + * |
225 | + * @param charset The name of the character encoding to convert from/to. |
226 | + * @throws std::invalid_argument if \a charset is not supported. |
227 | + */ |
228 | + stream( char const *charset ) : |
229 | + tbuf_( charset, this->rdbuf() ) |
230 | + { |
231 | + init(); |
232 | + } |
233 | + |
234 | + /** |
235 | + * Constructs a %stream. |
236 | + * |
237 | + * @tparam StreamArgType The type of the first argument of \a StreamType's |
238 | + * constructor. |
239 | + * @param charset The name of the character encoding to convert from/to. |
240 | + * @param stream_arg The argument to pass as the first argument to |
241 | + * \a StreamType's constructor. |
242 | + * @throws std::invalid_argument if \a charset is not supported. |
243 | + */ |
244 | + template<typename StreamArgType> |
245 | + stream( char const *charset, StreamArgType stream_arg ) : |
246 | + StreamType( stream_arg ), |
247 | + tbuf_( charset, this->rdbuf() ) |
248 | + { |
249 | + init(); |
250 | + } |
251 | + |
252 | + /** |
253 | + * Constructs a %transcode::stream. |
254 | + * |
255 | + * @tparam StreamArgType The type of the first argument of \a StreamType's |
256 | + * constructor. |
257 | + * @param charset The name of the character encoding to convert from/to. |
258 | + * @param stream_arg The argument to pass as the first argument to |
259 | + * \a StreamType's constructor. |
260 | + * @param mode The open-mode to pass to \a StreamType's constructor. |
261 | + * @throws std::invalid_argument if \a charset is not supported. |
262 | + */ |
263 | + template<typename StreamArgType> |
264 | + stream( char const *charset, StreamArgType stream_arg, |
265 | + std::ios_base::openmode mode ) : |
266 | + StreamType( stream_arg, mode ), |
267 | + tbuf_( charset, this->rdbuf() ) |
268 | + { |
269 | + init(); |
270 | + } |
271 | + |
272 | +private: |
273 | + streambuf tbuf_; |
274 | + |
275 | + void init() { |
276 | + this->std::ios::rdbuf( &tbuf_ ); |
277 | + } |
278 | +}; |
279 | + |
280 | +/////////////////////////////////////////////////////////////////////////////// |
281 | + |
282 | +/** |
283 | + * Checks whether it would be necessary to transcode from the given character |
284 | + * encoding to UTF-8. |
285 | + * |
286 | + * @param charset The name of the character encoding to check. |
287 | + * @return \c true only if it would be necessary to transcode from the given |
288 | + * character encoding to UTF-8. |
289 | + */ |
290 | +ZORBA_DLL_PUBLIC |
291 | +bool is_necessary( char const *charset ); |
292 | + |
293 | +/** |
294 | + * Checks whether the given character set is supported for transcoding. |
295 | + * |
296 | + * @param charset The name of the character encoding to check. |
297 | + * @return \c true only if the character encoding is supported. |
298 | + */ |
299 | +ZORBA_DLL_PUBLIC |
300 | +bool is_supported( char const *charset ); |
301 | + |
302 | +/////////////////////////////////////////////////////////////////////////////// |
303 | + |
304 | +} // namespace transcode |
305 | +} // namespace zorba |
306 | +#endif /* ZORBA_TRANSCODE_STREAM_API_H */ |
307 | +/* vim:set et sw=2 ts=2: */ |
308 | |
309 | === modified file 'modules/ExternalModules.conf' |
310 | --- modules/ExternalModules.conf 2012-02-16 00:52:25 +0000 |
311 | +++ modules/ExternalModules.conf 2012-02-16 02:19:18 +0000 |
312 | @@ -32,7 +32,7 @@ |
313 | email bzr lp:zorba/email-module zorba-2.1 |
314 | excel bzr lp:zorba/excel-module zorba-2.1 |
315 | geo bzr lp:zorba/geo-module zorba-2.1 |
316 | -http-client bzr lp:zorba/http-client-module 1.0 |
317 | +http-client bzr lp:zorba/http-client-module |
318 | image bzr lp:zorba/image-module zorba-2.1 |
319 | languages bzr lp:zorba/languages-module zorba-2.1 |
320 | oauth bzr lp:zorba/oauth-module zorba-2.1 |
321 | |
322 | === modified file 'modules/com/zorba-xquery/www/modules/http-client.xq' |
323 | --- modules/com/zorba-xquery/www/modules/http-client.xq 2011-08-26 23:36:24 +0000 |
324 | +++ modules/com/zorba-xquery/www/modules/http-client.xq 2012-02-16 02:19:18 +0000 |
325 | @@ -354,7 +354,7 @@ |
326 | :) |
327 | declare %ann:nondeterministic function http:get-node($href as xs:string) as item()+ |
328 | { |
329 | - http:http-nondeterministic-impl(validate {<http-schema:request method="GET" href="{$href}" follow-redirect="true" override-media-type="text/xml"/>}, (), ()) |
330 | + http:http-nondeterministic-impl(validate {<http-schema:request method="GET" href="{$href}" follow-redirect="true" override-media-type="text/xml; charset=utf-8"/>}, (), ()) |
331 | }; |
332 | |
333 | (:~ |
334 | @@ -374,7 +374,7 @@ |
335 | :) |
336 | declare %ann:nondeterministic function http:get-text($href as xs:string) as item()+ |
337 | { |
338 | - http:http-nondeterministic-impl(validate {<http-schema:request method="GET" href="{$href}" follow-redirect="true" override-media-type="text/plain"/>}, (), ()) |
339 | + http:http-nondeterministic-impl(validate {<http-schema:request method="GET" href="{$href}" follow-redirect="true" override-media-type="text/plain; charset=utf-8"/>}, (), ()) |
340 | }; |
341 | |
342 | (:~ |
343 | |
344 | === modified file 'modules/com/zorba-xquery/www/modules/http-client.xq.src/curl_stream_buffer.cpp' |
345 | --- modules/com/zorba-xquery/www/modules/http-client.xq.src/curl_stream_buffer.cpp 2011-07-29 08:12:36 +0000 |
346 | +++ modules/com/zorba-xquery/www/modules/http-client.xq.src/curl_stream_buffer.cpp 2012-02-16 02:19:18 +0000 |
347 | @@ -21,6 +21,7 @@ |
348 | #include <iostream> |
349 | #include <cassert> |
350 | #ifndef WIN32 |
351 | +#include <cerrno> |
352 | #include <sys/time.h> |
353 | #endif /* WIN32 */ |
354 | |
355 | @@ -32,349 +33,347 @@ |
356 | using namespace std; |
357 | |
358 | namespace zorba { |
359 | - namespace curl { |
360 | - |
361 | - /////////////////////////////////////////////////////////////////////////////// |
362 | - |
363 | +namespace curl { |
364 | + |
365 | +/////////////////////////////////////////////////////////////////////////////// |
366 | + |
367 | #define ZORBA_CURL_ASSERT(expr) \ |
368 | -do { \ |
369 | -if ( CURLcode const code##__LINE__ = (expr) ) \ |
370 | -throw exception( #expr, "", code##__LINE__ ); \ |
371 | -} while (0) |
372 | - |
373 | + do { \ |
374 | + if ( CURLcode const code##__LINE__ = (expr) ) \ |
375 | + throw exception( #expr, "", code##__LINE__ ); \ |
376 | + } while (0) |
377 | + |
378 | #define ZORBA_CURLM_ASSERT(expr) \ |
379 | -do { \ |
380 | -if ( CURLMcode const code##__LINE__ = (expr) ) \ |
381 | -if ( code##__LINE__ != CURLM_CALL_MULTI_PERFORM ) \ |
382 | -throw exception( #expr, "", code##__LINE__ ); \ |
383 | -} while (0) |
384 | - |
385 | - exception::exception( char const *function, char const *uri, char const *msg ) : |
386 | - std::exception(), theMessage(msg) |
387 | - { |
388 | - } |
389 | - |
390 | - exception::exception( char const *function, char const *uri, CURLcode code ) : |
391 | - std::exception(), theMessage(curl_easy_strerror(code)) |
392 | - { |
393 | - } |
394 | - |
395 | - exception::exception( char const *function, char const *uri, CURLMcode code ) : |
396 | - std::exception(), theMessage(curl_multi_strerror(code)) |
397 | - { |
398 | - } |
399 | - |
400 | - const char* exception::what() const throw() { |
401 | - return theMessage; |
402 | - } |
403 | - |
404 | - |
405 | - /////////////////////////////////////////////////////////////////////////////// |
406 | - |
407 | - CURL* create( char const *uri, write_fn_t fn, void *data ) { |
408 | - // |
409 | - // Having cURL initialization wrapped by a class and using a singleton static |
410 | - // instance guarantees that cURL is initialized exactly once before use and |
411 | - // and also is cleaned-up at program termination (when destructors for static |
412 | - // objects are called). |
413 | - // |
414 | - struct curl_initializer { |
415 | - curl_initializer() { |
416 | - ZORBA_CURL_ASSERT( curl_global_init( CURL_GLOBAL_ALL ) ); |
417 | - } |
418 | - ~curl_initializer() { |
419 | - curl_global_cleanup(); |
420 | - } |
421 | - }; |
422 | - static curl_initializer initializer; |
423 | - |
424 | - CURL *const curl = curl_easy_init(); |
425 | - if ( !curl ) |
426 | - throw exception( "curl_easy_init()", uri, "" ); |
427 | - |
428 | - try { |
429 | - ZORBA_CURL_ASSERT( curl_easy_setopt( curl, CURLOPT_URL, uri ) ); |
430 | - ZORBA_CURL_ASSERT( curl_easy_setopt( curl, CURLOPT_WRITEDATA, data ) ); |
431 | - ZORBA_CURL_ASSERT( curl_easy_setopt( curl, CURLOPT_WRITEFUNCTION, fn ) ); |
432 | - |
433 | - // Tells cURL to follow redirects. CURLOPT_MAXREDIRS is by default set to -1 |
434 | - // thus cURL will do an infinite number of redirects. |
435 | - ZORBA_CURL_ASSERT( curl_easy_setopt( curl, CURLOPT_FOLLOWLOCATION, 1 ) ); |
436 | - |
437 | + do { \ |
438 | + if ( CURLMcode const code##__LINE__ = (expr) ) \ |
439 | + if ( code##__LINE__ != CURLM_CALL_MULTI_PERFORM ) \ |
440 | + throw exception( #expr, "", code##__LINE__ ); \ |
441 | + } while (0) |
442 | + |
443 | +exception::exception( char const *function, char const *uri, char const *msg ) : |
444 | + std::exception(), msg_( msg ) |
445 | +{ |
446 | +} |
447 | + |
448 | +exception::exception( char const *function, char const *uri, CURLcode code ) : |
449 | + std::exception(), |
450 | + msg_( curl_easy_strerror( code ) ) |
451 | +{ |
452 | +} |
453 | + |
454 | +exception::exception( char const *function, char const *uri, CURLMcode code ) : |
455 | + std::exception(), |
456 | + msg_( curl_multi_strerror( code ) ) |
457 | +{ |
458 | +} |
459 | + |
460 | +exception::~exception() throw() { |
461 | + // out-of-line since it's virtual |
462 | +} |
463 | + |
464 | +const char* exception::what() const throw() { |
465 | + return msg_.c_str(); |
466 | +} |
467 | + |
468 | +/////////////////////////////////////////////////////////////////////////////// |
469 | + |
470 | +CURL* create( char const *uri, write_fn_t fn, void *data ) { |
471 | + // |
472 | + // Having cURL initialization wrapped by a class and using a singleton static |
473 | + // instance guarantees that cURL is initialized exactly once before use and |
474 | + // and also is cleaned-up at program termination (when destructors for static |
475 | + // objects are called). |
476 | + // |
477 | + struct curl_initializer { |
478 | + curl_initializer() { |
479 | + ZORBA_CURL_ASSERT( curl_global_init( CURL_GLOBAL_ALL ) ); |
480 | + } |
481 | + ~curl_initializer() { |
482 | + curl_global_cleanup(); |
483 | + } |
484 | + }; |
485 | + static curl_initializer initializer; |
486 | + |
487 | + CURL *const curl = curl_easy_init(); |
488 | + if ( !curl ) |
489 | + throw exception( "curl_easy_init()", uri, "" ); |
490 | + |
491 | + try { |
492 | + ZORBA_CURL_ASSERT( curl_easy_setopt( curl, CURLOPT_URL, uri ) ); |
493 | + ZORBA_CURL_ASSERT( curl_easy_setopt( curl, CURLOPT_WRITEDATA, data ) ); |
494 | + ZORBA_CURL_ASSERT( curl_easy_setopt( curl, CURLOPT_WRITEFUNCTION, fn ) ); |
495 | + |
496 | + // Tells cURL to follow redirects. CURLOPT_MAXREDIRS is by default set to -1 |
497 | + // thus cURL will do an infinite number of redirects. |
498 | + ZORBA_CURL_ASSERT( curl_easy_setopt( curl, CURLOPT_FOLLOWLOCATION, 1 ) ); |
499 | + |
500 | #ifndef ZORBA_VERIFY_PEER_SSL_CERTIFICATE |
501 | - ZORBA_CURL_ASSERT( curl_easy_setopt( curl, CURLOPT_SSL_VERIFYPEER, 0 ) ); |
502 | - // |
503 | - // CURLOPT_SSL_VERIFYHOST is left default, value 2, meaning verify that the |
504 | - // Common Name or Subject Alternate Name field in the certificate matches |
505 | - // the name of the server. |
506 | - // |
507 | - // Tested with https://www.npr.org/rss/rss.php?id=1001 |
508 | - // About using SSL certs in curl: http://curl.haxx.se/docs/sslcerts.html |
509 | + ZORBA_CURL_ASSERT( curl_easy_setopt( curl, CURLOPT_SSL_VERIFYPEER, 0 ) ); |
510 | + // |
511 | + // CURLOPT_SSL_VERIFYHOST is left default, value 2, meaning verify that the |
512 | + // Common Name or Subject Alternate Name field in the certificate matches |
513 | + // the name of the server. |
514 | + // |
515 | + // Tested with https://www.npr.org/rss/rss.php?id=1001 |
516 | + // About using SSL certs in curl: http://curl.haxx.se/docs/sslcerts.html |
517 | #else |
518 | # ifdef WIN32 |
519 | - // set the root CA certificates file path |
520 | - if ( GENV.g_curl_root_CA_certificates_path[0] ) |
521 | - ZORBA_CURL_ASSERT( |
522 | - curl_easy_setopt( |
523 | - curl, CURLOPT_CAINFO, GENV.g_curl_root_CA_certificates_path |
524 | - ) |
525 | - ); |
526 | + // set the root CA certificates file path |
527 | + if ( GENV.g_curl_root_CA_certificates_path[0] ) |
528 | + ZORBA_CURL_ASSERT( |
529 | + curl_easy_setopt( |
530 | + curl, CURLOPT_CAINFO, GENV.g_curl_root_CA_certificates_path |
531 | + ) |
532 | + ); |
533 | # endif /* WIN32 */ |
534 | #endif /* ZORBA_VERIFY_PEER_SSL_CERTIFICATE */ |
535 | - |
536 | - // |
537 | - // Some servers don't like requests that are made without a user-agent |
538 | - // field, so we provide one. |
539 | - // |
540 | - ZORBA_CURL_ASSERT( |
541 | - curl_easy_setopt( curl, CURLOPT_USERAGENT, "libcurl-agent/1.0" ) |
542 | - ); |
543 | - |
544 | - return curl; |
545 | - } |
546 | - catch ( ... ) { |
547 | - destroy( curl ); |
548 | - throw; |
549 | - } |
550 | - } |
551 | - |
552 | - void destroy( CURL *curl ) { |
553 | - if ( curl ) { |
554 | - curl_easy_reset( curl ); |
555 | - curl_easy_cleanup( curl ); |
556 | - } |
557 | - } |
558 | - |
559 | - /////////////////////////////////////////////////////////////////////////////// |
560 | - |
561 | - streambuf::streambuf() : theInformer(0), theOwnInformer(false) { |
562 | -#ifdef WIN32 |
563 | - theDummySocket = socket(AF_INET, SOCK_DGRAM, 0); |
564 | - if (theDummySocket == CURL_SOCKET_BAD || theDummySocket == INVALID_SOCKET) { |
565 | - std::cerr << "creating the socket failed" << std::endl; |
566 | - } |
567 | -#endif |
568 | - init(); |
569 | - } |
570 | - |
571 | - streambuf::streambuf( char const *uri ) : theInformer(0), theOwnInformer(false) { |
572 | -#ifdef WIN32 |
573 | - theDummySocket = socket(AF_INET, SOCK_DGRAM, 0); |
574 | - if (theDummySocket == CURL_SOCKET_BAD || theDummySocket == INVALID_SOCKET) { |
575 | - std::cerr << "creating the socket failed" << std::endl; |
576 | - } |
577 | -#endif |
578 | - init(); |
579 | - open( uri ); |
580 | - } |
581 | - |
582 | - int streambuf::multi_perform() { |
583 | - underflow(); |
584 | - CURLMsg* msg; |
585 | - int msgInQueue; |
586 | - int error = 0; |
587 | - while ((msg = curl_multi_info_read(curlm_, &msgInQueue))) { |
588 | - if (msg->msg == CURLMSG_DONE) { |
589 | - error = msg->data.result; |
590 | - } |
591 | - } |
592 | - return error; |
593 | - } |
594 | - |
595 | - streambuf::streambuf( CURL* aCurl) : theInformer(0), theOwnInformer(false) { |
596 | -#ifdef WIN32 |
597 | - theDummySocket = socket(AF_INET, SOCK_DGRAM, 0); |
598 | - if (theDummySocket == CURL_SOCKET_BAD || theDummySocket == INVALID_SOCKET) { |
599 | - std::cerr << "creating the socket failed" << std::endl; |
600 | - } |
601 | -#endif |
602 | - init(); |
603 | - curl_ = aCurl; |
604 | - ZORBA_CURL_ASSERT( curl_easy_setopt( aCurl, CURLOPT_WRITEDATA, this ) ); |
605 | - ZORBA_CURL_ASSERT( curl_easy_setopt( aCurl, CURLOPT_WRITEFUNCTION, curl_write_callback ) ); |
606 | - |
607 | - init_curlm(); |
608 | - } |
609 | - |
610 | - streambuf::~streambuf() { |
611 | - free( buf_ ); |
612 | - close(); |
613 | -#ifdef WIN32 |
614 | - closesocket(theDummySocket); |
615 | -#endif |
616 | - // If we have been assigned memory ownership of theInformer, delete it now. |
617 | - if (theOwnInformer) |
618 | - delete theInformer; |
619 | - } |
620 | - |
621 | - void streambuf::close() { |
622 | - if ( curl_ ) { |
623 | - if ( curlm_ ) { |
624 | - curl_multi_remove_handle( curlm_, curl_ ); |
625 | - curl_multi_cleanup( curlm_ ); |
626 | - curlm_ = 0; |
627 | - } |
628 | - destroy( curl_ ); |
629 | - curl_ = 0; |
630 | - } |
631 | - } |
632 | - |
633 | - void streambuf::curl_read() { |
634 | - buf_len_ = 0; |
635 | - while ( curl_running_ && !buf_len_ ) { |
636 | - fd_set fd_read, fd_write, fd_except; |
637 | - FD_ZERO( &fd_read ); |
638 | - FD_ZERO( &fd_write ); |
639 | - FD_ZERO( &fd_except ); |
640 | - int max_fd = -1; |
641 | -#ifdef WIN32 |
642 | - // Windows does not like a call to select where all arguments are 0. So |
643 | - // we just add a dummy socket to make the call to select happy. |
644 | - FD_SET (theDummySocket, &fd_read); |
645 | -#endif |
646 | - ZORBA_CURLM_ASSERT( |
647 | - curl_multi_fdset( curlm_, &fd_read, &fd_write, &fd_except, &max_fd ) |
648 | - ); |
649 | - |
650 | - // |
651 | - // Note that the fopen.c sample code is unnecessary at best or wrong at |
652 | - // worst; see: http://curl.haxx.se/mail/lib-2011-05/0011.html |
653 | - // |
654 | - timeval timeout; |
655 | - long curl_timeout_ms; |
656 | - ZORBA_CURLM_ASSERT( curl_multi_timeout( curlm_, &curl_timeout_ms ) ); |
657 | - if ( curl_timeout_ms > 0 ) { |
658 | - timeout.tv_sec = curl_timeout_ms / 1000; |
659 | - timeout.tv_usec = curl_timeout_ms % 1000 * 1000; |
660 | - } else { |
661 | - // |
662 | - // From curl_multi_timeout(3): |
663 | - // |
664 | - // Note: if libcurl returns a -1 timeout here, it just means that |
665 | - // libcurl currently has no stored timeout value. You must not wait |
666 | - // too long (more than a few seconds perhaps) before you call |
667 | - // curl_multi_perform() again. |
668 | - // |
669 | - // So we just pick some not-too-long default. |
670 | - // |
671 | - timeout.tv_sec = 1; |
672 | - timeout.tv_usec = 0; |
673 | - } |
674 | - |
675 | - switch ( select( max_fd + 1, &fd_read, &fd_write, &fd_except, &timeout ) ) { |
676 | - case -1: // select error |
677 | -#ifdef WIN32 |
678 | - std::cout << "Error = " << WSAGetLastError() << std::endl; |
679 | -#endif |
680 | - throw exception( "select()", "" ); |
681 | - case 0: // timeout |
682 | - // no break; |
683 | - default: |
684 | - CURLMcode code; |
685 | - do { |
686 | - code = curl_multi_perform( curlm_, &curl_running_ ); |
687 | - } while ( code == CURLM_CALL_MULTI_PERFORM ); |
688 | - ZORBA_CURLM_ASSERT( code ); |
689 | - } |
690 | - } |
691 | - if (theInformer) { |
692 | - theInformer->afterRead(); |
693 | - } |
694 | - } |
695 | - |
696 | - size_t streambuf::curl_write_callback( void *ptr, size_t size, size_t nmemb, |
697 | - void *data ) { |
698 | - size *= nmemb; |
699 | - streambuf *const that = static_cast<streambuf*>( data ); |
700 | - |
701 | - std::streamoff buf_free = that->buf_capacity_ - that->buf_len_; |
702 | - if (that->theInformer) { |
703 | - that->theInformer->beforeRead(); |
704 | - } |
705 | - if ( size > buf_free ) { |
706 | - std::streamoff new_capacity = that->buf_capacity_ + size - buf_free; |
707 | - if ( void *const new_buf = realloc( that->buf_, static_cast<size_t>(new_capacity) ) ) { |
708 | - that->buf_ = static_cast<char*>( new_buf ); |
709 | - that->buf_capacity_ = new_capacity; |
710 | - } else |
711 | - throw exception( "realloc()", "" ); |
712 | - } |
713 | - ::memcpy( that->buf_ + that->buf_len_, ptr, size ); |
714 | - that->buf_len_ += size; |
715 | - return size; |
716 | - } |
717 | - |
718 | - void streambuf::init() { |
719 | - buf_ = 0; |
720 | - buf_capacity_ = 0; |
721 | - buf_len_ = 0; |
722 | - curl_ = 0; |
723 | - curlm_ = 0; |
724 | - curl_running_ = 0; |
725 | - } |
726 | - |
727 | - void streambuf::init_curlm() { |
728 | - // |
729 | - // Lie about cURL running initially so the while-loop in curl_read() will run |
730 | - // at least once. |
731 | - // |
732 | - curl_running_ = 1; |
733 | - |
734 | - // |
735 | - // Set the "get" pointer to the end (gptr() == egptr()) so a call to |
736 | - // underflow() and initial data read will be triggered. |
737 | - // |
738 | - buf_len_ = buf_capacity_; |
739 | - setg( buf_, buf_ + buf_len_, buf_ + buf_capacity_ ); |
740 | - |
741 | - // |
742 | - // Clean-up has to be done here with try/catch (as opposed to relying on the |
743 | - // destructor) because open() can be called from the constructor. If an |
744 | - // exception is thrown, the constructor will not have completed, hence the |
745 | - // object will not have been fully constructed; therefore the destructor will |
746 | - // not be called. |
747 | - // |
748 | - try { |
749 | - if ( !(curlm_ = curl_multi_init()) ) |
750 | - throw exception( "curl_multi_init()", "" ); |
751 | - try { |
752 | - ZORBA_CURLM_ASSERT( curl_multi_add_handle( curlm_, curl_ ) ); |
753 | - } |
754 | - catch ( ... ) { |
755 | - curl_multi_cleanup( curlm_ ); |
756 | - curlm_ = 0; |
757 | - throw; |
758 | - } |
759 | - } |
760 | - catch ( ... ) { |
761 | - destroy( curl_ ); |
762 | - curl_ = 0; |
763 | - throw; |
764 | - } |
765 | - } |
766 | - |
767 | - void streambuf::open( char const *uri ) { |
768 | - curl_ = create( uri, curl_write_callback, this ); |
769 | - |
770 | - init_curlm(); |
771 | - } |
772 | - |
773 | - streamsize streambuf::showmanyc() { |
774 | - return egptr() - gptr(); |
775 | - } |
776 | - |
777 | - streambuf::int_type streambuf::underflow() { |
778 | - while ( true ) { |
779 | - if ( gptr() < egptr() ) |
780 | - return traits_type::to_int_type( *gptr() ); |
781 | - curl_read(); |
782 | - if ( !buf_len_ ) |
783 | - return traits_type::eof(); |
784 | - setg( buf_, buf_, buf_ + buf_len_ ); |
785 | - } |
786 | - } |
787 | - |
788 | - /////////////////////////////////////////////////////////////////////////////// |
789 | - |
790 | - } // namespace curl |
791 | + |
792 | + // |
793 | + // Some servers don't like requests that are made without a user-agent |
794 | + // field, so we provide one. |
795 | + // |
796 | + ZORBA_CURL_ASSERT( |
797 | + curl_easy_setopt( curl, CURLOPT_USERAGENT, "libcurl-agent/1.0" ) |
798 | + ); |
799 | + |
800 | + return curl; |
801 | + } |
802 | + catch ( ... ) { |
803 | + destroy( curl ); |
804 | + throw; |
805 | + } |
806 | +} |
807 | + |
808 | +void destroy( CURL *curl ) { |
809 | + if ( curl ) { |
810 | + curl_easy_reset( curl ); |
811 | + curl_easy_cleanup( curl ); |
812 | + } |
813 | +} |
814 | + |
815 | +/////////////////////////////////////////////////////////////////////////////// |
816 | + |
817 | +streambuf::streambuf() { |
818 | + init(); |
819 | +} |
820 | + |
821 | +streambuf::streambuf( char const *uri ) { |
822 | + init(); |
823 | + open( uri ); |
824 | +} |
825 | + |
826 | +streambuf::streambuf( CURL *curl ) { |
827 | + init(); |
828 | + curl_ = curl; |
829 | + ZORBA_CURL_ASSERT( curl_easy_setopt( curl, CURLOPT_WRITEDATA, this ) ); |
830 | + ZORBA_CURL_ASSERT( curl_easy_setopt( curl, CURLOPT_WRITEFUNCTION, curl_write_callback ) ); |
831 | + init_curlm(); |
832 | +} |
833 | + |
834 | +streambuf::~streambuf() { |
835 | + free( buf_ ); |
836 | + close(); |
837 | +#ifdef WIN32 |
838 | + closesocket( dummy_socket_ ); |
839 | +#endif |
840 | + // If we have been assigned memory ownership of theInformer, delete it now. |
841 | + if ( theOwnInformer ) |
842 | + delete theInformer; |
843 | +} |
844 | + |
845 | +void streambuf::close() { |
846 | + if ( curl_ ) { |
847 | + if ( curlm_ ) { |
848 | + curl_multi_remove_handle( curlm_, curl_ ); |
849 | + curl_multi_cleanup( curlm_ ); |
850 | + curlm_ = 0; |
851 | + } |
852 | + destroy( curl_ ); |
853 | + curl_ = 0; |
854 | + } |
855 | +} |
856 | + |
857 | +void streambuf::curl_read() { |
858 | + buf_len_ = 0; |
859 | + while ( curl_running_ && !buf_len_ ) { |
860 | + fd_set fd_read, fd_write, fd_except; |
861 | + FD_ZERO( &fd_read ); |
862 | + FD_ZERO( &fd_write ); |
863 | + FD_ZERO( &fd_except ); |
864 | + int max_fd = -1; |
865 | +#ifdef WIN32 |
866 | + // |
867 | + // Windows does not like a call to select where all arguments are 0, so we |
868 | + // just add a dummy socket to make the call to select happy. |
869 | + // |
870 | + FD_SET( dummy_socket_, &fd_read ); |
871 | +#endif /* WIN32 */ |
872 | + ZORBA_CURLM_ASSERT( |
873 | + curl_multi_fdset( curlm_, &fd_read, &fd_write, &fd_except, &max_fd ) |
874 | + ); |
875 | + |
876 | + // |
877 | + // Note that the fopen.c sample code is unnecessary at best or wrong at |
878 | + // worst; see: http://curl.haxx.se/mail/lib-2011-05/0011.html |
879 | + // |
880 | + timeval timeout; |
881 | + long curl_timeout_ms; |
882 | + ZORBA_CURLM_ASSERT( curl_multi_timeout( curlm_, &curl_timeout_ms ) ); |
883 | + if ( curl_timeout_ms > 0 ) { |
884 | + timeout.tv_sec = curl_timeout_ms / 1000; |
885 | + timeout.tv_usec = curl_timeout_ms % 1000 * 1000; |
886 | + } else { |
887 | + // |
888 | + // From curl_multi_timeout(3): |
889 | + // |
890 | + // Note: if libcurl returns a -1 timeout here, it just means that |
891 | + // libcurl currently has no stored timeout value. You must not wait |
892 | + // too long (more than a few seconds perhaps) before you call |
893 | + // curl_multi_perform() again. |
894 | + // |
895 | + // So we just pick some not-too-long default. |
896 | + // |
897 | + timeout.tv_sec = 1; |
898 | + timeout.tv_usec = 0; |
899 | + } |
900 | + |
901 | + switch ( select( max_fd + 1, &fd_read, &fd_write, &fd_except, &timeout ) ) { |
902 | + case -1: // select error |
903 | +#ifdef WIN32 |
904 | + char err_buf[8]; |
905 | + sprintf( err_buf, "%d", WSAGetLastError() ); |
906 | + throw exception( "select()", "", err_buf ); |
907 | +#else |
908 | + throw exception( "select()", "", strerror( errno ) ); |
909 | +#endif |
910 | + case 0: // timeout |
911 | + // no break; |
912 | + default: |
913 | + CURLMcode code; |
914 | + do { |
915 | + code = curl_multi_perform( curlm_, &curl_running_ ); |
916 | + } while ( code == CURLM_CALL_MULTI_PERFORM ); |
917 | + ZORBA_CURLM_ASSERT( code ); |
918 | + } |
919 | + } |
920 | + if ( theInformer ) |
921 | + theInformer->afterRead(); |
922 | +} |
923 | + |
924 | +size_t streambuf::curl_write_callback( void *ptr, size_t size, size_t nmemb, |
925 | + void *data ) { |
926 | + size *= nmemb; |
927 | + streambuf *const that = static_cast<streambuf*>( data ); |
928 | + |
929 | + if ( that->theInformer ) |
930 | + that->theInformer->beforeRead(); |
931 | + |
932 | + size_t const buf_free = that->buf_capacity_ - that->buf_len_; |
933 | + if ( size > buf_free ) { |
934 | + streamoff new_capacity = that->buf_capacity_ + size - buf_free; |
935 | + if ( void *const new_buf = |
936 | + realloc( that->buf_, static_cast<size_t>( new_capacity ) ) ) { |
937 | + that->buf_ = static_cast<char*>( new_buf ); |
938 | + that->buf_capacity_ = new_capacity; |
939 | + } else |
940 | + throw exception( "realloc()", "" ); |
941 | + } |
942 | + ::memcpy( that->buf_ + that->buf_len_, ptr, size ); |
943 | + that->buf_len_ += size; |
944 | + return size; |
945 | +} |
946 | + |
947 | +void streambuf::init() { |
948 | + buf_ = 0; |
949 | + buf_capacity_ = 0; |
950 | + buf_len_ = 0; |
951 | + curl_ = 0; |
952 | + curlm_ = 0; |
953 | + curl_running_ = 0; |
954 | + theInformer = 0; |
955 | + theOwnInformer = false; |
956 | +#ifdef WIN32 |
957 | + dummy_socket_ = socket( AF_INET, SOCK_DGRAM, 0 ); |
958 | + if ( dummy_socket_ == CURL_SOCKET_BAD || dummy_socket_ == INVALID_SOCKET ) |
959 | + throw exception( "socket()", "" ); |
960 | +#endif /* WIN32 */ |
961 | +} |
962 | + |
963 | +void streambuf::init_curlm() { |
964 | + // |
965 | + // Lie about cURL running initially so the while-loop in curl_read() will run |
966 | + // at least once. |
967 | + // |
968 | + curl_running_ = 1; |
969 | + |
970 | + // |
971 | + // Set the "get" pointer to the end (gptr() == egptr()) so a call to |
972 | + // underflow() and initial data read will be triggered. |
973 | + // |
974 | + buf_len_ = buf_capacity_; |
975 | + setg( buf_, buf_ + buf_len_, buf_ + buf_capacity_ ); |
976 | + |
977 | + // |
978 | + // Clean-up has to be done here with try/catch (as opposed to relying on the |
979 | + // destructor) because open() can be called from the constructor. If an |
980 | + // exception is thrown, the constructor will not have completed, hence the |
981 | + // object will not have been fully constructed; therefore the destructor will |
982 | + // not be called. |
983 | + // |
984 | + try { |
985 | + if ( !(curlm_ = curl_multi_init()) ) |
986 | + throw exception( "curl_multi_init()", "" ); |
987 | + try { |
988 | + ZORBA_CURLM_ASSERT( curl_multi_add_handle( curlm_, curl_ ) ); |
989 | + } |
990 | + catch ( ... ) { |
991 | + curl_multi_cleanup( curlm_ ); |
992 | + curlm_ = 0; |
993 | + throw; |
994 | + } |
995 | + } |
996 | + catch ( ... ) { |
997 | + destroy( curl_ ); |
998 | + curl_ = 0; |
999 | + throw; |
1000 | + } |
1001 | +} |
1002 | + |
1003 | +int streambuf::multi_perform() { |
1004 | + underflow(); |
1005 | + CURLMsg *msg; |
1006 | + int msgInQueue; |
1007 | + int error = 0; |
1008 | + while ( (msg = curl_multi_info_read( curlm_, &msgInQueue )) ) { |
1009 | + if ( msg->msg == CURLMSG_DONE ) |
1010 | + error = msg->data.result; |
1011 | + } |
1012 | + return error; |
1013 | +} |
1014 | + |
1015 | +void streambuf::open( char const *uri ) { |
1016 | + curl_ = create( uri, curl_write_callback, this ); |
1017 | + |
1018 | + init_curlm(); |
1019 | +} |
1020 | + |
1021 | +streamsize streambuf::showmanyc() { |
1022 | + return egptr() - gptr(); |
1023 | +} |
1024 | + |
1025 | +streambuf::int_type streambuf::underflow() { |
1026 | + while ( true ) { |
1027 | + if ( gptr() < egptr() ) |
1028 | + return traits_type::to_int_type( *gptr() ); |
1029 | + curl_read(); |
1030 | + if ( !buf_len_ ) |
1031 | + return traits_type::eof(); |
1032 | + setg( buf_, buf_, buf_ + buf_len_ ); |
1033 | + } |
1034 | +} |
1035 | + |
1036 | +/////////////////////////////////////////////////////////////////////////////// |
1037 | + |
1038 | +} // namespace curl |
1039 | } // namespace zorba |
1040 | +/* vim:set et sw=2 ts=2: */ |
1041 | |
1042 | === modified file 'modules/com/zorba-xquery/www/modules/http-client.xq.src/curl_stream_buffer.h' |
1043 | --- modules/com/zorba-xquery/www/modules/http-client.xq.src/curl_stream_buffer.h 2011-07-29 08:12:36 +0000 |
1044 | +++ modules/com/zorba-xquery/www/modules/http-client.xq.src/curl_stream_buffer.h 2012-02-16 02:19:18 +0000 |
1045 | @@ -19,154 +19,175 @@ |
1046 | |
1047 | #include <zorba/config.h> |
1048 | |
1049 | +#include <exception> |
1050 | #include <istream> |
1051 | -#include <exception> |
1052 | #include <streambuf> |
1053 | +#include <string> |
1054 | #include <curl/curl.h> |
1055 | |
1056 | namespace zorba { |
1057 | - |
1058 | - namespace http_client { |
1059 | - class InformDataRead; |
1060 | - } |
1061 | - |
1062 | - namespace curl { |
1063 | - |
1064 | - class exception : public std::exception { |
1065 | - public: |
1066 | - exception( char const *function, char const *uri, char const *msg = 0 ); |
1067 | - exception( char const *function, char const *uri, CURLcode code ); |
1068 | - exception( char const *function, char const *uri, CURLMcode code ); |
1069 | - public: |
1070 | - virtual const char* what() const throw(); |
1071 | - private: |
1072 | - const char* theMessage; |
1073 | - }; |
1074 | - |
1075 | - |
1076 | - |
1077 | - ////////// create & destroy /////////////////////////////////////////////////// |
1078 | - |
1079 | - /** |
1080 | - * The signature type of cURL's write function callback. |
1081 | - */ |
1082 | - typedef size_t (*write_fn_t)( void*, size_t, size_t, void* ); |
1083 | - |
1084 | - /** |
1085 | - * Creates a new, initialized cURL instance. |
1086 | - * |
1087 | - * @throws exception upon failure. |
1088 | - */ |
1089 | - CURL* create( char const *uri, write_fn_t fn, void *data ); |
1090 | - |
1091 | - /** |
1092 | - * Destroys a cURL instance. |
1093 | - * |
1094 | - * @param instance A cURL instance. If \c NULL, does nothing. |
1095 | - */ |
1096 | - void destroy( CURL *instance ); |
1097 | - |
1098 | - ////////// streambuf ////////////////////////////////////////////////////////// |
1099 | - |
1100 | - /** |
1101 | - * A curl::streambuf is-a std::streambuf for streaming the contents of URI |
1102 | - * using cURL. However, do not use this class directly. Use uri::streambuf |
1103 | - * instead. |
1104 | - */ |
1105 | - class streambuf : public std::streambuf { |
1106 | - public: |
1107 | - /** |
1108 | - * Constructs a %streambuf. |
1109 | - */ |
1110 | - streambuf(); |
1111 | - |
1112 | - /** |
1113 | - * Constructs a %streambuf and opens a connection to the server hosting the |
1114 | - * given URI for subsequent streaming. |
1115 | - * |
1116 | - * @param uri The URI to stream. |
1117 | - */ |
1118 | - streambuf( char const *uri ); |
1119 | - |
1120 | - /** |
1121 | - * In case we already have a curl object, which was set up somewhere else, we |
1122 | - * take it here as an arument. This takes ownership over the object. |
1123 | - */ |
1124 | - streambuf( CURL* aCurl ); |
1125 | - |
1126 | - /** |
1127 | - * Destroys a %streambuf. |
1128 | - */ |
1129 | - ~streambuf(); |
1130 | - |
1131 | - /** |
1132 | - * Opens a connection to the server hosting the given URI for subsequent |
1133 | - * streaming. |
1134 | - * |
1135 | - * @param uri The URI to stream. |
1136 | - * @throws exception upon failure. |
1137 | - */ |
1138 | - void open( char const *uri ); |
1139 | - |
1140 | - /** |
1141 | - * Tests whether the buffer is open. |
1142 | - * |
1143 | - * @return Returns \c true only if the buffer is open. |
1144 | - */ |
1145 | - bool is_open() const { |
1146 | - return !!curl_; |
1147 | - } |
1148 | - |
1149 | - /** |
1150 | - * Closes this %streambuf. |
1151 | - */ |
1152 | - void close(); |
1153 | - |
1154 | - /** |
1155 | - * Provide a InformDataRead that will get callbacks about read events. |
1156 | - */ |
1157 | - void setInformer(::zorba::http_client::InformDataRead* aInformer) { theInformer = aInformer; } |
1158 | - |
1159 | - /** |
1160 | - * Specify whether this streambuf has memory ownership over the |
1161 | - * InformDataRead it has been passed. You can use this if, for example, |
1162 | - * the lifetime of the streambuf will extend past the lifetime of the |
1163 | - * object which created the InformDataRead. |
1164 | - */ |
1165 | - void setOwnInformer(bool aOwnInformer) { theOwnInformer = aOwnInformer; } |
1166 | - |
1167 | - int multi_perform(); |
1168 | - |
1169 | - protected: |
1170 | - // inherited |
1171 | - std::streamsize showmanyc(); |
1172 | - int_type underflow(); |
1173 | - |
1174 | - private: |
1175 | - void curl_read(); |
1176 | - static size_t curl_write_callback( void*, size_t, size_t, void* ); |
1177 | - |
1178 | - void init(); |
1179 | - void init_curlm(); |
1180 | - |
1181 | - char *buf_; |
1182 | - std::streamsize buf_capacity_; |
1183 | - std::streamoff buf_len_; |
1184 | - |
1185 | - CURL *curl_; |
1186 | - CURLM *curlm_; |
1187 | - int curl_running_; |
1188 | - ::zorba::http_client::InformDataRead* theInformer; |
1189 | - bool theOwnInformer; |
1190 | - |
1191 | - // forbid |
1192 | - streambuf( streambuf const& ); |
1193 | - streambuf& operator=( streambuf const& ); |
1194 | + |
1195 | +namespace http_client { |
1196 | + class InformDataRead; |
1197 | +} |
1198 | + |
1199 | +namespace curl { |
1200 | + |
1201 | +/////////////////////////////////////////////////////////////////////////////// |
1202 | + |
1203 | +class exception : public std::exception { |
1204 | +public: |
1205 | + exception( char const *function, char const *uri, char const *msg = 0 ); |
1206 | + exception( char const *function, char const *uri, CURLcode code ); |
1207 | + exception( char const *function, char const *uri, CURLMcode code ); |
1208 | + ~exception() throw(); |
1209 | + |
1210 | + virtual const char* what() const throw(); |
1211 | + |
1212 | +private: |
1213 | + std::string msg_; |
1214 | +}; |
1215 | + |
1216 | +////////// create & destroy /////////////////////////////////////////////////// |
1217 | + |
1218 | +/** |
1219 | + * The signature type of cURL's write function callback. |
1220 | + */ |
1221 | +typedef size_t (*write_fn_t)( void*, size_t, size_t, void* ); |
1222 | + |
1223 | +/** |
1224 | + * Creates a new, initialized cURL instance. |
1225 | + * |
1226 | + * @throws exception upon failure. |
1227 | + */ |
1228 | +CURL* create( char const *uri, write_fn_t fn, void *data ); |
1229 | + |
1230 | +/** |
1231 | + * Destroys a cURL instance. |
1232 | + * |
1233 | + * @param instance A cURL instance. If \c NULL, does nothing. |
1234 | + */ |
1235 | +void destroy( CURL *instance ); |
1236 | + |
1237 | +////////// streambuf ////////////////////////////////////////////////////////// |
1238 | + |
1239 | +/** |
1240 | + * A curl::streambuf is-a std::streambuf for streaming the contents of URI |
1241 | + * using cURL. However, do not use this class directly. Use uri::streambuf |
1242 | + * instead. |
1243 | + */ |
1244 | +class streambuf : public std::streambuf { |
1245 | +public: |
1246 | + /** |
1247 | + * Constructs a %streambuf. |
1248 | + */ |
1249 | + streambuf(); |
1250 | + |
1251 | + /** |
1252 | + * Constructs a %streambuf and opens a connection to the server hosting the |
1253 | + * given URI for subsequent streaming. |
1254 | + * |
1255 | + * @param uri The URI to stream. |
1256 | + */ |
1257 | + streambuf( char const *uri ); |
1258 | + |
1259 | + /** |
1260 | + * Constructs a %streambuf using an existing CURL object. |
1261 | + * |
1262 | + * @param curl The CURL object to use. This %streambuf takes ownership of |
1263 | + * it. |
1264 | + */ |
1265 | + streambuf( CURL *curl ); |
1266 | + |
1267 | + /** |
1268 | + * Destroys a %streambuf. |
1269 | + */ |
1270 | + ~streambuf(); |
1271 | + |
1272 | + /** |
1273 | + * Opens a connection to the server hosting the given URI for subsequent |
1274 | + * streaming. |
1275 | + * |
1276 | + * @param uri The URI to stream. |
1277 | + * @throws exception upon failure. |
1278 | + */ |
1279 | + void open( char const *uri ); |
1280 | + |
1281 | + /** |
1282 | + * Tests whether the buffer is open. |
1283 | + * |
1284 | + * @return Returns \c true only if the buffer is open. |
1285 | + */ |
1286 | + bool is_open() const { |
1287 | + return !!curl_; |
1288 | + } |
1289 | + |
1290 | + /** |
1291 | + * Closes this %streambuf. |
1292 | + */ |
1293 | + void close(); |
1294 | + |
1295 | + /** |
1296 | + * Gets the CURL object in use. |
1297 | + * |
1298 | + * @return Return said CURL object. |
1299 | + */ |
1300 | + CURL* curl() const { |
1301 | + return curl_; |
1302 | + } |
1303 | + |
1304 | + /** |
1305 | + * Provide a InformDataRead that will get callbacks about read events. |
1306 | + */ |
1307 | + void setInformer( http_client::InformDataRead *aInformer ) { |
1308 | + theInformer = aInformer; |
1309 | + } |
1310 | + |
1311 | + /** |
1312 | + * Specify whether this streambuf has memory ownership over the |
1313 | + * InformDataRead it has been passed. You can use this if, for example, |
1314 | + * the lifetime of the streambuf will extend past the lifetime of the |
1315 | + * object which created the InformDataRead. |
1316 | + */ |
1317 | + void setOwnInformer( bool aOwnInformer ) { |
1318 | + theOwnInformer = aOwnInformer; |
1319 | + } |
1320 | + |
1321 | + int multi_perform(); |
1322 | + |
1323 | +protected: |
1324 | + // inherited |
1325 | + std::streamsize showmanyc(); |
1326 | + int_type underflow(); |
1327 | + |
1328 | +private: |
1329 | + void curl_read(); |
1330 | + static size_t curl_write_callback( void*, size_t, size_t, void* ); |
1331 | + |
1332 | + void init(); |
1333 | + void init_curlm(); |
1334 | + |
1335 | + char *buf_; |
1336 | + std::streamsize buf_capacity_; |
1337 | + std::streamoff buf_len_; |
1338 | + |
1339 | + CURL *curl_; |
1340 | + CURLM *curlm_; |
1341 | + int curl_running_; |
1342 | + http_client::InformDataRead *theInformer; |
1343 | + bool theOwnInformer; |
1344 | + |
1345 | + // forbid |
1346 | + streambuf( streambuf const& ); |
1347 | + streambuf& operator=( streambuf const& ); |
1348 | #ifdef WIN32 |
1349 | - SOCKET theDummySocket; |
1350 | -#endif |
1351 | - }; |
1352 | - |
1353 | - } // namespace curl |
1354 | + SOCKET dummy_socket_; |
1355 | +#endif /* WIN32 */ |
1356 | +}; |
1357 | + |
1358 | +/////////////////////////////////////////////////////////////////////////////// |
1359 | + |
1360 | +} // namespace curl |
1361 | } // namespace zorba |
1362 | #endif /* ZORBA_CURL_UTIL_H */ |
1363 | +/* vim:set et sw=2 ts=2: */ |
1364 | |
1365 | === modified file 'modules/com/zorba-xquery/www/modules/http-client.xq.src/http_response_parser.cpp' |
1366 | --- modules/com/zorba-xquery/www/modules/http-client.xq.src/http_response_parser.cpp 2011-07-29 08:12:36 +0000 |
1367 | +++ modules/com/zorba-xquery/www/modules/http-client.xq.src/http_response_parser.cpp 2012-02-16 02:19:18 +0000 |
1368 | @@ -26,12 +26,44 @@ |
1369 | #include <zorba/error.h> |
1370 | #include <zorba/xquery_exception.h> |
1371 | #include <zorba/xquery_functions.h> |
1372 | +#include <zorba/transcode_stream.h> |
1373 | |
1374 | #include "http_response_parser.h" |
1375 | #include "http_request_handler.h" |
1376 | #include "curl_stream_buffer.h" |
1377 | |
1378 | -namespace zorba { namespace http_client { |
1379 | +namespace zorba { |
1380 | + |
1381 | +static bool parse_content_type( std::string const &s, std::string *mime_type, |
1382 | + std::string *charset ) { |
1383 | + std::string::size_type pos = s.find( ';' ); |
1384 | + *mime_type = s.substr( 0, pos ); |
1385 | + |
1386 | + if ( pos != std::string::npos ) { |
1387 | + // |
1388 | + // Parse: charset="?XXXXX"?[ (comment)] |
1389 | + // |
1390 | + if ( (pos = s.find( '=' )) != std::string::npos ) { |
1391 | + std::string t = s.substr( pos + 1 ); |
1392 | + if ( !t.empty() ) { |
1393 | + if ( t[0] == '"' ) { |
1394 | + t.erase( 0, 1 ); |
1395 | + if ( (pos = t.find( '"' )) != std::string::npos ) |
1396 | + t.erase( pos ); |
1397 | + } else { |
1398 | + if ( (pos = t.find( ' ' )) != std::string::npos ) |
1399 | + t.erase( pos ); |
1400 | + } |
1401 | + *charset = t; |
1402 | + } |
1403 | + } |
1404 | + } else { |
1405 | + // The HTTP/1.1 spec says that the default charset is ISO-8859-1. |
1406 | + *charset = "ISO-8859-1"; |
1407 | + } |
1408 | +} |
1409 | + |
1410 | +namespace http_client { |
1411 | |
1412 | HttpResponseParser::HttpResponseParser(RequestHandler& aHandler, CURL* aCurl, |
1413 | ErrorThrower& aErrorThrower, |
1414 | @@ -60,19 +92,30 @@ |
1415 | if (lCode) |
1416 | return lCode; |
1417 | if (!theStatusOnly) { |
1418 | - std::auto_ptr<std::istream> lStream(new std::istream(theStreamBuffer)); |
1419 | + |
1420 | + if (!theOverridenContentType.empty()) { |
1421 | + parse_content_type( |
1422 | + theOverridenContentType, &theCurrentContentType, &theCurrentCharset |
1423 | + ); |
1424 | + } |
1425 | + |
1426 | + std::auto_ptr<std::istream> lStream; |
1427 | + if ( transcode::is_necessary( theCurrentCharset.c_str() ) ) { |
1428 | + lStream.reset( |
1429 | + new transcode::stream<std::istream>( |
1430 | + theCurrentCharset.c_str(), theStreamBuffer |
1431 | + ) |
1432 | + ); |
1433 | + } else |
1434 | + lStream.reset(new std::istream(theStreamBuffer)); |
1435 | + |
1436 | Item lItem; |
1437 | - if (theOverridenContentType != "") { |
1438 | - theCurrentContentType = theOverridenContentType; |
1439 | - } |
1440 | if (theCurrentContentType == "text/xml" || |
1441 | theCurrentContentType == "application/xml" || |
1442 | theCurrentContentType == "text/xml-external-parsed-entity" || |
1443 | theCurrentContentType == "application/xml-external-parsed-entity" || |
1444 | theCurrentContentType.find("+xml") == theCurrentContentType.size()-4) { |
1445 | lItem = createXmlItem(*lStream.get()); |
1446 | - } else if (theCurrentContentType.find("text/html") == 0) { |
1447 | - lItem = createTextItem(lStream.release()); |
1448 | } else if (theCurrentContentType.find("text/") == 0) { |
1449 | lItem = createTextItem(lStream.release()); |
1450 | } else { |
1451 | @@ -106,8 +149,8 @@ |
1452 | } |
1453 | theInsideRead = true; |
1454 | theHandler.beginResponse(theStatus, theMessage); |
1455 | - std::vector<std::pair<std::string, std::string> >::iterator lIter; |
1456 | - for (lIter = theHeaders.begin(); lIter != theHeaders.end(); ++lIter) { |
1457 | + for ( headers_type::const_iterator |
1458 | + lIter = theHeaders.begin(); lIter != theHeaders.end(); ++lIter) { |
1459 | theHandler.header(lIter->first, lIter->second); |
1460 | } |
1461 | if (!theStatusOnly) |
1462 | @@ -120,23 +163,20 @@ |
1463 | |
1464 | void HttpResponseParser::registerHandler() |
1465 | { |
1466 | - curl_easy_setopt(theCurl, CURLOPT_HEADERFUNCTION, |
1467 | - &HttpResponseParser::headerfunction); |
1468 | + curl_easy_setopt(theCurl, CURLOPT_HEADERFUNCTION, &curl_headerfunction); |
1469 | curl_easy_setopt(theCurl, CURLOPT_HEADERDATA, this); |
1470 | } |
1471 | |
1472 | - size_t HttpResponseParser::headerfunction(void *ptr, |
1473 | - size_t size, |
1474 | - size_t nmemb, |
1475 | - void *stream) |
1476 | + size_t HttpResponseParser::curl_headerfunction( void *ptr, size_t size, |
1477 | + size_t nmemb, void *data ) |
1478 | { |
1479 | size_t lSize = size*nmemb; |
1480 | size_t lResult = lSize; |
1481 | - HttpResponseParser* lParser = static_cast<HttpResponseParser*>(stream); |
1482 | + HttpResponseParser* lParser = static_cast<HttpResponseParser*>(data); |
1483 | if (lParser->theInsideRead) { |
1484 | lParser->theHandler.endBody(); |
1485 | + lParser->theInsideRead = false; |
1486 | } |
1487 | - lParser->theInsideRead = false; |
1488 | const char* lDataChar = (const char*) ptr; |
1489 | while (lSize != 0 && (lDataChar[lSize - 1] == 10 |
1490 | || lDataChar[lSize - 1] == 13)) { |
1491 | @@ -173,7 +213,9 @@ |
1492 | } |
1493 | String lNameS = fn::lower_case( lName ); |
1494 | if (lNameS == "content-type") { |
1495 | - lParser->theCurrentContentType = lValue.substr(0, lValue.find(';')); |
1496 | + parse_content_type( |
1497 | + lValue, &lParser->theCurrentContentType, &lParser->theCurrentCharset |
1498 | + ); |
1499 | } else if (lNameS == "content-id") { |
1500 | lParser->theId = lValue; |
1501 | } else if (lNameS == "content-description") { |
1502 | @@ -184,7 +226,7 @@ |
1503 | return lResult; |
1504 | } |
1505 | |
1506 | - void HttpResponseParser::parseStatusAndMessage(std::string aHeader) |
1507 | + void HttpResponseParser::parseStatusAndMessage(std::string const &aHeader) |
1508 | { |
1509 | std::string::size_type lPos = aHeader.find(' '); |
1510 | assert(lPos != std::string::npos); |
1511 | @@ -215,7 +257,12 @@ |
1512 | static void streamReleaser(std::istream* aStream) |
1513 | { |
1514 | // This istream contains our curl stream buffer, so we have to delete it too |
1515 | - delete aStream->rdbuf(); |
1516 | + std::streambuf *const sbuf = aStream->rdbuf(); |
1517 | + if ( transcode::streambuf *tbuf = |
1518 | + dynamic_cast<transcode::streambuf*>( sbuf ) ) |
1519 | + delete tbuf->orig_streambuf(); |
1520 | + else |
1521 | + delete sbuf; |
1522 | delete aStream; |
1523 | } |
1524 | |
1525 | @@ -265,4 +312,7 @@ |
1526 | return Item(); |
1527 | } |
1528 | } |
1529 | -}} |
1530 | + |
1531 | +} // namespace http_client |
1532 | +} // namespace zorba |
1533 | +/* vim:set et sw=2 ts=2: */ |
1534 | |
1535 | === modified file 'modules/com/zorba-xquery/www/modules/http-client.xq.src/http_response_parser.h' |
1536 | --- modules/com/zorba-xquery/www/modules/http-client.xq.src/http_response_parser.h 2011-07-29 08:12:36 +0000 |
1537 | +++ modules/com/zorba-xquery/www/modules/http-client.xq.src/http_response_parser.h 2012-02-16 02:19:18 +0000 |
1538 | @@ -31,6 +31,7 @@ |
1539 | namespace curl { |
1540 | class streambuf; |
1541 | } |
1542 | + |
1543 | namespace http_client { |
1544 | class RequestHandler; |
1545 | |
1546 | @@ -40,7 +41,9 @@ |
1547 | CURL* theCurl; |
1548 | ErrorThrower& theErrorThrower; |
1549 | std::string theCurrentContentType; |
1550 | - std::vector<std::pair<std::string, std::string> > theHeaders; |
1551 | + std::string theCurrentCharset; |
1552 | + typedef std::vector<std::pair<std::string, std::string> > headers_type; |
1553 | + headers_type theHeaders; |
1554 | int theStatus; |
1555 | std::string theMessage; |
1556 | zorba::curl::streambuf* theStreamBuffer; |
1557 | @@ -74,15 +77,16 @@ |
1558 | virtual void afterRead(); |
1559 | private: |
1560 | void registerHandler(); |
1561 | - void parseStatusAndMessage(std::string aHeader); |
1562 | + void parseStatusAndMessage(std::string const &aHeader); |
1563 | Item createXmlItem(std::istream& aStream); |
1564 | Item createHtmlItem(std::istream& aStream); |
1565 | Item createTextItem(std::istream* aStream); |
1566 | Item createBase64Item(std::istream& aStream); |
1567 | - public: //Handler |
1568 | - static size_t headerfunction( void *ptr, size_t size, size_t nmemb, |
1569 | - void *stream); |
1570 | + |
1571 | + static size_t curl_headerfunction( void*, size_t, size_t, void* ); |
1572 | }; |
1573 | -}} // namespace zorba, http_client |
1574 | + |
1575 | +} // namespace http_client |
1576 | +} // namespace zorba |
1577 | |
1578 | #endif //HTTP_RESPONSE_PARSER_H |
1579 | |
1580 | === modified file 'modules/com/zorba-xquery/www/modules/pregenerated/errors.xq' |
1581 | --- modules/com/zorba-xquery/www/modules/pregenerated/errors.xq 2012-01-26 01:35:11 +0000 |
1582 | +++ modules/com/zorba-xquery/www/modules/pregenerated/errors.xq 2012-02-16 02:19:18 +0000 |
1583 | @@ -81,6 +81,10 @@ |
1584 | |
1585 | (:~ |
1586 | :) |
1587 | +declare variable $zerr:ZXQP0006 as xs:QName := fn:QName($zerr:NS, "zerr:ZXQP0006"); |
1588 | + |
1589 | +(:~ |
1590 | +:) |
1591 | declare variable $zerr:ZXQP0007 as xs:QName := fn:QName($zerr:NS, "zerr:ZXQP0007"); |
1592 | |
1593 | (:~ |
1594 | @@ -664,6 +668,10 @@ |
1595 | |
1596 | (:~ |
1597 | :) |
1598 | +declare variable $zerr:ZOSE0006 as xs:QName := fn:QName($zerr:NS, "zerr:ZOSE0006"); |
1599 | + |
1600 | +(:~ |
1601 | +:) |
1602 | declare variable $zerr:ZSTR0001 as xs:QName := fn:QName($zerr:NS, "zerr:ZSTR0001"); |
1603 | |
1604 | (:~ |
1605 | |
1606 | === modified file 'modules/org/expath/ns/file.xq.src/file.cpp' |
1607 | --- modules/org/expath/ns/file.xq.src/file.cpp 2011-07-22 08:12:31 +0000 |
1608 | +++ modules/org/expath/ns/file.xq.src/file.cpp 2012-02-16 02:19:18 +0000 |
1609 | @@ -28,6 +28,7 @@ |
1610 | #include <zorba/singleton_item_sequence.h> |
1611 | #include <zorba/util/path.h> |
1612 | #include <zorba/user_exception.h> |
1613 | +#include <zorba/transcode_stream.h> |
1614 | |
1615 | #include "file_module.h" |
1616 | |
1617 | @@ -188,6 +189,7 @@ |
1618 | { |
1619 | String lFileStr = getFilePathString(aArgs, 0); |
1620 | File_t lFile = File::createFile(lFileStr.c_str()); |
1621 | + String lEncoding("UTF-8"); |
1622 | |
1623 | // preconditions |
1624 | if (!lFile->exists()) { |
1625 | @@ -198,18 +200,30 @@ |
1626 | } |
1627 | |
1628 | if (aArgs.size() == 2) { |
1629 | - // since Zorba currently only supports UTF-8 we only call this function |
1630 | - // to reject any other encoding requested bu the user |
1631 | - getEncodingArg(aArgs, 1); |
1632 | + lEncoding = getEncodingArg(aArgs, 1); |
1633 | } |
1634 | |
1635 | - std::auto_ptr<StreamableItemSequence> lSeq(new StreamableItemSequence()); |
1636 | - lFile->openInputStream(*lSeq->theStream, false, true); |
1637 | - |
1638 | - lSeq->theItem = theModule->getItemFactory()->createStreamableString( |
1639 | - *lSeq->theStream, &StreamableItemSequence::streamReleaser); |
1640 | - |
1641 | - return ItemSequence_t(lSeq.release()); |
1642 | + zorba::Item lResult; |
1643 | + std::unique_ptr<std::ifstream> lInStream; |
1644 | + if ( transcode::is_necessary( lEncoding.c_str() ) ) |
1645 | + { |
1646 | + try { |
1647 | + lInStream.reset( new transcode::stream<std::ifstream>(lEncoding.c_str()) ); |
1648 | + } catch (std::invalid_argument const& e) |
1649 | + { |
1650 | + raiseFileError("FOFL0006", "Unsupported encoding", lEncoding.c_str()); |
1651 | + } |
1652 | + } |
1653 | + else |
1654 | + { |
1655 | + lInStream.reset( new std::ifstream() ); |
1656 | + } |
1657 | + lFile->openInputStream(*lInStream.get(), false, true); |
1658 | + lResult = theModule->getItemFactory()->createStreamableString( |
1659 | + *lInStream.release(), &FileModule::streamReleaser |
1660 | + ); |
1661 | + return ItemSequence_t(new SingletonItemSequence(lResult)); |
1662 | + |
1663 | } |
1664 | |
1665 | //***************************************************************************** |
1666 | @@ -722,3 +736,4 @@ |
1667 | extern "C" DLL_EXPORT zorba::ExternalModule* createModule() { |
1668 | return new zorba::filemodule::FileModule(); |
1669 | } |
1670 | +/* vim:set et sw=2 ts=2: */ |
1671 | |
1672 | === modified file 'modules/org/expath/ns/file.xq.src/file_function.cpp' |
1673 | --- modules/org/expath/ns/file.xq.src/file_function.cpp 2011-07-13 01:56:45 +0000 |
1674 | +++ modules/org/expath/ns/file.xq.src/file_function.cpp 2012-02-16 02:19:18 +0000 |
1675 | @@ -141,11 +141,6 @@ |
1676 | arg_iter->close(); |
1677 | } |
1678 | |
1679 | - if (!(lEncoding == "UTF-8" || lEncoding == "UTF8")) { |
1680 | - // the rest are not supported encodings |
1681 | - raiseFileError("FOFL0006", "Unsupported encoding", lEncoding.c_str()); |
1682 | - } |
1683 | - |
1684 | return lEncoding; |
1685 | } |
1686 | |
1687 | |
1688 | === modified file 'modules/org/expath/ns/file.xq.src/file_function.h' |
1689 | --- modules/org/expath/ns/file.xq.src/file_function.h 2011-07-22 08:12:31 +0000 |
1690 | +++ modules/org/expath/ns/file.xq.src/file_function.h 2012-02-16 02:19:18 +0000 |
1691 | @@ -25,7 +25,9 @@ |
1692 | |
1693 | #include <fstream> |
1694 | |
1695 | -namespace zorba { namespace filemodule { |
1696 | +namespace zorba { |
1697 | + |
1698 | + namespace filemodule { |
1699 | |
1700 | class FileModule; |
1701 | |
1702 | @@ -136,18 +138,12 @@ |
1703 | next(Item& aResult); |
1704 | }; |
1705 | |
1706 | - Item theItem; |
1707 | - std::ifstream* theStream; |
1708 | + Item theItem; |
1709 | + std::ifstream* theStream; |
1710 | |
1711 | StreamableItemSequence() |
1712 | : theStream(new std::ifstream()) {} |
1713 | |
1714 | - static void |
1715 | - streamReleaser(std::istream* stream) |
1716 | - { |
1717 | - delete stream; |
1718 | - } |
1719 | - |
1720 | Iterator_t getIterator() |
1721 | { |
1722 | return new InternalIterator(this); |
1723 | |
1724 | === modified file 'modules/org/expath/ns/file.xq.src/file_module.cpp' |
1725 | --- modules/org/expath/ns/file.xq.src/file_module.cpp 2011-06-08 18:37:56 +0000 |
1726 | +++ modules/org/expath/ns/file.xq.src/file_module.cpp 2012-02-16 02:19:18 +0000 |
1727 | @@ -17,11 +17,10 @@ |
1728 | #include "file.h" |
1729 | #include "file_module.h" |
1730 | #include "file_function.h" |
1731 | +#include <cassert> |
1732 | |
1733 | namespace zorba { namespace filemodule { |
1734 | |
1735 | - ItemFactory* FileModule::theFactory = 0; |
1736 | - |
1737 | const char* FileModule::theNamespace = "http://expath.org/ns/file"; |
1738 | |
1739 | |
1740 | @@ -39,9 +38,7 @@ |
1741 | { |
1742 | ExternalFunction*& lFunc = theFunctions[aLocalname]; |
1743 | if (!lFunc) { |
1744 | - if (1 == 0) { |
1745 | - |
1746 | - } else if (aLocalname == "create-directory") { |
1747 | + if (aLocalname == "create-directory") { |
1748 | lFunc = new CreateDirectoryFunction(this); |
1749 | } else if (aLocalname == "delete-file-impl") { |
1750 | lFunc = new DeleteFileImplFunction(this); |
1751 | |
1752 | === modified file 'modules/org/expath/ns/file.xq.src/file_module.h' |
1753 | --- modules/org/expath/ns/file.xq.src/file_module.h 2011-06-08 18:37:56 +0000 |
1754 | +++ modules/org/expath/ns/file.xq.src/file_module.h 2012-02-16 02:19:18 +0000 |
1755 | @@ -27,7 +27,7 @@ |
1756 | class FileModule : public ExternalModule |
1757 | { |
1758 | private: |
1759 | - static ItemFactory* theFactory; |
1760 | + mutable ItemFactory* theFactory; |
1761 | |
1762 | public: |
1763 | static const char* theNamespace; |
1764 | @@ -43,10 +43,17 @@ |
1765 | }; |
1766 | |
1767 | typedef std::map<String, ExternalFunction*, ltstr> FuncMap_t; |
1768 | - |
1769 | FuncMap_t theFunctions; |
1770 | - |
1771 | + |
1772 | public: |
1773 | + static void |
1774 | + streamReleaser(std::istream* stream) |
1775 | + { |
1776 | + delete stream; |
1777 | + } |
1778 | + |
1779 | + FileModule() : theFactory(0) {} |
1780 | + |
1781 | virtual ~FileModule(); |
1782 | |
1783 | virtual String |
1784 | @@ -58,10 +65,10 @@ |
1785 | virtual void |
1786 | destroy(); |
1787 | |
1788 | - static ItemFactory* |
1789 | - getItemFactory() |
1790 | + ItemFactory* |
1791 | + getItemFactory() const |
1792 | { |
1793 | - if(!theFactory) |
1794 | + if (!theFactory) |
1795 | { |
1796 | theFactory = Zorba::getInstance(0)->getItemFactory(); |
1797 | } |
1798 | |
1799 | === modified file 'src/api/CMakeLists.txt' |
1800 | --- src/api/CMakeLists.txt 2011-08-31 13:17:59 +0000 |
1801 | +++ src/api/CMakeLists.txt 2012-02-16 02:19:18 +0000 |
1802 | @@ -55,6 +55,7 @@ |
1803 | zorba_functions.cpp |
1804 | annotationimpl.cpp |
1805 | auditimpl.cpp |
1806 | + transcode_streambuf.cpp |
1807 | ) |
1808 | |
1809 | IF (NOT ZORBA_NO_FULL_TEXT) |
1810 | |
1811 | === added file 'src/api/transcode_streambuf.cpp' |
1812 | --- src/api/transcode_streambuf.cpp 1970-01-01 00:00:00 +0000 |
1813 | +++ src/api/transcode_streambuf.cpp 2012-02-16 02:19:18 +0000 |
1814 | @@ -0,0 +1,102 @@ |
1815 | +/* |
1816 | + * Copyright 2006-2008 The FLWOR Foundation. |
1817 | + * |
1818 | + * Licensed under the Apache License, Version 2.0 (the "License"); |
1819 | + * you may not use this file except in compliance with the License. |
1820 | + * You may obtain a copy of the License at |
1821 | + * |
1822 | + * http://www.apache.org/licenses/LICENSE-2.0 |
1823 | + * |
1824 | + * Unless required by applicable law or agreed to in writing, software |
1825 | + * distributed under the License is distributed on an "AS IS" BASIS, |
1826 | + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
1827 | + * See the License for the specific language governing permissions and |
1828 | + * limitations under the License. |
1829 | + */ |
1830 | + |
1831 | +#include <zorba/transcode_stream.h> |
1832 | + |
1833 | +#include "util/transcode_streambuf.h" |
1834 | + |
1835 | +using namespace std; |
1836 | + |
1837 | +namespace zorba { |
1838 | +namespace transcode { |
1839 | + |
1840 | +/////////////////////////////////////////////////////////////////////////////// |
1841 | + |
1842 | +streambuf::streambuf( char const *charset, std::streambuf *orig ) : |
1843 | + proxy_buf_( new internal::transcode::streambuf( charset, orig ) ) |
1844 | +{ |
1845 | +} |
1846 | + |
1847 | +streambuf::~streambuf() { |
1848 | + // out-of-line since it's virtual |
1849 | +} |
1850 | + |
1851 | +void streambuf::imbue( std::locale const &loc ) { |
1852 | + proxy_buf_->pubimbue( loc ); |
1853 | +} |
1854 | + |
1855 | +streambuf::pos_type streambuf::seekoff( off_type o, ios_base::seekdir d, |
1856 | + ios_base::openmode m ) { |
1857 | + return proxy_buf_->pubseekoff( o, d, m ); |
1858 | +} |
1859 | + |
1860 | +streambuf::pos_type streambuf::seekpos( pos_type p, ios_base::openmode m ) { |
1861 | + return proxy_buf_->pubseekpos( p, m ); |
1862 | +} |
1863 | + |
1864 | +std::streambuf* streambuf::setbuf( char_type *p, streamsize s ) { |
1865 | + proxy_buf_->pubsetbuf( p, s ); |
1866 | + return this; |
1867 | +} |
1868 | + |
1869 | +streamsize streambuf::showmanyc() { |
1870 | + return proxy_buf_->in_avail(); |
1871 | +} |
1872 | + |
1873 | +int streambuf::sync() { |
1874 | + return proxy_buf_->pubsync(); |
1875 | +} |
1876 | + |
1877 | +streambuf::int_type streambuf::overflow( int_type c ) { |
1878 | + return proxy_buf_->sputc( c ); |
1879 | +} |
1880 | + |
1881 | +streambuf::int_type streambuf::pbackfail( int_type c ) { |
1882 | + return proxy_buf_->sputbackc( traits_type::to_char_type( c ) ); |
1883 | +} |
1884 | + |
1885 | +streambuf::int_type streambuf::uflow() { |
1886 | + return proxy_buf_->sbumpc(); |
1887 | +} |
1888 | + |
1889 | +streambuf::int_type streambuf::underflow() { |
1890 | + return proxy_buf_->sgetc(); |
1891 | +} |
1892 | + |
1893 | +streamsize streambuf::xsgetn( char_type *to, streamsize size ) { |
1894 | + return proxy_buf_->sgetn( to, size ); |
1895 | +} |
1896 | + |
1897 | +streamsize streambuf::xsputn( char_type const *from, |
1898 | + streamsize size ) { |
1899 | + return proxy_buf_->sputn( from, size ); |
1900 | +} |
1901 | + |
1902 | +/////////////////////////////////////////////////////////////////////////////// |
1903 | + |
1904 | +bool is_necessary( char const *charset ) { |
1905 | + return internal::transcode::streambuf::is_necessary( charset ); |
1906 | +} |
1907 | + |
1908 | +bool is_supported( char const *charset ) { |
1909 | + return internal::transcode::streambuf::is_supported( charset ); |
1910 | +} |
1911 | + |
1912 | +/////////////////////////////////////////////////////////////////////////////// |
1913 | + |
1914 | +} // namespace transcode |
1915 | +} // namespace zorba |
1916 | +/* vim:set et sw=2 ts=2: */ |
1917 | |
1918 | === modified file 'src/diagnostics/diagnostic_en.xml' |
1919 | --- src/diagnostics/diagnostic_en.xml 2012-02-16 00:52:25 +0000 |
1920 | +++ src/diagnostics/diagnostic_en.xml 2012-02-16 02:19:18 +0000 |
1921 | @@ -1581,6 +1581,10 @@ |
1922 | <value>"$1": feature not enabled</value> |
1923 | </diagnostic> |
1924 | |
1925 | + <diagnostic code="ZXQP0006" name="UNKNOWN_ENCODING"> |
1926 | + <value>"$1": unknown character encoding</value> |
1927 | + </diagnostic> |
1928 | + |
1929 | <diagnostic code="ZXQP0007" name="FUNCTION_SIGNATURE_NOT_EQUAL"> |
1930 | <value>"$1": function signature does not match declaration</value> |
1931 | </diagnostic> |
1932 | @@ -2193,6 +2197,10 @@ |
1933 | <value>"$1": error loading dynamic library${: 2}</value> |
1934 | </diagnostic> |
1935 | |
1936 | + <diagnostic code="ZOSE0006" name="TRANSCODING_ERROR"> |
1937 | + <value>stream transcoding error ($1)</value> |
1938 | + </diagnostic> |
1939 | + |
1940 | <!--////////// Zorba Store Errors //////////////////////////////////////--> |
1941 | |
1942 | <diagnostic code="ZSTR0001" name="INDEX_ALREADY_EXISTS"> |
1943 | |
1944 | === modified file 'src/diagnostics/pregenerated/diagnostic_list.cpp' |
1945 | --- src/diagnostics/pregenerated/diagnostic_list.cpp 2012-01-26 01:35:11 +0000 |
1946 | +++ src/diagnostics/pregenerated/diagnostic_list.cpp 2012-02-16 02:19:18 +0000 |
1947 | @@ -568,6 +568,9 @@ |
1948 | ZorbaErrorCode ZXQP0005_NOT_ENABLED( "ZXQP0005" ); |
1949 | |
1950 | |
1951 | +ZorbaErrorCode ZXQP0006_UNKNOWN_ENCODING( "ZXQP0006" ); |
1952 | + |
1953 | + |
1954 | ZorbaErrorCode ZXQP0007_FUNCTION_SIGNATURE_NOT_EQUAL( "ZXQP0007" ); |
1955 | |
1956 | |
1957 | @@ -1004,6 +1007,9 @@ |
1958 | ZorbaErrorCode ZOSE0005_DLL_LOAD_FAILED( "ZOSE0005" ); |
1959 | |
1960 | |
1961 | +ZorbaErrorCode ZOSE0006_TRANSCODING_ERROR( "ZOSE0006" ); |
1962 | + |
1963 | + |
1964 | ZorbaErrorCode ZSTR0001_INDEX_ALREADY_EXISTS( "ZSTR0001" ); |
1965 | |
1966 | |
1967 | |
1968 | === modified file 'src/diagnostics/pregenerated/dict_en.cpp' |
1969 | --- src/diagnostics/pregenerated/dict_en.cpp 2012-02-16 00:52:25 +0000 |
1970 | +++ src/diagnostics/pregenerated/dict_en.cpp 2012-02-16 02:19:18 +0000 |
1971 | @@ -354,6 +354,7 @@ |
1972 | { "ZOSE0003", "stream read failure" }, |
1973 | { "ZOSE0004", "${\"1\": }I/O error${: 2}" }, |
1974 | { "ZOSE0005", "\"$1\": error loading dynamic library${: 2}" }, |
1975 | + { "ZOSE0006", "stream transcoding error ($1)" }, |
1976 | { "ZSTR0001", "\"$1\": index already exists" }, |
1977 | { "ZSTR0002", "\"$1\": index does not exist" }, |
1978 | { "ZSTR0003", "\"$1\": partial key insertion into index \"$2\"" }, |
1979 | @@ -392,6 +393,7 @@ |
1980 | { "ZXQP0003", "internal error${: 1}" }, |
1981 | { "ZXQP0004", "not yet implemented: $1" }, |
1982 | { "ZXQP0005", "\"$1\": feature not enabled" }, |
1983 | + { "ZXQP0006", "\"$1\": unknown character encoding" }, |
1984 | { "ZXQP0007", "\"$1\": function signature does not match declaration" }, |
1985 | { "ZXQP0008", "\"$1\": function implementation not found" }, |
1986 | { "ZXQP0009", "\"$1\": function referred to by this local-name has the local-name \"$2\" instead" }, |
1987 | |
1988 | === modified file 'src/unit_tests/CMakeLists.txt' |
1989 | --- src/unit_tests/CMakeLists.txt 2012-02-02 16:38:39 +0000 |
1990 | +++ src/unit_tests/CMakeLists.txt 2012-02-16 02:19:18 +0000 |
1991 | @@ -11,7 +11,6 @@ |
1992 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
1993 | # See the License for the specific language governing permissions and |
1994 | # limitations under the License. |
1995 | - |
1996 | |
1997 | SET(UNIT_TEST_SRCS |
1998 | string_instantiate.cpp |
1999 | @@ -30,10 +29,9 @@ |
2000 | tokenizer.cpp) |
2001 | ENDIF (NOT ZORBA_NO_FULL_TEXT) |
2002 | |
2003 | -IF(ZORBA_WITH_DEBUGGER) |
2004 | - LIST(APPEND UNIT_TEST_SRCS |
2005 | -# test_debugger_protocol.cpp |
2006 | - ) |
2007 | -ENDIF(ZORBA_WITH_DEBUGGER) |
2008 | +IF (NOT ZORBA_NO_UNICODE) |
2009 | + LIST (APPEND UNIT_TEST_SRCS |
2010 | + test_icu_streambuf.cpp) |
2011 | +ENDIF (NOT ZORBA_NO_UNICODE) |
2012 | |
2013 | # vim:set et sw=2 tw=2: |
2014 | |
2015 | === added file 'src/unit_tests/test_icu_streambuf.cpp' |
2016 | --- src/unit_tests/test_icu_streambuf.cpp 1970-01-01 00:00:00 +0000 |
2017 | +++ src/unit_tests/test_icu_streambuf.cpp 2012-02-16 02:19:18 +0000 |
2018 | @@ -0,0 +1,151 @@ |
2019 | +/* |
2020 | + * Copyright 2006-2008 The FLWOR Foundation. |
2021 | + * |
2022 | + * Licensed under the Apache License, Version 2.0 (the "License"); |
2023 | + * you may not use this file except in compliance with the License. |
2024 | + * You may obtain a copy of the License at |
2025 | + * |
2026 | + * http://www.apache.org/licenses/LICENSE-2.0 |
2027 | + * |
2028 | + * Unless required by applicable law or agreed to in writing, software |
2029 | + * distributed under the License is distributed on an "AS IS" BASIS, |
2030 | + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
2031 | + * See the License for the specific language governing permissions and |
2032 | + * limitations under the License. |
2033 | + */ |
2034 | + |
2035 | +#include <fstream> |
2036 | +#include <iostream> |
2037 | +#include <sstream> |
2038 | + |
2039 | +#include "util/transcode_streambuf.h" |
2040 | + |
2041 | +using namespace std; |
2042 | +using namespace zorba; |
2043 | + |
2044 | +#define COPYRIGHT_ISO "\xA9" |
2045 | +#define COPYRIGHT_UTF8 "\xC2\xA9" |
2046 | + |
2047 | +#define ONE_THIRD_UTF8 "\xE2\x85\x93" |
2048 | +#define ONE_THIRD_UTF16BE "\x21\x53" |
2049 | + |
2050 | +struct test { |
2051 | + char const *ext_charset; |
2052 | + char const *ext_str; |
2053 | + int ext_len; |
2054 | + char const *utf8_str; |
2055 | +}; |
2056 | + |
2057 | +static test const tests[] = { |
2058 | + /* 0 */ { "ISO-8859-1", "Copyright " COPYRIGHT_ISO " 2011", 0, "Copyright " COPYRIGHT_UTF8 " 2011" }, |
2059 | + /* 1 */ { "UTF-16BE", ONE_THIRD_UTF16BE "\0 \0c\0u\0p", 10, ONE_THIRD_UTF8 " cup" }, |
2060 | + { 0, 0, 0, 0 } |
2061 | +}; |
2062 | + |
2063 | +static string make_ext_str( test const *t ) { |
2064 | + if ( t->ext_len ) |
2065 | + return string( t->ext_str, t->ext_len ); |
2066 | + return string( t->ext_str ); |
2067 | +} |
2068 | + |
2069 | +/////////////////////////////////////////////////////////////////////////////// |
2070 | + |
2071 | +static int failures; |
2072 | + |
2073 | +static bool assert_true( int no, char const *expr, int line, bool result ) { |
2074 | + if ( !result ) { |
2075 | + cout << '#' << no << " FAILED, line " << line << ": " << expr << endl; |
2076 | + ++failures; |
2077 | + } |
2078 | + return result; |
2079 | +} |
2080 | + |
2081 | +static void print_exception( int no, char const *expr, int line, |
2082 | + std::exception const &e ) { |
2083 | + assert_true( no, expr, line, false ); |
2084 | + cout << "+ exception: " << e.what() << endl; |
2085 | +} |
2086 | + |
2087 | +#define ASSERT_TRUE( NO, EXPR ) assert_true( NO, #EXPR, __LINE__, !!(EXPR) ) |
2088 | + |
2089 | +#define ASSERT_TRUE_AND_NO_EXCEPTION( NO, EXPR ) \ |
2090 | + try { ASSERT_TRUE( NO, EXPR ); } \ |
2091 | + catch ( std::exception const &e ) { print_exception( NO, #EXPR, __LINE__, e ); } |
2092 | + |
2093 | +/////////////////////////////////////////////////////////////////////////////// |
2094 | + |
2095 | +static bool test_getline( test const *t ) { |
2096 | + string const ext_str( make_ext_str( t ) ); |
2097 | + istringstream iss( ext_str ); |
2098 | + icu_streambuf xbuf( t->ext_charset, iss.rdbuf() ); |
2099 | + iss.ios::rdbuf( &xbuf ); |
2100 | + |
2101 | + char utf8_buf[ 1024 ]; |
2102 | + iss.getline( utf8_buf, sizeof utf8_buf ); |
2103 | + if ( iss.gcount() ) { |
2104 | + string const utf8_str( utf8_buf ); |
2105 | + return utf8_str == t->utf8_str; |
2106 | + } |
2107 | + return false; |
2108 | +} |
2109 | + |
2110 | +static bool test_read( test const *t ) { |
2111 | + string const ext_str( make_ext_str( t ) ); |
2112 | + istringstream iss( ext_str ); |
2113 | + icu_streambuf xbuf( t->ext_charset, iss.rdbuf() ); |
2114 | + iss.ios::rdbuf( &xbuf ); |
2115 | + |
2116 | + char utf8_buf[ 1024 ]; |
2117 | + iss.read( utf8_buf, sizeof utf8_buf ); |
2118 | + if ( iss.gcount() ) { |
2119 | + string const utf8_str( utf8_buf, iss.gcount() ); |
2120 | + return utf8_str == t->utf8_str; |
2121 | + } |
2122 | + return false; |
2123 | +} |
2124 | + |
2125 | +static bool test_insertion( test const *t ) { |
2126 | + ostringstream oss; |
2127 | + icu_streambuf xbuf( t->ext_charset, oss.rdbuf() ); |
2128 | + oss.ios::rdbuf( &xbuf ); |
2129 | + |
2130 | + oss << t->utf8_str << flush; |
2131 | + string const ext_str( oss.str() ); |
2132 | + |
2133 | + string const expected_ext_str( make_ext_str( t ) ); |
2134 | + return ext_str == expected_ext_str; |
2135 | +} |
2136 | + |
2137 | +static bool test_put( test const *t ) { |
2138 | + ostringstream oss; |
2139 | + icu_streambuf xbuf( t->ext_charset, oss.rdbuf() ); |
2140 | + oss.ios::rdbuf( &xbuf ); |
2141 | + |
2142 | + for ( char const *c = t->utf8_str; *c; ++c ) |
2143 | + oss.put( *c ); |
2144 | + string const ext_str( oss.str() ); |
2145 | + |
2146 | + string const expected_ext_str( make_ext_str( t ) ); |
2147 | + return ext_str == expected_ext_str; |
2148 | +} |
2149 | + |
2150 | +/////////////////////////////////////////////////////////////////////////////// |
2151 | + |
2152 | +namespace zorba { |
2153 | +namespace UnitTests { |
2154 | + |
2155 | +int test_icu_streambuf( int, char*[] ) { |
2156 | + int test_no = 0; |
2157 | + for ( test const *t = tests; t->utf8_str; ++t, ++test_no ) { |
2158 | + ASSERT_TRUE_AND_NO_EXCEPTION( test_no, test_getline( t ) ); |
2159 | + ASSERT_TRUE_AND_NO_EXCEPTION( test_no, test_read( t ) ); |
2160 | + ASSERT_TRUE_AND_NO_EXCEPTION( test_no, test_insertion( t ) ); |
2161 | + ASSERT_TRUE_AND_NO_EXCEPTION( test_no, test_put( t ) ); |
2162 | + } |
2163 | + cout << failures << " test(s) failed\n"; |
2164 | + return failures ? 1 : 0; |
2165 | +} |
2166 | + |
2167 | +} // namespace UnitTests |
2168 | +} // namespace zorba |
2169 | +/* vim:set et sw=2 ts=2: */ |
2170 | |
2171 | === modified file 'src/unit_tests/unit_test_list.h' |
2172 | --- src/unit_tests/unit_test_list.h 2012-02-02 16:38:39 +0000 |
2173 | +++ src/unit_tests/unit_test_list.h 2012-02-16 02:19:18 +0000 |
2174 | @@ -17,6 +17,8 @@ |
2175 | #ifndef ZORBA_UNIT_TEST_LIST_H |
2176 | #define ZORBA_UNIT_TEST_LIST_H |
2177 | |
2178 | +#include <iostream> |
2179 | + |
2180 | #include <zorba/config.h> |
2181 | |
2182 | namespace zorba { |
2183 | @@ -34,6 +36,9 @@ |
2184 | /** |
2185 | * ADD NEW UNIT TESTS HERE |
2186 | */ |
2187 | +#ifndef ZORBA_NO_UNICODE |
2188 | + int test_icu_streambuf( int, char*[] ); |
2189 | +#endif /* ZORBA_NO_UNICODE */ |
2190 | int json_parser( int, char*[] ); |
2191 | |
2192 | void initializeTestList(); |
2193 | |
2194 | === modified file 'src/unit_tests/unit_tests.cpp' |
2195 | --- src/unit_tests/unit_tests.cpp 2012-02-02 16:38:39 +0000 |
2196 | +++ src/unit_tests/unit_tests.cpp 2012-02-16 02:19:18 +0000 |
2197 | @@ -39,6 +39,9 @@ |
2198 | void initializeTestList() { |
2199 | libunittests["string"] = test_string; |
2200 | libunittests["uri"] = runUriTest; |
2201 | +#ifndef ZORBA_NO_UNICODE |
2202 | + libunittests["icu_streambuf"] = test_icu_streambuf; |
2203 | +#endif /* ZORBA_NO_UNICODE */ |
2204 | libunittests["json_parser"] = json_parser; |
2205 | libunittests["unique_ptr"] = test_unique_ptr; |
2206 | #ifndef ZORBA_NO_FULL_TEXT |
2207 | |
2208 | === modified file 'src/util/CMakeLists.txt' |
2209 | --- src/util/CMakeLists.txt 2011-12-20 18:29:15 +0000 |
2210 | +++ src/util/CMakeLists.txt 2012-02-16 02:19:18 +0000 |
2211 | @@ -41,7 +41,12 @@ |
2212 | ENDIF(ZORBA_WITH_FILE_ACCESS) |
2213 | |
2214 | IF(ZORBA_NO_UNICODE) |
2215 | - LIST(APPEND UTIL_SRCS regex_ascii.cpp) |
2216 | + LIST(APPEND UTIL_SRCS |
2217 | + regex_ascii.cpp |
2218 | + passthru_streambuf.cpp) |
2219 | +ELSE(ZORBA_NO_UNICODE) |
2220 | + LIST(APPEND UTIL_SRCS |
2221 | + icu_streambuf.cpp) |
2222 | ENDIF(ZORBA_NO_UNICODE) |
2223 | |
2224 | HEADER_GROUP_SUBFOLDER(UTIL_SRCS fx) |
2225 | |
2226 | === added file 'src/util/icu_streambuf.cpp' |
2227 | --- src/util/icu_streambuf.cpp 1970-01-01 00:00:00 +0000 |
2228 | +++ src/util/icu_streambuf.cpp 2012-02-16 02:19:18 +0000 |
2229 | @@ -0,0 +1,300 @@ |
2230 | +/* |
2231 | + * Copyright 2006-2008 The FLWOR Foundation. |
2232 | + * |
2233 | + * Licensed under the Apache License, Version 2.0 (the "License"); |
2234 | + * you may not use this file except in compliance with the License. |
2235 | + * You may obtain a copy of the License at |
2236 | + * |
2237 | + * http://www.apache.org/licenses/LICENSE-2.0 |
2238 | + * |
2239 | + * Unless required by applicable law or agreed to in writing, software |
2240 | + * distributed under the License is distributed on an "AS IS" BASIS, |
2241 | + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
2242 | + * See the License for the specific language governing permissions and |
2243 | + * limitations under the License. |
2244 | + */ |
2245 | + |
2246 | +#define ZORBA_DEBUG_ICU_STREAMBUF 0 |
2247 | + |
2248 | +#ifdef ZORBA_DEBUG_ICU_STREAMBUF |
2249 | +# include <stdio.h> |
2250 | +#endif |
2251 | + |
2252 | +#include <algorithm> |
2253 | +#include <cassert> |
2254 | + |
2255 | +#include <zorba/diagnostic_list.h> |
2256 | + |
2257 | +#include "diagnostics/assert.h" |
2258 | +#include "diagnostics/diagnostic.h" |
2259 | +#include "diagnostics/zorba_exception.h" |
2260 | +#include "util/cxx_util.h" |
2261 | +#include "util/string_util.h" |
2262 | +#include "util/utf8_util.h" |
2263 | + |
2264 | +#include "icu_streambuf.h" |
2265 | + |
2266 | +using namespace std; |
2267 | + |
2268 | +namespace zorba { |
2269 | + |
2270 | +int const Small_External_Buf_Size = 6; |
2271 | +int const Large_External_Buf_Size = 4096; |
2272 | + |
2273 | +/////////////////////////////////////////////////////////////////////////////// |
2274 | + |
2275 | +inline void icu_streambuf::buf_type_base::reset() { |
2276 | + pivot_source_ = pivot_target_ = pivot_buf_; |
2277 | +} |
2278 | + |
2279 | +inline void icu_streambuf::resetg() { |
2280 | + setg( |
2281 | + g_.utf8_char_, g_.utf8_char_ + sizeof g_.utf8_char_, |
2282 | + g_.utf8_char_ + sizeof g_.utf8_char_ |
2283 | + ); |
2284 | +} |
2285 | + |
2286 | +icu_streambuf::icu_streambuf( char const *charset, streambuf *orig ) : |
2287 | + proxy_streambuf( orig ), |
2288 | + no_conv_( !is_necessary( charset ) ), |
2289 | + external_conv_( no_conv_ ? nullptr : create_conv( charset ) ), |
2290 | + utf8_conv_( no_conv_ ? nullptr : create_conv( "UTF-8" ) ) |
2291 | +{ |
2292 | + if ( !orig ) |
2293 | + throw invalid_argument( "null streambuf" ); |
2294 | + resetg(); |
2295 | +} |
2296 | + |
2297 | +icu_streambuf::~icu_streambuf() { |
2298 | + if ( external_conv_ ) |
2299 | + ucnv_close( external_conv_ ); |
2300 | + if ( utf8_conv_ ) |
2301 | + ucnv_close( utf8_conv_ ); |
2302 | +} |
2303 | + |
2304 | +void icu_streambuf::clear() { |
2305 | + if ( !no_conv_ ) { |
2306 | + ucnv_reset( external_conv_ ); |
2307 | + ucnv_reset( utf8_conv_ ); |
2308 | + g_.reset(); |
2309 | + p_.reset(); |
2310 | + resetg(); |
2311 | + } |
2312 | +} |
2313 | + |
2314 | +UConverter* icu_streambuf::create_conv( char const *charset ) { |
2315 | + UErrorCode err = U_ZERO_ERROR; |
2316 | + UConverter *const conv = ucnv_open( charset, &err ); |
2317 | + ucnv_setFromUCallBack( |
2318 | + conv, UCNV_FROM_U_CALLBACK_STOP, nullptr, nullptr, nullptr, &err |
2319 | + ); |
2320 | + ucnv_setToUCallBack( |
2321 | + conv, UCNV_TO_U_CALLBACK_STOP, nullptr, nullptr, nullptr, &err |
2322 | + ); |
2323 | + if ( !conv || U_FAILURE( err ) ) { |
2324 | + if ( conv ) |
2325 | + ucnv_close( conv ); |
2326 | + throw invalid_argument( charset ); |
2327 | + } |
2328 | + return conv; |
2329 | +} |
2330 | + |
2331 | +bool icu_streambuf::is_necessary( char const *charset ) { |
2332 | + // |
2333 | + // Checking for "US-ASCII" explicitly isn't necessary since ICU knows about |
2334 | + // aliases. |
2335 | + // |
2336 | + return ucnv_compareNames( charset, "ASCII" ) |
2337 | + && ucnv_compareNames( charset, "UTF-8" ); |
2338 | +} |
2339 | + |
2340 | +bool icu_streambuf::is_supported( char const *charset ) { |
2341 | + try { |
2342 | + ucnv_close( create_conv( charset ) ); |
2343 | + return true; |
2344 | + } |
2345 | + catch ( invalid_argument const& ) { |
2346 | + return false; |
2347 | + } |
2348 | +} |
2349 | + |
2350 | +icu_streambuf::pos_type icu_streambuf::seekoff( off_type o, ios_base::seekdir d, |
2351 | + ios_base::openmode m ) { |
2352 | + clear(); |
2353 | + return original()->pubseekoff( o, d, m ); |
2354 | +} |
2355 | + |
2356 | +icu_streambuf::pos_type icu_streambuf::seekpos( pos_type p, |
2357 | + ios_base::openmode m ) { |
2358 | + clear(); |
2359 | + return original()->pubseekpos( p, m ); |
2360 | +} |
2361 | + |
2362 | +streambuf* icu_streambuf::setbuf( char_type *p, streamsize s ) { |
2363 | + original()->pubsetbuf( p, s ); |
2364 | + return this; |
2365 | +} |
2366 | + |
2367 | +int icu_streambuf::sync() { |
2368 | + return original()->pubsync(); |
2369 | +} |
2370 | + |
2371 | +icu_streambuf::int_type icu_streambuf::overflow( int_type c ) { |
2372 | +#if ZORBA_DEBUG_ICU_STREAMBUF |
2373 | + printf( "overflow()\n" ); |
2374 | +#endif |
2375 | + if ( no_conv_ ) |
2376 | + return original()->sputc( c ); |
2377 | + |
2378 | + if ( traits_type::eq_int_type( c, traits_type::eof() ) ) |
2379 | + return traits_type::eof(); |
2380 | + |
2381 | + char_type const utf8_byte = traits_type::to_char_type( c ); |
2382 | + char_type const *from = &utf8_byte; |
2383 | + char ebuf[ Small_External_Buf_Size ], *to = ebuf; |
2384 | + |
2385 | + bool const ok = to_external( &from, from + 1, &to, to + sizeof ebuf ); |
2386 | + assert( ok ); |
2387 | + if ( streamsize const n = to - ebuf ) { |
2388 | + original()->sputn( ebuf, n ); |
2389 | + p_.reset(); |
2390 | + } |
2391 | + |
2392 | + return c; |
2393 | +} |
2394 | + |
2395 | +bool icu_streambuf::to_external( char_type const **from, |
2396 | + char_type const *from_end, char **to, |
2397 | + char const *to_end, bool flush ) { |
2398 | + UErrorCode err = U_ZERO_ERROR; |
2399 | + ucnv_convertEx( |
2400 | + external_conv_, utf8_conv_, to, to_end, from, from_end, |
2401 | + p_.pivot_buf_, &p_.pivot_source_, &p_.pivot_target_, |
2402 | + p_.pivot_buf_ + sizeof p_.pivot_buf_, |
2403 | + /*reset*/ false, flush, &err |
2404 | + ); |
2405 | + if ( err == U_TRUNCATED_CHAR_FOUND || err == U_BUFFER_OVERFLOW_ERROR ) |
2406 | + return false; |
2407 | + if ( U_FAILURE( err ) ) |
2408 | + throw ZORBA_EXCEPTION( |
2409 | + zerr::ZOSE0006_TRANSCODING_ERROR, ERROR_PARAMS( u_errorName( err ) ) |
2410 | + ); |
2411 | + return true; |
2412 | +} |
2413 | + |
2414 | +bool icu_streambuf::to_utf8( char const **from, char const *from_end, |
2415 | + char_type **to, char_type const *to_end, |
2416 | + bool flush ) { |
2417 | + UErrorCode err = U_ZERO_ERROR; |
2418 | + ucnv_convertEx( |
2419 | + utf8_conv_, external_conv_, to, to_end, from, from_end, |
2420 | + g_.pivot_buf_, &g_.pivot_source_, &g_.pivot_target_, |
2421 | + g_.pivot_buf_ + sizeof g_.pivot_buf_, |
2422 | + /*reset*/ false, flush, &err |
2423 | + ); |
2424 | + if ( err == U_TRUNCATED_CHAR_FOUND || err == U_BUFFER_OVERFLOW_ERROR ) |
2425 | + return false; |
2426 | + if ( U_FAILURE( err ) ) |
2427 | + throw ZORBA_EXCEPTION( |
2428 | + zerr::ZOSE0006_TRANSCODING_ERROR, ERROR_PARAMS( u_errorName( err ) ) |
2429 | + ); |
2430 | + return true; |
2431 | +} |
2432 | + |
2433 | +icu_streambuf::int_type icu_streambuf::underflow() { |
2434 | +#if ZORBA_DEBUG_ICU_STREAMBUF |
2435 | + printf( "underflow()\n" ); |
2436 | +#endif |
2437 | + if ( no_conv_ ) |
2438 | + return original()->sgetc(); |
2439 | + |
2440 | + if ( gptr() >= egptr() ) { |
2441 | + utf8::storage_type *to = g_.utf8_char_; |
2442 | + utf8::storage_type const *const to_end = to + sizeof g_.utf8_char_; |
2443 | + |
2444 | + while ( true ) { |
2445 | + int_type const c = original()->sbumpc(); |
2446 | + if ( traits_type::eq_int_type( c, traits_type::eof() ) ) |
2447 | + return traits_type::eof(); |
2448 | + |
2449 | + char const ebyte = traits_type::to_char_type( c ); |
2450 | + char const *from = &ebyte; |
2451 | + |
2452 | + to_utf8( &from, from + 1, &to, to_end ); |
2453 | + if ( to > g_.utf8_char_ ) { |
2454 | + setg( g_.utf8_char_, g_.utf8_char_, to ); |
2455 | + g_.reset(); |
2456 | + break; |
2457 | + } |
2458 | + } |
2459 | + } |
2460 | + return traits_type::to_int_type( *gptr() ); |
2461 | +} |
2462 | + |
2463 | +streamsize icu_streambuf::xsgetn( char_type *to, streamsize size ) { |
2464 | +#if ZORBA_DEBUG_ICU_STREAMBUF |
2465 | + printf( "xsgetn()\n" ); |
2466 | +#endif |
2467 | + if ( no_conv_ ) |
2468 | + return original()->sgetn( to, size ); |
2469 | + |
2470 | + streamsize return_size = 0; |
2471 | + char_type *const to_end = to + size; |
2472 | + |
2473 | + if ( streamsize const gsize = egptr() - gptr() ) { |
2474 | + // must first get any chars in g_.utf8_char_ |
2475 | + streamsize const n = min( gsize, size ); |
2476 | + traits_type::copy( to, gptr(), n ); |
2477 | + gbump( n ); |
2478 | + to += n; |
2479 | + size -= n, return_size += n; |
2480 | + } |
2481 | + |
2482 | + while ( size > 0 ) { |
2483 | + char ebuf[ Large_External_Buf_Size ]; |
2484 | + streamsize const get = min( (streamsize)(sizeof ebuf), size ); |
2485 | + if ( streamsize const got = original()->sgetn( ebuf, get ) ) { |
2486 | + char const *from = ebuf; |
2487 | + char_type const *const to_orig = to; |
2488 | + int_type const peek = original()->sgetc(); |
2489 | + bool const flush = traits_type::eq_int_type( peek, traits_type::eof() ); |
2490 | + to_utf8( &from, from + got, &to, to_end, flush ); |
2491 | + streamsize const n = to - to_orig; |
2492 | + size -= n, return_size += n; |
2493 | + if ( flush ) |
2494 | + break; |
2495 | + } else |
2496 | + break; |
2497 | + } |
2498 | + return return_size; |
2499 | +} |
2500 | + |
2501 | +streamsize icu_streambuf::xsputn( char_type const *from, streamsize size ) { |
2502 | +#if ZORBA_DEBUG_ICU_STREAMBUF |
2503 | + printf( "xsputn()\n" ); |
2504 | +#endif |
2505 | + if ( no_conv_ ) |
2506 | + return original()->sputn( from, size ); |
2507 | + |
2508 | + streamsize return_size = 0; |
2509 | + char_type const *const from_end = from + size; |
2510 | + char ebuf[ Large_External_Buf_Size ], *to = ebuf; |
2511 | + char const *const to_end = to + sizeof ebuf; |
2512 | + |
2513 | + while ( size > 0 ) { |
2514 | + char_type const *const from_orig = from; |
2515 | + to_external( &from, from_end, &to, to_end ); |
2516 | + streamsize n = to - ebuf; |
2517 | + if ( n && !original()->sputn( ebuf, n ) ) |
2518 | + break; |
2519 | + to = ebuf; |
2520 | + n = from - from_orig; |
2521 | + size -= n, return_size += n; |
2522 | + } |
2523 | + return return_size; |
2524 | +} |
2525 | + |
2526 | +/////////////////////////////////////////////////////////////////////////////// |
2527 | + |
2528 | +} // namespace zorba |
2529 | +/* vim:set et sw=2 ts=2: */ |
2530 | |
2531 | === added file 'src/util/icu_streambuf.h' |
2532 | --- src/util/icu_streambuf.h 1970-01-01 00:00:00 +0000 |
2533 | +++ src/util/icu_streambuf.h 2012-02-16 02:19:18 +0000 |
2534 | @@ -0,0 +1,140 @@ |
2535 | +/* |
2536 | + * Copyright 2006-2008 The FLWOR Foundation. |
2537 | + * |
2538 | + * Licensed under the Apache License, Version 2.0 (the "License"); |
2539 | + * you may not use this file except in compliance with the License. |
2540 | + * You may obtain a copy of the License at |
2541 | + * |
2542 | + * http://www.apache.org/licenses/LICENSE-2.0 |
2543 | + * |
2544 | + * Unless required by applicable law or agreed to in writing, software |
2545 | + * distributed under the License is distributed on an "AS IS" BASIS, |
2546 | + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
2547 | + * See the License for the specific language governing permissions and |
2548 | + * limitations under the License. |
2549 | + */ |
2550 | + |
2551 | +#ifndef ZORBA_ICU_STREAMBUF_H |
2552 | +#define ZORBA_ICU_STREAMBUF_H |
2553 | + |
2554 | +#include <zorba/transcode_stream.h> |
2555 | + |
2556 | +#include "util/utf8_util.h" |
2557 | + |
2558 | +namespace zorba { |
2559 | + |
2560 | +/////////////////////////////////////////////////////////////////////////////// |
2561 | + |
2562 | +/** |
2563 | + * An %icu_streambuf is-a std::streambuf for transcoding character encodings |
2564 | + * from/to UTF-8 on-the-fly. |
2565 | + * |
2566 | + * To use it, replace a stream's streambuf: |
2567 | + * \code |
2568 | + * istream is; |
2569 | + * // ... |
2570 | + * icu_streambuf xbuf( "ISO-8859-1", is.rdbuf() ); |
2571 | + * is.ios::rdbuf( &xbuf ); |
2572 | + * \endcode |
2573 | + * Note that the %icu_streambuf must exist for as long as it's being used by |
2574 | + * the stream. If you are replacing the streabuf for a stream you did not |
2575 | + * create, you should set it back to the original streambuf: |
2576 | + * \code |
2577 | + * void f( ostream &os ) { |
2578 | + * icu_streambuf xbuf( "ISO-8859-1", os.rdbuf() ); |
2579 | + * try { |
2580 | + * os.ios::rdbuf( &xbuf ); |
2581 | + * // ... |
2582 | + * } |
2583 | + * catch ( ... ) { |
2584 | + * os.ios::rdbuf( xbuf.original() ); |
2585 | + * throw; |
2586 | + * } |
2587 | + * } |
2588 | + * \endcode |
2589 | + * |
2590 | + * While %icu_streambuf does support seeking, the positions are relative to the |
2591 | + * original byte stream. |
2592 | + */ |
2593 | +class icu_streambuf : public proxy_streambuf { |
2594 | +public: |
2595 | + /** |
2596 | + * Constructs an %icu_streambuf. |
2597 | + * |
2598 | + * @param charset The name of the character encoding to convert from/to. |
2599 | + * @param orig The original streambuf to read/write from/to. |
2600 | + */ |
2601 | + icu_streambuf( char const *charset, std::streambuf *orig ); |
2602 | + |
2603 | + /** |
2604 | + * Destructs an %icu_streambuf. |
2605 | + */ |
2606 | + ~icu_streambuf(); |
2607 | + |
2608 | + /** |
2609 | + * Checks whether it would be necessary to transcode from the given character |
2610 | + * encoding to UTF-8. |
2611 | + * |
2612 | + * @param charset The name of the character encoding to check. |
2613 | + * @return \c true only if t would be necessary to transcode from the given |
2614 | + * character encoding to UTF-8. |
2615 | + */ |
2616 | + static bool is_necessary( char const *charset ); |
2617 | + |
2618 | + /** |
2619 | + * Checks whether the given character set is supported for transcoding. |
2620 | + * |
2621 | + * @param charset The name of the character encoding to check. |
2622 | + * @return \c true only if the character encoding is supported. |
2623 | + */ |
2624 | + static bool is_supported( char const *charset ); |
2625 | + |
2626 | +protected: |
2627 | + pos_type seekoff( off_type, std::ios_base::seekdir, std::ios_base::openmode ); |
2628 | + pos_type seekpos( pos_type, std::ios_base::openmode ); |
2629 | + std::streambuf* setbuf( char_type*, std::streamsize ); |
2630 | + int sync(); |
2631 | + int_type overflow( int_type ); |
2632 | + int_type underflow(); |
2633 | + std::streamsize xsgetn( char_type*, std::streamsize ); |
2634 | + std::streamsize xsputn( char_type const*, std::streamsize ); |
2635 | + |
2636 | +private: |
2637 | + struct buf_type_base { |
2638 | + UChar pivot_buf_[ 4096 ], *pivot_source_, *pivot_target_; |
2639 | + |
2640 | + buf_type_base() { reset(); } |
2641 | + void reset(); |
2642 | + }; |
2643 | + |
2644 | + struct gbuf_type : buf_type_base { |
2645 | + utf8::encoded_char_type utf8_char_; |
2646 | + }; |
2647 | + gbuf_type g_; |
2648 | + |
2649 | + typedef buf_type_base pbuf_type; |
2650 | + pbuf_type p_; |
2651 | + |
2652 | + bool const no_conv_; // true = no conversion needed |
2653 | + UConverter *const external_conv_, *const utf8_conv_; |
2654 | + |
2655 | + void clear(); |
2656 | + static UConverter* create_conv( char const *charset ); |
2657 | + void resetg(); |
2658 | + |
2659 | + bool to_external( char_type const **from, char_type const *from_end, |
2660 | + char **to, char const *to_end, bool flush = false ); |
2661 | + |
2662 | + bool to_utf8( char const **from, char const *from_end, char_type **to, |
2663 | + char_type const *to_end, bool flush = false ); |
2664 | + |
2665 | + // forbid |
2666 | + icu_streambuf( icu_streambuf const& ); |
2667 | + icu_streambuf& operator=( icu_streambuf const& ); |
2668 | +}; |
2669 | + |
2670 | +/////////////////////////////////////////////////////////////////////////////// |
2671 | + |
2672 | +} // namespace zorba |
2673 | +#endif /* ZORBA_ICU_STREAMBUF_H */ |
2674 | +/* vim:set et sw=2 ts=2: */ |
2675 | |
2676 | === added file 'src/util/passthru_streambuf.cpp' |
2677 | --- src/util/passthru_streambuf.cpp 1970-01-01 00:00:00 +0000 |
2678 | +++ src/util/passthru_streambuf.cpp 2012-02-16 02:19:18 +0000 |
2679 | @@ -0,0 +1,105 @@ |
2680 | +/* |
2681 | + * Copyright 2006-2008 The FLWOR Foundation. |
2682 | + * |
2683 | + * Licensed under the Apache License, Version 2.0 (the "License"); |
2684 | + * you may not use this file except in compliance with the License. |
2685 | + * You may obtain a copy of the License at |
2686 | + * |
2687 | + * http://www.apache.org/licenses/LICENSE-2.0 |
2688 | + * |
2689 | + * Unless required by applicable law or agreed to in writing, software |
2690 | + * distributed under the License is distributed on an "AS IS" BASIS, |
2691 | + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
2692 | + * See the License for the specific language governing permissions and |
2693 | + * limitations under the License. |
2694 | + */ |
2695 | + |
2696 | +#include "passthru_streambuf.h" |
2697 | + |
2698 | +using namespace std; |
2699 | + |
2700 | +namespace zorba { |
2701 | + |
2702 | +/////////////////////////////////////////////////////////////////////////////// |
2703 | + |
2704 | +passthru_streambuf::passthru_streambuf( char const*, streambuf *orig ) : |
2705 | + proxy_streambuf( orig ) |
2706 | +{ |
2707 | + if ( !orig ) |
2708 | + throw invalid_argument( "null streambuf" ); |
2709 | +} |
2710 | + |
2711 | +passthru_streambuf::~passthru_streambuf() { |
2712 | + // out-of-line since it's virtual |
2713 | +} |
2714 | + |
2715 | +void passthru_streambuf::imbue( std::locale const &loc ) { |
2716 | + original()->pubimbue( loc ); |
2717 | +} |
2718 | + |
2719 | +bool passthru_streambuf::is_necessary( char const *cc_charset ) { |
2720 | + zstring charset( cc_charset ); |
2721 | + ascii::trim_whitespace( charset ); |
2722 | + ascii::to_upper( charset ); |
2723 | + return charset != "ASCII" |
2724 | + && charset != "US-ASCII" |
2725 | + && charset != "UTF-8"; |
2726 | +} |
2727 | + |
2728 | +bool passthru_streambuf::is_supported( char const *cc_charset ) { |
2729 | + return !is_necessary( charset ); |
2730 | +} |
2731 | + |
2732 | +passthru_streambuf::pos_type |
2733 | +passthru_streambuf::seekoff( off_type o, ios_base::seekdir d, |
2734 | + ios_base::openmode m ) { |
2735 | + return original()->pubseekoff( o, d, m ); |
2736 | +} |
2737 | + |
2738 | +passthru_streambuf::pos_type |
2739 | +passthru_streambuf::seekpos( pos_type p, ios_base::openmode m ) { |
2740 | + return original()->pubseekpos( p, m ); |
2741 | +} |
2742 | + |
2743 | +streambuf* passthru_streambuf::setbuf( char_type *p, streamsize s ) { |
2744 | + original()->pubsetbuf( p, s ); |
2745 | + return this; |
2746 | +} |
2747 | + |
2748 | +streamsize passthru_streambuf::showmanyc() { |
2749 | + return original()->in_avail(); |
2750 | +} |
2751 | + |
2752 | +int passthru_streambuf::sync() { |
2753 | + return original()->pubsync(); |
2754 | +} |
2755 | + |
2756 | +passthru_streambuf::int_type passthru_streambuf::overflow( int_type c ) { |
2757 | + return original()->sputc( c ); |
2758 | +} |
2759 | + |
2760 | +passthru_streambuf::int_type passthru_streambuf::pbackfail( int_type c ) { |
2761 | + return original()->sputbackc( traits_type::to_char_type( c ) ); |
2762 | +} |
2763 | + |
2764 | +passthru_streambuf::int_type passthru_streambuf::uflow() { |
2765 | + return original()->sbumpc(); |
2766 | +} |
2767 | + |
2768 | +passthru_streambuf::int_type passthru_streambuf::underflow() { |
2769 | + return original()->sgetc(); |
2770 | +} |
2771 | + |
2772 | +streamsize passthru_streambuf::xsgetn( char_type *to, streamsize size ) { |
2773 | + return original()->sgetn( to, size ); |
2774 | +} |
2775 | + |
2776 | +streamsize passthru_streambuf::xsputn( char_type const *from, |
2777 | + streamsize size ) { |
2778 | + return original()->sputn( from, size ); |
2779 | +} |
2780 | + |
2781 | +/////////////////////////////////////////////////////////////////////////////// |
2782 | + |
2783 | +} // namespace zorba |
2784 | +/* vim:set et sw=2 ts=2: */ |
2785 | |
2786 | === added file 'src/util/passthru_streambuf.h' |
2787 | --- src/util/passthru_streambuf.h 1970-01-01 00:00:00 +0000 |
2788 | +++ src/util/passthru_streambuf.h 2012-02-16 02:19:18 +0000 |
2789 | @@ -0,0 +1,76 @@ |
2790 | +/* |
2791 | + * Copyright 2006-2008 The FLWOR Foundation. |
2792 | + * |
2793 | + * Licensed under the Apache License, Version 2.0 (the "License"); |
2794 | + * you may not use this file except in compliance with the License. |
2795 | + * You may obtain a copy of the License at |
2796 | + * |
2797 | + * http://www.apache.org/licenses/LICENSE-2.0 |
2798 | + * |
2799 | + * Unless required by applicable law or agreed to in writing, software |
2800 | + * distributed under the License is distributed on an "AS IS" BASIS, |
2801 | + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
2802 | + * See the License for the specific language governing permissions and |
2803 | + * limitations under the License. |
2804 | + */ |
2805 | + |
2806 | +#ifndef ZORBA_PASSTHRU_STREAMBUF_H |
2807 | +#define ZORBA_PASSTHRU_STREAMBUF_H |
2808 | + |
2809 | +#include <zorba/transcode_streambuf.h> |
2810 | + |
2811 | +namespace zorba { |
2812 | + |
2813 | +/////////////////////////////////////////////////////////////////////////////// |
2814 | + |
2815 | +/** |
2816 | + * A %passthru_streambuf is-a std::streambuf TODO |
2817 | + */ |
2818 | +class passthru_streambuf : public proxy_streambuf { |
2819 | +public: |
2820 | + /** |
2821 | + * Constructs an %passthru_streambuf. |
2822 | + * |
2823 | + * @param charset The name of the character encoding to convert from/to. |
2824 | + * @param orig The original streambuf to read/write from/to. |
2825 | + */ |
2826 | + passthru_streambuf( char const *charset, std::streambuf *orig ); |
2827 | + |
2828 | + /** |
2829 | + * Destructs an %passthru_streambuf. |
2830 | + */ |
2831 | + ~passthru_streambuf(); |
2832 | + |
2833 | + /** |
2834 | + * Checks whether the given character set is supported for transcoding. |
2835 | + * |
2836 | + * @param charset The name of the character encoding to check. |
2837 | + * @return \c true only if the character encoding is supported. |
2838 | + */ |
2839 | + static bool is_supported( char const *charset ); |
2840 | + |
2841 | +protected: |
2842 | + void imbue( std::locale const& ); |
2843 | + pos_type seekoff( off_type, std::ios_base::seekdir, std::ios_base::openmode ); |
2844 | + pos_type seekpos( pos_type, std::ios_base::openmode ); |
2845 | + std::streambuf* setbuf( char_type*, std::streamsize ); |
2846 | + std::streamsize showmanyc(); |
2847 | + int sync(); |
2848 | + int_type overflow( int_type ); |
2849 | + int_type pbackfail( int_type ); |
2850 | + int_type uflow(); |
2851 | + int_type underflow(); |
2852 | + std::streamsize xsgetn( char_type*, std::streamsize ); |
2853 | + std::streamsize xsputn( char_type const*, std::streamsize ); |
2854 | + |
2855 | +private: |
2856 | + // forbid |
2857 | + passthru_streambuf( passthru_streambuf const& ); |
2858 | + passthru_streambuf& operator=( passthru_streambuf const& ); |
2859 | +}; |
2860 | + |
2861 | +/////////////////////////////////////////////////////////////////////////////// |
2862 | + |
2863 | +} // namespace zorba |
2864 | +#endif /* ZORBA_PASSTHRU_STREAMBUF_H */ |
2865 | +/* vim:set et sw=2 ts=2: */ |
2866 | |
2867 | === added file 'src/util/transcode_streambuf.h' |
2868 | --- src/util/transcode_streambuf.h 1970-01-01 00:00:00 +0000 |
2869 | +++ src/util/transcode_streambuf.h 2012-02-16 02:19:18 +0000 |
2870 | @@ -0,0 +1,47 @@ |
2871 | +/* |
2872 | + * Copyright 2006-2008 The FLWOR Foundation. |
2873 | + * |
2874 | + * Licensed under the Apache License, Version 2.0 (the "License"); |
2875 | + * you may not use this file except in compliance with the License. |
2876 | + * You may obtain a copy of the License at |
2877 | + * |
2878 | + * http://www.apache.org/licenses/LICENSE-2.0 |
2879 | + * |
2880 | + * Unless required by applicable law or agreed to in writing, software |
2881 | + * distributed under the License is distributed on an "AS IS" BASIS, |
2882 | + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
2883 | + * See the License for the specific language governing permissions and |
2884 | + * limitations under the License. |
2885 | + */ |
2886 | + |
2887 | +#ifndef ZORBA_TRANSCODE_STREAMBUF_H |
2888 | +#define ZORBA_TRANSCODE_STREAMBUF_H |
2889 | + |
2890 | +#include <zorba/config.h> |
2891 | + |
2892 | +/////////////////////////////////////////////////////////////////////////////// |
2893 | + |
2894 | +#ifdef ZORBA_NO_UNICODE |
2895 | +# include "passthru_streambuf.h" |
2896 | +#else |
2897 | +# include "icu_streambuf.h" |
2898 | +#endif /* ZORBA_NO_UNICODE */ |
2899 | + |
2900 | +namespace zorba { |
2901 | +namespace internal { |
2902 | +namespace transcode { |
2903 | + |
2904 | +#ifdef ZORBA_NO_UNICODE |
2905 | +typedef passthru_streambuf streambuf; |
2906 | +#else |
2907 | +typedef icu_streambuf streambuf; |
2908 | +#endif /* ZORBA_NO_UNICODE */ |
2909 | + |
2910 | +} // namespace transcode |
2911 | +} // namespace internal |
2912 | +} // namespace zorba |
2913 | + |
2914 | +/////////////////////////////////////////////////////////////////////////////// |
2915 | + |
2916 | +#endif /* ZORBA_TRANSCODE_STREAMBUF_H */ |
2917 | +/* vim:set et sw=2 ts=2: */ |
2918 | |
2919 | === added file 'test/rbkt/ExpQueryResults/zorba/file/cp1252.xml.res' |
2920 | --- test/rbkt/ExpQueryResults/zorba/file/cp1252.xml.res 1970-01-01 00:00:00 +0000 |
2921 | +++ test/rbkt/ExpQueryResults/zorba/file/cp1252.xml.res 2012-02-16 02:19:18 +0000 |
2922 | @@ -0,0 +1,1 @@ |
2923 | +üäö |
2924 | |
2925 | === added file 'test/rbkt/Queries/zorba/file/cp1252.txt' |
2926 | --- test/rbkt/Queries/zorba/file/cp1252.txt 1970-01-01 00:00:00 +0000 |
2927 | +++ test/rbkt/Queries/zorba/file/cp1252.txt 2012-02-16 02:19:18 +0000 |
2928 | @@ -0,0 +1,1 @@ |
2929 | +üäö |
2930 | |
2931 | === added file 'test/rbkt/Queries/zorba/file/cp1252.xq' |
2932 | --- test/rbkt/Queries/zorba/file/cp1252.xq 1970-01-01 00:00:00 +0000 |
2933 | +++ test/rbkt/Queries/zorba/file/cp1252.xq 2012-02-16 02:19:18 +0000 |
2934 | @@ -0,0 +1,3 @@ |
2935 | +import module namespace f = "http://expath.org/ns/file"; |
2936 | + |
2937 | +f:read-text(fn:resolve-uri("cp1252.txt"), "CP1252") |
2938 | |
2939 | === added file 'test/rbkt/Queries/zorba/file/invalid_encoding.spec' |
2940 | --- test/rbkt/Queries/zorba/file/invalid_encoding.spec 1970-01-01 00:00:00 +0000 |
2941 | +++ test/rbkt/Queries/zorba/file/invalid_encoding.spec 2012-02-16 02:19:18 +0000 |
2942 | @@ -0,0 +1,1 @@ |
2943 | +Error: http://expath.org/ns/file:FOFL0006 |
2944 | |
2945 | === added file 'test/rbkt/Queries/zorba/file/invalid_encoding.xq' |
2946 | --- test/rbkt/Queries/zorba/file/invalid_encoding.xq 1970-01-01 00:00:00 +0000 |
2947 | +++ test/rbkt/Queries/zorba/file/invalid_encoding.xq 2012-02-16 02:19:18 +0000 |
2948 | @@ -0,0 +1,3 @@ |
2949 | +import module namespace f = "http://expath.org/ns/file"; |
2950 | + |
2951 | +f:read-text(fn:resolve-uri("cp1252.txt"), "FOO") |
2952 | |
2953 | === modified file 'test/rbkt/Queries/zorba/http-client/send-request/http2-read-svg.xq' |
2954 | --- test/rbkt/Queries/zorba/http-client/send-request/http2-read-svg.xq 2011-08-23 07:11:31 +0000 |
2955 | +++ test/rbkt/Queries/zorba/http-client/send-request/http2-read-svg.xq 2012-02-16 02:19:18 +0000 |
2956 | @@ -7,9 +7,9 @@ |
2957 | auth-method="Basic" |
2958 | send-authorization="true" |
2959 | username="zorba" |
2960 | - password="blub"/>; |
2961 | + password="blub" |
2962 | + override-media-type="application/xml; charset=utf-8"/>; |
2963 | |
2964 | variable $http-res := http:send-request($req, (), ()); |
2965 | |
2966 | $http-res[2] |
2967 | - |
Validation queue starting for merge proposal. zorbatest. lambda. nu:8080/ remotequeue/ feature- transcode_ streambuf- 2012-02- 08T19-21- 05.882Z/ log.html
Log at: http://