Merge lp:~zorba-coders/zorba/tokenize into lp:zorba

Proposed by Matthias Brantner
Status: Superseded
Proposed branch: lp:~zorba-coders/zorba/tokenize
Merge into: lp:zorba
Diff against target: 603 lines (+366/-2)
25 files modified
ChangeLog (+2/-0)
modules/com/zorba-xquery/www/modules/CMakeLists.txt (+1/-1)
modules/com/zorba-xquery/www/modules/string.xq (+21/-1)
src/functions/pregenerated/func_strings.cpp (+23/-0)
src/functions/pregenerated/func_strings.h (+13/-0)
src/functions/pregenerated/function_enum.h (+1/-0)
src/runtime/spec/strings/strings.xml (+31/-0)
src/runtime/strings/pregenerated/strings.cpp (+42/-0)
src/runtime/strings/pregenerated/strings.h (+52/-0)
src/runtime/strings/strings_impl.cpp (+130/-0)
src/runtime/visitors/pregenerated/planiter_visitor.h (+5/-0)
src/runtime/visitors/pregenerated/printer_visitor.cpp (+14/-0)
src/runtime/visitors/pregenerated/printer_visitor.h (+3/-0)
test/rbkt/ExpQueryResults/zorba/string/tokenize01.xml.res (+1/-0)
test/rbkt/ExpQueryResults/zorba/string/tokenize02.xml.res (+1/-0)
test/rbkt/ExpQueryResults/zorba/string/tokenize03.xml.res (+1/-0)
test/rbkt/ExpQueryResults/zorba/string/tokenize04.xml.res (+1/-0)
test/rbkt/Queries/zorba/string/token01.txt (+1/-0)
test/rbkt/Queries/zorba/string/token02.txt (+1/-0)
test/rbkt/Queries/zorba/string/token03.txt (+1/-0)
test/rbkt/Queries/zorba/string/token04.txt (+1/-0)
test/rbkt/Queries/zorba/string/tokenize01.xq (+5/-0)
test/rbkt/Queries/zorba/string/tokenize02.xq (+5/-0)
test/rbkt/Queries/zorba/string/tokenize03.xq (+5/-0)
test/rbkt/Queries/zorba/string/tokenize04.xq (+5/-0)
To merge this branch: bzr merge lp:~zorba-coders/zorba/tokenize
Reviewer Review Type Date Requested Status
William Candillon Approve
Matthias Brantner Approve
Paul J. Lucas Pending
Review via email: mp+86829@code.launchpad.net

This proposal supersedes a proposal from 2011-12-22.

This proposal has been superseded by a proposal from 2011-12-23.

Commit message

implementation of string:split function that doesn't accept regular expressions but allows for streamable processing of the input (resolves bug #898074)

Description of the change

implementation of string:split function that doesn't accept regular expressions but allows for streamable processing of the input (resolves bug #898074)

To post a comment you must log in.
Revision history for this message
Zorba Build Bot (zorba-buildbot) wrote : Posted in a previous version of this proposal
Revision history for this message
Zorba Build Bot (zorba-buildbot) wrote : Posted in a previous version of this proposal

Validation queue job tokenize-2011-12-21T21-46-05.289Z is finished. The final status was:

All tests succeeded!

Revision history for this message
Zorba Build Bot (zorba-buildbot) wrote : Posted in a previous version of this proposal

Voting does not meet specified criteria. Required: Approve > 1, Disapprove < 1. Got: 2 Pending.

Revision history for this message
Paul J. Lucas (paul-lucas) wrote : Posted in a previous version of this proposal

On line 357, why do you call assert()? An invalid byte should throw an exception, not assert and dump core.

Revision history for this message
Paul J. Lucas (paul-lucas) wrote : Posted in a previous version of this proposal

> On line 357, why do you call assert()? An invalid byte should throw an
> exception, not assert and dump core.

I meant line 367.

Revision history for this message
Paul J. Lucas (paul-lucas) : Posted in a previous version of this proposal
review: Needs Fixing
Revision history for this message
Matthias Brantner (matthias-brantner) wrote : Posted in a previous version of this proposal

Once you finished the implementation of the transcoding stream buffer, I don't even want to do this check anymore. This must not happen with the stream buffer.

Revision history for this message
Paul J. Lucas (paul-lucas) wrote : Posted in a previous version of this proposal

> Once you finished the implementation of the transcoding stream buffer, I don't
> even want to do this check anymore. This must not happen with the stream
> buffer.

I don't understand how it "must not happen." It can always happen. However, I think you're saying that you assume the check will happen in the transcoder. While it will be doing checks, bad input can still happen.

In the mean time, using an assert() is still too Draconian.

Revision history for this message
Matthias Brantner (matthias-brantner) wrote : Posted in a previous version of this proposal

I have replaced the assertion with a graceful error.

Revision history for this message
Paul J. Lucas (paul-lucas) : Posted in a previous version of this proposal
review: Approve
Revision history for this message
William Candillon (wcandillon) wrote : Posted in a previous version of this proposal

Is there an example that works with streaming?
I wasn't able to make the following work:
import module namespace http = "http://www.zorba-xquery.com/modules/http-client";

declare namespace h = "http://expath.org/ns/http-client";

let $item := http:send-request(<h:request href="https://stream.twitter.com/1/statuses/sample.json?delimited=length"
                                          method="GET"
                                          username="wcandillon"
                                          password="wnvbb86g"
                                          override-media-type="text/plain"
                               />
                               ,
                               "https://stream.twitter.com/1/statuses/sample.json?delimited=length",
                               ()
                                )[2]
for $tweet in tokenize($item,"a")
return $tweet

Where:
import module namespace http = "http://www.zorba-xquery.com/modules/http-client";

declare namespace h = "http://expath.org/ns/http-client";

let $item := http:send-request(<h:request href="https://stream.twitter.com/1/statuses/sample.json?delimited=length"
                                          method="GET"
                                          username="wcandillon"
                                          password="wnvbb86g"
                                          override-media-type="text/plain"
                               />
                               ,
                               "https://stream.twitter.com/1/statuses/sample.json?delimited=length",
                               ()
                                )[2]
return $item

streams fine.
What am I missing?

Revision history for this message
Matthias Brantner (matthias-brantner) wrote : Posted in a previous version of this proposal

As discussed in this thread, only the new tokenize function of the string module streams.
Use the following instead

import module namespace s = "http://www.zorba-xquery.com/modules/string";

s:tokenize($item, "a")

Revision history for this message
William Candillon (wcandillon) wrote : Posted in a previous version of this proposal

Works like a charm.

review: Approve
Revision history for this message
Zorba Build Bot (zorba-buildbot) wrote : Posted in a previous version of this proposal

Attempt to merge into lp:zorba failed due to conflicts:

text conflict in ChangeLog

Revision history for this message
Matthias Brantner (matthias-brantner) :
review: Approve
Revision history for this message
William Candillon (wcandillon) wrote :

Works great.

review: Approve
Revision history for this message
Zorba Build Bot (zorba-buildbot) wrote :
Revision history for this message
Zorba Build Bot (zorba-buildbot) wrote :

The attempt to merge lp:~zorba-coders/zorba/tokenize into lp:zorba failed. Below is the output from the failed tests.

CMake Error at /home/ceej/zo/testing/zorbatest/tester/TarmacLander.cmake:273 (message):
  Validation queue job tokenize-2011-12-23T20-55-01.864Z is finished. The
  final status was:

  1 tests did not succeed - changes not commited.

Error in read script: /home/ceej/zo/testing/zorbatest/tester/TarmacLander.cmake

Revision history for this message
Zorba Build Bot (zorba-buildbot) wrote :
Revision history for this message
Zorba Build Bot (zorba-buildbot) wrote :

The attempt to merge lp:~zorba-coders/zorba/tokenize into lp:zorba failed. Below is the output from the failed tests.

CMake Error at /home/ceej/zo/testing/zorbatest/tester/TarmacLander.cmake:273 (message):
  Validation queue job tokenize-2011-12-23T21-25-57.422Z is finished. The
  final status was:

  1 tests did not succeed - changes not commited.

Error in read script: /home/ceej/zo/testing/zorbatest/tester/TarmacLander.cmake

lp:~zorba-coders/zorba/tokenize updated
10589. By Matthias Brantner

forgot to commit pregenerated file

Unmerged revisions

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'ChangeLog'
--- ChangeLog 2011-12-23 19:38:53 +0000
+++ ChangeLog 2011-12-23 20:33:38 +0000
@@ -12,6 +12,8 @@
12 set multiple times via the c++ api).12 set multiple times via the c++ api).
13 * Fixed bug #905050 (setting and getting the context item type via the c++ api)13 * Fixed bug #905050 (setting and getting the context item type via the c++ api)
14 * Added createDayTimeDuration, createYearMonthDuration, createDocumentNode, createCommentNode, createPiNode to api's ItemFactory.14 * Added createDayTimeDuration, createYearMonthDuration, createDocumentNode, createCommentNode, createPiNode to api's ItemFactory.
15 * Added split function to the string module that allows for streamable tokenization but doesn't have regular expression
16 support.
15 * zerr is not predeclared anymore to be http://www.zorba-xquery.com/errors17 * zerr is not predeclared anymore to be http://www.zorba-xquery.com/errors
1618
17version 2.119version 2.1
1820
=== modified file 'modules/com/zorba-xquery/www/modules/CMakeLists.txt'
--- modules/com/zorba-xquery/www/modules/CMakeLists.txt 2011-12-21 14:40:33 +0000
+++ modules/com/zorba-xquery/www/modules/CMakeLists.txt 2011-12-23 20:33:38 +0000
@@ -58,7 +58,7 @@
58 URI "http://www.zorba-xquery.com/modules/reflection")58 URI "http://www.zorba-xquery.com/modules/reflection")
59DECLARE_ZORBA_MODULE(FILE schema.xq VERSION 2.059DECLARE_ZORBA_MODULE(FILE schema.xq VERSION 2.0
60 URI "http://www.zorba-xquery.com/modules/schema")60 URI "http://www.zorba-xquery.com/modules/schema")
61DECLARE_ZORBA_MODULE(FILE string.xq VERSION 2.061DECLARE_ZORBA_MODULE(FILE string.xq VERSION 2.1
62 URI "http://www.zorba-xquery.com/modules/string")62 URI "http://www.zorba-xquery.com/modules/string")
63DECLARE_ZORBA_MODULE(FILE xml.xq VERSION 2.063DECLARE_ZORBA_MODULE(FILE xml.xq VERSION 2.0
64 URI "http://www.zorba-xquery.com/modules/xml")64 URI "http://www.zorba-xquery.com/modules/xml")
6565
=== modified file 'modules/com/zorba-xquery/www/modules/string.xq'
--- modules/com/zorba-xquery/www/modules/string.xq 2011-08-03 15:12:40 +0000
+++ modules/com/zorba-xquery/www/modules/string.xq 2011-12-23 20:33:38 +0000
@@ -25,7 +25,7 @@
25 :)25 :)
26module namespace string = "http://www.zorba-xquery.com/modules/string";26module namespace string = "http://www.zorba-xquery.com/modules/string";
27declare namespace ver = "http://www.zorba-xquery.com/options/versioning";27declare namespace ver = "http://www.zorba-xquery.com/options/versioning";
28declare option ver:module-version "2.0";28declare option ver:module-version "2.1";
2929
30(:~30(:~
31 : This function materializes a streamable string.31 : This function materializes a streamable string.
@@ -63,3 +63,23 @@
63 :63 :
64 :)64 :)
65declare function string:is-streamable($s as xs:string) as xs:boolean external;65declare function string:is-streamable($s as xs:string) as xs:boolean external;
66
67(:~
68 : Returns a sequence of strings constructed by splitting the input wherever the given
69 : separator is found.
70 :
71 : The function is different from fn:tokenize. It doesn't allow
72 : the separator to be a regular expression. This restriction allows for more
73 : performant implementation. Specifically, the function processes
74 : streamable strings as input in a streamable way which is particularly useful
75 : to tokenize huge strings (e.g. if returned by the file module's read-text
76 : function).
77 :
78 : @param $s the input string to split
79 : @param $separator the separator used for splitting the input string $s
80 :
81 : @return a sequence of strings constructed by splitting the input
82 :)
83declare function string:split(
84 $s as xs:string,
85 $separator as xs:string) as xs:string* external;
6686
=== modified file 'src/functions/pregenerated/func_strings.cpp'
--- src/functions/pregenerated/func_strings.cpp 2011-12-21 14:40:33 +0000
+++ src/functions/pregenerated/func_strings.cpp 2011-12-23 20:33:38 +0000
@@ -320,6 +320,16 @@
320 return new StringIsStreamableIterator(sctx, loc, argv);320 return new StringIsStreamableIterator(sctx, loc, argv);
321}321}
322322
323PlanIter_t fn_zorba_string_split::codegen(
324 CompilerCB*,
325 static_context* sctx,
326 const QueryLoc& loc,
327 std::vector<PlanIter_t>& argv,
328 AnnotationHolder& ann) const
329{
330 return new StringSplitIterator(sctx, loc, argv);
331}
332
323void populate_context_strings(static_context* sctx)333void populate_context_strings(static_context* sctx)
324{334{
325 {335 {
@@ -890,6 +900,19 @@
890900
891 }901 }
892902
903
904 {
905
906
907 DECL_WITH_KIND(sctx, fn_zorba_string_split,
908 (createQName("http://www.zorba-xquery.com/modules/string","","split"),
909 GENV_TYPESYSTEM.STRING_TYPE_ONE,
910 GENV_TYPESYSTEM.STRING_TYPE_ONE,
911 GENV_TYPESYSTEM.STRING_TYPE_STAR),
912 FunctionConsts::FN_ZORBA_STRING_SPLIT_2);
913
914 }
915
893}916}
894917
895918
896919
=== modified file 'src/functions/pregenerated/func_strings.h'
--- src/functions/pregenerated/func_strings.h 2011-12-22 14:14:53 +0000
+++ src/functions/pregenerated/func_strings.h 2011-12-23 20:33:38 +0000
@@ -481,6 +481,19 @@
481};481};
482482
483483
484//fn-zorba-string:split
485class fn_zorba_string_split : public function
486{
487public:
488 fn_zorba_string_split(const signature& sig, FunctionConsts::FunctionKind kind)
489 : function(sig, kind) {
490
491}
492
493 CODEGEN_DECL();
494};
495
496
484} //namespace zorba497} //namespace zorba
485498
486499
487500
=== modified file 'src/functions/pregenerated/function_enum.h'
--- src/functions/pregenerated/function_enum.h 2011-12-21 14:40:33 +0000
+++ src/functions/pregenerated/function_enum.h 2011-12-23 20:33:38 +0000
@@ -371,6 +371,7 @@
371 FN_ANALYZE_STRING_3,371 FN_ANALYZE_STRING_3,
372 FN_ZORBA_STRING_MATERIALIZE_1,372 FN_ZORBA_STRING_MATERIALIZE_1,
373 FN_ZORBA_STRING_IS_STREAMABLE_1,373 FN_ZORBA_STRING_IS_STREAMABLE_1,
374 FN_ZORBA_STRING_SPLIT_2,
374 FN_ZORBA_XQDOC_XQDOC_1,375 FN_ZORBA_XQDOC_XQDOC_1,
375 FN_ZORBA_XQDOC_XQDOC_CONTENT_1,376 FN_ZORBA_XQDOC_XQDOC_CONTENT_1,
376377
377378
=== modified file 'src/runtime/spec/strings/strings.xml'
--- src/runtime/spec/strings/strings.xml 2011-12-21 14:40:33 +0000
+++ src/runtime/spec/strings/strings.xml 2011-12-23 20:33:38 +0000
@@ -729,4 +729,35 @@
729729
730</zorba:iterator>730</zorba:iterator>
731731
732<!--
733/*******************************************************************************
734 * string:tokenize
735********************************************************************************/
736-->
737<zorba:iterator name="StringSplitIterator">
738
739 <zorba:description author="Matthias Brantner">
740 string:split
741 </zorba:description>
742
743 <zorba:function>
744 <zorba:signature localname="split" prefix="fn-zorba-string">
745 <zorba:param>xs:string</zorba:param>
746 <zorba:param>xs:string</zorba:param>
747 <zorba:output>xs:string*</zorba:output>
748 </zorba:signature>
749 </zorba:function>
750
751 <zorba:state>
752 <zorba:member type="zstring" name="theSeparator"
753 brief="separator for the tokenization"/>
754 <zorba:member type="std::istream*" name="theIStream"
755 brief="the remaining string (if the input is streamable)"/>
756 <zorba:member type="zstring" name="theInput"
757 brief="the string to tokenize (if the input is not streamable)"/>
758 <zorba:member type="size_t" name="theNextStartPos" defaultValue="0"/>
759 </zorba:state>
760
761</zorba:iterator>
762
732</zorba:iterators>763</zorba:iterators>
733764
=== modified file 'src/runtime/strings/pregenerated/strings.cpp'
--- src/runtime/strings/pregenerated/strings.cpp 2011-12-21 14:40:33 +0000
+++ src/runtime/strings/pregenerated/strings.cpp 2011-12-23 20:33:38 +0000
@@ -830,6 +830,48 @@
830// </StringIsStreamableIterator>830// </StringIsStreamableIterator>
831831
832832
833// <StringSplitIterator>
834const char* StringSplitIterator::class_name_str = "StringSplitIterator";
835StringSplitIterator::class_factory<StringSplitIterator>
836StringSplitIterator::g_class_factory;
837
838const serialization::ClassVersion
839StringSplitIterator::class_versions[] ={{ 1, 0x000905, false}};
840
841const int StringSplitIterator::class_versions_count =
842sizeof(StringSplitIterator::class_versions)/sizeof(struct serialization::ClassVersion);
843
844void StringSplitIterator::accept(PlanIterVisitor& v) const {
845 v.beginVisit(*this);
846
847 std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin();
848 std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end();
849 for ( ; lIter != lEnd; ++lIter ){
850 (*lIter)->accept(v);
851 }
852
853 v.endVisit(*this);
854}
855
856StringSplitIterator::~StringSplitIterator() {}
857
858StringSplitIteratorState::StringSplitIteratorState() {}
859
860StringSplitIteratorState::~StringSplitIteratorState() {}
861
862
863void StringSplitIteratorState::init(PlanState& planState) {
864 PlanIteratorState::init(planState);
865 theNextStartPos = 0;
866}
867
868void StringSplitIteratorState::reset(PlanState& planState) {
869 PlanIteratorState::reset(planState);
870 theNextStartPos = 0;
871}
872// </StringSplitIterator>
873
874
833875
834}876}
835877
836878
=== modified file 'src/runtime/strings/pregenerated/strings.h'
--- src/runtime/strings/pregenerated/strings.h 2011-12-21 14:40:33 +0000
+++ src/runtime/strings/pregenerated/strings.h 2011-12-23 20:33:38 +0000
@@ -1075,6 +1075,58 @@
1075};1075};
10761076
10771077
1078/**
1079 *
1080 * string:split
1081 *
1082 * Author: Matthias Brantner
1083 */
1084class StringSplitIteratorState : public PlanIteratorState
1085{
1086public:
1087 zstring theSeparator; //separator for the tokenization
1088 std::istream* theIStream; //the remaining string (if the input is streamable)
1089 zstring theInput; //the string to tokenize (if the input is not streamable)
1090 size_t theNextStartPos; //
1091
1092 StringSplitIteratorState();
1093
1094 ~StringSplitIteratorState();
1095
1096 void init(PlanState&);
1097 void reset(PlanState&);
1098};
1099
1100class StringSplitIterator : public NaryBaseIterator<StringSplitIterator, StringSplitIteratorState>
1101{
1102public:
1103 SERIALIZABLE_CLASS(StringSplitIterator);
1104
1105 SERIALIZABLE_CLASS_CONSTRUCTOR2T(StringSplitIterator,
1106 NaryBaseIterator<StringSplitIterator, StringSplitIteratorState>);
1107
1108 void serialize( ::zorba::serialization::Archiver& ar)
1109 {
1110 serialize_baseclass(ar,
1111 (NaryBaseIterator<StringSplitIterator, StringSplitIteratorState>*)this);
1112 }
1113
1114 StringSplitIterator(
1115 static_context* sctx,
1116 const QueryLoc& loc,
1117 std::vector<PlanIter_t>& children)
1118 :
1119 NaryBaseIterator<StringSplitIterator, StringSplitIteratorState>(sctx, loc, children)
1120 {}
1121
1122 virtual ~StringSplitIterator();
1123
1124 void accept(PlanIterVisitor& v) const;
1125
1126 bool nextImpl(store::Item_t& result, PlanState& aPlanState) const;
1127};
1128
1129
1078}1130}
1079#endif1131#endif
1080/*1132/*
10811133
=== modified file 'src/runtime/strings/strings_impl.cpp'
--- src/runtime/strings/strings_impl.cpp 2011-12-23 06:41:43 +0000
+++ src/runtime/strings/strings_impl.cpp 2011-12-23 20:33:38 +0000
@@ -140,6 +140,7 @@
140 p = ec;140 p = ec;
141141
142 if ( utf8::read( *state->theStream, ec ) == utf8::npos )142 if ( utf8::read( *state->theStream, ec ) == utf8::npos )
143 {
143 if ( state->theStream->good() ) {144 if ( state->theStream->good() ) {
144 //145 //
145 // If read() failed but the stream state is good, it means that an146 // If read() failed but the stream state is good, it means that an
@@ -165,6 +166,7 @@
165 zerr::ZOSE0003_STREAM_READ_FAILURE, ERROR_LOC( loc )166 zerr::ZOSE0003_STREAM_READ_FAILURE, ERROR_LOC( loc )
166 );167 );
167 }168 }
169 }
168 state->theResult.clear();170 state->theResult.clear();
169 state->theResult.push_back( utf8::next_char( p ) );171 state->theResult.push_back( utf8::next_char( p ) );
170 172
@@ -2284,5 +2286,133 @@
2284 STACK_END(state);2286 STACK_END(state);
2285}2287}
22862288
2289/**
2290 *______________________________________________________________________
2291 *
2292 * http://www.zorba-xquery.com/modules/string
2293 * string:split
2294 */
2295bool StringSplitIterator::nextImpl(
2296 store::Item_t& result,
2297 PlanState& planState) const
2298{
2299 store::Item_t item;
2300 size_t lNewPos = 0;
2301 zstring lToken;
2302 zstring lPartialMatch;
2303
2304 StringSplitIteratorState* state;
2305 DEFAULT_STACK_INIT(StringSplitIteratorState, state, planState);
2306
2307 // init phase, get input string and tokens
2308 consumeNext(item, theChildren[0].getp(), planState);
2309
2310 if (item->isStreamable())
2311 {
2312 state->theIStream = &item->getStream();
2313 }
2314 else
2315 {
2316 state->theIStream = 0;
2317 item->getStringValue2(state->theInput);
2318 }
2319
2320 consumeNext(item, theChildren[1].getp(), planState);
2321
2322 item->getStringValue2(state->theSeparator);
2323
2324 // working phase, do the tokenization
2325 if (state->theIStream)
2326 {
2327 while ( !state->theIStream->eof() )
2328 {
2329 utf8::encoded_char_type ec;
2330 memset( ec, '\0' , sizeof(ec) );
2331 utf8::storage_type *p;
2332 p = ec;
2333
2334 if ( utf8::read( *state->theIStream, ec ) != utf8::npos )
2335 {
2336 if (state->theSeparator.compare(lNewPos, 1, ec) == 0)
2337 {
2338 if (++lNewPos == state->theSeparator.length())
2339 {
2340 GENV_ITEMFACTORY->createString(result, lToken);
2341 STACK_PUSH(true, state);
2342 }
2343 else
2344 {
2345 lPartialMatch.append(ec);
2346 }
2347 }
2348 else
2349 {
2350 lToken.append(lPartialMatch);
2351 lToken.append(ec);
2352 }
2353 }
2354 else
2355 {
2356 if (state->theIStream->good())
2357 {
2358 char buf[ 6 /* bytes at most */ * 5 /* chars per byte */ ], *b = buf;
2359 bool first = true;
2360 for ( ; *p; ++p ) {
2361 if ( first )
2362 first = false;
2363 else
2364 *b++ = ',';
2365 ::strcpy( b, "0x" ); b += 2;
2366 ::sprintf( b, "%0hhX", *p ); b += 2;
2367 }
2368 throw XQUERY_EXCEPTION(
2369 zerr::ZXQD0006_INVALID_UTF8_BYTE_SEQUENCE,
2370 ERROR_PARAMS( buf ),
2371 ERROR_LOC( loc )
2372 );
2373 }
2374 if (!lToken.empty())
2375 {
2376 GENV_ITEMFACTORY->createString(result, lToken);
2377 STACK_PUSH(true, state);
2378 }
2379 break;
2380 }
2381 }
2382 }
2383 else
2384 {
2385 while (true)
2386 {
2387 if (state->theNextStartPos == zstring::npos)
2388 {
2389 break;
2390 }
2391
2392 lNewPos =
2393 state->theInput.find(state->theSeparator, state->theNextStartPos);
2394 if (lNewPos != zstring::npos)
2395 {
2396 zstring lSubStr = state->theInput.substr(
2397 state->theNextStartPos,
2398 lNewPos - state->theNextStartPos);
2399 GENV_ITEMFACTORY->createString(result, lSubStr);
2400 state->theNextStartPos =
2401 lNewPos==state->theInput.length() - state->theSeparator.length()
2402 ? zstring::npos
2403 : lNewPos + state->theSeparator.length();
2404 }
2405 else
2406 {
2407 zstring lSubStr = state->theInput.substr(state->theNextStartPos);
2408 GENV_ITEMFACTORY->createString(result, lSubStr);
2409 state->theNextStartPos = zstring::npos;
2410 }
2411 STACK_PUSH(true, state);
2412 }
2413 }
2414
2415 STACK_END(state);
2416}
2287} // namespace zorba2417} // namespace zorba
2288/* vim:set et sw=2 ts=2: */2418/* vim:set et sw=2 ts=2: */
22892419
=== modified file 'src/runtime/visitors/pregenerated/planiter_visitor.h'
--- src/runtime/visitors/pregenerated/planiter_visitor.h 2011-12-21 14:40:33 +0000
+++ src/runtime/visitors/pregenerated/planiter_visitor.h 2011-12-23 20:33:38 +0000
@@ -582,6 +582,8 @@
582582
583 class StringIsStreamableIterator;583 class StringIsStreamableIterator;
584584
585 class StringSplitIterator;
586
585 class XQDocIterator;587 class XQDocIterator;
586588
587 class XQDocContentIterator;589 class XQDocContentIterator;
@@ -1423,6 +1425,9 @@
1423 virtual void beginVisit ( const StringIsStreamableIterator& ) = 0;1425 virtual void beginVisit ( const StringIsStreamableIterator& ) = 0;
1424 virtual void endVisit ( const StringIsStreamableIterator& ) = 0;1426 virtual void endVisit ( const StringIsStreamableIterator& ) = 0;
14251427
1428 virtual void beginVisit ( const StringSplitIterator& ) = 0;
1429 virtual void endVisit ( const StringSplitIterator& ) = 0;
1430
1426 virtual void beginVisit ( const XQDocIterator& ) = 0;1431 virtual void beginVisit ( const XQDocIterator& ) = 0;
1427 virtual void endVisit ( const XQDocIterator& ) = 0;1432 virtual void endVisit ( const XQDocIterator& ) = 0;
14281433
14291434
=== modified file 'src/runtime/visitors/pregenerated/printer_visitor.cpp'
--- src/runtime/visitors/pregenerated/printer_visitor.cpp 2011-12-21 14:40:33 +0000
+++ src/runtime/visitors/pregenerated/printer_visitor.cpp 2011-12-23 20:33:38 +0000
@@ -3961,6 +3961,20 @@
3961// </StringIsStreamableIterator>3961// </StringIsStreamableIterator>
39623962
39633963
3964// <StringSplitIterator>
3965void PrinterVisitor::beginVisit ( const StringSplitIterator& a) {
3966 thePrinter.startBeginVisit("StringSplitIterator", ++theId);
3967 printCommons( &a, theId );
3968 thePrinter.endBeginVisit( theId );
3969}
3970
3971void PrinterVisitor::endVisit ( const StringSplitIterator& ) {
3972 thePrinter.startEndVisit();
3973 thePrinter.endEndVisit();
3974}
3975// </StringSplitIterator>
3976
3977
3964// <XQDocIterator>3978// <XQDocIterator>
3965void PrinterVisitor::beginVisit ( const XQDocIterator& a) {3979void PrinterVisitor::beginVisit ( const XQDocIterator& a) {
3966 thePrinter.startBeginVisit("XQDocIterator", ++theId);3980 thePrinter.startBeginVisit("XQDocIterator", ++theId);
39673981
=== modified file 'src/runtime/visitors/pregenerated/printer_visitor.h'
--- src/runtime/visitors/pregenerated/printer_visitor.h 2011-12-21 14:40:33 +0000
+++ src/runtime/visitors/pregenerated/printer_visitor.h 2011-12-23 20:33:38 +0000
@@ -876,6 +876,9 @@
876 void beginVisit( const StringIsStreamableIterator& );876 void beginVisit( const StringIsStreamableIterator& );
877 void endVisit ( const StringIsStreamableIterator& );877 void endVisit ( const StringIsStreamableIterator& );
878878
879 void beginVisit( const StringSplitIterator& );
880 void endVisit ( const StringSplitIterator& );
881
879 void beginVisit( const XQDocIterator& );882 void beginVisit( const XQDocIterator& );
880 void endVisit ( const XQDocIterator& );883 void endVisit ( const XQDocIterator& );
881884
882885
=== added file 'test/rbkt/ExpQueryResults/zorba/string/tokenize01.xml.res'
--- test/rbkt/ExpQueryResults/zorba/string/tokenize01.xml.res 1970-01-01 00:00:00 +0000
+++ test/rbkt/ExpQueryResults/zorba/string/tokenize01.xml.res 2011-12-23 20:33:38 +0000
@@ -0,0 +1,1 @@
1a d a d
02
=== added file 'test/rbkt/ExpQueryResults/zorba/string/tokenize02.xml.res'
--- test/rbkt/ExpQueryResults/zorba/string/tokenize02.xml.res 1970-01-01 00:00:00 +0000
+++ test/rbkt/ExpQueryResults/zorba/string/tokenize02.xml.res 2011-12-23 20:33:38 +0000
@@ -0,0 +1,1 @@
1a a
02
=== added file 'test/rbkt/ExpQueryResults/zorba/string/tokenize03.xml.res'
--- test/rbkt/ExpQueryResults/zorba/string/tokenize03.xml.res 1970-01-01 00:00:00 +0000
+++ test/rbkt/ExpQueryResults/zorba/string/tokenize03.xml.res 2011-12-23 20:33:38 +0000
@@ -0,0 +1,1 @@
1 d d
02
=== added file 'test/rbkt/ExpQueryResults/zorba/string/tokenize04.xml.res'
--- test/rbkt/ExpQueryResults/zorba/string/tokenize04.xml.res 1970-01-01 00:00:00 +0000
+++ test/rbkt/ExpQueryResults/zorba/string/tokenize04.xml.res 2011-12-23 20:33:38 +0000
@@ -0,0 +1,1 @@
1abcd abcd
02
=== added file 'test/rbkt/Queries/zorba/string/token01.txt'
--- test/rbkt/Queries/zorba/string/token01.txt 1970-01-01 00:00:00 +0000
+++ test/rbkt/Queries/zorba/string/token01.txt 2011-12-23 20:33:38 +0000
@@ -0,0 +1,1 @@
1abcd
0\ No newline at end of file2\ No newline at end of file
13
=== added file 'test/rbkt/Queries/zorba/string/token02.txt'
--- test/rbkt/Queries/zorba/string/token02.txt 1970-01-01 00:00:00 +0000
+++ test/rbkt/Queries/zorba/string/token02.txt 2011-12-23 20:33:38 +0000
@@ -0,0 +1,1 @@
1abc
0\ No newline at end of file2\ No newline at end of file
13
=== added file 'test/rbkt/Queries/zorba/string/token03.txt'
--- test/rbkt/Queries/zorba/string/token03.txt 1970-01-01 00:00:00 +0000
+++ test/rbkt/Queries/zorba/string/token03.txt 2011-12-23 20:33:38 +0000
@@ -0,0 +1,1 @@
1bcd
0\ No newline at end of file2\ No newline at end of file
13
=== added file 'test/rbkt/Queries/zorba/string/token04.txt'
--- test/rbkt/Queries/zorba/string/token04.txt 1970-01-01 00:00:00 +0000
+++ test/rbkt/Queries/zorba/string/token04.txt 2011-12-23 20:33:38 +0000
@@ -0,0 +1,1 @@
1abcd
0\ No newline at end of file2\ No newline at end of file
13
=== added file 'test/rbkt/Queries/zorba/string/tokenize01.xq'
--- test/rbkt/Queries/zorba/string/tokenize01.xq 1970-01-01 00:00:00 +0000
+++ test/rbkt/Queries/zorba/string/tokenize01.xq 2011-12-23 20:33:38 +0000
@@ -0,0 +1,5 @@
1import module namespace f = "http://expath.org/ns/file";
2import module namespace s = "http://www.zorba-xquery.com/modules/string";
3
4s:split(f:read-text(fn:resolve-uri("token01.txt")), "bc"),
5s:split(s:materialize(f:read-text(fn:resolve-uri("token01.txt"))), "bc")
06
=== added file 'test/rbkt/Queries/zorba/string/tokenize02.xq'
--- test/rbkt/Queries/zorba/string/tokenize02.xq 1970-01-01 00:00:00 +0000
+++ test/rbkt/Queries/zorba/string/tokenize02.xq 2011-12-23 20:33:38 +0000
@@ -0,0 +1,5 @@
1import module namespace f = "http://expath.org/ns/file";
2import module namespace s = "http://www.zorba-xquery.com/modules/string";
3
4s:split(f:read-text(fn:resolve-uri("token02.txt")), "bc"),
5s:split(s:materialize(f:read-text(fn:resolve-uri("token02.txt"))), "bc")
06
=== added file 'test/rbkt/Queries/zorba/string/tokenize03.xq'
--- test/rbkt/Queries/zorba/string/tokenize03.xq 1970-01-01 00:00:00 +0000
+++ test/rbkt/Queries/zorba/string/tokenize03.xq 2011-12-23 20:33:38 +0000
@@ -0,0 +1,5 @@
1import module namespace f = "http://expath.org/ns/file";
2import module namespace s = "http://www.zorba-xquery.com/modules/string";
3
4s:split(f:read-text(fn:resolve-uri("token03.txt")), "bc"),
5s:split(s:materialize(f:read-text(fn:resolve-uri("token03.txt"))), "bc")
06
=== added file 'test/rbkt/Queries/zorba/string/tokenize04.xq'
--- test/rbkt/Queries/zorba/string/tokenize04.xq 1970-01-01 00:00:00 +0000
+++ test/rbkt/Queries/zorba/string/tokenize04.xq 2011-12-23 20:33:38 +0000
@@ -0,0 +1,5 @@
1import module namespace f = "http://expath.org/ns/file";
2import module namespace s = "http://www.zorba-xquery.com/modules/string";
3
4s:split(f:read-text(fn:resolve-uri("token04.txt")), "f"),
5s:split(s:materialize(f:read-text(fn:resolve-uri("token04.txt"))), "f")

Subscribers

People subscribed via source and target branches