Merge lp:~paul-lucas/zorba/feature-ft_bw into lp:zorba
- feature-ft_bw
- Merge into trunk
Status: | Merged | ||||
---|---|---|---|---|---|
Approved by: | Matthias Brantner | ||||
Approved revision: | 10856 | ||||
Merged at revision: | 10908 | ||||
Proposed branch: | lp:~paul-lucas/zorba/feature-ft_bw | ||||
Merge into: | lp:zorba | ||||
Diff against target: |
2563 lines (+758/-670) 25 files modified
ChangeLog (+2/-0) include/zorba/tokenizer.h (+1/-1) modules/com/zorba-xquery/www/modules/full-text.xq (+43/-7) src/functions/func_ft_module_impl.cpp (+32/-3) src/functions/func_ft_module_impl.h (+20/-0) src/functions/function_consts.h (+3/-1) src/runtime/full_text/CMakeLists.txt (+1/-0) src/runtime/full_text/apply.h (+4/-0) src/runtime/full_text/ft_module_impl.cpp (+226/-125) src/runtime/full_text/ft_module_util.cpp (+57/-0) src/runtime/full_text/ft_module_util.h (+80/-0) src/runtime/full_text/ft_util.cpp (+24/-0) src/runtime/full_text/ft_util.h (+17/-1) src/runtime/full_text/pregenerated/ft_module.cpp (+0/-463) src/runtime/full_text/pregenerated/ft_module.h (+64/-10) src/runtime/full_text/tokenizer.cpp (+6/-16) src/runtime/json/jsonml_array.cpp (+7/-14) src/runtime/pregenerated/iterator_enum.h (+1/-0) src/runtime/spec/full_text/ft_module.xml (+63/-24) src/runtime/visitors/pregenerated/planiter_visitor.h (+7/-0) src/runtime/visitors/pregenerated/printer_visitor.cpp (+15/-0) src/runtime/visitors/pregenerated/printer_visitor.h (+5/-0) src/util/xml_util.h (+37/-5) test/rbkt/ExpQueryResults/zorba/fulltext/ft-module-tokenize-nodes-1.xml.res (+1/-0) test/rbkt/Queries/zorba/fulltext/ft-module-tokenize-nodes-1.xq (+42/-0) |
||||
To merge this branch: | bzr merge lp:~paul-lucas/zorba/feature-ft_bw | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Matthias Brantner | Approve | ||
Paul J. Lucas | Approve | ||
Review via email: mp+112811@code.launchpad.net |
Commit message
Added tokenize-nodes() function.
Description of the change
Added tokenize-nodes() function.
Paul J. Lucas (paul-lucas) : | # |
Matthias Brantner (matthias-brantner) wrote : | # |
Zorba Build Bot (zorba-buildbot) wrote : | # |
Validation queue starting for merge proposal.
Log at: http://
Zorba Build Bot (zorba-buildbot) wrote : | # |
Validation queue job feature-
All tests succeeded!
Zorba Build Bot (zorba-buildbot) wrote : | # |
Voting does not meet specified criteria. Required: Approve > 1, Disapprove < 1, Needs Fixing < 1, Pending < 1. Got: 1 Approve, 1 Needs Fixing.
Paul J. Lucas (paul-lucas) wrote : | # |
> - the changelog says that it's a new function but it has been there before
No it hasn't.
> - ft:tokenize-nodes#2 comment is confusing. Why does it say
> The default
> 74 + : <a href="http://
> 75 + : is assumed to be the one returned by <code>ft:current-
> lang()</code>
> in between the two pragmas.
Because it specifies the language for the $includes. It's part of the $includes documentation.
Matthias Brantner (matthias-brantner) : | # |
Zorba Build Bot (zorba-buildbot) wrote : | # |
Validation queue starting for merge proposal.
Log at: http://
Zorba Build Bot (zorba-buildbot) wrote : | # |
Validation queue job feature-
All tests succeeded!
Preview Diff
1 | === modified file 'ChangeLog' |
2 | --- ChangeLog 2012-06-29 13:25:20 +0000 |
3 | +++ ChangeLog 2012-06-29 16:57:20 +0000 |
4 | @@ -4,8 +4,10 @@ |
5 | version 2.x |
6 | |
7 | New Features: |
8 | + |
9 | * Item::isSeekable API extension for streamable content (xs:string and xs:base64Binary). |
10 | * Implemented the latest W3C specification for the group by clause |
11 | + * Added ft:tokenize-nodes() function to full-text module |
12 | * New XQuery 3.0 functions |
13 | - fn:parse-xml-fragment#1 |
14 | * Added support for transient maps to the http://www.zorba-xquery.com/modules/store/data-structures/unordered-map module. |
15 | |
16 | === modified file 'include/zorba/tokenizer.h' |
17 | --- include/zorba/tokenizer.h 2012-06-28 04:14:03 +0000 |
18 | +++ include/zorba/tokenizer.h 2012-06-29 16:57:20 +0000 |
19 | @@ -79,7 +79,7 @@ |
20 | |
21 | /** |
22 | * This member-function is called whenever an item that is being tokenized |
23 | - * is entered or exited. |
24 | + * is entered or exited. The default implementation does nothing. |
25 | * |
26 | * @param item The item being entered or exited. |
27 | * @param entering If \c true, the item is being entered; if \c false, the |
28 | |
29 | === modified file 'modules/com/zorba-xquery/www/modules/full-text.xq' |
30 | --- modules/com/zorba-xquery/www/modules/full-text.xq 2012-06-28 04:14:03 +0000 |
31 | +++ modules/com/zorba-xquery/www/modules/full-text.xq 2012-06-29 16:57:20 +0000 |
32 | @@ -767,14 +767,14 @@ |
33 | as xs:string* external; |
34 | |
35 | (:~ |
36 | - : Tokenizes the given node and all of its descendants. |
37 | + : Tokenizes the given node and all of its decendants. |
38 | : |
39 | : @param $node The node to tokenize. |
40 | : @param $lang The default |
41 | : <a href="http://www.w3.org/TR/xmlschema-2/#language">language</a> |
42 | : of <code>$node</code>. |
43 | : @return a (possibly empty) sequence of tokens. |
44 | - : @error err:FTST0009 if <code>$lang</code> is not supported in general. |
45 | + : @error err:FTST0009 if <code>$lang</code> is not supported. |
46 | : @example test/rbkt/Queries/zorba/fulltext/ft-module-tokenize-node-1.xq |
47 | :) |
48 | declare function ft:tokenize-node( $node as node(), $lang as xs:language ) |
49 | @@ -784,12 +784,11 @@ |
50 | : Tokenizes the given node and all of its descendants. |
51 | : |
52 | : @param $node The node to tokenize. |
53 | - : The document's default |
54 | + : The node's default |
55 | : <a href="http://www.w3.org/TR/xmlschema-2/#language">language</a> |
56 | : is assumed to be the one returned by <code>ft:current-lang()</code>. |
57 | : @return a (possibly empty) sequence of tokens. |
58 | - : @error err:FTST0009 if <code>ft:current-lang()</code> is not supported in |
59 | - : general. |
60 | + : @error err:FTST0009 if <code>ft:current-lang()</code> is not supported. |
61 | : @example test/rbkt/Queries/zorba/fulltext/ft-module-tokenize-node-2.xq |
62 | : @example test/rbkt/Queries/zorba/fulltext/ft-module-tokenize-node-3.xq |
63 | : @example test/rbkt/Queries/zorba/fulltext/ft-module-tokenize-node-4.xq |
64 | @@ -798,10 +797,47 @@ |
65 | as element(ft-schema:token)* external; |
66 | |
67 | (:~ |
68 | + : Tokenizes the set of nodes comprising <code>$includes</code> (and all of its |
69 | + : descendants) but excluding <code>$excludes</code> (and all of its |
70 | + : descendants), if any. |
71 | + : |
72 | + : @param $includes The set of nodes (and its descendants) to include. |
73 | + : The default |
74 | + : <a href="http://www.w3.org/TR/xmlschema-2/#language">language</a> |
75 | + : is assumed to be the one returned by <code>ft:current-lang()</code>. |
76 | + : @param $excludes The set of nodes (and its descendants) to exclude. |
77 | + : @return a (possibly empty) sequence of tokens. |
78 | + : @error err:FTST0009 if <code>ft:current-lang()</code> is not supported. |
79 | + : @example test/rbkt/Queries/zorba/fulltext/ft-module-tokenize-nodes-1.xq |
80 | + :) |
81 | +declare function ft:tokenize-nodes( $includes as node()+, |
82 | + $excludes as node()* ) |
83 | + as element(ft-schema:token)* external; |
84 | + |
85 | +(:~ |
86 | + : Tokenizes the set of nodes comprising <code>$includes</code> (and all of its |
87 | + : descendants) but excluding <code>$excludes</code> (and all of its |
88 | + : descendants), if any. |
89 | + : |
90 | + : @param $includes The set of nodes (and its descendants) to include. |
91 | + : @param $excludes The set of nodes (and its descendants) to exclude. |
92 | + : @param $lang The default |
93 | + : <a href="http://www.w3.org/TR/xmlschema-2/#language">language</a> |
94 | + : for nodes. |
95 | + : @return a (possibly empty) sequence of tokens. |
96 | + : @error err:FTST0009 if <code>$lang</code> is not supported. |
97 | + : @example test/rbkt/Queries/zorba/fulltext/ft-module-tokenize-nodes-1.xq |
98 | + :) |
99 | +declare function ft:tokenize-nodes( $includes as node()+, |
100 | + $excludes as node()*, |
101 | + $lang as xs:language ) |
102 | + as element(ft-schema:token)* external; |
103 | + |
104 | +(:~ |
105 | : Tokenizes the given string. |
106 | : |
107 | : @param $string The string to tokenize. |
108 | - : @param $lang The default |
109 | + : @param $lang The |
110 | : <a href="http://www.w3.org/TR/xmlschema-2/#language">language</a> |
111 | : of <code>$string</code>. |
112 | : @return a (possibly empty) sequence of tokens. |
113 | @@ -816,7 +852,7 @@ |
114 | : Tokenizes the given string. |
115 | : |
116 | : @param $string The string to tokenize. |
117 | - : The string's default |
118 | + : The string's |
119 | : <a href="http://www.w3.org/TR/xmlschema-2/#language">language</a> |
120 | : is assumed to be the one returned by <code>ft:current-lang()</code>. |
121 | : @return a (possibly empty) sequence of tokens. |
122 | |
123 | === modified file 'src/functions/func_ft_module_impl.cpp' |
124 | --- src/functions/func_ft_module_impl.cpp 2012-06-28 04:14:03 +0000 |
125 | +++ src/functions/func_ft_module_impl.cpp 2012-06-29 16:57:20 +0000 |
126 | @@ -36,6 +36,17 @@ |
127 | } |
128 | |
129 | |
130 | +PlanIter_t full_text_tokenize_nodes::codegen( |
131 | + CompilerCB*, |
132 | + static_context* sctx, |
133 | + const QueryLoc& loc, |
134 | + std::vector<PlanIter_t>& argv, |
135 | + expr& ann) const |
136 | +{ |
137 | + return new TokenizeNodesIterator(sctx, loc, argv); |
138 | +} |
139 | + |
140 | + |
141 | PlanIter_t full_text_tokenizer_properties::codegen( |
142 | CompilerCB*, |
143 | static_context* sctx, |
144 | @@ -59,7 +70,6 @@ |
145 | |
146 | #endif // ZORBA_NO_FULL_TEXT |
147 | |
148 | - |
149 | /////////////////////////////////////////////////////////////////////////////// |
150 | |
151 | void populate_context_ft_module_impl(static_context* sctx) |
152 | @@ -105,6 +115,25 @@ |
153 | tokenize_return_type), |
154 | FunctionConsts::FULL_TEXT_TOKENIZE_NODE_2); |
155 | } |
156 | + { |
157 | + DECL_WITH_KIND(sctx, |
158 | + full_text_tokenize_nodes, |
159 | + (createQName( FT_MODULE_NS, "", "tokenize-nodes"), |
160 | + GENV_TYPESYSTEM.ANY_NODE_TYPE_PLUS, |
161 | + GENV_TYPESYSTEM.ANY_NODE_TYPE_STAR, |
162 | + tokenize_return_type), |
163 | + FunctionConsts::FULL_TEXT_TOKENIZE_NODES_2); |
164 | + } |
165 | + { |
166 | + DECL_WITH_KIND(sctx, |
167 | + full_text_tokenize_nodes, |
168 | + (createQName( FT_MODULE_NS, "", "tokenize-nodes"), |
169 | + GENV_TYPESYSTEM.ANY_NODE_TYPE_PLUS, |
170 | + GENV_TYPESYSTEM.ANY_NODE_TYPE_STAR, |
171 | + GENV_TYPESYSTEM.LANGUAGE_TYPE_ONE, |
172 | + tokenize_return_type), |
173 | + FunctionConsts::FULL_TEXT_TOKENIZE_NODES_3); |
174 | + } |
175 | |
176 | xqtref_t tokenizer_properties_return_type = |
177 | GENV_TYPESYSTEM.create_node_type(store::StoreConsts::elementNode, |
178 | @@ -128,10 +157,10 @@ |
179 | tokenizer_properties_return_type), |
180 | FunctionConsts::FULL_TEXT_TOKENIZER_PROPERTIES_1); |
181 | } |
182 | -#endif // ZORBA_NO_FULL_TEXT |
183 | +#endif /* ZORBA_NO_FULL_TEXT */ |
184 | } |
185 | |
186 | - |
187 | +/////////////////////////////////////////////////////////////////////////////// |
188 | |
189 | } // namespace zorba |
190 | /* vim:set et sw=2 ts=2: */ |
191 | |
192 | === modified file 'src/functions/func_ft_module_impl.h' |
193 | --- src/functions/func_ft_module_impl.h 2012-06-28 04:14:03 +0000 |
194 | +++ src/functions/func_ft_module_impl.h 2012-06-29 16:57:20 +0000 |
195 | @@ -49,6 +49,26 @@ |
196 | }; |
197 | |
198 | |
199 | +//full-text:tokenize_nodes |
200 | +class full_text_tokenize_nodes : public function |
201 | +{ |
202 | +public: |
203 | + full_text_tokenize_nodes(const signature& sig, |
204 | + FunctionConsts::FunctionKind kind) : |
205 | + function(sig, kind) |
206 | + { |
207 | + |
208 | + } |
209 | + |
210 | + // Mark the function as accessing the dyn ctx so that it won't be |
211 | + // const-folded. We must prevent const-folding because the function |
212 | + // uses the store to get access to the tokenizer provider. |
213 | + bool accessesDynCtx() const { return true; } |
214 | + |
215 | + CODEGEN_DECL(); |
216 | +}; |
217 | + |
218 | + |
219 | //full-text:tokenizer-properties |
220 | class full_text_tokenizer_properties : public function |
221 | { |
222 | |
223 | === modified file 'src/functions/function_consts.h' |
224 | --- src/functions/function_consts.h 2012-06-28 04:14:03 +0000 |
225 | +++ src/functions/function_consts.h 2012-06-29 16:57:20 +0000 |
226 | @@ -238,7 +238,9 @@ |
227 | FULL_TEXT_TOKENIZER_PROPERTIES_0, |
228 | FULL_TEXT_TOKENIZE_NODE_2, |
229 | FULL_TEXT_TOKENIZE_NODE_1, |
230 | -#endif |
231 | + FULL_TEXT_TOKENIZE_NODES_3, |
232 | + FULL_TEXT_TOKENIZE_NODES_2, |
233 | +#endif /* ZORBA_NO_FULL_TEXT */ |
234 | |
235 | #include "functions/function_enum.h" |
236 | |
237 | |
238 | === modified file 'src/runtime/full_text/CMakeLists.txt' |
239 | --- src/runtime/full_text/CMakeLists.txt 2012-06-28 04:14:03 +0000 |
240 | +++ src/runtime/full_text/CMakeLists.txt 2012-06-29 16:57:20 +0000 |
241 | @@ -41,6 +41,7 @@ |
242 | thesaurus.cpp |
243 | tokenizer.cpp |
244 | default_tokenizer.cpp |
245 | + ft_module_util.cpp |
246 | ft_module.cpp |
247 | ) |
248 | |
249 | |
250 | === modified file 'src/runtime/full_text/apply.h' |
251 | --- src/runtime/full_text/apply.h 2012-06-28 04:14:03 +0000 |
252 | +++ src/runtime/full_text/apply.h 2012-06-29 16:57:20 +0000 |
253 | @@ -24,6 +24,8 @@ |
254 | |
255 | namespace zorba { |
256 | |
257 | +/////////////////////////////////////////////////////////////////////////////// |
258 | + |
259 | void apply_ftand( ft_all_matches const&, ft_all_matches const&, |
260 | ft_all_matches &result ); |
261 | |
262 | @@ -52,6 +54,8 @@ |
263 | void apply_ftwindow( ft_all_matches const&, ft_int window_size, ft_unit::type, |
264 | ft_all_matches &result ); |
265 | |
266 | +/////////////////////////////////////////////////////////////////////////////// |
267 | + |
268 | } // namespace zorba |
269 | #endif /* ZORBA_FULL_TEXT_APPLY_H */ |
270 | /* vim:set et sw=2 ts=2: */ |
271 | |
272 | === modified file 'src/runtime/full_text/ft_module_impl.cpp' |
273 | --- src/runtime/full_text/ft_module_impl.cpp 2012-06-28 04:14:03 +0000 |
274 | +++ src/runtime/full_text/ft_module_impl.cpp 2012-06-29 16:57:20 +0000 |
275 | @@ -13,7 +13,7 @@ |
276 | * See the License for the specific language governing permissions and |
277 | * limitations under the License. |
278 | */ |
279 | -#include "stdafx.h" |
280 | + |
281 | #include <zorba/config.h> |
282 | |
283 | // |
284 | @@ -23,6 +23,8 @@ |
285 | // |
286 | #ifndef ZORBA_NO_FULL_TEXT |
287 | |
288 | +#include "stdafx.h" |
289 | + |
290 | #include <limits> |
291 | #include <typeinfo> |
292 | |
293 | @@ -42,10 +44,12 @@ |
294 | #include "types/casting.h" |
295 | #include "types/typeimpl.h" |
296 | #include "types/typeops.h" |
297 | +#include "util/stl_util.h" |
298 | #include "util/utf8_util.h" |
299 | #include "zorbatypes/URI.h" |
300 | #include "zorbautils/locale.h" |
301 | |
302 | +#include "ft_module_util.h" |
303 | #include "ft_stop_words_set.h" |
304 | #include "ft_token_seq_iterator.h" |
305 | #include "ft_util.h" |
306 | @@ -87,6 +91,85 @@ |
307 | ); |
308 | } |
309 | |
310 | +static Tokenizer::ptr get_tokenizer( iso639_1::type lang, |
311 | + Tokenizer::State *t_state, |
312 | + QueryLoc const &loc ) { |
313 | + TokenizerProvider const *const provider = GENV_STORE.getTokenizerProvider(); |
314 | + ZORBA_ASSERT( provider ); |
315 | + Tokenizer::ptr tokenizer; |
316 | + if ( !provider->getTokenizer( lang, t_state, &tokenizer ) ) |
317 | + throw XQUERY_EXCEPTION( |
318 | + err::FTST0009 /* lang not supported */, |
319 | + ERROR_PARAMS( |
320 | + iso639_1::string_of[ lang ], ZED( FTST0009_BadTokenizerLang ) |
321 | + ), |
322 | + ERROR_LOC( loc ) |
323 | + ); |
324 | + return std::move( tokenizer ); |
325 | +} |
326 | + |
327 | +static void make_token_element( FTToken const &token, |
328 | + TokenQNames const &qnames, |
329 | + store::Item_t &result ) { |
330 | + zstring base_uri = static_context::ZORBA_FULL_TEXT_FN_NS; |
331 | + store::Item_t item, attr_node, node_name, type_name; |
332 | + store::NsBindings const ns_bindings; |
333 | + zstring value_string; |
334 | + |
335 | + type_name = GENV_TYPESYSTEM.XS_UNTYPED_QNAME; |
336 | + node_name = qnames.token; |
337 | + GENV_ITEMFACTORY->createElementNode( |
338 | + result, nullptr, node_name, type_name, false, false, |
339 | + ns_bindings, base_uri |
340 | + ); |
341 | + |
342 | + if ( token.lang() ) { |
343 | + value_string = iso639_1::string_of[ token.lang() ]; |
344 | + GENV_ITEMFACTORY->createString( item, value_string ); |
345 | + type_name = GENV_TYPESYSTEM.XS_UNTYPED_QNAME; |
346 | + node_name = qnames.lang; |
347 | + GENV_ITEMFACTORY->createAttributeNode( |
348 | + attr_node, result, node_name, type_name, item |
349 | + ); |
350 | + } |
351 | + |
352 | + ztd::to_string( token.para(), &value_string ); |
353 | + GENV_ITEMFACTORY->createString( item, value_string ); |
354 | + type_name = GENV_TYPESYSTEM.XS_UNTYPED_QNAME; |
355 | + node_name = qnames.paragraph; |
356 | + GENV_ITEMFACTORY->createAttributeNode( |
357 | + attr_node, result, node_name, type_name, item |
358 | + ); |
359 | + |
360 | + ztd::to_string( token.sent(), &value_string ); |
361 | + GENV_ITEMFACTORY->createString( item, value_string ); |
362 | + type_name = GENV_TYPESYSTEM.XS_UNTYPED_QNAME; |
363 | + node_name = qnames.sentence; |
364 | + GENV_ITEMFACTORY->createAttributeNode( |
365 | + attr_node, result, node_name, type_name, item |
366 | + ); |
367 | + |
368 | + value_string = token.value(); |
369 | + GENV_ITEMFACTORY->createString( item, value_string ); |
370 | + type_name = GENV_TYPESYSTEM.XS_UNTYPED_QNAME; |
371 | + node_name = qnames.value; |
372 | + GENV_ITEMFACTORY->createAttributeNode( |
373 | + attr_node, result, node_name, type_name, item |
374 | + ); |
375 | + |
376 | + if ( store::Item const *const token_item = token.item() ) { |
377 | + if ( GENV_STORE.getNodeReference( item, token_item ) ) { |
378 | + item->getStringValue2( value_string ); |
379 | + GENV_ITEMFACTORY->createString( item, value_string ); |
380 | + type_name = GENV_TYPESYSTEM.XS_UNTYPED_QNAME; |
381 | + node_name = qnames.node_ref; |
382 | + GENV_ITEMFACTORY->createAttributeNode( |
383 | + attr_node, result, node_name, type_name, item |
384 | + ); |
385 | + } |
386 | + } |
387 | +} |
388 | + |
389 | /////////////////////////////////////////////////////////////////////////////// |
390 | |
391 | bool CurrentCompareOptionsIterator::nextImpl( store::Item_t &result, |
392 | @@ -296,10 +379,9 @@ |
393 | } |
394 | |
395 | try { |
396 | - static_context const *const sctx = getStaticContext(); |
397 | - ZORBA_ASSERT( sctx ); |
398 | iso639_1::type const lang = get_lang_from( item, loc ); |
399 | - |
400 | + static_context const *const sctx = getStaticContext(); |
401 | + ZORBA_ASSERT( sctx ); |
402 | zstring error_msg; |
403 | auto_ptr<internal::Resource> rsrc = sctx->resolve_uri( |
404 | uri, internal::EntityData::THESAURUS, error_msg |
405 | @@ -369,7 +451,6 @@ |
406 | PlanIteratorState *state; |
407 | DEFAULT_STACK_INIT( PlanIteratorState, state, plan_state ); |
408 | |
409 | - |
410 | consumeNext( item, theChildren[0], plan_state ); |
411 | item->getStringValue2( word ); |
412 | utf8::to_lower( word ); |
413 | @@ -535,45 +616,12 @@ |
414 | |
415 | /////////////////////////////////////////////////////////////////////////////// |
416 | |
417 | -TokenizeNodeIterator::TokenizeNodeIterator( static_context *sctx, |
418 | - QueryLoc const &loc, |
419 | - std::vector<PlanIter_t>& children ): |
420 | - NaryBaseIterator<TokenizeNodeIterator,TokenizeNodeIteratorState>(sctx, loc, children) |
421 | -{ |
422 | - initMembers(); |
423 | -} |
424 | - |
425 | -void TokenizeNodeIterator::initMembers() { |
426 | - GENV_ITEMFACTORY->createQName( |
427 | - token_qname_, static_context::ZORBA_FULL_TEXT_FN_NS, "", "token" ); |
428 | - |
429 | - GENV_ITEMFACTORY->createQName( |
430 | - lang_qname_, "", "", "lang" ); |
431 | - |
432 | - GENV_ITEMFACTORY->createQName( |
433 | - para_qname_, "", "", "paragraph" ); |
434 | - |
435 | - GENV_ITEMFACTORY->createQName( |
436 | - sent_qname_, "", "", "sentence" ); |
437 | - |
438 | - GENV_ITEMFACTORY->createQName( |
439 | - value_qname_, "", "", "value" ); |
440 | - |
441 | - GENV_ITEMFACTORY->createQName( |
442 | - ref_qname_, "", "", "node-ref" ); |
443 | -} |
444 | - |
445 | bool TokenizeNodeIterator::nextImpl( store::Item_t &result, |
446 | PlanState &plan_state ) const { |
447 | - store::Item_t node_name, attr_node; |
448 | - zstring base_uri; |
449 | store::Item_t item; |
450 | iso639_1::type lang; |
451 | Tokenizer::State t_state; |
452 | - store::NsBindings const ns_bindings; |
453 | TokenizerProvider const *tokenizer_provider; |
454 | - store::Item_t type_name; |
455 | - zstring value_string; |
456 | |
457 | TokenizeNodeIteratorState *state; |
458 | DEFAULT_STACK_INIT( TokenizeNodeIteratorState, state, plan_state ); |
459 | @@ -594,66 +642,11 @@ |
460 | state->doc_item_->getTokens( *tokenizer_provider, t_state, lang ); |
461 | |
462 | while ( state->doc_tokens_->hasNext() ) { |
463 | - FTToken const *token; |
464 | - token = state->doc_tokens_->next(); |
465 | - ZORBA_ASSERT( token ); |
466 | - |
467 | - base_uri = static_context::ZORBA_FULL_TEXT_FN_NS; |
468 | - type_name = GENV_TYPESYSTEM.XS_UNTYPED_QNAME; |
469 | - node_name = token_qname_; |
470 | - GENV_ITEMFACTORY->createElementNode( |
471 | - result, nullptr, node_name, type_name, false, false, |
472 | - ns_bindings, base_uri |
473 | - ); |
474 | - |
475 | - if ( token->lang() ) { |
476 | - value_string = iso639_1::string_of[ token->lang() ]; |
477 | - GENV_ITEMFACTORY->createString( item, value_string ); |
478 | - type_name = GENV_TYPESYSTEM.XS_UNTYPED_QNAME; |
479 | - node_name = lang_qname_; |
480 | - GENV_ITEMFACTORY->createAttributeNode( |
481 | - attr_node, result, node_name, type_name, item |
482 | - ); |
483 | - } |
484 | - |
485 | - ztd::to_string( token->para(), &value_string ); |
486 | - GENV_ITEMFACTORY->createString( item, value_string ); |
487 | - type_name = GENV_TYPESYSTEM.XS_UNTYPED_QNAME; |
488 | - node_name = para_qname_; |
489 | - GENV_ITEMFACTORY->createAttributeNode( |
490 | - attr_node, result, node_name, type_name, item |
491 | - ); |
492 | - |
493 | - ztd::to_string( token->sent(), &value_string ); |
494 | - GENV_ITEMFACTORY->createString( item, value_string ); |
495 | - type_name = GENV_TYPESYSTEM.XS_UNTYPED_QNAME; |
496 | - node_name = sent_qname_; |
497 | - GENV_ITEMFACTORY->createAttributeNode( |
498 | - attr_node, result, node_name, type_name, item |
499 | - ); |
500 | - |
501 | - value_string = token->value(); |
502 | - GENV_ITEMFACTORY->createString( item, value_string ); |
503 | - type_name = GENV_TYPESYSTEM.XS_UNTYPED_QNAME; |
504 | - node_name = value_qname_; |
505 | - GENV_ITEMFACTORY->createAttributeNode( |
506 | - attr_node, result, node_name, type_name, item |
507 | - ); |
508 | - |
509 | - if ( store::Item const *const token_item = token->item() ) { |
510 | - if ( GENV_STORE.getNodeReference( item, token_item ) ) { |
511 | - item->getStringValue2( value_string ); |
512 | - GENV_ITEMFACTORY->createString( item, value_string ); |
513 | - type_name = GENV_TYPESYSTEM.XS_UNTYPED_QNAME; |
514 | - node_name = ref_qname_; |
515 | - GENV_ITEMFACTORY->createAttributeNode( |
516 | - attr_node, result, node_name, type_name, item |
517 | - ); |
518 | - } |
519 | - } |
520 | - |
521 | + make_token_element( |
522 | + *state->doc_tokens_->next(), state->token_qnames_, result |
523 | + ); |
524 | STACK_PUSH( true, state ); |
525 | - } // while |
526 | + } |
527 | } |
528 | |
529 | STACK_END( state ); |
530 | @@ -669,12 +662,140 @@ |
531 | state->doc_tokens_->reset(); |
532 | } |
533 | |
534 | -void TokenizeNodeIterator::serialize( serialization::Archiver &ar ) { |
535 | - serialize_baseclass( |
536 | - ar, (NaryBaseIterator<TokenizeNodeIterator,TokenizeNodeIteratorState>*)this |
537 | - ); |
538 | - if ( !ar.is_serializing_out() ) |
539 | - initMembers(); |
540 | +/////////////////////////////////////////////////////////////////////////////// |
541 | + |
542 | +bool TokenizeNodesIterator::nextImpl( store::Item_t &result, |
543 | + PlanState &plan_state ) const { |
544 | + store::Item_t item; |
545 | + iso639_1::type lang; |
546 | + Tokenizer::State t_state; |
547 | + Tokenizer::ptr tokenizer; |
548 | + |
549 | + TokenizeNodesIteratorState *state; |
550 | + DEFAULT_STACK_INIT( TokenizeNodesIteratorState, state, plan_state ); |
551 | + |
552 | + if ( theChildren.size() > 2 ) { |
553 | + consumeNext( item, theChildren[2], plan_state ); |
554 | + lang = get_lang_from( item, loc ); |
555 | + } else { |
556 | + static_context const *const sctx = getStaticContext(); |
557 | + ZORBA_ASSERT( sctx ); |
558 | + lang = get_lang_from( sctx ); |
559 | + } |
560 | + |
561 | + tokenizer = get_tokenizer( lang, &state->t_state_, loc ); |
562 | + |
563 | + // $includes |
564 | + while ( consumeNext( item, theChildren[0], plan_state ) ) |
565 | + state->includes_.push_back( item ); |
566 | + state->includes_.push_back( store::Item_t() ); // sentinel |
567 | + |
568 | + // $excludes |
569 | + while ( consumeNext( item, theChildren[1], plan_state ) ) { |
570 | + store::Item_t exc_si; |
571 | + GENV_STORE.getStructuralInformation( exc_si, item.getp() ); |
572 | + state->excludes_.push_back( exc_si ); |
573 | + } |
574 | + |
575 | + state->callback_.set_tokens( state->tokens_ ); |
576 | + state->langs_.push( lang ); |
577 | + state->tokenizers_.push( tokenizer.release() ); |
578 | + |
579 | + while ( true ) { |
580 | + if ( state->tokens_.empty() ) { |
581 | + if ( state->includes_.empty() ) |
582 | + break; |
583 | + |
584 | + store::Item_t inc( state->includes_.front() ); |
585 | + state->includes_.pop_front(); |
586 | + if ( inc.isNull() ) { // sentinel |
587 | + state->langs_.pop(); |
588 | + Tokenizer::ptr deleter( ztd::pop_stack( state->tokenizers_ ) ); |
589 | + continue; |
590 | + } |
591 | + |
592 | + store::Item_t inc_si; |
593 | + GENV_STORE.getStructuralInformation( inc_si, inc.getp() ); |
594 | + bool excluded = false; |
595 | + FOR_EACH( vector<store::Item_t>, exc, state->excludes_ ) { |
596 | + if ( inc_si->equals( *exc ) || (*exc)->isInSubtreeOf( inc_si ) ) { |
597 | + excluded = true; |
598 | + break; |
599 | + } |
600 | + } |
601 | + if ( excluded ) |
602 | + continue; |
603 | + |
604 | + bool add_sentinel = false; |
605 | + switch ( inc->getNodeKind() ) { |
606 | + case store::StoreConsts::elementNode: |
607 | + ++state->t_state_.para; |
608 | + if ( find_lang_attribute( *inc, &lang ) ) { |
609 | + state->langs_.push( lang ); |
610 | + tokenizer = get_tokenizer( lang, &state->t_state_, loc ); |
611 | + state->tokenizers_.push( tokenizer.release() ); |
612 | + add_sentinel = true; |
613 | + } |
614 | + // no break; |
615 | + case store::StoreConsts::documentNode: { |
616 | + list<store::Item_t>::iterator pos = state->includes_.begin(); |
617 | + store::Iterator_t i = inc->getChildren(); |
618 | + i->open(); |
619 | + for ( store::Item_t child; i->next( child ); ) { |
620 | + switch ( child->getNodeKind() ) { |
621 | + case store::StoreConsts::attributeNode: |
622 | + case store::StoreConsts::commentNode: |
623 | + case store::StoreConsts::piNode: |
624 | + continue; // never include these implicitly |
625 | + default: |
626 | + pos = state->includes_.insert( pos, child ); |
627 | + ++pos; |
628 | + } |
629 | + } |
630 | + i->close(); |
631 | + if ( add_sentinel ) // sentinel |
632 | + state->includes_.insert( pos, store::Item_t() ); |
633 | + continue; |
634 | + } |
635 | + |
636 | + case store::StoreConsts::attributeNode: |
637 | + case store::StoreConsts::commentNode: |
638 | + case store::StoreConsts::piNode: |
639 | + // tokenize these because they were included explicitly |
640 | + case store::StoreConsts::textNode: { |
641 | + zstring const s( inc->getStringValue() ); |
642 | + Item const temp( inc.getp() ); |
643 | + state->tokenizers_.top()->tokenize_string( |
644 | + s.data(), s.size(), state->langs_.top(), false, state->callback_, |
645 | + &temp |
646 | + ); |
647 | + break; |
648 | + } |
649 | + |
650 | + default: |
651 | + break; |
652 | + } // switch |
653 | + continue; |
654 | + } // if ( state->tokens_.empty() ) |
655 | + |
656 | + make_token_element( |
657 | + state->tokens_.front(), state->token_qnames_, result |
658 | + ); |
659 | + state->tokens_.pop_front(); |
660 | + STACK_PUSH( true, state ); |
661 | + } // while |
662 | + |
663 | + STACK_END( state ); |
664 | +} |
665 | + |
666 | +void TokenizeNodesIterator::resetImpl( PlanState &plan_state ) const { |
667 | + NaryBaseIterator<TokenizeNodesIterator,TokenizeNodesIteratorState>:: |
668 | + resetImpl( plan_state ); |
669 | + TokenizeNodesIteratorState *const state = |
670 | + StateTraitsImpl<TokenizeNodesIteratorState>::getState( |
671 | + plan_state, this->theStateOffset |
672 | + ); |
673 | + state->doc_tokens_->reset(); |
674 | } |
675 | |
676 | /////////////////////////////////////////////////////////////////////////////// |
677 | @@ -689,7 +810,6 @@ |
678 | Tokenizer::ptr tokenizer; |
679 | store::Item_t type_name; |
680 | Tokenizer::Properties props; |
681 | - TokenizerProvider const *tokenizer_provider; |
682 | zstring value_string; |
683 | |
684 | PlanIteratorState *state; |
685 | @@ -704,15 +824,7 @@ |
686 | lang = get_lang_from( sctx ); |
687 | } |
688 | |
689 | - tokenizer_provider = GENV_STORE.getTokenizerProvider(); |
690 | - ZORBA_ASSERT( tokenizer_provider ); |
691 | - if ( !tokenizer_provider->getTokenizer( lang, &t_state, &tokenizer ) ) |
692 | - throw XQUERY_EXCEPTION( |
693 | - err::FTST0009 /* lang not supported */, |
694 | - ERROR_PARAMS( |
695 | - iso639_1::string_of[ lang ], ZED( FTST0009_BadTokenizerLang ) |
696 | - ) |
697 | - ); |
698 | + tokenizer = get_tokenizer( lang, &t_state, loc ); |
699 | tokenizer->properties( &props ); |
700 | |
701 | GENV_ITEMFACTORY->createQName( |
702 | @@ -840,19 +952,8 @@ |
703 | } |
704 | |
705 | { // local scope |
706 | - TokenizerProvider const *const tokenizer_provider = |
707 | - GENV_STORE.getTokenizerProvider(); |
708 | - ZORBA_ASSERT( tokenizer_provider ); |
709 | Tokenizer::State t_state; |
710 | - Tokenizer::ptr tokenizer; |
711 | - if ( !tokenizer_provider->getTokenizer( lang, &t_state, &tokenizer ) ) |
712 | - throw XQUERY_EXCEPTION( |
713 | - err::FTST0009 /* lang not supported */, |
714 | - ERROR_PARAMS( |
715 | - iso639_1::string_of[ lang ], ZED( FTST0009_BadTokenizerLang ) |
716 | - ) |
717 | - ); |
718 | - |
719 | + Tokenizer::ptr const tokenizer( get_tokenizer( lang, &t_state, loc ) ); |
720 | TokenizeStringIteratorCallback callback; |
721 | tokenizer->tokenize_string( |
722 | value_string.data(), value_string.size(), lang, false, callback |
723 | |
724 | === added file 'src/runtime/full_text/ft_module_util.cpp' |
725 | --- src/runtime/full_text/ft_module_util.cpp 1970-01-01 00:00:00 +0000 |
726 | +++ src/runtime/full_text/ft_module_util.cpp 2012-06-29 16:57:20 +0000 |
727 | @@ -0,0 +1,57 @@ |
728 | +/* |
729 | + * Copyright 2006-2008 The FLWOR Foundation. |
730 | + * |
731 | + * Licensed under the Apache License, Version 2.0 (the "License"); |
732 | + * you may not use this file except in compliance with the License. |
733 | + * You may obtain a copy of the License at |
734 | + * |
735 | + * http://www.apache.org/licenses/LICENSE-2.0 |
736 | + * |
737 | + * Unless required by applicable law or agreed to in writing, software |
738 | + * distributed under the License is distributed on an "AS IS" BASIS, |
739 | + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
740 | + * See the License for the specific language governing permissions and |
741 | + * limitations under the License. |
742 | + */ |
743 | + |
744 | +#include "api/unmarshaller.h" |
745 | +#include "context/static_context.h" |
746 | +#include "store/api/item_factory.h" |
747 | +#include "system/globalenv.h" |
748 | + |
749 | +#include "ft_module_util.h" |
750 | + |
751 | +using namespace std; |
752 | +using namespace zorba::locale; |
753 | + |
754 | +namespace zorba { |
755 | + |
756 | +/////////////////////////////////////////////////////////////////////////////// |
757 | + |
758 | +void TokenizeNodesCallback::token( char const *utf8_s, size_type utf8_len, |
759 | + iso639_1::type lang, size_type token_no, |
760 | + size_type sent_no, size_type para_no, |
761 | + Item const *api_item ) { |
762 | + store::Item const *const item = Unmarshaller::getInternalItem( *api_item ); |
763 | + tokens_->push_back( |
764 | + FTToken( utf8_s, utf8_len, token_no, sent_no, para_no, item ) |
765 | + ); |
766 | +} |
767 | + |
768 | +/////////////////////////////////////////////////////////////////////////////// |
769 | + |
770 | +TokenQNames::TokenQNames() { |
771 | + GENV_ITEMFACTORY->createQName( |
772 | + token, static_context::ZORBA_FULL_TEXT_FN_NS, "", "token" |
773 | + ); |
774 | + GENV_ITEMFACTORY->createQName( lang, "", "", "lang" ); |
775 | + GENV_ITEMFACTORY->createQName( paragraph, "", "", "paragraph" ); |
776 | + GENV_ITEMFACTORY->createQName( sentence, "", "", "sentence" ); |
777 | + GENV_ITEMFACTORY->createQName( value, "", "", "value" ); |
778 | + GENV_ITEMFACTORY->createQName( node_ref, "", "", "node-ref" ); |
779 | +} |
780 | + |
781 | +/////////////////////////////////////////////////////////////////////////////// |
782 | + |
783 | +} // namespace zorba |
784 | +/* vim:set et sw=2 ts=2: */ |
785 | |
786 | === added file 'src/runtime/full_text/ft_module_util.h' |
787 | --- src/runtime/full_text/ft_module_util.h 1970-01-01 00:00:00 +0000 |
788 | +++ src/runtime/full_text/ft_module_util.h 2012-06-29 16:57:20 +0000 |
789 | @@ -0,0 +1,80 @@ |
790 | +/* |
791 | + * Copyright 2006-2008 The FLWOR Foundation. |
792 | + * |
793 | + * Licensed under the Apache License, Version 2.0 (the "License"); |
794 | + * you may not use this file except in compliance with the License. |
795 | + * You may obtain a copy of the License at |
796 | + * |
797 | + * http://www.apache.org/licenses/LICENSE-2.0 |
798 | + * |
799 | + * Unless required by applicable law or agreed to in writing, software |
800 | + * distributed under the License is distributed on an "AS IS" BASIS, |
801 | + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
802 | + * See the License for the specific language governing permissions and |
803 | + * limitations under the License. |
804 | + */ |
805 | + |
806 | +#ifndef ZORBA_FT_MODULE_UTIL_H |
807 | +#define ZORBA_FT_MODULE_UTIL_H |
808 | + |
809 | +// |
810 | +// The reason this header (and related .cpp) are necessary (instead of just |
811 | +// puting this code into ft_module.h/.cpp directly) is because this header |
812 | +// needs to be #include'd into the .cpp generated from the ft_module.xml file. |
813 | +// |
814 | + |
815 | +#include <zorba/tokenizer.h> |
816 | + |
817 | +#include <deque> |
818 | + |
819 | +#include "store/api/item.h" |
820 | +#include "util/cxx_util.h" |
821 | +#include "zorbatypes/ft_token.h" |
822 | + |
823 | +#include "ft_module_util.h" |
824 | + |
825 | +namespace zorba { |
826 | + |
827 | +/////////////////////////////////////////////////////////////////////////////// |
828 | + |
829 | +/** |
830 | + * A %TokenizeNodesCallback is-a Tokenizer::Callback that's used exclusively by |
831 | + * the TokenizeNodesIterator that implements the ft:tokenize-nodes() full-text |
832 | + * module function. |
833 | + */ |
834 | +class TokenizeNodesCallback : public Tokenizer::Callback { |
835 | +public: |
836 | + TokenizeNodesCallback() : tokens_( nullptr ) { } |
837 | + TokenizeNodesCallback( std::deque<FTToken> &tokens ) : tokens_( &tokens ) { } |
838 | + |
839 | + void set_tokens( std::deque<FTToken> &tokens ) { |
840 | + tokens_ = &tokens; |
841 | + } |
842 | + |
843 | + // inherited |
844 | + void token( char const *utf8_s, size_type utf8_len, |
845 | + locale::iso639_1::type lang, size_type token_no, |
846 | + size_type sent_no, size_type para_no, Item const *item = 0 ); |
847 | + |
848 | +private: |
849 | + std::deque<FTToken> *tokens_; |
850 | +}; |
851 | + |
852 | +/////////////////////////////////////////////////////////////////////////////// |
853 | + |
854 | +struct TokenQNames { |
855 | + store::Item_t token; |
856 | + store::Item_t lang; |
857 | + store::Item_t paragraph; |
858 | + store::Item_t sentence; |
859 | + store::Item_t value; |
860 | + store::Item_t node_ref; |
861 | + |
862 | + TokenQNames(); |
863 | +}; |
864 | + |
865 | +/////////////////////////////////////////////////////////////////////////////// |
866 | + |
867 | +} // namespace zorba |
868 | +#endif /* ZORBA_FT_MODULE_UTIL_H */ |
869 | +/* vim:set et sw=2 ts=2: */ |
870 | |
871 | === modified file 'src/runtime/full_text/ft_util.cpp' |
872 | --- src/runtime/full_text/ft_util.cpp 2012-04-27 17:07:47 +0000 |
873 | +++ src/runtime/full_text/ft_util.cpp 2012-06-29 16:57:20 +0000 |
874 | @@ -19,14 +19,38 @@ |
875 | #include <stdexcept> |
876 | |
877 | #include "diagnostics/xquery_diagnostics.h" |
878 | +#include "zorbamisc/ns_consts.h" |
879 | #include "zorbatypes/numconversions.h" |
880 | +#include "zorbautils/locale.h" |
881 | |
882 | #include "ft_util.h" |
883 | |
884 | +using namespace zorba::locale; |
885 | + |
886 | namespace zorba { |
887 | |
888 | /////////////////////////////////////////////////////////////////////////////// |
889 | |
890 | +bool find_lang_attribute( store::Item const &item, iso639_1::type *lang ) { |
891 | + bool found_lang = false; |
892 | + if ( item.getNodeKind() == store::StoreConsts::elementNode ) { |
893 | + store::Iterator_t i( item.getAttributes() ); |
894 | + i->open(); |
895 | + for ( store::Item_t attr; i->next( attr ); ) { |
896 | + store::Item const *const qname = attr->getNodeName(); |
897 | + if ( qname && |
898 | + qname->getLocalName() == "lang" && |
899 | + qname->getNamespace() == XML_NS ) { |
900 | + *lang = locale::find_lang( attr->getStringValue().c_str() ); |
901 | + found_lang = true; |
902 | + break; |
903 | + } |
904 | + } |
905 | + i->close(); |
906 | + } |
907 | + return found_lang; |
908 | +} |
909 | + |
910 | ft_int to_ft_int( xs_integer const &i ) { |
911 | try { |
912 | return to_xs_unsignedInt( i ); |
913 | |
914 | === modified file 'src/runtime/full_text/ft_util.h' |
915 | --- src/runtime/full_text/ft_util.h 2012-06-28 04:14:03 +0000 |
916 | +++ src/runtime/full_text/ft_util.h 2012-06-29 16:57:20 +0000 |
917 | @@ -17,11 +17,13 @@ |
918 | #ifndef ZORBA_FULL_TEXT_UTIL_H |
919 | #define ZORBA_FULL_TEXT_UTIL_H |
920 | |
921 | +#include <zorba/item.h> |
922 | #include <zorba/locale.h> |
923 | |
924 | #include "compiler/expression/ftnode.h" |
925 | +#include "store/api/item.h" |
926 | +#include "util/cxx_util.h" |
927 | #include "zorbatypes/schema_types.h" |
928 | -#include "util/cxx_util.h" |
929 | |
930 | #include "ft_match.h" |
931 | |
932 | @@ -44,6 +46,18 @@ |
933 | ////////// Functions ////////////////////////////////////////////////////////// |
934 | |
935 | /** |
936 | + * Finds the <code>xml:lang</code> attribute, if any, of the XML element |
937 | + * specified by \a item and obtains its value. |
938 | + * |
939 | + * @param item The item for an XML element to check. |
940 | + * @param lang A pointer to received the found language. |
941 | + * @return Returns \c true only if an <code>xml:lang</code> attribute was |
942 | + * found. |
943 | + */ |
944 | +bool find_lang_attribute( store::Item const &item, |
945 | + locale::iso639_1::type *lang ); |
946 | + |
947 | +/** |
948 | * Gets the language from the given ftmatch_options, if any. |
949 | * |
950 | * @param options The ftmatch_options to get the language from. This may be \c |
951 | @@ -98,6 +112,8 @@ |
952 | */ |
953 | ft_int to_ft_int( xs_integer const &i ); |
954 | |
955 | +/////////////////////////////////////////////////////////////////////////////// |
956 | + |
957 | } // namespace zorba |
958 | #endif /* ZORBA_FULL_TEXT_UTIL_H */ |
959 | /* vim:set et sw=2 ts=2: */ |
960 | |
961 | === added file 'src/runtime/full_text/pregenerated/ft_module.cpp' |
962 | --- src/runtime/full_text/pregenerated/ft_module.cpp 1970-01-01 00:00:00 +0000 |
963 | +++ src/runtime/full_text/pregenerated/ft_module.cpp 2012-06-29 16:57:20 +0000 |
964 | @@ -0,0 +1,506 @@ |
965 | +/* |
966 | + * Copyright 2006-2008 The FLWOR Foundation. |
967 | + * |
968 | + * Licensed under the Apache License, Version 2.0 (the "License"); |
969 | + * you may not use this file except in compliance with the License. |
970 | + * You may obtain a copy of the License at |
971 | + * |
972 | + * http://www.apache.org/licenses/LICENSE-2.0 |
973 | + * |
974 | + * Unless required by applicable law or agreed to in writing, software |
975 | + * distributed under the License is distributed on an "AS IS" BASIS, |
976 | + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
977 | + * See the License for the specific language governing permissions and |
978 | + * limitations under the License. |
979 | + */ |
980 | + |
981 | +// ****************************************** |
982 | +// * * |
983 | +// * THIS IS A GENERATED FILE. DO NOT EDIT! * |
984 | +// * SEE .xml FILE WITH SAME NAME * |
985 | +// * * |
986 | +// ****************************************** |
987 | + |
988 | +#include "stdafx.h" |
989 | +#include "zorbatypes/rchandle.h" |
990 | +#include "zorbatypes/zstring.h" |
991 | +#include "runtime/visitors/planiter_visitor.h" |
992 | +#include "runtime/full_text/ft_module.h" |
993 | +#include "system/globalenv.h" |
994 | + |
995 | + |
996 | +#include "store/api/iterator.h" |
997 | + |
998 | +namespace zorba { |
999 | + |
1000 | +#ifndef ZORBA_NO_FULL_TEXT |
1001 | +// <CurrentCompareOptionsIterator> |
1002 | +SERIALIZABLE_CLASS_VERSIONS(CurrentCompareOptionsIterator) |
1003 | + |
1004 | +void CurrentCompareOptionsIterator::serialize(::zorba::serialization::Archiver& ar) |
1005 | +{ |
1006 | + serialize_baseclass(ar, |
1007 | + (NaryBaseIterator<CurrentCompareOptionsIterator, PlanIteratorState>*)this); |
1008 | +} |
1009 | + |
1010 | + |
1011 | +void CurrentCompareOptionsIterator::accept(PlanIterVisitor& v) const |
1012 | +{ |
1013 | + v.beginVisit(*this); |
1014 | + |
1015 | + std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1016 | + std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1017 | + for ( ; lIter != lEnd; ++lIter ){ |
1018 | + (*lIter)->accept(v); |
1019 | + } |
1020 | + |
1021 | + v.endVisit(*this); |
1022 | +} |
1023 | + |
1024 | +CurrentCompareOptionsIterator::~CurrentCompareOptionsIterator() {} |
1025 | + |
1026 | +// </CurrentCompareOptionsIterator> |
1027 | + |
1028 | +#endif |
1029 | +#ifndef ZORBA_NO_FULL_TEXT |
1030 | +// <CurrentLangIterator> |
1031 | +SERIALIZABLE_CLASS_VERSIONS(CurrentLangIterator) |
1032 | + |
1033 | +void CurrentLangIterator::serialize(::zorba::serialization::Archiver& ar) |
1034 | +{ |
1035 | + serialize_baseclass(ar, |
1036 | + (NaryBaseIterator<CurrentLangIterator, PlanIteratorState>*)this); |
1037 | +} |
1038 | + |
1039 | + |
1040 | +void CurrentLangIterator::accept(PlanIterVisitor& v) const |
1041 | +{ |
1042 | + v.beginVisit(*this); |
1043 | + |
1044 | + std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1045 | + std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1046 | + for ( ; lIter != lEnd; ++lIter ){ |
1047 | + (*lIter)->accept(v); |
1048 | + } |
1049 | + |
1050 | + v.endVisit(*this); |
1051 | +} |
1052 | + |
1053 | +CurrentLangIterator::~CurrentLangIterator() {} |
1054 | + |
1055 | +// </CurrentLangIterator> |
1056 | + |
1057 | +#endif |
1058 | +#ifndef ZORBA_NO_FULL_TEXT |
1059 | +// <HostLangIterator> |
1060 | +SERIALIZABLE_CLASS_VERSIONS(HostLangIterator) |
1061 | + |
1062 | +void HostLangIterator::serialize(::zorba::serialization::Archiver& ar) |
1063 | +{ |
1064 | + serialize_baseclass(ar, |
1065 | + (NaryBaseIterator<HostLangIterator, PlanIteratorState>*)this); |
1066 | +} |
1067 | + |
1068 | + |
1069 | +void HostLangIterator::accept(PlanIterVisitor& v) const |
1070 | +{ |
1071 | + v.beginVisit(*this); |
1072 | + |
1073 | + std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1074 | + std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1075 | + for ( ; lIter != lEnd; ++lIter ){ |
1076 | + (*lIter)->accept(v); |
1077 | + } |
1078 | + |
1079 | + v.endVisit(*this); |
1080 | +} |
1081 | + |
1082 | +HostLangIterator::~HostLangIterator() {} |
1083 | + |
1084 | +// </HostLangIterator> |
1085 | + |
1086 | +#endif |
1087 | +#ifndef ZORBA_NO_FULL_TEXT |
1088 | +// <IsStemLangSupportedIterator> |
1089 | +SERIALIZABLE_CLASS_VERSIONS(IsStemLangSupportedIterator) |
1090 | + |
1091 | +void IsStemLangSupportedIterator::serialize(::zorba::serialization::Archiver& ar) |
1092 | +{ |
1093 | + serialize_baseclass(ar, |
1094 | + (NaryBaseIterator<IsStemLangSupportedIterator, PlanIteratorState>*)this); |
1095 | +} |
1096 | + |
1097 | + |
1098 | +void IsStemLangSupportedIterator::accept(PlanIterVisitor& v) const |
1099 | +{ |
1100 | + v.beginVisit(*this); |
1101 | + |
1102 | + std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1103 | + std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1104 | + for ( ; lIter != lEnd; ++lIter ){ |
1105 | + (*lIter)->accept(v); |
1106 | + } |
1107 | + |
1108 | + v.endVisit(*this); |
1109 | +} |
1110 | + |
1111 | +IsStemLangSupportedIterator::~IsStemLangSupportedIterator() {} |
1112 | + |
1113 | +// </IsStemLangSupportedIterator> |
1114 | + |
1115 | +#endif |
1116 | +#ifndef ZORBA_NO_FULL_TEXT |
1117 | +// <IsStopWordIterator> |
1118 | +SERIALIZABLE_CLASS_VERSIONS(IsStopWordIterator) |
1119 | + |
1120 | +void IsStopWordIterator::serialize(::zorba::serialization::Archiver& ar) |
1121 | +{ |
1122 | + serialize_baseclass(ar, |
1123 | + (NaryBaseIterator<IsStopWordIterator, PlanIteratorState>*)this); |
1124 | +} |
1125 | + |
1126 | + |
1127 | +void IsStopWordIterator::accept(PlanIterVisitor& v) const |
1128 | +{ |
1129 | + v.beginVisit(*this); |
1130 | + |
1131 | + std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1132 | + std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1133 | + for ( ; lIter != lEnd; ++lIter ){ |
1134 | + (*lIter)->accept(v); |
1135 | + } |
1136 | + |
1137 | + v.endVisit(*this); |
1138 | +} |
1139 | + |
1140 | +IsStopWordIterator::~IsStopWordIterator() {} |
1141 | + |
1142 | +// </IsStopWordIterator> |
1143 | + |
1144 | +#endif |
1145 | +#ifndef ZORBA_NO_FULL_TEXT |
1146 | +// <IsStopWordLangSupportedIterator> |
1147 | +SERIALIZABLE_CLASS_VERSIONS(IsStopWordLangSupportedIterator) |
1148 | + |
1149 | +void IsStopWordLangSupportedIterator::serialize(::zorba::serialization::Archiver& ar) |
1150 | +{ |
1151 | + serialize_baseclass(ar, |
1152 | + (NaryBaseIterator<IsStopWordLangSupportedIterator, PlanIteratorState>*)this); |
1153 | +} |
1154 | + |
1155 | + |
1156 | +void IsStopWordLangSupportedIterator::accept(PlanIterVisitor& v) const |
1157 | +{ |
1158 | + v.beginVisit(*this); |
1159 | + |
1160 | + std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1161 | + std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1162 | + for ( ; lIter != lEnd; ++lIter ){ |
1163 | + (*lIter)->accept(v); |
1164 | + } |
1165 | + |
1166 | + v.endVisit(*this); |
1167 | +} |
1168 | + |
1169 | +IsStopWordLangSupportedIterator::~IsStopWordLangSupportedIterator() {} |
1170 | + |
1171 | +// </IsStopWordLangSupportedIterator> |
1172 | + |
1173 | +#endif |
1174 | +#ifndef ZORBA_NO_FULL_TEXT |
1175 | +// <IsThesaurusLangSupportedIterator> |
1176 | +SERIALIZABLE_CLASS_VERSIONS(IsThesaurusLangSupportedIterator) |
1177 | + |
1178 | +void IsThesaurusLangSupportedIterator::serialize(::zorba::serialization::Archiver& ar) |
1179 | +{ |
1180 | + serialize_baseclass(ar, |
1181 | + (NaryBaseIterator<IsThesaurusLangSupportedIterator, PlanIteratorState>*)this); |
1182 | +} |
1183 | + |
1184 | + |
1185 | +void IsThesaurusLangSupportedIterator::accept(PlanIterVisitor& v) const |
1186 | +{ |
1187 | + v.beginVisit(*this); |
1188 | + |
1189 | + std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1190 | + std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1191 | + for ( ; lIter != lEnd; ++lIter ){ |
1192 | + (*lIter)->accept(v); |
1193 | + } |
1194 | + |
1195 | + v.endVisit(*this); |
1196 | +} |
1197 | + |
1198 | +IsThesaurusLangSupportedIterator::~IsThesaurusLangSupportedIterator() {} |
1199 | + |
1200 | +// </IsThesaurusLangSupportedIterator> |
1201 | + |
1202 | +#endif |
1203 | +#ifndef ZORBA_NO_FULL_TEXT |
1204 | +// <IsTokenizerLangSupportedIterator> |
1205 | +SERIALIZABLE_CLASS_VERSIONS(IsTokenizerLangSupportedIterator) |
1206 | + |
1207 | +void IsTokenizerLangSupportedIterator::serialize(::zorba::serialization::Archiver& ar) |
1208 | +{ |
1209 | + serialize_baseclass(ar, |
1210 | + (NaryBaseIterator<IsTokenizerLangSupportedIterator, PlanIteratorState>*)this); |
1211 | +} |
1212 | + |
1213 | + |
1214 | +void IsTokenizerLangSupportedIterator::accept(PlanIterVisitor& v) const |
1215 | +{ |
1216 | + v.beginVisit(*this); |
1217 | + |
1218 | + std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1219 | + std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1220 | + for ( ; lIter != lEnd; ++lIter ){ |
1221 | + (*lIter)->accept(v); |
1222 | + } |
1223 | + |
1224 | + v.endVisit(*this); |
1225 | +} |
1226 | + |
1227 | +IsTokenizerLangSupportedIterator::~IsTokenizerLangSupportedIterator() {} |
1228 | + |
1229 | +// </IsTokenizerLangSupportedIterator> |
1230 | + |
1231 | +#endif |
1232 | +#ifndef ZORBA_NO_FULL_TEXT |
1233 | +// <StemIterator> |
1234 | +SERIALIZABLE_CLASS_VERSIONS(StemIterator) |
1235 | + |
1236 | +void StemIterator::serialize(::zorba::serialization::Archiver& ar) |
1237 | +{ |
1238 | + serialize_baseclass(ar, |
1239 | + (NaryBaseIterator<StemIterator, PlanIteratorState>*)this); |
1240 | +} |
1241 | + |
1242 | + |
1243 | +void StemIterator::accept(PlanIterVisitor& v) const |
1244 | +{ |
1245 | + v.beginVisit(*this); |
1246 | + |
1247 | + std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1248 | + std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1249 | + for ( ; lIter != lEnd; ++lIter ){ |
1250 | + (*lIter)->accept(v); |
1251 | + } |
1252 | + |
1253 | + v.endVisit(*this); |
1254 | +} |
1255 | + |
1256 | +StemIterator::~StemIterator() {} |
1257 | + |
1258 | +// </StemIterator> |
1259 | + |
1260 | +#endif |
1261 | +#ifndef ZORBA_NO_FULL_TEXT |
1262 | +// <StripDiacriticsIterator> |
1263 | +SERIALIZABLE_CLASS_VERSIONS(StripDiacriticsIterator) |
1264 | + |
1265 | +void StripDiacriticsIterator::serialize(::zorba::serialization::Archiver& ar) |
1266 | +{ |
1267 | + serialize_baseclass(ar, |
1268 | + (NaryBaseIterator<StripDiacriticsIterator, PlanIteratorState>*)this); |
1269 | +} |
1270 | + |
1271 | + |
1272 | +void StripDiacriticsIterator::accept(PlanIterVisitor& v) const |
1273 | +{ |
1274 | + v.beginVisit(*this); |
1275 | + |
1276 | + std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1277 | + std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1278 | + for ( ; lIter != lEnd; ++lIter ){ |
1279 | + (*lIter)->accept(v); |
1280 | + } |
1281 | + |
1282 | + v.endVisit(*this); |
1283 | +} |
1284 | + |
1285 | +StripDiacriticsIterator::~StripDiacriticsIterator() {} |
1286 | + |
1287 | +// </StripDiacriticsIterator> |
1288 | + |
1289 | +#endif |
1290 | +#ifndef ZORBA_NO_FULL_TEXT |
1291 | +// <ThesaurusLookupIterator> |
1292 | +SERIALIZABLE_CLASS_VERSIONS(ThesaurusLookupIterator) |
1293 | + |
1294 | +void ThesaurusLookupIterator::serialize(::zorba::serialization::Archiver& ar) |
1295 | +{ |
1296 | + serialize_baseclass(ar, |
1297 | + (NaryBaseIterator<ThesaurusLookupIterator, ThesaurusLookupIteratorState>*)this); |
1298 | +} |
1299 | + |
1300 | + |
1301 | +void ThesaurusLookupIterator::accept(PlanIterVisitor& v) const |
1302 | +{ |
1303 | + v.beginVisit(*this); |
1304 | + |
1305 | + std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1306 | + std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1307 | + for ( ; lIter != lEnd; ++lIter ){ |
1308 | + (*lIter)->accept(v); |
1309 | + } |
1310 | + |
1311 | + v.endVisit(*this); |
1312 | +} |
1313 | + |
1314 | +ThesaurusLookupIterator::~ThesaurusLookupIterator() {} |
1315 | + |
1316 | +ThesaurusLookupIteratorState::ThesaurusLookupIteratorState() {} |
1317 | + |
1318 | +ThesaurusLookupIteratorState::~ThesaurusLookupIteratorState() {} |
1319 | + |
1320 | + |
1321 | +void ThesaurusLookupIteratorState::reset(PlanState& planState) { |
1322 | + PlanIteratorState::reset(planState); |
1323 | +} |
1324 | +// </ThesaurusLookupIterator> |
1325 | + |
1326 | +#endif |
1327 | +#ifndef ZORBA_NO_FULL_TEXT |
1328 | +// <TokenizeNodeIterator> |
1329 | +SERIALIZABLE_CLASS_VERSIONS(TokenizeNodeIterator) |
1330 | + |
1331 | +void TokenizeNodeIterator::serialize(::zorba::serialization::Archiver& ar) |
1332 | +{ |
1333 | + serialize_baseclass(ar, |
1334 | + (NaryBaseIterator<TokenizeNodeIterator, TokenizeNodeIteratorState>*)this); |
1335 | +} |
1336 | + |
1337 | + |
1338 | +void TokenizeNodeIterator::accept(PlanIterVisitor& v) const |
1339 | +{ |
1340 | + v.beginVisit(*this); |
1341 | + |
1342 | + std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1343 | + std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1344 | + for ( ; lIter != lEnd; ++lIter ){ |
1345 | + (*lIter)->accept(v); |
1346 | + } |
1347 | + |
1348 | + v.endVisit(*this); |
1349 | +} |
1350 | + |
1351 | +TokenizeNodeIterator::~TokenizeNodeIterator() {} |
1352 | + |
1353 | +TokenizeNodeIteratorState::TokenizeNodeIteratorState() {} |
1354 | + |
1355 | +TokenizeNodeIteratorState::~TokenizeNodeIteratorState() {} |
1356 | + |
1357 | + |
1358 | +void TokenizeNodeIteratorState::reset(PlanState& planState) { |
1359 | + PlanIteratorState::reset(planState); |
1360 | +} |
1361 | +// </TokenizeNodeIterator> |
1362 | + |
1363 | +#endif |
1364 | +#ifndef ZORBA_NO_FULL_TEXT |
1365 | +// <TokenizeNodesIterator> |
1366 | +SERIALIZABLE_CLASS_VERSIONS(TokenizeNodesIterator) |
1367 | + |
1368 | +void TokenizeNodesIterator::serialize(::zorba::serialization::Archiver& ar) |
1369 | +{ |
1370 | + serialize_baseclass(ar, |
1371 | + (NaryBaseIterator<TokenizeNodesIterator, TokenizeNodesIteratorState>*)this); |
1372 | +} |
1373 | + |
1374 | + |
1375 | +void TokenizeNodesIterator::accept(PlanIterVisitor& v) const |
1376 | +{ |
1377 | + v.beginVisit(*this); |
1378 | + |
1379 | + std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1380 | + std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1381 | + for ( ; lIter != lEnd; ++lIter ){ |
1382 | + (*lIter)->accept(v); |
1383 | + } |
1384 | + |
1385 | + v.endVisit(*this); |
1386 | +} |
1387 | + |
1388 | +TokenizeNodesIterator::~TokenizeNodesIterator() {} |
1389 | + |
1390 | +TokenizeNodesIteratorState::TokenizeNodesIteratorState() {} |
1391 | + |
1392 | +TokenizeNodesIteratorState::~TokenizeNodesIteratorState() {} |
1393 | + |
1394 | + |
1395 | +void TokenizeNodesIteratorState::reset(PlanState& planState) { |
1396 | + PlanIteratorState::reset(planState); |
1397 | +} |
1398 | +// </TokenizeNodesIterator> |
1399 | + |
1400 | +#endif |
1401 | +#ifndef ZORBA_NO_FULL_TEXT |
1402 | +// <TokenizerPropertiesIterator> |
1403 | +SERIALIZABLE_CLASS_VERSIONS(TokenizerPropertiesIterator) |
1404 | + |
1405 | +void TokenizerPropertiesIterator::serialize(::zorba::serialization::Archiver& ar) |
1406 | +{ |
1407 | + serialize_baseclass(ar, |
1408 | + (NaryBaseIterator<TokenizerPropertiesIterator, PlanIteratorState>*)this); |
1409 | +} |
1410 | + |
1411 | + |
1412 | +void TokenizerPropertiesIterator::accept(PlanIterVisitor& v) const |
1413 | +{ |
1414 | + v.beginVisit(*this); |
1415 | + |
1416 | + std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1417 | + std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1418 | + for ( ; lIter != lEnd; ++lIter ){ |
1419 | + (*lIter)->accept(v); |
1420 | + } |
1421 | + |
1422 | + v.endVisit(*this); |
1423 | +} |
1424 | + |
1425 | +TokenizerPropertiesIterator::~TokenizerPropertiesIterator() {} |
1426 | + |
1427 | +// </TokenizerPropertiesIterator> |
1428 | + |
1429 | +#endif |
1430 | +#ifndef ZORBA_NO_FULL_TEXT |
1431 | +// <TokenizeStringIterator> |
1432 | +SERIALIZABLE_CLASS_VERSIONS(TokenizeStringIterator) |
1433 | + |
1434 | +void TokenizeStringIterator::serialize(::zorba::serialization::Archiver& ar) |
1435 | +{ |
1436 | + serialize_baseclass(ar, |
1437 | + (NaryBaseIterator<TokenizeStringIterator, TokenizeStringIteratorState>*)this); |
1438 | +} |
1439 | + |
1440 | + |
1441 | +void TokenizeStringIterator::accept(PlanIterVisitor& v) const |
1442 | +{ |
1443 | + v.beginVisit(*this); |
1444 | + |
1445 | + std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1446 | + std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1447 | + for ( ; lIter != lEnd; ++lIter ){ |
1448 | + (*lIter)->accept(v); |
1449 | + } |
1450 | + |
1451 | + v.endVisit(*this); |
1452 | +} |
1453 | + |
1454 | +TokenizeStringIterator::~TokenizeStringIterator() {} |
1455 | + |
1456 | +TokenizeStringIteratorState::TokenizeStringIteratorState() {} |
1457 | + |
1458 | +TokenizeStringIteratorState::~TokenizeStringIteratorState() {} |
1459 | + |
1460 | + |
1461 | +void TokenizeStringIteratorState::reset(PlanState& planState) { |
1462 | + PlanIteratorState::reset(planState); |
1463 | +} |
1464 | +// </TokenizeStringIterator> |
1465 | + |
1466 | +#endif |
1467 | + |
1468 | +} |
1469 | + |
1470 | + |
1471 | |
1472 | === removed file 'src/runtime/full_text/pregenerated/ft_module.cpp' |
1473 | --- src/runtime/full_text/pregenerated/ft_module.cpp 2012-05-22 19:09:20 +0000 |
1474 | +++ src/runtime/full_text/pregenerated/ft_module.cpp 1970-01-01 00:00:00 +0000 |
1475 | @@ -1,463 +0,0 @@ |
1476 | -/* |
1477 | - * Copyright 2006-2008 The FLWOR Foundation. |
1478 | - * |
1479 | - * Licensed under the Apache License, Version 2.0 (the "License"); |
1480 | - * you may not use this file except in compliance with the License. |
1481 | - * You may obtain a copy of the License at |
1482 | - * |
1483 | - * http://www.apache.org/licenses/LICENSE-2.0 |
1484 | - * |
1485 | - * Unless required by applicable law or agreed to in writing, software |
1486 | - * distributed under the License is distributed on an "AS IS" BASIS, |
1487 | - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
1488 | - * See the License for the specific language governing permissions and |
1489 | - * limitations under the License. |
1490 | - */ |
1491 | - |
1492 | -// ****************************************** |
1493 | -// * * |
1494 | -// * THIS IS A GENERATED FILE. DO NOT EDIT! * |
1495 | -// * SEE .xml FILE WITH SAME NAME * |
1496 | -// * * |
1497 | -// ****************************************** |
1498 | - |
1499 | -#include "stdafx.h" |
1500 | -#include "zorbatypes/rchandle.h" |
1501 | -#include "zorbatypes/zstring.h" |
1502 | -#include "runtime/visitors/planiter_visitor.h" |
1503 | -#include "runtime/full_text/ft_module.h" |
1504 | -#include "system/globalenv.h" |
1505 | - |
1506 | - |
1507 | -#include "store/api/iterator.h" |
1508 | - |
1509 | -namespace zorba { |
1510 | - |
1511 | -#ifndef ZORBA_NO_FULL_TEXT |
1512 | -// <CurrentCompareOptionsIterator> |
1513 | -SERIALIZABLE_CLASS_VERSIONS(CurrentCompareOptionsIterator) |
1514 | - |
1515 | -void CurrentCompareOptionsIterator::serialize(::zorba::serialization::Archiver& ar) |
1516 | -{ |
1517 | - serialize_baseclass(ar, |
1518 | - (NaryBaseIterator<CurrentCompareOptionsIterator, PlanIteratorState>*)this); |
1519 | -} |
1520 | - |
1521 | - |
1522 | -void CurrentCompareOptionsIterator::accept(PlanIterVisitor& v) const |
1523 | -{ |
1524 | - v.beginVisit(*this); |
1525 | - |
1526 | - std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1527 | - std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1528 | - for ( ; lIter != lEnd; ++lIter ){ |
1529 | - (*lIter)->accept(v); |
1530 | - } |
1531 | - |
1532 | - v.endVisit(*this); |
1533 | -} |
1534 | - |
1535 | -CurrentCompareOptionsIterator::~CurrentCompareOptionsIterator() {} |
1536 | - |
1537 | -// </CurrentCompareOptionsIterator> |
1538 | - |
1539 | -#endif |
1540 | -#ifndef ZORBA_NO_FULL_TEXT |
1541 | -// <CurrentLangIterator> |
1542 | -SERIALIZABLE_CLASS_VERSIONS(CurrentLangIterator) |
1543 | - |
1544 | -void CurrentLangIterator::serialize(::zorba::serialization::Archiver& ar) |
1545 | -{ |
1546 | - serialize_baseclass(ar, |
1547 | - (NaryBaseIterator<CurrentLangIterator, PlanIteratorState>*)this); |
1548 | -} |
1549 | - |
1550 | - |
1551 | -void CurrentLangIterator::accept(PlanIterVisitor& v) const |
1552 | -{ |
1553 | - v.beginVisit(*this); |
1554 | - |
1555 | - std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1556 | - std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1557 | - for ( ; lIter != lEnd; ++lIter ){ |
1558 | - (*lIter)->accept(v); |
1559 | - } |
1560 | - |
1561 | - v.endVisit(*this); |
1562 | -} |
1563 | - |
1564 | -CurrentLangIterator::~CurrentLangIterator() {} |
1565 | - |
1566 | -// </CurrentLangIterator> |
1567 | - |
1568 | -#endif |
1569 | -#ifndef ZORBA_NO_FULL_TEXT |
1570 | -// <HostLangIterator> |
1571 | -SERIALIZABLE_CLASS_VERSIONS(HostLangIterator) |
1572 | - |
1573 | -void HostLangIterator::serialize(::zorba::serialization::Archiver& ar) |
1574 | -{ |
1575 | - serialize_baseclass(ar, |
1576 | - (NaryBaseIterator<HostLangIterator, PlanIteratorState>*)this); |
1577 | -} |
1578 | - |
1579 | - |
1580 | -void HostLangIterator::accept(PlanIterVisitor& v) const |
1581 | -{ |
1582 | - v.beginVisit(*this); |
1583 | - |
1584 | - std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1585 | - std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1586 | - for ( ; lIter != lEnd; ++lIter ){ |
1587 | - (*lIter)->accept(v); |
1588 | - } |
1589 | - |
1590 | - v.endVisit(*this); |
1591 | -} |
1592 | - |
1593 | -HostLangIterator::~HostLangIterator() {} |
1594 | - |
1595 | -// </HostLangIterator> |
1596 | - |
1597 | -#endif |
1598 | -#ifndef ZORBA_NO_FULL_TEXT |
1599 | -// <IsStemLangSupportedIterator> |
1600 | -SERIALIZABLE_CLASS_VERSIONS(IsStemLangSupportedIterator) |
1601 | - |
1602 | -void IsStemLangSupportedIterator::serialize(::zorba::serialization::Archiver& ar) |
1603 | -{ |
1604 | - serialize_baseclass(ar, |
1605 | - (NaryBaseIterator<IsStemLangSupportedIterator, PlanIteratorState>*)this); |
1606 | -} |
1607 | - |
1608 | - |
1609 | -void IsStemLangSupportedIterator::accept(PlanIterVisitor& v) const |
1610 | -{ |
1611 | - v.beginVisit(*this); |
1612 | - |
1613 | - std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1614 | - std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1615 | - for ( ; lIter != lEnd; ++lIter ){ |
1616 | - (*lIter)->accept(v); |
1617 | - } |
1618 | - |
1619 | - v.endVisit(*this); |
1620 | -} |
1621 | - |
1622 | -IsStemLangSupportedIterator::~IsStemLangSupportedIterator() {} |
1623 | - |
1624 | -// </IsStemLangSupportedIterator> |
1625 | - |
1626 | -#endif |
1627 | -#ifndef ZORBA_NO_FULL_TEXT |
1628 | -// <IsStopWordIterator> |
1629 | -SERIALIZABLE_CLASS_VERSIONS(IsStopWordIterator) |
1630 | - |
1631 | -void IsStopWordIterator::serialize(::zorba::serialization::Archiver& ar) |
1632 | -{ |
1633 | - serialize_baseclass(ar, |
1634 | - (NaryBaseIterator<IsStopWordIterator, PlanIteratorState>*)this); |
1635 | -} |
1636 | - |
1637 | - |
1638 | -void IsStopWordIterator::accept(PlanIterVisitor& v) const |
1639 | -{ |
1640 | - v.beginVisit(*this); |
1641 | - |
1642 | - std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1643 | - std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1644 | - for ( ; lIter != lEnd; ++lIter ){ |
1645 | - (*lIter)->accept(v); |
1646 | - } |
1647 | - |
1648 | - v.endVisit(*this); |
1649 | -} |
1650 | - |
1651 | -IsStopWordIterator::~IsStopWordIterator() {} |
1652 | - |
1653 | -// </IsStopWordIterator> |
1654 | - |
1655 | -#endif |
1656 | -#ifndef ZORBA_NO_FULL_TEXT |
1657 | -// <IsStopWordLangSupportedIterator> |
1658 | -SERIALIZABLE_CLASS_VERSIONS(IsStopWordLangSupportedIterator) |
1659 | - |
1660 | -void IsStopWordLangSupportedIterator::serialize(::zorba::serialization::Archiver& ar) |
1661 | -{ |
1662 | - serialize_baseclass(ar, |
1663 | - (NaryBaseIterator<IsStopWordLangSupportedIterator, PlanIteratorState>*)this); |
1664 | -} |
1665 | - |
1666 | - |
1667 | -void IsStopWordLangSupportedIterator::accept(PlanIterVisitor& v) const |
1668 | -{ |
1669 | - v.beginVisit(*this); |
1670 | - |
1671 | - std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1672 | - std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1673 | - for ( ; lIter != lEnd; ++lIter ){ |
1674 | - (*lIter)->accept(v); |
1675 | - } |
1676 | - |
1677 | - v.endVisit(*this); |
1678 | -} |
1679 | - |
1680 | -IsStopWordLangSupportedIterator::~IsStopWordLangSupportedIterator() {} |
1681 | - |
1682 | -// </IsStopWordLangSupportedIterator> |
1683 | - |
1684 | -#endif |
1685 | -#ifndef ZORBA_NO_FULL_TEXT |
1686 | -// <IsThesaurusLangSupportedIterator> |
1687 | -SERIALIZABLE_CLASS_VERSIONS(IsThesaurusLangSupportedIterator) |
1688 | - |
1689 | -void IsThesaurusLangSupportedIterator::serialize(::zorba::serialization::Archiver& ar) |
1690 | -{ |
1691 | - serialize_baseclass(ar, |
1692 | - (NaryBaseIterator<IsThesaurusLangSupportedIterator, PlanIteratorState>*)this); |
1693 | -} |
1694 | - |
1695 | - |
1696 | -void IsThesaurusLangSupportedIterator::accept(PlanIterVisitor& v) const |
1697 | -{ |
1698 | - v.beginVisit(*this); |
1699 | - |
1700 | - std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1701 | - std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1702 | - for ( ; lIter != lEnd; ++lIter ){ |
1703 | - (*lIter)->accept(v); |
1704 | - } |
1705 | - |
1706 | - v.endVisit(*this); |
1707 | -} |
1708 | - |
1709 | -IsThesaurusLangSupportedIterator::~IsThesaurusLangSupportedIterator() {} |
1710 | - |
1711 | -// </IsThesaurusLangSupportedIterator> |
1712 | - |
1713 | -#endif |
1714 | -#ifndef ZORBA_NO_FULL_TEXT |
1715 | -// <IsTokenizerLangSupportedIterator> |
1716 | -SERIALIZABLE_CLASS_VERSIONS(IsTokenizerLangSupportedIterator) |
1717 | - |
1718 | -void IsTokenizerLangSupportedIterator::serialize(::zorba::serialization::Archiver& ar) |
1719 | -{ |
1720 | - serialize_baseclass(ar, |
1721 | - (NaryBaseIterator<IsTokenizerLangSupportedIterator, PlanIteratorState>*)this); |
1722 | -} |
1723 | - |
1724 | - |
1725 | -void IsTokenizerLangSupportedIterator::accept(PlanIterVisitor& v) const |
1726 | -{ |
1727 | - v.beginVisit(*this); |
1728 | - |
1729 | - std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1730 | - std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1731 | - for ( ; lIter != lEnd; ++lIter ){ |
1732 | - (*lIter)->accept(v); |
1733 | - } |
1734 | - |
1735 | - v.endVisit(*this); |
1736 | -} |
1737 | - |
1738 | -IsTokenizerLangSupportedIterator::~IsTokenizerLangSupportedIterator() {} |
1739 | - |
1740 | -// </IsTokenizerLangSupportedIterator> |
1741 | - |
1742 | -#endif |
1743 | -#ifndef ZORBA_NO_FULL_TEXT |
1744 | -// <StemIterator> |
1745 | -SERIALIZABLE_CLASS_VERSIONS(StemIterator) |
1746 | - |
1747 | -void StemIterator::serialize(::zorba::serialization::Archiver& ar) |
1748 | -{ |
1749 | - serialize_baseclass(ar, |
1750 | - (NaryBaseIterator<StemIterator, PlanIteratorState>*)this); |
1751 | -} |
1752 | - |
1753 | - |
1754 | -void StemIterator::accept(PlanIterVisitor& v) const |
1755 | -{ |
1756 | - v.beginVisit(*this); |
1757 | - |
1758 | - std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1759 | - std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1760 | - for ( ; lIter != lEnd; ++lIter ){ |
1761 | - (*lIter)->accept(v); |
1762 | - } |
1763 | - |
1764 | - v.endVisit(*this); |
1765 | -} |
1766 | - |
1767 | -StemIterator::~StemIterator() {} |
1768 | - |
1769 | -// </StemIterator> |
1770 | - |
1771 | -#endif |
1772 | -#ifndef ZORBA_NO_FULL_TEXT |
1773 | -// <StripDiacriticsIterator> |
1774 | -SERIALIZABLE_CLASS_VERSIONS(StripDiacriticsIterator) |
1775 | - |
1776 | -void StripDiacriticsIterator::serialize(::zorba::serialization::Archiver& ar) |
1777 | -{ |
1778 | - serialize_baseclass(ar, |
1779 | - (NaryBaseIterator<StripDiacriticsIterator, PlanIteratorState>*)this); |
1780 | -} |
1781 | - |
1782 | - |
1783 | -void StripDiacriticsIterator::accept(PlanIterVisitor& v) const |
1784 | -{ |
1785 | - v.beginVisit(*this); |
1786 | - |
1787 | - std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1788 | - std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1789 | - for ( ; lIter != lEnd; ++lIter ){ |
1790 | - (*lIter)->accept(v); |
1791 | - } |
1792 | - |
1793 | - v.endVisit(*this); |
1794 | -} |
1795 | - |
1796 | -StripDiacriticsIterator::~StripDiacriticsIterator() {} |
1797 | - |
1798 | -// </StripDiacriticsIterator> |
1799 | - |
1800 | -#endif |
1801 | -#ifndef ZORBA_NO_FULL_TEXT |
1802 | -// <ThesaurusLookupIterator> |
1803 | -SERIALIZABLE_CLASS_VERSIONS(ThesaurusLookupIterator) |
1804 | - |
1805 | -void ThesaurusLookupIterator::serialize(::zorba::serialization::Archiver& ar) |
1806 | -{ |
1807 | - serialize_baseclass(ar, |
1808 | - (NaryBaseIterator<ThesaurusLookupIterator, ThesaurusLookupIteratorState>*)this); |
1809 | -} |
1810 | - |
1811 | - |
1812 | -void ThesaurusLookupIterator::accept(PlanIterVisitor& v) const |
1813 | -{ |
1814 | - v.beginVisit(*this); |
1815 | - |
1816 | - std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1817 | - std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1818 | - for ( ; lIter != lEnd; ++lIter ){ |
1819 | - (*lIter)->accept(v); |
1820 | - } |
1821 | - |
1822 | - v.endVisit(*this); |
1823 | -} |
1824 | - |
1825 | -ThesaurusLookupIterator::~ThesaurusLookupIterator() {} |
1826 | - |
1827 | -ThesaurusLookupIteratorState::ThesaurusLookupIteratorState() {} |
1828 | - |
1829 | -ThesaurusLookupIteratorState::~ThesaurusLookupIteratorState() {} |
1830 | - |
1831 | - |
1832 | -void ThesaurusLookupIteratorState::reset(PlanState& planState) { |
1833 | - PlanIteratorState::reset(planState); |
1834 | -} |
1835 | -// </ThesaurusLookupIterator> |
1836 | - |
1837 | -#endif |
1838 | -#ifndef ZORBA_NO_FULL_TEXT |
1839 | -// <TokenizeNodeIterator> |
1840 | -SERIALIZABLE_CLASS_VERSIONS(TokenizeNodeIterator) |
1841 | - |
1842 | - |
1843 | -void TokenizeNodeIterator::accept(PlanIterVisitor& v) const |
1844 | -{ |
1845 | - v.beginVisit(*this); |
1846 | - |
1847 | - std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1848 | - std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1849 | - for ( ; lIter != lEnd; ++lIter ){ |
1850 | - (*lIter)->accept(v); |
1851 | - } |
1852 | - |
1853 | - v.endVisit(*this); |
1854 | -} |
1855 | - |
1856 | -TokenizeNodeIterator::~TokenizeNodeIterator() {} |
1857 | - |
1858 | -TokenizeNodeIteratorState::TokenizeNodeIteratorState() {} |
1859 | - |
1860 | -TokenizeNodeIteratorState::~TokenizeNodeIteratorState() {} |
1861 | - |
1862 | - |
1863 | -void TokenizeNodeIteratorState::reset(PlanState& planState) { |
1864 | - PlanIteratorState::reset(planState); |
1865 | -} |
1866 | -// </TokenizeNodeIterator> |
1867 | - |
1868 | -#endif |
1869 | -#ifndef ZORBA_NO_FULL_TEXT |
1870 | -// <TokenizerPropertiesIterator> |
1871 | -SERIALIZABLE_CLASS_VERSIONS(TokenizerPropertiesIterator) |
1872 | - |
1873 | -void TokenizerPropertiesIterator::serialize(::zorba::serialization::Archiver& ar) |
1874 | -{ |
1875 | - serialize_baseclass(ar, |
1876 | - (NaryBaseIterator<TokenizerPropertiesIterator, PlanIteratorState>*)this); |
1877 | -} |
1878 | - |
1879 | - |
1880 | -void TokenizerPropertiesIterator::accept(PlanIterVisitor& v) const |
1881 | -{ |
1882 | - v.beginVisit(*this); |
1883 | - |
1884 | - std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1885 | - std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1886 | - for ( ; lIter != lEnd; ++lIter ){ |
1887 | - (*lIter)->accept(v); |
1888 | - } |
1889 | - |
1890 | - v.endVisit(*this); |
1891 | -} |
1892 | - |
1893 | -TokenizerPropertiesIterator::~TokenizerPropertiesIterator() {} |
1894 | - |
1895 | -// </TokenizerPropertiesIterator> |
1896 | - |
1897 | -#endif |
1898 | -#ifndef ZORBA_NO_FULL_TEXT |
1899 | -// <TokenizeStringIterator> |
1900 | -SERIALIZABLE_CLASS_VERSIONS(TokenizeStringIterator) |
1901 | - |
1902 | -void TokenizeStringIterator::serialize(::zorba::serialization::Archiver& ar) |
1903 | -{ |
1904 | - serialize_baseclass(ar, |
1905 | - (NaryBaseIterator<TokenizeStringIterator, TokenizeStringIteratorState>*)this); |
1906 | -} |
1907 | - |
1908 | - |
1909 | -void TokenizeStringIterator::accept(PlanIterVisitor& v) const |
1910 | -{ |
1911 | - v.beginVisit(*this); |
1912 | - |
1913 | - std::vector<PlanIter_t>::const_iterator lIter = theChildren.begin(); |
1914 | - std::vector<PlanIter_t>::const_iterator lEnd = theChildren.end(); |
1915 | - for ( ; lIter != lEnd; ++lIter ){ |
1916 | - (*lIter)->accept(v); |
1917 | - } |
1918 | - |
1919 | - v.endVisit(*this); |
1920 | -} |
1921 | - |
1922 | -TokenizeStringIterator::~TokenizeStringIterator() {} |
1923 | - |
1924 | -TokenizeStringIteratorState::TokenizeStringIteratorState() {} |
1925 | - |
1926 | -TokenizeStringIteratorState::~TokenizeStringIteratorState() {} |
1927 | - |
1928 | - |
1929 | -void TokenizeStringIteratorState::reset(PlanState& planState) { |
1930 | - PlanIteratorState::reset(planState); |
1931 | -} |
1932 | -// </TokenizeStringIterator> |
1933 | - |
1934 | -#endif |
1935 | - |
1936 | -} |
1937 | - |
1938 | - |
1939 | |
1940 | === modified file 'src/runtime/full_text/pregenerated/ft_module.h' |
1941 | --- src/runtime/full_text/pregenerated/ft_module.h 2012-06-28 04:14:03 +0000 |
1942 | +++ src/runtime/full_text/pregenerated/ft_module.h 2012-06-29 16:57:20 +0000 |
1943 | @@ -29,6 +29,11 @@ |
1944 | |
1945 | |
1946 | #include "runtime/base/narybase.h" |
1947 | +#include <deque> |
1948 | +#include <list> |
1949 | +#include <stack> |
1950 | +#include <vector> |
1951 | +#include "runtime/full_text/ft_module_util.h" |
1952 | #include "runtime/full_text/ft_token_seq_iterator.h" |
1953 | #include "runtime/full_text/thesaurus.h" |
1954 | |
1955 | @@ -416,6 +421,7 @@ |
1956 | public: |
1957 | store::Item_t doc_item_; // |
1958 | FTTokenIterator_t doc_tokens_; // |
1959 | + TokenQNames token_qnames_; // |
1960 | |
1961 | TokenizeNodeIteratorState(); |
1962 | |
1963 | @@ -426,13 +432,6 @@ |
1964 | |
1965 | class TokenizeNodeIterator : public NaryBaseIterator<TokenizeNodeIterator, TokenizeNodeIteratorState> |
1966 | { |
1967 | -protected: |
1968 | - store::Item_t token_qname_; // |
1969 | - store::Item_t lang_qname_; // |
1970 | - store::Item_t para_qname_; // |
1971 | - store::Item_t sent_qname_; // |
1972 | - store::Item_t value_qname_; // |
1973 | - store::Item_t ref_qname_; // |
1974 | public: |
1975 | SERIALIZABLE_CLASS(TokenizeNodeIterator); |
1976 | |
1977 | @@ -445,12 +444,67 @@ |
1978 | static_context* sctx, |
1979 | const QueryLoc& loc, |
1980 | std::vector<PlanIter_t>& children) |
1981 | - ; |
1982 | + : |
1983 | + NaryBaseIterator<TokenizeNodeIterator, TokenizeNodeIteratorState>(sctx, loc, children) |
1984 | + {} |
1985 | |
1986 | virtual ~TokenizeNodeIterator(); |
1987 | |
1988 | -public: |
1989 | - void initMembers(); |
1990 | + void accept(PlanIterVisitor& v) const; |
1991 | + |
1992 | + bool nextImpl(store::Item_t& result, PlanState& aPlanState) const; |
1993 | + |
1994 | + void resetImpl(PlanState&) const; |
1995 | +}; |
1996 | + |
1997 | +#endif |
1998 | + |
1999 | +#ifndef ZORBA_NO_FULL_TEXT |
2000 | +/** |
2001 | + * |
2002 | + * Author: |
2003 | + */ |
2004 | +class TokenizeNodesIteratorState : public PlanIteratorState |
2005 | +{ |
2006 | +public: |
2007 | + store::Item_t doc_item_; // |
2008 | + FTTokenIterator_t doc_tokens_; // |
2009 | + TokenQNames token_qnames_; // |
2010 | + std::list<store::Item_t> includes_; // |
2011 | + std::vector<store::Item_t> excludes_; // |
2012 | + std::stack<Tokenizer*> tokenizers_; // |
2013 | + std::stack<locale::iso639_1::type> langs_; // |
2014 | + TokenizeNodesCallback callback_; // |
2015 | + Tokenizer::State t_state_; // |
2016 | + std::deque<FTToken> tokens_; // |
2017 | + |
2018 | + TokenizeNodesIteratorState(); |
2019 | + |
2020 | + ~TokenizeNodesIteratorState(); |
2021 | + |
2022 | + void reset(PlanState&); |
2023 | +}; |
2024 | + |
2025 | +class TokenizeNodesIterator : public NaryBaseIterator<TokenizeNodesIterator, TokenizeNodesIteratorState> |
2026 | +{ |
2027 | +public: |
2028 | + SERIALIZABLE_CLASS(TokenizeNodesIterator); |
2029 | + |
2030 | + SERIALIZABLE_CLASS_CONSTRUCTOR2T(TokenizeNodesIterator, |
2031 | + NaryBaseIterator<TokenizeNodesIterator, TokenizeNodesIteratorState>); |
2032 | + |
2033 | + void serialize( ::zorba::serialization::Archiver& ar); |
2034 | + |
2035 | + TokenizeNodesIterator( |
2036 | + static_context* sctx, |
2037 | + const QueryLoc& loc, |
2038 | + std::vector<PlanIter_t>& children) |
2039 | + : |
2040 | + NaryBaseIterator<TokenizeNodesIterator, TokenizeNodesIteratorState>(sctx, loc, children) |
2041 | + {} |
2042 | + |
2043 | + virtual ~TokenizeNodesIterator(); |
2044 | + |
2045 | void accept(PlanIterVisitor& v) const; |
2046 | |
2047 | bool nextImpl(store::Item_t& result, PlanState& aPlanState) const; |
2048 | |
2049 | === modified file 'src/runtime/full_text/tokenizer.cpp' |
2050 | --- src/runtime/full_text/tokenizer.cpp 2012-06-28 04:14:03 +0000 |
2051 | +++ src/runtime/full_text/tokenizer.cpp 2012-06-29 16:57:20 +0000 |
2052 | @@ -21,12 +21,15 @@ |
2053 | #include <zorba/tokenizer.h> |
2054 | #include <zorba/zorba_string.h> |
2055 | |
2056 | +#include "api/unmarshaller.h" |
2057 | #include "diagnostics/assert.h" |
2058 | #include "store/api/store.h" |
2059 | #include "system/globalenv.h" |
2060 | #include "zorbamisc/ns_consts.h" |
2061 | #include "zorbautils/locale.h" |
2062 | |
2063 | +#include "ft_util.h" |
2064 | + |
2065 | using namespace zorba::locale; |
2066 | |
2067 | namespace zorba { |
2068 | @@ -38,22 +41,9 @@ |
2069 | } |
2070 | |
2071 | bool Tokenizer::find_lang_attribute( Item const &item, iso639_1::type *lang ) { |
2072 | - bool found_lang = false; |
2073 | - if ( item.getNodeKind() == store::StoreConsts::elementNode ) { |
2074 | - Iterator_t i( item.getAttributes() ); |
2075 | - i->open(); |
2076 | - for ( Item attr; i->next( attr ); ) { |
2077 | - Item qname; |
2078 | - if ( attr.getNodeName( qname ) && |
2079 | - qname.getLocalName() == "lang" && qname.getNamespace() == XML_NS ) { |
2080 | - *lang = locale::find_lang( attr.getStringValue().c_str() ); |
2081 | - found_lang = true; |
2082 | - break; |
2083 | - } |
2084 | - } |
2085 | - i->close(); |
2086 | - } |
2087 | - return found_lang; |
2088 | + return zorba::find_lang_attribute( |
2089 | + *Unmarshaller::getInternalItem( item ), lang |
2090 | + ); |
2091 | } |
2092 | |
2093 | void Tokenizer::item( Item const &item, bool entering ) { |
2094 | |
2095 | === modified file 'src/runtime/json/jsonml_array.cpp' |
2096 | --- src/runtime/json/jsonml_array.cpp 2012-06-28 04:14:03 +0000 |
2097 | +++ src/runtime/json/jsonml_array.cpp 2012-06-29 16:57:20 +0000 |
2098 | @@ -30,6 +30,7 @@ |
2099 | #include "util/omanip.h" |
2100 | #include "util/oseparator.h" |
2101 | #include "util/stl_util.h" |
2102 | +#include "util/xml_util.h" |
2103 | |
2104 | #include "jsonml_array.h" |
2105 | |
2106 | @@ -39,20 +40,12 @@ |
2107 | |
2108 | /////////////////////////////////////////////////////////////////////////////// |
2109 | |
2110 | -static void split_name( zstring const &name, zstring *prefix, zstring *local ) { |
2111 | - zstring::size_type const colon = name.find( ':' ); |
2112 | - if ( colon != zstring::npos ) { |
2113 | - *prefix = name.substr( 0, colon ); |
2114 | - *local = name.substr( colon + 1 ); |
2115 | - if ( prefix->empty() || local->empty() ) |
2116 | - throw XQUERY_EXCEPTION( |
2117 | - zerr::ZJPE0008_ILLEGAL_QNAME, |
2118 | - ERROR_PARAMS( name ) |
2119 | - ); |
2120 | - } else { |
2121 | - prefix->clear(); |
2122 | - *local = name; |
2123 | - } |
2124 | +inline void split_name( zstring const &name, zstring *prefix, zstring *local ) { |
2125 | + if ( !xml::split_name( name, prefix, local ) ) |
2126 | + throw XQUERY_EXCEPTION( |
2127 | + zerr::ZJPE0008_ILLEGAL_QNAME, |
2128 | + ERROR_PARAMS( name ) |
2129 | + ); |
2130 | } |
2131 | |
2132 | namespace expect { |
2133 | |
2134 | === modified file 'src/runtime/pregenerated/iterator_enum.h' |
2135 | --- src/runtime/pregenerated/iterator_enum.h 2012-06-28 21:54:08 +0000 |
2136 | +++ src/runtime/pregenerated/iterator_enum.h 2012-06-29 16:57:20 +0000 |
2137 | @@ -114,6 +114,7 @@ |
2138 | TYPE_StripDiacriticsIterator, |
2139 | TYPE_ThesaurusLookupIterator, |
2140 | TYPE_TokenizeNodeIterator, |
2141 | + TYPE_TokenizeNodesIterator, |
2142 | TYPE_TokenizerPropertiesIterator, |
2143 | TYPE_TokenizeStringIterator, |
2144 | TYPE_FunctionNameIterator, |
2145 | |
2146 | === modified file 'src/runtime/spec/full_text/ft_module.xml' |
2147 | --- src/runtime/spec/full_text/ft_module.xml 2012-06-28 04:14:03 +0000 |
2148 | +++ src/runtime/spec/full_text/ft_module.xml 2012-06-29 16:57:20 +0000 |
2149 | @@ -6,6 +6,12 @@ |
2150 | xsi:schemaLocation="http://www.zorba-xquery.com ../runtime.xsd"> |
2151 | |
2152 | <zorba:header> |
2153 | + <zorba:include form="Angle-bracket">deque</zorba:include> |
2154 | + <zorba:include form="Angle-bracket">list</zorba:include> |
2155 | + <zorba:include form="Angle-bracket">stack</zorba:include> |
2156 | + <zorba:include form="Angle-bracket">vector</zorba:include> |
2157 | + <zorba:include form="Angle-brakcet">zorba/locale.h</zorba:include> |
2158 | + <zorba:include form="Quoted">runtime/full_text/ft_module_util.h</zorba:include> |
2159 | <zorba:include form="Quoted">runtime/full_text/ft_token_seq_iterator.h</zorba:include> |
2160 | <zorba:include form="Quoted">runtime/full_text/thesaurus.h</zorba:include> |
2161 | </zorba:header> |
2162 | @@ -14,6 +20,8 @@ |
2163 | <zorba:include form="Quoted">store/api/iterator.h</zorba:include> |
2164 | </zorba:source> |
2165 | |
2166 | +<!--========================================================================--> |
2167 | + |
2168 | <zorba:iterator name="CurrentCompareOptionsIterator" |
2169 | preprocessorGuard="#ifndef ZORBA_NO_FULL_TEXT"> |
2170 | </zorba:iterator> |
2171 | @@ -27,6 +35,8 @@ |
2172 | </zorba:function> |
2173 | </zorba:iterator> |
2174 | |
2175 | +<!--========================================================================--> |
2176 | + |
2177 | <zorba:iterator name="HostLangIterator" |
2178 | preprocessorGuard="#ifndef ZORBA_NO_FULL_TEXT"> |
2179 | <zorba:function> |
2180 | @@ -36,6 +46,8 @@ |
2181 | </zorba:function> |
2182 | </zorba:iterator> |
2183 | |
2184 | +<!--========================================================================--> |
2185 | + |
2186 | <zorba:iterator name="IsStemLangSupportedIterator" |
2187 | preprocessorGuard="#ifndef ZORBA_NO_FULL_TEXT"> |
2188 | <zorba:function> |
2189 | @@ -46,6 +58,8 @@ |
2190 | </zorba:function> |
2191 | </zorba:iterator> |
2192 | |
2193 | +<!--========================================================================--> |
2194 | + |
2195 | <zorba:iterator name="IsStopWordIterator" |
2196 | preprocessorGuard="#ifndef ZORBA_NO_FULL_TEXT"> |
2197 | <zorba:function> |
2198 | @@ -61,6 +75,8 @@ |
2199 | </zorba:function> |
2200 | </zorba:iterator> |
2201 | |
2202 | +<!--========================================================================--> |
2203 | + |
2204 | <zorba:iterator name="IsStopWordLangSupportedIterator" |
2205 | preprocessorGuard="#ifndef ZORBA_NO_FULL_TEXT"> |
2206 | <zorba:function> |
2207 | @@ -71,6 +87,8 @@ |
2208 | </zorba:function> |
2209 | </zorba:iterator> |
2210 | |
2211 | +<!--========================================================================--> |
2212 | + |
2213 | <zorba:iterator name="IsThesaurusLangSupportedIterator" |
2214 | preprocessorGuard="#ifndef ZORBA_NO_FULL_TEXT"> |
2215 | <zorba:function> |
2216 | @@ -86,6 +104,8 @@ |
2217 | </zorba:function> |
2218 | </zorba:iterator> |
2219 | |
2220 | +<!--========================================================================--> |
2221 | + |
2222 | <zorba:iterator name="IsTokenizerLangSupportedIterator" |
2223 | preprocessorGuard="#ifndef ZORBA_NO_FULL_TEXT"> |
2224 | <zorba:function> |
2225 | @@ -96,6 +116,8 @@ |
2226 | </zorba:function> |
2227 | </zorba:iterator> |
2228 | |
2229 | +<!--========================================================================--> |
2230 | + |
2231 | <zorba:iterator name="StemIterator" |
2232 | preprocessorGuard="#ifndef ZORBA_NO_FULL_TEXT"> |
2233 | <zorba:function> |
2234 | @@ -111,6 +133,8 @@ |
2235 | </zorba:function> |
2236 | </zorba:iterator> |
2237 | |
2238 | +<!--========================================================================--> |
2239 | + |
2240 | <zorba:iterator name="StripDiacriticsIterator" |
2241 | preprocessorGuard="#ifndef ZORBA_NO_FULL_TEXT"> |
2242 | <zorba:function> |
2243 | @@ -121,6 +145,8 @@ |
2244 | </zorba:function> |
2245 | </zorba:iterator> |
2246 | |
2247 | +<!--========================================================================--> |
2248 | + |
2249 | <zorba:iterator name="ThesaurusLookupIterator" |
2250 | generateResetImpl="true" |
2251 | preprocessorGuard="#ifndef ZORBA_NO_FULL_TEXT"> |
2252 | @@ -167,56 +193,69 @@ |
2253 | </zorba:state> |
2254 | </zorba:iterator> |
2255 | |
2256 | +<!--========================================================================--> |
2257 | + |
2258 | <zorba:iterator name="TokenizeNodeIterator" |
2259 | generateResetImpl="true" |
2260 | - generateSerialize="false" |
2261 | - generateConstructor="false" |
2262 | - preprocessorGuard="#ifndef ZORBA_NO_FULL_TEXT"> |
2263 | - |
2264 | - <zorba:state generateInit="use-default"> |
2265 | - <zorba:member type="store::Item_t" name="doc_item_"/> |
2266 | - <zorba:member type="FTTokenIterator_t" name="doc_tokens_"/> |
2267 | - </zorba:state> |
2268 | - |
2269 | - <zorba:member type="store::Item_t" name="token_qname_"/> |
2270 | - <zorba:member type="store::Item_t" name="lang_qname_"/> |
2271 | - <zorba:member type="store::Item_t" name="para_qname_"/> |
2272 | - <zorba:member type="store::Item_t" name="sent_qname_"/> |
2273 | - <zorba:member type="store::Item_t" name="value_qname_"/> |
2274 | - <zorba:member type="store::Item_t" name="ref_qname_"/> |
2275 | - |
2276 | - <zorba:method name="initMembers" return="void"/> |
2277 | - |
2278 | -</zorba:iterator> |
2279 | + preprocessorGuard="#ifndef ZORBA_NO_FULL_TEXT"> |
2280 | + <zorba:state generateInit="use-default"> |
2281 | + <zorba:member type="store::Item_t" name="doc_item_"/> |
2282 | + <zorba:member type="FTTokenIterator_t" name="doc_tokens_"/> |
2283 | + <zorba:member type="TokenQNames" name="token_qnames_"/> |
2284 | + </zorba:state> |
2285 | +</zorba:iterator> |
2286 | + |
2287 | +<!--========================================================================--> |
2288 | + |
2289 | +<zorba:iterator name="TokenizeNodesIterator" |
2290 | + generateResetImpl="true" |
2291 | + preprocessorGuard="#ifndef ZORBA_NO_FULL_TEXT"> |
2292 | + <zorba:state generateInit="use-default"> |
2293 | + <zorba:member type="store::Item_t" name="doc_item_"/> |
2294 | + <zorba:member type="FTTokenIterator_t" name="doc_tokens_"/> |
2295 | + |
2296 | + <zorba:member type="TokenQNames" name="token_qnames_"/> |
2297 | + |
2298 | + <zorba:member type="std::list<store::Item_t>" name="includes_"/> |
2299 | + <zorba:member type="std::vector<store::Item_t>" name="excludes_"/> |
2300 | + |
2301 | + <zorba:member type="std::stack<Tokenizer*>" name="tokenizers_"/> |
2302 | + <zorba:member type="std::stack<locale::iso639_1::type>" name="langs_"/> |
2303 | + <zorba:member type="TokenizeNodesCallback" name="callback_"/> |
2304 | + <zorba:member type="Tokenizer::State" name="t_state_"/> |
2305 | + <zorba:member type="std::deque<FTToken>" name="tokens_"/> |
2306 | + </zorba:state> |
2307 | +</zorba:iterator> |
2308 | + |
2309 | +<!--========================================================================--> |
2310 | |
2311 | <zorba:iterator name="TokenizerPropertiesIterator" |
2312 | preprocessorGuard="#ifndef ZORBA_NO_FULL_TEXT"> |
2313 | </zorba:iterator> |
2314 | |
2315 | +<!--========================================================================--> |
2316 | + |
2317 | <zorba:iterator name="TokenizeStringIterator" |
2318 | generateResetImpl="true" |
2319 | preprocessorGuard="#ifndef ZORBA_NO_FULL_TEXT"> |
2320 | |
2321 | <zorba:function> |
2322 | - |
2323 | <zorba:signature localname="tokenize-string" prefix="full-text"> |
2324 | <zorba:param>xs:string</zorba:param> <!-- string --> |
2325 | <zorba:output>xs:string*</zorba:output> |
2326 | </zorba:signature> |
2327 | - |
2328 | <zorba:signature localname="tokenize-string" prefix="full-text"> |
2329 | <zorba:param>xs:string</zorba:param> <!-- string --> |
2330 | <zorba:param>xs:language</zorba:param> <!-- lang --> |
2331 | <zorba:output>xs:string*</zorba:output> |
2332 | </zorba:signature> |
2333 | - |
2334 | </zorba:function> |
2335 | - |
2336 | <zorba:state generateInit="use-default"> |
2337 | <zorba:member type="FTTokenSeqIterator" name="string_tokens_"/> |
2338 | </zorba:state> |
2339 | - |
2340 | </zorba:iterator> |
2341 | |
2342 | +<!--========================================================================--> |
2343 | + |
2344 | </zorba:iterators> |
2345 | <!-- vim:set et sw=2 ts=2: --> |
2346 | |
2347 | === modified file 'src/runtime/visitors/pregenerated/planiter_visitor.h' |
2348 | --- src/runtime/visitors/pregenerated/planiter_visitor.h 2012-06-28 21:54:08 +0000 |
2349 | +++ src/runtime/visitors/pregenerated/planiter_visitor.h 2012-06-29 16:57:20 +0000 |
2350 | @@ -232,6 +232,9 @@ |
2351 | class TokenizeNodeIterator; |
2352 | #endif |
2353 | #ifndef ZORBA_NO_FULL_TEXT |
2354 | + class TokenizeNodesIterator; |
2355 | +#endif |
2356 | +#ifndef ZORBA_NO_FULL_TEXT |
2357 | class TokenizerPropertiesIterator; |
2358 | #endif |
2359 | #ifndef ZORBA_NO_FULL_TEXT |
2360 | @@ -1015,6 +1018,10 @@ |
2361 | virtual void endVisit ( const TokenizeNodeIterator& ) = 0; |
2362 | #endif |
2363 | #ifndef ZORBA_NO_FULL_TEXT |
2364 | + virtual void beginVisit ( const TokenizeNodesIterator& ) = 0; |
2365 | + virtual void endVisit ( const TokenizeNodesIterator& ) = 0; |
2366 | +#endif |
2367 | +#ifndef ZORBA_NO_FULL_TEXT |
2368 | virtual void beginVisit ( const TokenizerPropertiesIterator& ) = 0; |
2369 | virtual void endVisit ( const TokenizerPropertiesIterator& ) = 0; |
2370 | #endif |
2371 | |
2372 | === modified file 'src/runtime/visitors/pregenerated/printer_visitor.cpp' |
2373 | --- src/runtime/visitors/pregenerated/printer_visitor.cpp 2012-06-28 21:54:08 +0000 |
2374 | +++ src/runtime/visitors/pregenerated/printer_visitor.cpp 2012-06-29 16:57:20 +0000 |
2375 | @@ -1442,6 +1442,21 @@ |
2376 | |
2377 | #endif |
2378 | #ifndef ZORBA_NO_FULL_TEXT |
2379 | +// <TokenizeNodesIterator> |
2380 | +void PrinterVisitor::beginVisit ( const TokenizeNodesIterator& a) { |
2381 | + thePrinter.startBeginVisit("TokenizeNodesIterator", ++theId); |
2382 | + printCommons( &a, theId ); |
2383 | + thePrinter.endBeginVisit( theId ); |
2384 | +} |
2385 | + |
2386 | +void PrinterVisitor::endVisit ( const TokenizeNodesIterator& ) { |
2387 | + thePrinter.startEndVisit(); |
2388 | + thePrinter.endEndVisit(); |
2389 | +} |
2390 | +// </TokenizeNodesIterator> |
2391 | + |
2392 | +#endif |
2393 | +#ifndef ZORBA_NO_FULL_TEXT |
2394 | // <TokenizerPropertiesIterator> |
2395 | void PrinterVisitor::beginVisit ( const TokenizerPropertiesIterator& a) { |
2396 | thePrinter.startBeginVisit("TokenizerPropertiesIterator", ++theId); |
2397 | |
2398 | === modified file 'src/runtime/visitors/pregenerated/printer_visitor.h' |
2399 | --- src/runtime/visitors/pregenerated/printer_visitor.h 2012-06-28 21:54:08 +0000 |
2400 | +++ src/runtime/visitors/pregenerated/printer_visitor.h 2012-06-29 16:57:20 +0000 |
2401 | @@ -356,6 +356,11 @@ |
2402 | #endif |
2403 | |
2404 | #ifndef ZORBA_NO_FULL_TEXT |
2405 | + void beginVisit( const TokenizeNodesIterator& ); |
2406 | + void endVisit ( const TokenizeNodesIterator& ); |
2407 | +#endif |
2408 | + |
2409 | +#ifndef ZORBA_NO_FULL_TEXT |
2410 | void beginVisit( const TokenizerPropertiesIterator& ); |
2411 | void endVisit ( const TokenizerPropertiesIterator& ); |
2412 | #endif |
2413 | |
2414 | === modified file 'src/util/xml_util.h' |
2415 | --- src/util/xml_util.h 2012-06-28 04:14:03 +0000 |
2416 | +++ src/util/xml_util.h 2012-06-29 16:57:20 +0000 |
2417 | @@ -40,12 +40,14 @@ |
2418 | return o << version_string_of[ v ]; |
2419 | } |
2420 | |
2421 | -////////// "James Clark notation" universal name functions //////////////////// |
2422 | +////////// XML name handing /////////////////////////////////////////////////// |
2423 | |
2424 | /** |
2425 | * Attempts to extract the local name from a "universal name". |
2426 | * See: http://www.jclark.com/xml/xmlns.htm |
2427 | * |
2428 | + * @tparam InputStringType The input string type. |
2429 | + * @tparam OutputStringType The output string type. |
2430 | * @param uname The universal name. |
2431 | * @param local A pointer to the string to receive the local name. |
2432 | * @return Returns \c true only if the extraction was successful. |
2433 | @@ -64,6 +66,8 @@ |
2434 | * Attempts to extract the URI from a "universal name". |
2435 | * See: http://www.jclark.com/xml/xmlns.htm |
2436 | * |
2437 | + * @tparam InputStringType The input string type. |
2438 | + * @tparam OutputStringType The output string type. |
2439 | * @param uname The universal name. |
2440 | * @param uri A pointer to the string to receive the URI. |
2441 | * @return Returns \c true only if the extraction was successful. |
2442 | @@ -80,11 +84,39 @@ |
2443 | return false; |
2444 | } |
2445 | |
2446 | +/** |
2447 | + * Splits an XML name at a \c : if present. |
2448 | + * |
2449 | + * @tparam InputStringType The input string type. |
2450 | + * @tparam PrefixStringType The output prefix string type. |
2451 | + * @tparam LocalStringType The output local string type. |
2452 | + * @param name The XML name to be split. |
2453 | + * @param prefix The prefix is put here, if any. |
2454 | + * @param local The local name is put here. |
2455 | + * @return If \a name contains a \c : and either \a prefix or \a local strings |
2456 | + * become empty, returns \c false; otherwise returns \a true. |
2457 | + */ |
2458 | +template<class InputStringType,class PrefixStringType,class LocalStringType> |
2459 | +inline bool split_name( InputStringType const &name, PrefixStringType *prefix, |
2460 | + LocalStringType *local ) { |
2461 | + typename InputStringType::size_type const colon = name.find( ':' ); |
2462 | + if ( colon != InputStringType::npos ) { |
2463 | + prefix->assign( name, 0, colon ); |
2464 | + local->assign( name, colon + 1, LocalStringType::npos ); |
2465 | + return !( prefix->empty() || local->empty() ); |
2466 | + } else { |
2467 | + prefix->clear(); |
2468 | + *local = name; |
2469 | + return true; |
2470 | + } |
2471 | +} |
2472 | + |
2473 | ////////// Character validity ///////////////////////////////////////////////// |
2474 | |
2475 | /** |
2476 | * Checks whether the given code-point is valid for the given XML version. |
2477 | * |
2478 | + * @tparam CodePointType The integral Unicode code-point type. |
2479 | * @param v The XML version to use. |
2480 | * @return Returns \c true only if the code-point is valid. |
2481 | */ |
2482 | @@ -196,7 +228,7 @@ |
2483 | /** |
2484 | * Parses an XML entity reference. |
2485 | * |
2486 | - * @tparam StringType The type of the input string. |
2487 | + * @tparam StringType The input string type. |
2488 | * @param ref The string pointing to the start of the entity reference. |
2489 | * @param c A pointer to the code-point result. |
2490 | * @return If successful, returns the number of characters parsed; otherwise |
2491 | @@ -211,7 +243,7 @@ |
2492 | * Parses an XML entity reference and appends the UTF-8 encoding of the |
2493 | * resulting code-point to the given string. |
2494 | * |
2495 | - * @tparam StringType The type of the output string. |
2496 | + * @tparam StringType The output string type. |
2497 | * @param ref The C string pointing to the start of the entity reference. |
2498 | * @param out A string to append to. |
2499 | * @return If successful, returns the number of characters parsed; otherwise |
2500 | @@ -230,8 +262,8 @@ |
2501 | * Parses an XML entity reference and appends the UTF-8 encoding of the |
2502 | * resulting code-point to the given string. |
2503 | * |
2504 | - * @tparam InputStringType The type of the input string. |
2505 | - * @tparam OutputStringType The type of the output string. |
2506 | + * @tparam InputStringType The input string type. |
2507 | + * @tparam OutputStringType The output string type. |
2508 | * @param ref The string pointing to the start of the entity reference. |
2509 | * @param out A string to append to. |
2510 | * @return If successful, returns the number of characters parsed; otherwise |
2511 | |
2512 | === added file 'test/rbkt/ExpQueryResults/zorba/fulltext/ft-module-tokenize-nodes-1.xml.res' |
2513 | --- test/rbkt/ExpQueryResults/zorba/fulltext/ft-module-tokenize-nodes-1.xml.res 1970-01-01 00:00:00 +0000 |
2514 | +++ test/rbkt/ExpQueryResults/zorba/fulltext/ft-module-tokenize-nodes-1.xml.res 2012-06-29 16:57:20 +0000 |
2515 | @@ -0,0 +1,1 @@ |
2516 | +true |
2517 | |
2518 | === added file 'test/rbkt/Queries/zorba/fulltext/ft-module-tokenize-nodes-1.xq' |
2519 | --- test/rbkt/Queries/zorba/fulltext/ft-module-tokenize-nodes-1.xq 1970-01-01 00:00:00 +0000 |
2520 | +++ test/rbkt/Queries/zorba/fulltext/ft-module-tokenize-nodes-1.xq 2012-06-29 16:57:20 +0000 |
2521 | @@ -0,0 +1,42 @@ |
2522 | +import module namespace ft = "http://www.zorba-xquery.com/modules/full-text"; |
2523 | +import schema namespace fts = "http://www.zorba-xquery.com/modules/full-text"; |
2524 | + |
2525 | +let $book := |
2526 | + <book> |
2527 | + <title>The C++ Programming Language</title> |
2528 | + <authors> |
2529 | + <author>Bjarne Stroustrup</author> |
2530 | + </authors> |
2531 | + <chapters> |
2532 | + <chapter> |
2533 | + <title>Notes to the Reader</title> |
2534 | + <content> |
2535 | + <quote> |
2536 | + <content> |
2537 | + "The time has come," the Walrus said, |
2538 | + "to talk of many things." |
2539 | + </content> |
2540 | + <source>Lewis Carroll</source> |
2541 | + </quote> |
2542 | + <!-- more content --> |
2543 | + </content> |
2544 | + </chapter> |
2545 | + </chapters> |
2546 | + </book> |
2547 | + |
2548 | +let $includes := $book//chapter |
2549 | +let $excludes := $book//quote |
2550 | + |
2551 | +let $tokens := ft:tokenize-nodes( $includes, $excludes, xs:language("en") ) |
2552 | + |
2553 | +let $t1 := validate { $tokens[1] } |
2554 | +let $t2 := validate { $tokens[2] } |
2555 | +let $t3 := validate { $tokens[3] } |
2556 | +let $t4 := validate { $tokens[4] } |
2557 | + |
2558 | +return $t1/@value = "Notes" |
2559 | + and $t2/@value = "to" |
2560 | + and $t3/@value = "the" |
2561 | + and $t4/@value = "Reader" |
2562 | + |
2563 | +(: vim:set et sw=2 ts=2: :) |
- the changelog says that it's a new function but it has been there before www.w3. org/TR/ xmlschema- 2/#language">language</a> current- lang()< /code>
- ft:tokenize-nodes#2 comment is confusing. Why does it say
The default
74 + : <a href="http://
75 + : is assumed to be the one returned by <code>ft:
in between the two pragmas.