Merge lp:~nbrinza/zorba/parse-fragment into lp:zorba

Proposed by Nicolae Brinza
Status: Superseded
Proposed branch: lp:~nbrinza/zorba/parse-fragment
Merge into: lp:zorba
Diff against target: 761 lines (+344/-17)
34 files modified
ChangeLog (+3/-0)
modules/com/zorba-xquery/www/modules/xml-options.xsd (+1/-0)
modules/com/zorba-xquery/www/modules/xml.xq (+8/-3)
src/diagnostics/diagnostic_en.xml (+9/-1)
src/diagnostics/pregenerated/dict_en.cpp (+3/-1)
src/runtime/parsing_and_serializing/fragment_istream.h (+10/-3)
src/runtime/parsing_and_serializing/parse_fragment_impl.cpp (+2/-0)
src/store/api/load_properties.h (+16/-0)
src/store/naive/loader.h (+6/-0)
src/store/naive/loader_dtd.cpp (+74/-9)
test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-01.xml.res (+5/-0)
test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-02.xml.res (+6/-0)
test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-03.xml.res (+6/-0)
test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-04.xml.res (+7/-0)
test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-05.xml.res (+6/-0)
test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-06.xml.res (+4/-0)
test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-07.xml.res (+7/-0)
test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-08.xml.res (+2/-0)
test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-11.xml.res (+8/-0)
test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-01.xq (+12/-0)
test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-02.xq (+13/-0)
test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-03.xq (+13/-0)
test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-04.xq (+14/-0)
test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-05.xq (+13/-0)
test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-06.xq (+12/-0)
test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-07.xq (+15/-0)
test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-08.xq (+12/-0)
test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-09.spec (+1/-0)
test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-09.xq (+13/-0)
test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-10.spec (+1/-0)
test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-10.xq (+13/-0)
test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-11.xq (+15/-0)
test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-12.spec (+1/-0)
test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-12.xq (+13/-0)
To merge this branch: bzr merge lp:~nbrinza/zorba/parse-fragment
Reviewer Review Type Date Requested Status
Matthias Brantner Needs Fixing
Review via email: mp+115344@code.launchpad.net

This proposal has been superseded by a proposal from 2012-07-17.

Commit message

The parse-fragment function now allows a DOCTYPE declaration in the input.

Description of the change

The parse-fragment function now allows a DOCTYPE declaration in the input.

To post a comment you must log in.
Revision history for this message
Zorba Build Bot (zorba-buildbot) wrote :

Voting does not meet specified criteria. Required: Approve > 1, Disapprove < 1, Needs Fixing < 1, Pending < 1. Got: 1 Pending.

Revision history for this message
Matthias Brantner (matthias-brantner) wrote :

ChangeLog should mention the fix of bug https://bugs.launchpad.net/zorba/+bug/1016606. In fact, the patch fixes a bug and is not a new feature. ;-)

review: Needs Fixing
lp:~nbrinza/zorba/parse-fragment updated
10533. By Nicolae Brinza

Updated Changelog with the resolution of bug #1016606

10534. By Nicolae Brinza

Merged with Zorba trunk

10535. By Nicolae Brinza

Fixed parse-fragment not handling correctly the streammable streams lifetime.

10536. By Nicolae Brinza

The input buffer of parse-fragment can grow if libxml is not able to parse the current chunk. Fixes bug #1027270

10537. By Nicolae Brinza

Merged with Zorba trunk

10538. By Nicolae Brinza

Merged with Zorba trunk

10539. By Nicolae Brinza

Updated the Changelog with fixes for bugs #1016606 and #1024033

10540. By Nicolae Brinza

Updated the Changelog with the fix for the bug #1023170

10541. By Nicolae Brinza

Merged with Zorba trunk

10542. By Nicolae Brinza

Fix for bug #1099535 endless loop in xml:parse()

10543. By Nicolae Brinza

Merged with Zorba trunk

10544. By Nicolae Brinza

Updated Changelog to mention fix for bug #1099535

10545. By Nicolae Brinza

Merged with Zorba trunk

10546. By Nicolae Brinza

Fixed bug #1099648 -- XML parsing failures on Red Hat

10547. By Nicolae Brinza

Updated Changelog to mention the fix for bug #1099648

10548. By Nicolae Brinza

Merged with Zorba trunk

Unmerged revisions

10548. By Nicolae Brinza

Merged with Zorba trunk

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'ChangeLog'
2--- ChangeLog 2012-07-12 17:29:55 +0000
3+++ ChangeLog 2012-07-17 15:50:31 +0000
4@@ -4,6 +4,8 @@
5 version 2.x
6
7 New Features:
8+ * The parse-fragment function now allows a DOCTYPE declaration in the input.
9+ (Also fixed bug #1016606 with the feature request).
10 * Implemented the new EQName syntax (use Q{namespace}local instead of "namespace":local).
11 Also updated the fn:path() function.
12 * Item::isSeekable API extension for streamable content (xs:string and xs:base64Binary).
13@@ -30,6 +32,7 @@
14 * Streaming execution for tumbling windows (also fixes bug #1010051).
15
16 Bug Fixes/Other Changes:
17+ * Fixed bug #1016606 (DOCTYPE in the input of the parse-fragment function)
18 * Fixed bug #1002993 (bug during revalidation after update; improper condition
19 for calling TypeOps::get_atomic_type_code() from
20 SchemaValidatorImpl::isPossibleSimpleContentRevalImpl())
21
22=== modified file 'modules/com/zorba-xquery/www/modules/xml-options.xsd'
23--- modules/com/zorba-xquery/www/modules/xml-options.xsd 2012-07-12 17:29:55 +0000
24+++ modules/com/zorba-xquery/www/modules/xml-options.xsd 2012-07-17 15:50:31 +0000
25@@ -61,6 +61,7 @@
26 </simpleType>
27 </attribute>
28 <attribute name="skip-top-level-text-nodes" type="boolean" use="optional"/>
29+ <attribute name="error-on-doctype" type="boolean" use="optional"/>
30 </complexType>
31 </element>
32 <element name="substitute-entities" minOccurs="0" maxOccurs="1">
33
34=== modified file 'modules/com/zorba-xquery/www/modules/xml.xq'
35--- modules/com/zorba-xquery/www/modules/xml.xq 2012-07-12 17:29:55 +0000
36+++ modules/com/zorba-xquery/www/modules/xml.xq 2012-07-17 15:50:31 +0000
37@@ -141,7 +141,11 @@
38 : external entities. If the option
39 : is enabled, the input must conform to the syntax extParsedEnt (production
40 : [78] in XML 1.0, see <a href="http://www.w3.org/TR/xml/#wf-entities">
41- : Well-Formed Parsed Entities</a>). The result of the function call is a list
42+ : Well-Formed Parsed Entities</a>). In addition, by default a DOCTYPE declaration is allowed,
43+ : as described by the [28] doctypedecl production, see <a href="http://www.w3.org/TR/xml/#NT-doctypedecl">
44+ : Document Type Definition</a>. A parameter is available to forbid the appearance of the DOCTYPE.
45+ :
46+ : The result of the function call is a list
47 : of nodes corresponding to the top-level components of the content of the
48 : external entity: that is, elements, processing instructions, comments, and
49 : text nodes. CDATA sections and character references are expanded, and
50@@ -151,7 +155,7 @@
51 : (<a href="http://www.w3.org/TR/xml/#sec-well-formed">production [1] in XML 1.0</a>).
52 : This option can not be used together with either the &lt;schema-validate/&gt; or the &lt;DTD-validate/&gt;
53 : option. Doing so will raise a zerr:ZXQD0003 error.
54- : The &lt;parse-external-parsed-entity/&gt; option has two parameters, given by attributes. The first
55+ : The &lt;parse-external-parsed-entity/&gt; option has three parameters, given by attributes. The first
56 : attribute is "skip-root-nodes" and it can have a non-negative value. Specifying the paramter
57 : tells the parser to skip the given number of root nodes and return only their children. E.g.
58 : skip-root-nodes="1" is equivalent to parse-xml($xml-string)/node()/node() . skip-root-nodes="2" is equivalent
59@@ -159,7 +163,8 @@
60 : boolean value. Specifying "true" will tell the parser to skip top level text nodes, returning
61 : only the top level elements, comments, PIs, etc. This parameter works in combination with
62 : the "skip-root-nodes" paramter, thus top level text nodes are skipped after "skip-root-nodes" has
63- : been applied.
64+ : been applied. The third paramter is "error-on-doctype" and will generate an error if a DOCTYPE
65+ : declaration appears in the input, which by default is allowed.
66 : </li>
67 :
68 : <li>
69
70=== modified file 'src/diagnostics/diagnostic_en.xml'
71--- src/diagnostics/diagnostic_en.xml 2012-07-12 17:29:55 +0000
72+++ src/diagnostics/diagnostic_en.xml 2012-07-17 15:50:31 +0000
73@@ -3834,7 +3834,15 @@
74 </entry>
75
76 <entry key="ParseFragmentInvalidOptions">
77- <value>invalid options passed to the parse-xml:parse() function, the element must in the schema target namespace</value>
78+ <value>invalid options passed to the parse-xml:parse() function, the element must be in the schema target namespace</value>
79+ </entry>
80+
81+ <entry key="ParseFragmentDoctypeNotAllowed">
82+ <value>a DOCTYPE declaration is not allowed</value>
83+ </entry>
84+
85+ <entry key="ParseFragmentDoctypeNotAllowedHere">
86+ <value>a DOCTYPE declaration must appear before any element or text node, and at most once</value>
87 </entry>
88
89 <entry key="FormatNumberDuplicates">
90
91=== modified file 'src/diagnostics/pregenerated/dict_en.cpp'
92--- src/diagnostics/pregenerated/dict_en.cpp 2012-07-12 17:29:55 +0000
93+++ src/diagnostics/pregenerated/dict_en.cpp 2012-07-17 15:50:31 +0000
94@@ -670,7 +670,9 @@
95 { "~OpNodeBeforeMustHaveNodes", "op:node-before() must have nodes as parameters" },
96 { "~OperationNotDef_23", "$2 not defined for type \"$3\"" },
97 { "~OperationNotPossibleWithTypes_234", "\"$2\": operation not possible with parameters of type \"$3\" and \"$4\"" },
98- { "~ParseFragmentInvalidOptions", "invalid options passed to the parse-xml:parse() function, the element must in the schema target namespace" },
99+ { "~ParseFragmentDoctypeNotAllowed", "a DOCTYPE declaration is not allowed" },
100+ { "~ParseFragmentDoctypeNotAllowedHere", "a DOCTYPE declaration must appear before any element or text node, and at most once" },
101+ { "~ParseFragmentInvalidOptions", "invalid options passed to the parse-xml:parse() function, the element must be in the schema target namespace" },
102 { "~ParseFragmentOptionCombinationNotAllowed", "only one of the <schema-validate/>, <DTD-validate/> or <parse-external-parsed-entity/> options can be specified" },
103 { "~ParserInitFailed", "parser initialization failed" },
104 { "~ParserNoCreateTree", "XML tree creation failed" },
105
106=== modified file 'src/runtime/parsing_and_serializing/fragment_istream.h'
107--- src/runtime/parsing_and_serializing/fragment_istream.h 2012-07-12 17:29:55 +0000
108+++ src/runtime/parsing_and_serializing/fragment_istream.h 2012-07-17 15:50:31 +0000
109@@ -33,6 +33,13 @@
110 static const unsigned int BUFFER_SIZE = 4096;
111 static const unsigned int LOOKAHEAD_BYTES = 3; // lookahead fetching is implemented, but currently not used
112 static const unsigned int PARSED_NODES_BATCH_SIZE = 1024;
113+
114+ // names of these states are orientative
115+ enum FRAGMENT_PARSER_STATE {
116+ FRAGMENT_FIRST_START_DOC = 0,
117+ FRAGMENT_PROLOG,
118+ FRAGMENT_CONTENT // this state is set once an element is encountered
119+ };
120
121 public:
122 std::istringstream* theIss;
123@@ -43,7 +50,7 @@
124 int current_element_depth;
125 int root_elements_to_skip;
126 xmlParserCtxtPtr ctxt;
127- bool first_start_doc;
128+ FRAGMENT_PARSER_STATE state;
129 bool forced_parser_stop;
130 bool reached_eof;
131 unsigned int parsed_nodes_count;
132@@ -63,7 +70,7 @@
133 current_element_depth(0),
134 root_elements_to_skip(0),
135 ctxt(NULL),
136- first_start_doc(true),
137+ state(FRAGMENT_FIRST_START_DOC),
138 forced_parser_stop(false),
139 reached_eof(false),
140 parsed_nodes_count(0),
141@@ -103,7 +110,7 @@
142 current_element_depth = 0;
143 root_elements_to_skip = 0;
144 ctxt = NULL;
145- first_start_doc = true;
146+ state = FRAGMENT_FIRST_START_DOC;
147 forced_parser_stop = false;
148 reached_eof = false;
149 parsed_nodes_count = 0;
150
151=== modified file 'src/runtime/parsing_and_serializing/parse_fragment_impl.cpp'
152--- src/runtime/parsing_and_serializing/parse_fragment_impl.cpp 2012-07-12 17:29:55 +0000
153+++ src/runtime/parsing_and_serializing/parse_fragment_impl.cpp 2012-07-17 15:50:31 +0000
154@@ -145,6 +145,8 @@
155 props.setSkipRootNodes(ztd::aton<xs_int>(attr->getStringValue().c_str()));
156 else if (attr->getNodeName()->getLocalName() == "skip-top-level-text-nodes")
157 props.setSkipTopLevelTextNodes(true);
158+ else if (attr->getNodeName()->getLocalName() == "error-on-doctype")
159+ props.setErrorOnDoctype(true);
160 }
161 attribs->close();
162 }
163
164=== modified file 'src/store/api/load_properties.h'
165--- src/store/api/load_properties.h 2012-07-12 17:29:55 +0000
166+++ src/store/api/load_properties.h 2012-07-17 15:50:31 +0000
167@@ -46,6 +46,10 @@
168 bool theParseExternalParsedEntity;
169 unsigned int theSkipRootNodes;
170 bool theSkipTopLevelTextNodes;
171+ bool theErrorOnDoctype; // Used by the fragment parser. By default it will allow Doctype
172+ // declarations. But if a Doctype declaration is
173+ // present, and the flag is set to true, an error will be generated.
174+
175 bool theSubstituteEntities;
176 bool theXincludeSubstitutions;
177 bool theRemoveRedundantNS;
178@@ -73,6 +77,7 @@
179 theParseExternalParsedEntity(false),
180 theSkipRootNodes(0),
181 theSkipTopLevelTextNodes(false),
182+ theErrorOnDoctype(false),
183 theSubstituteEntities(false),
184 theXincludeSubstitutions(false),
185 theRemoveRedundantNS(false),
186@@ -100,6 +105,7 @@
187 theParseExternalParsedEntity = false;
188 theSkipRootNodes = 0;
189 theSkipTopLevelTextNodes = false;
190+ theErrorOnDoctype = false;
191 theSubstituteEntities = false;
192 theXincludeSubstitutions = false;
193 theRemoveRedundantNS = false;
194@@ -238,6 +244,16 @@
195 {
196 return theSkipTopLevelTextNodes;
197 }
198+
199+ // theSkipTopLevelTextNodes
200+ void setErrorOnDoctype(bool aErrorOnDoctype)
201+ {
202+ theErrorOnDoctype = aErrorOnDoctype;
203+ }
204+ bool getErrorOnDoctype() const
205+ {
206+ return theErrorOnDoctype;
207+ }
208
209 // theSubstituteEntities
210 void setSubstituteEntities(bool aSubstituteEntities)
211
212=== modified file 'src/store/naive/loader.h'
213--- src/store/naive/loader.h 2012-07-12 17:29:55 +0000
214+++ src/store/naive/loader.h 2012-07-17 15:50:31 +0000
215@@ -306,6 +306,12 @@
216 void * ctx,
217 const xmlChar * target,
218 const xmlChar * data);
219+
220+ static void internalSubset(
221+ void *ctx,
222+ const xmlChar *name,
223+ const xmlChar *ExternalID,
224+ const xmlChar *SystemID);
225
226 protected:
227 FragmentIStream* theFragmentStream;
228
229=== modified file 'src/store/naive/loader_dtd.cpp'
230--- src/store/naive/loader_dtd.cpp 2012-07-12 17:29:55 +0000
231+++ src/store/naive/loader_dtd.cpp 2012-07-17 15:50:31 +0000
232@@ -141,6 +141,7 @@
233 theSaxHandler.getEntity = &FragmentXmlLoader::getEntity;
234 theSaxHandler.getParameterEntity = &FragmentXmlLoader::getParameterEntity;
235 theSaxHandler.entityDecl = &FragmentXmlLoader::entityDecl;
236+ theSaxHandler.internalSubset = &FragmentXmlLoader::internalSubset;
237 }
238
239
240@@ -251,7 +252,7 @@
241 theFragmentStream->parsed_nodes_count = 0;
242 theFragmentStream->forced_parser_stop = false;
243
244- if ( ! theFragmentStream->first_start_doc)
245+ if (theFragmentStream->state != FragmentIStream::FRAGMENT_FIRST_START_DOC)
246 {
247 theFragmentStream->ctxt->instate = XML_PARSER_CONTENT;
248 FragmentXmlLoader::startDocument(theFragmentStream->ctxt->userData);
249@@ -259,9 +260,7 @@
250
251 while ( ! theFragmentStream->forced_parser_stop && fillBuffer(theFragmentStream))
252 {
253- // std::cerr << "\n==================\n--> skip_root: " << theFragmentStream->root_elements_to_skip << " current_depth: " << theFragmentStream->current_element_depth << " about to parse: [" << theFragmentStream->ctxt->input->cur << "] " << std::endl;
254-
255- if (theFragmentStream->only_one_doc_node && !theFragmentStream->first_start_doc)
256+ if (theFragmentStream->only_one_doc_node && theFragmentStream->state != FragmentIStream::FRAGMENT_FIRST_START_DOC)
257 {
258 theFragmentStream->ctxt->instate = XML_PARSER_CONTENT;
259 theFragmentStream->ctxt->disableSAX = false; // xmlStopParser() sets disableSAX to true
260@@ -278,7 +277,51 @@
261 );
262 throw 0; // the argument to throw is not used by the catch clause
263 }
264-
265+
266+ // parse the DOCTYPE declaration, if any
267+ if (theFragmentStream->ctxt->input->cur[0] == '<' &&
268+ theFragmentStream->ctxt->input->cur[1] == '!' &&
269+ theFragmentStream->ctxt->input->cur[2] == 'D' &&
270+ theFragmentStream->ctxt->input->cur[3] == 'O' &&
271+ theFragmentStream->ctxt->input->cur[4] == 'C' &&
272+ theFragmentStream->ctxt->input->cur[5] == 'T' &&
273+ theFragmentStream->ctxt->input->cur[6] == 'Y' &&
274+ theFragmentStream->ctxt->input->cur[7] == 'P' &&
275+ theFragmentStream->ctxt->input->cur[8] == 'E')
276+ {
277+ if (theFragmentStream->state == FragmentIStream::FRAGMENT_PROLOG
278+ &&
279+ theLoadProperties.getErrorOnDoctype() == false)
280+ {
281+ theFragmentStream->ctxt->instate = XML_PARSER_MISC;
282+ }
283+ else if (theFragmentStream->state != FragmentIStream::FRAGMENT_FIRST_START_DOC)
284+ {
285+ if (theFragmentStream->state != FragmentIStream::FRAGMENT_PROLOG)
286+ {
287+ theXQueryDiagnostics->add_error(theDocUri.empty() ?
288+ NEW_ZORBA_EXCEPTION(zerr::ZSTR0021_LOADER_PARSING_ERROR, ERROR_PARAMS( ZED( ParseFragmentDoctypeNotAllowedHere ))) :
289+ NEW_ZORBA_EXCEPTION(zerr::ZSTR0021_LOADER_PARSING_ERROR, ERROR_PARAMS( ZED( ParseFragmentDoctypeNotAllowedHere ), theDocUri))
290+ );
291+ throw 0; // the argument to throw is not used by the catch clause
292+ }
293+ else // theLoadProperties.getErrorOnDoctype() == true
294+ {
295+ theXQueryDiagnostics->add_error(theDocUri.empty() ?
296+ NEW_ZORBA_EXCEPTION(zerr::ZSTR0021_LOADER_PARSING_ERROR, ERROR_PARAMS( ZED( ParseFragmentDoctypeNotAllowed ))) :
297+ NEW_ZORBA_EXCEPTION(zerr::ZSTR0021_LOADER_PARSING_ERROR, ERROR_PARAMS( ZED( ParseFragmentDoctypeNotAllowed ), theDocUri))
298+ );
299+ throw 0; // the argument to throw is not used by the catch clause
300+ }
301+ }
302+ }
303+
304+ /*
305+ std::cerr << "\n==================\n--> skip_root: " << theFragmentStream->root_elements_to_skip << " current_depth: " << theFragmentStream->current_element_depth
306+ << " state: " << theFragmentStream->ctxt->instate
307+ << " about to parse: [" << theFragmentStream->ctxt->input->cur << "] " << std::endl;
308+ */
309+
310 xmlParseChunk(theFragmentStream->ctxt, (const char*)theFragmentStream->ctxt->input->cur,
311 theFragmentStream->ctxt->input->length, 0);
312
313@@ -287,7 +330,7 @@
314 &&
315 theFragmentStream->current_offset == 0)
316 {
317- if (theFragmentStream->first_start_doc)
318+ if (theFragmentStream->state == FragmentIStream::FRAGMENT_FIRST_START_DOC)
319 FragmentXmlLoader::startDocument(theFragmentStream->ctxt->userData);
320 xmlParseCharData(theFragmentStream->ctxt, 0);
321 theFragmentStream->current_offset = getCurrentInputOffset(); // update current offset
322@@ -310,7 +353,7 @@
323 }
324
325 // this happens when the input is an empty string
326- if (theFragmentStream->first_start_doc
327+ if (theFragmentStream->state == FragmentIStream::FRAGMENT_FIRST_START_DOC
328 &&
329 theFragmentStream->stream_is_consumed())
330 FragmentXmlLoader::startDocument(theFragmentStream->ctxt->userData);
331@@ -363,6 +406,7 @@
332 return resultNode;
333 }
334
335+
336 unsigned long FragmentXmlLoader::getCurrentInputOffset() const
337 {
338 unsigned long offset = theFragmentStream->ctxt->input->cur
339@@ -372,6 +416,7 @@
340 return offset;
341 }
342
343+
344 void FragmentXmlLoader::checkStopParsing(void* ctx, bool force)
345 {
346 FragmentXmlLoader& loader = *(static_cast<FragmentXmlLoader*>(ctx));
347@@ -402,23 +447,26 @@
348 loader.theFragmentStream->parsed_nodes_count++;
349 }
350
351+
352 void FragmentXmlLoader::startDocument(void * ctx)
353 {
354 FragmentXmlLoader& loader = *(static_cast<FragmentXmlLoader*>(ctx));
355 ZORBA_LOADER_CHECK_ERROR(loader);
356 FastXmlLoader::startDocument(ctx);
357- if (loader.theFragmentStream->first_start_doc)
358+ if (loader.theFragmentStream->state == FragmentIStream::FRAGMENT_FIRST_START_DOC)
359 {
360- loader.theFragmentStream->first_start_doc = false;
361+ loader.theFragmentStream->state = FragmentIStream::FRAGMENT_PROLOG;
362 FragmentXmlLoader::checkStopParsing(ctx, true);
363 }
364 }
365
366+
367 void FragmentXmlLoader::endDocument(void * ctx)
368 {
369 FastXmlLoader::endDocument(ctx);
370 }
371
372+
373 void FragmentXmlLoader::startElement(
374 void * ctx,
375 const xmlChar * localname,
376@@ -432,6 +480,7 @@
377 {
378 FragmentXmlLoader& loader = *(static_cast<FragmentXmlLoader*>(ctx));
379 ZORBA_LOADER_CHECK_ERROR(loader);
380+ loader.theFragmentStream->state = FragmentIStream::FRAGMENT_CONTENT;
381 loader.theFragmentStream->current_element_depth++;
382 if (loader.theFragmentStream->current_element_depth > loader.theFragmentStream->root_elements_to_skip)
383 {
384@@ -447,6 +496,7 @@
385 }
386 }
387
388+
389 void FragmentXmlLoader::endElement(
390 void * ctx,
391 const xmlChar * localname,
392@@ -461,6 +511,7 @@
393 checkStopParsing(ctx);
394 }
395
396+
397 void FragmentXmlLoader::characters(
398 void * ctx,
399 const xmlChar * ch,
400@@ -512,6 +563,20 @@
401 }
402
403
404+void FragmentXmlLoader::internalSubset(
405+ void *ctx,
406+ const xmlChar *name,
407+ const xmlChar *ExternalID,
408+ const xmlChar *SystemID)
409+{
410+ FragmentXmlLoader& loader = *(static_cast<FragmentXmlLoader*>(ctx));
411+ ZORBA_LOADER_CHECK_ERROR(loader);
412+ loader.theFragmentStream->state = FragmentIStream::FRAGMENT_CONTENT;
413+ checkStopParsing(ctx);
414+ loader.theFragmentStream->current_offset++;
415+}
416+
417+
418 /*******************************************************************************
419
420 ********************************************************************************/
421
422=== added file 'test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-01.xml.res'
423--- test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-01.xml.res 1970-01-01 00:00:00 +0000
424+++ test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-01.xml.res 2012-07-17 15:50:31 +0000
425@@ -0,0 +1,5 @@
426+<?xml version="1.0" encoding="UTF-8"?>
427+
428+<from1>Jani</from1>
429+<from2>Jani</from2>
430+<from3>Jani</from3>
431\ No newline at end of file
432
433=== added file 'test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-02.xml.res'
434--- test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-02.xml.res 1970-01-01 00:00:00 +0000
435+++ test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-02.xml.res 2012-07-17 15:50:31 +0000
436@@ -0,0 +1,6 @@
437+<?xml version="1.0" encoding="UTF-8"?>
438+
439+
440+<from1>Jani</from1>
441+<from2>Jani</from2>
442+<from3>Jani</from3>
443\ No newline at end of file
444
445=== added file 'test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-03.xml.res'
446--- test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-03.xml.res 1970-01-01 00:00:00 +0000
447+++ test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-03.xml.res 2012-07-17 15:50:31 +0000
448@@ -0,0 +1,6 @@
449+<?xml version="1.0" encoding="UTF-8"?>
450+
451+
452+<from1>Jani</from1>
453+<from2>Jani</from2>
454+<from3>Jani</from3>
455\ No newline at end of file
456
457=== added file 'test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-04.xml.res'
458--- test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-04.xml.res 1970-01-01 00:00:00 +0000
459+++ test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-04.xml.res 2012-07-17 15:50:31 +0000
460@@ -0,0 +1,7 @@
461+<?xml version="1.0" encoding="UTF-8"?>
462+
463+
464+
465+<from1>Jani</from1>
466+<from2>Jani</from2>
467+<from3>Jani</from3>
468\ No newline at end of file
469
470=== added file 'test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-05.xml.res'
471--- test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-05.xml.res 1970-01-01 00:00:00 +0000
472+++ test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-05.xml.res 2012-07-17 15:50:31 +0000
473@@ -0,0 +1,6 @@
474+<?xml version="1.0" encoding="UTF-8"?>
475+
476+
477+<from1>Jani</from1>
478+<from2>Jani</from2>
479+<from3>Jani</from3>
480\ No newline at end of file
481
482=== added file 'test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-06.xml.res'
483--- test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-06.xml.res 1970-01-01 00:00:00 +0000
484+++ test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-06.xml.res 2012-07-17 15:50:31 +0000
485@@ -0,0 +1,4 @@
486+<?xml version="1.0" encoding="UTF-8"?>
487+
488+Jani
489+<from3>Jani</from3>
490\ No newline at end of file
491
492=== added file 'test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-07.xml.res'
493--- test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-07.xml.res 1970-01-01 00:00:00 +0000
494+++ test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-07.xml.res 2012-07-17 15:50:31 +0000
495@@ -0,0 +1,7 @@
496+<?xml version="1.0" encoding="UTF-8"?>
497+
498+<!-- comment -->
499+
500+<!-- comment -->
501+Jani
502+<from3>Jani</from3>
503\ No newline at end of file
504
505=== added file 'test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-08.xml.res'
506--- test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-08.xml.res 1970-01-01 00:00:00 +0000
507+++ test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-08.xml.res 2012-07-17 15:50:31 +0000
508@@ -0,0 +1,2 @@
509+<?xml version="1.0" encoding="UTF-8"?>
510+JaniJaniJani
511\ No newline at end of file
512
513=== added file 'test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-11.xml.res'
514--- test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-11.xml.res 1970-01-01 00:00:00 +0000
515+++ test/rbkt/ExpQueryResults/zorba/parsing_and_serializing/parse-fragment-doctype-11.xml.res 2012-07-17 15:50:31 +0000
516@@ -0,0 +1,8 @@
517+<?xml version="1.0" encoding="UTF-8"?>
518+
519+<!-- comment -->
520+
521+<!-- comment -->
522+<from1>Jani</from1>
523+<from2>Jani</from2>
524+<from3>Jani</from3>
525\ No newline at end of file
526
527=== added file 'test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-01.xq'
528--- test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-01.xq 1970-01-01 00:00:00 +0000
529+++ test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-01.xq 2012-07-17 15:50:31 +0000
530@@ -0,0 +1,12 @@
531+import module namespace x = "http://www.zorba-xquery.com/modules/xml";
532+import schema namespace opt = "http://www.zorba-xquery.com/modules/xml-options";
533+
534+x:parse(
535+"<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'>
536+<from1>Jani</from1>
537+<from2>Jani</from2>
538+<from3>Jani</from3>",
539+<opt:options>
540+ <opt:parse-external-parsed-entity/>
541+</opt:options>
542+)
543
544=== added file 'test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-02.xq'
545--- test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-02.xq 1970-01-01 00:00:00 +0000
546+++ test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-02.xq 2012-07-17 15:50:31 +0000
547@@ -0,0 +1,13 @@
548+import module namespace x = "http://www.zorba-xquery.com/modules/xml";
549+import schema namespace opt = "http://www.zorba-xquery.com/modules/xml-options";
550+
551+x:parse(
552+"<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'>
553+
554+<from1>Jani</from1>
555+<from2>Jani</from2>
556+<from3>Jani</from3>",
557+<opt:options>
558+ <opt:parse-external-parsed-entity/>
559+</opt:options>
560+)
561
562=== added file 'test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-03.xq'
563--- test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-03.xq 1970-01-01 00:00:00 +0000
564+++ test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-03.xq 2012-07-17 15:50:31 +0000
565@@ -0,0 +1,13 @@
566+import module namespace x = "http://www.zorba-xquery.com/modules/xml";
567+import schema namespace opt = "http://www.zorba-xquery.com/modules/xml-options";
568+
569+x:parse(
570+"
571+<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'>
572+<from1>Jani</from1>
573+<from2>Jani</from2>
574+<from3>Jani</from3>",
575+<opt:options>
576+ <opt:parse-external-parsed-entity/>
577+</opt:options>
578+)
579
580=== added file 'test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-04.xq'
581--- test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-04.xq 1970-01-01 00:00:00 +0000
582+++ test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-04.xq 2012-07-17 15:50:31 +0000
583@@ -0,0 +1,14 @@
584+import module namespace x = "http://www.zorba-xquery.com/modules/xml";
585+import schema namespace opt = "http://www.zorba-xquery.com/modules/xml-options";
586+
587+x:parse(
588+"
589+<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'>
590+
591+<from1>Jani</from1>
592+<from2>Jani</from2>
593+<from3>Jani</from3>",
594+<opt:options>
595+ <opt:parse-external-parsed-entity/>
596+</opt:options>
597+)
598
599=== added file 'test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-05.xq'
600--- test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-05.xq 1970-01-01 00:00:00 +0000
601+++ test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-05.xq 2012-07-17 15:50:31 +0000
602@@ -0,0 +1,13 @@
603+import module namespace x = "http://www.zorba-xquery.com/modules/xml";
604+import schema namespace opt = "http://www.zorba-xquery.com/modules/xml-options";
605+
606+x:parse(
607+"<?xml version='1.0'?>
608+<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'>
609+<from1>Jani</from1>
610+<from2>Jani</from2>
611+<from3>Jani</from3>",
612+<opt:options>
613+ <opt:parse-external-parsed-entity/>
614+</opt:options>
615+)
616
617=== added file 'test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-06.xq'
618--- test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-06.xq 1970-01-01 00:00:00 +0000
619+++ test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-06.xq 2012-07-17 15:50:31 +0000
620@@ -0,0 +1,12 @@
621+import module namespace x = "http://www.zorba-xquery.com/modules/xml";
622+import schema namespace opt = "http://www.zorba-xquery.com/modules/xml-options";
623+
624+x:parse(
625+"<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'>
626+Jani
627+<from3>Jani</from3>",
628+<opt:options>
629+ <opt:parse-external-parsed-entity/>
630+</opt:options>
631+)
632+
633
634=== added file 'test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-07.xq'
635--- test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-07.xq 1970-01-01 00:00:00 +0000
636+++ test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-07.xq 2012-07-17 15:50:31 +0000
637@@ -0,0 +1,15 @@
638+import module namespace x = "http://www.zorba-xquery.com/modules/xml";
639+import schema namespace opt = "http://www.zorba-xquery.com/modules/xml-options";
640+
641+x:parse(
642+"
643+<!-- comment -->
644+<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'>
645+<!-- comment -->
646+Jani
647+<from3>Jani</from3>",
648+<opt:options>
649+ <opt:parse-external-parsed-entity/>
650+</opt:options>
651+)
652+
653
654=== added file 'test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-08.xq'
655--- test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-08.xq 1970-01-01 00:00:00 +0000
656+++ test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-08.xq 2012-07-17 15:50:31 +0000
657@@ -0,0 +1,12 @@
658+import module namespace x = "http://www.zorba-xquery.com/modules/xml";
659+import schema namespace opt = "http://www.zorba-xquery.com/modules/xml-options";
660+
661+x:parse(
662+"<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'>
663+<from1>Jani</from1>
664+<from2>Jani</from2>
665+<from3>Jani</from3>",
666+<opt:options>
667+ <opt:parse-external-parsed-entity opt:skip-root-nodes="1"/>
668+</opt:options>
669+)
670
671=== added file 'test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-09.spec'
672--- test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-09.spec 1970-01-01 00:00:00 +0000
673+++ test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-09.spec 2012-07-17 15:50:31 +0000
674@@ -0,0 +1,1 @@
675+Error: http://www.w3.org/2005/xqt-errors:FODC0006
676
677=== added file 'test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-09.xq'
678--- test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-09.xq 1970-01-01 00:00:00 +0000
679+++ test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-09.xq 2012-07-17 15:50:31 +0000
680@@ -0,0 +1,13 @@
681+import module namespace x = "http://www.zorba-xquery.com/modules/xml";
682+import schema namespace opt = "http://www.zorba-xquery.com/modules/xml-options";
683+
684+x:parse(
685+"
686+<from1>Jani</from1>
687+<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'>
688+<from2>Jani</from2>
689+<from3>Jani</from3>",
690+<opt:options>
691+ <opt:parse-external-parsed-entity/>
692+</opt:options>
693+)
694
695=== added file 'test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-10.spec'
696--- test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-10.spec 1970-01-01 00:00:00 +0000
697+++ test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-10.spec 2012-07-17 15:50:31 +0000
698@@ -0,0 +1,1 @@
699+Error: http://www.w3.org/2005/xqt-errors:FODC0006
700
701=== added file 'test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-10.xq'
702--- test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-10.xq 1970-01-01 00:00:00 +0000
703+++ test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-10.xq 2012-07-17 15:50:31 +0000
704@@ -0,0 +1,13 @@
705+import module namespace x = "http://www.zorba-xquery.com/modules/xml";
706+import schema namespace opt = "http://www.zorba-xquery.com/modules/xml-options";
707+
708+x:parse(
709+"
710+<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'>
711+<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'>
712+<from2>Jani</from2>
713+<from3>Jani</from3>",
714+<opt:options>
715+ <opt:parse-external-parsed-entity/>
716+</opt:options>
717+)
718
719=== added file 'test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-11.xq'
720--- test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-11.xq 1970-01-01 00:00:00 +0000
721+++ test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-11.xq 2012-07-17 15:50:31 +0000
722@@ -0,0 +1,15 @@
723+import module namespace x = "http://www.zorba-xquery.com/modules/xml";
724+import schema namespace opt = "http://www.zorba-xquery.com/modules/xml-options";
725+
726+x:parse(
727+"<?xml version='1.0'?>
728+<!-- comment -->
729+<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'>
730+<!-- comment -->
731+<from1>Jani</from1>
732+<from2>Jani</from2>
733+<from3>Jani</from3>",
734+<opt:options>
735+ <opt:parse-external-parsed-entity/>
736+</opt:options>
737+)
738
739=== added file 'test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-12.spec'
740--- test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-12.spec 1970-01-01 00:00:00 +0000
741+++ test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-12.spec 2012-07-17 15:50:31 +0000
742@@ -0,0 +1,1 @@
743+Error: http://www.w3.org/2005/xqt-errors:FODC0006
744
745=== added file 'test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-12.xq'
746--- test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-12.xq 1970-01-01 00:00:00 +0000
747+++ test/rbkt/Queries/zorba/parsing_and_serializing/parse-fragment-doctype-12.xq 2012-07-17 15:50:31 +0000
748@@ -0,0 +1,13 @@
749+import module namespace x = "http://www.zorba-xquery.com/modules/xml";
750+import schema namespace opt = "http://www.zorba-xquery.com/modules/xml-options";
751+
752+x:parse(
753+"<?xml version='1.0'?>
754+<!DOCTYPE HTML PUBLIC '-//W3C//DTD HTML 4.01 Transitional//EN' 'http://www.w3.org/TR/html4/loose.dtd'>
755+<from1>Jani</from1>
756+<from2>Jani</from2>
757+<from3>Jani</from3>",
758+<opt:options>
759+ <opt:parse-external-parsed-entity opt:error-on-doctype="true"/>
760+</opt:options>
761+)

Subscribers

People subscribed via source and target branches