lp:~gagern/xalan/bugfixes

Created by Martin von Gagern and last modified

Collection of bug fixes from upstream issue tracker.

Annoyed by the fact that patches to address issues with XalanJ lie around in Apache Jira for ages without so much as a comment, I created this branch to keep my own patches organized and accessible to others. I'll probably add patches created by others as well, when I find the time to review them.

Get this branch:
bzr branch lp:~gagern/xalan/bugfixes
Only Martin von Gagern can upload to this branch. If you are Martin von Gagern please log in for upload directions.

Branch merges

Related bugs

Related blueprints

Branch information

Owner:
Martin von Gagern
Project:
Xalan
Status:
Development

Recent revisions

4482. By Martin von Gagern

XALANJ-2346: fixes for faulty NaN handling
Fixes https://issues.apache.org/jira/browse/XALANJ-2346
Patch by Dave Brosius

4481. By Martin von Gagern

XALANJ-2325: don't drop unused global variables

Fixes https://issues.apache.org/jira/browse/XALANJ-2325 :
XSLTC Causes NoSuchFieldError if global variable is unused

Based on a patch by Fred Kruesi

4480. By Martin von Gagern

XALANJ-2438: predicates for xsl:key elements

Fixes https://issues.apache.org/jira/browse/XALANJ-2438 :
[PATCH] XSLTC ignores XPath predicates in xsl:key elements

Patch by Helge Schulz.

4479. By Martin von Gagern

XALANJ-2452: fix treatment of CData sections in DOM2DTM

Fixes https://issues.apache.org/jira/browse/XALANJ-2452 :
//text() does not work for CData sections

Patch by Michele Vivoda

4478. By Martin von Gagern

XALANJ-2458: implement erratum 23 of the XSLT recommendation

Fixes https://issues.apache.org/jira/browse/XALANJ-2458 :
Xalan-J does not implement erratum 23 of the XSLT recommendation

For xsl:number, with level="any", if there are no matching nodes, Xalan-J
generates a text node with "0". The erratum states that, if there are no
matching nodes, there will be no text node in the result tree.

See the erratum for more details:
http://www.w3.org/1999/11/REC-xslt-19991116-errata/

Patch by David Bertoni.

4477. By Martin von Gagern

XALANJ-2462: avoid NullPointerException on fatal error.

Fixes https://issues.apache.org/jira/browse/XALANJ-2462 :
TransfomerImpl.transformNode() Fails to throw some errors due to NullPointerException

Based on a patch by Brendan Durkin.

4476. By Martin von Gagern

XALANJ-2495: Corrections to German error messages
Fixes https://issues.apache.org/jira/browse/XALANJ-2495 (for now).

4475. By Martin von Gagern

XALANJ-2473: conforming implementation of DTMNodeProxy.getTextContent().

Fixes https://issues.apache.org/jira/browse/XALANJ-2473 :
DTMNodeProxy.getTextContent() does not return text content of child nodes

If a java extension functions takes an org.w3c.dom.Node as an argument then
it gets an org.apache.xml.dtm.ref.DTMNodeProxy object. The "getTextContent"
method of this object returns null if the node is an element node, although
according to the java 1.5 documentation, the "getTextContent" method should
return the "concatenation of the textContent attribute value of every child
node".

4474. By Martin von Gagern

XALANJ-2493: fix nodeList2Iterator to deal with attributes.

Fixes https://issues.apache.org/jira/browse/XALANJ-2493 :
BasisLibrary.nodeList2Iterator broken

The current implementation of nodeList2Iterator is broken, because it can
not deal with attribute nodes. It relies on copyNodes which in turn tries
to add attribute nodes as children of some top level node. Attributes don't
live on the children axis, though, so this is against DOM and causes a DOM
exception in the Xerces DOM implementation and probably most other
implementations. The resulting HIERARCHY_REQUEST_ERR was noted e.g. in
XALANJ-2424.

Furthermore, the implementation is inefficient, because it manually copies
each and every node from the source document to a new DTM to some new DTM.
The time overhead for the copying as well as the memory overhead for the
additional DOM can be avoided in cases where the nodes come from some input
document, as opposed to a document completely loaded within some extension
function.

A comment in the related XALANJ-2425 suggests returning DTMIterator from
extension functions and avoiding the re-import for those. I don't like this
idea because it exposes a lot of Xalans internals to extension functions,
and because the returned node list might be newly created, while at least
some of thenodes might still be from the same document. So instead of
special cases for the list type, I implemented special cases for every node
of the list. If it is a proxy node of the same (Multi)DOM, we simply use
its handle.

If not, we add it to some w3c DOM and turn that into a DTM, pretty much like
the current implementation does. However, I dropped copyNodes in favor of
Document.importNode, to avoid code duplication and rely on supposedly more
heavily tested code. I also added another level of elements, so that there
is one such dummy node for every item of the source list, with always a
single child or element. A few assertions help ensure this single child
policy. This is especially important in the new implementation, because
otherwise it would become difficult to get the proxied nodes and the newly
DTMified nodes into correct order.

Right now, the import of DOM nodes is only implemented for those nodes I
expect to turn up in the DTM in pretty much the same form as they do in the
w3c DOM. For all other nodes, an internal error is thrown. This especially
concerns document fragment nodes. At least in w3c DOM, a document fragment
can never be a child, so if DTM behaves the same, we would need to import
document fragments seperately, or expand them to the list of their children
instead. I'm not sure what correct behaviour would be here, so I'd rather
throw an exception than implement wrong behaviour.

I also noticed that
org.apache.xml.dtm.ref.DTMManagerDefault.getDTMHandleFromNode would in
theory provide an implementation for turning w3c nodes into DTM handles -
exactly what we need here. That method seems to start importing from the
topmost ancestor of a node, giving as much context as possible, in contrast
to both current and my suggested XSLTC approach, which destroys ancestor
references. That method also seems to employ caches in order to avoid
importing a document repeatedly. Sadly, actually using that method throws a
ClassCastException as it expects a DTM generate from a DOM source to be a
DOM2DTM, which SAXImpl is not. A comment inside that method also indicates
that future implementations might drop auto-importing and instead leave it
to the caller to import a DOM if it hasn't been imported before.

I left my own attempt at an nodeList2Iterator implementation using
getDTMHandleFromNode in place, renamed to
nodeList2IteratorUsingHandleFromNode and made private. So it's there, it
even gets compiled, but it won't get used. If that method gets fixed in
XSLTCDTMManager or its ancestor, then this method might be used instead,
giving a much simpler and cleaner implementation. If some of my code would
be useful for such an implementation as well, like the check for "is same
DOM", feel free to copy or move my code to those classes as well.

4473. By Martin von Gagern

XALANJ-1847: allow references as "this" parameter.

Fixes https://issues.apache.org/jira/browse/XALANJ-1847 :
Transform.setParameter doesn't function as an extension mecanism

The issue mostly concerns the class namespace format. The reason for this
is that in the other namespace formats, the class to search for methods has
to be determined from the "this" argument, and such type information is not
available for references. It could be implemented inefficiently via
reflection at runtime in the long run, but we'll better skip this for now.

With the class namespace format, we cannot tell by looking at the function
name whether we are dealing with a static or an instance method. The best
information in this respect probably comes from the class itself. The patch
modifies the method search to return all public method with matching name,
without regard for argument count. typeCheckExternal then does a case
distinction: either it's a static method matching all parameters, or it's an
instance method with the first argument of a suitable type (either
assignable object type or reference type) and the rest or the arguments
matching the method parameters. The "this" argument is removed from the
list of all arguments only after a method has been chosen.

The patch also fixes what seems to be an error with the return type during
method selection, where a method with a higher distance might override the
return type of the expression without actually getting chosen. Haven't
written a test case exposing this issue yet. If you absolutely need it, I
can do so.

When the "this" argument is of reference type, it has to be cast to the
target type. To do so, reference types have to be castable in the first
place. I'm a bit worried about all those special cases in
ReferenceType.translateTo, so I wouldn't be surprised if working on classes
that are also used to express xslt data types might cause trouble.

It might be that allowing references to be cast to anything might cause
trouble in other parts of the code. I would assume that the Java VM should
catch most of those, though. Having castable reference types solved another
issue for me, where I tried to pass them not as "this" but as arguments to
extension functions.

Branch metadata

Branch format:
Branch format 7
Repository format:
Bazaar RepositoryFormatKnitPack6RichRoot (bzr 1.9)
Stacked on:
lp:~gagern/xalan/trunk
This branch contains Public information 
Everyone can see this information.