JavaTM API for XML Processing
Release Notes
Version: 1.1
This document contains notes that may help you use this library
more effectively. Please see the JAXP specification for more
information.
XSLT Support
- XSLT is
supported in this release via the
javax.xml.transform
package. See
the associated Javadoc for details on accessing basic functionality
in a XSLT processor-independent manner.
Parser
- There are two factory classes for making parsers pluggable. If
you limit your application to the JAXP API in the
javax.xml.parsers
,
org.xml.sax
, and
org.w3c.dom
packages,
you can use the library in a manner independent of the underlying
implementing parser.
- To be notified of validation errors in an XML document, these
items must be true:
- Validation must be turned on. See the
setValidating
methods of
javax.xml.parsers.DocumentBuilderFactory
or
javax.xml.parsers.SAXParserFactory
.
- An application-defined
ErrorHandler
must be
set. See the setErrorHandler
methods of
javax.xml.parsers.DocumentBuilder
or
org.xml.sax.XMLReader
.
See example programs for
more information.
- Whenever you work with text encodings other than UTF-8 and
UTF-16, you should put an encoding declaration at the very
beginning of all your XML files (including DTDs). If you don't do
this, the parser will not be able to determine the encoding being
used, and will probably be unable to parse your document. A text
declaration like
<?xml version='1.0'
encoding='euc-jp'?>
says that the document uses the
"euc-jp" encoding.
- The parser currently reports warnings, rather than errors,
in cases where the declared and actual text encodings don't match.
It may give those same warnings in the common case where the
encoding name used internally to Java is not the one used in the
document. If the declared encoding is truly an error, you'll
usually see other errors (not warnings) being reported by the
parser.
- The parser currently does not report an error for content
models which are not deterministic. Accordingly it may not behave
well when given data which matches an "ambiguous" content model
such as ((a,b)|(a,c)). DTDs with such models are in
error, and must be restructured to be unambiguous. (In the
example, (a,(b|c)) is an equivalent legal content model.)
- If you are using JDK 1.1 with large numbers of symbols
(more than can be counted in sixteen bits) you might encounter a
message, panic: 16-bit string hash table overflow as the
Java VM aborts. The Java 2 SDK does not have this limitation.
Object Model
- Conforming to the XML specification, the parser reports all
whitespace to the DOM even, if it's meaningless. Many applications
do not want to see such whitespace. You can remove it by invoking
the Element.normalize method, which merges adjacent text
nodes and also canonicalizes adjacent whitespace into a single
space (unless the xml:space="preserve" attribute prevents
it).
- Currently, attribute nodes do not have children. Access their
values as strings instead of enumerating children.
- Currently, when documents are cloned, the clone will not have a
clone of the associated ElementFactory or DocumentType.
- The in-memory representation of text nodes has not been tuned
to be efficient with respect to space utilization.
Other Issues
- If you get a "nonfatal internal JIT" error when running with
versions of Java 2 SDK version 1.2, you can either ignore the
message or upgrade to the newer hotspot compiler which is shipped
by default with Java 2 SDK version 1.3 to fix the problem.
- If you recompile the DOM implementation using versions of "javac"
older than the Java 2 SDK version 1.2 you may run into a compiler
bug. The symptom is a report of illegal access violations for some
of the private classes inside the DOM implementation. This is
because of incorrect code generated by the compiler. You should
only compile these class files with a compiler that does not have
this bug; you may also use the pre-compiled version in this
release. There is no bytecode dependency on the Java 2 runtime;
you may use these classes on JDK 1.1 systems also.
- The Microsoft SDK 3.2 for Java (and presumably all earlier
versions) has bugs similar to the one noted above. There are both
compiler and JVM bugs; the JVM bugs prevent the correct byte codes
(as produced by the Java 2 SDK) from working. This means that you
can't compile or use this DOM code with Microsoft implementations
of Java until Microsoft fixes these bugs, which have been reported
to Microsoft.
Changes since JAXP RI (Reference Implementation) version 1.1ea1
- Improved scheme for locating pluggable implementations. For
example, JAXP looks for a resource on the classpath for factory
implementations. This allows other parsers such as Xerces to be
used simply by adding a suitable
xerces.jar
to the
classpath.
- Created new
javax.xml.transform
package to handle XSLT
processing.
- Updated
javax.xml.parsers
to SAX 2.0 and DOM Level 2.
Changes since JAXP RI (Reference Implementation) version 1.0.1
- All previous releases (from version 1.0.1 and before) used a
parser implementation with a package heirarchy beginning with
com.sun.xml
. Between version 1.0.1 and the current
release, the parser was donated to the Apache Software Foundation
under the name "Crimson" and the packages were correspondingly
renamed to org.apache.crimson
. Migration from
previous releases may involve renaming packages in your
application. In addition, if your application uses SAX1 then you
may either convert it to use the preferred SAX2
org.sax.xml.XMLReader
or obtain a SAX1
org.sax.xml.Parser
from the
javax.xml.parsers.SAXParser.getParser()
method.
Changes since 1.0
- Default parser is used in controlled environments such as applets
where
System.getProperty()
results in a
SecurityException.
- Default Message.properties is provided to avoid gettting error codes
in Locales other than English.
Changes since EA1
- API for pluggability has changed. See the specification and
javadocs for more details.
- All the reported bugs have been fixed including those reported
internally for SAX 1.0 DOM Level 1 and the JAXP 1.0 API.