Items that need to be done: --- RIGHT AWAY --- * Implement the ability to output the same DOCTYPE as was input. Store the internal DTD subset as a String variable in DocType. This means we need to reconstruct the String based on SAX events of the DTDHandler/DeclHandler. Or if reading from DOM then from the DOM DocumentType. (Harry Evans and Phil Nelson are working on this.) * Consider a getNames() method to ProcessingInstruction to iterate over the names. * Consider if setText() should not replace children * Take action on detach() allowing a Doc to have its root removed with an ISE upon later access. This also removes the need to remove the Document(null) usage by builders. * Integrate Alex Chaffee's XMLOutputter rework. Included in this is a fix for the minor bug where pretty print (indented, newlines, trim text) prints an extra whitespace line between one closing element and another. * Add setIgnoringAllWhitespace(boolean) method. * Determine if setContent(), setAttributes(), setPIs(), setChildren(), and addAll() should be atomic on failure (such as when an illegal object is added from the list). Currently setContent() has been changed to be atomic. * Examine if it's worth doing an intern() on element and attribute names, or if it's too much to pay since SAX is likely doing it already http://www.megginson.com/SAX/Java/features.html Probably turn off SAX interning and do it ourselves? Note commentary: http://lists.denveronline.net/lists/jdom-interest/2000-October/003289.html Regardless, the SAX builder should not assume there's interning! * Perhaps have builder flags to indicate if CDATA sections should be included and if comment sections should be included. All seem like reasonable customizations. The whitespace flag may respect xml:space. It might use an XMLFilter to do the job. * Integrate FilterList (currently developed in the jdom-wip CVS module) which replaced PartialList by more efficiently driving directly off the backing list. - Find a way to reuse the FilterList impls returned by getContent(). Possible way: store content in a FilterList but give it a pkg prot accessor method to retrieve the raw list? - Make sure FilterList doesn't let an elt be added as a child (or grandchild, etc) of itself. See elt.addContent() logic that walks ancestry. - See about reducing the number of classes in org.jdom (perhaps leaving Filter and FilterList in there but moving the others to be inner classes) - Decide what classes/methods should be final - See about a way so matches() can be a fast no-op for pre-filtered lists (as is the case for most calls) - Fix issue where size is kept as an instance var but that's not reliable * Verify we get good error handling if someone passes null to any add/set method. Consensus is throw an NPE if passing null to a setList-style method. * Determine if DOMBuilder and DOMOutputter should transparently support DOM1. * Look again at if XMLOutputter output methods taking a Writer should auto-flush. Currently they do, but is that overstepping the contract? * Look at where Namespace may need to be synchronized or made no longer a flyweight. See http://lists.denveronline.net/lists/jdom-interest/2000-September/003009.html and follow-ups. * Look into having a flag to turn off DTD loading, since parsers often load the DTD even if validation is off. Xerces has a custom option: http://xml.apache.org/xerces-j/features.html Crimson doesn't, nor probably do others. That makes it tricky. * Figure out how to deal with XMLOutputter writing of special characters like  . Should it char escape only chars unprintable in the current character set? Should there be a fancy API for selecting what's escaped? Should this be something where you can subclass? http://lists.denveronline.net/lists/jdom-interest/2001-February/004521.html http://lists.denveronline.net/lists/jdom-interest/2001-April/005644.html http://lists.denveronline.net/lists/jdom-interest/2001-April/005649.html http://lists.denveronline.net/lists/jdom-interest/2001-April/005669.html * Consider an XMLOutputter flag or feature to convert characters with well known named character entities to their named char entity form instead of numeric. * Determine if there should be a way to output everything on one line (including the decl and doctype) * Look into elharo's report of DOMBuilder throwing away ignorable whitespace. http://www.servlets.com/archive/servlet/ReadMsg?msgId=8313 * Add a Text class (currently stored but unused in org.jdom) * Performance optimize. See following thread for test data. http://lists.denveronline.net/lists/jdom-interest/2000-October/003418.html http://lists.denveronline.net/lists/jdom-interest/2000-October/003472.html * Look into how the factory builder model could support giving the factory extra knowledge about the context (line number, element stack, etc), and allow it to report errors or to return a code indicating the element should be ignored. --- FOR JDOM 1.0 COMMUNITY REVIEW --- * Expand class-level Javadocs for inclusion into Frame using the MIF Doclet. * Note in the docs where necessary our multithreading policy. --- FOR JDOM 1.0 --- * Create "build dist" for distribution Use fixcrlf in dist (instead of package as currently done) Probably include source with jdom.jar built * Consider changing XMLOutputter to have more set methods like Enhydra's DOMFormatter. Possible good ones: void setPreserveSpace(boolean preserve) Set the default space-preservation flag. void setXmlEncoding(java.lang.String newXmlEncoding) Set the encoding using the XML encoding name. void setXmlEncoding(java.lang.String newJavaEncoding, java.lang.String newXmlEncoding) Set both the XML and Java encodings. * Consider adding methods/logic to Verifier for all XML spec. constraints (Consider specifically a PCDATA check. Downside is Elliotte says it causes a 20% performance penalty on building docs.) Probably go with sanity checking input unless it adds significant time to a SAX build. See http://lists.denveronline.net/lists/jdom-interest/2000-August/002088.html And http://lists.denveronline.net/lists/jdom-interest/2000-August/002102.html * Consider changing the Verifier method signatures to throw the IllegalXXXException directly instead of returning null on error, and let the caller pass the exception through * Populate jdom-test. Jools is leading this but Phil Nelson is currently doing a lot of work. Hong Zhang is helping with the J2EE CTS. * Make sure we have a plan for supporting obj serialization across current and future JDOM versions. See "serialVersionUID" thread especially Peter V. Gadjokov's remarks at http://lists.denveronline.net/lists/jdom-interest/2000-September/subject.html It may be OK to worry about fast short-term serialization only. * Consider visitor pattern Use cases: count elements, count nodes, translate comments, remove PIs Would implement with option to visit depth or breadth first Maybe go crazy with pre-order, in-order, and post-order too :-) Methods would exist on Document and Element FYI, DOM's much overweight Traversal-Range spec is at http://www.w3.org/TR/2000/REC-DOM-Level-2-Traversal-Range-20001113/ Joe Bowbeer has ideas at: http://lists.denveronline.net/lists/jdom-interest/2000-November/003610.html We may be able to just have doc.iterator() methods * Ensure JDOM is appropriately tweaked for subclassing, per the threads started by Joe Bowbeer. http://www.servlets.com/archive/servlet/ReadMsg?msgId=7601 begins it * Ensure JDOM is flawless regarding clone semantics, per more threads by Joe Bowbeer. http://www.servlets.com/archive/servlet/ReadMsg?msgId=7602 begins it * Joe summarizes his issues at http://www.servlets.com/archive/servlet/ReadMsg?msgId=7697 --- FOR JDOM 1.1 --- * Add XPath support, most likely integrating Bob McWhirter's package. * Figure out XPath interface, current best is this: List XPath.getList(Element e, String xpath) // or Document param Comment XPath.getComment(Element e, String xpath) Element XPath.getElement(Element e, String xpath) ProcIns XPath.getProcIns(Element e, String xpath) Entity XPath.getEntity(Element e, String xpath) String XPath.getText(Element e, String xpath) * Eliminate string hardcoding. Use resource bundles to allow for localization. * Investigate a way to do in-memory validation. First step is probably to get an in-memory representation of a DTD as per http://xmlhack.com/read.php?item=626 http://www.wutka.com/dtdparser.html http://lists.denveronline.net/lists/jdom-interest/2000-July/001431.html http://lists.denveronline.net/lists/jdom-interest/2001-February/004661.html Maybe new DTDValidator(dtd).validate(doc); Then later new SchemaValidator(schema).validate(doc); Could instead do doc.validate(dtd/schema) but then we'd have to dynamically switch between recognizing DTDs and the various schemas. The method would probably either throw InvalidDocumentException or might take an ErrorHandler-style interface implementation if there are non-fatal errors possible. It'd also be possible to have a programmatic verifier, that determined for example if an orderid="100" entry was valid against a database entry. * Consider a listener interface so you could listen to doc changes. (Probably after 1.1 honestly; this can be done through manual subclasses already.) Some pertinent messages on this topic: http://lists.denveronline.net/lists/jdom-interest/2000-July/001586.html http://lists.denveronline.net/lists/jdom-interest/2000-July/001587.html http://lists.denveronline.net/lists/jdom-interest/2000-July/001600.html * Consider a "locator" ability for nodes to remember the line number on which they were declared, to help debug semantic errors. http://lists.denveronline.net/lists/jdom-interest/2000-October/003422.html --- UNTIED TO A JDOM VERSION --- * Create a builder based on Xerces' XNI, which will be more featureful and probably faster than the one based on SAX. See http://lists.denveronline.net/lists/jdom-interest/2001-July/007362.html * Contribute the samples from Elliotte's XML DevCon talk to the samples/ directory. http://metalab.unc.edu/xml/slides/xmlsig/jdom/JDOM.html * Add a search for jdom.org using Google with site:www.jdom.org, imitating http://www.zope.org/SiteIndex/searchForm * Fix it so check-in messages include diffs. (jools@jools.org) * Add ElementLocator to contrib/ directory (from Alfred Lopez) * Write a guide for contributors. Short summary: Follow Sun's coding guidelines, use 4-space (no tab) indents, no lines longer than 80 characters * Consider a builder for a read-only document. It could "intern" objects to reduce memory consumption. In fact, interning may be good for String objects regardless. * Consider having the license be clear org.jdom is a protected namespace. --- WILD IDEAS --- * Figure out if there's a role for a Node interface. It sounds easy but all attempts so far have hit obstacles. Amy Lewis talks about it here: http://lists.denveronline.net/lists/jdom-interest/2000-December/004016.html There were many follow-on threads. * Think about somewhat crazy idea of using more inheritance in JDOM to allow lightweight but not XML 1.0 complete implementations. For example Element could have a superclass "CommonXMLElement" that supported only what Common XML requires. Builders could build such elements to be faster and lighter than full elements -- perfect for things like reading config files. Lots of difficulties with this design though. * Look at Xerces parser features (http://apache.org/xml/features/dom) for ideas on things that may be needed. http://xml.apache.org/xerces-j/features.html * Create a JDOM logo. * Create a Verifier lookup table as an int[256] growable to int[64K] where bits in the returned value indicate that char's ability to be used for a task. So "lookup[(int)'x'] & LETTER_MASK" tells us if it's a letter or not. * Use new Ant regexp task for more efficient JDK 1.1 package renaming. * Consider elt.getTreeText() which would recursively get the text (in order) for the subtree, effectively ripping out interveneing Elements. (Suggested by Bob to help with XPath.) * Shouldn't addNamespaceDeclaration() have a name to match getAdditionalNamespaces(). * Consider an HTMLBuilder that reads not-necessarily-well-formed HTML and produces a JDOM Document. The approach I'd suggest is to build on top of JTidy first. That gives a working implementation fast, at the cost of a 157K Tidy.jar in the distribution. After that, perhaps someone would lead an effort to change the JTidy code to build a JDOM Document directly, instead of making a DOM Document or XML stream first. That would be a lot faster, use less memory, and make our dist smaller. See http://www.sourceforge.net/projects/jtidy for Tidy. * Look at a (contrib?) outputter option using SAX filters per http://lists.denveronline.net/lists/jdom-interest/2000-October/003303.html http://lists.denveronline.net/lists/jdom-interest/2000-October/003304.html http://lists.denveronline.net/lists/jdom-interest/2000-October/003318.html http://lists.denveronline.net/lists/jdom-interest/2000-October/003535.html * Look at event-based parsing as per the following thread: http://lists.denveronline.net/lists/jdom-interest/2000-November/003613.html and replies. * Considering that local vars are considerably faster that instance vars, test if using local vars can speed building. * Consider a builder.setFeature() pass-through method that allows any features to be set that aren't in the http://xml.org namespace. Make those in http://xml.org not to be touched because either we have specific requirements for them to be set one way, or we have the feature exposed through a Java method. * Consider Mike Jennings' proposal of two new methods on Element: public String getAttributeValue(String name, String default) public String getAttributeValue(String name, Namespace ns, String default) http://lists.denveronline.net/lists/jdom-interest/2000-December/004155.html * Consider using a List of instance data so elements only use what they really need (saving attrib list, namespace list) * Consider Element.hasAttributes() and Element.isEmpty() * Investigate doc.getDescription() to let people add doc descriptions. It's an idea from IBM's parser suggested by andyk. * Work on creating a deferred builder that parses only what's necessary to satisfy the programmer's requests. See Ayal Spitz' post at http://lists.denveronline.net/lists/jdom-interest/2001-April/005685.html