This file is a combination of bugs and potential future enhancements. It is not always clear which is which...
Title: Reuse of prepared SELECT statements with JDBC-ODBC bridge and Access doesn't work
Description: Either the JDBC-ODBC bridge or the MS Access driver does not correctly support the reuse of prepared SELECT statements. As a consequence, the code to reuse these statements has been commented out. If your driver does correctly support prepared SELECT statements, see the comments in Map.checkInSelectStmt for how to increase performance.
Title: KeyGeneratorImpl doesn't work with Microsoft Access
Description: KeyGeneratorImpl doesn't work with Microsoft Access. The problem is that the transaction that updates the XMLDBMSKey table is never committed, in spite of a call to do this. As a workaround, comment out the line of code that sets auto-commit to false. Presumably, this means there is
a possibility two users could receive the same high key value. Note that it is not clear whether the problem is with Microsoft Access itself or the ODBC driver supplied by Microsoft.
Title: Namespace prefix "lost" of more than one prefix defined for same namespace
Description: If more than one prefix is declared in SubsetToDTD.convertDocument or SubsetToDTD.convertExternalSubset, then the prefix actually used is the first one encountered in the DTD.
Title: Order not supported when foreign key in parent
Description: When the foreign key in a parent/child relationship between two element types-as-classes is in the table of the parent element type, order information can be saved for the
child element type but not retrieved. For example, consider the following:
<SalesOrder> <Number>123</Number> <Customer> <Name>ABC Industries</Name> <Address>123 Main St., Chicago</Address> </Customer> </SalesOrder>
If the foreign key used to join the sales order and customer tables is in the sales order table, then no information about the order in which the Customer element appears in the SalesOrder element can be retrieved from the database. The problem here is that the code to construct SELECT statements and the code to access data in result sets assumes that SELECTs are done over a single table. In this case, the order column for the Customer element is stored in the Sales table, so a join to the Sales table needs to be done when retrieving customer data. Fixing this is likely to require some rethinking in the way in which the statement-generation code in Map and data retrieval code in Row (as well as elsewhere) works.
Title: Root table assumed to have a key
Description: If the root table does not have a candidate key, then the DocumentInfo object returned by DOMToDBMS.storeDocument() cannot be used to retrieve the data. It is also quite likely that some code will fail, although this has not been tested.
Title: Columns with markup not expanded
Description: When a column containing markup is retrieved from the database, it is not expanded into the corresponding DOM nodes. Instead, it remains as marked up text.
Title: Markup not escaped on serialization
Description: Any markup that occurs when serializing a Map or DDML document is not replaced with entities as it should be. XMLOutputStream.characters needs to be fixed.
Title: Markup characters not serialized correctly
Description: The code that serializes the contents of a DOM node needs to know how to serialize markup characters that are not true markup. For example, if an element contains "a<b", how should the < character be serialized? If it is serialized as a < character, it will generate an error when it is retrieved from the database, as the code will try to parse it as markup. If it is serialized as the entity <, the data in the database cannot be searched for the expected text, but must instead be searched for the < entity usage. Which route to follow should be left as a choice for the user -- probably designated on a per-element-type basis in the mapping document.
Title: Duplicate keys not handled correctly?
Description: If the code encounters a duplicate key value when inserting data into the database, it returns an error. Although this makes sense in some cases -- for example, it is clearly an error to insert the same sales order twice -- it does not make sense in others. For example, suppose part information is sent with each sales order. Because multiple sales orders can refer to the same part, all sales orders after the first that refer to a given part will fail because they get a duplicate key error inserting the part information. One possible solution to this is to ignore duplicate key errors when the foreign key in the parent/child relationship is in the table of the parent element; in such a situation, the contents of the child and all its children would presumably be ignored. However, because it is not clear what to do in this case, the best solution for the moment is to simply throw the duplicate key error and find out how people are using the software and what they expect.
Title: Empty elements-as-classes inserted as row of all NULLs
Description: If an element-as-class is empty and has no attributes, the code (probably) inserts in a row with all NULLs into the database, which will often cause an INSERT error. It is not clear whether this is the correct behavior.
Title: SubsetToDTD ignores encoding declarations
Description: SubsetToDTD ignores encoding declarations, instead depending on the native abilities of Java to decode the bytes. This will most likely be encountered while using MapFactory_DTD.
Some of these changes are probably not that difficult; others will require a lot of work. If you are interested in implementing any of these, you might want to send me email first, as I can often explain what needs to be done and where the potential problems lie.
Title: Pass-through classes
Description: Pass-through is a way to compress structure that exists in the XML document but not in the database. The simplest example of pass-through is the IgnoreRoot element in the XML-DBMS mapping language. However, much more sophisticated types of pass-through as possible as well. For example, imagine that you have an Address element inside a Customer element and the Address element contains Street, City, PostCode, etc. elements. In the object-tree view supported by XML-DBMS, Address would generally require its own table. However, in many cases it is desirable to eliminate the Address element and store Street, City, PostCode directly in the Customer table.
Pass-through like this is relatively easy when transferring data from XML documents to the database, but can prove impossible in the other direction, as multiple XML structures can be mapped to the same database structure. More sophisticated types of pass-through are imaginable as well. For more information, see PassThrough.txt.
Title: Binary data not supported
Description: Binary data is not supported either internally (Base64) or externally (unparsed
entities).
Title: CREATE TABLE statements inadequate
Description: Currently, CREATE TABLE statements have a number of drawbacks. 1) They do not correctly support DECIMAL and NUMERIC columns. 2) They use hard-coded data type names instead of querying the database for these names. 3) They do not state whether columns are nullable. 4) They do not include primary key / foreign key constraints. None of these should be difficult to fix except possible the primary key / foreign key constraints.
Title: Map construction code is a mess
Description: The code in the Temp*Map classes is a mess, especially TempMap. (The map factory code is generally pretty good.) The problem is that the Temp*Map classes do not have well-defined interfaces, allowing direct access to class variables instead. Thus, writing map factories can be very confusing and the maps themselves never really track whether they are in a valid or invalid state. Furthermore, the code is poorly commented, especially TempMap.java. For more information, see TempMap.txt.
Title: No pretty printing
Description: The code that serializes the contents of a DOM node needs to have pretty-printing options: a) indent nested elements, b) normalize carriage return/line feeds as spaces, c) insert line breaks at a specified line length.
Title: Code to order DOM nodes very inefficient
Description: When retrieving data from the database, the code that inserts ordered child nodes into a parent node is very inefficient. It uses a linear search to determine where to place the child. This code should be rewritten, possibly using a binary search.
Title: Number formats
Description: Currently, there is no way to specify the format of numbers in the XML document. It should be relatively easy to support this in a manner similar to date formats through an option in the mapping file. See the XMLDBMS DTD, MapFactory_MapDocument, Parameters, and DBMSToDOM for ideas.
Title: Per-property formats
Description: In many cases, it is useful to assign date/time or number formats on a per-property basis. This is most easily done by adding a Name attribute to the formats described in the mapping document (type ID) and optional Formats subelements to property maps. The Formats subelement would have IDREF attributes for the number and date/time formats to use (e.g. Number, DateTime). These must refer to formats defined in the Options element. If present, they would be used; if not present, the default format (defined in the Options element) would be used.
Title: Rewrite DOMToDBMS as SAX application
Description: It is possible that DOMToDBMS can be rewritten as a SAX application. If done, the result should be named SAXToDBMS and DOMToDBMS left in the package, as it will sometimes be useful to pass a DOM tree to the database. For more information, see SAXToDBMS.txt.