I originally tried to write DOMToDBMS as a SAX application but concluded
that it was impossible. However, this conclusion was based on the
assumption that the class did no caching of data and, in this case, it
was impossible. However, if some data is cached as Row objects, it is
quite possible that SAXToDBMS is possible. The problem that needs to
be solved is as follows.

Consider the sample Sales Order document. This has data for four tables: 
Sales, Customers, Lines, and Parts.  Now consider the key relationships: 
Sales has a foreign key that points to Customers and a primary key that 
points to Lines, and Lines has a foreign key that points to Parts.  The 
trick is that, to meet referential integrity constraints in the database, 
you need to add the rows with primary keys first. Thus, you need to add the 
Customers row before you can add the Sales row, you need to add the Sales 
row before you can add the Lines row, and you need to add the Parts row 
before you can add the Lines row.

The tree-walking algorithm I use is as follows. Start from the root. When 
you encounter a node mapped to a table, such as the SalesOrder node, create 
a buffer for a row in that table, then processes the children of that node. 
If a child is mapped to a column, such as the OrderDate node, store the 
node's value in the buffer. If a child is mapped to a table that stores a 
foreign key in the current table, such as the Customer node, process the 
node immediately and store the foreign key in the current row buffer. If a 
child is mapped to a table that uses the primary key of the current table, 
such as the Line nodes, save the node is for later processing, which you do 
after inserting the parent node, so you can pass the primary key to the 
child node.

In my original SAX implementation, I did not use row buffers. Thus, the 
implementation was clearly impossible -- I would, for example, have to 
immediately process (and insert) the sales data, which comes before the 
customer data. This leads to referential integrity problems. However, 
you've got me thinking now and it's quite possible you might be able to 
build the row buffers as you go, then insert them in the correct order. If 
so, this would be terrific.

Anyway, the general strategy is something along the lines of:

**********
In startElement().
1) Get the map for the element.

2) If the element is mapped as a class:
a) create a new Row for the element.
b) read the attributes linearly, getting each name and value.
c) get the property map for each attribute. if it is stored as TOCOLUMN, 
store the value in the current Row. if it is stored in as TOPROPERTYTABLE, 
create a Row for the property table and store the value in it.

3) If the element is mapped as a property then, if the element is mapped as 
TOCOLUMN, do nothing now. If it is mapped as TOPROPERTYTABLE, create a Row 
for the property table.

********
In characters():

1) If you are processing an element mapped as a property then:
a) If the element is mapped as TOCOLUMN, store the value in the parent 
element's Row.
b) If the element is mapped as TOPROPERTYTABLE, store the value in the 
property table Row.
2) If you are processing an element mapped as a class, get the map for the 
PCDATA. If this is mapped as TOCOLUMN, store the value in the element's 
Row. If this is mapped as TOPROPERTYTABLE, create a Row for the property 
table and store the value in it.
3) IMPORTANT! A single piece of PCDATA can result in multiple calls to 
characters, so you need to deal with this somehow -- for example, appending 
data to column values rather than just creating a string and storing it in 
the Row.

*********
In endElement():
1) If the element is mapped as a property, I don't think there is anything 
to do.

2) If the element is mapped as a class, then (I'm thinking out loud here), 
by reaching the end of the element, you know that all child elements are 
done. You don't need to worry about child elements-as-properties, as 
processing for them is finished.  All you care about is child 
elements-as-classes and their offspring.  I think what you do is 
recursively process these as follows:
a) First process all child elements-as-classes that contain primary keys. 
These need to be inserted and their foreign keys stored in the parent row.
b) Next insert the parent row.
c) Finally, insert the primary key from the parent row into the children 
skipped in (a) and insert these children.

Note that when I talk about child elements-as-classes, I think I am also 
referring to property tables for the current parent.

This is the only really tricky part of the code, as these elements could be 
nested quite deep and you need to keep track of which have been inserted 
and which haven't. However, the more I talk about this, the more I believe 
it is possible. Furthermore, although you end up caching a certain amount 
of data, I think it's a good bet that this approaches the same memory 
requirements as the DOM for trees in which there are no siblings, only 
parents and children. (For example, if your tree has a root with 10000 
children, each of which has three children and no grandchildren, then 
assuming the root is ignored, you will, at most, only cache the original 
child and its three children at any one point in time. On the other hand, 
if your tree has only parents and children, then depending on the PK/FK 
relationships, you might have to cache the whole thing and insert the most 
deeply nested child first.  In any case, caching the rows and keeping their 
maps around is likely to be far less expensive than the equivalent DOM.)

The last thing to consider is that you need to keep order information about 
the children of each element-as-class. Because elements can be nested, 
there will be a stack of order information, as you need to maintain this on 
a per-level basis.  I don't think this will be too hard -- just a push 
every time you start an element-as-class, a pop every you end an 
element-as-class, and an increment every time you start an 
element-as-property or call to characters (remembering, of course, the 
problem with multiple calls to characters for a single piece of PCDATA).