/ - Diff - Metacat - Ecoinformatics Redmine

« Previous | Next »

Revision 3148

Added by Chris Jones almost 18 years ago

As part of a patch fix for:

http://bugzilla.ecoinformatics.org/show_bug.cgi?id=2469

I've changed DocumentImpl.java in three locations:

buildIndex()
  traverseParents()
  updatePathIndex()

This patch modifies buildIndex(). Like the prior two patches, it changes
the pathsFound Vector to a HashMap. This hashmap associates nodeids, paths,
and the node data for each node being processed. The prior code at times
would incorrectly index ATTRIBUTE nodes, because it was not getting the node
data value from the correct location (in the same database tuple for ATTRIBUTEs,
as opposed to a child TEXT tuple for ELEMENTS.

I've changed buildIndex to now process ATTRIBUTE nodes and TEXT nodes, not
ELEMENT nodes. These are both leaf nodes, where the data values reside, not
ELEMENT nodes. The major change is that the parent (ELEMENT) nodes to TEXT nodes
are now traversed with traverseParents(), but the TEXT nodeData and nodeDataNumerical
are set in the parent's NodeRecord so that they will be indexed correctly.

                 deleteNodeIndex(dbConn);
                 // Step through all of the node records we were given
                 // and build the new index and update the database
                 // and build the new index and update the database. Process
                 // TEXT nodes with their parent ELEMENT node ids to associate the
                 // element with it's node data (stored in the text node)
                 it = nodeRecordLists.iterator();
                 Vector pathsFound = new Vector();;
                 HashMap pathsFound = new HashMap();
                 while (it.hasNext()) {
                     NodeRecord currentNode = (NodeRecord) it.next();
                     HashMap pathList = new HashMap();
                     if (currentNode.getNodeType().equals("ELEMENT") ||
                         currentNode.getNodeType().equals("ATTRIBUTE")){
                     if ( currentNode.getNodeType().equals("ELEMENT") ||
                          currentNode.getNodeType().equals("ATTRIBUTE") ){
                         if (atRootElement) {
                             rootNodeId = currentNode.getNodeId();
-...
                                         currentNode.getNodeId(),
                                         "", pathList, pathsFound);
                         updateNodeIndex(dbConn, pathList);
                     } else if (currentNode.getNodeType().equals("TEXT")){
                     	if(!currentNode.getNodeData().trim().equals("")
                     			&& !pathsFound.isEmpty()){
                     		updatePathIndex(dbConn, currentNode, pathsFound);
                         	pathsFound.removeAllElements();
+                        }
                     } else if ( currentNode.getNodeType().equals("TEXT") ) {
                       // Set the parent node's nodedata and nodedatanumerical to
                       // that of the leaf TEXT node being processed, and then
                       // traverse the parents starting from the parent node.  The
                       // xml_path_index table will be populated with ELEMENT paths
                       // with TEXT nodedata (since it's modeled this way in the DOM)
                       NodeRecord parentNode =
                        (NodeRecord) nodeRecordMap.get(currentNode.getParentNodeId());
                       if ( parentNode.getNodeType().equals("ELEMENT") &&
                            !currentNode.getNodeData().equals("") ) {
                         parentNode.setNodeData(currentNode.getNodeData());
                         parentNode.setNodeDataNumerical(currentNode.getNodeDataNumerical());
                       	traverseParents(nodeRecordMap, rootNodeId,
                                         currentNode.getNodeId(),
                                         parentNode.getNodeId(),
                                         "", pathList, pathsFound);
+                      }
+                    }
                     // Lastly, update the xml_path_index table
                     if(!pathsFound.isEmpty()){
                   		updatePathIndex(dbConn, pathsFound);
                       pathsFound.clear();
+                    }
+                }
                 dbConn.commit();
-...
         /**
          * Recurse up the parent node hierarchy and add each node to the
          * hashmap of paths to be indexed.
          * hashmap of paths to be indexed.  Note: pathsForIndexing is a hash map of
          * paths
+         *
          * @param records the set of records hashed by nodeId
          * @param rootNodeId the id of the root element of the document

Also available in: Unified diff

Project

General

Profile

Metacat

Revision 3148

Added by Chris Jones almost 18 years ago