Project

General

Profile

« Previous | Next » 

Revision 3148

As part of a patch fix for:

http://bugzilla.ecoinformatics.org/show_bug.cgi?id=2469
I've changed DocumentImpl.java in three locations:
buildIndex()
traverseParents()
updatePathIndex()

This patch modifies buildIndex(). Like the prior two patches, it changes
the pathsFound Vector to a HashMap. This hashmap associates nodeids, paths,
and the node data for each node being processed. The prior code at times
would incorrectly index ATTRIBUTE nodes, because it was not getting the node
data value from the correct location (in the same database tuple for ATTRIBUTEs,
as opposed to a child TEXT tuple for ELEMENTS.

I've changed buildIndex to now process ATTRIBUTE nodes and TEXT nodes, not
ELEMENT nodes. These are both leaf nodes, where the data values reside, not
ELEMENT nodes. The major change is that the parent (ELEMENT) nodes to TEXT nodes
are now traversed with traverseParents(), but the TEXT nodeData and nodeDataNumerical
are set in the parent's NodeRecord so that they will be indexed correctly.

View differences:

src/edu/ucsb/nceas/metacat/DocumentImpl.java
1293 1293
            deleteNodeIndex(dbConn);
1294 1294

  
1295 1295
            // Step through all of the node records we were given
1296
            // and build the new index and update the database
1296
            // and build the new index and update the database. Process
1297
            // TEXT nodes with their parent ELEMENT node ids to associate the
1298
            // element with it's node data (stored in the text node)
1297 1299
            it = nodeRecordLists.iterator();
1298
            Vector pathsFound = new Vector();;
1300
            HashMap pathsFound = new HashMap();
1299 1301
            while (it.hasNext()) {
1300 1302
                NodeRecord currentNode = (NodeRecord) it.next();
1301 1303
                HashMap pathList = new HashMap();
1302
                if (currentNode.getNodeType().equals("ELEMENT") ||
1303
                    currentNode.getNodeType().equals("ATTRIBUTE")){
1304
                if ( currentNode.getNodeType().equals("ELEMENT") ||
1305
                     currentNode.getNodeType().equals("ATTRIBUTE") ){
1304 1306

  
1305 1307
                    if (atRootElement) {
1306 1308
                        rootNodeId = currentNode.getNodeId();
......
1311 1313
                                    currentNode.getNodeId(), 
1312 1314
                                    "", pathList, pathsFound);
1313 1315
                    updateNodeIndex(dbConn, pathList);
1314
                } else if (currentNode.getNodeType().equals("TEXT")){
1315
                	if(!currentNode.getNodeData().trim().equals("") 
1316
                			&& !pathsFound.isEmpty()){
1317
                		updatePathIndex(dbConn, currentNode, pathsFound);
1318
                    	pathsFound.removeAllElements();
1319
                    }
1316
                } else if ( currentNode.getNodeType().equals("TEXT") ) {
1317
                  
1318
                  // Set the parent node's nodedata and nodedatanumerical to
1319
                  // that of the leaf TEXT node being processed, and then
1320
                  // traverse the parents starting from the parent node.  The
1321
                  // xml_path_index table will be populated with ELEMENT paths
1322
                  // with TEXT nodedata (since it's modeled this way in the DOM)
1323
                  NodeRecord parentNode = 
1324
                   (NodeRecord) nodeRecordMap.get(currentNode.getParentNodeId());
1325
                  if ( parentNode.getNodeType().equals("ELEMENT") &&
1326
                       !currentNode.getNodeData().equals("") ) {
1327
                    parentNode.setNodeData(currentNode.getNodeData());
1328
                    parentNode.setNodeDataNumerical(currentNode.getNodeDataNumerical());
1329
                    
1330
                  	traverseParents(nodeRecordMap, rootNodeId,
1331
                                    currentNode.getNodeId(),
1332
                                    parentNode.getNodeId(), 
1333
                                    "", pathList, pathsFound);  
1334
                  } 
1320 1335
                }
1336
                // Lastly, update the xml_path_index table
1337
                if(!pathsFound.isEmpty()){
1338
              		updatePathIndex(dbConn, pathsFound);
1339
                  pathsFound.clear();
1340
                }
1321 1341
            }
1322 1342
            
1323 1343
            dbConn.commit();
......
1352 1372

  
1353 1373
    /**
1354 1374
     * Recurse up the parent node hierarchy and add each node to the
1355
     * hashmap of paths to be indexed.
1375
     * hashmap of paths to be indexed.  Note: pathsForIndexing is a hash map of 
1376
     * paths
1356 1377
     *
1357 1378
     * @param records the set of records hashed by nodeId
1358 1379
     * @param rootNodeId the id of the root element of the document

Also available in: Unified diff