Project

General

Profile

Actions

Bug #1427

closed

xml_index constrains depth of paths that can be inserted

Added by Matt Jones about 20 years ago. Updated about 20 years ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Category:
metacat
Target version:
Start date:
03/30/2004
Due date:
% Done:

0%

Estimated time:
Bugzilla-Id:
1427

Description

When an XML document contains a deeply nested structure, metacat accepts the
document for storage in xml_nodes, but during the subsequent indexing phase, it
throws an exception because the composite paths to the deep nodes are too long
to fit in the space allocated for the paths in the column in the xml_index
table. This column was limited to a a few hundred characters so that it is
indexable (Oracle had a limit on the total indexable width of columns).

These problems were discovered and reported by Wade Sheldon (GCE LTER) when he
submitted EML documents with fully filled out taxonomic coverage entries. We
definitely need to support realistically filled out EML documents.

So, two possible solutions:
1) make the column much wider
-- this is a partial solution, because the column still might not be big
enough for very deep docs or docs with long element names
-- if its wider, it may not be indexable, which is why it exists
2) eliminate the dependency on the xml_index table altogether
-- the recursive search needed isn't that much slower, and may not be
slower at all as we tune the database
-- insert/update/delete should be MUCH faster
-- simpler database structure

We have decided to pursue (2) above because of the advantages listed. Rather
than completely removing the xml_index code, we are going to make it an option
whether or not it is used, but by default ship with it turned off.

Actions

Also available in: Atom PDF