Indexing for Performance |
Back | Home | Next |
Metacat DB stores indeces for the current version of all documents in xml_index table. All relative and absolute paths to any node in a document are stored in this table. These paths are used for structured query searches to a specified location in an XML tree. Path expressions specified in the pathquery document are queried from this table.
Indeces become necessary because relational databases are not efficient at querying tree structures. By slightly denormalizing the database and listing the tree structure in a flat table, the relational database engine can more effectively handle large path queries.
Example:
<?xml version="1.0"?> <!DOCTYPE employee> <employee> <name> <first>Chad</first> <last>Berkley</last> </name> <address> <street>735 State St. Ste. 303</street> <city>Santa Barbara</city> <state>California</state> <zip>93101</zip> </address> <occupation> <title>Metadata Systems Developer</title> <location> <type> <grant> <name>DBA</name> <grantor>NSF</grantor> <PI>Jim Reichman</PI> <PI>Matt Jones</PI> <PI>Mark Schildhauer</PI> </grant> </type> <system>UC</system> <location>Santa Barbara</location> </location> </occupation> </employee>
The XML document above is logically deconstructed into all possible paths through the document. Each path (relative and absolute) is entered as a record in xml_index along with the nodeid of the deepest node. In this way, any path can be quickly queried. It's contents can be assertained from cross linking the nodeid back to the xml_nodes table.
PATH | NODEID | PARENTNODEID | |
occupation/location/type/grant/grantor | 200150 | 200145 | |
location/type/grant/grantor | 200150 | 200145 | |
grantor | 200150 | 200145 | |
location/type/grant/PI | 200153 | 200145 | |
grant/PI | 200153 | 200145 | |
/employee/occupation/location/type/grant/PI | 200153 | 200145 | |
employee/occupation/location/type/grant/PI | 200153 | 200145 | |
occupation/location/type/grant/PI | 200153 | 200145 | |
type/grant/PI | 200153 | 200145 | |
PI | 200153 | 200145 | |
location/type/grant/PI | 200156 | 200145 | |
grant/PI | 200156 | 200145 | |
/employee/occupation/location/type/grant/PI | 200156 | 200145 | |
employee/occupation/location/type/grant/PI | 200156 | 200145 | |
occupation/location/type/grant/PI | 200156 | 200145 | |
type/grant/PI | 200156 | 200145 | |
PI | 200156 | 200145 | |
location/type/grant/PI | 200159 | 200145 | |
grant/PI | 200159 | 200145 | |
/employee/occupation/location/type/grant/PI | 200159 | 200145 | |
employee/occupation/location/type/grant/PI | 200159 | 200145 | |
occupation/location/type/grant/PI | 200159 | 200145 | |
type/grant/PI | 200159 | 200145 | |
PI | 200159 | 200145 | |
occupation/location/system | 200164 | 200141 | |
system | 200164 | 200141 | |
employee/occupation/location/system | 200164 | 200141 | |
location/system | 200164 | 200141 | |
/employee/occupation/location/system | 200164 | 200141 | |
/employee/occupation/location/location | 200167 | 200141 | |
employee/occupation/location/location | 200167 | 200141 | |
occupation/location/location | 200167 | 200141 | |
location | 200167 | 200141 | |
location/location | 200167 | 200141 | |
/employee/name | 200112 | 200110 | |
name | 200112 | 200110 | |
employee/name | 200112 | 200110 | |
/employee/name/first | 200114 | 200112 | |
name/first | 200114 | 200112 | |
/employee | 200110 | 200109 | |
employee | 200110 | 200109 | |
employee/name/first | 200114 | 200112 | |
first | 200114 | 200112 | |
name/last | 200117 | 200112 | |
/employee/name/last | 200117 | 200112 | |
employee/name/last | 200117 | 200112 | |
last | 200117 | 200112 | |
employee/address | 200121 | 200110 | |
address | 200121 | 200110 | |
/employee/address | 200121 | 200110 | |
/employee/address/street | 200123 | 200121 | |
employee/address/street | 200123 | 200121 | |
address/street | 200123 | 200121 | |
street | 200123 | 200121 | |
employee/address/city | 200126 | 200121 | |
address/city | 200126 | 200121 | |
/employee/address/city | 200126 | 200121 | |
city | 200126 | 200121 | |
address/state | 200129 | 200121 | |
/employee/address/state | 200129 | 200121 | |
employee/address/state | 200129 | 200121 | |
state | 200129 | 200121 | |
employee/address/zip | 200132 | 200121 | |
zip | 200132 | 200121 | |
/employee/address/zip | 200132 | 200121 | |
address/zip | 200132 | 200121 | |
employee/occupation | 200136 | 200110 | |
/employee/occupation | 200136 | 200110 | |
occupation | 200136 | 200110 | |
/employee/occupation/title | 200138 | 200136 | |
employee/occupation/title | 200138 | 200136 | |
occupation/title | 200138 | 200136 | |
title | 200138 | 200136 | |
location | 200141 | 200136 | |
employee/occupation/location | 200141 | 200136 | |
/employee/occupation/location | 200141 | 200136 | |
occupation/location | 200141 | 200136 | |
occupation/location/type | 200143 | 200141 | |
type | 200143 | 200141 | |
/employee/occupation/location/type | 200143 | 200141 | |
location/type | 200143 | 200141 | |
employee/occupation/location/type | 200143 | 200141 | |
type/grant | 200145 | 200143 | |
occupation/location/type/grant | 200145 | 200143 | |
grant | 200145 | 200143 | |
location/type/grant | 200145 | 200143 | |
/employee/occupation/location/type/grant | 200145 | 200143 | |
employee/occupation/location/type/grant | 200145 | 200143 | |
grant/name | 200147 | 200145 | |
/employee/occupation/location/type/grant/name | 200147 | 200145 | |
occupation/location/type/grant/name | 200147 | 200145 | |
name | 200147 | 200145 | |
type/grant/name | 200147 | 200145 | |
location/type/grant/name | 200147 | 200145 | |
employee/occupation/location/type/grant/name | 200147 | 200145 | |
/employee/occupation/location/type/grant/grantor | 200150 | 200145 | |
type/grant/grantor | 200150 | 200145 | |
grant/grantor | 200150 | 200145 | |
employee/occupation/location/type/grant/grantor | 200150 | 200145 |
The following is the document in xml_nodes.
nodeid | nodetype | nodename | nodedata | rootnodeid | parentnodeid |
200164 | ELEMENT | system | 200109 | 200141 | |
200165 | TEXT | UC | 200109 | 200164 | |
200166 | TEXT | 200109 | 200141 | ||
200167 | ELEMENT | location | 200109 | 200141 | |
200168 | TEXT | Santa Barbara | 200109 | 200167 | |
200169 | TEXT | 200109 | 200141 | ||
200170 | TEXT | 200109 | 200136 | ||
200171 | TEXT | 200109 | 200110 | ||
200109 | DOCUMENT | employee | 200109 | ||
200110 | ELEMENT | employee | 200109 | 200109 | |
200111 | TEXT | 200109 | 200110 | ||
200112 | ELEMENT | name | 200109 | 200110 | |
200113 | TEXT | 200109 | 200112 | ||
200114 | ELEMENT | first | 200109 | 200112 | |
200115 | TEXT | Chad | 200109 | 200114 | |
200116 | TEXT | 200109 | 200112 | ||
200117 | ELEMENT | last | 200109 | 200112 | |
200118 | TEXT | Berkley | 200109 | 200117 | |
200119 | TEXT | 200109 | 200112 | ||
200120 | TEXT | 200109 | 200110 | ||
200121 | ELEMENT | address | 200109 | 200110 | |
200122 | TEXT | 200109 | 200121 | ||
200123 | ELEMENT | street | 200109 | 200121 | |
200124 | TEXT | 735 State St. Ste. 303 | 200109 | 200123 | |
200125 | TEXT | 200109 | 200121 | ||
200126 | ELEMENT | city | 200109 | 200121 | |
200127 | TEXT | Santa Barbara | 200109 | 200126 | |
200128 | TEXT | 200109 | 200121 | ||
200129 | ELEMENT | state | 200109 | 200121 | |
200130 | TEXT | California | 200109 | 200129 | |
200131 | TEXT | 200109 | 200121 | ||
200132 | ELEMENT | zip | 200109 | 200121 | |
200133 | TEXT | 93101 | 200109 | 200132 | |
200134 | TEXT | 200109 | 200121 | ||
200135 | TEXT | 200109 | 200110 | ||
200136 | ELEMENT | occupation | 200109 | 200110 | |
200137 | TEXT | 200109 | 200136 | ||
200138 | ELEMENT | title | 200109 | 200136 | |
200139 | TEXT | Metadata Systems Developer | 200109 | 200138 | |
200140 | TEXT | 200109 | 200136 | ||
200141 | ELEMENT | location | 200109 | 200136 | |
200142 | TEXT | 200109 | 200141 | ||
200143 | ELEMENT | type | 200109 | 200141 | |
200144 | TEXT | 200109 | 200143 | ||
200145 | ELEMENT | grant | 200109 | 200143 | |
200146 | TEXT | 200109 | 200145 | ||
200147 | ELEMENT | name | 200109 | 200145 | |
200148 | TEXT | DBA | 200109 | 200147 | |
200149 | TEXT | 200109 | 200145 | ||
200150 | ELEMENT | grantor | 200109 | 200145 | |
200151 | TEXT | NSF | 200109 | 200150 | |
200152 | TEXT | 200109 | 200145 | ||
200153 | ELEMENT | PI | 200109 | 200145 | |
200154 | TEXT | Jim Reichman | 200109 | 200153 | |
200155 | TEXT | 200109 | 200145 | ||
200156 | ELEMENT | PI | 200109 | 200145 | |
200157 | TEXT | Matt Jones | 200109 | 200156 | |
200158 | TEXT | 200109 | 200145 | 200159 | ELEMENT | PI | 200109 | 200145 |
200160 | TEXT | Mark Schilhauer | 200109 | 200159 | |
200161 | TEXT | 200109 | 200145 | ||
200162 | TEXT | 200109 | 200143 | ||
200163 | TEXT | 200109 | 200141 |