Project

General

Profile

metacat / docs / user / xmlindex.html @ 3067

1
<!--
2
  * xmlindex.html
3
  *
4
  *      Authors: Jivka Bojilova
5
  *    Copyright: 2000 Regents of the University of California and the
6
  *               National Center for Ecological Analysis and Synthesis
7
  *  For Details: http://www.nceas.ucsb.edu/
8
  *      Created: 2000 April 5
9
  *      Version: 0.01
10
  *    File Info: '$Id: xmlindex.html 878 2001-12-18 18:11:42Z berkley $'
11
  * 
12
  * October Meeting SDSC, 2000
13
-->
14
<HTML>
15
<HEAD>
16
<TITLE>Metacat</TITLE>
17
<link rel="stylesheet" type="text/css" href="@docrooturl@default.css">
18
</HEAD> 
19
<BODY>
20
  <table width="100%">
21
    <tr>
22
      <td class="tablehead" colspan="2"><p class="label">Indexing for Performance</p></td>
23
      <td class="tablehead" colspan="2" align="right">
24
        <a href="./saxparser.html">Back</a> | <a href="./metacattour.html">Home</a> | 
25
        <a href="./acontrol.html">Next</a>
26
      </td>
27
    </tr>
28
  </table>
29
  <P>Metacat DB stores indeces for the current version of all documents in <i>xml_index</i> 
30
  table. All relative and absolute paths to any node in a document are stored in 
31
  this table. These paths are used for structured query searches to a specified 
32
  location in an XML tree. Path expressions specified in the 
33
  <a href="./metacatquery.html">pathquery</a> document are queried from this 
34
  table.</p>
35
  <p>Indeces become necessary because relational databases are not efficient
36
  at querying tree structures.  By slightly denormalizing the database and listing
37
  the tree structure in a flat table, the relational database engine can more
38
  effectively handle large path queries.</p>
39
  <p>Example: </p>
40
  <pre>
41
    &lt;?xml version="1.0"?&gt;
42
    &lt;!DOCTYPE employee&gt;
43
    &lt;employee&gt;
44
      &lt;name&gt;
45
        &lt;first&gt;Chad&lt;/first&gt;
46
        &lt;last&gt;Berkley&lt;/last&gt;
47
      &lt;/name&gt;
48
      &lt;address&gt;
49
        &lt;street&gt;735 State St. Ste. 303&lt;/street&gt;
50
        &lt;city&gt;Santa Barbara&lt;/city&gt;
51
        &lt;state&gt;California&lt;/state&gt; 
52
        &lt;zip&gt;93101&lt;/zip&gt;
53
      &lt;/address&gt;
54
      &lt;occupation&gt;
55
        &lt;title&gt;Metadata Systems Developer&lt;/title&gt;
56
        &lt;location&gt;
57
          &lt;type&gt;
58
            &lt;grant&gt;
59
              &lt;name&gt;DBA&lt;/name&gt;
60
              &lt;grantor&gt;NSF&lt;/grantor&gt;
61
              &lt;PI&gt;Jim Reichman&lt;/PI&gt;
62
              &lt;PI&gt;Matt Jones&lt;/PI&gt;
63
              &lt;PI&gt;Mark Schildhauer&lt;/PI&gt;
64
            &lt;/grant&gt;
65
          &lt;/type&gt;
66
          &lt;system&gt;UC&lt;/system&gt;
67
          &lt;location&gt;Santa Barbara&lt;/location&gt;
68
        &lt;/location&gt;
69
      &lt;/occupation&gt;
70
    &lt;/employee&gt;
71
  </pre>
72
  <p>The XML document above is logically deconstructed into all possible paths
73
  through the document.  Each path (relative and absolute) is entered as a record
74
  in xml_index along with the nodeid of the deepest node.  In this way, any path
75
  can be quickly queried.  It's contents can be assertained from cross linking
76
  the nodeid back to the <a href="#xml_nodes">xml_nodes</a> table.  
77
  <table border="1">
78
    <tr>
79
      <td><b>PATH</b></td><td><b>NODEID</b></td><td><b>PARENTNODEID</b></td>
80
    </tr>
81
    <tr><td>occupation/location/type/grant/grantor</td><td>200150</td><td>200145<td></tr>
82
    <tr><td>location/type/grant/grantor</td><td>200150</td><td>200145<td></tr>
83
    <tr><td>grantor</td><td>200150</td><td>200145<td></tr>
84
    <tr><td>location/type/grant/PI</td><td>200153</td><td>200145<td></tr>
85
    <tr><td>grant/PI</td><td>200153</td><td>          200145<td></tr>
86
    <tr><td>/employee/occupation/location/type/grant/PI</td><td>200153</td><td>          200145<td></tr>
87
    <tr><td>employee/occupation/location/type/grant/PI</td><td>200153</td><td>          200145<td></tr>
88
    <tr><td>occupation/location/type/grant/PI</td><td>200153</td><td>          200145<td></tr>
89
    <tr><td>type/grant/PI</td><td>200153</td><td>          200145<td></tr>
90
    <tr><td>PI</td><td>200153</td><td>          200145<td></tr>
91
    <tr><td>location/type/grant/PI</td><td>200156</td><td>          200145<td></tr>
92
    <tr><td>grant/PI</td><td>200156</td><td>          200145<td></tr>
93
    <tr><td>/employee/occupation/location/type/grant/PI</td><td>200156</td><td>          200145<td></tr>
94
    <tr><td>employee/occupation/location/type/grant/PI</td><td>200156</td><td>          200145<td></tr>
95
    <tr><td>occupation/location/type/grant/PI</td><td>200156</td><td>          200145<td></tr>
96
    <tr><td>type/grant/PI</td><td>200156</td><td>          200145<td></tr>
97
    <tr><td>PI</td><td>        200156</td><td>          200145<td></tr>
98
    <tr><td>location/type/grant/PI</td><td>200159</td><td>          200145<td></tr>
99
    <tr><td>grant/PI</td><td>200159</td><td>          200145
100
    <tr><td>/employee/occupation/location/type/grant/PI</td><td>200159</td><td>          200145<td></tr>
101
    <tr><td>employee/occupation/location/type/grant/PI</td><td>200159</td><td>          200145<td></tr>
102
    <tr><td>occupation/location/type/grant/PI</td><td>200159</td><td>          200145<td></tr>
103
    <tr><td>type/grant/PI</td><td>200159</td><td>          200145<td></tr>
104
    <tr><td>PI</td><td>200159</td><td>          200145<td></tr>
105
    <tr><td>occupation/location/system</td><td>200164</td><td>          200141<td></tr>
106
    <tr><td>system</td><td>200164</td><td>          200141<td></tr>
107
    <tr><td>employee/occupation/location/system</td><td>200164</td><td>          200141<td></tr>
108
    <tr><td>location/system</td><td>200164</td><td>          200141<td></tr>
109
    <tr><td>/employee/occupation/location/system</td><td>200164</td><td>          200141<td></tr>
110
    <tr><td>/employee/occupation/location/location</td><td>200167</td><td>          200141<td></tr>
111
    <tr><td>employee/occupation/location/location</td><td>200167</td><td>          200141<td></tr>
112
    <tr><td>occupation/location/location</td><td>200167</td><td>          200141<td></tr>
113
    <tr><td>location</td><td>200167</td><td>          200141<td></tr>
114
    <tr><td>location/location</td><td>200167</td><td>          200141<td></tr>
115
    <tr><td>/employee/name</td><td>200112</td><td>          200110<td></tr>
116
    <tr><td>name</td><td>200112</td><td>          200110<td></tr>
117
    <tr><td>employee/name</td><td>200112</td><td>          200110<td></tr>
118
    <tr><td>/employee/name/first</td><td>200114</td><td>          200112<td></tr>
119
    <tr><td>name/first</td><td>200114</td><td>          200112<td></tr>
120
    <tr><td>/employee</td><td>200110</td><td>          200109<td></tr>
121
    <tr><td>employee</td><td>200110</td><td>          200109<td></tr>
122
    <tr><td>employee/name/first</td><td>200114</td><td>          200112<td></tr>
123
    <tr><td>first</td><td>200114</td><td>          200112<td></tr>
124
    <tr><td>name/last</td><td>200117</td><td>          200112<td></tr>
125
    <tr><td>/employee/name/last</td><td>200117</td><td>          200112<td></tr>
126
    <tr><td>employee/name/last</td><td>200117</td><td>          200112<td></tr>
127
    <tr><td>last</td><td>200117</td><td>          200112<td></tr>
128
    <tr><td>employee/address</td><td>200121</td><td>          200110<td></tr>
129
    <tr><td>address</td><td>200121</td><td>          200110<td></tr>
130
    <tr><td>/employee/address</td><td>200121</td><td>          200110<td></tr>
131
    <tr><td>/employee/address/street</td><td>200123</td><td>          200121<td></tr>
132
    <tr><td>employee/address/street</td><td>200123</td><td>          200121<td></tr>
133
    <tr><td>address/street</td><td>200123</td><td>          200121<td></tr>
134
    <tr><td>street</td><td>200123</td><td>          200121<td></tr>
135
    <tr><td>employee/address/city</td><td>200126</td><td>          200121<td></tr>
136
    <tr><td>address/city</td><td>200126</td><td>          200121<td></tr>
137
    <tr><td>/employee/address/city</td><td>200126</td><td>          200121<td></tr>
138
    <tr><td>city</td><td>200126</td><td>          200121<td></tr>
139
    <tr><td>address/state</td><td>200129</td><td>          200121<td></tr>
140
    <tr><td>/employee/address/state</td><td>200129</td><td>          200121<td></tr>
141
    <tr><td>employee/address/state</td><td>200129</td><td>          200121<td></tr>
142
    <tr><td>state</td><td>200129</td><td>          200121<td></tr>
143
    <tr><td>employee/address/zip</td><td>200132</td><td>          200121<td></tr>
144
    <tr><td>zip</td><td>200132</td><td>          200121<td></tr>
145
    <tr><td>/employee/address/zip</td><td>200132</td><td>          200121<td></tr>
146
    <tr><td>address/zip</td><td>200132</td><td>          200121<td></tr>
147
    <tr><td>employee/occupation</td><td>200136</td><td>          200110<td></tr>
148
    <tr><td>/employee/occupation</td><td>200136</td><td>          200110<td></tr>
149
    <tr><td>occupation</td><td>200136</td><td>          200110<td></tr>
150
    <tr><td>/employee/occupation/title</td><td>200138</td><td>          200136<td></tr>
151
    <tr><td>employee/occupation/title</td><td>200138</td><td>          200136<td></tr>
152
    <tr><td>occupation/title</td><td>200138</td><td>  200136<td></tr>
153
    <tr><td>title</td><td>200138</td><td>          200136<td></tr>
154
    <tr><td>location</td><td>200141</td><td>          200136<td></tr>
155
    <tr><td>employee/occupation/location</td><td>200141</td><td>          200136<td></tr>
156
    <tr><td>/employee/occupation/location</td><td>200141</td><td>          200136<td></tr>
157
    <tr><td>occupation/location</td><td>200141</td><td>          200136<td></tr>
158
    <tr><td>occupation/location/type</td><td>200143</td><td>          200141<td></tr>
159
    <tr><td>type</td><td>200143</td><td>          200141<td></tr>
160
    <tr><td>/employee/occupation/location/type</td><td>200143</td><td>          200141<td></tr>
161
    <tr><td>location/type</td><td>200143</td><td>          200141<td></tr>
162
    <tr><td>employee/occupation/location/type</td><td>200143</td><td>          200141<td></tr>
163
    <tr><td>type/grant</td><td>200145</td><td>          200143<td></tr>
164
    <tr><td>occupation/location/type/grant</td><td>200145</td><td>          200143<td></tr>
165
    <tr><td>grant</td><td>200145</td><td>          200143<td></tr>
166
    <tr><td>location/type/grant</td><td>200145</td><td>          200143<td></tr>
167
    <tr><td>/employee/occupation/location/type/grant</td><td>200145</td><td>          200143<td></tr>
168
    <tr><td>employee/occupation/location/type/grant</td><td>200145</td><td>          200143<td></tr>
169
    <tr><td>grant/name</td><td>200147</td><td>          200145<td></tr>
170
    <tr><td>/employee/occupation/location/type/grant/name</td><td>200147</td><td>          200145<td></tr>
171
    <tr><td>occupation/location/type/grant/name</td><td>200147</td><td>          200145<td></tr>
172
    <tr><td>name</td><td>200147</td><td>          200145<td></tr>
173
    <tr><td>type/grant/name</td><td>200147</td><td>          200145<td></tr>
174
    <tr><td>location/type/grant/name</td><td>200147</td><td>          200145<td></tr>
175
    <tr><td>employee/occupation/location/type/grant/name</td><td>200147</td><td>          200145<td></tr>
176
    <tr><td>/employee/occupation/location/type/grant/grantor</td><td>200150</td><td>          200145<td></tr>
177
    <tr><td>type/grant/grantor</td><td>200150</td><td>          200145<td></tr>
178
    <tr><td>grant/grantor</td><td>200150</td><td>          200145<td></tr>
179
    <tr><td>employee/occupation/location/type/grant/grantor</td><td>200150</td><td>          200145<td></tr>
180
  </table>
181
  <br>
182
  <a name="xml_nodes"></a>
183
  <p>The following is the document in xml_nodes.</p>
184
  <br>
185
  <table border="1">
186
    <tr>
187
      <td><b>nodeid</b></td><td><b>nodetype</b></td><td><b>nodename</b></td>
188
      <td><b>nodedata</b></td><td><b>rootnodeid</b></td><td><b>parentnodeid</b></td>
189
    </tr>
190
    
191
    <tr><td>200164</td> <td>ELEMENT </td> <td>system</td><td>&nbsp;</td><td>200109</td><td>200141</td> </tr>
192
    <tr><td>200165</td> <td>TEXT </td> <td>&nbsp;</td><td>UC</td><td>200109</td><td>200164</td> </tr>
193
    <tr><td>200166</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200141</td> </tr>
194
    <tr><td>200167</td> <td>ELEMENT </td> <td> location</td><td>&nbsp;</td><td>200109</td><td>200141</td> </tr>
195
    <tr><td>200168</td> <td>TEXT </td> <td>&nbsp;</td><td>Santa Barbara</td><td>200109</td><td>200167</td> </tr>
196
    <tr><td>200169</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200141</td> </tr>
197
    <tr><td>200170</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200136</td> </tr>
198
    <tr><td>200171</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200110</td> </tr>
199
    <tr><td>200109</td> <td>DOCUMENT </td> <td> employee</td><td>&nbsp;</td><td>200109</td><td>&nbsp;</td> </tr>
200
    <tr><td>200110</td> <td>ELEMENT </td> <td> employee</td><td>&nbsp;</td><td>200109</td><td>200109</td> </tr>
201
    <tr><td>200111</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200110</td> </tr>
202
    <tr><td>200112</td> <td>ELEMENT </td> <td> name</td><td>&nbsp;</td><td>200109</td><td>200110</td> </tr>
203
    <tr><td>200113</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200112</td> </tr>
204
    <tr><td>200114</td> <td>ELEMENT </td> <td> first</td><td>&nbsp;</td><td>200109</td><td>200112</td> </tr>
205
    <tr><td>200115</td> <td>TEXT </td> <td>&nbsp;</td><td>Chad</td><td>200109</td><td>200114</td> </tr>
206
    <tr><td>200116</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200112</td> </tr>
207
    <tr><td>200117</td> <td>ELEMENT </td> <td> last</td><td>&nbsp;</td><td>200109</td><td>200112</td> </tr>
208
    <tr><td>200118</td> <td>TEXT </td> <td>&nbsp;</td><td>Berkley</td><td>200109</td><td>200117</td> </tr>
209
    <tr><td>200119</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200112</td> </tr>
210
    <tr><td>200120</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200110</td> </tr>
211
    <tr><td>200121</td> <td>ELEMENT </td> <td> address</td><td>&nbsp;</td><td>200109</td><td>200110</td> </tr>
212
    <tr><td>200122</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200121</td> </tr>
213
    <tr><td>200123</td> <td>ELEMENT </td> <td> street</td><td>&nbsp;</td><td>200109</td><td>200121</td> </tr>
214
    <tr><td>200124</td> <td>TEXT </td> <td>&nbsp;</td><td>735 State St. Ste. 303</td><td>200109</td><td>200123</td> </tr>
215
    <tr><td>200125</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200121</td> </tr>
216
    <tr><td>200126</td> <td>ELEMENT </td> <td> city</td><td>&nbsp;</td><td>200109</td><td>200121</td> </tr>
217
    <tr><td>200127</td> <td>TEXT </td> <td>&nbsp;</td><td>Santa Barbara</td><td>200109</td><td>200126</td> </tr>
218
    <tr><td>200128</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200121</td> </tr>
219
    <tr><td>200129</td> <td>ELEMENT </td> <td> state</td><td>&nbsp;</td><td>200109</td><td>200121</td> </tr>
220
    <tr><td>200130</td> <td>TEXT </td> <td>&nbsp;</td><td>California</td><td>200109</td><td>200129</td> </tr>
221
    <tr><td>200131</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200121</td> </tr>
222
    <tr><td>200132</td> <td>ELEMENT </td> <td> zip</td><td>&nbsp;</td><td>200109</td><td>200121</td> </tr>
223
    <tr><td>200133</td> <td>TEXT </td> <td>&nbsp;</td><td>93101</td><td>200109</td><td>200132</td> </tr>
224
    <tr><td>200134</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200121</td> </tr>
225
    <tr><td>200135</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200110</td> </tr>
226
    <tr><td>200136</td> <td>ELEMENT </td> <td> occupation</td><td>&nbsp;</td><td>200109</td><td>200110</td> </tr>
227
    <tr><td>200137</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200136</td> </tr>
228
    <tr><td>200138</td> <td>ELEMENT </td> <td> title</td><td>&nbsp;</td><td>200109</td><td>200136</td> </tr>
229
    <tr><td>200139</td> <td>TEXT </td> <td>&nbsp;</td><td>Metadata Systems Developer</td><td>200109</td><td>200138</td> </tr>
230
    <tr><td>200140</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200136</td> </tr>
231
    <tr><td>200141</td> <td>ELEMENT </td> <td> location</td><td>&nbsp;</td><td>200109</td><td>200136</td> </tr>
232
    <tr><td>200142</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200141</td> </tr>
233
    <tr><td>200143</td> <td>ELEMENT </td> <td> type</td><td>&nbsp;</td><td>200109</td><td>200141</td> </tr>
234
    <tr><td>200144</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200143</td> </tr>
235
    <tr><td>200145</td> <td>ELEMENT </td> <td> grant</td><td>&nbsp;</td><td>200109</td><td>200143</td> </tr>
236
    <tr><td>200146</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200145</td> </tr>
237
    <tr><td>200147</td> <td>ELEMENT </td> <td> name</td><td>&nbsp;</td><td>200109</td><td>200145</td> </tr>
238
    <tr><td>200148</td> <td>TEXT </td> <td>&nbsp;</td><td>DBA</td><td>200109</td><td>200147</td> </tr>
239
    <tr><td>200149</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200145</td> </tr>
240
    <tr><td>200150</td> <td>ELEMENT </td> <td> grantor</td><td>&nbsp;</td><td>200109</td><td>200145</td> </tr>
241
    <tr><td>200151</td> <td>TEXT </td> <td>&nbsp;</td><td>NSF</td><td>200109</td><td>200150</td> </tr>
242
    <tr><td>200152</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200145</td> </tr>
243
    <tr><td>200153</td> <td>ELEMENT </td> <td> PI</td><td>&nbsp;</td><td>200109</td><td>200145</td> </tr>
244
    <tr><td>200154</td> <td>TEXT </td> <td>&nbsp;</td><td>Jim Reichman</td><td>200109</td><td>200153</td> </tr>
245
    <tr><td>200155</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200145</td> </tr>
246
    <tr><td>200156</td> <td>ELEMENT </td> <td> PI</td><td>&nbsp;</td><td>200109</td><td>200145</td> </tr>
247
    <tr><td>200157</td> <td>TEXT </td> <td>&nbsp;</td><td>Matt Jones</td><td>200109</td><td>200156</td> </tr>
248
    <tr><td>200158</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200145</td> </tr
249
    <tr><td>200159</td> <td>ELEMENT </td> <td> PI</td><td>&nbsp;</td><td>200109</td><td>200145</td> </tr>
250
    <tr><td>200160</td> <td>TEXT </td> <td>&nbsp;</td><td>Mark Schilhauer</td><td>200109</td><td>200159</td> </tr>
251
    <tr><td>200161</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200145</td> </tr>
252
    <tr><td>200162</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200143</td> </tr>
253
    <tr><td>200163</td> <td>TEXT </td> <td>&nbsp;</td><td>&nbsp;</td><td>200109</td><td>200141</td> </tr>
254
  </table>
255
  
256
  
257
  
258
  
259
  <br>
260
  <a href="./saxparser.html">Back</a> | <a href="./metacattour.html">Home</a> | 
261
  <a href="./acontrol.html">Next</a>
262
  
263
</BODY>
264
</HTML>
265