1 |
14
|
jones
|
<!--
|
2 |
|
|
No copyright, no warranty; use as you will.
|
3 |
|
|
Written by Ronald Bourret, Technical University of Darmstadt, 1998-9
|
4 |
|
|
-->
|
5 |
|
|
|
6 |
|
|
<!--
|
7 |
|
|
XML-DBMS is a system for transferring data between XML documents
|
8 |
|
|
and relational databases. It views an XML document as a tree of
|
9 |
|
|
objects and then uses an object-relational mapping to map these
|
10 |
|
|
objects to a relational database.
|
11 |
|
|
|
12 |
|
|
Generally, element types are viewed as classes, and attributes and
|
13 |
|
|
PCDATA are viewed as properties of those classes. However, element
|
14 |
|
|
types can also be viewed as properties of their parent element
|
15 |
|
|
type. Although this is most useful when an element type contains
|
16 |
|
|
only PCDATA, it is useful in other cases as well. For example,
|
17 |
|
|
consider an element type that contains a description written in
|
18 |
|
|
XHTML. Although this description has subelements such as <B> and
|
19 |
|
|
<P>, these subelements cannot be meaningfully interpreted on their
|
20 |
|
|
own and it makes more sense to view the contents of the element
|
21 |
|
|
type as a single value (property) rather than a class.
|
22 |
|
|
|
23 |
|
|
(Note that the tree of objects is *not* the DOM. This is because
|
24 |
|
|
the DOM models the document itself, not the data in that document.)
|
25 |
|
|
|
26 |
|
|
The XML-DBMS mapping language, which is described in this DTD,
|
27 |
|
|
allows users to:
|
28 |
|
|
|
29 |
|
|
a) Declare how element types are to be viewed (as classes or
|
30 |
|
|
properties),
|
31 |
|
|
|
32 |
|
|
b) Declare which subelements, attributes, and PCDATA are to be
|
33 |
|
|
viewed as properties of a given element type-as-class (unmapped
|
34 |
|
|
XML structures are ignored), and
|
35 |
|
|
|
36 |
|
|
c) State how to map the resulting classes and properties to the
|
37 |
|
|
database.
|
38 |
|
|
|
39 |
|
|
The resulting object-relational mapping maps classes to tables and
|
40 |
|
|
properties to either columns in those tables or to subtables. (The
|
41 |
|
|
latter is useful, for example, for storing BLOB properties
|
42 |
|
|
separately.) Inter-class relationships are mapped as candidate key
|
43 |
|
|
/ foreign key relationships. The mapping can also state whether to
|
44 |
|
|
preserve information about the order in which subelements and
|
45 |
|
|
PCDATA occur within their parent, which is generally important in
|
46 |
|
|
document-centric XML documents and unimportant in data-centric XML
|
47 |
|
|
documents.
|
48 |
|
|
-->
|
49 |
|
|
|
50 |
|
|
<!--
|
51 |
|
|
The XMLDBMS element type is the root element type of the
|
52 |
|
|
mapping document.
|
53 |
|
|
-->
|
54 |
|
|
|
55 |
|
|
<!ELEMENT XMLToDBMS (Options*, Maps)>
|
56 |
|
|
<!ATTLIST XMLToDBMS
|
57 |
|
|
Version CDATA #FIXED "1.0">
|
58 |
|
|
|
59 |
|
|
<!--
|
60 |
|
|
Options is simply a container to hold the various options you
|
61 |
|
|
can set.
|
62 |
|
|
-->
|
63 |
|
|
|
64 |
|
|
<!ELEMENT Options (EmptyStringIsNull?, DateTimeFormats?, Namespace*)>
|
65 |
|
|
|
66 |
|
|
<!--
|
67 |
|
|
The EmptyStringIsNull element states how empty strings in an
|
68 |
|
|
XML document correspond to NULLs in the database. Technically,
|
69 |
|
|
NULL means that there is no value and is distinct from an empty
|
70 |
|
|
string. In an XML document, this corresponds to an optional
|
71 |
|
|
element or attribute being missing, as opposed to its being
|
72 |
|
|
present and having an empty string as its value (this includes
|
73 |
|
|
empty elements).
|
74 |
|
|
|
75 |
|
|
However, many XML users are likely to think of empty strings
|
76 |
|
|
as NULLs. EmptyStringIsNull allows XML-DBMS users to handle
|
77 |
|
|
this situation. If it is present, empty strings are treated
|
78 |
|
|
the same as NULLs; if it is absent, empty strings are treated
|
79 |
|
|
as strings.
|
80 |
|
|
|
81 |
|
|
The following table shows how NULL values and empty strings in
|
82 |
|
|
the database are transferred to missing elements/attributes and
|
83 |
|
|
empty strings in the XML document and vice versa.
|
84 |
|
|
|
85 |
|
|
|
86 |
|
|
Transfer Direction
|
87 |
|
|
_______________________________________
|
88 |
|
|
EmptyStringIsNull | | |
|
89 |
|
|
element is: | DBMS => XML | XML => DBMS |
|
90 |
|
|
___________________|___________________|___________________|
|
91 |
|
|
| | | |
|
92 |
|
|
| | NULL => missing | missing => NULL |
|
93 |
|
|
| not present | | |
|
94 |
|
|
| | empty => empty | empty => empty |
|
95 |
|
|
| | string string | string string |
|
96 |
|
|
|___________________|___________________|___________________|
|
97 |
|
|
| | | |
|
98 |
|
|
| | NULL => empty | missing => NULL |
|
99 |
|
|
| | string | |
|
100 |
|
|
| present | | |
|
101 |
|
|
| | empty => empty | empty => NULL |
|
102 |
|
|
| | string string | string |
|
103 |
|
|
|___________________|___________________|___________________|
|
104 |
|
|
|
105 |
|
|
|
106 |
|
|
Note that EmptyStringIsNull applies only to elements and
|
107 |
|
|
attributes mapped as properties. (An empty element-as-class with
|
108 |
|
|
no attributes results in a row of all NULLs in the database.)
|
109 |
|
|
-->
|
110 |
|
|
|
111 |
|
|
<!ELEMENT EmptyStringIsNull EMPTY>
|
112 |
|
|
|
113 |
|
|
<!--
|
114 |
|
|
The DateTimeFormats element and its subelements specify the formats
|
115 |
|
|
used to parse dates, times, and timestamps. The information specified
|
116 |
|
|
here is used to construct one of Java's date formatting objects -
|
117 |
|
|
either a java.text.DateFormat or a java.text.SimpleDateFormat.
|
118 |
|
|
|
119 |
|
|
The value of the Language attribute must be a valid ISO Language Code.
|
120 |
|
|
These are defined by ISO-639 and are available on the Web. For example,
|
121 |
|
|
try:
|
122 |
|
|
|
123 |
|
|
http://www.ics.uci.edu/pub/ietf/http/related/iso639.txt
|
124 |
|
|
|
125 |
|
|
The value of the Country attribute must be a valid ISO Country Code.
|
126 |
|
|
These are defined by ISO-3166 and are also available on the Web. For
|
127 |
|
|
example, try:
|
128 |
|
|
|
129 |
|
|
http://www.din.de/gremien/nas/nabd/iso3166ma/codlstp1.html
|
130 |
|
|
|
131 |
|
|
The value of Date, Time, and Timestamp attributes must be either one
|
132 |
|
|
of the keywords FULL, LONG, MEDIUM, or SHORT, whose formats are
|
133 |
|
|
described in the documentation for DateFormat, or a formatting
|
134 |
|
|
pattern as defined in the documentation for SimpleDateFormat. Which
|
135 |
|
|
format is used depends on the data type of the target column. (If
|
136 |
|
|
values are not being formatted correctly, be sure to check how the
|
137 |
|
|
JDBC driver maps the type of the target column. For example,
|
138 |
|
|
MS Access only supports TIMESTAMP columns.)
|
139 |
|
|
|
140 |
|
|
If an element is missing, the default value is used. For example, if
|
141 |
|
|
the Locale element is missing, the default locale is used. If the
|
142 |
|
|
DateTimeFormats element is missing, the default locale and format
|
143 |
|
|
are used. Note that Locale is used only if Date, Time, or Timestamp
|
144 |
|
|
is present.
|
145 |
|
|
-->
|
146 |
|
|
|
147 |
|
|
<!ELEMENT DateTimeFormats (Locale?, Patterns)>
|
148 |
|
|
<!ELEMENT Locale EMPTY>
|
149 |
|
|
<!ATTLIST Locale
|
150 |
|
|
Language NMTOKEN #REQUIRED
|
151 |
|
|
Country NMTOKEN #REQUIRED>
|
152 |
|
|
<!ELEMENT Patterns EMPTY>
|
153 |
|
|
<!ATTLIST Patterns
|
154 |
|
|
Date CDATA #IMPLIED
|
155 |
|
|
Time CDATA #IMPLIED
|
156 |
|
|
Timestamp CDATA #IMPLIED>
|
157 |
|
|
|
158 |
|
|
<!--
|
159 |
|
|
Namespace elements give URIs and their associated prefixes. These
|
160 |
|
|
are used as follows:
|
161 |
|
|
|
162 |
|
|
a) In the mapping document, prefixes identify which namespace an
|
163 |
|
|
element or attribute belongs to. They can be used in the Name
|
164 |
|
|
attribute of the ElementType and Attribute element types.
|
165 |
|
|
|
166 |
|
|
b) When transferring data from an XML document to the database,
|
167 |
|
|
namespace URIs are used to identify elements and attributes in
|
168 |
|
|
that document. The XML document can use different prefixes than
|
169 |
|
|
are used in the mapping document.
|
170 |
|
|
|
171 |
|
|
c) When transferring data from the database to an XML document,
|
172 |
|
|
namespace URIs and prefixes are used to prefix element and
|
173 |
|
|
attribute names in that document.
|
174 |
|
|
|
175 |
|
|
Namespace elements are not required. If they are used, the same URI
|
176 |
|
|
or prefix cannot be used more than once. Zero-length prefixes ("")
|
177 |
|
|
are not currently supported.
|
178 |
|
|
-->
|
179 |
|
|
|
180 |
|
|
<!ELEMENT Namespace EMPTY>
|
181 |
|
|
<!ATTLIST Namespace
|
182 |
|
|
Prefix NMTOKEN #REQUIRED
|
183 |
|
|
URI CDATA #REQUIRED>
|
184 |
|
|
|
185 |
|
|
<!ELEMENT Maps (IgnoreRoot*, ClassMap+)>
|
186 |
|
|
|
187 |
|
|
<!--
|
188 |
|
|
IgnoreRoot elements instruct the transfer software to ignore the
|
189 |
|
|
root element of the XML document (when transferring data from an
|
190 |
|
|
XML document to the database) or to construct an enclosing root
|
191 |
|
|
element (when transferring data from the database to an XML
|
192 |
|
|
document). This is useful when a document contains multiple,
|
193 |
|
|
unrelated instances of a particular class.
|
194 |
|
|
|
195 |
|
|
For example, suppose a document contains multiple sales orders:
|
196 |
|
|
each sales order is represented by a SalesOrder element and a
|
197 |
|
|
single Orders element serves as the root of the document. If the
|
198 |
|
|
sales orders are unrelated - that is, no information is stored in
|
199 |
|
|
the database about which sales orders are in this particular
|
200 |
|
|
document - then the root element of the document (Orders) should
|
201 |
|
|
be ignored.
|
202 |
|
|
|
203 |
|
|
The ElementType sub-element of IgnoreRoot identifies the root
|
204 |
|
|
element type to be ignored. A given map can identify multiple roots
|
205 |
|
|
that are to be ignored.
|
206 |
|
|
|
207 |
|
|
The PseudoRoot sub-elements of IgnoreRoot identify the mapped
|
208 |
|
|
children of the ignored root. Each is identified by its ElementType
|
209 |
|
|
and must be mapped separately in a ClassMap element. CandidateKey
|
210 |
|
|
(optional) gives the candidate key in the table to which the
|
211 |
|
|
pseudo-root element is mapped and OrderColumn (optional) gives the
|
212 |
|
|
column containing information about the order in which the
|
213 |
|
|
pseudo-root occurs in the actual root.
|
214 |
|
|
-->
|
215 |
|
|
|
216 |
|
|
<!ELEMENT IgnoreRoot (ElementType, PseudoRoot+)>
|
217 |
|
|
<!ELEMENT PseudoRoot (ElementType, CandidateKey?, OrderColumn?)>
|
218 |
|
|
|
219 |
|
|
<!--
|
220 |
|
|
ClassMap elements state that an element type (identified by the
|
221 |
|
|
ElementType subelement) is to be treated as a class. They also
|
222 |
|
|
provide information about the properties of that class (PropertyMap
|
223 |
|
|
subelements), any classes that are related to the class
|
224 |
|
|
(RelatedClass subelements), and how to map that class to the
|
225 |
|
|
database (ToRootTable and ToClassTable subelements).
|
226 |
|
|
|
227 |
|
|
A root table is any table that can be used as the top-level table
|
228 |
|
|
when extracting data from the database. The CandidateKey and
|
229 |
|
|
OrderColumn subelements give the columns that are used in the
|
230 |
|
|
WHERE and ORDER BY clauses when extracting data. The root element
|
231 |
|
|
type must be mapped as either ToRootTable or IgnoreRoot.
|
232 |
|
|
-->
|
233 |
|
|
|
234 |
|
|
<!ELEMENT ClassMap (ElementType,
|
235 |
|
|
(ToRootTable | ToClassTable),
|
236 |
|
|
PropertyMap*, RelatedClass*)>
|
237 |
|
|
|
238 |
|
|
<!ELEMENT ToRootTable (Table, CandidateKey?, OrderColumn?)>
|
239 |
|
|
<!ELEMENT ToClassTable (Table)>
|
240 |
|
|
|
241 |
|
|
<!--
|
242 |
|
|
PropertyMap elements state that an attribute, PCDATA, or element
|
243 |
|
|
type is to be treated as a property. The property is identified by
|
244 |
|
|
the Attribute, PCDATA, or ElementType subelement and belongs to the
|
245 |
|
|
class in whose ClassMap the PropertyMap is nested.
|
246 |
|
|
|
247 |
|
|
Attributes and PCDATA can be properties only of their parent
|
248 |
|
|
element type-as-class. An element type can be a property of any
|
249 |
|
|
parent element type. Thus, an element type can be declared to be a
|
250 |
|
|
property of more than one element type-as-class.
|
251 |
|
|
|
252 |
|
|
Property values are stored in columns. These can be either in the
|
253 |
|
|
class table (ToColumn) or in a separate table (ToPropertyTable). In
|
254 |
|
|
the latter case, Table identifies the property table, and
|
255 |
|
|
CandidateKey and ForeignKey identify the keys used to join the two
|
256 |
|
|
tables.
|
257 |
|
|
|
258 |
|
|
The OrderColumn subelement designates the column in which the
|
259 |
|
|
system stores order information. For more information, see
|
260 |
|
|
OrderColumn below.
|
261 |
|
|
-->
|
262 |
|
|
|
263 |
|
|
<!ELEMENT PropertyMap ((Attribute | PCDATA | ElementType),
|
264 |
|
|
(ToColumn | ToPropertyTable),
|
265 |
|
|
OrderColumn?)>
|
266 |
|
|
|
267 |
|
|
<!ELEMENT ToColumn (Column)>
|
268 |
|
|
<!ELEMENT ToPropertyTable (Table, CandidateKey, ForeignKey, Column)>
|
269 |
|
|
<!ATTLIST ToPropertyTable
|
270 |
|
|
KeyInParentTable (Candidate | Foreign) #REQUIRED>
|
271 |
|
|
|
272 |
|
|
<!--
|
273 |
|
|
RelatedClass elements describe classes that are related to
|
274 |
|
|
the class being defined. In class terms, you can think of
|
275 |
|
|
this as meaning that a property is added to the class being
|
276 |
|
|
defined that points to the related class. In XML terms, this
|
277 |
|
|
means that the element type for the related class is a child
|
278 |
|
|
of the element type for the class being defined.
|
279 |
|
|
|
280 |
|
|
(Note that the term "child class" could have been used here,
|
281 |
|
|
but wasn't due to the potential for confusion with parent/
|
282 |
|
|
child table relationships, parent/child element
|
283 |
|
|
relationships, and class inheritance relationships.)
|
284 |
|
|
|
285 |
|
|
For example, in the following XML document, if the element
|
286 |
|
|
types <A> and <B> are mapped as classes, then <B> needs to
|
287 |
|
|
be defined as a related class of <A>.
|
288 |
|
|
|
289 |
|
|
<A>
|
290 |
|
|
<property_A1>123</property_A1>
|
291 |
|
|
<property_A2>abcde</property_A2>
|
292 |
|
|
<B>
|
293 |
|
|
<property_B1>123</property_B1>
|
294 |
|
|
<property_B2>abcde</property_B2>
|
295 |
|
|
</B>
|
296 |
|
|
</A>
|
297 |
|
|
|
298 |
|
|
|
299 |
|
|
The RelatedClass element specifies the element type of the
|
300 |
|
|
related class, the candidate and foreign keys used to join
|
301 |
|
|
the tables for the two classes, and the name of the column
|
302 |
|
|
(if any) which contains the order in which the elements for
|
303 |
|
|
the related class appear in the class being defined.
|
304 |
|
|
-->
|
305 |
|
|
|
306 |
|
|
<!ELEMENT RelatedClass (ElementType,
|
307 |
|
|
CandidateKey, ForeignKey,
|
308 |
|
|
OrderColumn?)>
|
309 |
|
|
<!ATTLIST RelatedClass
|
310 |
|
|
KeyInParentTable (Candidate | Foreign) #REQUIRED>
|
311 |
|
|
|
312 |
|
|
<!--
|
313 |
|
|
The CandidateKey and ForeignKey elements describe the keys used to
|
314 |
|
|
join two tables: either two class tables or a class table and a
|
315 |
|
|
property table. Which key occurs in the parent table is declared in
|
316 |
|
|
the RelatedClass or ToPropertyTable element with the
|
317 |
|
|
KeyInParentTable attribute.
|
318 |
|
|
|
319 |
|
|
In addition, the CandidateKey element is used to identify the
|
320 |
|
|
columns used to identify rows when extracting data from the root
|
321 |
|
|
table.
|
322 |
|
|
|
323 |
|
|
The Generate attribute tells the system whether to generate the
|
324 |
|
|
candidate key. If the key is generated, the user must provide a
|
325 |
|
|
class that generates the key; for more information, see:
|
326 |
|
|
|
327 |
|
|
de.tudarmstadt.ito.xmldbms.KeyGenerator
|
328 |
|
|
de.tudarmstadt.ito.xmldbms.helpers.KeyGeneratorImpl
|
329 |
|
|
|
330 |
|
|
If the key is not generated, other properties must be mapped to
|
331 |
|
|
the key columns.
|
332 |
|
|
-->
|
333 |
|
|
|
334 |
|
|
<!ELEMENT CandidateKey (Column+)>
|
335 |
|
|
<!ATTLIST CandidateKey
|
336 |
|
|
Generate (Yes | No) #REQUIRED>
|
337 |
|
|
<!ELEMENT ForeignKey (Column+)>
|
338 |
|
|
|
339 |
|
|
<!--
|
340 |
|
|
ElementType, Attribute, and PCDATA elements are used to identify
|
341 |
|
|
the corresponding XML structures.
|
342 |
|
|
|
343 |
|
|
The MultiValued attribute of the Attribute element type states
|
344 |
|
|
whether individual tokens in an attribute are separate values
|
345 |
|
|
(NMTOKENS, IDREFS, and ENTITIES attributes) or a single value
|
346 |
|
|
(CDATA, ID, IDREF, ENTITY, and NMTOKEN attributes).
|
347 |
|
|
-->
|
348 |
|
|
|
349 |
|
|
<!ENTITY % XMLName "Name NMTOKEN #REQUIRED">
|
350 |
|
|
<!ELEMENT ElementType EMPTY>
|
351 |
|
|
<!ATTLIST ElementType
|
352 |
|
|
%XMLName;>
|
353 |
|
|
<!ELEMENT Attribute EMPTY>
|
354 |
|
|
<!ATTLIST Attribute
|
355 |
|
|
%XMLName;
|
356 |
|
|
MultiValued (Yes | No) "No">
|
357 |
|
|
<!ELEMENT PCDATA EMPTY>
|
358 |
|
|
|
359 |
|
|
<!--
|
360 |
|
|
Table, Column, and OrderColumn names are used to identify the
|
361 |
|
|
corresponding XML structures. Table and column names must follow
|
362 |
|
|
the naming conventions used in the database. For example, if
|
363 |
|
|
column names are stored in upper case in the database, then they
|
364 |
|
|
must be specified in upper case in the mapping document. Table
|
365 |
|
|
names may be qualified with catalog and schema names. Column names
|
366 |
|
|
must not be qualified; the table to which they belong is determined
|
367 |
|
|
from context (see below). Column names must not be quoted; the system
|
368 |
|
|
quotes them before using them in SQL statements.
|
369 |
|
|
|
370 |
|
|
When transferring data from the database to an XML document, the
|
371 |
|
|
special table name "Result Set" is used when the root table is a
|
372 |
|
|
result set.
|
373 |
|
|
|
374 |
|
|
The table to which a column belongs is determined as follows:
|
375 |
|
|
|
376 |
|
|
Column element in: Column occurs in:
|
377 |
|
|
__________________ ________________________________________
|
378 |
|
|
|
379 |
|
|
ToColumn Class table
|
380 |
|
|
ToPropertyTable Property table
|
381 |
|
|
CandidateKey Determined by KeyInParentTable attribute
|
382 |
|
|
ForeignKey Determined by KeyInParentTable attribute
|
383 |
|
|
|
384 |
|
|
OrderColumn element in: Column occurs in:
|
385 |
|
|
__________________ ________________________________________
|
386 |
|
|
|
387 |
|
|
PseudoRoot Class table of pseudo-root element
|
388 |
|
|
PropertyTable Class table
|
389 |
|
|
(if property mapped as ToColumn)
|
390 |
|
|
Same table as foreign key
|
391 |
|
|
(if property mapped as ToPropertyTable)
|
392 |
|
|
RelatedClass Same table as foreign key
|
393 |
|
|
|
394 |
|
|
Order columns are used to store information about the order in
|
395 |
|
|
which elements and PCDATA occur in their parent element, as well as
|
396 |
|
|
the order of values in multi-valued attributes (IDREFS, NMTOKENS,
|
397 |
|
|
and ENTITIES).
|
398 |
|
|
|
399 |
|
|
Storing order information is optional; if it is not stored, there
|
400 |
|
|
is no guarantee that order will be preserved in a round trip from
|
401 |
|
|
an XML document to the database and back again. (Note that nesting
|
402 |
|
|
is preserved; that is, subelements and PCDATA always occur in the
|
403 |
|
|
correct parent.)
|
404 |
|
|
|
405 |
|
|
The Generate attribute of the OrderColumn element tells the system
|
406 |
|
|
whether to generate order information or not. (The presence or
|
407 |
|
|
absence of the OrderColumn element tells the system whether to use
|
408 |
|
|
order information.) If order information is generated, the order
|
409 |
|
|
column must be of type java.sql.Types.Integer. If order information
|
410 |
|
|
is not generated, another property must be mapped to the order
|
411 |
|
|
column.
|
412 |
|
|
-->
|
413 |
|
|
|
414 |
|
|
<!ENTITY % DatabaseName "Name CDATA #REQUIRED">
|
415 |
|
|
<!ELEMENT Table EMPTY>
|
416 |
|
|
<!ATTLIST Table
|
417 |
|
|
%DatabaseName;>
|
418 |
|
|
<!ELEMENT Column EMPTY>
|
419 |
|
|
<!ATTLIST Column
|
420 |
|
|
%DatabaseName;>
|
421 |
|
|
<!ELEMENT OrderColumn EMPTY>
|
422 |
|
|
<!ATTLIST OrderColumn
|
423 |
|
|
%DatabaseName;
|
424 |
|
|
Generate (Yes | No) #REQUIRED>
|