1
|
<!--
|
2
|
No copyright, no warranty; use as you will.
|
3
|
Written by Ronald Bourret, Technical University of Darmstadt, 1998-9
|
4
|
-->
|
5
|
|
6
|
<!--
|
7
|
XML-DBMS is a system for transferring data between XML documents
|
8
|
and relational databases. It views an XML document as a tree of
|
9
|
objects and then uses an object-relational mapping to map these
|
10
|
objects to a relational database.
|
11
|
|
12
|
Generally, element types are viewed as classes, and attributes and
|
13
|
PCDATA are viewed as properties of those classes. However, element
|
14
|
types can also be viewed as properties of their parent element
|
15
|
type. Although this is most useful when an element type contains
|
16
|
only PCDATA, it is useful in other cases as well. For example,
|
17
|
consider an element type that contains a description written in
|
18
|
XHTML. Although this description has subelements such as <B> and
|
19
|
<P>, these subelements cannot be meaningfully interpreted on their
|
20
|
own and it makes more sense to view the contents of the element
|
21
|
type as a single value (property) rather than a class.
|
22
|
|
23
|
(Note that the tree of objects is *not* the DOM. This is because
|
24
|
the DOM models the document itself, not the data in that document.)
|
25
|
|
26
|
The XML-DBMS mapping language, which is described in this DTD,
|
27
|
allows users to:
|
28
|
|
29
|
a) Declare how element types are to be viewed (as classes or
|
30
|
properties),
|
31
|
|
32
|
b) Declare which subelements, attributes, and PCDATA are to be
|
33
|
viewed as properties of a given element type-as-class (unmapped
|
34
|
XML structures are ignored), and
|
35
|
|
36
|
c) State how to map the resulting classes and properties to the
|
37
|
database.
|
38
|
|
39
|
The resulting object-relational mapping maps classes to tables and
|
40
|
properties to either columns in those tables or to subtables. (The
|
41
|
latter is useful, for example, for storing BLOB properties
|
42
|
separately.) Inter-class relationships are mapped as candidate key
|
43
|
/ foreign key relationships. The mapping can also state whether to
|
44
|
preserve information about the order in which subelements and
|
45
|
PCDATA occur within their parent, which is generally important in
|
46
|
document-centric XML documents and unimportant in data-centric XML
|
47
|
documents.
|
48
|
-->
|
49
|
|
50
|
<!--
|
51
|
The XMLDBMS element type is the root element type of the
|
52
|
mapping document.
|
53
|
-->
|
54
|
|
55
|
<!ELEMENT XMLToDBMS (Options*, Maps)>
|
56
|
<!ATTLIST XMLToDBMS
|
57
|
Version CDATA #FIXED "1.0">
|
58
|
|
59
|
<!--
|
60
|
Options is simply a container to hold the various options you
|
61
|
can set.
|
62
|
-->
|
63
|
|
64
|
<!ELEMENT Options (EmptyStringIsNull?, DateTimeFormats?, Namespace*)>
|
65
|
|
66
|
<!--
|
67
|
The EmptyStringIsNull element states how empty strings in an
|
68
|
XML document correspond to NULLs in the database. Technically,
|
69
|
NULL means that there is no value and is distinct from an empty
|
70
|
string. In an XML document, this corresponds to an optional
|
71
|
element or attribute being missing, as opposed to its being
|
72
|
present and having an empty string as its value (this includes
|
73
|
empty elements).
|
74
|
|
75
|
However, many XML users are likely to think of empty strings
|
76
|
as NULLs. EmptyStringIsNull allows XML-DBMS users to handle
|
77
|
this situation. If it is present, empty strings are treated
|
78
|
the same as NULLs; if it is absent, empty strings are treated
|
79
|
as strings.
|
80
|
|
81
|
The following table shows how NULL values and empty strings in
|
82
|
the database are transferred to missing elements/attributes and
|
83
|
empty strings in the XML document and vice versa.
|
84
|
|
85
|
|
86
|
Transfer Direction
|
87
|
_______________________________________
|
88
|
EmptyStringIsNull | | |
|
89
|
element is: | DBMS => XML | XML => DBMS |
|
90
|
___________________|___________________|___________________|
|
91
|
| | | |
|
92
|
| | NULL => missing | missing => NULL |
|
93
|
| not present | | |
|
94
|
| | empty => empty | empty => empty |
|
95
|
| | string string | string string |
|
96
|
|___________________|___________________|___________________|
|
97
|
| | | |
|
98
|
| | NULL => empty | missing => NULL |
|
99
|
| | string | |
|
100
|
| present | | |
|
101
|
| | empty => empty | empty => NULL |
|
102
|
| | string string | string |
|
103
|
|___________________|___________________|___________________|
|
104
|
|
105
|
|
106
|
Note that EmptyStringIsNull applies only to elements and
|
107
|
attributes mapped as properties. (An empty element-as-class with
|
108
|
no attributes results in a row of all NULLs in the database.)
|
109
|
-->
|
110
|
|
111
|
<!ELEMENT EmptyStringIsNull EMPTY>
|
112
|
|
113
|
<!--
|
114
|
The DateTimeFormats element and its subelements specify the formats
|
115
|
used to parse dates, times, and timestamps. The information specified
|
116
|
here is used to construct one of Java's date formatting objects -
|
117
|
either a java.text.DateFormat or a java.text.SimpleDateFormat.
|
118
|
|
119
|
The value of the Language attribute must be a valid ISO Language Code.
|
120
|
These are defined by ISO-639 and are available on the Web. For example,
|
121
|
try:
|
122
|
|
123
|
http://www.ics.uci.edu/pub/ietf/http/related/iso639.txt
|
124
|
|
125
|
The value of the Country attribute must be a valid ISO Country Code.
|
126
|
These are defined by ISO-3166 and are also available on the Web. For
|
127
|
example, try:
|
128
|
|
129
|
http://www.din.de/gremien/nas/nabd/iso3166ma/codlstp1.html
|
130
|
|
131
|
The value of Date, Time, and Timestamp attributes must be either one
|
132
|
of the keywords FULL, LONG, MEDIUM, or SHORT, whose formats are
|
133
|
described in the documentation for DateFormat, or a formatting
|
134
|
pattern as defined in the documentation for SimpleDateFormat. Which
|
135
|
format is used depends on the data type of the target column. (If
|
136
|
values are not being formatted correctly, be sure to check how the
|
137
|
JDBC driver maps the type of the target column. For example,
|
138
|
MS Access only supports TIMESTAMP columns.)
|
139
|
|
140
|
If an element is missing, the default value is used. For example, if
|
141
|
the Locale element is missing, the default locale is used. If the
|
142
|
DateTimeFormats element is missing, the default locale and format
|
143
|
are used. Note that Locale is used only if Date, Time, or Timestamp
|
144
|
is present.
|
145
|
-->
|
146
|
|
147
|
<!ELEMENT DateTimeFormats (Locale?, Patterns)>
|
148
|
<!ELEMENT Locale EMPTY>
|
149
|
<!ATTLIST Locale
|
150
|
Language NMTOKEN #REQUIRED
|
151
|
Country NMTOKEN #REQUIRED>
|
152
|
<!ELEMENT Patterns EMPTY>
|
153
|
<!ATTLIST Patterns
|
154
|
Date CDATA #IMPLIED
|
155
|
Time CDATA #IMPLIED
|
156
|
Timestamp CDATA #IMPLIED>
|
157
|
|
158
|
<!--
|
159
|
Namespace elements give URIs and their associated prefixes. These
|
160
|
are used as follows:
|
161
|
|
162
|
a) In the mapping document, prefixes identify which namespace an
|
163
|
element or attribute belongs to. They can be used in the Name
|
164
|
attribute of the ElementType and Attribute element types.
|
165
|
|
166
|
b) When transferring data from an XML document to the database,
|
167
|
namespace URIs are used to identify elements and attributes in
|
168
|
that document. The XML document can use different prefixes than
|
169
|
are used in the mapping document.
|
170
|
|
171
|
c) When transferring data from the database to an XML document,
|
172
|
namespace URIs and prefixes are used to prefix element and
|
173
|
attribute names in that document.
|
174
|
|
175
|
Namespace elements are not required. If they are used, the same URI
|
176
|
or prefix cannot be used more than once. Zero-length prefixes ("")
|
177
|
are not currently supported.
|
178
|
-->
|
179
|
|
180
|
<!ELEMENT Namespace EMPTY>
|
181
|
<!ATTLIST Namespace
|
182
|
Prefix NMTOKEN #REQUIRED
|
183
|
URI CDATA #REQUIRED>
|
184
|
|
185
|
<!ELEMENT Maps (IgnoreRoot*, ClassMap+)>
|
186
|
|
187
|
<!--
|
188
|
IgnoreRoot elements instruct the transfer software to ignore the
|
189
|
root element of the XML document (when transferring data from an
|
190
|
XML document to the database) or to construct an enclosing root
|
191
|
element (when transferring data from the database to an XML
|
192
|
document). This is useful when a document contains multiple,
|
193
|
unrelated instances of a particular class.
|
194
|
|
195
|
For example, suppose a document contains multiple sales orders:
|
196
|
each sales order is represented by a SalesOrder element and a
|
197
|
single Orders element serves as the root of the document. If the
|
198
|
sales orders are unrelated - that is, no information is stored in
|
199
|
the database about which sales orders are in this particular
|
200
|
document - then the root element of the document (Orders) should
|
201
|
be ignored.
|
202
|
|
203
|
The ElementType sub-element of IgnoreRoot identifies the root
|
204
|
element type to be ignored. A given map can identify multiple roots
|
205
|
that are to be ignored.
|
206
|
|
207
|
The PseudoRoot sub-elements of IgnoreRoot identify the mapped
|
208
|
children of the ignored root. Each is identified by its ElementType
|
209
|
and must be mapped separately in a ClassMap element. CandidateKey
|
210
|
(optional) gives the candidate key in the table to which the
|
211
|
pseudo-root element is mapped and OrderColumn (optional) gives the
|
212
|
column containing information about the order in which the
|
213
|
pseudo-root occurs in the actual root.
|
214
|
-->
|
215
|
|
216
|
<!ELEMENT IgnoreRoot (ElementType, PseudoRoot+)>
|
217
|
<!ELEMENT PseudoRoot (ElementType, CandidateKey?, OrderColumn?)>
|
218
|
|
219
|
<!--
|
220
|
ClassMap elements state that an element type (identified by the
|
221
|
ElementType subelement) is to be treated as a class. They also
|
222
|
provide information about the properties of that class (PropertyMap
|
223
|
subelements), any classes that are related to the class
|
224
|
(RelatedClass subelements), and how to map that class to the
|
225
|
database (ToRootTable and ToClassTable subelements).
|
226
|
|
227
|
A root table is any table that can be used as the top-level table
|
228
|
when extracting data from the database. The CandidateKey and
|
229
|
OrderColumn subelements give the columns that are used in the
|
230
|
WHERE and ORDER BY clauses when extracting data. The root element
|
231
|
type must be mapped as either ToRootTable or IgnoreRoot.
|
232
|
-->
|
233
|
|
234
|
<!ELEMENT ClassMap (ElementType,
|
235
|
(ToRootTable | ToClassTable),
|
236
|
PropertyMap*, RelatedClass*)>
|
237
|
|
238
|
<!ELEMENT ToRootTable (Table, CandidateKey?, OrderColumn?)>
|
239
|
<!ELEMENT ToClassTable (Table)>
|
240
|
|
241
|
<!--
|
242
|
PropertyMap elements state that an attribute, PCDATA, or element
|
243
|
type is to be treated as a property. The property is identified by
|
244
|
the Attribute, PCDATA, or ElementType subelement and belongs to the
|
245
|
class in whose ClassMap the PropertyMap is nested.
|
246
|
|
247
|
Attributes and PCDATA can be properties only of their parent
|
248
|
element type-as-class. An element type can be a property of any
|
249
|
parent element type. Thus, an element type can be declared to be a
|
250
|
property of more than one element type-as-class.
|
251
|
|
252
|
Property values are stored in columns. These can be either in the
|
253
|
class table (ToColumn) or in a separate table (ToPropertyTable). In
|
254
|
the latter case, Table identifies the property table, and
|
255
|
CandidateKey and ForeignKey identify the keys used to join the two
|
256
|
tables.
|
257
|
|
258
|
The OrderColumn subelement designates the column in which the
|
259
|
system stores order information. For more information, see
|
260
|
OrderColumn below.
|
261
|
-->
|
262
|
|
263
|
<!ELEMENT PropertyMap ((Attribute | PCDATA | ElementType),
|
264
|
(ToColumn | ToPropertyTable),
|
265
|
OrderColumn?)>
|
266
|
|
267
|
<!ELEMENT ToColumn (Column)>
|
268
|
<!ELEMENT ToPropertyTable (Table, CandidateKey, ForeignKey, Column)>
|
269
|
<!ATTLIST ToPropertyTable
|
270
|
KeyInParentTable (Candidate | Foreign) #REQUIRED>
|
271
|
|
272
|
<!--
|
273
|
RelatedClass elements describe classes that are related to
|
274
|
the class being defined. In class terms, you can think of
|
275
|
this as meaning that a property is added to the class being
|
276
|
defined that points to the related class. In XML terms, this
|
277
|
means that the element type for the related class is a child
|
278
|
of the element type for the class being defined.
|
279
|
|
280
|
(Note that the term "child class" could have been used here,
|
281
|
but wasn't due to the potential for confusion with parent/
|
282
|
child table relationships, parent/child element
|
283
|
relationships, and class inheritance relationships.)
|
284
|
|
285
|
For example, in the following XML document, if the element
|
286
|
types <A> and <B> are mapped as classes, then <B> needs to
|
287
|
be defined as a related class of <A>.
|
288
|
|
289
|
<A>
|
290
|
<property_A1>123</property_A1>
|
291
|
<property_A2>abcde</property_A2>
|
292
|
<B>
|
293
|
<property_B1>123</property_B1>
|
294
|
<property_B2>abcde</property_B2>
|
295
|
</B>
|
296
|
</A>
|
297
|
|
298
|
|
299
|
The RelatedClass element specifies the element type of the
|
300
|
related class, the candidate and foreign keys used to join
|
301
|
the tables for the two classes, and the name of the column
|
302
|
(if any) which contains the order in which the elements for
|
303
|
the related class appear in the class being defined.
|
304
|
-->
|
305
|
|
306
|
<!ELEMENT RelatedClass (ElementType,
|
307
|
CandidateKey, ForeignKey,
|
308
|
OrderColumn?)>
|
309
|
<!ATTLIST RelatedClass
|
310
|
KeyInParentTable (Candidate | Foreign) #REQUIRED>
|
311
|
|
312
|
<!--
|
313
|
The CandidateKey and ForeignKey elements describe the keys used to
|
314
|
join two tables: either two class tables or a class table and a
|
315
|
property table. Which key occurs in the parent table is declared in
|
316
|
the RelatedClass or ToPropertyTable element with the
|
317
|
KeyInParentTable attribute.
|
318
|
|
319
|
In addition, the CandidateKey element is used to identify the
|
320
|
columns used to identify rows when extracting data from the root
|
321
|
table.
|
322
|
|
323
|
The Generate attribute tells the system whether to generate the
|
324
|
candidate key. If the key is generated, the user must provide a
|
325
|
class that generates the key; for more information, see:
|
326
|
|
327
|
de.tudarmstadt.ito.xmldbms.KeyGenerator
|
328
|
de.tudarmstadt.ito.xmldbms.helpers.KeyGeneratorImpl
|
329
|
|
330
|
If the key is not generated, other properties must be mapped to
|
331
|
the key columns.
|
332
|
-->
|
333
|
|
334
|
<!ELEMENT CandidateKey (Column+)>
|
335
|
<!ATTLIST CandidateKey
|
336
|
Generate (Yes | No) #REQUIRED>
|
337
|
<!ELEMENT ForeignKey (Column+)>
|
338
|
|
339
|
<!--
|
340
|
ElementType, Attribute, and PCDATA elements are used to identify
|
341
|
the corresponding XML structures.
|
342
|
|
343
|
The MultiValued attribute of the Attribute element type states
|
344
|
whether individual tokens in an attribute are separate values
|
345
|
(NMTOKENS, IDREFS, and ENTITIES attributes) or a single value
|
346
|
(CDATA, ID, IDREF, ENTITY, and NMTOKEN attributes).
|
347
|
-->
|
348
|
|
349
|
<!ENTITY % XMLName "Name NMTOKEN #REQUIRED">
|
350
|
<!ELEMENT ElementType EMPTY>
|
351
|
<!ATTLIST ElementType
|
352
|
%XMLName;>
|
353
|
<!ELEMENT Attribute EMPTY>
|
354
|
<!ATTLIST Attribute
|
355
|
%XMLName;
|
356
|
MultiValued (Yes | No) "No">
|
357
|
<!ELEMENT PCDATA EMPTY>
|
358
|
|
359
|
<!--
|
360
|
Table, Column, and OrderColumn names are used to identify the
|
361
|
corresponding XML structures. Table and column names must follow
|
362
|
the naming conventions used in the database. For example, if
|
363
|
column names are stored in upper case in the database, then they
|
364
|
must be specified in upper case in the mapping document. Table
|
365
|
names may be qualified with catalog and schema names. Column names
|
366
|
must not be qualified; the table to which they belong is determined
|
367
|
from context (see below). Column names must not be quoted; the system
|
368
|
quotes them before using them in SQL statements.
|
369
|
|
370
|
When transferring data from the database to an XML document, the
|
371
|
special table name "Result Set" is used when the root table is a
|
372
|
result set.
|
373
|
|
374
|
The table to which a column belongs is determined as follows:
|
375
|
|
376
|
Column element in: Column occurs in:
|
377
|
__________________ ________________________________________
|
378
|
|
379
|
ToColumn Class table
|
380
|
ToPropertyTable Property table
|
381
|
CandidateKey Determined by KeyInParentTable attribute
|
382
|
ForeignKey Determined by KeyInParentTable attribute
|
383
|
|
384
|
OrderColumn element in: Column occurs in:
|
385
|
__________________ ________________________________________
|
386
|
|
387
|
PseudoRoot Class table of pseudo-root element
|
388
|
PropertyTable Class table
|
389
|
(if property mapped as ToColumn)
|
390
|
Same table as foreign key
|
391
|
(if property mapped as ToPropertyTable)
|
392
|
RelatedClass Same table as foreign key
|
393
|
|
394
|
Order columns are used to store information about the order in
|
395
|
which elements and PCDATA occur in their parent element, as well as
|
396
|
the order of values in multi-valued attributes (IDREFS, NMTOKENS,
|
397
|
and ENTITIES).
|
398
|
|
399
|
Storing order information is optional; if it is not stored, there
|
400
|
is no guarantee that order will be preserved in a round trip from
|
401
|
an XML document to the database and back again. (Note that nesting
|
402
|
is preserved; that is, subelements and PCDATA always occur in the
|
403
|
correct parent.)
|
404
|
|
405
|
The Generate attribute of the OrderColumn element tells the system
|
406
|
whether to generate order information or not. (The presence or
|
407
|
absence of the OrderColumn element tells the system whether to use
|
408
|
order information.) If order information is generated, the order
|
409
|
column must be of type java.sql.Types.Integer. If order information
|
410
|
is not generated, another property must be mapped to the order
|
411
|
column.
|
412
|
-->
|
413
|
|
414
|
<!ENTITY % DatabaseName "Name CDATA #REQUIRED">
|
415
|
<!ELEMENT Table EMPTY>
|
416
|
<!ATTLIST Table
|
417
|
%DatabaseName;>
|
418
|
<!ELEMENT Column EMPTY>
|
419
|
<!ATTLIST Column
|
420
|
%DatabaseName;>
|
421
|
<!ELEMENT OrderColumn EMPTY>
|
422
|
<!ATTLIST OrderColumn
|
423
|
%DatabaseName;
|
424
|
Generate (Yes | No) #REQUIRED>
|
425
|
|