Project

General

Profile

1
<!--
2
   No copyright, no warranty; use as you will.
3
   Written by Ronald Bourret, Technical University of Darmstadt, 1998-9
4
-->
5

    
6
<!--
7
   XML-DBMS is a system for transferring data between XML documents
8
   and relational databases. It views an XML document as a tree of
9
   objects and then uses an object-relational mapping to map these
10
   objects to a relational database.
11

    
12
   Generally, element types are viewed as classes, and attributes and
13
   PCDATA are viewed as properties of those classes. However, element
14
   types can also be viewed as properties of their parent element
15
   type. Although this is most useful when an element type contains
16
   only PCDATA, it is useful in other cases as well. For example,
17
   consider an element type that contains a description written in
18
   XHTML. Although this description has subelements such as <B> and
19
   <P>, these subelements cannot be meaningfully interpreted on their
20
   own and it makes more sense to view the contents of the element
21
   type as a single value (property) rather than a class.
22

    
23
   (Note that the tree of objects is *not* the DOM. This is because
24
   the DOM models the document itself, not the data in that document.)
25

    
26
   The XML-DBMS mapping language, which is described in this DTD,
27
   allows users to:
28

    
29
   a) Declare how element types are to be viewed (as classes or
30
      properties),
31

    
32
   b) Declare which subelements, attributes, and PCDATA are to be
33
      viewed as properties of a given element type-as-class (unmapped
34
      XML structures are ignored), and
35

    
36
   c) State how to map the resulting classes and properties to the
37
      database.
38

    
39
   The resulting object-relational mapping maps classes to tables and
40
   properties to either columns in those tables or to subtables. (The
41
   latter is useful, for example, for storing BLOB properties
42
   separately.) Inter-class relationships are mapped as candidate key
43
   / foreign key relationships. The mapping can also state whether to
44
   preserve information about the order in which subelements and
45
   PCDATA occur within their parent, which is generally important in
46
   document-centric XML documents and unimportant in data-centric XML
47
   documents.
48
-->
49

    
50
<!-- 
51
   The XMLDBMS element type is the root element type of the
52
   mapping document.
53
 -->
54

    
55
<!ELEMENT XMLToDBMS (Options*, Maps)>
56
<!ATTLIST XMLToDBMS
57
          Version CDATA #FIXED "1.0">
58

    
59
<!--
60
   Options is simply a container to hold the various options you
61
   can set.
62
-->
63

    
64
<!ELEMENT Options (EmptyStringIsNull?, DateTimeFormats?, Namespace*)>
65

    
66
<!--
67
   The EmptyStringIsNull element states how empty strings in an
68
   XML document correspond to NULLs in the database. Technically,
69
   NULL means that there is no value and is distinct from an empty
70
   string. In an XML document, this corresponds to an optional
71
   element or attribute being missing, as opposed to its being
72
   present and having an empty string as its value (this includes
73
   empty elements).
74

    
75
   However, many XML users are likely to think of empty strings
76
   as NULLs. EmptyStringIsNull allows XML-DBMS users to handle
77
   this situation. If it is present, empty strings are treated
78
   the same as NULLs; if it is absent, empty strings are treated
79
   as strings.
80

    
81
   The following table shows how NULL values and empty strings in
82
   the database are transferred to missing elements/attributes and
83
   empty strings in the XML document and vice versa.
84

    
85

    
86
                                     Transfer Direction
87
                          _______________________________________
88
       EmptyStringIsNull |                   |                   |
89
          element is:    |  DBMS  =>  XML    |    XML  =>  DBMS  |
90
      ___________________|___________________|___________________|
91
     |                   |                   |                   |
92
     |                   | NULL   => missing | missing => NULL   |
93
     |    not present    |                   |                   |
94
     |                   | empty  => empty   | empty   => empty  |
95
     |                   | string    string  | string     string |
96
     |___________________|___________________|___________________|
97
     |                   |                   |                   |
98
     |                   | NULL   => empty   | missing => NULL   |
99
     |                   |           string  |                   |
100
     |     present       |                   |                   |
101
     |                   | empty  => empty   | empty   => NULL   |
102
     |                   | string    string  | string            |
103
     |___________________|___________________|___________________|
104
   
105

    
106
   Note that EmptyStringIsNull applies only to elements and
107
   attributes mapped as properties. (An empty element-as-class with
108
   no attributes results in a row of all NULLs in the database.) 
109
-->
110

    
111
<!ELEMENT EmptyStringIsNull EMPTY>
112

    
113
<!--
114
   The DateTimeFormats element and its subelements specify the formats
115
   used to parse dates, times, and timestamps. The information specified
116
   here is used to construct one of Java's date formatting objects -
117
   either a java.text.DateFormat or a java.text.SimpleDateFormat.
118

    
119
   The value of the Language attribute must be a valid ISO Language Code.
120
   These are defined by ISO-639 and are available on the Web. For example,
121
   try:
122

    
123
      http://www.ics.uci.edu/pub/ietf/http/related/iso639.txt 
124

    
125
   The value of the Country attribute must be a valid ISO Country Code.
126
   These are defined by ISO-3166 and are also available on the Web. For
127
   example, try:
128

    
129
      http://www.din.de/gremien/nas/nabd/iso3166ma/codlstp1.html
130

    
131
   The value of Date, Time, and Timestamp attributes must be either one
132
   of the keywords FULL, LONG, MEDIUM, or SHORT, whose formats are
133
   described in the documentation for DateFormat, or a formatting
134
   pattern as defined in the documentation for SimpleDateFormat. Which
135
   format is used depends on the data type of the target column. (If
136
   values are not being formatted correctly, be sure to check how the
137
   JDBC driver maps the type of the target column. For example,
138
   MS Access only supports TIMESTAMP columns.)
139

    
140
   If an element is missing, the default value is used. For example, if
141
   the Locale element is missing, the default locale is used. If the
142
   DateTimeFormats element is missing, the default locale and format
143
   are used. Note that Locale is used only if Date, Time, or Timestamp
144
   is present.
145
-->
146

    
147
<!ELEMENT DateTimeFormats (Locale?, Patterns)>
148
<!ELEMENT Locale EMPTY>
149
<!ATTLIST Locale
150
          Language NMTOKEN #REQUIRED
151
          Country NMTOKEN #REQUIRED>
152
<!ELEMENT Patterns EMPTY>
153
<!ATTLIST Patterns
154
          Date CDATA #IMPLIED
155
          Time CDATA #IMPLIED
156
          Timestamp CDATA #IMPLIED>
157

    
158
<!--
159
   Namespace elements give URIs and their associated prefixes. These
160
   are used as follows:
161

    
162
   a) In the mapping document, prefixes identify which namespace an
163
      element or attribute belongs to. They can be used in the Name
164
      attribute of the ElementType and Attribute element types.
165

    
166
   b) When transferring data from an XML document to the database,
167
      namespace URIs are used to identify elements and attributes in
168
      that document. The XML document can use different prefixes than
169
      are used in the mapping document.
170

    
171
   c) When transferring data from the database to an XML document,
172
      namespace URIs and prefixes are used to prefix element and
173
      attribute names in that document.
174

    
175
   Namespace elements are not required. If they are used, the same URI
176
   or prefix cannot be used more than once. Zero-length prefixes ("")
177
   are not currently supported.
178
-->
179

    
180
<!ELEMENT Namespace EMPTY>
181
<!ATTLIST Namespace
182
          Prefix NMTOKEN #REQUIRED
183
          URI CDATA #REQUIRED>
184

    
185
<!ELEMENT Maps (IgnoreRoot*, ClassMap+)>
186

    
187
<!--
188
   IgnoreRoot elements instruct the transfer software to ignore the
189
   root element of the XML document (when transferring data from an
190
   XML document to the database) or to construct an enclosing root
191
   element (when transferring data from the database to an XML
192
   document). This is useful when a document contains multiple,
193
   unrelated instances of a particular class.
194

    
195
   For example, suppose a document contains multiple sales orders:
196
   each sales order is represented by a SalesOrder element and a
197
   single Orders element serves as the root of the document. If the
198
   sales orders are unrelated - that is, no information is stored in
199
   the database about which sales orders are in this particular
200
   document - then the root element of the document (Orders) should
201
   be ignored.
202

    
203
   The ElementType sub-element of IgnoreRoot identifies the root
204
   element type to be ignored. A given map can identify multiple roots
205
   that are to be ignored.
206

    
207
   The PseudoRoot sub-elements of IgnoreRoot identify the mapped
208
   children of the ignored root. Each is identified by its ElementType
209
   and must be mapped separately in a ClassMap element. CandidateKey
210
   (optional) gives the candidate key in the table to which the
211
   pseudo-root element is mapped and OrderColumn (optional) gives the
212
   column containing information about the order in which the
213
   pseudo-root occurs in the actual root.
214
-->
215

    
216
<!ELEMENT IgnoreRoot (ElementType, PseudoRoot+)>
217
<!ELEMENT PseudoRoot (ElementType, CandidateKey?, OrderColumn?)>
218

    
219
<!--
220
   ClassMap elements state that an element type (identified by the
221
   ElementType subelement) is to be treated as a class. They also
222
   provide information about the properties of that class (PropertyMap
223
   subelements), any classes that are related to the class
224
   (RelatedClass subelements), and how to map that class to the
225
   database (ToRootTable and ToClassTable subelements).
226

    
227
   A root table is any table that can be used as the top-level table
228
   when extracting data from the database. The CandidateKey and
229
   OrderColumn subelements give the columns that are used in the
230
   WHERE and ORDER BY clauses when extracting data. The root element
231
   type must be mapped as either ToRootTable or IgnoreRoot.
232
-->
233

    
234
<!ELEMENT ClassMap (ElementType,
235
                    (ToRootTable | ToClassTable),
236
                    PropertyMap*, RelatedClass*)>
237

    
238
<!ELEMENT ToRootTable (Table, CandidateKey?, OrderColumn?)>
239
<!ELEMENT ToClassTable (Table)>
240

    
241
<!--
242
   PropertyMap elements state that an attribute, PCDATA, or element
243
   type is to be treated as a property. The property is identified by
244
   the Attribute, PCDATA, or ElementType subelement and belongs to the
245
   class in whose ClassMap the PropertyMap is nested.
246

    
247
   Attributes and PCDATA can be properties only of their parent
248
   element type-as-class. An element type can be a property of any
249
   parent element type. Thus, an element type can be declared to be a
250
   property of more than one element type-as-class.
251

    
252
   Property values are stored in columns. These can be either in the
253
   class table (ToColumn) or in a separate table (ToPropertyTable). In
254
   the latter case, Table identifies the property table, and
255
   CandidateKey and ForeignKey identify the keys used to join the two
256
   tables.
257

    
258
   The OrderColumn subelement designates the column in which the
259
   system stores order information. For more information, see
260
   OrderColumn below.
261
-->
262

    
263
<!ELEMENT PropertyMap ((Attribute | PCDATA | ElementType),
264
                       (ToColumn | ToPropertyTable),
265
                       OrderColumn?)>
266

    
267
<!ELEMENT ToColumn (Column)>
268
<!ELEMENT ToPropertyTable (Table, CandidateKey, ForeignKey, Column)>
269
<!ATTLIST ToPropertyTable
270
          KeyInParentTable (Candidate | Foreign) #REQUIRED>
271

    
272
<!--
273
   RelatedClass elements describe classes that are related to
274
   the class being defined. In class terms, you can think of
275
   this as meaning that a property is added to the class being
276
   defined that points to the related class. In XML terms, this
277
   means that the element type for the related class is a child
278
   of the element type for the class being defined.
279

    
280
   (Note that the term "child class" could have been used here,
281
   but wasn't due to the potential for confusion with parent/
282
   child table relationships, parent/child element
283
   relationships, and class inheritance relationships.)
284

    
285
   For example, in the following XML document, if the element
286
   types <A> and <B> are mapped as classes, then <B> needs to
287
   be defined as a related class of <A>.
288

    
289
      <A>
290
         <property_A1>123</property_A1>
291
         <property_A2>abcde</property_A2>
292
         <B>
293
            <property_B1>123</property_B1>
294
            <property_B2>abcde</property_B2>
295
         </B>
296
      </A>
297
            
298

    
299
   The RelatedClass element specifies the element type of the
300
   related class, the candidate and foreign keys used to join
301
   the tables for the two classes, and the name of the column
302
   (if any) which contains the order in which the elements for
303
   the related class appear in the class being defined.
304
-->
305

    
306
<!ELEMENT RelatedClass (ElementType,
307
                        CandidateKey, ForeignKey,
308
                        OrderColumn?)>
309
<!ATTLIST RelatedClass
310
          KeyInParentTable (Candidate | Foreign) #REQUIRED>
311

    
312
<!--
313
   The CandidateKey and ForeignKey elements describe the keys used to
314
   join two tables: either two class tables or a class table and a
315
   property table. Which key occurs in the parent table is declared in
316
   the RelatedClass or ToPropertyTable element with the
317
   KeyInParentTable attribute.
318

    
319
   In addition, the CandidateKey element is used to identify the
320
   columns used to identify rows when extracting data from the root
321
   table.
322

    
323
   The Generate attribute tells the system whether to generate the
324
   candidate key. If the key is generated, the user must provide a
325
   class that generates the key; for more information, see:
326

    
327
      de.tudarmstadt.ito.xmldbms.KeyGenerator
328
      de.tudarmstadt.ito.xmldbms.helpers.KeyGeneratorImpl
329

    
330
   If the key is not generated, other properties must be mapped to
331
   the key columns.
332
-->
333

    
334
<!ELEMENT CandidateKey (Column+)>
335
<!ATTLIST CandidateKey
336
          Generate (Yes | No) #REQUIRED>
337
<!ELEMENT ForeignKey (Column+)>
338

    
339
<!--
340
   ElementType, Attribute, and PCDATA elements are used to identify
341
   the corresponding XML structures.
342

    
343
   The MultiValued attribute of the Attribute element type states
344
   whether individual tokens in an attribute are separate values
345
   (NMTOKENS, IDREFS, and ENTITIES attributes) or a single value
346
   (CDATA, ID, IDREF, ENTITY, and NMTOKEN attributes).
347
-->
348

    
349
<!ENTITY % XMLName "Name NMTOKEN #REQUIRED">
350
<!ELEMENT ElementType EMPTY>
351
<!ATTLIST ElementType
352
          %XMLName;>
353
<!ELEMENT Attribute EMPTY>
354
<!ATTLIST Attribute
355
          %XMLName;
356
          MultiValued (Yes | No) "No">
357
<!ELEMENT PCDATA EMPTY>
358

    
359
<!--
360
   Table, Column, and OrderColumn names are used to identify the
361
   corresponding XML structures. Table and column names must follow
362
   the naming conventions used in the database. For example, if
363
   column names are stored in upper case in the database, then they
364
   must be specified in upper case in the mapping document. Table
365
   names may be qualified with catalog and schema names. Column names
366
   must not be qualified; the table to which they belong is determined
367
   from context (see below). Column names must not be quoted; the system
368
   quotes them before using them in SQL statements.
369

    
370
   When transferring data from the database to an XML document, the
371
   special table name "Result Set" is used when the root table is a
372
   result set.
373

    
374
   The table to which a column belongs is determined as follows:
375

    
376
   Column element in:        Column occurs in:
377
   __________________        ________________________________________
378

    
379
   ToColumn                  Class table
380
   ToPropertyTable           Property table
381
   CandidateKey              Determined by KeyInParentTable attribute
382
   ForeignKey                Determined by KeyInParentTable attribute
383

    
384
   OrderColumn element in:   Column occurs in:
385
   __________________        ________________________________________
386

    
387
   PseudoRoot                Class table of pseudo-root element
388
   PropertyTable             Class table
389
                               (if property mapped as ToColumn)
390
                             Same table as foreign key
391
                               (if property mapped as ToPropertyTable)
392
   RelatedClass              Same table as foreign key
393

    
394
   Order columns are used to store information about the order in
395
   which elements and PCDATA occur in their parent element, as well as
396
   the order of values in multi-valued attributes (IDREFS, NMTOKENS,
397
   and ENTITIES).
398

    
399
   Storing order information is optional; if it is not stored, there
400
   is no guarantee that order will be preserved in a round trip from
401
   an XML document to the database and back again. (Note that nesting
402
   is preserved; that is, subelements and PCDATA always occur in the
403
   correct parent.)
404

    
405
   The Generate attribute of the OrderColumn element tells the system
406
   whether to generate order information or not. (The presence or
407
   absence of the OrderColumn element tells the system whether to use
408
   order information.) If order information is generated, the order
409
   column must be of type java.sql.Types.Integer. If order information
410
   is not generated, another property must be mapped to the order
411
   column.
412
-->
413

    
414
<!ENTITY % DatabaseName "Name CDATA #REQUIRED">
415
<!ELEMENT Table EMPTY>
416
<!ATTLIST Table
417
          %DatabaseName;>
418
<!ELEMENT Column EMPTY>
419
<!ATTLIST Column
420
          %DatabaseName;>
421
<!ELEMENT OrderColumn EMPTY>
422
<!ATTLIST OrderColumn
423
          %DatabaseName;
424
          Generate (Yes | No) #REQUIRED>
425

    
(8-8/9)