Project

General

Profile

1 808 berkley
<!--
2
  * Ecological Metadata Language (EML) - Data set variable descriptors
3
  *
4
  *      Authors: Matt Jones, Zheng Wang, and Noah Goldstein
5
  * Organization: National Center for Ecological Analysis and Synthesis
6
  *  For Details: http://www.nceas.ucsb.edu/
7
  *      Created: 1997 August 19
8
  *     Modified: 1999 June 23
9
  *      Version: 1.4
10
  *    File Info: '$Id$'
11
  *
12
  * Ecological Metadata Language is a general purpose metadata content
13
  * specification for documenting ecological data.  The specification
14
  * consists of a series of modular document type definitions (DTD) that
15
  * provide metadata content descriptors. It describes the owner and
16
  * contents of the dataset  (eml-dataset.dtd), the research context in
17
  * which it was created (eml-context.dtd), the structural
18
  * characteristics of data files (eml-file.dtd), the
19
  * characteristics of variables in a file (eml-variable.dtd), current
20
  * status of data and metadata files (eml-status.dtd), access control
21
  * rules regarding the data  and metadata (eml-access.dtd), software
22
  * information (eml-software) and a variety of miscellaneous
23
  * supplemental descriptors (eml-supplement.dtd).
24
  *
25
  * Files generated under the structural constraints of eml are
26
  * plain-text files and therefore are editable in ordinary
27
  * text-processors.  However, these DTDs are intended for use within
28
  * general purpose metadata editors, and within a more specific
29
  * metadata editor being developed at NCEAS for the ecological
30
  * community.  This metadata editor will provide facilities for
31
  * version control and efficient metadata entry.
32
  * The purpose of this specification was to formalize the
33
  * Michener et al. work in a structured language to examine its
34
  * application to ecological data in a controlled manner.
35
  *
36
  * This specification was based on the work of the Ecological Society
37
  * of America's Committee on the Future of Long Term Data, and more
38
  * specifically on a related paper, Michener et al., 1997. See:
39
  * Michener, William K., et al., 1997. Ecological Appications,
40
  * "Nongeospatial metadata for the ecological sciences"
41
  * Vol 7(1). pp. 330-342.
42
  *
43
  * Where appropriate, we have used elements of the ISO/TC 211 draft
44
  * standard - the ISO Geographic information/Geomatics standard,
45
  * which includes xml code, as well as ISO 8601 schema. Some elements
46
  * in the ISO/TC 211 were expanded to allow for greater
47
  * resolution.
48
  *
49
  * For an explanation of the classes of metadata and elements defined
50
  * below, see Michener et al. 1997. In particular, the numbered comment
51
  * labels found below refer to Table 1 (pp. 336-337) of Michener
52
  * et al. 1997. In addition, each of the principal elements in the
53
  * specification is accompanied by a FIXED attribute called "description"
54
  * that provides a brief description of the content of the element. These
55
  * descriptions are derived from Michener et al. 1997.
56
  *
57
-->
58
59
<!--    *      *      *      *
60
        CLASS IV B - VARIABLE DESCRIPTORS
61
        *      *      *      *
62
        -->
63
64
<!-- Class 4 B -->
65
<!ELEMENT eml-variable (meta_file_id, variable*)>
66
<!ATTLIST eml-variable description CDATA #FIXED "Variable description for a file">
67
68
<!ELEMENT meta_file_id (#PCDATA)>
69
<!ATTLIST meta_file_id description CDATA #FIXED "Unique identifier of this metadata record">
70
71
<!ELEMENT variable (variable_name, variable_definition, unit?, storage_type?,
72
                    code_definition* , numeric_range* , missing_value_code*,
73
                    precision?, field_format?)>
74
<!ATTLIST variable description CDATA #FIXED "Variable information">
75
<!ELEMENT unit (#PCDATA) >
76
<!ATTLIST unit description CDATA #FIXED "Unit">
77
78
<!-- Class 4.B.1 -->
79
<!ELEMENT variable_name (#PCDATA) >
80
<!ATTLIST variable_name description CDATA #FIXED "Unique variable name or code">
81
82
<!-- Class 4.B.2 -->
83
<!ELEMENT variable_definition (#PCDATA)>
84
<!ATTLIST variable_definition description CDATA #FIXED "Precise definition of variables in data set">
85
86
<!-- Class 4.B.3 - see 4.A.2 -->
87
88
<!-- Class 4.B.4.a -->
89
<!ELEMENT storage_type (#PCDATA) >
90
<!ATTLIST storage_type description CDATA #FIXED "Storage type; Integer, floating point, character, string">
91
92
<!-- Class 4.B.4.b -->
93
<!ELEMENT code_definition (code, definition) >
94
<!ATTLIST code_definition description CDATA #FIXED "Description of any codes associated with variables">
95
<!ELEMENT code (#PCDATA) >
96
<!ATTLIST code description CDATA #FIXED "Code">
97
<!ELEMENT definition (#PCDATA) >
98
<!ATTLIST definition description CDATA #FIXED "List and definition of variable codes">
99
100
<!-- Class 4.B.4.c -->
101
<!ELEMENT numeric_range  (minimum?,maximum?) >
102
<!ATTLIST numeric_range description CDATA #FIXED "Range for numeric values">
103
<!ELEMENT minimum (#PCDATA) >
104
<!ATTLIST minimum description CDATA #FIXED "Minimum value">
105
<!ELEMENT maximum (#PCDATA) >
106
<!ATTLIST maximum description CDATA #FIXED "Maximum value">
107
108
<!-- Class 4.B.4.d -->
109
<!ELEMENT missing_value_code  (#PCDATA) >
110
<!ATTLIST missing_value_code description CDATA #FIXED "Character used to represent missing data">
111
112
<!-- Class 4.B.4.e -->
113
<!ELEMENT precision   (#PCDATA) >
114
<!ATTLIST precision description CDATA #FIXED "Precision; number of significant digits">
115
116
<!-- Class 4.B.5 -->
117
<!ELEMENT field_format (variable_width|fixed_width)>
118
<!ATTLIST field_format description CDATA #FIXED "Data format">
119
120
<!--
121
Data sets are generally classified as fixed_width format or
122
variable_width format, but we have determined that this is actually a
123
per-field classification because one may encounter fixed_width fields
124
mixed together in the same data file with variable_width fields.
125
126
In our encoding scheme, the start of each field is assumed to be the
127
column after the last column of the previous field, or the first column
128
if this is the first field in the dataset.  The end column for each
129
field is classified using a field_format and some information specific to
130
each field_format type that indicates in which column the field ends. The
131
two types of field formats are variable_width and fixed_width.
132
Variable_width fields can vary in their field length, and the end of the
133
field is delimited by a special character called a field delimiter,
134
usually a comma or a tab character.  Fixed_width fields have a set
135
length, and so the end of the field can always be determined by adding
136
the field_width to the starting column number.  Here is an example:
137
138
Assume we have the following data in a data set:
139
140
May,100aaa,1.2,
141
April,200aaa,3.4,
142
June,300bbb,4.6,
143
144
The metadata for the 4 fields would include the following:
145
<variable><name>month</name>
146
<field_format><variable_width><delimiter>,</delimiter>
147
</variable_width></field_format></variable>
148
149
<variable><name>sitecode</name>
150
<field_format><fixed_width><field_width>3</field_width>
151
</fixed_width></field_format></variable>
152
153
<variable><name>subsitecode</name>
154
<field_format><fixed_width><field_width>3</field_width>
155
</fixed_width></field_format></variable>
156
157
<variable><name>response</name>
158
<field_format><variable_width><delimiter>,</delimiter>
159
</variable_width></field_format></variable>
160
161
-->
162
163
<!ELEMENT variable_width (delimiter+)>
164
<!ATTLIST variable_width description CDATA #FIXED "Variable width field">
165
<!ELEMENT delimiter (#PCDATA)>
166
<!ATTLIST delimiter description CDATA #FIXED "Character used to delimit end of field"><!ELEMENT fixed_width (field_width)>
167
<!ATTLIST fixed_width description CDATA #FIXED "Fixed width field">
168
<!ELEMENT field_width (#PCDATA)>
169
<!ATTLIST field_width description CDATA #FIXED "Width of field in characters">
170
171
<!-- Class 4.B.5.a - see Class 4.B.5 -->
172
173
<!-- Class 4.B.5.b - see Class 4.B.5 -->
174
175
<!-- Class 4.B.5.c -->
176
<!-- This section was removed as we were unsure of its usefullness -->
177
178
<!-- End of file -->