Project

General

Profile

1
<!--
2
  * Ecological Metadata Language (EML) - Data set variable descriptors
3
  *
4
  *      Authors: Matt Jones, Zheng Wang, and Noah Goldstein
5
  * Organization: National Center for Ecological Analysis and Synthesis
6
  *  For Details: http://www.nceas.ucsb.edu/
7
  *      Created: 1997 August 19
8
  *     Modified: 1999 June 23
9
  *      Version: 1.4
10
  *    File Info: '$Id: eml-variable.dtd 808 2001-07-25 15:57:31Z berkley $'
11
  *
12
  * Ecological Metadata Language is a general purpose metadata content
13
  * specification for documenting ecological data.  The specification 
14
  * consists of a series of modular document type definitions (DTD) that 
15
  * provide metadata content descriptors. It describes the owner and  
16
  * contents of the dataset  (eml-dataset.dtd), the research context in 
17
  * which it was created (eml-context.dtd), the structural 
18
  * characteristics of data files (eml-file.dtd), the 
19
  * characteristics of variables in a file (eml-variable.dtd), current 
20
  * status of data and metadata files (eml-status.dtd), access control 
21
  * rules regarding the data  and metadata (eml-access.dtd), software 
22
  * information (eml-software) and a variety of miscellaneous 
23
  * supplemental descriptors (eml-supplement.dtd).
24
  * 
25
  * Files generated under the structural constraints of eml are
26
  * plain-text files and therefore are editable in ordinary
27
  * text-processors.  However, these DTDs are intended for use within 
28
  * general purpose metadata editors, and within a more specific 
29
  * metadata editor being developed at NCEAS for the ecological 
30
  * community.  This metadata editor will provide facilities for 
31
  * version control and efficient metadata entry.  
32
  * The purpose of this specification was to formalize the 
33
  * Michener et al. work in a structured language to examine its 
34
  * application to ecological data in a controlled manner.
35
  *
36
  * This specification was based on the work of the Ecological Society 
37
  * of America's Committee on the Future of Long Term Data, and more
38
  * specifically on a related paper, Michener et al., 1997. See:
39
  * Michener, William K., et al., 1997. Ecological Appications,
40
  * "Nongeospatial metadata for the ecological sciences"
41
  * Vol 7(1). pp. 330-342.
42
  *
43
  * Where appropriate, we have used elements of the ISO/TC 211 draft 
44
  * standard - the ISO Geographic information/Geomatics standard, 
45
  * which includes xml code, as well as ISO 8601 schema. Some elements 
46
  * in the ISO/TC 211 were expanded to allow for greater 
47
  * resolution.
48
  *
49
  * For an explanation of the classes of metadata and elements defined 
50
  * below, see Michener et al. 1997. In particular, the numbered comment 
51
  * labels found below refer to Table 1 (pp. 336-337) of Michener 
52
  * et al. 1997. In addition, each of the principal elements in the 
53
  * specification is accompanied by a FIXED attribute called "description"
54
  * that provides a brief description of the content of the element. These
55
  * descriptions are derived from Michener et al. 1997.
56
  *
57
-->
58

    
59
<!--    *      *      *      *
60
        CLASS IV B - VARIABLE DESCRIPTORS
61
        *      *      *      *
62
        -->
63

    
64
<!-- Class 4 B -->
65
<!ELEMENT eml-variable (meta_file_id, variable*)>
66
<!ATTLIST eml-variable description CDATA #FIXED "Variable description for a file">
67

    
68
<!ELEMENT meta_file_id (#PCDATA)>
69
<!ATTLIST meta_file_id description CDATA #FIXED "Unique identifier of this metadata record">
70

    
71
<!ELEMENT variable (variable_name, variable_definition, unit?, storage_type?,
72
                    code_definition* , numeric_range* , missing_value_code*,
73
                    precision?, field_format?)>
74
<!ATTLIST variable description CDATA #FIXED "Variable information">
75
<!ELEMENT unit (#PCDATA) >
76
<!ATTLIST unit description CDATA #FIXED "Unit">
77

    
78
<!-- Class 4.B.1 -->
79
<!ELEMENT variable_name (#PCDATA) >
80
<!ATTLIST variable_name description CDATA #FIXED "Unique variable name or code">
81

    
82
<!-- Class 4.B.2 -->
83
<!ELEMENT variable_definition (#PCDATA)>
84
<!ATTLIST variable_definition description CDATA #FIXED "Precise definition of variables in data set">
85

    
86
<!-- Class 4.B.3 - see 4.A.2 -->
87

    
88
<!-- Class 4.B.4.a -->
89
<!ELEMENT storage_type (#PCDATA) >
90
<!ATTLIST storage_type description CDATA #FIXED "Storage type; Integer, floating point, character, string">
91

    
92
<!-- Class 4.B.4.b -->
93
<!ELEMENT code_definition (code, definition) >
94
<!ATTLIST code_definition description CDATA #FIXED "Description of any codes associated with variables">
95
<!ELEMENT code (#PCDATA) >
96
<!ATTLIST code description CDATA #FIXED "Code">
97
<!ELEMENT definition (#PCDATA) >
98
<!ATTLIST definition description CDATA #FIXED "List and definition of variable codes">
99

    
100
<!-- Class 4.B.4.c -->
101
<!ELEMENT numeric_range  (minimum?,maximum?) >
102
<!ATTLIST numeric_range description CDATA #FIXED "Range for numeric values">
103
<!ELEMENT minimum (#PCDATA) >
104
<!ATTLIST minimum description CDATA #FIXED "Minimum value">
105
<!ELEMENT maximum (#PCDATA) >
106
<!ATTLIST maximum description CDATA #FIXED "Maximum value">
107

    
108
<!-- Class 4.B.4.d -->
109
<!ELEMENT missing_value_code  (#PCDATA) >
110
<!ATTLIST missing_value_code description CDATA #FIXED "Character used to represent missing data">
111

    
112
<!-- Class 4.B.4.e -->
113
<!ELEMENT precision   (#PCDATA) >
114
<!ATTLIST precision description CDATA #FIXED "Precision; number of significant digits">
115

    
116
<!-- Class 4.B.5 -->
117
<!ELEMENT field_format (variable_width|fixed_width)>
118
<!ATTLIST field_format description CDATA #FIXED "Data format">
119

    
120
<!--
121
Data sets are generally classified as fixed_width format or
122
variable_width format, but we have determined that this is actually a
123
per-field classification because one may encounter fixed_width fields
124
mixed together in the same data file with variable_width fields.
125

    
126
In our encoding scheme, the start of each field is assumed to be the
127
column after the last column of the previous field, or the first column
128
if this is the first field in the dataset.  The end column for each
129
field is classified using a field_format and some information specific to
130
each field_format type that indicates in which column the field ends. The
131
two types of field formats are variable_width and fixed_width. 
132
Variable_width fields can vary in their field length, and the end of the
133
field is delimited by a special character called a field delimiter,
134
usually a comma or a tab character.  Fixed_width fields have a set
135
length, and so the end of the field can always be determined by adding
136
the field_width to the starting column number.  Here is an example:
137

    
138
Assume we have the following data in a data set:
139

    
140
May,100aaa,1.2,
141
April,200aaa,3.4,
142
June,300bbb,4.6,
143

    
144
The metadata for the 4 fields would include the following:
145
<variable><name>month</name>
146
<field_format><variable_width><delimiter>,</delimiter>
147
</variable_width></field_format></variable>
148

    
149
<variable><name>sitecode</name>
150
<field_format><fixed_width><field_width>3</field_width>
151
</fixed_width></field_format></variable>
152

    
153
<variable><name>subsitecode</name>
154
<field_format><fixed_width><field_width>3</field_width>
155
</fixed_width></field_format></variable>
156

    
157
<variable><name>response</name>
158
<field_format><variable_width><delimiter>,</delimiter>
159
</variable_width></field_format></variable>
160

    
161
-->
162

    
163
<!ELEMENT variable_width (delimiter+)>
164
<!ATTLIST variable_width description CDATA #FIXED "Variable width field">
165
<!ELEMENT delimiter (#PCDATA)>
166
<!ATTLIST delimiter description CDATA #FIXED "Character used to delimit end of field"><!ELEMENT fixed_width (field_width)>
167
<!ATTLIST fixed_width description CDATA #FIXED "Fixed width field">
168
<!ELEMENT field_width (#PCDATA)>
169
<!ATTLIST field_width description CDATA #FIXED "Width of field in characters">
170

    
171
<!-- Class 4.B.5.a - see Class 4.B.5 -->
172

    
173
<!-- Class 4.B.5.b - see Class 4.B.5 -->
174

    
175
<!-- Class 4.B.5.c -->
176
<!-- This section was removed as we were unsure of its usefullness -->
177

    
178
<!-- End of file -->
179

    
(17-17/18)