'$RCSfile: eml-physical.xsd,v $'
Copyright: 1997-2002 Regents of the University of California,
University of New Mexico, and
Arizona State University
Sponsors: National Center for Ecological Analysis and Synthesis and
Partnership for Interdisciplinary Studies of Coastal Oceans,
University of California Santa Barbara
Long-Term Ecological Research Network Office,
University of New Mexico
Center for Environmental Studies, Arizona State University
Other funding: National Science Foundation (see README for details)
The David and Lucile Packard Foundation
For Details: http://knb.ecoinformatics.org/
'$Author: cjones $'
'$Date: 2002/09/16 23:40:58 $'
'$Revision: 1.43 $'
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
eml-physical
The eml-physical module describes the external
and internal physical characteristics of a data object as well as the
information required for its distribution. Examples of the external
physical characteristics of a data object would be the filename, size,
compression, encoding methods, and authentication of a file (or byte
stream) that resides on a filesystem or the name of a database table
if the data object resides in a relational database. Internal
physical characteristics describe the format of the data object being
described. Examples are Microsoft Access 2000, ASCII, or UTF-8. It
also includes the information needed to parse the data object to
extract the entity and its attributes from the data object.
Distribution information describes how to retrieve the data object.
The retrieval information can be either online with connection
information, a URL for example, or offline with the data object
residing on an archival tape.
Any data object that is being desribed by EML
needs this information so the entities and attributes that reside with
in the data object can be extracted.
yes
Physical structure.
Physical structure of an entity or entities.
The content model for physical is a CHOICE between
"references" and all of the elements that let you describe the
internal/external characteristics and distribution of a data object
(e.g., dataObject, dataFormat, distribution.) A physical element can
contain a reference to an physical element defined elsewhere. Using
a reference means that the referenced physical is identical, not just
in name but identical in its complete description.
The eml-physical was introduced into EML 1.4 as
eml-file.
Data object size
Describes the physical size of the
data object.
This element contains information of the
physical size of the entity, typically in
bytes.
13]]>
The entitySize was introduced into EML
1.4.
Unit of measurement
Unit of measurement for the entity
size, typically bytes
This element gives the unit of
measurement for the size of the entity, and is
typically bytes.
13]]>
The unit was introduced into EML
1.4.
Authentication method
A value, typically a checksum, used to
authenticate that the bitstream delivered to the user is
identical to the original.
This element describes authentication
procedures or techniques, typically by giving a checksum
method (e.g., MD5) and checksum value for the
bytestream.
f5b2177ea03aea73de12da81f896fe40]]>
The authentication element was introduced into
EML 1.4.
Authentication method
The method used to calculate an
authentication checksum.
This element names the method used
to calculate and authentication checksum that can
be used to validate a bytestream. Typical checksum
methods include MD5 and CRC.
f5b2177ea03aea73de12da81f896fe40]]>
The authentication element was
introduced into EML 1.4.
Entity's compression method
Name of the entity's compression
method
This element describes any compression
methods used to compress the entity, such as zip, compress,
etc.
The compressed element was introduced into EML
1.4.
Encoding Method
Method used for encoding the
entity
This element describes the entity's
encoded method, such as MIME base64 encoding or binhex
encoding.
The encoded element was introduced into EML
1.4.
Character Encoding
Contains the name of the chracter encoding
used for the data.
This element contains the name of the
character encoding. This is typically ASCII or UTF-8, or
one of the other common encodings.
UTF-8]]>
Introduced in EML 2.0
Data format
Describes the internal physical format
of a data object.
This element is the parent which is a CHOICE
between four possible internal physical formats
which describe the internal
physical characteristics of the data object. Using this
information the user should be able construct the entity
and attributes described in those modules. Note that this is
the format of the
physical file itself.
The format element was introduced into EML
1.4.
Generic binary format
Generic binary format
Documentation for a generic binary
format
Introduced in EML 2.0.
Record delimiter character
Character used to delimit
records.
This element specifies the record
delimiter character when the format is text. The
record delimiter is usually a newline (\n) on UNIX, a
carriage return (\r) on MacOS, or both (\r\n) on
Windows/DOS. Multiline records are usually delimited
with two line ending characters, for example on UNIX
it would be two newline characters
(\n\n).
\n\r]]>
The recordDelimiter element was
introduced into EML 1.4.
Quote character
Character used to quote values for
delimiter escaping
This element specifies a character
to be used in the entity for quoting values so that
field delimeters can be used within the value. This
basically allows delimeter "escaping". The
quoteChacter is typically a " or '.
"]]>
The quoteCharacter element was taken
from the NBII standard.
Literal character
Character used to escape other
characters
This element specifies a character
to be used for escaping character values so that the
following character is treated as its literal value.
This allows "escaping" for special characters like
quotes, commas, and spaces when they aren't intended
as a delimiter value. The literalCharacter is
typicallya \.
\]]>
Introduced in EML 2.0.
ASCII fixed delimited
Describes physical format of entities
and attributes delimitedby special characters like commas
and spaces.
Describes physical format of entities
and attributes delimitedby special characters like commas
and spaces.
Introduced in EML 2.0.
Field width
FieldWidth specification for fixed
field length.
FixedWidth fields have a set
length, thus the end of the field can always be
determined by adding the fieldWidth to the
starting column number.
any positive integer, see example
in "delimeter" description
The fieldWidth element was
introduced into EML 1.4. Semantics changed to
work identically to the NBII DTD.
Physical Line Number
The line on which the data field
is found, when the data record is written over
more than one physical line in the
file.
A single logical data record
may be written over several physical lines in a
file, with no special marker to indicate the
end of a record. In such cases, the relative
location of a data field must be indicated by
both relative row and column
number.
3
Introduced into EML
2.0.
Start column
The starting column number for a
fixed format attribute.
FixedWidth fields have a set
length, thus the end of the field can always be
determined by adding the fieldWidth to the
starting column number.
any positive integer, see example
in "delimiter" description
Introduced into EML
2.0.
Number of physical lines
The number of physical lines in the file
spanned by a single logical data
record.
A single logical data record may be
written over several physical lines in a file, with
no special marker to indicate the end of a record. In
such cases, it is necessary to know the number of
lines per record in order to correctly read
them.
3
Introduced into EML 2.0.
ASCII field delimited
Describes physical format of entities
and attributes delimitedby special characters like commas
and spaces.
Describes physical format of entities
and attributes delimitedby special characters like commas
and spaces.
Introduced in EML 2.0.
Attribute delimiter
The end of the attribute (field) is
delimited by a special character called a field
delimiter.
Variable width format fields (attributes) can vary
in their
field length, thus the end of the field is
delimited by a special character called a
field delimiter (typically a comma or a space).
Data sets are generally classified as fixedWidth
format or variableWidth format, but we have
determined that this is actually a per-field
classification because one may encounter
fixedWidth fields mixed together in the same
data file with variableWidth fields.
In our encoding scheme, the start of each field
is assumed to be the column after the last column
of the previous field, or the first column
if this is the first field in the dataset, unless
the starting column is explicity enumerated using the
"fieldStartColumn" element.
The end column for each field is classified
using either a special character delimeter indicated
using the filedDelimiter element,
or a fixed field length indicated by using the
"fieldWidth"
element. The delimiter for the last field in the
data set can be omitted.
variableWidth fields can vary in their field length,
and the end of
the field is delimited by a special character
called a field delimiter, usually a comma or
a tab character. fixedWidth fields have a set
length, and so the end of the field can always
be determined by adding the fieldWidth to the
starting column number. Here is an example:
Assume we have the following data in a data set:
May,100aaaa,1.2,
April,200aaaa,3.4,
June,300bbbb,4.6,
The metadata indicating the physical layout of the
4 fields would include the
following:
,
3
3
,
]]>
In a strictly fixed format file, the metadata would
be slightly different:
May100aaaa1.2
Apr200aaaa3.4
Jun300bbbb4.6
3
3
4
3
]]>
or, one could explicitly describe the starting columns:
1
3
4
3
7
4
11
3
]]>
comma, tab, white space,
etc.
The delimiter element was introduced
into EML 1.4. Semantics changed to work identically
to the NBII DTD, and then modified to fit more
cases.
Physical Line Number
The line on which the data field
is found, when the data record is written over
more than one physical line in the
file.
A single logical data record
may be written over several physical lines in a
file, with no special marker to indicate the
end of a record. In such cases, the relative
location of a data field must be indicated by
both relative row and column
number.
3
Introduced into EML
2.0.
Quote character
Character used to quote values for
delimiter escaping
This element specifies a character
to be used in the entity for quoting values so that
field delimeters can be used within the value. This
basically allows delimeter "escaping". The
quoteChacter is typically a " or '.
"]]>
The quoteCharacter element was taken
from the NBII standard.
Literal character
Character used to escape other
characters
This element specifies a character
to be used for escaping character values so that the
following character is treated as its literal value.
This allows "escaping" for special characters like
quotes, commas, and spaces when they aren't intended
as a delimiter value. The literalCharacter is
typicallya \.
\]]>
Introduced in EML 2.0.
Format Name
Name of the internal format of the
data object
Name of the internal format of the
data object
Microsoft Excel
The formatName element was
introduced into EML 2.0
Format Version
Version of the internal format of the
data object
Version of the internal format of the
data object
2000 (9.0.2720)
The formatVersion element was
introduced into EML 2.0
citation
Data object is an eml-literature document.
Data object conforms to the
EML standard for citation as defined in the XML schema
for eml-literature.
eml-literature.xml
The citation element was
introduced into EML 2.0
raster image parameters
contains binary raster data header
parameters
The binaryRasterInfo element is a
container for various parameters used to described the
contents of binary raster image files. In this case, it is
based on a white paper on the ESRI site that describes the
header information used for BIP and BIL files ("Extendable
Image Formats for ArcView GIS 3.1 and
3.2").
Introduced in EML 2.0.
Number of rows
The number of rows in the image.
The number of rows in the image.
Rows are parallel to the x-axis of the map coordinate
system. There is no default.
400
Introduced in EML 2.0.
Number of columns
The number of columns in the image.
The number of columns in the image.
Columns are parallel to the y-axis of the map
coordinate system. There is no
default.
600
Introduced in EML 2.0.
Entity's record
orientation
Specification of the binary raster
entity's record orientation.
This element contains specification
of the binary raster entity's record orientation by
defining the element's attribute "columnorrow". The
binary raster will be column major if the raster is
to be displayed column by column from the byte
stream, or row major if it is to be displayed row by
row from the byte stream.
The valid attribute values are
"columnmajor" or "rowmajor". If the attribute is not
specified, "columnmajor" is used.
The orientation element was introduced
into EML 2.0
Attribute of orientation
element
Specification of the entity's record
orientation.
This attribute specifies the
entity's record orientation.
The valid attribute values are
"columnmajor" or "rowmajor". If the attribute is
not specified, "columnmajor" is
used.
The columnorrow attribute was
introduced into EML 1.4.
Number of Bands
The number of spectral bands in the
image.
The number of spectral bands in the
image. The default is 1.
1
Introduced in EML 2.0.
Number of Bits
The number of bits per pixel per
band.
The number of bits per pixel per
band. Acceptable values are 1, 4, 8, 16, and 32. The
default value is eight bits per pixel per band. For a
true color image with three bands (R, G, B) stored
using eight bits for each pixel in each band, nbits
equals eight and nbands equals three, for a total of
twenty-four bits per pixel. For an image with nbits
equal to one, nbands must also equal
one.
8
Introduced in EML 2.0.
Byte Order
The byte order in which image pixel
values are stored.
The byte order in which image pixel
values are stored. The byte order is important for
sixteen-bit images, with two bytes per pixel.
Acceptable values are I - Intel byte order (Silicon
Graphics, DEC Alpha, PC) Also known as little endian.
M - Motorola byte order (Sun, HP, etc.) Also known as
big-endian. The default byte order is the same as
that of the host machine executing the
software.
I or M
Introduced in EML 2.0.
Layout
The organization of the bands in the
image file.
The organization of the bands in the
image file. Acceptable values are bil - Band
interleaved by line. bip - Band interleaved by pixel.
bsq - Band sequential. The default layout is
bil.
bil, bip, bsq
Introduced in EML 2.0.
Skip Bytes
The number of bytes of data in the
image file to skip in order to reach the start of the
image data.
The number of bytes of data in the
image file to skip in order to reach the start of the
image data. This keyword allows you to bypass any
existing image header information in the file. The
default value is zero bytes.
0
Introduced in EML 2.0.
upper left X map coordinate
The x-axis map coordinate of the
center of the upper-left pixel.
The x-axis map coordinate of the
center of the upper-left pixel. If this parameter is
specified, ulymap must also be set, otherwise a
default value is used.
340000
Introduced in EML 2.0.
upper left Y map coordinate
The y-axis map coordinate of the
center of the upper-left pixel.
The y-axis map coordinate of the
center of the upper-left pixel. If you specify this
parameter, set ulxmap, too, otherwise a default value
is used.
6486666
Introduced in EML 2.0.
X dimension
The x-dimension of a pixel in map
units.
The x-dimension of a pixel in map
units. If this parameter is specified, ydim must also
be set, otherwise a default value is
used.
16.665
Introduced in EML 2.0.
Y dimension
The y-dimension of a pixel in map
units.
The y-dimension of a pixel in map
units. If this parameter is specified, xdim must also
be set, otherwise a default value is
used.
16.665
Introduced in EML 2.0.
Bytes per band per row
The number of bytes per band per
row.
The number of bytes per band per
row. This must be an integer. This keyword is used
only with BIL files when there are extra bits at the
end of each band within a row that must be
skipped.
3
Introduced in EML 2.0.
Total bytes of data per row
The total number of bytes of data
per row.
The total number of bytes of data
per row. Use totalrowbytes when there are extra
trailing bits at the end of each
row.
8
Introduced in EML 2.0.
Bytes between bands
The number of bytes between bands in
a BSQ format image.
The number of bytes between bands in
a BSQ format image. The default is
zero.
1
Introduced in EML 2.0.
Distribution Information
Information on how the resource is distributed
online and offline
This element provides information on how the
resource is distributed online and offline. Connections to online
systems can be described as URLs and as a list of relevant
connection parameters.
Derived from distribution elements in the FGDC
standard.
Online Distribution Information
Distribution information for accessing the
resource online.
Distribution information for accessing the
resource online, represented either as a URL or as a series of
named parameters that are needed in order to
connect. The URL field is provided for the simple cases where a
file is available for download directly from a web server or
other similar server and a complex connection protocol is not
needed. The connection field provides an alternative where a
complex protocol needs to be named and described, along with
the necessary parameters needed for the connection.
Download site URL
A URL (Uniform Resource Locator) from which
this resource can be downloaded or information can be
obtained about downloading it.
A URL (Uniform Resource Locator) from
which this resource can be downloaded or additional
information can be obtained. If accessing the URL would
directly return the data stream, then the "function"
attribute should be set to "download". If the URL
provides further information about downloading the
object but does not directly return the data stream, then
the "function" attribute should be set to "information".
If the "function" attribute is omitted, then "download"
is implied for the URL function.
In more complex cases where a non-standard connection
must be established that complies with application
specific procedures beyond what can be described in the
simple URL, then the "connection" element should
be used instead of the URL element.
http://data.org/getdata?id=98332
ISO CD 19115.3, Geographic information -
Metadata
Connection
A description of the information needed
to make an application connection to a data service.
A description of the information needed
to make an application connection to a data service.
The connection starts with a connectionDefinition which
lists all of the parameters needed for the connection
and possible default values for each. It then includes a
list of parameter values, one for each parameter, that
override the defaults for this particular connection.
One parameter element should exist for every
parameterDefinition that is present in the
connectionDefinition, except that parameters that were
defined with a defaultValue in their parameterDefinition
can be ommitted from the connection and the default
will be used. All information about how to use the
parameters to establish a session and extract data is
present in the connectionDefinition, possibly implicitly
by naming a connection schemeName that is well-known.
Connection Definition
Definition of the connection protocol
to be used for this connection.
Definition of the connection
protocol to be used for this connection. The
definition has a "scheme" which identifies the
protocol by name, and a detailed description of
the scheme and its required parameters.
Parameter
A parameter to be used to make this
connection.
A parameter to be used to make
this connection. This value overrides any
default value that may have been provided in the
connection definition.
Parameter Name
Name of the parameter to be
used to make this connection.
The name of the parameter
to be used to make this connection.
hostname
Parameter Value
The value of the parameter to
be used to make this connection.
The value of the parameter
to be used to make this connection. This
value overrides any default value that may
have been provided in the connection
definition.
nceas.ucsb.edu
References
The id of another connection in this
EML document to be used to provide the connection
information.
The id of another connection in
this EML document to be used to provide the
connection information. This is used instead of
duplicating connection information when an identical
connection needs to be used multiple times in an
EML document.
medium of the resource
the medium on which this resource is distributed,
either digitally or as hardcopy
the medium on which this resource is distributed
digitally, such as 3.5" floppy disk, or various tape media types,
or 'hardcopy'
CD-ROM, 3.5 in. floppy disk, Zip disk
ISO CD 19115.3, Geographic information -
Metadata
Medium name
Name of the medium that for this resource
distribution
Name of the medium on which this resource
is distributed. Can be various digital media such as tapes
and disks, or printed media which can collectively be
termed 'hardcopy'.
Tape, 3.5 inch Floppy Disk,
hardcopy
ISO CD 19115.3, Geographic information -
Metadata
density of the digital medium
the density of the digital medium if this is
relevant.
the density of the digital medium if this
is relevant. Used mainly for floppy disks or
tape.
High Density (HD), Double Density
(DD)
ISO CD 19115.3, Geographic information -
Metadata
units of a numerical density
a numerical density's units
if a density is given numerically, the
units should be given here.
B/cm
ISO CD 19115.3, Geographic information -
Metadata
storage volume
total volume of the storage
medium
the total volume of the storage medium on
which this resource is shipped.
650 MB
ISO CD 19115.3, Geographic information -
Metadata
medium format
format of the medium on which the resource is
shipped.
the file system format of the medium on
which the resource is shipped
NTFS, FAT32, EXT2, QIK80
ISO CD 19115.3, Geographic information -
Metadata
note about the media
note about the media
any additional pertinent information about
the media
ISO CD 19115.3, Geographic information -
Metadata