Bug #1794
openmodify temporalCoverage to support ongoing data sources
0%
Description
---- Posted on behalf of Barbara Benson (bjbenson@wisc.edu) ----
I would like to raise some concerns that have arisen while developing EML
documents for the North Temperate Lakes LTER.
Our data reside in an Oracle database, and tables are updated with new data at
frequencies ranging from hourly to annually. We are creating EML documents to
describe these data, and the data can be accessed dynamically from our website.
Data from instrumented buoys are uploaded to the database every hour and are
thus accessible from our website current to within the last hour. Our problem
comes from trying to create temporal coverage for the NTL data. In order to
have valid EML, it would seem like our options are:
1) to inaccurately describe the end date of a data set by choosing a static
date; for example, the EML Best Practices document suggests using the end of the
current year
2) to choose not to populate temporal coverage, thus having data sets that
won't be located by temporal searches
3) to create data sets outside our database that are static
4) to use the "kluge" solution from a previous draft of the EML Best Practices
using the alternative time scale as "ongoing" and leaving the end date blank.
For data sets that are only updated annually, we are willing to create an end
date and just change that end date each year in the metadata. We have not
decided how to handle temporal coverage for data that are updated more
frequently but none of the currently available (valid) options seems desirable.
The current focus for creation of EML documents is to harvest them to the
Metacat at the LTER Network Office. The rationale for this harvest is to
support the data discovery functionality through Metacat across the LTER
datasets. Given the well developed functionality of the NTL dynamic database
access and the capability of capturing information about users accessing the NTL
data, we want the EML documents to point to our dynamic database access system
for each data set. Therefore, we don't find the creation of a static dataset a
viable option at the present time when our higher level of functionality is not
available centrally and not likely to become available in the near future.
To me the problems with creating temporal coverage for an ongoing data set
highlight what I perceive to be a more general problem regarding the
conceptualization of what objects EML is designed to describe. The set of
objects needs to be bigger than static data sets. There are other data sources
that need metadata description, e.g., database tables that are frequently
updated, data streams from sensor networks. Some features of the current
version of EML seem to be limited by this "static dataset" paradigm. It isn't
hard to envision applications for EML attached to data streams.
We would appreciate your response to these issues. We think the next version of
EML should accommodate ongoing data sets and allow the end date to be blank.
thanks
Barbara Benson