1 |
6147
|
jones
|
Introduction
|
2 |
|
|
============
|
3 |
|
|
|
4 |
|
|
Metacat is a repository for data and metadata (data about data), which helps
|
5 |
|
|
scientists find, understand and effectively use the data sets they manage or
|
6 |
|
|
that have been created by others. Thousands of data sets are currently
|
7 |
|
|
documented in a standardized way and stored in Metacat systems, providing the
|
8 |
|
|
scientific community with a broad range of science data that--because the
|
9 |
|
|
data are well and consistently described--can be easily searched, compared,
|
10 |
|
|
merged, or used in other ways.
|
11 |
|
|
|
12 |
|
|
Not only is the Metacat repository a reliable place to store metadata and data
|
13 |
|
|
(the database is replicated over a secure connection so that every record is
|
14 |
|
|
stored on multiple machines and no data is ever lost to technical failures), it
|
15 |
|
|
provides a user-friendly interface for information entry and retrieval.
|
16 |
|
|
Scientists can search the repository via the Web using a customizable search
|
17 |
|
|
form. Searches return results based on user-specified criteria, such as desired
|
18 |
|
|
geographic coverage, taxonomic coverage, and/or keywords that appear in places
|
19 |
|
|
such as the data set's title or owner's name. Users need only click a linked
|
20 |
|
|
search result to open the corresponding data-set documentation in a browser
|
21 |
|
|
window and discover whom to contact to obtain the data themselves (or how to
|
22 |
|
|
immediately download the data via the Web).
|
23 |
|
|
|
24 |
|
|
Metacat's user-friendly Registry application allows data providers to enter
|
25 |
|
|
data-set documentation into Metacat using a Web form. When the form is
|
26 |
|
|
submitted, Metacat compiles the provided documentation into the required format
|
27 |
|
|
and saves it. Information providers need never work directly with the XML
|
28 |
|
|
format in which the data are stored or with the database records themselves. In
|
29 |
|
|
addition, the Metacat application can easily be extended to provide a
|
30 |
|
|
customized data-entry interface that suits the particular requirements of each
|
31 |
|
|
project. Metacat users can also choose to enter metadata using the Morpho
|
32 |
|
|
application, which provides data-entry wizards that guide information providers
|
33 |
|
|
through the process of documenting each data set.
|
34 |
|
|
|
35 |
|
|
The metadata stored in Metacat includes all of the information you and others
|
36 |
|
|
need to understand what the described data are and how to use them: a
|
37 |
|
|
descriptive data set title; an abstract; the temporal, spatial, and taxonomic
|
38 |
|
|
coverage of the data; the data collection methods; distribution information;
|
39 |
|
|
and contact information. Each information provider decides who has access to
|
40 |
|
|
this information (the public, or just specified users), and whether or not to
|
41 |
|
|
upload the data set itself with the data documentation. Information providers
|
42 |
|
|
can also edit the metadata or delete it from the repository, again using
|
43 |
|
|
Metacat's straightforward Web interface.
|
44 |
|
|
|
45 |
|
|
Metacat is a Java servlet application that runs on Window or Linux platforms in
|
46 |
|
|
conjunction with a database, such as PostgreSQL (or Oracle 8i), and a Web
|
47 |
|
|
server. The Metacat application stores data in an XML format using Ecological
|
48 |
|
|
Metadata Language (EML) or another ecological metadata standard. For more
|
49 |
|
|
information about Metacat or for examples of projects currently using Metacat,
|
50 |
|
|
please see http://knb.ecoinformatics.org.
|
51 |
|
|
|
52 |
|
|
What's in this Guide
|
53 |
|
|
--------------------
|
54 |
|
|
The Administrator guide includes information for installing, configuring,
|
55 |
|
|
managing and extending Metacat for both Linux and Windows systems. Chapter Two
|
56 |
|
|
contains instructions for downloading and installing Metacat and the
|
57 |
|
|
applications required to run the software on Linux and Microsoft platforms.
|
58 |
|
|
Chapter Three covers how to configure Metacat, both for new and upgraded
|
59 |
|
|
installations. Chapter Four details the ways in which you can customize the
|
60 |
|
|
Metacat interface so users can access and submit information easily: using
|
61 |
|
|
Metacat's generic web-interface (the Registry), creating your own HTML forms,
|
62 |
|
|
and creating your own desktop client (like Morpho). Chapter Five discusses how
|
63 |
|
|
to work with Metacat's Geoserver. Chapter Six describes how to set up the
|
64 |
|
|
Metacat's replication service, which permits Metacat servers to share data with
|
65 |
|
|
each other, effectively backing up metadata and data files. Chapter Seven looks
|
66 |
|
|
at the Metacat Harvester, a program that automates the retrieval of EML
|
67 |
|
|
documents from one or more sites and their subsequent upload (insert or update)
|
68 |
|
|
to Metacat. Chapter Eight discusses logging, Chapter Nine contains instructions
|
69 |
|
|
for creating a site map, which makes individual metadata entries available via
|
70 |
|
|
Web searches. Metacat's Java API is included as an appendix at the end of the
|
71 |
|
|
guide.
|
72 |
|
|
|
73 |
|
|
Metacat Features
|
74 |
|
|
----------------
|
75 |
|
|
Metacat is a repository for metadata (data about data), which help scientists
|
76 |
|
|
find, understand and effectively use the data sets they manage or that have
|
77 |
|
|
been created by others. Specifically,
|
78 |
|
|
|
79 |
|
|
* Metacat is a Java servlet application, which can run on both Windows and Linux systems
|
80 |
|
|
* Metadata submitted to Metacat is broken into modules, which are stored to optimize rapid information retrieval
|
81 |
|
|
* Metacat's Web interface facilitates the input and retrieval of data (Figure 1.1)
|
82 |
|
|
* Metacat's optional mapping functionality enables you to query and visualize the geographic coverage of stored documents
|
83 |
|
|
* Metacat's replication feature ensures that all Metacat data and metadata is stored safely on multiple Metacat servers
|
84 |
|
|
* The Metacat interface can be easily extended and customized via Web forms, skins, and/or user-developed Java clients
|
85 |
|
|
* The Metacat harvester automates the process of retrieving and storing EML documents from one or more sites
|
86 |
|
|
* Metacat can be customized to use Life Sciences Identifiers (LSIDs), uniquely identifying every data record
|
87 |
|
|
* Metacat has a built-in logging system for tracking events such as document insertions, updates, deletes, and reads
|
88 |
|
|
* The appearance of Metacat's Web interface can be customized via skins.
|