1
|
Introduction
|
2
|
============
|
3
|
|
4
|
Metacat is a repository for data and metadata (data about data), which helps
|
5
|
scientists find, understand and effectively use the data sets they manage or
|
6
|
that have been created by others. Thousands of data sets are currently
|
7
|
documented in a standardized way and stored in Metacat systems, providing the
|
8
|
scientific community with a broad range of science data that--because the
|
9
|
data are well and consistently described--can be easily searched, compared,
|
10
|
merged, or used in other ways.
|
11
|
|
12
|
Not only is the Metacat repository a reliable place to store metadata and data
|
13
|
(the database is replicated over a secure connection so that every record is
|
14
|
stored on multiple machines and no data is ever lost to technical failures), it
|
15
|
provides a user-friendly interface for information entry and retrieval.
|
16
|
Scientists can search the repository via the Web using a customizable search
|
17
|
form. Searches return results based on user-specified criteria, such as desired
|
18
|
geographic coverage, taxonomic coverage, and/or keywords that appear in places
|
19
|
such as the data set's title or owner's name. Users need only click a linked
|
20
|
search result to open the corresponding data-set documentation in a browser
|
21
|
window and discover whom to contact to obtain the data themselves (or how to
|
22
|
immediately download the data via the Web).
|
23
|
|
24
|
Metacat's user-friendly Registry application allows data providers to enter
|
25
|
data-set documentation into Metacat using a Web form. When the form is
|
26
|
submitted, Metacat compiles the provided documentation into the required format
|
27
|
and saves it. Information providers need never work directly with the XML
|
28
|
format in which the data are stored or with the database records themselves. In
|
29
|
addition, the Metacat application can easily be extended to provide a
|
30
|
customized data-entry interface that suits the particular requirements of each
|
31
|
project. Metacat users can also choose to enter metadata using the Morpho
|
32
|
application, which provides data-entry wizards that guide information providers
|
33
|
through the process of documenting each data set.
|
34
|
|
35
|
The metadata stored in Metacat includes all of the information you and others
|
36
|
need to understand what the described data are and how to use them: a
|
37
|
descriptive data set title; an abstract; the temporal, spatial, and taxonomic
|
38
|
coverage of the data; the data collection methods; distribution information;
|
39
|
and contact information. Each information provider decides who has access to
|
40
|
this information (the public, or just specified users), and whether or not to
|
41
|
upload the data set itself with the data documentation. Information providers
|
42
|
can also edit the metadata or delete it from the repository, again using
|
43
|
Metacat's straightforward Web interface.
|
44
|
|
45
|
Metacat is a Java servlet application that runs on Window or Linux platforms in
|
46
|
conjunction with a database, such as PostgreSQL (or Oracle 8i), and a Web
|
47
|
server. The Metacat application stores data in an XML format using Ecological
|
48
|
Metadata Language (EML) or another ecological metadata standard. For more
|
49
|
information about Metacat or for examples of projects currently using Metacat,
|
50
|
please see http://knb.ecoinformatics.org.
|
51
|
|
52
|
What's in this Guide
|
53
|
--------------------
|
54
|
The Administrator guide includes information for installing, configuring,
|
55
|
managing and extending Metacat for both Linux and Windows systems. Chapter Two
|
56
|
contains instructions for downloading and installing Metacat and the
|
57
|
applications required to run the software on Linux and Microsoft platforms.
|
58
|
Chapter Three covers how to configure Metacat, both for new and upgraded
|
59
|
installations. Chapter Four details the ways in which you can customize the
|
60
|
Metacat interface so users can access and submit information easily: using
|
61
|
Metacat's generic web-interface (the Registry), creating your own HTML forms,
|
62
|
and creating your own desktop client (like Morpho). Chapter Five discusses how
|
63
|
to work with Metacat's Geoserver. Chapter Six describes how to set up the
|
64
|
Metacat's replication service, which permits Metacat servers to share data with
|
65
|
each other, effectively backing up metadata and data files. Chapter Seven looks
|
66
|
at the Metacat Harvester, a program that automates the retrieval of EML
|
67
|
documents from one or more sites and their subsequent upload (insert or update)
|
68
|
to Metacat. Chapter Eight discusses logging, Chapter Nine contains instructions
|
69
|
for creating a site map, which makes individual metadata entries available via
|
70
|
Web searches. Metacat's Java API is included as an appendix at the end of the
|
71
|
guide.
|
72
|
|
73
|
Metacat Features
|
74
|
----------------
|
75
|
Metacat is a repository for metadata (data about data), which help scientists
|
76
|
find, understand and effectively use the data sets they manage or that have
|
77
|
been created by others. Specifically,
|
78
|
|
79
|
* Metacat is a Java servlet application, which can run on both Windows and Linux systems
|
80
|
* Metadata submitted to Metacat is broken into modules, which are stored to optimize rapid information retrieval
|
81
|
* Metacat's Web interface facilitates the input and retrieval of data (Figure 1.1)
|
82
|
* Metacat's optional mapping functionality enables you to query and visualize the geographic coverage of stored documents
|
83
|
* Metacat's replication feature ensures that all Metacat data and metadata is stored safely on multiple Metacat servers
|
84
|
* The Metacat interface can be easily extended and customized via Web forms, skins, and/or user-developed Java clients
|
85
|
* The Metacat harvester automates the process of retrieving and storing EML documents from one or more sites
|
86
|
* Metacat can be customized to use Life Sciences Identifiers (LSIDs), uniquely identifying every data record
|
87
|
* Metacat has a built-in logging system for tracking events such as document insertions, updates, deletes, and reads
|
88
|
* The appearance of Metacat's Web interface can be customized via skins.
|
89
|
|