Project

General

Profile

1
Introduction
2
============
3

    
4
Metacat is a repository for data and metadata (data about data), which helps
5
scientists find, understand and effectively use the data sets they manage or
6
that have been created by others. Thousands of data sets are currently
7
documented in a standardized way and stored in Metacat systems, providing the
8
scientific community with a broad range of science data that--because the
9
data are well and consistently described--can be easily searched, compared,
10
merged, or used in other ways.
11

    
12
Not only is the Metacat repository a reliable place to store metadata and data
13
(the database is replicated over a secure connection so that every record is
14
stored on multiple machines and no data is ever lost to technical failures), it
15
provides a user-friendly interface for information entry and retrieval.
16
Scientists can search the repository via the Web using a customizable search
17
form. Searches return results based on user-specified criteria, such as desired
18
geographic coverage, taxonomic coverage, and/or keywords that appear in places
19
such as the data set's title or owner's name. Users need only click a linked
20
search result to open the corresponding data-set documentation in a browser
21
window and discover whom to contact to obtain the data themselves (or how to
22
immediately download the data via the Web).
23

    
24
Metacat's user-friendly Registry application allows data providers to enter
25
data-set documentation into Metacat using a Web form. When the form is
26
submitted, Metacat compiles the provided documentation into the required format
27
and saves it. Information providers need never work directly with the XML
28
format in which the data are stored or with the database records themselves. In
29
addition, the Metacat application can easily be extended to provide a
30
customized data-entry interface that suits the particular requirements of each
31
project. Metacat users can also choose to enter metadata using the Morpho
32
application, which provides data-entry wizards that guide information providers
33
through the process of documenting each data set.
34

    
35
The metadata stored in Metacat includes all of the information you and others
36
need to understand what the described data are and how to use them: a
37
descriptive data set title; an abstract; the temporal, spatial, and taxonomic
38
coverage of the data; the data collection methods; distribution information;
39
and contact information. Each information provider decides who has access to
40
this information (the public, or just specified users), and whether or not to
41
upload the data set itself with the data documentation. Information providers
42
can also edit the metadata or delete it from the repository, again using
43
Metacat's straightforward Web interface.
44

    
45
Metacat is a Java servlet application that runs on Window or Linux platforms in
46
conjunction with a database, such as PostgreSQL (or Oracle 8i), and a Web
47
server. The Metacat application stores data in an XML format using Ecological
48
Metadata Language (EML) or another ecological metadata standard. For more
49
information about Metacat or for examples of projects currently using Metacat,
50
please see http://knb.ecoinformatics.org.
51

    
52
What's in this Guide
53
--------------------
54
The Administrator guide includes information for installing, configuring,
55
managing and extending Metacat for both Linux and Windows systems. Chapter Two
56
contains instructions for downloading and installing Metacat and the
57
applications required to run the software on Linux and Microsoft platforms.
58
Chapter Three covers how to configure Metacat, both for new and upgraded
59
installations. Chapter Four details the ways in which you can customize the
60
Metacat interface so users can access and submit information easily: using
61
Metacat's generic web-interface (the Registry), creating your own HTML forms,
62
and creating your own desktop client (like Morpho). Chapter Five discusses how
63
to work with Metacat's Geoserver. Chapter Six describes how to set up the
64
Metacat's replication service, which permits Metacat servers to share data with
65
each other, effectively backing up metadata and data files. Chapter Seven looks
66
at the Metacat Harvester, a program that automates the retrieval of EML
67
documents from one or more sites and their subsequent upload (insert or update)
68
to Metacat. Chapter Eight discusses logging, Chapter Nine contains instructions
69
for creating a site map, which makes individual metadata entries available via
70
Web searches. Metacat's Java API is included as an appendix at the end of the
71
guide.
72

    
73
Metacat Features
74
----------------
75
Metacat is a repository for metadata (data about data), which help scientists
76
find, understand and effectively use the data sets they manage or that have
77
been created by others. Specifically,
78

    
79
* Metacat is a Java servlet application, which can run on both Windows and Linux systems
80
* Metadata submitted to Metacat is broken into modules, which are stored to optimize rapid information retrieval
81
* Metacat's Web interface facilitates the input and retrieval of data (Figure 1.1)
82
* Metacat's optional mapping functionality enables you to query and visualize the geographic coverage of stored documents
83
* Metacat's replication feature ensures that all Metacat data and metadata is stored safely on multiple Metacat servers
84
* The Metacat interface can be easily extended and customized via Web forms, skins, and/or user-developed Java clients
85
* The Metacat harvester automates the process of retrieving and storing EML documents from one or more sites
86
* Metacat can be customized to use Life Sciences Identifiers (LSIDs), uniquely identifying every data record
87
* Metacat has a built-in logging system for tracking events such as document insertions, updates, deletes, and reads
88
* The appearance of Metacat's Web interface can be customized via skins. 
89

    
(12-12/17)