Project

General

Profile

Bug #5516 » intro_mgb.rst

gastil gastil, 06/05/2012 04:14 PM

 
1
Introduction
2
============
3

    
4
Metacat is a repository for data and metadata (descriptions of data) that helps
5
scientists find, understand and effectively use the data sets they manage or
6
that have been created by others. Thousands of data sets are currently
7
documented in a standardized way and stored in Metacat systems, providing the
8
scientific community with a broad range of science data that--because the
9
data are well and consistently described--can be easily searched, compared,
10
merged, or used in other ways.
11

    
12
Not only is the Metacat repository a reliable place to store metadata and data
13
(the database is replicated over a secure connection so that every record is
14
stored on multiple machines and no data is ever lost to technical failures), it
15
provides a user-friendly interface for information entry and retrieval.
16
Scientists can search the repository via the Web using a customizable search
17
form. Searches return results based on user-specified criteria, such as desired
18
geographic coverage, taxonomic coverage, and/or keywords that appear in places
19
such as the data set's title or owner's name. Users need only click a linked
20
search result to open the corresponding data-set documentation in a browser
21
window and discover whom to contact to obtain the data themselves (or how to
22
immediately download the data via the Web).
23

    
24
Metacat's user-friendly Registry application allows data providers to enter
25
data set documentation into Metacat using a Web form. When the form is
26
submitted, Metacat compiles the provided documentation into the required format
27
and saves it. Information providers need never work directly with the XML_
28
format in which the metadata are stored or with the database records themselves. In
29
addition, the Metacat application can easily be extended to provide a
30
customized data-entry interface that suits the particular requirements of each
31
project. Metacat users can also choose to enter metadata using the Morpho
32
application, which provides data entry wizards that guide information providers
33
through the process of documenting each data set.
34

    
35
The metadata stored in Metacat includes all of the information needed
36
to understand what the described data are and how to use them: a
37
descriptive data set title; an abstract; the temporal, spatial, and taxonomic
38
coverage of the data; the data collection methods; distribution information;
39
and contact information. Each information provider decides who has access to
40
this information (the public, or just specified users), and whether or not to
41
upload the data set itself with the data documentation. Information providers
42
can also edit the metadata or delete it from the repository, again using
43
Metacat's straightforward Web interface.
44

    
45
Metacat is a `Java servlet`_ application that runs on Linux, Mac OS, and
46
Windows platforms in conjunction with a database, such as 
47
PostgreSQL_ (or Oracle_), and a Web
48
server. The Metacat application stores data in an XML_ format using `Ecological
49
Metadata Language`_ (EML) or other metadata standards such as `ISO 19139`_ or the
50
`FGDC Biological Data Profile`_. For more
51
information about Metacat or for examples of projects currently using Metacat,
52
please see http://knb.ecoinformatics.org.
53

    
54
.. _XML: http://en.wikipedia.org/wiki/XML
55

    
56
.. _Java servlet: http://en.wikipedia.org/wiki/Java_Servlet
57

    
58
.. _PostgreSQL: http://www.postgresql.org/
59

    
60
.. _Oracle: http://www.oracle.com/
61

    
62
.. _Ecological Metadata Language: http://knb.ecoinformatics.org/software/eml
63

    
64
.. _ISO 19139: http://marinemetadata.org/references/iso19139 
65

    
66
.. _FGDC Biological Data Profile: http://www.fgdc.gov/standards/projects/FGDC-standards-projects/metadata/biometadata 
67

    
68
What's in this Guide
69
--------------------
70
This Administrator's guide includes information for installing, configuring,
71
managing and extending Metacat for both Linux, Mac OS, and Windows systems. 
72
Chapter Four contains instructions for downloading and installing Metacat and the
73
applications required to run the software on Linux and Microsoft platforms.
74
Chapter Five covers how to configure Metacat, both for new and upgraded
75
installations. Chapter Seven details the ways in which you can customize the
76
Metacat interface so users can access and submit information easily: using
77
Metacat's generic web-interface (the Registry), creating your own HTML forms,
78
and creating your own desktop client (like Morpho). Chapter Eight discusses how
79
to work with Metacat's embedded Geoserver. Chapter Nine describes how to set up the
80
Metacat's replication service, which permits Metacat servers to share data with
81
each other, effectively backing up metadata and data files. Chapter Ten looks
82
at the Metacat Harvester, a program that automates the retrieval of EML
83
documents from one or more sites and their subsequent upload (insert or update)
84
to Metacat. Chapter Twelve discusses logging, Chapter Thirteen contains instructions
85
for creating a site map, which makes individual metadata entries available via
86
Web searches. Metacat's Java API is `available`_ for developers.
87

    
88
.. _API documentation: ./api/index.html
89

    
90
Metacat Features
91
----------------
92
Metacat is a repository for data and metadata (documentation about data), that 
93
helps scientists find, understand and effectively use the data sets they manage or 
94
that have been created by others. Specifically,
95

    
96
* Metacat is an open source web application, which can run on Linux, MacOS, and Windows operating systems and is written in Java
97
* Metacat's Web interface facilitates the input and retrieval of data 
98
* Metacat's optional mapping functionality enables you to query and visualize the geographic coverage of stored data sets
99
* Metacat's replication feature ensures that all Metacat data and metadata is stored safely on multiple Metacat servers
100
* The Metacat interface can be easily extended and customized via Web forms, skins, and/or user-developed client tools in Java and other languages
101
* The Metacat harvester automates the process of retrieving and storing EML documents from one or more sites
102
* Metacat can be customized to use Life Sciences Identifiers (LSIDs), uniquely identifying every data record
103
* Metacat has a built-in logging system for tracking events such as document insertions, updates, deletes, and reads
104
* The appearance of Metacat's Web interface can be customized via skins. 
105
* Metacat fully supports the DataONE Member Node interface, allowing Metacat deployments to easily participate in the DataONE federation
106

    
107
.. figure:: images/screenshots/image007.png
108
   :align: center
109

    
110
   Metacat's default home page. Users can customize the appearance using skins.
111

    
112

    
113

    
(2-2/5)