Project

General

Profile

« Previous | Next » 

Revision 6831

Added by Matt Jones about 13 years ago

Updating Sphinx doc structure in prep for moving metacat admin guide to Sphinx.

View differences:

docs/dev/metacat/source/identifiers.txt
1
.. raw:: latex
2

  
3
  \newpage
4
  
5

  
6
Identifier Management
7
=====================
8

  
9
.. index:: Identifiers
10

  
11
Author
12
  Matthew B. Jones
13

  
14
Date
15
  - 20100301 [MBJ] Initial draft of Identifier documentation
16

  
17
Goal
18
  Extend Metacat to support identifiers with arbitrary syntax
19

  
20
Summary 
21
  Metacat currently supports identifier strings called 'docids' that have
22
  the syntax 'scope.object.revision', such as 'foo.34.1' (we will refer to
23
  these as 'LocalIDs'). We now want Metacat to support identifiers that are 
24
  arbitrary strings, but still enforce uniqueness and proper revision
25
  handling (refer to these as GUIDs).  Metacat must be able to accept 
26
  these strings as identifiers for all CRUD operations, and reference them 
27
  in search results.
28

  
29
Identifier Resolution
30
---------------------
31
Because Metacat uses LocalIDs throughout the code for references to objects,
32
and that LocalID has a constrained structure that includes semantics about
33
revisions in the identifier, it is difficult to wholesale replace it with
34
less-constrained string identifiers without re-writing much of Metacat.
35
Thus, our alternate strategy is to wrap the Metacat APIs with a
36
identifier resolution layer that keeps track of the unconstrained GUIDs and
37
maps them to constrained local identifiers which are used internally within
38
Metacat. The basic identifer table model is shown in Figure 1, while the
39
basic strategy for retrieving an object is shown in Figure 2, creating an 
40
object is shown in Figure 3, updating an object in Figure 4, and deleting
41
an object is shown in Figure 5.
42

  
43

  
44
Identifier Table Structure
45
~~~~~~~~~~~~~~~~~~~~~~~~~~
46

  
47
.. figure:: images/identifiers.png
48

  
49
   Figure 1. Table structure for identifiers.
50

  
51
..
52
  This block defines the table structure diagram referenced above.
53
  @startuml images/identifiers.png
54

  
55
  identifiers "*" -- "1" xml_documents
56

  
57
  identifiers : String identifier
58
  identifiers : String docid
59
  identifiers : Integer rev
60

  
61
  xml_documents : String docid
62
  xml_documents : String rev
63

  
64
  note right of identifiers
65
    "identifiers.(docid,rev) is a foreign key into xml_documents"
66
  end note
67
  @enduml
68

  
69
.. raw:: latex
70

  
71
  \newpage
72

  
73
.. raw:: pdf
74

  
75
  PageBreak
76

  
77

  
78
Handling document read operations
79
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
80

  
81
An overview of the process needed to read an object using a GUID.
82

  
83

  
84
.. figure:: images/guid_read.png
85

  
86
   Figure 2. Basic handling for string identifiers (GUIDs) as mapped to
87
   docids (LocalIDs) to retrieve an object.
88

  
89
..
90
  @startuml images/guid_read.png
91
  !include plantuml.conf
92
  actor User
93
  participant "Client" as app_client << Application >>
94
  participant "CRUD API" as c_crud << MetacatRestServlet >>
95
  participant "Identifier Manager" as ident_man << IdentifierManager >>
96
  participant "Handler" as handler << MetacatHandler >>
97
  User -> app_client
98
  app_client -> c_crud: get(token, GUID)
99
  c_crud -> ident_man: getLocalID(GUID)
100
  c_crud <-- ident_man: localID
101
  c_crud -> handler: handleReadAction(localID)
102
  c_crud <-- handler: object
103
  c_crud --> app_client: object
104
  
105
  @enduml
106

  
107
Handling document create operations
108
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
109

  
110
An overview of the process needed to create an object using a GUID.
111

  
112
.. figure:: images/guid_insert.png
113

  
114
   Figure 3. Basic handling for string identifiers (GUIDs) as mapped to
115
   docids (LocalIDs) to create an object.
116

  
117
..
118
  @startuml images/guid_insert.png
119
  !include plantuml.conf
120
  actor User
121
  participant "Client" as app_client << Application >>
122
  participant "CRUD API" as c_crud << MetacatRestServlet >>
123
  participant "Identifier Manager" as ident_man << IdentifierManager >>
124
  participant "Handler" as handler << MetacatHandler >>
125
  User -> app_client
126
  app_client -> c_crud: create(token, GUID, object, sysmeta)
127
  c_crud -> ident_man: identifierExists(GUID)
128
  c_crud <-- ident_man: T or F 
129
  alt identifierExists == "F"
130
      c_crud -> ident_man: mapToLocalId(GUID)
131
      c_crud <-- ident_man: localID
132
      c_crud -> handler: handleInsertAction(localID)
133
      c_crud <-- handler: success
134
      note right of c_crud
135
        "Also need to address how to handle the sysmeta information wrt insertion methods"
136
      end note
137
      app_client <-- c_crud: success
138
  else identifierExists == "T"
139
      app_client <-- c_crud: IdentifierNotUnique
140
  end
141
  @enduml
142

  
143
Handling document update operations
144
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
145

  
146
An overview of the process needed to update an object using a GUID.
147

  
148
.. figure:: images/guid_update.png
149

  
150
   Figure 4. Basic handling for string identifiers (GUIDs) as mapped to
151
   docids (LocalIDs) to update an object.
152

  
153
..
154
  @startuml images/guid_update.png
155
  !include plantuml.conf
156
  actor User
157
  participant "Client" as app_client << Application >>
158
  participant "CRUD API" as c_crud << MetacatRestServlet >>
159
  participant "Identifier Manager" as ident_man << IdentifierManager >>
160
  participant "Handler" as handler << MetacatHandler >>
161
  User -> app_client
162
  app_client -> c_crud: update(token, GUID, object, obsoletedGUID, sysmeta)
163

  
164
  c_crud -> ident_man: identifierExists(obsoletedGUID)
165
  c_crud <-- ident_man: T or F 
166
  alt identifierExists == "T"
167

  
168
      c_crud -> ident_man: identifierExists(GUID)
169
      c_crud <-- ident_man: T or F 
170
      alt identifierExists == "F"
171
          c_crud -> ident_man: mapToLocalId(GUID, obsoletedGUID)
172
          c_crud <-- ident_man: localID
173
          c_crud -> handler: handleUpdateAction(localID)
174
          c_crud <-- handler: success
175
          note right of c_crud
176
            "Also need to address how to handle the sysmeta information wrt update methods"
177
          end note
178
          app_client <-- c_crud: success
179
      else identifierExists == "T"
180
          app_client <-- c_crud: IdentifierNotUnique
181
      end
182
  else identifierExists == "F"
183
      app_client <-- c_crud: NotFound
184
  end
185
  @enduml
186

  
187
Handling document delete operations
188
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
189

  
190
An overview of the process needed to delete an object using a GUID.
191

  
192
.. figure:: images/guid_delete.png
193

  
194
   Figure 5. Basic handling for string identifiers (GUIDs) as mapped to
195
   docids (LocalIDs) to delete an object.
196

  
197
..
198
  @startuml images/guid_delete.png
199
  !include plantuml.conf
200
  actor User
201
  participant "Client" as app_client << Application >>
202
  participant "CRUD API" as c_crud << MetacatRestServlet >>
203
  participant "Identifier Manager" as ident_man << IdentifierManager >>
204
  participant "Handler" as handler << MetacatHandler >>
205
  User -> app_client
206
  app_client -> c_crud: delete(token, GUID)
207
  c_crud -> ident_man: identifierExists(GUID)
208
  c_crud <-- ident_man: T or F 
209
  alt identifierExists == "T"
210
      c_crud -> ident_man: mapToLocalId(GUID)
211
      c_crud <-- ident_man: localID
212
      c_crud -> handler: handleDeleteAction(localID)
213
      c_crud <-- handler: success
214
      app_client <-- c_crud: success
215
  else identifierExists == "F"
216
      app_client <-- c_crud: NotFound
217
  end
218
  @enduml
219

  
220
..
221
  This block defines the interaction diagram referenced above.
222
  startuml images/01_interaction.png
223
    !include plantuml.conf
224
    actor User
225
    participant "Client" as app_client << Application >>
226
    User -> app_client
227

  
228
    participant "CRUD API" as c_crud << Coordinating Node >>
229
    activate c_crud
230
    app_client -> c_crud: resolve(GUID, auth_token)
231
    participant "Authorization API" as c_authorize << Coordinating Node >>
232
    c_crud -> c_authorize: isAuth(auth_token, GUID)
233
    participant "Verify API" as c_ver << Coordinating Node >>
234
    c_authorize -> c_ver: isValidToken (token)
235
    c_authorize <-- c_ver: T or F
236
    c_crud <-- c_authorize: T or F
237
    app_client <-- c_crud: handle_list
238
    deactivate c_crud
239

  
240
    participant "CRUD API" as m_crud << Member Node >>
241
    activate m_crud
242
    app_client -> m_crud: get(auth_token, handle)
243
    participant "Server Authentication API" as m_authenticate << Member Node >>
244
    m_crud -> m_authenticate: isAuth(auth_token, GUID)
245
    m_crud <-- m_authenticate: T or F
246
    m_crud -> m_crud: log(get, UserID, GUID)
247
    app_client <-- m_crud: object or unauth or doesNotExist
248
    deactivate m_crud
249
  enduml
250 0

  
docs/dev/metacat/source/index.txt
1

  
2
Metacat Administrator's Guide
3
=============================
4

  
5
.. sidebar:: Version: 2.0.0 Release
6

  
7
    .. image:: themes/readable/static/metacat-logo.png
8
       :height: 130pt
9

  
10
    Send feedback and bugs to: 
11
        metacat-dev@ecoinformatics.org
12
        http://bugzilla.ecoinformatics.org
13

  
14
    License: GPL
15

  
16
Metacat is a repository for data and metadata (data about data), which helps scientists find, understand and effectively use the data sets they manage or that have been created by others. Thousands of data sets are currently documented in a standardized way and stored in Metacat systems, providing the scientific community with a broad range of science data that--because the data are well and consistently described--can be easily searched, compared, merged, or used in other ways.  
17

  
18
- Metacat `Administrators Guide`_
19
- Download Metacat
20
    - Binary Distribution (A war file installation)
21
        - GZIP File: metacat-bin-2.0.0.tar.gz_
22
        - ZIP File: metacat-bin-2.0.0.zip_
23
    - Source Distribution (Full source, requiring build)
24
        - GZIP File: metacat-src-2.0.0.tar.gz_
25
        - ZIP File: metacat-src-2.0.0.zip_
26
    - `Older versions`_
27

  
28
.. _Administrators Guide: http://knb.ecoinformatics.org/software/metacat/MetacatAdministratorGuide.pdf
29

  
30
.. _metacat-bin-2.0.0.tar.gz: http://knb.ecoinformatics.org/software/dist/metacat-bin-2.0.0.tar.gz
31

  
32
.. _metacat-bin-2.0.0.zip: http://knb.ecoinformatics.org/software/dist/metacat-bin-2.0.0.zip
33

  
34
.. _metacat-src-2.0.0.tar.gz: http://knb.ecoinformatics.org/software/dist/metacat-src-2.0.0.tar.gz
35

  
36
.. _metacat-src-2.0.0.zip: http://knb.ecoinformatics.org/software/dist/metacat-src-2.0.0.zip
37

  
38
.. _Older versions: http://knb.ecoinformatics.org/software/dist/
39

  
40
Contents
41
========
42
.. toctree::
43
   :numbered:
44
   :maxdepth: 2
45

  
46
   01-intro.txt
47
   02-contributors.txt
48

  
49
..   03-install.txt
50
..   04-configuration.txt
51
..   05-submitting.txt
52
..   06-geoserver.txt
53
..   07-replication.txt
54
..   08-harvester.txt
55
..   09-event-logging.txt
56
..   10-sitemaps.txt
57
..   11-authinterface.txt
58
..   12-metacat-properties.txt
59
..   13-development.txt
60

  
61

  
62
Indices and tables
63
==================
64

  
65
* :ref:`genindex`
66
* :ref:`search`
67

  
68 0

  
docs/dev/metacat/source/12-metacat-properties.txt
1
Appendix: Metacat Properties
2
============================
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/05-submitting.txt
1
Accessing and Submitting Metadata and Data
2
==========================================
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/07-replication.txt
1
Replication
2
===========
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/11-authinterface.txt
1
Creating a Java Class that Implements AuthInterface
2
===================================================
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/08-harvester.txt
1
Harvester and Harvest List Editor
2
=================================
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/06-geoserver.txt
1
Metacat's Use of Geoserver
2
==========================
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/13-development.txt
1
Appendix: Development Issues
2
============================
3

  
4
Metacat is an open source application, and we welcome contributions to the
5
source from members of the community who are interested in helping out.  This
6
section will contain a series of technical details about Metacat that will help
7
contributors understand and extend Metacat.  Much of the detail needed here is
8
contained in the Administrator's guide, which we will gradually migrate to this
9
format.
10

  
11
Contents:
12

  
13
.. toctree::
14
   :maxdepth: 2
15

  
16
   identifiers.txt
17

  
docs/dev/metacat/source/01-intro.txt
1
Introduction
2
============
3

  
4
Metacat is a repository for data and metadata (data about data), which helps
5
scientists find, understand and effectively use the data sets they manage or
6
that have been created by others. Thousands of data sets are currently
7
documented in a standardized way and stored in Metacat systems, providing the
8
scientific community with a broad range of science data that--because the
9
data are well and consistently described--can be easily searched, compared,
10
merged, or used in other ways.
11

  
12
Not only is the Metacat repository a reliable place to store metadata and data
13
(the database is replicated over a secure connection so that every record is
14
stored on multiple machines and no data is ever lost to technical failures), it
15
provides a user-friendly interface for information entry and retrieval.
16
Scientists can search the repository via the Web using a customizable search
17
form. Searches return results based on user-specified criteria, such as desired
18
geographic coverage, taxonomic coverage, and/or keywords that appear in places
19
such as the data set's title or owner's name. Users need only click a linked
20
search result to open the corresponding data-set documentation in a browser
21
window and discover whom to contact to obtain the data themselves (or how to
22
immediately download the data via the Web).
23

  
24
Metacat's user-friendly Registry application allows data providers to enter
25
data-set documentation into Metacat using a Web form. When the form is
26
submitted, Metacat compiles the provided documentation into the required format
27
and saves it. Information providers need never work directly with the XML
28
format in which the data are stored or with the database records themselves. In
29
addition, the Metacat application can easily be extended to provide a
30
customized data-entry interface that suits the particular requirements of each
31
project. Metacat users can also choose to enter metadata using the Morpho
32
application, which provides data-entry wizards that guide information providers
33
through the process of documenting each data set.
34

  
35
The metadata stored in Metacat includes all of the information you and others
36
need to understand what the described data are and how to use them: a
37
descriptive data set title; an abstract; the temporal, spatial, and taxonomic
38
coverage of the data; the data collection methods; distribution information;
39
and contact information. Each information provider decides who has access to
40
this information (the public, or just specified users), and whether or not to
41
upload the data set itself with the data documentation. Information providers
42
can also edit the metadata or delete it from the repository, again using
43
Metacat's straightforward Web interface.
44

  
45
Metacat is a Java servlet application that runs on Window or Linux platforms in
46
conjunction with a database, such as PostgreSQL (or Oracle 8i), and a Web
47
server. The Metacat application stores data in an XML format using Ecological
48
Metadata Language (EML) or another ecological metadata standard. For more
49
information about Metacat or for examples of projects currently using Metacat,
50
please see http://knb.ecoinformatics.org.
51

  
52
What's in this Guide
53
--------------------
54
The Administrator guide includes information for installing, configuring,
55
managing and extending Metacat for both Linux and Windows systems. Chapter Two
56
contains instructions for downloading and installing Metacat and the
57
applications required to run the software on Linux and Microsoft platforms.
58
Chapter Three covers how to configure Metacat, both for new and upgraded
59
installations. Chapter Four details the ways in which you can customize the
60
Metacat interface so users can access and submit information easily: using
61
Metacat's generic web-interface (the Registry), creating your own HTML forms,
62
and creating your own desktop client (like Morpho). Chapter Five discusses how
63
to work with Metacat's Geoserver. Chapter Six describes how to set up the
64
Metacat's replication service, which permits Metacat servers to share data with
65
each other, effectively backing up metadata and data files. Chapter Seven looks
66
at the Metacat Harvester, a program that automates the retrieval of EML
67
documents from one or more sites and their subsequent upload (insert or update)
68
to Metacat. Chapter Eight discusses logging, Chapter Nine contains instructions
69
for creating a site map, which makes individual metadata entries available via
70
Web searches. Metacat's Java API is included as an appendix at the end of the
71
guide.
72

  
73
Metacat Features
74
----------------
75
Metacat is a repository for metadata (data about data), which help scientists
76
find, understand and effectively use the data sets they manage or that have
77
been created by others. Specifically,
78

  
79
* Metacat is a Java servlet application, which can run on both Windows and Linux systems
80
* Metadata submitted to Metacat is broken into modules, which are stored to optimize rapid information retrieval
81
* Metacat's Web interface facilitates the input and retrieval of data (Figure 1.1)
82
* Metacat's optional mapping functionality enables you to query and visualize the geographic coverage of stored documents
83
* Metacat's replication feature ensures that all Metacat data and metadata is stored safely on multiple Metacat servers
84
* The Metacat interface can be easily extended and customized via Web forms, skins, and/or user-developed Java clients
85
* The Metacat harvester automates the process of retrieving and storing EML documents from one or more sites
86
* Metacat can be customized to use Life Sciences Identifiers (LSIDs), uniquely identifying every data record
87
* Metacat has a built-in logging system for tracking events such as document insertions, updates, deletes, and reads
88
* The appearance of Metacat's Web interface can be customized via skins. 
89

  
docs/dev/metacat/source/02-contributors.txt
1
Contributors
2
============
3

  
4
Metacat has been designed and built by a large number of contributors to this
5
open source project.  Main developers and additional patch contributors are
6
listed here.
7

  
8
Contributors
9
------------
10
  - Matt Jones (jones@nceas.ucsb.edu)
11
  - Chad Berkley (berkley@nceas.ucsb.edu)
12
  - Jing Tao (tao@nceas.ucsb.edu)
13
  - Jivka Bojilova (bojilova@nceas.ucsb.edu)
14
  - Dan Higgins (higgins@nceas.ucsb.edu)
15
  - Saurabh Garg (sgarg@nceas.ucsb.edu)
16
  - Duane Costa (dcosta@lternet.edu)
17
  - Veronique Connolly (connolly@nceas.ucsb.edu)
18
  - Chris Jones (cjones@msi.ucsb.edu)
19
  - John Harris (harris@nceas.ucsb.edu)
20
  - Callie Bowdish (bowdish@ecoinformatics.org)
21
  - Will Tyburczy (tyburczy@ecoinformatics.org)
22
  - Matthew Perry (perry@nceas.ucsb.edu)
23
  - Chad Burt (underbluewaters@gmail.com)
24
  - Ben Leinfelder (leinfelder@nceas.ucsb.edu)
25
  - Chris Barteau (barteau@nceas.ucsb.edu)
26
  - Shaun Walbridge (walbridge@nceas.ucsb.edu)
27
  - Michael Daigle (daigle@nceas.ucsb.edu)
28

  
29
Patch contributors
30
------------------
31
  - Andrea Chadden (chadden@nceas.ucsb.edu)
32
  - Johnoel Ancheta (johnoel@hawaii.edu)
33
  - Owen Jones (owen.jones@imperial.ac.uk)
docs/dev/metacat/source/10-sitemaps.txt
1
Enabling Web Searches: Sitemaps
2
===============================
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/04-configuration.txt
1
Configuring Metacat
2
===================
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/03-install.txt
1
Downloading and Installing Metacat
2
==================================
3

  
4
Under construction!
5

  
6
System Requirements
7
-------------------
8

  
9
Installing on Linux
10
-------------------
11

  
12
Quick Start Overview
13
~~~~~~~~~~~~~~~~~~~~
14

  
15
Downloading Metacat
16
~~~~~~~~~~~~~~~~~~~
17

  
18
Download the Metacat Installer
19
..............................
20

  
21
Download Metacat Source Code
22
............................
23

  
24
Check Out Metacat Source Code from SVN (for Developers)
25
.......................................................
26

  
27
Installing and Configuring Required Software
28
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29

  
30
Java 6
31
......
32

  
33
Apache Jakarta-Tomcat
34
.....................
35

  
36
Apache Web Server (Highly Recommended)
37
......................................
38

  
39
PostgreSQL Database (or Oracle 8i)
40
..................................
41

  
42
Installing and Configuring Oracle 8i
43
....................................
44

  
45
Apache Jakarta-Ant (if building from Source)
46
............................................
47

  
48
Installing Metacat
49
~~~~~~~~~~~~~~~~~~
50

  
51
Optional Installation Options (LSID Server)
52
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
53

  
54
Troubleshooting
55
~~~~~~~~~~~~~~~
56

  
57
Installing on Windows
58
---------------------
docs/dev/metacat/source/09-event-logging.txt
1
Event Logging
2
=============
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/metacat-properties.rst
1
Appendix: Metacat Properties
2
============================
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/replication.rst
1
Replication
2
===========
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/development.rst
1
Appendix: Development Issues
2
============================
3

  
4
Metacat is an open source application, and we welcome contributions to the
5
source from members of the community who are interested in helping out.  This
6
section will contain a series of technical details about Metacat that will help
7
contributors understand and extend Metacat.  Much of the detail needed here is
8
contained in the Administrator's guide, which we will gradually migrate to this
9
format.
10

  
11
Contents:
12

  
13
.. toctree::
14
   :maxdepth: 2
15

  
16
   identifiers
17

  
docs/dev/metacat/source/authinterface.rst
1
Creating a Java Class that Implements AuthInterface
2
===================================================
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/submitting.rst
1
Accessing and Submitting Metadata and Data
2
==========================================
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/sitemaps.rst
1
Enabling Web Searches: Sitemaps
2
===============================
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/intro.rst
1
Introduction
2
============
3

  
4
Metacat is a repository for data and metadata (data about data), which helps
5
scientists find, understand and effectively use the data sets they manage or
6
that have been created by others. Thousands of data sets are currently
7
documented in a standardized way and stored in Metacat systems, providing the
8
scientific community with a broad range of science data that--because the
9
data are well and consistently described--can be easily searched, compared,
10
merged, or used in other ways.
11

  
12
Not only is the Metacat repository a reliable place to store metadata and data
13
(the database is replicated over a secure connection so that every record is
14
stored on multiple machines and no data is ever lost to technical failures), it
15
provides a user-friendly interface for information entry and retrieval.
16
Scientists can search the repository via the Web using a customizable search
17
form. Searches return results based on user-specified criteria, such as desired
18
geographic coverage, taxonomic coverage, and/or keywords that appear in places
19
such as the data set's title or owner's name. Users need only click a linked
20
search result to open the corresponding data-set documentation in a browser
21
window and discover whom to contact to obtain the data themselves (or how to
22
immediately download the data via the Web).
23

  
24
Metacat's user-friendly Registry application allows data providers to enter
25
data-set documentation into Metacat using a Web form. When the form is
26
submitted, Metacat compiles the provided documentation into the required format
27
and saves it. Information providers need never work directly with the XML
28
format in which the data are stored or with the database records themselves. In
29
addition, the Metacat application can easily be extended to provide a
30
customized data-entry interface that suits the particular requirements of each
31
project. Metacat users can also choose to enter metadata using the Morpho
32
application, which provides data-entry wizards that guide information providers
33
through the process of documenting each data set.
34

  
35
The metadata stored in Metacat includes all of the information you and others
36
need to understand what the described data are and how to use them: a
37
descriptive data set title; an abstract; the temporal, spatial, and taxonomic
38
coverage of the data; the data collection methods; distribution information;
39
and contact information. Each information provider decides who has access to
40
this information (the public, or just specified users), and whether or not to
41
upload the data set itself with the data documentation. Information providers
42
can also edit the metadata or delete it from the repository, again using
43
Metacat's straightforward Web interface.
44

  
45
Metacat is a Java servlet application that runs on Window or Linux platforms in
46
conjunction with a database, such as PostgreSQL (or Oracle 8i), and a Web
47
server. The Metacat application stores data in an XML format using Ecological
48
Metadata Language (EML) or another ecological metadata standard. For more
49
information about Metacat or for examples of projects currently using Metacat,
50
please see http://knb.ecoinformatics.org.
51

  
52
What's in this Guide
53
--------------------
54
The Administrator guide includes information for installing, configuring,
55
managing and extending Metacat for both Linux and Windows systems. Chapter Two
56
contains instructions for downloading and installing Metacat and the
57
applications required to run the software on Linux and Microsoft platforms.
58
Chapter Three covers how to configure Metacat, both for new and upgraded
59
installations. Chapter Four details the ways in which you can customize the
60
Metacat interface so users can access and submit information easily: using
61
Metacat's generic web-interface (the Registry), creating your own HTML forms,
62
and creating your own desktop client (like Morpho). Chapter Five discusses how
63
to work with Metacat's Geoserver. Chapter Six describes how to set up the
64
Metacat's replication service, which permits Metacat servers to share data with
65
each other, effectively backing up metadata and data files. Chapter Seven looks
66
at the Metacat Harvester, a program that automates the retrieval of EML
67
documents from one or more sites and their subsequent upload (insert or update)
68
to Metacat. Chapter Eight discusses logging, Chapter Nine contains instructions
69
for creating a site map, which makes individual metadata entries available via
70
Web searches. Metacat's Java API is included as an appendix at the end of the
71
guide.
72

  
73
Metacat Features
74
----------------
75
Metacat is a repository for metadata (data about data), which help scientists
76
find, understand and effectively use the data sets they manage or that have
77
been created by others. Specifically,
78

  
79
* Metacat is a Java servlet application, which can run on both Windows and Linux systems
80
* Metadata submitted to Metacat is broken into modules, which are stored to optimize rapid information retrieval
81
* Metacat's Web interface facilitates the input and retrieval of data (Figure 1.1)
82
* Metacat's optional mapping functionality enables you to query and visualize the geographic coverage of stored documents
83
* Metacat's replication feature ensures that all Metacat data and metadata is stored safely on multiple Metacat servers
84
* The Metacat interface can be easily extended and customized via Web forms, skins, and/or user-developed Java clients
85
* The Metacat harvester automates the process of retrieving and storing EML documents from one or more sites
86
* Metacat can be customized to use Life Sciences Identifiers (LSIDs), uniquely identifying every data record
87
* Metacat has a built-in logging system for tracking events such as document insertions, updates, deletes, and reads
88
* The appearance of Metacat's Web interface can be customized via skins. 
89

  
docs/dev/metacat/source/contributors.rst
1
Contributors
2
============
3

  
4
Metacat has been designed and built by a large number of contributors to this
5
open source project.  Main developers and additional patch contributors are
6
listed here.
7

  
8
Contributors
9
------------
10
  - Matt Jones (jones@nceas.ucsb.edu)
11
  - Chad Berkley (berkley@nceas.ucsb.edu)
12
  - Jing Tao (tao@nceas.ucsb.edu)
13
  - Jivka Bojilova (bojilova@nceas.ucsb.edu)
14
  - Dan Higgins (higgins@nceas.ucsb.edu)
15
  - Saurabh Garg (sgarg@nceas.ucsb.edu)
16
  - Duane Costa (dcosta@lternet.edu)
17
  - Veronique Connolly (connolly@nceas.ucsb.edu)
18
  - Chris Jones (cjones@nceas.ucsb.edu)
19
  - John Harris (harris@nceas.ucsb.edu)
20
  - Callie Bowdish (bowdish@ecoinformatics.org)
21
  - Will Tyburczy (tyburczy@ecoinformatics.org)
22
  - Matthew Perry (perry@nceas.ucsb.edu)
23
  - Chad Burt (underbluewaters@gmail.com)
24
  - Ben Leinfelder (leinfelder@nceas.ucsb.edu)
25
  - Chris Barteau (barteau@nceas.ucsb.edu)
26
  - Shaun Walbridge (walbridge@nceas.ucsb.edu)
27
  - Michael Daigle (daigle@nceas.ucsb.edu)
28

  
29
Patch contributors
30
------------------
31
  - Andrea Chadden (chadden@nceas.ucsb.edu)
32
  - Johnoel Ancheta (johnoel@hawaii.edu)
33
  - Owen Jones (owen.jones@imperial.ac.uk)
docs/dev/metacat/source/geoserver.rst
1
Metacat's Use of Geoserver
2
==========================
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/harvester.rst
1
Harvester and Harvest List Editor
2
=================================
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/conf.py
28 28
templates_path = ['_templates']
29 29

  
30 30
# The suffix of source filenames.
31
source_suffix = '.txt'
31
source_suffix = '.rst'
32 32

  
33 33
# The encoding of source files.
34 34
#source_encoding = 'utf-8'
......
38 38

  
39 39
# General information about the project.
40 40
project = u'Metacat'
41
copyright = u'2011, Regents of the University of California'
41
copyright = u'2012, Regents of the University of California'
42 42

  
43 43
# The version info for the project you're documenting, acts as replacement for
44 44
# |version| and |release|, also used in various other places throughout the
docs/dev/metacat/source/event-logging.rst
1
Event Logging
2
=============
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/identifiers.rst
1
.. raw:: latex
2

  
3
  \newpage
4
  
5

  
6
Identifier Management
7
=====================
8

  
9
.. index:: Identifiers
10

  
11
Author
12
  Matthew B. Jones
13

  
14
Date
15
  - 20100301 [MBJ] Initial draft of Identifier documentation
16

  
17
Goal
18
  Extend Metacat to support identifiers with arbitrary syntax
19

  
20
Summary 
21
  Metacat currently supports identifier strings called 'docids' that have
22
  the syntax 'scope.object.revision', such as 'foo.34.1' (we will refer to
23
  these as 'LocalIDs'). We now want Metacat to support identifiers that are 
24
  arbitrary strings, but still enforce uniqueness and proper revision
25
  handling (refer to these as GUIDs).  Metacat must be able to accept 
26
  these strings as identifiers for all CRUD operations, and reference them 
27
  in search results.
28

  
29
Identifier Resolution
30
---------------------
31
Because Metacat uses LocalIDs throughout the code for references to objects,
32
and that LocalID has a constrained structure that includes semantics about
33
revisions in the identifier, it is difficult to wholesale replace it with
34
less-constrained string identifiers without re-writing much of Metacat.
35
Thus, our alternate strategy is to wrap the Metacat APIs with a
36
identifier resolution layer that keeps track of the unconstrained GUIDs and
37
maps them to constrained local identifiers which are used internally within
38
Metacat. The basic identifer table model is shown in Figure 1, while the
39
basic strategy for retrieving an object is shown in Figure 2, creating an 
40
object is shown in Figure 3, updating an object in Figure 4, and deleting
41
an object is shown in Figure 5.
42

  
43

  
44
Identifier Table Structure
45
~~~~~~~~~~~~~~~~~~~~~~~~~~
46

  
47
.. figure:: images/identifiers.png
48

  
49
   Figure 1. Table structure for identifiers.
50

  
51
..
52
  This block defines the table structure diagram referenced above.
53
  @startuml images/identifiers.png
54

  
55
  identifiers "*" -- "1" xml_documents
56

  
57
  identifiers : String identifier
58
  identifiers : String docid
59
  identifiers : Integer rev
60

  
61
  xml_documents : String docid
62
  xml_documents : String rev
63

  
64
  note right of identifiers
65
    "identifiers.(docid,rev) is a foreign key into xml_documents"
66
  end note
67
  @enduml
68

  
69
.. raw:: latex
70

  
71
  \newpage
72

  
73
.. raw:: pdf
74

  
75
  PageBreak
76

  
77

  
78
Handling document read operations
79
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
80

  
81
An overview of the process needed to read an object using a GUID.
82

  
83

  
84
.. figure:: images/guid_read.png
85

  
86
   Figure 2. Basic handling for string identifiers (GUIDs) as mapped to
87
   docids (LocalIDs) to retrieve an object.
88

  
89
..
90
  @startuml images/guid_read.png
91
  !include plantuml.conf
92
  actor User
93
  participant "Client" as app_client << Application >>
94
  participant "CRUD API" as c_crud << MetacatRestServlet >>
95
  participant "Identifier Manager" as ident_man << IdentifierManager >>
96
  participant "Handler" as handler << MetacatHandler >>
97
  User -> app_client
98
  app_client -> c_crud: get(token, GUID)
99
  c_crud -> ident_man: getLocalID(GUID)
100
  c_crud <-- ident_man: localID
101
  c_crud -> handler: handleReadAction(localID)
102
  c_crud <-- handler: object
103
  c_crud --> app_client: object
104
  
105
  @enduml
106

  
107
Handling document create operations
108
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
109

  
110
An overview of the process needed to create an object using a GUID.
111

  
112
.. figure:: images/guid_insert.png
113

  
114
   Figure 3. Basic handling for string identifiers (GUIDs) as mapped to
115
   docids (LocalIDs) to create an object.
116

  
117
..
118
  @startuml images/guid_insert.png
119
  !include plantuml.conf
120
  actor User
121
  participant "Client" as app_client << Application >>
122
  participant "CRUD API" as c_crud << MetacatRestServlet >>
123
  participant "Identifier Manager" as ident_man << IdentifierManager >>
124
  participant "Handler" as handler << MetacatHandler >>
125
  User -> app_client
126
  app_client -> c_crud: create(token, GUID, object, sysmeta)
127
  c_crud -> ident_man: identifierExists(GUID)
128
  c_crud <-- ident_man: T or F 
129
  alt identifierExists == "F"
130
      c_crud -> ident_man: mapToLocalId(GUID)
131
      c_crud <-- ident_man: localID
132
      c_crud -> handler: handleInsertAction(localID)
133
      c_crud <-- handler: success
134
      note right of c_crud
135
        "Also need to address how to handle the sysmeta information wrt insertion methods"
136
      end note
137
      app_client <-- c_crud: success
138
  else identifierExists == "T"
139
      app_client <-- c_crud: IdentifierNotUnique
140
  end
141
  @enduml
142

  
143
Handling document update operations
144
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
145

  
146
An overview of the process needed to update an object using a GUID.
147

  
148
.. figure:: images/guid_update.png
149

  
150
   Figure 4. Basic handling for string identifiers (GUIDs) as mapped to
151
   docids (LocalIDs) to update an object.
152

  
153
..
154
  @startuml images/guid_update.png
155
  !include plantuml.conf
156
  actor User
157
  participant "Client" as app_client << Application >>
158
  participant "CRUD API" as c_crud << MetacatRestServlet >>
159
  participant "Identifier Manager" as ident_man << IdentifierManager >>
160
  participant "Handler" as handler << MetacatHandler >>
161
  User -> app_client
162
  app_client -> c_crud: update(token, GUID, object, obsoletedGUID, sysmeta)
163

  
164
  c_crud -> ident_man: identifierExists(obsoletedGUID)
165
  c_crud <-- ident_man: T or F 
166
  alt identifierExists == "T"
167

  
168
      c_crud -> ident_man: identifierExists(GUID)
169
      c_crud <-- ident_man: T or F 
170
      alt identifierExists == "F"
171
          c_crud -> ident_man: mapToLocalId(GUID, obsoletedGUID)
172
          c_crud <-- ident_man: localID
173
          c_crud -> handler: handleUpdateAction(localID)
174
          c_crud <-- handler: success
175
          note right of c_crud
176
            "Also need to address how to handle the sysmeta information wrt update methods"
177
          end note
178
          app_client <-- c_crud: success
179
      else identifierExists == "T"
180
          app_client <-- c_crud: IdentifierNotUnique
181
      end
182
  else identifierExists == "F"
183
      app_client <-- c_crud: NotFound
184
  end
185
  @enduml
186

  
187
Handling document delete operations
188
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
189

  
190
An overview of the process needed to delete an object using a GUID.
191

  
192
.. figure:: images/guid_delete.png
193

  
194
   Figure 5. Basic handling for string identifiers (GUIDs) as mapped to
195
   docids (LocalIDs) to delete an object.
196

  
197
..
198
  @startuml images/guid_delete.png
199
  !include plantuml.conf
200
  actor User
201
  participant "Client" as app_client << Application >>
202
  participant "CRUD API" as c_crud << MetacatRestServlet >>
203
  participant "Identifier Manager" as ident_man << IdentifierManager >>
204
  participant "Handler" as handler << MetacatHandler >>
205
  User -> app_client
206
  app_client -> c_crud: delete(token, GUID)
207
  c_crud -> ident_man: identifierExists(GUID)
208
  c_crud <-- ident_man: T or F 
209
  alt identifierExists == "T"
210
      c_crud -> ident_man: mapToLocalId(GUID)
211
      c_crud <-- ident_man: localID
212
      c_crud -> handler: handleDeleteAction(localID)
213
      c_crud <-- handler: success
214
      app_client <-- c_crud: success
215
  else identifierExists == "F"
216
      app_client <-- c_crud: NotFound
217
  end
218
  @enduml
219

  
220
..
221
  This block defines the interaction diagram referenced above.
222
  startuml images/01_interaction.png
223
    !include plantuml.conf
224
    actor User
225
    participant "Client" as app_client << Application >>
226
    User -> app_client
227

  
228
    participant "CRUD API" as c_crud << Coordinating Node >>
229
    activate c_crud
230
    app_client -> c_crud: resolve(GUID, auth_token)
231
    participant "Authorization API" as c_authorize << Coordinating Node >>
232
    c_crud -> c_authorize: isAuth(auth_token, GUID)
233
    participant "Verify API" as c_ver << Coordinating Node >>
234
    c_authorize -> c_ver: isValidToken (token)
235
    c_authorize <-- c_ver: T or F
236
    c_crud <-- c_authorize: T or F
237
    app_client <-- c_crud: handle_list
238
    deactivate c_crud
239

  
240
    participant "CRUD API" as m_crud << Member Node >>
241
    activate m_crud
242
    app_client -> m_crud: get(auth_token, handle)
243
    participant "Server Authentication API" as m_authenticate << Member Node >>
244
    m_crud -> m_authenticate: isAuth(auth_token, GUID)
245
    m_crud <-- m_authenticate: T or F
246
    m_crud -> m_crud: log(get, UserID, GUID)
247
    app_client <-- m_crud: object or unauth or doesNotExist
248
    deactivate m_crud
249
  enduml
0 250

  
docs/dev/metacat/source/install.rst
1
Downloading and Installing Metacat
2
==================================
3

  
4
Under construction!
5

  
6
System Requirements
7
-------------------
8

  
9
Installing on Linux
10
-------------------
11

  
12
Quick Start Overview
13
~~~~~~~~~~~~~~~~~~~~
14

  
15
Downloading Metacat
16
~~~~~~~~~~~~~~~~~~~
17

  
18
Download the Metacat Installer
19
..............................
20

  
21
Download Metacat Source Code
22
............................
23

  
24
Check Out Metacat Source Code from SVN (for Developers)
25
.......................................................
26

  
27
Installing and Configuring Required Software
28
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29

  
30
Java 6
31
......
32

  
33
Apache Jakarta-Tomcat
34
.....................
35

  
36
Apache Web Server (Highly Recommended)
37
......................................
38

  
39
PostgreSQL Database (or Oracle 8i)
40
..................................
41

  
42
Installing and Configuring Oracle 8i
43
....................................
44

  
45
Apache Jakarta-Ant (if building from Source)
46
............................................
47

  
48
Installing Metacat
49
~~~~~~~~~~~~~~~~~~
50

  
51
Optional Installation Options (LSID Server)
52
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
53

  
54
Troubleshooting
55
~~~~~~~~~~~~~~~
56

  
57
Installing on Windows
58
---------------------
docs/dev/metacat/source/configuration.rst
1
Configuring Metacat
2
===================
3

  
4
Under construction!
5

  
6
Heading 1
7
------------
8

  
9
Heading 2
10
------------
11

  
docs/dev/metacat/source/index.rst
1

  
2
Metacat Administrator's Guide
3
=============================
4

  
5
.. sidebar:: Version: 2.0.0 Release
6

  
7
    .. image:: themes/readable/static/metacat-logo.png
8
       :height: 130pt
9

  
10
    Send feedback and bugs to: 
11
        metacat-dev@ecoinformatics.org
12
        http://bugzilla.ecoinformatics.org
13

  
14
    License: GPL
15

  
16
Metacat is a repository for data and metadata (data about data), which helps scientists find, understand and effectively use the data sets they manage or that have been created by others. Thousands of data sets are currently documented in a standardized way and stored in Metacat systems, providing the scientific community with a broad range of science data that--because the data are well and consistently described--can be easily searched, compared, merged, or used in other ways.  
17

  
18
- Metacat `Administrators Guide`_
19
- Download Metacat
20
    - Binary Distribution (A war file installation)
21
        - GZIP File: metacat-bin-2.0.0.tar.gz_
22
        - ZIP File: metacat-bin-2.0.0.zip_
23
    - Source Distribution (Full source, requiring build)
24
        - GZIP File: metacat-src-2.0.0.tar.gz_
25
        - ZIP File: metacat-src-2.0.0.zip_
26
    - `Older versions`_
27

  
28
.. _Administrators Guide: http://knb.ecoinformatics.org/software/metacat/MetacatAdministratorGuide.pdf
29

  
30
.. _metacat-bin-2.0.0.tar.gz: http://knb.ecoinformatics.org/software/dist/metacat-bin-2.0.0.tar.gz
31

  
32
.. _metacat-bin-2.0.0.zip: http://knb.ecoinformatics.org/software/dist/metacat-bin-2.0.0.zip
33

  
34
.. _metacat-src-2.0.0.tar.gz: http://knb.ecoinformatics.org/software/dist/metacat-src-2.0.0.tar.gz
35

  
36
.. _metacat-src-2.0.0.zip: http://knb.ecoinformatics.org/software/dist/metacat-src-2.0.0.zip
37

  
38
.. _Older versions: http://knb.ecoinformatics.org/software/dist/
39

  
40
Contents
41
========
42
.. toctree::
43
   :numbered:
44
   :maxdepth: 2
45

  
46
   intro
47
   contributors
48
   install
49
   configuration
50
   submitting
51
   geoserver
52
   replication
53
   harvester
54
   event-logging
55
   sitemaps
56
   authinterface
57
   metacat-properties
58
   development
59

  
60

  
61
Indices and tables
62
==================
63

  
64
* :ref:`genindex`
65
* :ref:`search`
66

  
0 67

  

Also available in: Unified diff