Revision 6885
Added by Matt Jones about 11 years ago
dataone.rst | ||
---|---|---|
1 | 1 |
DataONE Member Node Support |
2 | 2 |
=========================== |
3 |
|
|
4 | 3 |
DataONE_ is a federation of data repositories that aims to improve |
5 | 4 |
interoperability among data repository software systems and advance the |
6 | 5 |
preservation of scientific data for future use. |
... | ... | |
16 | 15 |
and social scientists to build a robust, interoperable, and sustainable system for |
17 | 16 |
preserving and accessing Earth observational data at national and global scales. |
18 | 17 |
Supported by the U.S. National Science Foundation, DataONE partners focus on |
19 |
technological, finalncial, and organizational sustainability approaches to
|
|
18 |
technological, financial, and organizational sustainability approaches to |
|
20 | 19 |
building a distributed network of data repositories that are fully interoperable, |
21 | 20 |
even when those repositories use divergent underlying software and support different |
22 | 21 |
data and metadata content standards. DataONE defines a common web-service service |
... | ... | |
33 | 32 |
software tools for data management, analysis, visualization and other parts of |
34 | 33 |
the scientific lifecycle to directly communicate with Metacat without being |
35 | 34 |
further specialized beyond the support needed for DataONE. This streamlines the |
36 |
process of writing scientific software on both for servers and client tools.
|
|
35 |
process of writing scientific software both for servers and client tools. |
|
37 | 36 |
|
38 | 37 |
The DataONE Service Interface |
39 | 38 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
40 |
DataONE acheives interoperability |
|
41 |
by defining a lightweight but powerful set of web services that can be |
|
42 |
implemented by various data management software systems to allow those systems |
|
43 |
to effectively communicate with one another, exchange data, metadata, and other |
|
44 |
scientific objects. This `DataONE Service Interface`_ |
|
39 |
DataONE acheives interoperability by defining a lightweight but powerful set of |
|
40 |
REST_ web services that can be implemented by various data management software |
|
41 |
systems to allow those systems to effectively communicate with one another, |
|
42 |
exchange data, metadata, and other scientific objects. This `DataONE Service Interface`_ |
|
45 | 43 |
is an open standard that defines the communication protocols and technical |
46 | 44 |
expectations for software components that wish to participate in the DataONE |
47 | 45 |
federation. This service interface is divided into `four distinct tiers`_, with the |
... | ... | |
55 | 53 |
3. **Tier 3:** Full Write access |
56 | 54 |
4. **Tier 4:** Replication target services |
57 | 55 |
|
56 |
.. _REST: http://en.wikipedia.org/wiki/Representational_state_transfer |
|
57 |
|
|
58 | 58 |
.. _DataONE Service Interface: http://releases.dataone.org/online/d1-architecture-1.0.0 |
59 | 59 |
|
60 | 60 |
.. _four distinct tiers: http://releases.dataone.org/online/d1-architecture-1.0.0/apis/index.html |
61 | 61 |
|
62 | 62 |
Member Nodes |
63 | 63 |
~~~~~~~~~~~~ |
64 |
In DataONE, Member Nodes represent the core of the network, in that they represent |
|
65 |
particular scientific communities, manage and preserve their data and metadata, and |
|
66 |
provide tools to their community for contributing, managing, and accessing data. |
|
67 |
DataONE provides a standard way for these individual repositories to interact, and helps |
|
68 |
to coordinate among the Member Nodes in the federation. This allows Member Nodes |
|
69 |
to provide services to each other, such as replication of data for backup and failover. |
|
70 |
To be a Member Node, a repository must implement the Member Node service interface, |
|
71 |
and then register with DataONE. Metacat provides this implementation automatically, |
|
72 |
and provides an easy configuration option to register a Metacat instance as a |
|
73 |
DataONE Member Node (see configuration section below). If you are deploying a Metacat |
|
74 |
instance, it is relatively simple to become a Member Node, but keep in mind that |
|
75 |
DataONE is aiming for longevity and preservation, and so is selecting for nodes |
|
76 |
that have long-term data preservation as part of their mission. |
|
64 | 77 |
|
65 | 78 |
Coordinating Nodes |
66 | 79 |
~~~~~~~~~~~~~~~~~~ |
80 |
The DataONE Coordinating Nodes provide a set of services to Member Nodes that |
|
81 |
allow Member Nodes to easily interact with one another and to provide a unified |
|
82 |
view of the whole DataONE Federation. The main services provided by Coordinating |
|
83 |
Nodes are: |
|
67 | 84 |
|
85 |
* Global search index for all metadata and web portal for data discovery |
|
86 |
* Resolution service to map unique identifiers to the Member Nodes that hold data |
|
87 |
* Authentication against a shared set of accounts based on CILogon_ and InCommon_ |
|
88 |
* Replication management services to reliably replicate data according to |
|
89 |
policies set by the Member Nodes |
|
90 |
* Fixity checking to ensure that preserved objects remain valid |
|
91 |
* Member Node registration and management |
|
92 |
* Aggregated logging for data access across the whole federation |
|
93 |
|
|
94 |
Three geographically distributed Coordinating Nodes replicate these coordinating |
|
95 |
services at UC Santa Barbara, the University of New Mexico, and the Oak Ridge Campus. |
|
96 |
Coordinating Nodes are set up in a fully redundant manner, such that any of the coordinating |
|
97 |
nodes can be offline and the others will continue to provide availability of the services |
|
98 |
without interruption. The DataONE services expose their services at:: |
|
99 |
|
|
100 |
https://cn.dataone.org/cn |
|
101 |
|
|
102 |
And the DataONE search portal is available at: |
|
103 |
|
|
104 |
https://cn.dataone.org/ |
|
105 |
|
|
106 |
.. _CILogon: http://www.cilogon.org |
|
107 |
|
|
108 |
.. _InCommon: http://incommon.org |
|
109 |
|
|
68 | 110 |
Investigator Toolkit |
69 | 111 |
~~~~~~~~~~~~~~~~~~~~ |
112 |
In order to provide scientists with convenient access to the data and metadata in |
|
113 |
DataONE, the third component represents a library of software tools that have been |
|
114 |
adapted to work with DataONE via the service interface and can be used to |
|
115 |
discover, manage, analyze, and visualize data in DataONE. For example, DataONE |
|
116 |
plans to release metadata editors (e.g., Morpho), data search tools (e.g., Mercury), |
|
117 |
data access tools (e.g., ONEDrive), and data analysis tools (e.g., R) that all |
|
118 |
know how to interact with DataONE Member Nodes and Coordinating Nodes. Consequently, |
|
119 |
scientists will be able to access data from any DataONE Member Node, such as a Metacat |
|
120 |
node, directly from within the R environment. In addition, software tools that |
|
121 |
are written to work with one Member Node should also work with others, thereby |
|
122 |
greatly increasing the efficiency of creating an entire toolkit of software that |
|
123 |
is useful to investigators. |
|
70 | 124 |
|
71 |
Metacat as a Member Node |
|
72 |
------------------------ |
|
125 |
Because DataONE services are REST web services, software written in any |
|
126 |
programming language can be adapted to interact with DataONE. |
|
127 |
In addition, to ease the process of adapting tools to work with DataONE, libraries |
|
128 |
are provided for common programming languages such as Java (d1-libclient-java) |
|
129 |
and Python (d1_libclient-python) are provided that allow simple function calls |
|
130 |
to be used to access any DataONE service. |
|
73 | 131 |
|
132 |
Configuring Metacat as a Member Node |
|
133 |
------------------------------------ |
|
134 |
Configuring Metacat as a DataONE Member Node is accomplished with the standard |
|
135 |
Metacat Administrative configuration utility. To access the utility, visit the |
|
136 |
following URL:: |
|
137 |
|
|
138 |
http://<yourhost.org>/<context>/admin |
|
139 |
|
|
140 |
where ``<yourhost.org>`` represents the hostname of your webserver running metacat, |
|
141 |
and ``<context>`` is the name of the web context in which Metacat was installed. |
|
142 |
Once at the administrative utility, click on the DataONE configuration link, which |
|
143 |
should show the following screen: |
|
144 |
|
|
145 |
.. figure:: images/screenshots/screen-dataone-config.png |
|
146 |
:align: center |
|
147 |
|
|
148 |
The configuration screen for configuring Metacat as a DataONE node. |
|
149 |
|
|
150 |
Being a replication target |
|
151 |
~~~~~~~~~~~~~~~~~~~~~~~~~~ |
|
152 |
TODO: Describe the configuraiton for acting as a replication target. |
|
153 |
|
|
154 |
Replication Policies |
|
155 |
-------------------- |
|
156 |
TODO: Describe the replication policies for objects in DataONE. |
|
157 |
|
|
158 |
Access Control Policies |
|
159 |
----------------------- |
|
160 |
TODO: Describe access control for objects in DataONE. |
|
161 |
|
|
162 |
|
|
163 |
|
|
164 |
|
Also available in: Unified diff
Continued authoring the description of DataONE in Metacat. More to come.