1 |
1 |
DataONE Member Node Support
|
2 |
2 |
===========================
|
3 |
|
|
4 |
3 |
DataONE_ is a federation of data repositories that aims to improve
|
5 |
4 |
interoperability among data repository software systems and advance the
|
6 |
5 |
preservation of scientific data for future use.
|
... | ... | |
16 |
15 |
and social scientists to build a robust, interoperable, and sustainable system for
|
17 |
16 |
preserving and accessing Earth observational data at national and global scales.
|
18 |
17 |
Supported by the U.S. National Science Foundation, DataONE partners focus on
|
19 |
|
technological, finalncial, and organizational sustainability approaches to
|
|
18 |
technological, financial, and organizational sustainability approaches to
|
20 |
19 |
building a distributed network of data repositories that are fully interoperable,
|
21 |
20 |
even when those repositories use divergent underlying software and support different
|
22 |
21 |
data and metadata content standards. DataONE defines a common web-service service
|
... | ... | |
33 |
32 |
software tools for data management, analysis, visualization and other parts of
|
34 |
33 |
the scientific lifecycle to directly communicate with Metacat without being
|
35 |
34 |
further specialized beyond the support needed for DataONE. This streamlines the
|
36 |
|
process of writing scientific software on both for servers and client tools.
|
|
35 |
process of writing scientific software both for servers and client tools.
|
37 |
36 |
|
38 |
37 |
The DataONE Service Interface
|
39 |
38 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
40 |
|
DataONE acheives interoperability
|
41 |
|
by defining a lightweight but powerful set of web services that can be
|
42 |
|
implemented by various data management software systems to allow those systems
|
43 |
|
to effectively communicate with one another, exchange data, metadata, and other
|
44 |
|
scientific objects. This `DataONE Service Interface`_
|
|
39 |
DataONE acheives interoperability by defining a lightweight but powerful set of
|
|
40 |
REST_ web services that can be implemented by various data management software
|
|
41 |
systems to allow those systems to effectively communicate with one another,
|
|
42 |
exchange data, metadata, and other scientific objects. This `DataONE Service Interface`_
|
45 |
43 |
is an open standard that defines the communication protocols and technical
|
46 |
44 |
expectations for software components that wish to participate in the DataONE
|
47 |
45 |
federation. This service interface is divided into `four distinct tiers`_, with the
|
... | ... | |
55 |
53 |
3. **Tier 3:** Full Write access
|
56 |
54 |
4. **Tier 4:** Replication target services
|
57 |
55 |
|
|
56 |
.. _REST: http://en.wikipedia.org/wiki/Representational_state_transfer
|
|
57 |
|
58 |
58 |
.. _DataONE Service Interface: http://releases.dataone.org/online/d1-architecture-1.0.0
|
59 |
59 |
|
60 |
60 |
.. _four distinct tiers: http://releases.dataone.org/online/d1-architecture-1.0.0/apis/index.html
|
61 |
61 |
|
62 |
62 |
Member Nodes
|
63 |
63 |
~~~~~~~~~~~~
|
|
64 |
In DataONE, Member Nodes represent the core of the network, in that they represent
|
|
65 |
particular scientific communities, manage and preserve their data and metadata, and
|
|
66 |
provide tools to their community for contributing, managing, and accessing data.
|
|
67 |
DataONE provides a standard way for these individual repositories to interact, and helps
|
|
68 |
to coordinate among the Member Nodes in the federation. This allows Member Nodes
|
|
69 |
to provide services to each other, such as replication of data for backup and failover.
|
|
70 |
To be a Member Node, a repository must implement the Member Node service interface,
|
|
71 |
and then register with DataONE. Metacat provides this implementation automatically,
|
|
72 |
and provides an easy configuration option to register a Metacat instance as a
|
|
73 |
DataONE Member Node (see configuration section below). If you are deploying a Metacat
|
|
74 |
instance, it is relatively simple to become a Member Node, but keep in mind that
|
|
75 |
DataONE is aiming for longevity and preservation, and so is selecting for nodes
|
|
76 |
that have long-term data preservation as part of their mission.
|
64 |
77 |
|
65 |
78 |
Coordinating Nodes
|
66 |
79 |
~~~~~~~~~~~~~~~~~~
|
|
80 |
The DataONE Coordinating Nodes provide a set of services to Member Nodes that
|
|
81 |
allow Member Nodes to easily interact with one another and to provide a unified
|
|
82 |
view of the whole DataONE Federation. The main services provided by Coordinating
|
|
83 |
Nodes are:
|
67 |
84 |
|
|
85 |
* Global search index for all metadata and web portal for data discovery
|
|
86 |
* Resolution service to map unique identifiers to the Member Nodes that hold data
|
|
87 |
* Authentication against a shared set of accounts based on CILogon_ and InCommon_
|
|
88 |
* Replication management services to reliably replicate data according to
|
|
89 |
policies set by the Member Nodes
|
|
90 |
* Fixity checking to ensure that preserved objects remain valid
|
|
91 |
* Member Node registration and management
|
|
92 |
* Aggregated logging for data access across the whole federation
|
|
93 |
|
|
94 |
Three geographically distributed Coordinating Nodes replicate these coordinating
|
|
95 |
services at UC Santa Barbara, the University of New Mexico, and the Oak Ridge Campus.
|
|
96 |
Coordinating Nodes are set up in a fully redundant manner, such that any of the coordinating
|
|
97 |
nodes can be offline and the others will continue to provide availability of the services
|
|
98 |
without interruption. The DataONE services expose their services at::
|
|
99 |
|
|
100 |
https://cn.dataone.org/cn
|
|
101 |
|
|
102 |
And the DataONE search portal is available at:
|
|
103 |
|
|
104 |
https://cn.dataone.org/
|
|
105 |
|
|
106 |
.. _CILogon: http://www.cilogon.org
|
|
107 |
|
|
108 |
.. _InCommon: http://incommon.org
|
|
109 |
|
68 |
110 |
Investigator Toolkit
|
69 |
111 |
~~~~~~~~~~~~~~~~~~~~
|
|
112 |
In order to provide scientists with convenient access to the data and metadata in
|
|
113 |
DataONE, the third component represents a library of software tools that have been
|
|
114 |
adapted to work with DataONE via the service interface and can be used to
|
|
115 |
discover, manage, analyze, and visualize data in DataONE. For example, DataONE
|
|
116 |
plans to release metadata editors (e.g., Morpho), data search tools (e.g., Mercury),
|
|
117 |
data access tools (e.g., ONEDrive), and data analysis tools (e.g., R) that all
|
|
118 |
know how to interact with DataONE Member Nodes and Coordinating Nodes. Consequently,
|
|
119 |
scientists will be able to access data from any DataONE Member Node, such as a Metacat
|
|
120 |
node, directly from within the R environment. In addition, software tools that
|
|
121 |
are written to work with one Member Node should also work with others, thereby
|
|
122 |
greatly increasing the efficiency of creating an entire toolkit of software that
|
|
123 |
is useful to investigators.
|
70 |
124 |
|
71 |
|
Metacat as a Member Node
|
72 |
|
------------------------
|
|
125 |
Because DataONE services are REST web services, software written in any
|
|
126 |
programming language can be adapted to interact with DataONE.
|
|
127 |
In addition, to ease the process of adapting tools to work with DataONE, libraries
|
|
128 |
are provided for common programming languages such as Java (d1-libclient-java)
|
|
129 |
and Python (d1_libclient-python) are provided that allow simple function calls
|
|
130 |
to be used to access any DataONE service.
|
73 |
131 |
|
|
132 |
Configuring Metacat as a Member Node
|
|
133 |
------------------------------------
|
|
134 |
Configuring Metacat as a DataONE Member Node is accomplished with the standard
|
|
135 |
Metacat Administrative configuration utility. To access the utility, visit the
|
|
136 |
following URL::
|
|
137 |
|
|
138 |
http://<yourhost.org>/<context>/admin
|
|
139 |
|
|
140 |
where ``<yourhost.org>`` represents the hostname of your webserver running metacat,
|
|
141 |
and ``<context>`` is the name of the web context in which Metacat was installed.
|
|
142 |
Once at the administrative utility, click on the DataONE configuration link, which
|
|
143 |
should show the following screen:
|
|
144 |
|
|
145 |
.. figure:: images/screenshots/screen-dataone-config.png
|
|
146 |
:align: center
|
|
147 |
|
|
148 |
The configuration screen for configuring Metacat as a DataONE node.
|
|
149 |
|
|
150 |
Being a replication target
|
|
151 |
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
152 |
TODO: Describe the configuraiton for acting as a replication target.
|
|
153 |
|
|
154 |
Replication Policies
|
|
155 |
--------------------
|
|
156 |
TODO: Describe the replication policies for objects in DataONE.
|
|
157 |
|
|
158 |
Access Control Policies
|
|
159 |
-----------------------
|
|
160 |
TODO: Describe access control for objects in DataONE.
|
|
161 |
|
|
162 |
|
|
163 |
|
|
164 |
|
Continued authoring the description of DataONE in Metacat. More to come.