Revision 6895
Added by Matt Jones about 13 years ago
docs/user/metacat/source/dataone.rst | ||
---|---|---|
35 | 35 |
process of writing scientific software both for servers and client tools. |
36 | 36 |
|
37 | 37 |
The DataONE Service Interface |
38 |
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
38 |
-----------------------------
|
|
39 | 39 |
DataONE acheives interoperability by defining a lightweight but powerful set of |
40 | 40 |
REST_ web services that can be implemented by various data management software |
41 | 41 |
systems to allow those systems to effectively communicate with one another, |
... | ... | |
60 | 60 |
.. _four distinct tiers: http://releases.dataone.org/online/d1-architecture-1.0.0/apis/index.html |
61 | 61 |
|
62 | 62 |
Member Nodes |
63 |
~~~~~~~~~~~~
|
|
63 |
------------
|
|
64 | 64 |
In DataONE, Member Nodes represent the core of the network, in that they represent |
65 | 65 |
particular scientific communities, manage and preserve their data and metadata, and |
66 | 66 |
provide tools to their community for contributing, managing, and accessing data. |
... | ... | |
76 | 76 |
that have long-term data preservation as part of their mission. |
77 | 77 |
|
78 | 78 |
Coordinating Nodes |
79 |
~~~~~~~~~~~~~~~~~~
|
|
79 |
------------------
|
|
80 | 80 |
The DataONE Coordinating Nodes provide a set of services to Member Nodes that |
81 | 81 |
allow Member Nodes to easily interact with one another and to provide a unified |
82 | 82 |
view of the whole DataONE Federation. The main services provided by Coordinating |
... | ... | |
108 | 108 |
.. _InCommon: http://incommon.org |
109 | 109 |
|
110 | 110 |
Investigator Toolkit |
111 |
~~~~~~~~~~~~~~~~~~~~
|
|
111 |
--------------------
|
|
112 | 112 |
In order to provide scientists with convenient access to the data and metadata in |
113 | 113 |
DataONE, the third component represents a library of software tools that have been |
114 | 114 |
adapted to work with DataONE via the service interface and can be used to |
... | ... | |
146 | 146 |
:align: center |
147 | 147 |
|
148 | 148 |
The configuration screen for configuring Metacat as a DataONE node. |
149 |
|
|
150 |
Being a replication target |
|
151 |
~~~~~~~~~~~~~~~~~~~~~~~~~~ |
|
152 |
TODO: Describe the configuraiton for acting as a replication target. |
|
153 | 149 |
|
154 |
Replication Policies |
|
155 |
-------------------- |
|
156 |
TODO: Describe the replication policies for objects in DataONE. |
|
150 |
To configure Metacat as a node in the DataONE network, configure the properties shown |
|
151 |
in the figure above. The Node Name should be a short name for the node that can |
|
152 |
be used in user interface displays that list the node. For example, one node in |
|
153 |
DataONE is the 'Knowledge Network for Biocomplexity'. Also provide a brief sentence |
|
154 |
or two describing the node, including its intended scope and purpose. |
|
157 | 155 |
|
156 |
The Node Identifier field is a unique identifer assigned by DataONE to identify |
|
157 |
this node even when the node changes physical locations over time. When Metacat |
|
158 |
registers with the DataONE Coordinating Nodes (when you click 'Register' at the |
|
159 |
bottom of this form), the Node Identifier is automatically set. **It is critical that |
|
160 |
you not change the Node Identifier**, as that will break the connection with the |
|
161 |
DataONE network. The ability to edit this field is only provided for the rare case |
|
162 |
in which a new Metacat instance is being established to act as the provider for an |
|
163 |
existing DataONE Member Node, in which case the field can be edited to set it to |
|
164 |
the value of a valid, existing Node Identifier. |
|
165 |
|
|
166 |
The Node Subject and Node Certificate Path are linked fields that are critical for |
|
167 |
proper operation of the node. To act as a Member Node in DataONE, you must obtain |
|
168 |
an X.509 certificate that can be used to identify this node and allow it to securely |
|
169 |
communicate using SSL with other nodes and client applications. This certificate can |
|
170 |
either be obtained from the DataONE Certificate Authority, or from a commercial |
|
171 |
provider of certificates. Once you have the certificate in hand, use a tool such |
|
172 |
as ``openssl`` to determine the exact subject distinguished name in the |
|
173 |
certificate, and use that to set the Node Subject field. Set the Node |
|
174 |
Certificate Path to the location on the system in which you stored the |
|
175 |
certificate file. |
|
176 |
|
|
177 |
The ``Synchronize`` checkbox allows the administrator to decide whether to turn on |
|
178 |
synchronization with the DataONE network. When this box is unchecked, the DataONE |
|
179 |
Coordinating Nodes will not attempt to synchronize at all, but when checked, then |
|
180 |
DataONE will periodically contact the node to synchrnize all metadata content. |
|
181 |
To be part of the DataONE network, this box must be checked as that allows |
|
182 |
DataONE to receive a copy of the metadata associated with each object in the Metacat |
|
183 |
system. The switch is provided for those rare cases when a node needs to be disconnected |
|
184 |
from DataONE for maintenance or service outages. When the box is checked, DataONE |
|
185 |
contacts the node using the schedule provided in the ``Synchronization Schedule`` |
|
186 |
fields. The example in the dialog above has synchronization occurring once every third |
|
187 |
minutes at the 10 second mark of those minutes. The syntax for these schedules |
|
188 |
follows the Quartz Crontab Entry syntax, which provides for many flexible schedule |
|
189 |
configurations. If the administrator desires a less frequent schedule, such as daily, |
|
190 |
that can be configured by changing the ``*`` in the ``Hours`` field to be a concrete |
|
191 |
hour (such as ``11``) and the ``Minutes`` field to a concrete value like``15``, |
|
192 |
which would change the schedule to synchronize at 11:15 am daily. |
|
193 |
|
|
194 |
Once these parameters have been properly set, us the ``Register`` button to |
|
195 |
request to register with the DataONE Coordinating Node. This will generate a |
|
196 |
registration document describing this Metacat instance and send it to the |
|
197 |
Coordinating Node registration service, which will return a unique Node Identifier |
|
198 |
which will be recorded by Metacat. At that point, all that remains is to wait for |
|
199 |
the DataONE administrators to approve the node registration. Details of the approval |
|
200 |
process can be found on the `DataONE web site`_. |
|
201 |
|
|
202 |
.. _DataONE web site: http://www.dataone.org |
|
203 |
|
|
158 | 204 |
Access Control Policies |
159 | 205 |
----------------------- |
160 |
TODO: Describe access control for objects in DataONE. |
|
206 |
Metacat has supported fine grained access control for objects in the system since |
|
207 |
its inception. DataONE has devised a simple but effective access control system |
|
208 |
that is compatible with the prior system in Metacat. For each object in the DataONE |
|
209 |
system (including data objects, scientific metadata objects, and resource maps), |
|
210 |
a SystemMetadata_ document describes the critical metadata needed to manage that |
|
211 |
object in the system. This metadata includes a ``RightsHolder`` field and an |
|
212 |
``AuthoritativeMemberNode`` field that are used to list the people and node that |
|
213 |
have ultimate control over the disposition of the object. In addition, a separate |
|
214 |
AccessPolicy_ can be included in the ``SystemMetadata`` for the object. This ``AccessPolicy`` |
|
215 |
consists of a set of rules that grant additional permissions to other people, |
|
216 |
groups, and systems in DataONE. For example, for one data file, two users |
|
217 |
(Alice and Bob) may be able make changes to the object, and the general public may |
|
218 |
be allowed to read the object. In the absence of explicit rules extending these permissions, |
|
219 |
Metacat enforces the rule that only the ``RightsHolder`` and ``AuthoritativeMemberNode`` have |
|
220 |
rights to the object, and that the Coordinating Node can manage ``SystemMetadata`` |
|
221 |
for the object. An example AccessPolicy that might be submitted with a dataset |
|
222 |
(giving Alice and Bob permission to read and write the object) follows: |
|
161 | 223 |
|
224 |
:: |
|
162 | 225 |
|
226 |
... |
|
227 |
<accessPolicy> |
|
228 |
<allow> |
|
229 |
<subject>/C=US/O=SomeIdP/CN=Alice</subject> |
|
230 |
<subject>/C=US/O=SomeIdP/CN=Bob</subject> |
|
231 |
<permission>read</permission> |
|
232 |
<permission>write</permission> |
|
233 |
</allow> |
|
234 |
</accessPolicy> |
|
235 |
... |
|
236 |
|
|
237 |
These AccessPolicies can be embedded inside of the ``SystemMetadata`` that accompany |
|
238 |
submission of an object through the `MNStorage.create`_ and `MNStorage.update`_ services, |
|
239 |
or can be set using the `CNAuthorization.setAccessPolicy`_ service. |
|
163 | 240 |
|
241 |
.. _SystemMetadata: http://releases.dataone.org/online/d1-architecture-1.0.0/apis/Types.html#Types.AccessPolicy |
|
164 | 242 |
|
243 |
.. _AccessPolicy: http://releases.dataone.org/online/d1-architecture-1.0.0/apis/Types.html#Types.AccessPolicy |
|
244 |
|
|
245 |
.. _MNStorage.create: http://releases.dataone.org/online/d1-architecture-1.0.0/apis/MN_APIs.html#MNStorage.create |
|
246 |
|
|
247 |
.. _MNStorage.update: http://releases.dataone.org/online/d1-architecture-1.0.0/apis/MN_APIs.html#MNStorage.update |
|
248 |
|
|
249 |
.. _CNAuthorization.setAccessPolicy: http://releases.dataone.org/online/d1-architecture-1.0.0/apis/CN_APIs.html#CNAuthorization.setAccessPolicy |
|
250 |
|
|
251 |
Configuration as a replication target |
|
252 |
------------------------------------- |
|
253 |
DataONE is designed to enable a robust preservation environment through replication |
|
254 |
of digital objects at multiple Member Nodes. Any Member Node in DataONE that implements |
|
255 |
the Tier 4 Service interface can offer to act as a target for object replication. |
|
256 |
Currently, Metacat configuration supports turning this replication function on or off. |
|
257 |
When the 'Act as a replication target' checkbox is checked, then Metacat will notify |
|
258 |
the Coordinating Nodes in DataONE that it is available to house replicas of objects |
|
259 |
from other nodes. Shortly thereafter, the Coordinating Nodes may notify Metacat to |
|
260 |
replicate objects from throughout the system, which it will start to do. There objects |
|
261 |
will begin to be listed in the Metacat catalog. |
|
262 |
|
|
263 |
.. Note:: |
|
264 |
|
|
265 |
Future versions of Metacat will allow finer specification of the Node |
|
266 |
Replication Policy, which determines the set of objects |
|
267 |
that it is willing to replicate, using constraints on object size, total objects, |
|
268 |
source nodes, and object format types. |
|
269 |
|
|
270 |
Object Replication Policies |
|
271 |
--------------------------- |
|
272 |
In addition to access control, each object also can have a ``ReplicationPolicy`` |
|
273 |
associated with it that determines whether DataONE should attempt to replicate the |
|
274 |
object for failover and backup purposes to other Member Nodes in the federation. |
|
275 |
Both the ``RightsHolder`` and ``AuthoritativeMemberNode`` for an object can set the |
|
276 |
``ReplicationPolicy``, which consists of fields that describe how many replicas |
|
277 |
should be maintained, and any nodes that are preferred for housing those replicas, or |
|
278 |
that should be blocked from housing replicas. |
|
279 |
|
|
280 |
These ReplicationPolicies can be embedded inside of the ``SystemMetadata`` that accompany |
|
281 |
submission of an object through the `MNStorage.create`_ and `MNStorage.update`_ services, |
|
282 |
or can be set using the `CNReplication.setReplicationPolicy`_ service. |
|
283 |
|
|
284 |
.. _CNReplication.setReplicationPolicy: http://releases.dataone.org/online/d1-architecture-1.0.0/apis/CN_APIs.html#CNReplication.setReplicationPolicy |
Also available in: Unified diff
Completed first draft of Admin guide chapter on DataONE.