Project

General

Profile

« Previous | Next » 

Revision 6895

Added by Matt Jones over 12 years ago

Completed first draft of Admin guide chapter on DataONE.

View differences:

dataone.rst
35 35
process of writing scientific software both for servers and client tools.
36 36

  
37 37
The DataONE Service Interface
38
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
38
-----------------------------
39 39
DataONE acheives interoperability by defining a lightweight but powerful set of 
40 40
REST_ web services that can be implemented by various data management software 
41 41
systems to allow those systems to effectively communicate with one another, 
......
60 60
.. _four distinct tiers: http://releases.dataone.org/online/d1-architecture-1.0.0/apis/index.html
61 61

  
62 62
Member Nodes
63
~~~~~~~~~~~~
63
------------
64 64
In DataONE, Member Nodes represent the core of the network, in that they represent
65 65
particular scientific communities, manage and preserve their data and metadata, and
66 66
provide tools to their community for contributing, managing, and accessing data.
......
76 76
that have long-term data preservation as part of their mission. 
77 77

  
78 78
Coordinating Nodes
79
~~~~~~~~~~~~~~~~~~
79
------------------
80 80
The DataONE Coordinating Nodes provide a set of services to Member Nodes that
81 81
allow Member Nodes to easily interact with one another and to provide a unified
82 82
view of the whole DataONE Federation.  The main services provided by Coordinating
......
108 108
.. _InCommon: http://incommon.org
109 109

  
110 110
Investigator Toolkit
111
~~~~~~~~~~~~~~~~~~~~
111
--------------------
112 112
In order to provide scientists with convenient access to the data and metadata in
113 113
DataONE, the third component represents a library of software tools that have been 
114 114
adapted to work with DataONE via the service interface and can be used to
......
146 146
   :align: center
147 147
   
148 148
   The configuration screen for configuring Metacat as a DataONE node.
149
   
150
Being a replication target
151
~~~~~~~~~~~~~~~~~~~~~~~~~~
152
TODO: Describe the configuraiton for acting as a replication target.
153 149

  
154
Replication Policies
155
--------------------
156
TODO: Describe the replication policies for objects in DataONE.
150
To configure Metacat as a node in the DataONE network, configure the properties shown
151
in the figure above.  The Node Name should be a short name for the node that can
152
be used in user interface displays that list the node.  For example, one node in
153
DataONE is the 'Knowledge Network for Biocomplexity'.  Also provide a brief sentence
154
or two describing the node, including its intended scope and purpose.  
157 155

  
156
The Node Identifier field is a unique identifer assigned by DataONE to identify
157
this node even when the node changes physical locations over time.  When Metacat
158
registers with the DataONE Coordinating Nodes (when you click 'Register' at the
159
bottom of this form), the Node Identifier is automatically set.  **It is critical that
160
you not change the Node Identifier**, as that will break the connection with the
161
DataONE network.  The ability to edit this field is only provided for the rare case
162
in which a new Metacat instance is being established to act as the provider for an 
163
existing DataONE Member Node, in which case the field can be edited to set it to
164
the value of a valid, existing Node Identifier.
165

  
166
The Node Subject and Node Certificate Path are linked fields that are critical for
167
proper operation of the node.  To act as a Member Node in DataONE, you must obtain
168
an X.509 certificate that can be used to identify this node and allow it to securely
169
communicate using SSL with other nodes and client applications.  This certificate can 
170
either be obtained from the DataONE Certificate Authority, or from a commercial 
171
provider of certificates. Once you have the certificate in hand, use a tool such 
172
as ``openssl`` to determine the exact subject distinguished name in the 
173
certificate, and use that to set the Node Subject field.  Set the Node 
174
Certificate Path to the location on the system in which you stored the 
175
certificate file.
176

  
177
The ``Synchronize`` checkbox allows the administrator to decide whether to turn on
178
synchronization with the DataONE network.  When this box is unchecked, the DataONE
179
Coordinating Nodes will not attempt to synchronize at all, but when checked, then
180
DataONE will periodically contact the node to synchrnize all metadata content.
181
To be part of the DataONE network, this box must be checked as that allows 
182
DataONE to receive a copy of the metadata associated with each object in the Metacat
183
system.  The switch is provided for those rare cases when a node needs to be disconnected
184
from DataONE for maintenance or service outages.  When the box is checked, DataONE
185
contacts the node using the schedule provided in the ``Synchronization Schedule``
186
fields.  The example in the dialog above has synchronization occurring once every third
187
minutes at the 10 second mark of those minutes.  The syntax for these schedules
188
follows the Quartz Crontab Entry syntax, which provides for many flexible schedule 
189
configurations.  If the administrator desires a less frequent schedule, such as daily, 
190
that can be configured by changing the ``*`` in the ``Hours`` field to be a concrete 
191
hour (such as ``11``) and the ``Minutes`` field to a concrete value like``15``, 
192
which would change the schedule to synchronize at 11:15 am daily.  
193

  
194
Once these parameters have been properly set, us the ``Register`` button to
195
request to register with the DataONE Coordinating Node.  This will generate a
196
registration document describing this Metacat instance and send it to the 
197
Coordinating Node registration service, which will return a unique Node Identifier
198
which will be recorded by Metacat.  At that point, all that remains is to wait for
199
the DataONE administrators to approve the node registration.  Details of the approval
200
process can be found on the `DataONE web site`_.
201

  
202
.. _DataONE web site: http://www.dataone.org
203

  
158 204
Access Control Policies
159 205
-----------------------
160
TODO: Describe access control for objects in DataONE.
206
Metacat has supported fine grained access control for objects in the system since
207
its inception.  DataONE has devised a simple but effective access control system
208
that is compatible with the prior system in Metacat.  For each object in the DataONE
209
system (including data objects, scientific metadata objects, and resource maps), 
210
a SystemMetadata_ document describes the critical metadata needed to manage that
211
object in the system.  This metadata includes a ``RightsHolder`` field and an
212
``AuthoritativeMemberNode`` field that are used to list the people and node that
213
have ultimate control over the disposition of the object.  In addition, a separate
214
AccessPolicy_ can be included in the ``SystemMetadata`` for the object.  This ``AccessPolicy``
215
consists of a set of rules that grant additional permissions to other people, 
216
groups, and systems in DataONE.  For example, for one data file, two users 
217
(Alice and Bob) may be able make changes to the object, and the general public may
218
be allowed to read the object.  In the absence of explicit rules extending these permissions,
219
Metacat enforces the rule that only the ``RightsHolder`` and ``AuthoritativeMemberNode`` have
220
rights to the object, and that the Coordinating Node can manage ``SystemMetadata``
221
for the object.  An example AccessPolicy that might be submitted with a dataset
222
(giving Alice and Bob permission to read and write the object) follows:
161 223

  
224
::
162 225

  
226
  ...
227
  <accessPolicy>
228
      <allow>
229
        <subject>/C=US/O=SomeIdP/CN=Alice</subject>
230
        <subject>/C=US/O=SomeIdP/CN=Bob</subject>
231
        <permission>read</permission>
232
        <permission>write</permission>
233
      </allow>
234
  </accessPolicy>
235
  ...
236
  
237
These AccessPolicies can be embedded inside of the ``SystemMetadata`` that accompany
238
submission of an object through the `MNStorage.create`_ and `MNStorage.update`_ services, 
239
or can be set using the `CNAuthorization.setAccessPolicy`_ service.
163 240

  
241
.. _SystemMetadata: http://releases.dataone.org/online/d1-architecture-1.0.0/apis/Types.html#Types.AccessPolicy
164 242

  
243
.. _AccessPolicy: http://releases.dataone.org/online/d1-architecture-1.0.0/apis/Types.html#Types.AccessPolicy
244

  
245
.. _MNStorage.create: http://releases.dataone.org/online/d1-architecture-1.0.0/apis/MN_APIs.html#MNStorage.create
246

  
247
.. _MNStorage.update: http://releases.dataone.org/online/d1-architecture-1.0.0/apis/MN_APIs.html#MNStorage.update
248

  
249
.. _CNAuthorization.setAccessPolicy: http://releases.dataone.org/online/d1-architecture-1.0.0/apis/CN_APIs.html#CNAuthorization.setAccessPolicy
250

  
251
Configuration as a replication target
252
-------------------------------------
253
DataONE is designed to enable a robust preservation environment through replication
254
of digital objects at multiple Member Nodes.  Any Member Node in DataONE that implements
255
the Tier 4 Service interface can offer to act as a target for object replication.  
256
Currently, Metacat configuration supports turning this replication function on or off.
257
When the 'Act as a replication target' checkbox is checked, then Metacat will notify
258
the Coordinating Nodes in DataONE that it is available to house replicas of objects
259
from other nodes.  Shortly thereafter, the Coordinating Nodes may notify Metacat to
260
replicate objects from throughout the system, which it will start to do.  There objects
261
will begin to be listed in the Metacat catalog.
262

  
263
.. Note:: 
264
  
265
  Future versions of Metacat will allow finer specification of the Node
266
  Replication Policy, which determines the set of objects
267
  that it is willing to replicate, using constraints on object size, total objects, 
268
  source nodes, and object format types.
269

  
270
Object Replication Policies
271
---------------------------
272
In addition to access control, each object also can have a ``ReplicationPolicy``
273
associated with it that determines whether DataONE should attempt to replicate the
274
object for failover and backup purposes to other Member Nodes in the federation. 
275
Both the ``RightsHolder`` and ``AuthoritativeMemberNode`` for an object can set the
276
``ReplicationPolicy``, which consists of fields that describe how many replicas 
277
should be maintained, and any nodes that are preferred for housing those replicas, or
278
that should be blocked from housing replicas.  
279

  
280
These ReplicationPolicies can be embedded inside of the ``SystemMetadata`` that accompany
281
submission of an object through the `MNStorage.create`_ and `MNStorage.update`_ services, 
282
or can be set using the `CNReplication.setReplicationPolicy`_ service.
283

  
284
.. _CNReplication.setReplicationPolicy: http://releases.dataone.org/online/d1-architecture-1.0.0/apis/CN_APIs.html#CNReplication.setReplicationPolicy

Also available in: Unified diff