Project

General

Profile

1
Replication
2
===========
3
Metacat has a built-in replication feature that allows different Metacat servers 
4
to share data (both XML documents and data files) between each other. Metacat 
5
can replicate not only its home server's original documents, but also those 
6
that were replicated from partner Metacat servers. When changes are made to 
7
one server in a replication network, the changes are automatically propogated 
8
to the network, even if the network is down.
9

    
10
Replication allows users to manage their data locally and (by replicating them 
11
to a shared Metacat repository) to make those data available to the greater 
12
scientific community via a centralized search. In other words, your Metacat can 
13
be part of a broader network, but you retain control over the local repository 
14
and how it is managed.
15

    
16
For example, the KNB Network (Figure 6.1), which currently consists of ten 
17
different Metacat servers from around the world, uses replication to "join" 
18
the disperate servers to form a single robust and searchable data 
19
repository--facilitating data discovery, while leaving the data ownership and 
20
management with the local administrators.
21

    
22
.. figure:: images/screenshots/image059.jpg
23
   :align: center
24
   
25
   A map of the KNB Metacat network.
26

    
27
When properly configured, Metacat's replication mechanism can be triggered by 
28
several types of events that occur on either the home or partner server: a 
29
document insertion, an update, or an automatic replication (i.e., Delta-T 
30
monitoring), which is set at a user-specified time interval.
31

    
32
+----------------------+----------------------------------------------------------+
33
| Replication Triggers | Description                                              |
34
+======================+==========================================================+
35
| Insert               | Whenever a document is inserted into Metacat, the server |
36
|                      | notifies each server in its replication list             |
37
|                      | that it has a new file available.                        |
38
+----------------------+----------------------------------------------------------+
39
| Update               | Whenever a document is updated, the server notifies      |
40
|                      | each server in its replication list of the update.       |
41
+----------------------+----------------------------------------------------------+
42
| Delta-T monitoring   | At a user-specified time interval, Metacat checks each   |
43
|                      | of the servers in its replication list                   |
44
|                      | for updated documents.                                   |
45
+----------------------+----------------------------------------------------------+
46

    
47
Configuring Replication
48
-----------------------
49
To configure replication, you must configure both the home and partner servers:
50

    
51
1. Create a list of partner servers on your home server using the Replication Control Panel
52
2. Create certificate files for the home server
53
3. Create certificate files for the partner server
54
4. Import partner certificate files to the home server
55
5. Import home certificate to the partner server
56
6. Update your Metacat database 
57

    
58
Each step is discussed in more detail in the following sections.
59

    
60
Using the Replication Control Panel
61
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
62
To add, remove, or alter servers on your home server's Replication list, or to 
63
activate and customize the Delta-T handler, use the Replication control panel, 
64
which is accessed at the following URL::
65
 
66
   http://somehost.somelocation.edu/context/style/skins/dev/replControl.html
67
   
68
"http://somehost.somelocation.edu/context" should be replaced with the name 
69
of your Metacat server and context (e.g., http://knb.ecoinformatics.org/knb/). 
70
You must be logged in to Metacat as an administrator.
71

    
72
.. figure:: images/screenshots/image061.jpg
73
   :align: center
74
   
75
   Replication control panel.
76

    
77
Note that currently, you cannot use the Replication Control Panel to remove a 
78
server after a replication has occurred. At this point in time, the only way to 
79
remove a replication server after replication has occurred is to remove the 
80
certificates. 
81

    
82
Also note that you must SCP partner certificates to your machine; you cannot 
83
use the "Download Certificate from" option on the Control Panel. For more 
84
information about creating and installing certificates, please see Generating 
85
and Exchanging Security Certificates.
86

    
87
Generating and Exchanging Security Certificates
88
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
89
Before you can take advantage of Metacat's replication feature, you must 
90
generate security certificates on both the replication partner and home servers. 
91
The certificates will be exchanged so that each machine understands that the 
92
other has replication access.
93

    
94
The process for generating certificates is different for Metacat servlets 
95
running under Tomcat and those under Tomcat/Apache (the recommended configuration). 
96
For instructions on generating and exchanging certificates on systems running 
97
only Tomcat (and Java 6), see Generating a Certificate for Tomcat standalone 
98
(no Apache).
99

    
100
Generate Certificates for Metacat running under Apache/Tomcat
101
.............................................................
102
Note: Instructions are for Ubuntu/Debian systems.
103

    
104
1. Generate a certificate key using openssl. The key will be named 
105
   ``<hostname>-apache.key``, where ``<hostname>`` is the name of your Metacat 
106
   server. Example values for the individual key fields are included in the
107
   table below.
108

    
109
   ::
110
   
111
     openssl req -new -out REQ.pem -keyout <hostname>-apache.key
112

    
113
   +--------------------------+-------------------------------------------------------------------------+
114
   | Key Field                | Description and Example Value                                           |
115
   +==========================+=========================================================================+
116
   | Country Name             | Two letter country code  (e.g., US)                                     |
117
   +--------------------------+-------------------------------------------------------------------------+
118
   | State or Province Name   | The name of your state or province spelled in full (e.g., California)   |
119
   +--------------------------+-------------------------------------------------------------------------+
120
   | Locality Name            | The name of your city (e.g., Santa Barbara)                             |
121
   +--------------------------+-------------------------------------------------------------------------+
122
   | Organization Name        | The company or organization name (e.g., UCSB)                           |
123
   +--------------------------+-------------------------------------------------------------------------+
124
   | Organizational Unit Name | The department or section name (e.g., NCEAS)                            |
125
   +--------------------------+-------------------------------------------------------------------------+
126
   | Common Name              | The host server name without port numbers (e.g., myserver.mydomain.edu) |
127
   +--------------------------+-------------------------------------------------------------------------+
128
   | Email Address            | Administrator's contact email (e.g., administrator@mydomain.edu)        |
129
   +--------------------------+-------------------------------------------------------------------------+
130
   | A challenge password     | --leave this field blank--                                              |
131
   +--------------------------+-------------------------------------------------------------------------+
132
   | An optional company name | --leave this field blank--                                              |
133
   +--------------------------+-------------------------------------------------------------------------+
134

    
135
2. Create the local certificate file by running the command:
136

    
137
   ::
138
   
139
     openssl req -x509 -days 800 -in REQ.pem -key <hostname>-apache.key -out <hostname>-apache.crt
140

    
141
   Use the same ``<hostname>`` you used when you generated the key. A file named 
142
   ``<hostname>-apache.crt`` will be created in the directory from which you 
143
   ran the openssl command. Note: You can name the certificate file anything 
144
   you'd like, but keep in mind that the file will be sent to the partner 
145
   machine used for replication. The certificate name should have enough 
146
   meaning that someone who sees it on that machine can figure out where it 
147
   came from. 
148

    
149
3. Enter the certificate into Apache's security configuration. You must 
150
   register the certificate in the local Apache instance. Note that the 
151
   security files may be in a different directory from the one used in the 
152
   instructions depending on how you installed Apache. Copy the certificate and 
153
   key file using the following commands:
154
   
155
   ::
156
   
157
     sudo cp <hostname>-apache.crt /etc/ssl/certs 
158
     sudo cp <hostname>-apache.key /etc/ssl/private 
159

    
160
4. Apache needs to know about Metacat SSL. The helper file named "knb-ssl" has 
161
   rules that tell Apache which traffic to route to the Metacat SSL port. Set up 
162
   SSL by dropping the knb-ssl file into the sites-available directory and 
163
   running ``a2ensite`` to enable the site: 
164

    
165
   ::
166
   
167
     sudo cp <metacat_helper_dir>/knb-ssl <apache_install_dir>/sites-available
168
     sudo a2ensite knb-ssl
169

    
170
5. Restart Apache to bring in changes by typing: 
171

    
172
   ::
173
   
174
     sudo /etc/init.d/apache2 restart
175

    
176
6. SCP ``<hostname>-apache.crt`` to the replication partner machine.
177

    
178
Generating a Certificate for Tomcat standalone (no Apache)
179
..........................................................
180
If you are running Metacat under Tomcat (no Apache), generate keys in the Java 
181
default key store.  The generated key is placed into the binary certificate's 
182
file located at ``/etc/java-1.5.0-sun/security/cacerts``.
183

    
184
1. Generate the key by running the following command (note that you must be 
185
   logged in as the root user to use the keytool):
186
   
187
   ::
188
    
189
     keytool -genkey -alias <aliasname> -keyalg RSA -validity 800 -keystore /etc/java-1.6.0-sun/security/cacerts
190

    
191
   ``<aliasname>`` is a unique name that you choose for this key. Something 
192
   like "<hostname-tomcat>" might be appropriate, where ``<hostname-tomcat>`` 
193
   is the name of the Metacat host. 
194

    
195
2. The Password-keytool will ask for a password. If writing to a pre-existing 
196
   keystore, you must know the password. If you are creating a new keystore, 
197
   the password you enter will become the keystore password. 
198

    
199
   Sample values when creating certificate: 
200

    
201
   ::
202
   
203
     What is your first and last name? myserver.nceas.ucsb.edu (note: use the host name without port number) 
204
     What is the name of your organizional unit? NCEAS 
205
     What is the name of your organizional unit? UCSB 
206
     What is the name of your City or Locality? Santa Barbara 
207
     What is the name of your State or Province? California (note: this is spelled in full) 
208
     What is the two-letter country code for this unit? US 
209

    
210
3. Create a certificate by running the command:
211
   
212
   ::
213
    
214
     keytool -export -alias <aliasname> -file <outputfile>.cert -keystore /etc/java-1.6.0-sun/security/cacerts
215

    
216
   ``<aliasname>`` is the same name you used when you created the key file. A 
217
   file named ``<outputfile>.cert`` will be created in the directory from which 
218
   you ran the keytool command. You can name the output file anything you like, 
219
   but keep in mind that it will be sent to the partner machine used for 
220
   replication. The filename should have enough meaning that someone who sees 
221
   it on that machine can figure out where it came from. Again, something like 
222
   "<hostname>-tomcat.cert" will suffice. 
223

    
224
4. Edit the Tomcat server file at ``$TOMCAT_HOME/conf/server.xml`` to enable 
225
   SSL in Tomcat.
226
    
227
   * Uncomment the section that starts with "<Connector port="8443" ...
228
     (Note: Databased Information comments start with <!-- and end with -->). 
229

    
230
   * Add two attribute to that section: 
231

    
232
     ::
233
     
234
       keystoreFile="/etc/java-1.6.0-sun/security/cacerts"
235
       keystorePass="<keystore_password>"
236

    
237
     where ``<keystore_password>`` is the password you used when you created 
238
     or accessed the keystore. 
239
 
240
5. SCP the certificate to the partner server.
241

    
242
To import a certificate
243
.......................
244
1. Log in as a root user (the keytool must run as a root user)
245
   
246
   ::
247
   
248
     sudo su –
249

    
250
2. Import the remote certificate by running: 
251

    
252
   ::
253
   
254
     keytool -import -alias <remotehostalias> -file <remotehostfilename>.crt -keystore /etc/java-1.6.0-sun/security/cacerts
255

    
256
   where the ``<remotehostfilename>`` is the name of the certificate file 
257
   created on the remote partner machine and SCP'd to the home machine. 
258
   The ``<remotehostalias>`` is the name the certificate will use in the 
259
   keystore. The name should identify the remote host. 
260

    
261
Update your Metacat database
262
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
263
The simplest way to update the Metacat database to use replication is to use 
264
the Replication Control Panel. You can also update the database u
265
sing SQL. Instructions for both options are included in this section.
266

    
267
.. figure:: images/screenshots/image063.jpg
268
   :align: center
269
   
270
   Using the Replication Control Panel to update the Metacat database.
271

    
272
To update your Metacat database to use replication, select the "Add this server" 
273
radio button from the Replication Control Panel, enter the partner server name, 
274
and specify how the replication should occur (whether to replicate xml, data, 
275
or use the local machine as a hub). Note that you cannot download certificates 
276
using this interface.
277

    
278
To update the database using SQL
279
................................
280

    
281
1. Log in to the database
282

    
283
   ::
284
   
285
     psql -U metacat -W -h localhost metacat
286

    
287
2. Select all rows from the replication table
288

    
289
   ::
290

    
291
     select * from xml_replication;  
292

    
293
3. Insert the partner server. 
294

    
295
   ::
296
   
297
     INSERT INTO xml_replication (server,last_checked,replicate,datareplicate,hub) VALUES ('<partner.server/context>/servlet/replication',NULL,1,1,0);
298

    
299
   Where ``<partner.server/context>`` is the name of the partner server and 
300
   context. The values 'NULL, 1,1,0' indicate (respectively) the last time 
301
   replication occurred, that XML docs should be replicated to the partner 
302
   server, that data files should be replicated to the partner server, and 
303
   that the local server should not act as a hub. Set a value of 'NULL,0,0,0' 
304
   if your Metacat is only receiving documents from the partner site and not 
305
   replicating to that site.
306

    
307
4. Exit the database 
308
5. Restart Apache and Tomcat on both home and partner replication machines 
(16-16/18)