1 |
6147
|
jones
|
Replication
|
2 |
|
|
===========
|
3 |
6885
|
jones
|
|
4 |
|
|
.. Note::
|
5 |
|
|
|
6 |
|
|
Note that much of the functionality provided by the replication subsystem in Metacat
|
7 |
|
|
has now been generalized and standardized by DataONE, so consider utilizing the
|
8 |
|
|
DataONE services for replication as it is a more general and standardized approach
|
9 |
|
|
than this Metacat-specific replication system. The Metacat replication system
|
10 |
|
|
will be supported for a while longer, but will likely be deprecated in a future
|
11 |
|
|
release in favor of using the DataONE replication approach.
|
12 |
|
|
|
13 |
6845
|
jones
|
Metacat has a built-in replication feature that allows different Metacat servers
|
14 |
|
|
to share data (both XML documents and data files) between each other. Metacat
|
15 |
|
|
can replicate not only its home server's original documents, but also those
|
16 |
|
|
that were replicated from partner Metacat servers. When changes are made to
|
17 |
|
|
one server in a replication network, the changes are automatically propogated
|
18 |
|
|
to the network, even if the network is down.
|
19 |
6147
|
jones
|
|
20 |
6845
|
jones
|
Replication allows users to manage their data locally and (by replicating them
|
21 |
|
|
to a shared Metacat repository) to make those data available to the greater
|
22 |
|
|
scientific community via a centralized search. In other words, your Metacat can
|
23 |
|
|
be part of a broader network, but you retain control over the local repository
|
24 |
|
|
and how it is managed.
|
25 |
6147
|
jones
|
|
26 |
6845
|
jones
|
For example, the KNB Network (Figure 6.1), which currently consists of ten
|
27 |
|
|
different Metacat servers from around the world, uses replication to "join"
|
28 |
|
|
the disperate servers to form a single robust and searchable data
|
29 |
|
|
repository--facilitating data discovery, while leaving the data ownership and
|
30 |
|
|
management with the local administrators.
|
31 |
6147
|
jones
|
|
32 |
6845
|
jones
|
.. figure:: images/screenshots/image059.jpg
|
33 |
|
|
:align: center
|
34 |
|
|
|
35 |
|
|
A map of the KNB Metacat network.
|
36 |
6147
|
jones
|
|
37 |
6845
|
jones
|
When properly configured, Metacat's replication mechanism can be triggered by
|
38 |
|
|
several types of events that occur on either the home or partner server: a
|
39 |
|
|
document insertion, an update, or an automatic replication (i.e., Delta-T
|
40 |
|
|
monitoring), which is set at a user-specified time interval.
|
41 |
|
|
|
42 |
|
|
+----------------------+----------------------------------------------------------+
|
43 |
|
|
| Replication Triggers | Description |
|
44 |
|
|
+======================+==========================================================+
|
45 |
|
|
| Insert | Whenever a document is inserted into Metacat, the server |
|
46 |
|
|
| | notifies each server in its replication list |
|
47 |
|
|
| | that it has a new file available. |
|
48 |
|
|
+----------------------+----------------------------------------------------------+
|
49 |
|
|
| Update | Whenever a document is updated, the server notifies |
|
50 |
|
|
| | each server in its replication list of the update. |
|
51 |
|
|
+----------------------+----------------------------------------------------------+
|
52 |
|
|
| Delta-T monitoring | At a user-specified time interval, Metacat checks each |
|
53 |
|
|
| | of the servers in its replication list |
|
54 |
|
|
| | for updated documents. |
|
55 |
|
|
+----------------------+----------------------------------------------------------+
|
56 |
|
|
|
57 |
|
|
Configuring Replication
|
58 |
|
|
-----------------------
|
59 |
|
|
To configure replication, you must configure both the home and partner servers:
|
60 |
|
|
|
61 |
|
|
1. Create a list of partner servers on your home server using the Replication Control Panel
|
62 |
|
|
2. Create certificate files for the home server
|
63 |
|
|
3. Create certificate files for the partner server
|
64 |
|
|
4. Import partner certificate files to the home server
|
65 |
|
|
5. Import home certificate to the partner server
|
66 |
|
|
6. Update your Metacat database
|
67 |
|
|
|
68 |
|
|
Each step is discussed in more detail in the following sections.
|
69 |
|
|
|
70 |
|
|
Using the Replication Control Panel
|
71 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
72 |
|
|
To add, remove, or alter servers on your home server's Replication list, or to
|
73 |
|
|
activate and customize the Delta-T handler, use the Replication control panel,
|
74 |
6870
|
jones
|
which is accessed via the Metacat Administration interface at the following URL::
|
75 |
6845
|
jones
|
|
76 |
6870
|
jones
|
http://somehost.somelocation.edu/context/admin
|
77 |
6845
|
jones
|
|
78 |
|
|
"http://somehost.somelocation.edu/context" should be replaced with the name
|
79 |
|
|
of your Metacat server and context (e.g., http://knb.ecoinformatics.org/knb/).
|
80 |
|
|
You must be logged in to Metacat as an administrator.
|
81 |
|
|
|
82 |
|
|
.. figure:: images/screenshots/image061.jpg
|
83 |
|
|
:align: center
|
84 |
|
|
|
85 |
|
|
Replication control panel.
|
86 |
|
|
|
87 |
|
|
Note that currently, you cannot use the Replication Control Panel to remove a
|
88 |
6936
|
leinfelder
|
server after a replication has occurred. To stop replication between two servers,
|
89 |
|
|
update the flags that control whether metadata and/or data are replicated.
|
90 |
6845
|
jones
|
|
91 |
|
|
Generating and Exchanging Security Certificates
|
92 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
93 |
|
|
Before you can take advantage of Metacat's replication feature, you must
|
94 |
|
|
generate security certificates on both the replication partner and home servers.
|
95 |
6870
|
jones
|
Depending on how the certificates are generated, the certificates may need to be
|
96 |
|
|
exchanged so that each machine "trusts" that the other has replication access.
|
97 |
|
|
Certificates that are purchased from a commercial and well-recognized
|
98 |
|
|
Certificate Authority do not need to be exchanged with the other replication
|
99 |
|
|
partner before replication takes place. Metacat replication relies on SSL with
|
100 |
|
|
client certificate authentication enabled. When a replication partner server
|
101 |
|
|
communicates with another replication partner, it presents a certificate that
|
102 |
|
|
serves to verify and authenticate that the server is trusted.
|
103 |
6845
|
jones
|
|
104 |
6870
|
jones
|
If you must generate a self-signed certificate, the partner replication server
|
105 |
7219
|
leinfelder
|
will need that public certificate (or the certificate of the signing CA) added
|
106 |
|
|
to its existing Certificate Authorities.
|
107 |
6845
|
jones
|
|
108 |
|
|
Generate Certificates for Metacat running under Apache/Tomcat
|
109 |
|
|
.............................................................
|
110 |
|
|
Note: Instructions are for Ubuntu/Debian systems.
|
111 |
|
|
|
112 |
6870
|
jones
|
1. Generate a private key using openssl. The key will be named
|
113 |
6845
|
jones
|
``<hostname>-apache.key``, where ``<hostname>`` is the name of your Metacat
|
114 |
|
|
server. Example values for the individual key fields are included in the
|
115 |
|
|
table below.
|
116 |
|
|
|
117 |
|
|
::
|
118 |
|
|
|
119 |
|
|
openssl req -new -out REQ.pem -keyout <hostname>-apache.key
|
120 |
|
|
|
121 |
|
|
+--------------------------+-------------------------------------------------------------------------+
|
122 |
|
|
| Key Field | Description and Example Value |
|
123 |
|
|
+==========================+=========================================================================+
|
124 |
|
|
| Country Name | Two letter country code (e.g., US) |
|
125 |
|
|
+--------------------------+-------------------------------------------------------------------------+
|
126 |
|
|
| State or Province Name | The name of your state or province spelled in full (e.g., California) |
|
127 |
|
|
+--------------------------+-------------------------------------------------------------------------+
|
128 |
|
|
| Locality Name | The name of your city (e.g., Santa Barbara) |
|
129 |
|
|
+--------------------------+-------------------------------------------------------------------------+
|
130 |
|
|
| Organization Name | The company or organization name (e.g., UCSB) |
|
131 |
|
|
+--------------------------+-------------------------------------------------------------------------+
|
132 |
|
|
| Organizational Unit Name | The department or section name (e.g., NCEAS) |
|
133 |
|
|
+--------------------------+-------------------------------------------------------------------------+
|
134 |
|
|
| Common Name | The host server name without port numbers (e.g., myserver.mydomain.edu) |
|
135 |
|
|
+--------------------------+-------------------------------------------------------------------------+
|
136 |
|
|
| Email Address | Administrator's contact email (e.g., administrator@mydomain.edu) |
|
137 |
|
|
+--------------------------+-------------------------------------------------------------------------+
|
138 |
|
|
| A challenge password | --leave this field blank-- |
|
139 |
|
|
+--------------------------+-------------------------------------------------------------------------+
|
140 |
|
|
| An optional company name | --leave this field blank-- |
|
141 |
|
|
+--------------------------+-------------------------------------------------------------------------+
|
142 |
|
|
|
143 |
|
|
2. Create the local certificate file by running the command:
|
144 |
|
|
|
145 |
|
|
::
|
146 |
|
|
|
147 |
|
|
openssl req -x509 -days 800 -in REQ.pem -key <hostname>-apache.key -out <hostname>-apache.crt
|
148 |
|
|
|
149 |
|
|
Use the same ``<hostname>`` you used when you generated the key. A file named
|
150 |
|
|
``<hostname>-apache.crt`` will be created in the directory from which you
|
151 |
|
|
ran the openssl command. Note: You can name the certificate file anything
|
152 |
|
|
you'd like, but keep in mind that the file will be sent to the partner
|
153 |
|
|
machine used for replication. The certificate name should have enough
|
154 |
|
|
meaning that someone who sees it on that machine can figure out where it
|
155 |
6870
|
jones
|
came from and for what purpose it should be used.
|
156 |
6845
|
jones
|
|
157 |
6870
|
jones
|
3. Enter the certificate into Apache's security configuration. This will
|
158 |
|
|
be used to identify your server to a replication partner. You must
|
159 |
6845
|
jones
|
register the certificate in the local Apache instance. Note that the
|
160 |
|
|
security files may be in a different directory from the one used in the
|
161 |
|
|
instructions depending on how you installed Apache. Copy the certificate and
|
162 |
|
|
key file using the following commands:
|
163 |
|
|
|
164 |
|
|
::
|
165 |
|
|
|
166 |
|
|
sudo cp <hostname>-apache.crt /etc/ssl/certs
|
167 |
|
|
sudo cp <hostname>-apache.key /etc/ssl/private
|
168 |
|
|
|
169 |
6936
|
leinfelder
|
4. Apache needs to be configured to request a client certificate when the
|
170 |
8265
|
leinfelder
|
replication API is utilized. The helper file named "metacat-site-ssl" has default
|
171 |
6870
|
jones
|
rules that configure Apache for SSL and client certificate authentication.
|
172 |
8265
|
leinfelder
|
Set up these SSL settings by copying the metacat-site-ssl file into the ``sites-available``
|
173 |
6870
|
jones
|
directory, editing pertinent values to match your system and running
|
174 |
8265
|
leinfelder
|
``a2ensite`` to enable the site. (Note: some settings in metacat-site-ssl need to be
|
175 |
7219
|
leinfelder
|
changed to match the specifics of your system and Metacat deployment.)
|
176 |
6845
|
jones
|
|
177 |
|
|
::
|
178 |
|
|
|
179 |
8265
|
leinfelder
|
sudo cp <metacat_helper_dir>/metacat-site-ssl <apache_install_dir>/sites-available
|
180 |
|
|
sudo a2ensite metacat-site-ssl
|
181 |
6845
|
jones
|
|
182 |
6930
|
leinfelder
|
5. Enable the ssl module:
|
183 |
6845
|
jones
|
|
184 |
|
|
::
|
185 |
|
|
|
186 |
6930
|
leinfelder
|
sudo a2enmod ssl
|
187 |
|
|
|
188 |
|
|
6. Restart Apache to bring in changes by typing:
|
189 |
|
|
|
190 |
|
|
::
|
191 |
|
|
|
192 |
6845
|
jones
|
sudo /etc/init.d/apache2 restart
|
193 |
|
|
|
194 |
6930
|
leinfelder
|
7. If using a self-signed certificate, SCP ``<hostname>-apache.crt`` to the
|
195 |
6870
|
jones
|
replication partner machine where it will be added as an additional
|
196 |
|
|
Certificate Authority.
|
197 |
6845
|
jones
|
|
198 |
6870
|
jones
|
If using self-signed certificates, after you have created and SCP'd a
|
199 |
|
|
certificate file to each replication partner, and received a certificate file
|
200 |
|
|
from each partner in return, both home and partner servers must add the
|
201 |
|
|
respective partner certificates as Certificate Authorities.
|
202 |
6845
|
jones
|
|
203 |
|
|
|
204 |
|
|
To import a certificate
|
205 |
|
|
.......................
|
206 |
6870
|
jones
|
1. Copy it into the Apache directory
|
207 |
6845
|
jones
|
|
208 |
|
|
::
|
209 |
|
|
|
210 |
6870
|
jones
|
sudo cp <remotehostfilename> /etc/ssl/certs/
|
211 |
6845
|
jones
|
|
212 |
6870
|
jones
|
2. Rehash the certificates for Apache by running:
|
213 |
6845
|
jones
|
|
214 |
|
|
::
|
215 |
|
|
|
216 |
6870
|
jones
|
cd /etc/ssl/certs
|
217 |
|
|
sudo c_rehash
|
218 |
6845
|
jones
|
|
219 |
6870
|
jones
|
|
220 |
6845
|
jones
|
where the ``<remotehostfilename>`` is the name of the certificate file
|
221 |
|
|
created on the remote partner machine and SCP'd to the home machine.
|
222 |
|
|
|
223 |
7220
|
leinfelder
|
To import a certificate into Java keystore (for self-signed certificates)
|
224 |
7223
|
jones
|
.........................................................................
|
225 |
7220
|
leinfelder
|
1. Use Java's keytool to import to the default Java keystore
|
226 |
|
|
|
227 |
|
|
::
|
228 |
|
|
|
229 |
|
|
sudo keytool -import -alias <remotehostname_alias> -file <remotehostfilename> -keystore $JAVA_HOME/lib/security/cacerts
|
230 |
|
|
|
231 |
|
|
2. Restart Tomcat
|
232 |
|
|
|
233 |
|
|
::
|
234 |
|
|
|
235 |
|
|
sudo /etc/init.d/tomcat6 restart
|
236 |
|
|
|
237 |
|
|
|
238 |
|
|
where the ``<remotehostfilename>`` is the name of the certificate file
|
239 |
|
|
created on the remote partner machine and SCP'd to the home machine and
|
240 |
|
|
<remotehostname_alias> is a short memorable alias for this certificate and
|
241 |
|
|
$JAVA_HOME is the same as configured for running Tomcat. NOTE: the cacerts path may be different
|
242 |
|
|
depending on your exact Java installation.
|
243 |
|
|
|
244 |
7242
|
leinfelder
|
|
245 |
|
|
Update Metacat properties
|
246 |
|
|
.........................
|
247 |
|
|
Metacat needs to be configured with the path to both the server certificate and the private key.
|
248 |
|
|
1. Edit metacat.properties, modifying these properties to match your specific deployment.
|
249 |
|
|
|
250 |
|
|
::
|
251 |
|
|
|
252 |
|
|
replication.certificate.file=/etc/ssl/certs/<hostname>-apache.crt
|
253 |
|
|
replication.privatekey.file=/etc/ssl/private/<hostname>-apache.key
|
254 |
|
|
replication.privatekey.password=<password, or blank if not protected>
|
255 |
|
|
|
256 |
|
|
|
257 |
6845
|
jones
|
Update your Metacat database
|
258 |
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
259 |
|
|
The simplest way to update the Metacat database to use replication is to use
|
260 |
6936
|
leinfelder
|
the Replication Control Panel. You can also update the database using SQL.
|
261 |
|
|
Instructions for both options are included in this section.
|
262 |
6845
|
jones
|
|
263 |
|
|
.. figure:: images/screenshots/image063.jpg
|
264 |
|
|
:align: center
|
265 |
|
|
|
266 |
|
|
Using the Replication Control Panel to update the Metacat database.
|
267 |
|
|
|
268 |
|
|
To update your Metacat database to use replication, select the "Add this server"
|
269 |
|
|
radio button from the Replication Control Panel, enter the partner server name,
|
270 |
|
|
and specify how the replication should occur (whether to replicate xml, data,
|
271 |
6870
|
jones
|
or use the local machine as a hub).
|
272 |
6845
|
jones
|
|
273 |
|
|
To update the database using SQL
|
274 |
|
|
................................
|
275 |
|
|
|
276 |
|
|
1. Log in to the database
|
277 |
|
|
|
278 |
|
|
::
|
279 |
|
|
|
280 |
|
|
psql -U metacat -W -h localhost metacat
|
281 |
|
|
|
282 |
|
|
2. Select all rows from the replication table
|
283 |
|
|
|
284 |
|
|
::
|
285 |
|
|
|
286 |
|
|
select * from xml_replication;
|
287 |
|
|
|
288 |
|
|
3. Insert the partner server.
|
289 |
|
|
|
290 |
|
|
::
|
291 |
|
|
|
292 |
|
|
INSERT INTO xml_replication (server,last_checked,replicate,datareplicate,hub) VALUES ('<partner.server/context>/servlet/replication',NULL,1,1,0);
|
293 |
|
|
|
294 |
|
|
Where ``<partner.server/context>`` is the name of the partner server and
|
295 |
|
|
context. The values 'NULL, 1,1,0' indicate (respectively) the last time
|
296 |
|
|
replication occurred, that XML docs should be replicated to the partner
|
297 |
|
|
server, that data files should be replicated to the partner server, and
|
298 |
|
|
that the local server should not act as a hub. Set a value of 'NULL,0,0,0'
|
299 |
|
|
if your Metacat is only receiving documents from the partner site and not
|
300 |
|
|
replicating to that site.
|
301 |
|
|
|
302 |
|
|
4. Exit the database
|
303 |
|
|
5. Restart Apache and Tomcat on both home and partner replication machines
|