Project

General

Profile

1 878 berkley
<!--
2
  * replication.html
3
  *
4
  *      Authors: Chad Berkley
5
  *    Copyright: 2000 Regents of the University of California and the
6
  *               National Center for Ecological Analysis and Synthesis
7
  *  For Details: http://www.nceas.ucsb.edu/
8
  *      Created: 2001 January 23
9
  *      Version:
10
  *    File Info: '$ '
11
  *
12
  *
13
-->
14
<HTML>
15
<HEAD>
16
<TITLE>Metacat</TITLE>
17 3780 daigle
<link rel="stylesheet" type="text/css" href="./default.css">
18 878 berkley
</HEAD>
19
<BODY>
20
  <table width="100%">
21
    <tr>
22
      <td class="tablehead" colspan="2"><p class="label">Replication</p></td>
23
      <td class="tablehead" colspan="2" align="right">
24
        <a href="./packages.html">Back</a> | <a href="./metacattour.html">Home</a> |
25
        <a href="./datafiles.html">Next</a>
26
      </td>
27
    </tr>
28
  </table>
29 4482 daigle
30
  <div class="header1">Table of Contents</div>
31
  <div class="toc1"><a href="#Intro">Metacat Replication</a></div>
32
    <div class="toc2"><a href="#DatabasedInfo">Databased Information</a></div>
33
    <div class="toc2"><a href="#Example">Example</a></div>
34
      <div class="toc3"><a href="#gamma">What happens with gamma?</a></div>
35
      <div class="toc3"><a href="#alpha">What happens with alpha?</a></div>
36
      <div class="toc3"><a href="#lamda">What happens with lamda?</a></div>
37
  <div class="toc1"><a href="#Certificates">Certificates</a></div>
38
    <div class="toc2"><a href="#GenerateCertificates">Generate Certificates on both the replication client and server.</a></div>
39
      <div class="toc3"><a href="#GenerateCertTomcat">Generate Certificate for Tomcat standalone (no Apache)</a></div>
40
      <div class="toc3"><a href="#GenerateCertApache">Generate Certificate for Apache/Tomcat</a></div>
41
    <div class="toc2"><a href="#RegisterPartner">Register the partner machines certificate</a></div>
42
43
  <a name="Intro"></a><div class="header1">Metacat Replication</div>
44
  <p>Metacat has built-in replication to allow different Metacat servers to
45
  share data between themselves. Metacat not only replicates XML documents but
46
  also data files. </p>
47
48
  <p>Metacat's hub feature allows it to replicate not only it's own server's original
49
  documents, but also those that were replicated from other servers.  This functionality
50
  allows for a more complex chaining replication structure.</p>
51
52 1302 tao
  <p>The replication scheme that Metacat uses is both push and pull.  There are
53 4482 daigle
  several triggers that can start a replication mechanism: </p>
54
  <ul class="list1">
55
    <li><b>Delta-T monitoring</b> - at a set time interval a server checks each of the
56 878 berkley
    other servers in its list for updated documents</li>
57 4482 daigle
    <li><b>INSERT trigger</b> - Whenever a document is inserted, the server notifies
58 878 berkley
    the remote hosts in its list that it has a new file available.</li>
59 4482 daigle
    <li><b>UPDATE trigger</b> - Whenever a document is updated, the server notifies
60 878 berkley
    each server in its list of the update.</li>
61 4482 daigle
    <li><b>File locking</b> - When a local user tries to alter a document on a local
62 878 berkley
    server that belongs to a remote server, the local server must first
63
    obtain a lock on that file.  Once the lock is obtained, the file can
64
    be updated, then it is force replicated out to each server in the list.
65
    The lock ensures that the remote copy is up to date and that an older
66
    file does not overwrite a newer one.  Only a documents home server
67
    can give a lock for that file to be altered.</li>
68
  </ul>
69 4482 daigle
70
  <a name="DatabasedInfo"></a><div class="header2">Databased Information</div>
71 878 berkley
  <p>Each server contains a list of servers to which it can replicate.  One-way
72 1302 tao
  replication is enabled by the 'replicate' and 'datareplicate' flags in the
73
  list.  The server list may look like the following.</p>
74 878 berkley
  <table border="1">
75
    <tr>
76
      <td><b>serverid</b></td>
77
      <td><b>server</b></td>
78
      <td><b>last_checked</b></td>
79
      <td><b>replicate</b></td>
80 1302 tao
      <td><b>datareplicate</b></td>
81
      <td><b>hub</b></td>
82 878 berkley
    </tr>
83
    <tr>
84
      <td>1</td>
85
      <td>localhost</td>
86
      <td>null</td>
87
      <td>0</td>
88 1302 tao
      <td>0</td>
89
      <td>0</td>
90 878 berkley
    </tr>
91
    <tr>
92
      <td>2</td>
93
      <td>alpha.nceas.ucsb.edu:8080/berkley/servlet/replication</td>
94
      <td>2001-01-22 14:52:12.1</td>
95 1302 tao
      <td>0</td>
96
      <td>0</td>
97
      <td>0</td>
98 878 berkley
    </tr>
99
    <tr>
100
      <td>3</td>
101
      <td>dev.nceas.ucsb.edu/Metacat/servlet/replication</td>
102
      <td>2001-01-23 9:10:02.5</td>
103
      <td>1</td>
104 1302 tao
      <td>1</td>
105
      <td>0</td>
106 878 berkley
    </tr>
107
  </table>
108
109 4482 daigle
  <br>
110
  The server list is kept in a table in the database called xml_replication.
111 878 berkley
  Localhost must always be the first entry in the table and have a serverid of 1.
112 4482 daigle
  The database fields are:
113
  <ul class="list1">
114
  <li><b>serverid</b> - a unique ID that is generated by the database when a new field is added.</li>
115
  <li><b>server</b> - this field always points to the partner server's replication servlet,
116
  hence the "servlet/replication" on the end of both of the sample servers.  Note
117 878 berkley
  that any port numbers (if your servlet engine is not running on port 80) must
118 4482 daigle
  also be included. </li>
119
  <li><b>last_checked</b> - a system generated values that holds the last time that a check was
120
  made to see if replication needed to be performed.<li>
121
  <li><b>replicate</b> - flag that is set to 1 if you want this server to replicate XML
122
  metadata documents TO the remote host.  Note that if this flag is set to 0, datareplicate
123
  and hub fields have no meaning.</li>
124
  <li><b>datareplicate</b> - flag that is set to 1 if you want this server to copy data
125
  files to the remote host.  Note that this field has no meaning if replicate is not set to 1.</li>
126
  If this server is a hub to the remote host, the hub flag should be set to.
127
  <li><b>hub</b> - if this flag is set to true, this server will not only replicate it's own
128
  original documents, it will also replicate documents that were replicated to it.  Thus it
129
  acts as a replication hub to one or more other Metacat servers.</li>
130
  </ul>
131
132
  <a name="Example"></a><div class="header2">Example</div>
133
  Here we show an example setup of three replication servers.  We will discuss each.<br><br>
134
135
  First, note that in order for replication to occur, both partner servers must have
136
  each other in their respective tables or replication will not take place.  Also,
137
  certificates must be set up correctly on both servers in order for replication to
138
  work.  See the <a href="#Certificates">certificates</a> section below.<br><br>
139
140 878 berkley
  <table border="1">
141
    <tr>
142
      <td>host</td>
143
      <td>replication table</td>
144
    </tr>
145
    <tr>
146 4482 daigle
     <td>gamma.nceas.ucsb.edu</td>
147 878 berkley
     <td>
148
      <table border="2">
149
        <tr>
150
          <td><b>server</b></td>
151
          <td><b>last_checked</b></td>
152
          <td><b>replicate</b></td>
153 1302 tao
          <td><b>datareplicate</b></td>
154
          <td><b>hub</b></td>
155 878 berkley
        </tr>
156
        <tr>
157
          <td>localhost</td>
158
          <td>null</td>
159
          <td>0</td>
160 1302 tao
          <td>0</td>
161
          <td>0</td>
162 878 berkley
        </tr>
163
        <tr>
164
          <td>alpha.nceas.ucsb.edu:8080/berkley/servlet/replication&nbsp;&nbsp;&nbsp;</td>
165
          <td>2001-01-22 14:52:12.1</td>
166 1302 tao
          <td>0</td>
167
          <td>0</td>
168
          <td>0</td>
169 878 berkley
        </tr>
170
        <tr>
171 4482 daigle
          <td>lamda.nceas.ucsb.edu/Metacat/servlet/replication</td>
172 878 berkley
          <td>2001-01-23 9:10:02.5</td>
173
          <td>1</td>
174 1302 tao
          <td>1</td>
175
          <td>0</td>
176 878 berkley
        </tr>
177
      </table>
178
     </td>
179
    </tr>
180
    <tr>
181
      <td>alpha.nceas.ucsb.edu</td>
182
      <td>
183
        <table border="2">
184
          <tr>
185
            <td><b>server</b></td>
186
            <td><b>last_checked</b></td>
187
            <td><b>replicate</b></td>
188 1302 tao
            <td><b>datareplicate</b></td>
189
            <td><b>hub</b></td>
190 878 berkley
          </tr>
191
          <tr>
192
            <td>localhost</td>
193
            <td>null</td>
194
            <td>0</td>
195 1302 tao
            <td>0</td>
196
            <td>0</td>
197 878 berkley
          </tr>
198
          <tr>
199 4482 daigle
            <td>gamma.nceas.ucsb.edu:8080/berkley/servlet/replication</td>
200 878 berkley
            <td>2001-01-21 11:33:12.7</td>
201
            <td>0</td>
202 1302 tao
            <td>1</td>
203
            <td>0</td>
204 878 berkley
          </tr>
205
          <tr>
206 4482 daigle
            <td>lamda.nceas.ucsb.edu/Metacat/servlet/replication</td>
207 878 berkley
            <td>2001-01-23 10:22:02.5</td>
208
            <td>1</td>
209 1302 tao
            <td>0</td>
210
            <td>0</td>
211 878 berkley
          </tr>
212
        </table>
213
      </td>
214
    </tr>
215
    <tr>
216 4482 daigle
      <td>lamda.nceas.ucsb.edu</td>
217 878 berkley
      <td>
218
        <table border="2">
219
          <tr>
220
            <td><b>server</b></td>
221
            <td><b>last_checked</b></td>
222
            <td><b>replicate</b></td>
223 1302 tao
            <td><b>datareplicate</b></td>
224
            <td><b>hub</b></td>
225 878 berkley
          </tr>
226
          <tr>
227
            <td>localhost</td>
228
            <td>null</td>
229
            <td>0</td>
230 1302 tao
            <td>0</td>
231
            <td>0</td>
232 878 berkley
          </tr>
233
          <tr>
234 4482 daigle
            <td>gamma.nceas.ucsb.edu:8080/berkley/servlet/replication</td>
235 878 berkley
            <td>2001-01-21 11:33:12.7</td>
236
            <td>0</td>
237 1302 tao
            <td>0</td>
238
            <td>0</td>
239 878 berkley
          </tr>
240
          <tr>
241
            <td>alpha.nceas.ucsb.edu:8080/Metacat/servlet/replication</td>
242
            <td>2001-01-22 12:15:32.5</td>
243
            <td>1</td>
244 1302 tao
            <td>1</td>
245
            <td>1</td>
246 878 berkley
          </tr>
247
        </table>
248
      </td>
249
    </tr>
250
  </table>
251 4482 daigle
252
  <a name="gamma"></a><div class="header3">What happens with gamma?</div>
253
  <ul class="list1">
254
  <li>The localhost entry is required internally for replication to work on
255
      gamma.  As long as we see it there, we can safely disregard it.</li>
256
  <li>We see the entry for the alpha machine has all zeros in replicate,
257
      datareplicate and hub columns.  This means that gamma is configured to
258
      accept replication information from alpha.  (As we will see in a moment,
259
      alpha is not actually correctly configured to send data to gamma.)</li>
260
  <li>We see that the entry for the lamda machine has ones in the replicate
261
      and data replicate columns and a zero in the hub column.  This tells us
262
      that gamma will replicate it's original documents to lamda, assuming that
263
      lambda is configured to accept replication from gamma (we will see that it
264
      is).  However, because the hub value is zero, any documents that replicate
265
      to gamma will not be further replicated to lamda.</li>
266
  </ul>
267
268
  <a name="alpha"></a><div class="header3">What happens with alpha?</div>
269
  <ul class="list1">
270
  <li>The localhost entry is required internally for replication to work on
271
      alpha.  As long as we see it there, we can safely disregard it.</li>
272
  <li>We see that the entry for gamma has a zero in the replicate column.
273
      This means that all other entries are meaningless and can be disregarded.
274
      Even though there is a one in the datareplicate column on alpha and gamma
275
      is configured to accept replication from alpha, no replicationwill happen
276
      from alpha to gamma.</li>
277
  <li>We see that the entry for lamda is a one in the replicate column and zeros
278
      in the datareplicate and hub columns.  Assuming lamda is configured to
279
      accept replication from alpha, alpha will replicate metadata only to lamda
280
      (and indeed, we will see that lambda is set up to accept replication from
281
      alpha). </li>
282
  </ul>
283
284
  <a name="lamda"></a><div class="header3">What happens with lamda?</div>
285
  <ul class="list1">
286
  <li>The localhost entry is required internally for replication to work on
287
      lamda.  As long as we see it there, we can safely disregard it.</li>
288
  <li>We see that the entry for gamma has all zeros in replicate, datareplicate
289
      and hub, so lamba is set up to accept replication from gamma.  As we have
290
      already seen, gamma is correctly configured to replicate metadata and data
291
      to lambda.  We should see data and metadata replication from gamma to lamda.
292
  <li>We see that the entry for alpha has ones in the replicate datareplicate and
293
      hub columns.  There's a lot going on here:
294
    <ul class="list2">
295
    <li>First, lamda will replicate original metadata and data to alpha if
296
        alpha is configured to accept replication from lamda.  Because alpha
297
        has an entry for lambda, lamba will be allowed to replicate to alpha. </li>
298
    <li>Second, because the alpha entry has a one in the hub column, lambda
299
        will not only replicate it's original data, it will also replicate
300
        data that was replicated to it.  Remember that gamma was configured
301
        to replicate to lamda.  So any data or metadata that gamma sends to
302
        lambda will get further replicated to alpha.</li>
303
    <li>Finally, the alpha entry in the table allows the alpha server to
304
        replicate to lambda.  Since the alpha server is set up to replicate
305
        metadata only, we would expect any original metadata on alpha to
306
        wind up on lambda.</li>
307
    </ul>
308
  </ul>
309
310
  There is an html control panel for controling replication.  After
311 878 berkley
  <a href="./Metacatinstall.html">installing</a> Metacat, you can access
312
  it by going through the Metacat servlet context you have setup and calling up
313
  replControl.html.  For instance, if you setup a Metacat servlet instance
314 4482 daigle
  called 'knb' you would probably type
315
316
  <div class="code">http://server.domain.com:8080/Metacat/style/skins/dev/replControl.html</div>
317
318
  The control panel is an easy interface for adding/removing/altering servers and
319
  starting the delta-T handler.  It will also allow you to 'force replicate' your
320
  server list.  This is useful if you want to initialize the state of one Metacat
321
  server from an existing state of another (i.e. copy all of the data from an existing
322 878 berkley
  server).</p>
323
324 4482 daigle
  <a name="Certificates"></a><div class="header1">Certificates:</div>
325
  You will need to generate security certificates on both the replication client
326
  and server.  The certificates will be exchanged so that each machine understands
327
  that the other has access for replication.<br><br>
328
  The following are the steps to generate and exchange certificates on systems
329
  running Tomcat 5 and java 1.5.  Note that if Tomcat is running in conjunction with
330
  Apache, the process is somewhat different than if it is running standalone.
331
332
  <a name="GenerateCertificates"></a><div class="header2">Generate Certificates on both the replication client and server.</div>
333
334
  <a name="GenerateCertTomcat"></a><div class="header3">Generate Certificate for Tomcat standalone (no Apache)</div>
335
  <ul class="list1">
336
  <li>Generate keys in java default key store - this will create a secure key and put it
337
    into the binary certificates file located at $JAVA_HOME/lib/security/cacerts</li>
338
    <ul class="list2">
339
    <li>Run the command:
340
   	  <div class="code">keytool -genkey -alias &lt;aliasname&gt; -keyalg RSA -validity 800 -keystore cacerts</div>
341
     where &lt;aliasname&gt; is a unique name that you choose for this cert.  Something like "&lt;hostname-tomcat&gt"
342
     might be appropriate.</li>
343
    </ul>
344
  </li>
345
  <li>Sample values when creating certificate</li>
346
    <ul class="list2">
347
    <li>What is your first and last name? <b>myserver.nceas.ucsb.edu </b>
348
        (note: use the host name without port number)<li>
349
    <li>What is the name of your organizional unit? <b>NCEAS</b></li>
350
    <li>What is the name of your organizional unit? <b>UCSB</b></li>
351
    <li>What is the name of your City or Locality? <b>Santa Barbara</b></li>
352
    <li>What is the name of your State or Province? <b>California</b>
353
        (note: this is spelled in full)<li>
354
    <li>What is the two-letter country code for this unit? <b>US</b></li>
355
    </ul>
356
  <li>Generate certificate - this will pull the certificate you created from the cacerts file
357
      and put it into a local file</li>
358
    <ul class="list2">
359
    <li>Run the command:
360
      <div class="code">keytool -export -alias &lt;aliasname&gt; -file &lt;outputfile&gt;.cert -keystore cacerts</div>
361
      where &lt;aliasname&gt; is the same name you used when you created the certificate.  </li>
362
    <li>A file named &lt;outputfile&gt;.cert will be created in the same directory where you run the keytool
363
      command.  You can name the output file anything you like, but keep in mind that it will get sent to the
364
      partner machine used for replication.  The filename should have have enough meaning that someone who sees
365
      it on that machine can have some idea where it came from.  Again, something like "&lt;hostname&gt;-tomcat.cert"
366
      will suffice.</li>
367
    </ul>
368
  </li>
369
  <li>Enable SSL in Tomcat
370
    <ul class="list2">
371
    <li>Edit the Tomcat server file at $TOMCAT_HOME/conf/server.xml</li>
372
    <li>uncomment the section that starts with "&lt;Connector port="8443" ...</li>
373
  	<li>add another attribute to that section that reads:
374
  	  <div class="code">keystoreFile="&lt;JAVA_HOME&gt;/lib/security/cacerts"</div>
375
  	  where $JAVA_HOME should be the actual java path.
376
  	</li>
377
  	</ul>
378
  </li>
379
  </ul>
380
381
  <a name="GenerateCertApache"></a><div class="header3">Generate Certificate for Apache/Tomcat</div>
382
  <ul class="list1">
383
  <li>Generate keys using openssl
384
    <ul class="list2">
385
    <li>Run the command:
386
   	  <div class="code">   openssl req -new -out REQ.pem -keyout &lt;hostname&gt;-apache.key</div>
387
    </li>
388
    </ul>
389
  </li>
390
  <li>Sample values when creating certificate</li>
391
    <ul class="list2">
392
    <li>Country Name (2 letter code) [AU]: <b>US</b></li>
393
    <li>State or Province Name (full name) [Some-State]: <b>California</b>
394
        (note: this is spelled in full)</li>
395
    <li>Locality Name (eg, city) []: <b>Santa Barbara</b></li>
396
    <li>Organization Name (eg, company) [Internet Widgits Pty Ltd]: </b>UCSB</b></li>
397
    <li>Organizational Unit Name (eg, section) []: <b>NCEAS</b></li>
398
    <li>Common Name (eg, YOUR name) []: <b>myserver.mydomain.edu</b>
399
        (note: use the host name without port number)</li>
400
    <li>Email Address []:  <b>administrator@mydomain.edu</b></li>
401
    <li>A challenge password []: (note: leave blank)</li>
402
    <li>An optional company name []: (note: leave blank)</li>
403
    </ul>
404
  </li>
405
  <li>Generate certificate - this will create a local file with your certificate</li>
406
    <ul class="list2">
407
    <li>Run the command:
408
      <div class="code">openssl req -x509 -days 800 -in REQ.pem -key &lt;hostname&gt;-apache.key -out &lt;hostname&gt;-apache.crt</div>
409
      where &lt;aliasname&gt; is the same name you used when you created the certificate.  </li>
410
    <li>A file named &lt;outputfile&gt;.cert will be created in the same directory where you run the keytool
411
      command.  You can name the output file anything you like, but keep in mind that it will get sent to the
412
      partner machine used for replication.  The filename should have have enough meaning that someone who sees
413
      it on that machine can have some idea where it came from.  Again, something like "&lt;hostname&gt;-tomcat.cert"
414
      will suffice.</li>
415
    </ul>
416
  </li>
417
  <li>Enter the certificate into apache security configuration - you need to register the certificate
418
      in the local Apache instance.  Note that the security files may be in a different place depending
419
      on how you installed apache.</li>
420
    <ul class="list2">
421
    <li>Copy the certificate and key file to the apache ssl directories and enable ssl.</li>
422
    <li>For Ubuntu/Debian based systems:
423
      <ul class="list3">
424
      <li>sudo cp &lt;hostname&gt;-apache.crt /etc/ssl/certs</li>
425
      <li>sudo cp &lt;hostname&gt;-apache.key /etc/ssl/private</li>
426
      <li>As root edit /etc/apache2/sites-available/default.  In the VirtualHost section
427
          after the DocumentRoot line, add:<br>
428
          SSLEngine on<br>
429
          SSLOptions +FakeBasicAuth +ExportCertData +CompatEnvVars +StrictRequire<br>
430
          SSLCertificateFile /etc/ssl/certs/server.crt<br>
431
          SSLCertificateKeyFile /etc/ssl/private/server.key<br>
432
      </li>
433
      </ul>
434
    </li>
435
    </ul>
436
    <ul class="list2">
437
    <li>For other systems:
438
      <ul class="list3">
439
      <li>sudo cp &lt;hostname&gt;-apache.crt $APACHE_HOME/conf/ssl.crt</li>
440
      <li>sudo cp &lt;hostname&gt;-apache.key $APACHE_HOME/conf/ssl.key</li>
441
      <li> ADD STEPS TO ENABLE SSL ON NON_DEBIAN SYSTEMS HERE</li>
442
      </ul>
443
    </li>
444
    </ul>
445
  <li>scp &lt;hostname&gt;-apache.crt to the replication partner machine.</li>
446
  </ul>
447
448
  <a name="RegisterPartner"></a><div class="header2">Register the partner machines certificate.</div>
449
  At this point, you have created a certificate for each replication server and
450
  scp-ed them across to each other.  Now you need to import the remote server's
451
  certificate on the local machine.  Perform the following steps for each
452
  replication server.
453
  <ul class="list1">
454
  <li>Import the remote certificate by running:
455
    <div class="code">keytool -import -alias &lt;remotehostalias&gt; -file &lt;remotehostfilename&gt;.cert -keystore cacerts</div>
456
    where the &lt;remotehostfilename&gt; is the certificate file you created on the remote machine and
457
    copied to this machine.  The &lt;remotehostalias&gt; is the name the certificate will use in
458
    the keystore.  It should be something that identifies the remote host.
459
  </li>
460
  <li>Restart Apache and Tomcat on both replication machines</li>
461
  </ul>
462
463 878 berkley
  <a href="./packages.html">Back</a> | <a href="./metacattour.html">Home</a> |
464
  <a href="./datafiles.html">Next</a>
465 4482 daigle
  </ul>
466 878 berkley
467
468
</BODY>
469
</HTML>