Project

General

Profile

1
<!--
2
  * replication.html
3
  *
4
  *      Authors: Chad Berkley
5
  *    Copyright: 2000 Regents of the University of California and the
6
  *               National Center for Ecological Analysis and Synthesis
7
  *  For Details: http://www.nceas.ucsb.edu/
8
  *      Created: 2001 January 23
9
  *      Version: 
10
  *    File Info: '$ '
11
  * 
12
  * 
13
-->
14
<HTML>
15
<HEAD>
16
<TITLE>Metacat Replication</TITLE>
17
<link rel="stylesheet" type="text/css" href="./default.css">
18
</HEAD> 
19
<BODY>
20
  <table width="100%">
21
    <tr>
22
      <td class="tablehead" colspan="2"><p class="label">Replication</p></td>
23
      <td class="tablehead" colspan="2" align="right">
24
        <a href="./packages.html">Back</a> | <a href="./metacattour.html">Home</a> | 
25
        <a href="./datafiles.html">Next</a>
26
      </td>
27
    </tr>
28
  </table>
29
  
30
  <div class="header1">Table of Contents</div>
31
  <div class="toc1"><a href="#Intro">Metacat Replication</a></div>
32
    <div class="toc2"><a href="#Overview">Overview</a></div>
33
    <div class="toc2"><a href="#DatabasedInfo">Databased Information</a></div>
34
    <div class="toc2"><a href="#Example">Example</a></div>
35
      <div class="toc3"><a href="#gamma">What happens with gamma?</a></div>
36
      <div class="toc3"><a href="#alpha">What happens with alpha?</a></div>
37
      <div class="toc3"><a href="#lamda">What happens with lamda?</a></div>
38
  <div class="toc1"><a href="#ControlPanel">The Replication Control Panel</a></div>
39
  <div class="toc1"><a href="#Certificates">Certificates</a></div>
40
    <div class="toc2"><a href="#GenerateCertificates">Generate Certificates on both the replication client and server.</a></div> 
41
      <div class="toc3"><a href="#GenerateCertTomcat">Generate Certificate for Tomcat standalone (no Apache)</a></div>
42
      <div class="toc3"><a href="#GenerateCertApache">Generate Certificate for Apache/Tomcat</a></div>
43
    <div class="toc2"><a href="#RegisterPartner">Register the partner machines certificate</a></div> 
44
  
45
  <a name="Intro"></a><div class="header1">Metacat Replication</div>
46
  <a name="Overview"></a><div class="header2">Overview</div>
47
  <p>Metacat has built-in replication to allow different Metacat servers to 
48
  share data between themselves. Metacat not only replicates XML documents but 
49
  also data files. </p>
50
  
51
  <p>Metacat's hub feature allows it to replicate not only it's own server's original
52
  documents, but also those that were replicated from other servers.  This functionality
53
  allows for a more complex chaining replication structure.</p>
54
  
55
  <p>The replication scheme that Metacat uses is both push and pull.  There are 
56
  several triggers that can start a replication mechanism: </p>
57
  <ul class="list1">
58
    <li><b>Delta-T monitoring</b> - at a set time interval a server checks each of the
59
    other servers in its list for updated documents</li>
60
    <li><b>INSERT trigger</b> - Whenever a document is inserted, the server notifies
61
    the remote hosts in its list that it has a new file available.</li>
62
    <li><b>UPDATE trigger</b> - Whenever a document is updated, the server notifies
63
    each server in its list of the update.</li>
64
    <li><b>File locking</b> - When a local user tries to alter a document on a local 
65
    server that belongs to a remote server, the local server must first
66
    obtain a lock on that file.  Once the lock is obtained, the file can 
67
    be updated, then it is force replicated out to each server in the list.
68
    The lock ensures that the remote copy is up to date and that an older
69
    file does not overwrite a newer one.  Only a documents home server
70
    can give a lock for that file to be altered.</li>
71
  </ul>
72
  
73
  <a name="DatabasedInfo"></a><div class="header2">Databased Information</div>
74
  <p>Each server contains a list of servers to which it can replicate.  One-way
75
  replication is enabled by the 'replicate' and 'datareplicate' flags in the 
76
  list.  The server list may look like the following.</p>
77
  <table border="1">
78
    <tr>
79
      <td><b>serverid</b></td>
80
      <td><b>server</b></td>
81
      <td><b>last_checked</b></td>
82
      <td><b>replicate</b></td>
83
      <td><b>datareplicate</b></td>
84
      <td><b>hub</b></td>
85
    </tr>
86
    <tr>
87
      <td>1</td>
88
      <td>localhost</td>
89
      <td>null</td>
90
      <td>0</td>
91
      <td>0</td>
92
      <td>0</td>
93
    </tr>
94
    <tr>
95
      <td>2</td>
96
      <td>alpha.nceas.ucsb.edu:8080/berkley/servlet/replication</td>
97
      <td>2001-01-22 14:52:12.1</td>
98
      <td>0</td>
99
      <td>0</td>
100
      <td>0</td>
101
    </tr>
102
    <tr>
103
      <td>3</td>
104
      <td>dev.nceas.ucsb.edu/Metacat/servlet/replication</td>
105
      <td>2001-01-23 9:10:02.5</td>
106
      <td>1</td>
107
      <td>1</td>
108
      <td>0</td>
109
    </tr>
110
  </table>
111
  
112
  <br>
113
  The server list is kept in a table in the database called xml_replication.
114
  Localhost must always be the first entry in the table and have a serverid of 1.
115
  The database fields are:
116
  <ul class="list1">
117
  <li><b>serverid</b> - a unique ID that is generated by the database when a new field is added.</li>
118
  <li><b>server</b> - this field always points to the partner server's replication servlet,
119
  hence the "servlet/replication" on the end of both of the sample servers.  Note
120
  that any port numbers (if your servlet engine is not running on port 80) must
121
  also be included. </li>
122
  <li><b>last_checked</b> - a system generated values that holds the last time that a check was 
123
  made to see if replication needed to be performed.<li>
124
  <li><b>replicate</b> - flag that is set to 1 if you want this server to replicate XML 
125
  metadata documents TO the remote host.  Note that if this flag is set to 0, datareplicate
126
  and hub fields have no meaning.</li>
127
  <li><b>datareplicate</b> - flag that is set to 1 if you want this server to copy data 
128
  files to the remote host.  Note that this field has no meaning if replicate is not set to 1.</li>
129
  If this server is a hub to the remote host, the hub flag should be set to.
130
  <li><b>hub</b> - if this flag is set to true, this server will not only replicate it's own
131
  original documents, it will also replicate documents that were replicated to it.  Thus it 
132
  acts as a replication hub to one or more other Metacat servers.</li>
133
  </ul>
134
  
135
  <a name="Example"></a><div class="header2">Example</div>
136
  Here we show an example setup of three replication servers.  We will discuss each.<br><br>
137
  
138
  First, note that in order for replication to occur, both partner servers must have 
139
  each other in their respective tables or replication will not take place.  Also, 
140
  certificates must be set up correctly on both servers in order for replication to 
141
  work.  See the <a href="#Certificates">certificates</a> section below.<br><br>
142

    
143
  <table border="1">
144
    <tr>
145
      <td>host</td>
146
      <td>replication table</td>
147
    </tr>
148
    <tr>
149
     <td>gamma.nceas.ucsb.edu</td>
150
     <td>
151
      <table border="2">
152
        <tr>
153
          <td><b>server</b></td>
154
          <td><b>last_checked</b></td>
155
          <td><b>replicate</b></td>
156
          <td><b>datareplicate</b></td>
157
          <td><b>hub</b></td>
158
        </tr>
159
        <tr>
160
          <td>localhost</td>
161
          <td>null</td>
162
          <td>0</td>
163
          <td>0</td>
164
          <td>0</td>
165
        </tr>
166
        <tr>
167
          <td>alpha.nceas.ucsb.edu:8080/berkley/servlet/replication&nbsp;&nbsp;&nbsp;</td>
168
          <td>2001-01-22 14:52:12.1</td>
169
          <td>0</td>
170
          <td>0</td>
171
          <td>0</td>
172
        </tr>
173
        <tr>
174
          <td>lamda.nceas.ucsb.edu/Metacat/servlet/replication</td>
175
          <td>2001-01-23 9:10:02.5</td>
176
          <td>1</td>
177
          <td>1</td>
178
          <td>0</td>
179
        </tr>
180
      </table>
181
     </td>
182
    </tr>
183
    <tr>
184
      <td>alpha.nceas.ucsb.edu</td>
185
      <td>
186
        <table border="2">
187
          <tr>
188
            <td><b>server</b></td>
189
            <td><b>last_checked</b></td>
190
            <td><b>replicate</b></td>
191
            <td><b>datareplicate</b></td>
192
            <td><b>hub</b></td>
193
          </tr>
194
          <tr>
195
            <td>localhost</td>
196
            <td>null</td>
197
            <td>0</td>
198
            <td>0</td>
199
            <td>0</td>
200
          </tr>
201
          <tr>
202
            <td>gamma.nceas.ucsb.edu:8080/berkley/servlet/replication</td>
203
            <td>2001-01-21 11:33:12.7</td>
204
            <td>0</td>
205
            <td>1</td>
206
            <td>0</td>
207
          </tr>
208
          <tr>
209
            <td>lamda.nceas.ucsb.edu/Metacat/servlet/replication</td>
210
            <td>2001-01-23 10:22:02.5</td>
211
            <td>1</td>
212
            <td>0</td>
213
            <td>0</td>
214
          </tr>
215
        </table>
216
      </td>
217
    </tr>
218
    <tr>
219
      <td>lamda.nceas.ucsb.edu</td>
220
      <td>
221
        <table border="2">
222
          <tr>
223
            <td><b>server</b></td>
224
            <td><b>last_checked</b></td>
225
            <td><b>replicate</b></td>
226
            <td><b>datareplicate</b></td>
227
            <td><b>hub</b></td>
228
          </tr>
229
          <tr>
230
            <td>localhost</td>
231
            <td>null</td>
232
            <td>0</td>
233
            <td>0</td>
234
            <td>0</td>
235
          </tr>
236
          <tr>
237
            <td>gamma.nceas.ucsb.edu:8080/berkley/servlet/replication</td>
238
            <td>2001-01-21 11:33:12.7</td>
239
            <td>0</td>
240
            <td>0</td>
241
            <td>0</td>
242
          </tr>
243
          <tr>
244
            <td>alpha.nceas.ucsb.edu:8080/Metacat/servlet/replication</td>
245
            <td>2001-01-22 12:15:32.5</td>
246
            <td>1</td>
247
            <td>1</td>
248
            <td>1</td>
249
          </tr>
250
        </table>
251
      </td>
252
    </tr>
253
  </table>
254
  
255
  <a name="gamma"></a><div class="header3">What happens with gamma?</div>
256
  <ul class="list1">
257
  <li>The localhost entry is required internally for replication to work on 
258
      gamma.  As long as we see it there, we can safely disregard it.</li>
259
  <li>We see the entry for the alpha machine has all zeros in replicate, 
260
      datareplicate and hub columns.  This means that gamma is configured to
261
      accept replication information from alpha.  (As we will see in a moment,
262
      alpha is not actually correctly configured to send data to gamma.)</li>
263
  <li>We see that the entry for the lamda machine has ones in the replicate
264
      and data replicate columns and a zero in the hub column.  This tells us
265
      that gamma will replicate it's original documents to lamda, assuming that
266
      lambda is configured to accept replication from gamma (we will see that it
267
      is).  However, because the hub value is zero, any documents that replicate 
268
      to gamma will not be further replicated to lamda.</li>
269
  </ul>
270
   
271
  <a name="alpha"></a><div class="header3">What happens with alpha?</div>
272
  <ul class="list1">
273
  <li>The localhost entry is required internally for replication to work on 
274
      alpha.  As long as we see it there, we can safely disregard it.</li>
275
  <li>We see that the entry for gamma has a zero in the replicate column.  
276
      This means that all other entries are meaningless and can be disregarded.
277
      Even though there is a one in the datareplicate column on alpha and gamma 
278
      is configured to accept replication from alpha, no replicationwill happen 
279
      from alpha to gamma.</li>
280
  <li>We see that the entry for lamda is a one in the replicate column and zeros
281
      in the datareplicate and hub columns.  Assuming lamda is configured to 
282
      accept replication from alpha, alpha will replicate metadata only to lamda 
283
      (and indeed, we will see that lambda is set up to accept replication from 
284
      alpha). </li>
285
  </ul>
286
      
287
  <a name="lamda"></a><div class="header3">What happens with lamda?</div>
288
  <ul class="list1">
289
  <li>The localhost entry is required internally for replication to work on 
290
      lamda.  As long as we see it there, we can safely disregard it.</li>
291
  <li>We see that the entry for gamma has all zeros in replicate, datareplicate
292
      and hub, so lamba is set up to accept replication from gamma.  As we have
293
      already seen, gamma is correctly configured to replicate metadata and data
294
      to lambda.  We should see data and metadata replication from gamma to lamda.
295
  <li>We see that the entry for alpha has ones in the replicate datareplicate and 
296
      hub columns.  There's a lot going on here:
297
    <ul class="list2">
298
    <li>First, lamda will replicate original metadata and data to alpha if 
299
        alpha is configured to accept replication from lamda.  Because alpha 
300
        has an entry for lambda, lamba will be allowed to replicate to alpha. </li>
301
    <li>Second, because the alpha entry has a one in the hub column, lambda 
302
        will not only replicate it's original data, it will also replicate 
303
        data that was replicated to it.  Remember that gamma was configured 
304
        to replicate to lamda.  So any data or metadata that gamma sends to 
305
        lambda will get further replicated to alpha.</li>
306
    <li>Finally, the alpha entry in the table allows the alpha server to 
307
        replicate to lambda.  Since the alpha server is set up to replicate
308
        metadata only, we would expect any original metadata on alpha to 
309
        wind up on lambda.</li>
310
    </ul>
311
  </ul>
312

    
313
<a name="ControlPanel"></a><div class="header1">The Replication Control Panel:</div>      
314
  There is an html control panel for controling replication.  After
315
  <a href="./Metacatinstall.html">installing</a> Metacat, you can access
316
  it by going through the Metacat servlet context you have setup and calling up
317
  replControl.html.  For instance, if you setup a Metacat servlet instance 
318
  called 'knb' you would probably type 
319
  
320
  <div class="code">http://server.domain.com:8080/Metacat/style/skins/dev/replControl.html</div>  
321
  
322
  The control panel is an easy interface for adding/removing/altering servers and 
323
  starting the delta-T handler.  It will also allow you to 'force replicate' your 
324
  server list.  This is useful if you want to initialize the state of one Metacat 
325
  server from an existing state of another (i.e. copy all of the data from an existing
326
  server).</p>
327
  
328
  <a name="Certificates"></a><div class="header1">Certificates:</div>
329
  You will need to generate security certificates on both the replication client 
330
  and server.  The certificates will be exchanged so that each machine understands
331
  that the other has access for replication.<br><br>
332
  The following are the steps to generate and exchange certificates on systems
333
  running Tomcat 5 and java 1.5.  Note that if Tomcat is running in conjunction with
334
  Apache, the process is somewhat different than if it is running standalone.
335

    
336
  <a name="GenerateCertificates"></a><div class="header2">Generate Certificates on both the replication client and server.</div>  
337

    
338
  <a name="GenerateCertTomcat"></a><div class="header3">Generate Certificate for Tomcat standalone (no Apache)</div>
339
  <ul class="list1">
340
  <li>Generate keys in java default key store - this will create a secure key and put it
341
    into the binary certificates file located at $JAVA_HOME/lib/security/cacerts</li> 
342
    <ul class="list2">
343
    <li>Run the command: 
344
   	  <div class="code">keytool -genkey -alias &lt;aliasname&gt; -keyalg RSA -validity 800 -keystore $JAVA_HOME/lib/security/cacerts</div>
345
     where &lt;aliasname&gt; is a unique name that you choose for this cert.  Something like "&lt;hostname-tomcat&gt"
346
     might be appropriate, where &lt;hostname-tomcat&gt is the name of this host.</li>
347
    </ul>
348
  </li>
349
  <li>
350
    Password - keytool will ask for a password.  If this is a pre-existing keystore, you will need
351
    to know its password to modify it.  If you are creating a new keystore, the password you enter
352
    will become the keystore password.
353
  </li>
354
  <li>Sample values when creating certificate</li>
355
    <ul class="list2">
356
    <li>What is your first and last name? <b>myserver.nceas.ucsb.edu </b>
357
        (note: use the host name without port number)<li>
358
    <li>What is the name of your organizional unit? <b>NCEAS</b></li>
359
    <li>What is the name of your organizional unit? <b>UCSB</b></li>
360
    <li>What is the name of your City or Locality? <b>Santa Barbara</b></li>
361
    <li>What is the name of your State or Province? <b>California</b> 
362
        (note: this is spelled in full)<li>
363
    <li>What is the two-letter country code for this unit? <b>US</b></li>
364
    </ul>
365
  <li>Generate certificate - this will pull the certificate you created from the cacerts file
366
      and put it into a local file</li>
367
    <ul class="list2">
368
    <li>Run the command:
369
      <div class="code">keytool -export -alias &lt;aliasname&gt; -file &lt;outputfile&gt;.cert -keystore $JAVA_HOME/lib/security/cacerts</div>
370
      where &lt;aliasname&gt; is the same name you used when you created the certificate.  </li>
371
    <li>A file named &lt;outputfile&gt;.cert will be created in the same directory where you run the keytool 
372
      command.  You can name the output file anything you like, but keep in mind that it will get sent to the 
373
      partner machine used for replication.  The filename should have have enough meaning that someone who sees 
374
      it on that machine can have some idea where it came from.  Again, something like "&lt;hostname&gt;-tomcat.cert"
375
      will suffice.</li>   
376
    </ul>
377
  </li>
378
  <li>Enable SSL in Tomcat 
379
    <ul class="list2">
380
    <li>Edit the Tomcat server file at $TOMCAT_HOME/conf/server.xml</li>
381
    <li>
382
      uncomment the section that starts with "&lt;Connector port="8443" ... (Note: Databased Informationcomments start with
383
      &lt;!-- and end with --&gt;).
384
    </li>
385
  	<li>add two attribute to that section that read:
386
  	  <div class="code">keystoreFile="&lt;JAVA_HOME&gt;/lib/security/cacerts"</div>
387
  	  <div class="code">keystorePass="&lt;keystore_password&gt;"</div>
388
  	  where &lt;JAVA_HOME&gt; should be the actual java path and &lt;keystore_password&gt; should be the 
389
  	  password you used when you created the keystore.
390
  	</li>
391
  	</ul>
392
  </li>
393
  </ul>  
394
    
395
  <a name="GenerateCertApache"></a><div class="header3">Generate Certificate for Apache/Tomcat</div>
396
  <ul class="list1">
397
  <li>Generate keys using openssl
398
    <ul class="list2">
399
    <li>Run the command: 
400
   	  <div class="code">   openssl req -new -out REQ.pem -keyout &lt;hostname&gt;-apache.key</div>
401
    </li>
402
    </ul>
403
  </li>
404
  <li>Sample values when creating certificate</li>
405
    <ul class="list2">
406
    <li>Country Name (2 letter code) [AU]: <b>US</b></li>
407
    <li>State or Province Name (full name) [Some-State]: <b>California</b> 
408
        (note: this is spelled in full)</li>
409
    <li>Locality Name (eg, city) []: <b>Santa Barbara</b></li>
410
    <li>Organization Name (eg, company) [Internet Widgits Pty Ltd]: </b>UCSB</b></li>
411
    <li>Organizational Unit Name (eg, section) []: <b>NCEAS</b></li>
412
    <li>Common Name (eg, YOUR name) []: <b>myserver.mydomain.edu</b>
413
        (note: use the host name without port number)</li>
414
    <li>Email Address []:  <b>administrator@mydomain.edu</b></li>
415
    <li>A challenge password []: (note: leave blank)</li>
416
    <li>An optional company name []: (note: leave blank)</li>
417
    </ul>
418
  </li>    
419
  <li>Generate certificate - this will create a local file with your certificate</li>
420
    <ul class="list2">
421
    <li>Run the command:
422
      <div class="code">openssl req -x509 -days 800 -in REQ.pem -key &lt;hostname&gt;-apache.key -out &lt;hostname&gt;-apache.crt</div>
423
      where &lt;aliasname&gt; is the same name you used when you created the certificate.  </li>
424
    <li>A file named &lt;outputfile&gt;.cert will be created in the same directory where you run the keytool 
425
      command.  You can name the output file anything you like, but keep in mind that it will get sent to the 
426
      partner machine used for replication.  The filename should have have enough meaning that someone who sees 
427
      it on that machine can have some idea where it came from.  Again, something like "&lt;hostname&gt;-tomcat.cert"
428
      will suffice.</li>   
429
    </ul>
430
  </li>   
431
  <li>Enter the certificate into apache security configuration - you need to register the certificate
432
      in the local Apache instance.  Note that the security files may be in a different place depending
433
      on how you installed apache.</li>
434
    <ul class="list2">
435
    <li>Copy the certificate and key file to the apache ssl directories and enable ssl.</li>
436
    <li>For Ubuntu/Debian based systems:
437
      <ul class="list3">
438
      <li>sudo cp &lt;hostname&gt;-apache.crt /etc/ssl/certs</li>
439
      <li>sudo cp &lt;hostname&gt;-apache.key /etc/ssl/private</li>
440
      <li>As root edit /etc/apache2/sites-available/default.  In the VirtualHost section
441
          after the DocumentRoot line, add:<br>
442
          SSLEngine on<br>
443
          SSLOptions +FakeBasicAuth +ExportCertData +CompatEnvVars +StrictRequire<br>
444
          SSLCertificateFile /etc/ssl/certs/server.crt<br>
445
          SSLCertificateKeyFile /etc/ssl/private/server.key<br>
446
      </li>
447
      </ul>
448
    </li>  
449
    </ul>  
450
    <ul class="list2">
451
    <li>For other systems:
452
      <ul class="list3">
453
      <li>sudo cp &lt;hostname&gt;-apache.crt $APACHE_HOME/conf/ssl.crt</li>
454
      <li>sudo cp &lt;hostname&gt;-apache.key $APACHE_HOME/conf/ssl.key</li> 
455
      <li> ADD STEPS TO ENABLE SSL ON NON_DEBIAN SYSTEMS HERE</li>
456
      </ul>
457
    </li>  
458
    </ul>    	  	        
459
  <li>scp &lt;hostname&gt;-apache.crt to the replication partner machine.</li>
460
  </ul>  
461
  
462
  <a name="RegisterPartner"></a><div class="header2">Register the partner machines certificate.</div>   
463
  At this point, you have created a certificate for each replication server and 
464
  scp-ed them across to each other.  Now you need to import the remote server's
465
  certificate on the local machine.  Perform the following steps for each 
466
  replication server.
467
  <ul class="list1">
468
  <li>Import the remote certificate by running:
469
    <div class="code">keytool -import -alias &lt;remotehostalias&gt; -file &lt;remotehostfilename&gt;.cert -keystore $JAVA_HOME/lib/security/cacerts</div>
470
    where the &lt;remotehostfilename&gt; is the certificate file you created on the remote machine and
471
    copied to this machine.  The &lt;remotehostalias&gt; is the name the certificate will use in
472
    the keystore.  It should be something that identifies the remote host.  
473
  </li>
474
  <li>Restart Apache and Tomcat on both replication machines</li>
475
  </ul>
476

    
477
  <a href="./packages.html">Back</a> | <a href="./metacattour.html">Home</a> | 
478
  <a href="./datafiles.html">Next</a>
479
  </ul>
480
  
481

    
482
</BODY>
483
</HTML>
(46-46/57)