Project

General

Profile

1
<!--
2
  * replication.html
3
  *
4
  *      Authors: Chad Berkley
5
  *    Copyright: 2000 Regents of the University of California and the
6
  *               National Center for Ecological Analysis and Synthesis
7
  *  For Details: http://www.nceas.ucsb.edu/
8
  *      Created: 2001 January 23
9
  *      Version: 
10
  *    File Info: '$ '
11
  * 
12
  * 
13
-->
14
<HTML>
15
<HEAD>
16
<TITLE>Metacat</TITLE>
17
<link rel="stylesheet" type="text/css" href="./default.css">
18
</HEAD> 
19
<BODY>
20
  <table width="100%">
21
    <tr>
22
      <td class="tablehead" colspan="2"><p class="label">Replication</p></td>
23
      <td class="tablehead" colspan="2" align="right">
24
        <a href="./packages.html">Back</a> | <a href="./metacattour.html">Home</a> | 
25
        <a href="./datafiles.html">Next</a>
26
      </td>
27
    </tr>
28
  </table>
29
  
30
  <div class="header1">Table of Contents</div>
31
  <div class="toc1"><a href="#Intro">Metacat Replication</a></div>
32
    <div class="toc2"><a href="#DatabasedInfo">Databased Information</a></div>
33
    <div class="toc2"><a href="#Example">Example</a></div>
34
      <div class="toc3"><a href="#gamma">What happens with gamma?</a></div>
35
      <div class="toc3"><a href="#alpha">What happens with alpha?</a></div>
36
      <div class="toc3"><a href="#lamda">What happens with lamda?</a></div>
37
  <div class="toc1"><a href="#Certificates">Certificates</a></div>
38
    <div class="toc2"><a href="#GenerateCertificates">Generate Certificates on both the replication client and server.</a></div> 
39
      <div class="toc3"><a href="#GenerateCertTomcat">Generate Certificate for Tomcat standalone (no Apache)</a></div>
40
      <div class="toc3"><a href="#GenerateCertApache">Generate Certificate for Apache/Tomcat</a></div>
41
    <div class="toc2"><a href="#RegisterPartner">Register the partner machines certificate</a></div> 
42
  
43
  <a name="Intro"></a><div class="header1">Metacat Replication</div>
44
  <p>Metacat has built-in replication to allow different Metacat servers to 
45
  share data between themselves. Metacat not only replicates XML documents but 
46
  also data files. </p>
47
  
48
  <p>Metacat's hub feature allows it to replicate not only it's own server's original
49
  documents, but also those that were replicated from other servers.  This functionality
50
  allows for a more complex chaining replication structure.</p>
51
  
52
  <p>The replication scheme that Metacat uses is both push and pull.  There are 
53
  several triggers that can start a replication mechanism: </p>
54
  <ul class="list1">
55
    <li><b>Delta-T monitoring</b> - at a set time interval a server checks each of the
56
    other servers in its list for updated documents</li>
57
    <li><b>INSERT trigger</b> - Whenever a document is inserted, the server notifies
58
    the remote hosts in its list that it has a new file available.</li>
59
    <li><b>UPDATE trigger</b> - Whenever a document is updated, the server notifies
60
    each server in its list of the update.</li>
61
    <li><b>File locking</b> - When a local user tries to alter a document on a local 
62
    server that belongs to a remote server, the local server must first
63
    obtain a lock on that file.  Once the lock is obtained, the file can 
64
    be updated, then it is force replicated out to each server in the list.
65
    The lock ensures that the remote copy is up to date and that an older
66
    file does not overwrite a newer one.  Only a documents home server
67
    can give a lock for that file to be altered.</li>
68
  </ul>
69
  
70
  <a name="DatabasedInfo"></a><div class="header2">Databased Information</div>
71
  <p>Each server contains a list of servers to which it can replicate.  One-way
72
  replication is enabled by the 'replicate' and 'datareplicate' flags in the 
73
  list.  The server list may look like the following.</p>
74
  <table border="1">
75
    <tr>
76
      <td><b>serverid</b></td>
77
      <td><b>server</b></td>
78
      <td><b>last_checked</b></td>
79
      <td><b>replicate</b></td>
80
      <td><b>datareplicate</b></td>
81
      <td><b>hub</b></td>
82
    </tr>
83
    <tr>
84
      <td>1</td>
85
      <td>localhost</td>
86
      <td>null</td>
87
      <td>0</td>
88
      <td>0</td>
89
      <td>0</td>
90
    </tr>
91
    <tr>
92
      <td>2</td>
93
      <td>alpha.nceas.ucsb.edu:8080/berkley/servlet/replication</td>
94
      <td>2001-01-22 14:52:12.1</td>
95
      <td>0</td>
96
      <td>0</td>
97
      <td>0</td>
98
    </tr>
99
    <tr>
100
      <td>3</td>
101
      <td>dev.nceas.ucsb.edu/Metacat/servlet/replication</td>
102
      <td>2001-01-23 9:10:02.5</td>
103
      <td>1</td>
104
      <td>1</td>
105
      <td>0</td>
106
    </tr>
107
  </table>
108
  
109
  <br>
110
  The server list is kept in a table in the database called xml_replication.
111
  Localhost must always be the first entry in the table and have a serverid of 1.
112
  The database fields are:
113
  <ul class="list1">
114
  <li><b>serverid</b> - a unique ID that is generated by the database when a new field is added.</li>
115
  <li><b>server</b> - this field always points to the partner server's replication servlet,
116
  hence the "servlet/replication" on the end of both of the sample servers.  Note
117
  that any port numbers (if your servlet engine is not running on port 80) must
118
  also be included. </li>
119
  <li><b>last_checked</b> - a system generated values that holds the last time that a check was 
120
  made to see if replication needed to be performed.<li>
121
  <li><b>replicate</b> - flag that is set to 1 if you want this server to replicate XML 
122
  metadata documents TO the remote host.  Note that if this flag is set to 0, datareplicate
123
  and hub fields have no meaning.</li>
124
  <li><b>datareplicate</b> - flag that is set to 1 if you want this server to copy data 
125
  files to the remote host.  Note that this field has no meaning if replicate is not set to 1.</li>
126
  If this server is a hub to the remote host, the hub flag should be set to.
127
  <li><b>hub</b> - if this flag is set to true, this server will not only replicate it's own
128
  original documents, it will also replicate documents that were replicated to it.  Thus it 
129
  acts as a replication hub to one or more other Metacat servers.</li>
130
  </ul>
131
  
132
  <a name="Example"></a><div class="header2">Example</div>
133
  Here we show an example setup of three replication servers.  We will discuss each.<br><br>
134
  
135
  First, note that in order for replication to occur, both partner servers must have 
136
  each other in their respective tables or replication will not take place.  Also, 
137
  certificates must be set up correctly on both servers in order for replication to 
138
  work.  See the <a href="#Certificates">certificates</a> section below.<br><br>
139

    
140
  <table border="1">
141
    <tr>
142
      <td>host</td>
143
      <td>replication table</td>
144
    </tr>
145
    <tr>
146
     <td>gamma.nceas.ucsb.edu</td>
147
     <td>
148
      <table border="2">
149
        <tr>
150
          <td><b>server</b></td>
151
          <td><b>last_checked</b></td>
152
          <td><b>replicate</b></td>
153
          <td><b>datareplicate</b></td>
154
          <td><b>hub</b></td>
155
        </tr>
156
        <tr>
157
          <td>localhost</td>
158
          <td>null</td>
159
          <td>0</td>
160
          <td>0</td>
161
          <td>0</td>
162
        </tr>
163
        <tr>
164
          <td>alpha.nceas.ucsb.edu:8080/berkley/servlet/replication&nbsp;&nbsp;&nbsp;</td>
165
          <td>2001-01-22 14:52:12.1</td>
166
          <td>0</td>
167
          <td>0</td>
168
          <td>0</td>
169
        </tr>
170
        <tr>
171
          <td>lamda.nceas.ucsb.edu/Metacat/servlet/replication</td>
172
          <td>2001-01-23 9:10:02.5</td>
173
          <td>1</td>
174
          <td>1</td>
175
          <td>0</td>
176
        </tr>
177
      </table>
178
     </td>
179
    </tr>
180
    <tr>
181
      <td>alpha.nceas.ucsb.edu</td>
182
      <td>
183
        <table border="2">
184
          <tr>
185
            <td><b>server</b></td>
186
            <td><b>last_checked</b></td>
187
            <td><b>replicate</b></td>
188
            <td><b>datareplicate</b></td>
189
            <td><b>hub</b></td>
190
          </tr>
191
          <tr>
192
            <td>localhost</td>
193
            <td>null</td>
194
            <td>0</td>
195
            <td>0</td>
196
            <td>0</td>
197
          </tr>
198
          <tr>
199
            <td>gamma.nceas.ucsb.edu:8080/berkley/servlet/replication</td>
200
            <td>2001-01-21 11:33:12.7</td>
201
            <td>0</td>
202
            <td>1</td>
203
            <td>0</td>
204
          </tr>
205
          <tr>
206
            <td>lamda.nceas.ucsb.edu/Metacat/servlet/replication</td>
207
            <td>2001-01-23 10:22:02.5</td>
208
            <td>1</td>
209
            <td>0</td>
210
            <td>0</td>
211
          </tr>
212
        </table>
213
      </td>
214
    </tr>
215
    <tr>
216
      <td>lamda.nceas.ucsb.edu</td>
217
      <td>
218
        <table border="2">
219
          <tr>
220
            <td><b>server</b></td>
221
            <td><b>last_checked</b></td>
222
            <td><b>replicate</b></td>
223
            <td><b>datareplicate</b></td>
224
            <td><b>hub</b></td>
225
          </tr>
226
          <tr>
227
            <td>localhost</td>
228
            <td>null</td>
229
            <td>0</td>
230
            <td>0</td>
231
            <td>0</td>
232
          </tr>
233
          <tr>
234
            <td>gamma.nceas.ucsb.edu:8080/berkley/servlet/replication</td>
235
            <td>2001-01-21 11:33:12.7</td>
236
            <td>0</td>
237
            <td>0</td>
238
            <td>0</td>
239
          </tr>
240
          <tr>
241
            <td>alpha.nceas.ucsb.edu:8080/Metacat/servlet/replication</td>
242
            <td>2001-01-22 12:15:32.5</td>
243
            <td>1</td>
244
            <td>1</td>
245
            <td>1</td>
246
          </tr>
247
        </table>
248
      </td>
249
    </tr>
250
  </table>
251
  
252
  <a name="gamma"></a><div class="header3">What happens with gamma?</div>
253
  <ul class="list1">
254
  <li>The localhost entry is required internally for replication to work on 
255
      gamma.  As long as we see it there, we can safely disregard it.</li>
256
  <li>We see the entry for the alpha machine has all zeros in replicate, 
257
      datareplicate and hub columns.  This means that gamma is configured to
258
      accept replication information from alpha.  (As we will see in a moment,
259
      alpha is not actually correctly configured to send data to gamma.)</li>
260
  <li>We see that the entry for the lamda machine has ones in the replicate
261
      and data replicate columns and a zero in the hub column.  This tells us
262
      that gamma will replicate it's original documents to lamda, assuming that
263
      lambda is configured to accept replication from gamma (we will see that it
264
      is).  However, because the hub value is zero, any documents that replicate 
265
      to gamma will not be further replicated to lamda.</li>
266
  </ul>
267
   
268
  <a name="alpha"></a><div class="header3">What happens with alpha?</div>
269
  <ul class="list1">
270
  <li>The localhost entry is required internally for replication to work on 
271
      alpha.  As long as we see it there, we can safely disregard it.</li>
272
  <li>We see that the entry for gamma has a zero in the replicate column.  
273
      This means that all other entries are meaningless and can be disregarded.
274
      Even though there is a one in the datareplicate column on alpha and gamma 
275
      is configured to accept replication from alpha, no replicationwill happen 
276
      from alpha to gamma.</li>
277
  <li>We see that the entry for lamda is a one in the replicate column and zeros
278
      in the datareplicate and hub columns.  Assuming lamda is configured to 
279
      accept replication from alpha, alpha will replicate metadata only to lamda 
280
      (and indeed, we will see that lambda is set up to accept replication from 
281
      alpha). </li>
282
  </ul>
283
      
284
  <a name="lamda"></a><div class="header3">What happens with lamda?</div>
285
  <ul class="list1">
286
  <li>The localhost entry is required internally for replication to work on 
287
      lamda.  As long as we see it there, we can safely disregard it.</li>
288
  <li>We see that the entry for gamma has all zeros in replicate, datareplicate
289
      and hub, so lamba is set up to accept replication from gamma.  As we have
290
      already seen, gamma is correctly configured to replicate metadata and data
291
      to lambda.  We should see data and metadata replication from gamma to lamda.
292
  <li>We see that the entry for alpha has ones in the replicate datareplicate and 
293
      hub columns.  There's a lot going on here:
294
    <ul class="list2">
295
    <li>First, lamda will replicate original metadata and data to alpha if 
296
        alpha is configured to accept replication from lamda.  Because alpha 
297
        has an entry for lambda, lamba will be allowed to replicate to alpha. </li>
298
    <li>Second, because the alpha entry has a one in the hub column, lambda 
299
        will not only replicate it's original data, it will also replicate 
300
        data that was replicated to it.  Remember that gamma was configured 
301
        to replicate to lamda.  So any data or metadata that gamma sends to 
302
        lambda will get further replicated to alpha.</li>
303
    <li>Finally, the alpha entry in the table allows the alpha server to 
304
        replicate to lambda.  Since the alpha server is set up to replicate
305
        metadata only, we would expect any original metadata on alpha to 
306
        wind up on lambda.</li>
307
    </ul>
308
  </ul>
309
      
310
  There is an html control panel for controling replication.  After
311
  <a href="./Metacatinstall.html">installing</a> Metacat, you can access
312
  it by going through the Metacat servlet context you have setup and calling up
313
  replControl.html.  For instance, if you setup a Metacat servlet instance 
314
  called 'knb' you would probably type 
315
  
316
  <div class="code">http://server.domain.com:8080/Metacat/style/skins/dev/replControl.html</div>  
317
  
318
  The control panel is an easy interface for adding/removing/altering servers and 
319
  starting the delta-T handler.  It will also allow you to 'force replicate' your 
320
  server list.  This is useful if you want to initialize the state of one Metacat 
321
  server from an existing state of another (i.e. copy all of the data from an existing
322
  server).</p>
323
  
324
  <a name="Certificates"></a><div class="header1">Certificates:</div>
325
  You will need to generate security certificates on both the replication client 
326
  and server.  The certificates will be exchanged so that each machine understands
327
  that the other has access for replication.<br><br>
328
  The following are the steps to generate and exchange certificates on systems
329
  running Tomcat 5 and java 1.5.  Note that if Tomcat is running in conjunction with
330
  Apache, the process is somewhat different than if it is running standalone.
331

    
332
  <a name="GenerateCertificates"></a><div class="header2">Generate Certificates on both the replication client and server.</div>  
333

    
334
  <a name="GenerateCertTomcat"></a><div class="header3">Generate Certificate for Tomcat standalone (no Apache)</div>
335
  <ul class="list1">
336
  <li>Generate keys in java default key store - this will create a secure key and put it
337
    into the binary certificates file located at $JAVA_HOME/lib/security/cacerts</li> 
338
    <ul class="list2">
339
    <li>Run the command: 
340
   	  <div class="code">keytool -genkey -alias &lt;aliasname&gt; -keyalg RSA -validity 800 -keystore cacerts</div>
341
     where &lt;aliasname&gt; is a unique name that you choose for this cert.  Something like "&lt;hostname-tomcat&gt"
342
     might be appropriate.</li>
343
    </ul>
344
  </li>
345
  <li>Sample values when creating certificate</li>
346
    <ul class="list2">
347
    <li>What is your first and last name? <b>myserver.nceas.ucsb.edu </b>
348
        (note: use the host name without port number)<li>
349
    <li>What is the name of your organizional unit? <b>NCEAS</b></li>
350
    <li>What is the name of your organizional unit? <b>UCSB</b></li>
351
    <li>What is the name of your City or Locality? <b>Santa Barbara</b></li>
352
    <li>What is the name of your State or Province? <b>California</b> 
353
        (note: this is spelled in full)<li>
354
    <li>What is the two-letter country code for this unit? <b>US</b></li>
355
    </ul>
356
  <li>Generate certificate - this will pull the certificate you created from the cacerts file
357
      and put it into a local file</li>
358
    <ul class="list2">
359
    <li>Run the command:
360
      <div class="code">keytool -export -alias &lt;aliasname&gt; -file &lt;outputfile&gt;.cert -keystore cacerts</div>
361
      where &lt;aliasname&gt; is the same name you used when you created the certificate.  </li>
362
    <li>A file named &lt;outputfile&gt;.cert will be created in the same directory where you run the keytool 
363
      command.  You can name the output file anything you like, but keep in mind that it will get sent to the 
364
      partner machine used for replication.  The filename should have have enough meaning that someone who sees 
365
      it on that machine can have some idea where it came from.  Again, something like "&lt;hostname&gt;-tomcat.cert"
366
      will suffice.</li>   
367
    </ul>
368
  </li>
369
  <li>Enable SSL in Tomcat 
370
    <ul class="list2">
371
    <li>Edit the Tomcat server file at $TOMCAT_HOME/conf/server.xml</li>
372
    <li>uncomment the section that starts with "&lt;Connector port="8443" ...</li>
373
  	<li>add another attribute to that section that reads:
374
  	  <div class="code">keystoreFile="&lt;JAVA_HOME&gt;/lib/security/cacerts"</div>
375
  	  where $JAVA_HOME should be the actual java path.
376
  	</li>
377
  	</ul>
378
  </li>
379
  </ul>  
380
    
381
  <a name="GenerateCertApache"></a><div class="header3">Generate Certificate for Apache/Tomcat</div>
382
  <ul class="list1">
383
  <li>Generate keys using openssl
384
    <ul class="list2">
385
    <li>Run the command: 
386
   	  <div class="code">   openssl req -new -out REQ.pem -keyout &lt;hostname&gt;-apache.key</div>
387
    </li>
388
    </ul>
389
  </li>
390
  <li>Sample values when creating certificate</li>
391
    <ul class="list2">
392
    <li>Country Name (2 letter code) [AU]: <b>US</b></li>
393
    <li>State or Province Name (full name) [Some-State]: <b>California</b> 
394
        (note: this is spelled in full)</li>
395
    <li>Locality Name (eg, city) []: <b>Santa Barbara</b></li>
396
    <li>Organization Name (eg, company) [Internet Widgits Pty Ltd]: </b>UCSB</b></li>
397
    <li>Organizational Unit Name (eg, section) []: <b>NCEAS</b></li>
398
    <li>Common Name (eg, YOUR name) []: <b>myserver.mydomain.edu</b>
399
        (note: use the host name without port number)</li>
400
    <li>Email Address []:  <b>administrator@mydomain.edu</b></li>
401
    <li>A challenge password []: (note: leave blank)</li>
402
    <li>An optional company name []: (note: leave blank)</li>
403
    </ul>
404
  </li>    
405
  <li>Generate certificate - this will create a local file with your certificate</li>
406
    <ul class="list2">
407
    <li>Run the command:
408
      <div class="code">openssl req -x509 -days 800 -in REQ.pem -key &lt;hostname&gt;-apache.key -out &lt;hostname&gt;-apache.crt</div>
409
      where &lt;aliasname&gt; is the same name you used when you created the certificate.  </li>
410
    <li>A file named &lt;outputfile&gt;.cert will be created in the same directory where you run the keytool 
411
      command.  You can name the output file anything you like, but keep in mind that it will get sent to the 
412
      partner machine used for replication.  The filename should have have enough meaning that someone who sees 
413
      it on that machine can have some idea where it came from.  Again, something like "&lt;hostname&gt;-tomcat.cert"
414
      will suffice.</li>   
415
    </ul>
416
  </li>   
417
  <li>Enter the certificate into apache security configuration - you need to register the certificate
418
      in the local Apache instance.  Note that the security files may be in a different place depending
419
      on how you installed apache.</li>
420
    <ul class="list2">
421
    <li>Copy the certificate and key file to the apache ssl directories and enable ssl.</li>
422
    <li>For Ubuntu/Debian based systems:
423
      <ul class="list3">
424
      <li>sudo cp &lt;hostname&gt;-apache.crt /etc/ssl/certs</li>
425
      <li>sudo cp &lt;hostname&gt;-apache.key /etc/ssl/private</li>
426
      <li>As root edit /etc/apache2/sites-available/default.  In the VirtualHost section
427
          after the DocumentRoot line, add:<br>
428
          SSLEngine on<br>
429
          SSLOptions +FakeBasicAuth +ExportCertData +CompatEnvVars +StrictRequire<br>
430
          SSLCertificateFile /etc/ssl/certs/server.crt<br>
431
          SSLCertificateKeyFile /etc/ssl/private/server.key<br>
432
      </li>
433
      </ul>
434
    </li>  
435
    </ul>  
436
    <ul class="list2">
437
    <li>For other systems:
438
      <ul class="list3">
439
      <li>sudo cp &lt;hostname&gt;-apache.crt $APACHE_HOME/conf/ssl.crt</li>
440
      <li>sudo cp &lt;hostname&gt;-apache.key $APACHE_HOME/conf/ssl.key</li> 
441
      <li> ADD STEPS TO ENABLE SSL ON NON_DEBIAN SYSTEMS HERE</li>
442
      </ul>
443
    </li>  
444
    </ul>    	  	        
445
  <li>scp &lt;hostname&gt;-apache.crt to the replication partner machine.</li>
446
  </ul>  
447
  
448
  <a name="RegisterPartner"></a><div class="header2">Register the partner machines certificate.</div>   
449
  At this point, you have created a certificate for each replication server and 
450
  scp-ed them across to each other.  Now you need to import the remote server's
451
  certificate on the local machine.  Perform the following steps for each 
452
  replication server.
453
  <ul class="list1">
454
  <li>Import the remote certificate by running:
455
    <div class="code">keytool -import -alias &lt;remotehostalias&gt; -file &lt;remotehostfilename&gt;.cert -keystore cacerts</div>
456
    where the &lt;remotehostfilename&gt; is the certificate file you created on the remote machine and
457
    copied to this machine.  The &lt;remotehostalias&gt; is the name the certificate will use in
458
    the keystore.  It should be something that identifies the remote host.  
459
  </li>
460
  <li>Restart Apache and Tomcat on both replication machines</li>
461
  </ul>
462

    
463
  <a href="./packages.html">Back</a> | <a href="./metacattour.html">Home</a> | 
464
  <a href="./datafiles.html">Next</a>
465
  </ul>
466
  
467

    
468
</BODY>
469
</HTML>
(46-46/57)