Project

General

Profile

Revision 4482

Added by daigle almost 13 years ago

Add certificate information and update examples

View differences:

docs/user/replication.html
26 26
      </td>
27 27
    </tr>
28 28
  </table>
29
  <p>Metacat has built in replication to allow different Metacat servers to 
30
  share data between themselves. In this release, Metacat not only replicate 
31
  XML documents but also data files. </p>
32
  <p> A new hub feature was added to Metacat too. Previous version of Metacat
33
  only replicate XML documents whose home server is itself. But hub Metacat can 
34
  replicate both its documents and other's documents which replicated from 
35
  other server.</p>
29
  
30
  <div class="header1">Table of Contents</div>
31
  <div class="toc1"><a href="#Intro">Metacat Replication</a></div>
32
    <div class="toc2"><a href="#DatabasedInfo">Databased Information</a></div>
33
    <div class="toc2"><a href="#Example">Example</a></div>
34
      <div class="toc3"><a href="#gamma">What happens with gamma?</a></div>
35
      <div class="toc3"><a href="#alpha">What happens with alpha?</a></div>
36
      <div class="toc3"><a href="#lamda">What happens with lamda?</a></div>
37
  <div class="toc1"><a href="#Certificates">Certificates</a></div>
38
    <div class="toc2"><a href="#GenerateCertificates">Generate Certificates on both the replication client and server.</a></div> 
39
      <div class="toc3"><a href="#GenerateCertTomcat">Generate Certificate for Tomcat standalone (no Apache)</a></div>
40
      <div class="toc3"><a href="#GenerateCertApache">Generate Certificate for Apache/Tomcat</a></div>
41
    <div class="toc2"><a href="#RegisterPartner">Register the partner machines certificate</a></div> 
42
  
43
  <a name="Intro"></a><div class="header1">Metacat Replication</div>
44
  <p>Metacat has built-in replication to allow different Metacat servers to 
45
  share data between themselves. Metacat not only replicates XML documents but 
46
  also data files. </p>
47
  
48
  <p>Metacat's hub feature allows it to replicate not only it's own server's original
49
  documents, but also those that were replicated from other servers.  This functionality
50
  allows for a more complex chaining replication structure.</p>
51
  
36 52
  <p>The replication scheme that Metacat uses is both push and pull.  There are 
37
  several triggers that can start a replication mechanism. </p>
38
  <ul>
39
    <li>Delta-T monitoring. At a set time interval a server checks each of the
53
  several triggers that can start a replication mechanism: </p>
54
  <ul class="list1">
55
    <li><b>Delta-T monitoring</b> - at a set time interval a server checks each of the
40 56
    other servers in its list for updated documents</li>
41
    <li>INSERT trigger. Whenever a document is inserted, the server notifies
57
    <li><b>INSERT trigger</b> - Whenever a document is inserted, the server notifies
42 58
    the remote hosts in its list that it has a new file available.</li>
43
    <li>UPDATE trigger. Whenever a document is updated, the server notifies
59
    <li><b>UPDATE trigger</b> - Whenever a document is updated, the server notifies
44 60
    each server in its list of the update.</li>
45
    <li>File locking. When a local user tries to alter a document on a local 
61
    <li><b>File locking</b> - When a local user tries to alter a document on a local 
46 62
    server that belongs to a remote server, the local server must first
47 63
    obtain a lock on that file.  Once the lock is obtained, the file can 
48 64
    be updated, then it is force replicated out to each server in the list.
......
50 66
    file does not overwrite a newer one.  Only a documents home server
51 67
    can give a lock for that file to be altered.</li>
52 68
  </ul>
69
  
70
  <a name="DatabasedInfo"></a><div class="header2">Databased Information</div>
53 71
  <p>Each server contains a list of servers to which it can replicate.  One-way
54 72
  replication is enabled by the 'replicate' and 'datareplicate' flags in the 
55 73
  list.  The server list may look like the following.</p>
......
88 106
    </tr>
89 107
  </table>
90 108
  
91
  <p>The server list is kept in a table in the database called xml_replication.
109
  <br>
110
  The server list is kept in a table in the database called xml_replication.
92 111
  Localhost must always be the first entry in the table and have a serverid of 1.
93
  The server field must always point to the other server's replication servlet,
94
  hence the servlet/replication on the end of both of the sample servers.  Note
112
  The database fields are:
113
  <ul class="list1">
114
  <li><b>serverid</b> - a unique ID that is generated by the database when a new field is added.</li>
115
  <li><b>server</b> - this field always points to the partner server's replication servlet,
116
  hence the "servlet/replication" on the end of both of the sample servers.  Note
95 117
  that any port numbers (if your servlet engine is not running on port 80) must
96
  also be included.  The replicate flag is set to 1 if you want this server to 
97
  copy XML documents TO the remote host.  If replicate flag is set to 1 and 
98
  datareplicate is set to 1, this server can copy data file TO the remote host
99
  too. If this server is a hub to the remote host, the hub flag should be set to
100
  1. (Note that both servers (the local host and the remote host) must have each 
101
  other in their respective tables or replication will not take place.)</p>
102
  <b>Example:</b>
118
  also be included. </li>
119
  <li><b>last_checked</b> - a system generated values that holds the last time that a check was 
120
  made to see if replication needed to be performed.<li>
121
  <li><b>replicate</b> - flag that is set to 1 if you want this server to replicate XML 
122
  metadata documents TO the remote host.  Note that if this flag is set to 0, datareplicate
123
  and hub fields have no meaning.</li>
124
  <li><b>datareplicate</b> - flag that is set to 1 if you want this server to copy data 
125
  files to the remote host.  Note that this field has no meaning if replicate is not set to 1.</li>
126
  If this server is a hub to the remote host, the hub flag should be set to.
127
  <li><b>hub</b> - if this flag is set to true, this server will not only replicate it's own
128
  original documents, it will also replicate documents that were replicated to it.  Thus it 
129
  acts as a replication hub to one or more other Metacat servers.</li>
130
  </ul>
131
  
132
  <a name="Example"></a><div class="header2">Example</div>
133
  Here we show an example setup of three replication servers.  We will discuss each.<br><br>
134
  
135
  First, note that in order for replication to occur, both partner servers must have 
136
  each other in their respective tables or replication will not take place.  Also, 
137
  certificates must be set up correctly on both servers in order for replication to 
138
  work.  See the <a href="#Certificates">certificates</a> section below.<br><br>
139

  
103 140
  <table border="1">
104 141
    <tr>
105 142
      <td>host</td>
106 143
      <td>replication table</td>
107 144
    </tr>
108 145
    <tr>
109
     <td>snoopy.nceas.ucsb.edu</td>
146
     <td>gamma.nceas.ucsb.edu</td>
110 147
     <td>
111 148
      <table border="2">
112 149
        <tr>
......
131 168
          <td>0</td>
132 169
        </tr>
133 170
        <tr>
134
          <td>dev.nceas.ucsb.edu/Metacat/servlet/replication</td>
171
          <td>lamda.nceas.ucsb.edu/Metacat/servlet/replication</td>
135 172
          <td>2001-01-23 9:10:02.5</td>
136 173
          <td>1</td>
137 174
          <td>1</td>
......
159 196
            <td>0</td>
160 197
          </tr>
161 198
          <tr>
162
            <td>snoopy.nceas.ucsb.edu:8080/berkley/servlet/replication</td>
199
            <td>gamma.nceas.ucsb.edu:8080/berkley/servlet/replication</td>
163 200
            <td>2001-01-21 11:33:12.7</td>
164 201
            <td>0</td>
165 202
            <td>1</td>
166 203
            <td>0</td>
167 204
          </tr>
168 205
          <tr>
169
            <td>dev.nceas.ucsb.edu/Metacat/servlet/replication</td>
206
            <td>lamda.nceas.ucsb.edu/Metacat/servlet/replication</td>
170 207
            <td>2001-01-23 10:22:02.5</td>
171 208
            <td>1</td>
172 209
            <td>0</td>
......
176 213
      </td>
177 214
    </tr>
178 215
    <tr>
179
      <td>dev.nceas.ucsb.edu</td>
216
      <td>lamda.nceas.ucsb.edu</td>
180 217
      <td>
181 218
        <table border="2">
182 219
          <tr>
......
194 231
            <td>0</td>
195 232
          </tr>
196 233
          <tr>
197
            <td>snoopy.nceas.ucsb.edu:8080/berkley/servlet/replication</td>
234
            <td>gamma.nceas.ucsb.edu:8080/berkley/servlet/replication</td>
198 235
            <td>2001-01-21 11:33:12.7</td>
199 236
            <td>0</td>
200 237
            <td>0</td>
......
211 248
      </td>
212 249
    </tr>
213 250
  </table>
214
  <p>Our three servers, snoopy, alpha and dev are all set up to replicate
215
  between themselves.  Snoopy is a one way replicator.  Meaning that it only 
216
  pushes XML documents and data file to dev but does not pull back from it.  This
217
  is achieved by dev and alpha setting snoopy's 'replicate' value to 0 indicating
218
  that they do not want to send their files to snoopy(Even in in Alpha, 
219
  'datareplicate' is set to 1 for snoopy but nothing will be sent to Snoopy from alpha).  
220
  Alpha and dev have a two-way replication agreement since each of them have a 1 
221
  in their 'replicate' value for the other.</p>
222
  <p>Snoopy will replicate both XML documents and data file to dev because it 
223
  setting dev's 'replicata' and 'datareplicate' is 1. Alpha only replicate
224
  XML documents to dev and this is caused by it setting dev's 'datareplicate' 0.
225
  </p>
226
  <p>Dev is a hub of alpha because it setting alpha's 'hub' value to be 1. Moreover, 
227
  dev set alpha's 'replicate' and 'datareplicate' value 1. So dev will replicate
228
  XML documents and data file whose home server is dev or snoopy(replicated 
229
  from snoopy) to alpha</p>
230
  <P>Note: if 'replicate' value is 0, the value for 'datareplicate' and 'hub'
231
  has no sense.</P>
232
  <p>There is an html control panel for controling replication.  After
251
  
252
  <a name="gamma"></a><div class="header3">What happens with gamma?</div>
253
  <ul class="list1">
254
  <li>The localhost entry is required internally for replication to work on 
255
      gamma.  As long as we see it there, we can safely disregard it.</li>
256
  <li>We see the entry for the alpha machine has all zeros in replicate, 
257
      datareplicate and hub columns.  This means that gamma is configured to
258
      accept replication information from alpha.  (As we will see in a moment,
259
      alpha is not actually correctly configured to send data to gamma.)</li>
260
  <li>We see that the entry for the lamda machine has ones in the replicate
261
      and data replicate columns and a zero in the hub column.  This tells us
262
      that gamma will replicate it's original documents to lamda, assuming that
263
      lambda is configured to accept replication from gamma (we will see that it
264
      is).  However, because the hub value is zero, any documents that replicate 
265
      to gamma will not be further replicated to lamda.</li>
266
  </ul>
267
   
268
  <a name="alpha"></a><div class="header3">What happens with alpha?</div>
269
  <ul class="list1">
270
  <li>The localhost entry is required internally for replication to work on 
271
      alpha.  As long as we see it there, we can safely disregard it.</li>
272
  <li>We see that the entry for gamma has a zero in the replicate column.  
273
      This means that all other entries are meaningless and can be disregarded.
274
      Even though there is a one in the datareplicate column on alpha and gamma 
275
      is configured to accept replication from alpha, no replicationwill happen 
276
      from alpha to gamma.</li>
277
  <li>We see that the entry for lamda is a one in the replicate column and zeros
278
      in the datareplicate and hub columns.  Assuming lamda is configured to 
279
      accept replication from alpha, alpha will replicate metadata only to lamda 
280
      (and indeed, we will see that lambda is set up to accept replication from 
281
      alpha). </li>
282
  </ul>
283
      
284
  <a name="lamda"></a><div class="header3">What happens with lamda?</div>
285
  <ul class="list1">
286
  <li>The localhost entry is required internally for replication to work on 
287
      lamda.  As long as we see it there, we can safely disregard it.</li>
288
  <li>We see that the entry for gamma has all zeros in replicate, datareplicate
289
      and hub, so lamba is set up to accept replication from gamma.  As we have
290
      already seen, gamma is correctly configured to replicate metadata and data
291
      to lambda.  We should see data and metadata replication from gamma to lamda.
292
  <li>We see that the entry for alpha has ones in the replicate datareplicate and 
293
      hub columns.  There's a lot going on here:
294
    <ul class="list2">
295
    <li>First, lamda will replicate original metadata and data to alpha if 
296
        alpha is configured to accept replication from lamda.  Because alpha 
297
        has an entry for lambda, lamba will be allowed to replicate to alpha. </li>
298
    <li>Second, because the alpha entry has a one in the hub column, lambda 
299
        will not only replicate it's original data, it will also replicate 
300
        data that was replicated to it.  Remember that gamma was configured 
301
        to replicate to lamda.  So any data or metadata that gamma sends to 
302
        lambda will get further replicated to alpha.</li>
303
    <li>Finally, the alpha entry in the table allows the alpha server to 
304
        replicate to lambda.  Since the alpha server is set up to replicate
305
        metadata only, we would expect any original metadata on alpha to 
306
        wind up on lambda.</li>
307
    </ul>
308
  </ul>
309
      
310
  There is an html control panel for controling replication.  After
233 311
  <a href="./Metacatinstall.html">installing</a> Metacat, you can access
234 312
  it by going through the Metacat servlet context you have setup and calling up
235 313
  replControl.html.  For instance, if you setup a Metacat servlet instance 
236
  called 'Metacat' you would probably type 
237
  http://server.domain.com:8080/Metacat/replControl.html.  The control panel 
238
  is an easy interface for adding/removing/altering servers and starting the 
239
  delta-T handler.  It will also allow you to 'force replicate' your server list.
240
  This is useful if you want to initialize the state of one Metacat server
241
  from an existing state of another (i.e. copy all of the data from an existing
314
  called 'knb' you would probably type 
315
  
316
  <div class="code">http://server.domain.com:8080/Metacat/style/skins/dev/replControl.html</div>  
317
  
318
  The control panel is an easy interface for adding/removing/altering servers and 
319
  starting the delta-T handler.  It will also allow you to 'force replicate' your 
320
  server list.  This is useful if you want to initialize the state of one Metacat 
321
  server from an existing state of another (i.e. copy all of the data from an existing
242 322
  server).</p>
243 323
  
244
  <br>
324
  <a name="Certificates"></a><div class="header1">Certificates:</div>
325
  You will need to generate security certificates on both the replication client 
326
  and server.  The certificates will be exchanged so that each machine understands
327
  that the other has access for replication.<br><br>
328
  The following are the steps to generate and exchange certificates on systems
329
  running Tomcat 5 and java 1.5.  Note that if Tomcat is running in conjunction with
330
  Apache, the process is somewhat different than if it is running standalone.
331

  
332
  <a name="GenerateCertificates"></a><div class="header2">Generate Certificates on both the replication client and server.</div>  
333

  
334
  <a name="GenerateCertTomcat"></a><div class="header3">Generate Certificate for Tomcat standalone (no Apache)</div>
335
  <ul class="list1">
336
  <li>Generate keys in java default key store - this will create a secure key and put it
337
    into the binary certificates file located at $JAVA_HOME/lib/security/cacerts</li> 
338
    <ul class="list2">
339
    <li>Run the command: 
340
   	  <div class="code">keytool -genkey -alias &lt;aliasname&gt; -keyalg RSA -validity 800 -keystore cacerts</div>
341
     where &lt;aliasname&gt; is a unique name that you choose for this cert.  Something like "&lt;hostname-tomcat&gt"
342
     might be appropriate.</li>
343
    </ul>
344
  </li>
345
  <li>Sample values when creating certificate</li>
346
    <ul class="list2">
347
    <li>What is your first and last name? <b>myserver.nceas.ucsb.edu </b>
348
        (note: use the host name without port number)<li>
349
    <li>What is the name of your organizional unit? <b>NCEAS</b></li>
350
    <li>What is the name of your organizional unit? <b>UCSB</b></li>
351
    <li>What is the name of your City or Locality? <b>Santa Barbara</b></li>
352
    <li>What is the name of your State or Province? <b>California</b> 
353
        (note: this is spelled in full)<li>
354
    <li>What is the two-letter country code for this unit? <b>US</b></li>
355
    </ul>
356
  <li>Generate certificate - this will pull the certificate you created from the cacerts file
357
      and put it into a local file</li>
358
    <ul class="list2">
359
    <li>Run the command:
360
      <div class="code">keytool -export -alias &lt;aliasname&gt; -file &lt;outputfile&gt;.cert -keystore cacerts</div>
361
      where &lt;aliasname&gt; is the same name you used when you created the certificate.  </li>
362
    <li>A file named &lt;outputfile&gt;.cert will be created in the same directory where you run the keytool 
363
      command.  You can name the output file anything you like, but keep in mind that it will get sent to the 
364
      partner machine used for replication.  The filename should have have enough meaning that someone who sees 
365
      it on that machine can have some idea where it came from.  Again, something like "&lt;hostname&gt;-tomcat.cert"
366
      will suffice.</li>   
367
    </ul>
368
  </li>
369
  <li>Enable SSL in Tomcat 
370
    <ul class="list2">
371
    <li>Edit the Tomcat server file at $TOMCAT_HOME/conf/server.xml</li>
372
    <li>uncomment the section that starts with "&lt;Connector port="8443" ...</li>
373
  	<li>add another attribute to that section that reads:
374
  	  <div class="code">keystoreFile="&lt;JAVA_HOME&gt;/lib/security/cacerts"</div>
375
  	  where $JAVA_HOME should be the actual java path.
376
  	</li>
377
  	</ul>
378
  </li>
379
  </ul>  
380
    
381
  <a name="GenerateCertApache"></a><div class="header3">Generate Certificate for Apache/Tomcat</div>
382
  <ul class="list1">
383
  <li>Generate keys using openssl
384
    <ul class="list2">
385
    <li>Run the command: 
386
   	  <div class="code">   openssl req -new -out REQ.pem -keyout &lt;hostname&gt;-apache.key</div>
387
    </li>
388
    </ul>
389
  </li>
390
  <li>Sample values when creating certificate</li>
391
    <ul class="list2">
392
    <li>Country Name (2 letter code) [AU]: <b>US</b></li>
393
    <li>State or Province Name (full name) [Some-State]: <b>California</b> 
394
        (note: this is spelled in full)</li>
395
    <li>Locality Name (eg, city) []: <b>Santa Barbara</b></li>
396
    <li>Organization Name (eg, company) [Internet Widgits Pty Ltd]: </b>UCSB</b></li>
397
    <li>Organizational Unit Name (eg, section) []: <b>NCEAS</b></li>
398
    <li>Common Name (eg, YOUR name) []: <b>myserver.mydomain.edu</b>
399
        (note: use the host name without port number)</li>
400
    <li>Email Address []:  <b>administrator@mydomain.edu</b></li>
401
    <li>A challenge password []: (note: leave blank)</li>
402
    <li>An optional company name []: (note: leave blank)</li>
403
    </ul>
404
  </li>    
405
  <li>Generate certificate - this will create a local file with your certificate</li>
406
    <ul class="list2">
407
    <li>Run the command:
408
      <div class="code">openssl req -x509 -days 800 -in REQ.pem -key &lt;hostname&gt;-apache.key -out &lt;hostname&gt;-apache.crt</div>
409
      where &lt;aliasname&gt; is the same name you used when you created the certificate.  </li>
410
    <li>A file named &lt;outputfile&gt;.cert will be created in the same directory where you run the keytool 
411
      command.  You can name the output file anything you like, but keep in mind that it will get sent to the 
412
      partner machine used for replication.  The filename should have have enough meaning that someone who sees 
413
      it on that machine can have some idea where it came from.  Again, something like "&lt;hostname&gt;-tomcat.cert"
414
      will suffice.</li>   
415
    </ul>
416
  </li>   
417
  <li>Enter the certificate into apache security configuration - you need to register the certificate
418
      in the local Apache instance.  Note that the security files may be in a different place depending
419
      on how you installed apache.</li>
420
    <ul class="list2">
421
    <li>Copy the certificate and key file to the apache ssl directories and enable ssl.</li>
422
    <li>For Ubuntu/Debian based systems:
423
      <ul class="list3">
424
      <li>sudo cp &lt;hostname&gt;-apache.crt /etc/ssl/certs</li>
425
      <li>sudo cp &lt;hostname&gt;-apache.key /etc/ssl/private</li>
426
      <li>As root edit /etc/apache2/sites-available/default.  In the VirtualHost section
427
          after the DocumentRoot line, add:<br>
428
          SSLEngine on<br>
429
          SSLOptions +FakeBasicAuth +ExportCertData +CompatEnvVars +StrictRequire<br>
430
          SSLCertificateFile /etc/ssl/certs/server.crt<br>
431
          SSLCertificateKeyFile /etc/ssl/private/server.key<br>
432
      </li>
433
      </ul>
434
    </li>  
435
    </ul>  
436
    <ul class="list2">
437
    <li>For other systems:
438
      <ul class="list3">
439
      <li>sudo cp &lt;hostname&gt;-apache.crt $APACHE_HOME/conf/ssl.crt</li>
440
      <li>sudo cp &lt;hostname&gt;-apache.key $APACHE_HOME/conf/ssl.key</li> 
441
      <li> ADD STEPS TO ENABLE SSL ON NON_DEBIAN SYSTEMS HERE</li>
442
      </ul>
443
    </li>  
444
    </ul>    	  	        
445
  <li>scp &lt;hostname&gt;-apache.crt to the replication partner machine.</li>
446
  </ul>  
447
  
448
  <a name="RegisterPartner"></a><div class="header2">Register the partner machines certificate.</div>   
449
  At this point, you have created a certificate for each replication server and 
450
  scp-ed them across to each other.  Now you need to import the remote server's
451
  certificate on the local machine.  Perform the following steps for each 
452
  replication server.
453
  <ul class="list1">
454
  <li>Import the remote certificate by running:
455
    <div class="code">keytool -import -alias &lt;remotehostalias&gt; -file &lt;remotehostfilename&gt;.cert -keystore cacerts</div>
456
    where the &lt;remotehostfilename&gt; is the certificate file you created on the remote machine and
457
    copied to this machine.  The &lt;remotehostalias&gt; is the name the certificate will use in
458
    the keystore.  It should be something that identifies the remote host.  
459
  </li>
460
  <li>Restart Apache and Tomcat on both replication machines</li>
461
  </ul>
462

  
245 463
  <a href="./packages.html">Back</a> | <a href="./metacattour.html">Home</a> | 
246 464
  <a href="./datafiles.html">Next</a>
465
  </ul>
247 466
  
248 467

  
249 468
</BODY>

Also available in: Unified diff