Project

General

Profile

« Previous | Next » 

Revision 5316

Added by daigle over 14 years ago

Move all user images into the dev image directory. Remove all but the MetacatAdministratorsGuide from the user dir.

View differences:

docs/user/index.html
1
<!DOCTYPE html PUBliC "-//W3C//DTD html 4.0//EN">
2
<html>
3

  
4
<head>
5
  <title>KNB Software: Metacat</title>
6
  <link rel="stylesheet" type="text/css" href="./default.css">
7
</head>
8

  
9
<body>
10

  
11
<table class="tabledefault" width="100%">
12
<tr>
13
  <td rowspan="2"><img src="./images/KNBLogo.gif"></td>
14
  <td colspan="7"><div class="title">Metacat: data and metadata catalog</div></td>
15
</tr>
16
<tr>
17
  <td><a href="/" class="toollink"> KNB Home </a></td>
18
  <td><a href="/data.html" class="toollink"> Data </a></td>
19
  <td><a href="/people.html" class="toollink"> People </a></td>
20
  <td><a href="/informatics" class="toollink"> Informatics </a></td>
21
  <td><a href="/biodiversity" class="toollink"> Biocomplexity </a></td>
22
  <td><a href="/education" class="toollink"> Education </a></td>
23
  <td><a href="/software" class="toollink"> Software </a></td>
24
</tr>
25
</table>
26
<hr>
27

  
28
<p>&nbsp;</p>
29
<table class="tabledefault" width="100%">
30
<tr>
31
  <td class="tablehead" colspan="2"><p class="label">Metacat: a flexible 
32
      metadata catalog and data repository</p></td>
33
</tr>
34
<tr>
35
  <td>
36
  <p><a href="/software/download.html">Download Metacat</a></p>
37
  <h3>Documentation and Installation Instructions</h3>
38
  <ul>
39
    <li><a
40
    href="/software/dist/MetacatAdministratorGuide.pdf">Metacat
41
    Administrator's Guide</a></li>
42
  </ul>
43
  <img src="images/metacat-logo.png" alt="Metacat" style="float: right;
44
  height: 100px"/> 
45
  <p>Metacat is a flexible, open source metadata catalog and data repository 
46
  that targets scientific data, particularly from ecology and environmental
47
  science.  Metacat accepts 
48
     <a href="http://www.w3.org/TR/REC-xml" target="offline">XML</a> as a 
49
     common syntax for representing the large number of metadata content
50
     standards that are relevant to ecology and other sciences.  Thus, Metacat 
51
     is a generic XML database that allows storage, query, and retrieval of 
52
     arbitrary XML documents without prior knowledge of the XML schema.
53
  </p>
54
  <p>
55
    Metacat is being used extensively <a href="/community.jsp">throughout the 
56
    world</a> to manage environmental data. It is a key infrastructure
57
    component for the <a href="http://data.nceas.ucsb.edu">NCEAS data catalog</a>, the <a href="http://knb.ecoinformatics.org/">Knowledge Network
58
    for Biocomplexity (KNB)</a> data catalog, and for the <a href="http://dataone.org">DataONE</a> system, <a href="/community.jsp">among others</a>.</p>
59
  </td>
60
</tr>
61
<tr><td>&nbsp;</td></tr>
62

  
63
</table>
64

  
65
<iframe width="500" height="350" frameborder="0" scrolling="no"
66
marginheight="0" marginwidth="0"
67
src="http://maps.google.com/maps/ms?ie=UTF8&amp;hl=en&amp;msa=0&amp;msid=111257081347673397900.00000111e7923f163c54e&amp;ll=10.487812,-8.4375&amp;spn=152.923473,351.5625&amp;z=1&amp;output=embed"></iframe><br
68
/><small>View <a
69
href="http://maps.google.com/maps/ms?ie=UTF8&amp;hl=en&amp;msa=0&amp;msid=111257081347673397900.00000111e7923f163c54e&amp;ll=10.487812,-8.4375&amp;spn=152.923473,351.5625&amp;z=1&amp;source=embed"
70
style="color:#0000FF;text-align:left">Metacat deployments</a> in a larger
71
map</small>
72

  
73
<p class="contact">
74
Web Contact: <a href="mailto:jones@nceas.ucsb.edu">jones@nceas.ucsb.edu</A>
75
</p>
76
</body>
77
</html>
78 0

  
docs/user/replication.html
1
<!--
2
  * replication.html
3
  *
4
  *      Authors: Chad Berkley
5
  *    Copyright: 2000 Regents of the University of California and the
6
  *               National Center for Ecological Analysis and Synthesis
7
  *  For Details: http://www.nceas.ucsb.edu/
8
  *      Created: 2001 January 23
9
  *      Version: 
10
  *    File Info: '$ '
11
  * 
12
  * 
13
-->
14
<HTML>
15
<HEAD>
16
<TITLE>Metacat Replication</TITLE>
17
<link rel="stylesheet" type="text/css" href="./common.css">
18
<link rel="stylesheet" type="text/css" href="./default.css">
19
</HEAD> 
20
<BODY>
21
  <table width="100%">
22
    <tr>
23
      <td class="tablehead" colspan="2"><p class="label">Replication</p></td>
24
      <td class="tablehead" colspan="2" align="right">
25
        <a href="./packages.html">Back</a> | <a href="./metacattour.html">Home</a> | 
26
        <a href="./datafiles.html">Next</a>
27
      </td>
28
    </tr>
29
  </table>
30
  
31
  <div class="header1">Table of Contents</div>
32
  <div class="toc1"><a href="#Intro">Metacat Replication</a></div>
33
    <div class="toc2"><a href="#Overview">Overview</a></div>
34
    <div class="toc2"><a href="#DatabasedInfo">Databased Information</a></div>
35
    <div class="toc2"><a href="#Example">Example</a></div>
36
      <div class="toc3"><a href="#gamma">What happens with gamma?</a></div>
37
      <div class="toc3"><a href="#alpha">What happens with alpha?</a></div>
38
      <div class="toc3"><a href="#lamda">What happens with lamda?</a></div>
39
  <div class="toc1"><a href="#ControlPanel">The Replication Control Panel</a></div>
40
  <div class="toc1"><a href="#Certificates">Certificates</a></div>
41
    <div class="toc2"><a href="#GenerateCertificates">Generate Certificates on both the replication client and server.</a></div> 
42
      <div class="toc3"><a href="#GenerateCertTomcat">Generate Certificate for Tomcat standalone (no Apache)</a></div>
43
      <div class="toc3"><a href="#GenerateCertApache">Generate Certificate for Apache/Tomcat</a></div>
44
    <div class="toc2"><a href="#RegisterPartner">Register the partner machines certificate</a></div> 
45
  
46
  <a name="Intro"></a><div class="header1">Metacat Replication</div>
47
  <a name="Overview"></a><div class="header2">Overview</div>
48
  <p>Metacat has built-in replication to allow different Metacat servers to 
49
  share data between themselves. Metacat not only replicates XML documents but 
50
  also data files. </p>
51
  
52
  <p>Metacat's hub feature allows it to replicate not only it's own server's original
53
  documents, but also those that were replicated from other servers.  This functionality
54
  allows for a more complex chaining replication structure.</p>
55
  
56
  <p>The replication scheme that Metacat uses is both push and pull.  There are 
57
  several triggers that can start a replication mechanism: </p>
58
  <ul class="list1">
59
    <li><b>Delta-T monitoring</b> - at a set time interval a server checks each of the
60
    other servers in its list for updated documents</li>
61
    <li><b>INSERT trigger</b> - Whenever a document is inserted, the server notifies
62
    the remote hosts in its list that it has a new file available.</li>
63
    <li><b>UPDATE trigger</b> - Whenever a document is updated, the server notifies
64
    each server in its list of the update.</li>
65
    <li><b>File locking</b> - When a local user tries to alter a document on a local 
66
    server that belongs to a remote server, the local server must first
67
    obtain a lock on that file.  Once the lock is obtained, the file can 
68
    be updated, then it is force replicated out to each server in the list.
69
    The lock ensures that the remote copy is up to date and that an older
70
    file does not overwrite a newer one.  Only a documents home server
71
    can give a lock for that file to be altered.</li>
72
  </ul>
73
  
74
  <a name="DatabasedInfo"></a><div class="header2">Databased Information</div>
75
  <p>Each server contains a list of servers to which it can replicate.  One-way
76
  replication is enabled by the 'replicate' and 'datareplicate' flags in the 
77
  list.  The server list may look like the following.</p>
78
  <table border="1">
79
    <tr>
80
      <td><b>serverid</b></td>
81
      <td><b>server</b></td>
82
      <td><b>last_checked</b></td>
83
      <td><b>replicate</b></td>
84
      <td><b>datareplicate</b></td>
85
      <td><b>hub</b></td>
86
    </tr>
87
    <tr>
88
      <td>1</td>
89
      <td>localhost</td>
90
      <td>null</td>
91
      <td>0</td>
92
      <td>0</td>
93
      <td>0</td>
94
    </tr>
95
    <tr>
96
      <td>2</td>
97
      <td>alpha.nceas.ucsb.edu:8080/berkley/servlet/replication</td>
98
      <td>2001-01-22 14:52:12.1</td>
99
      <td>0</td>
100
      <td>0</td>
101
      <td>0</td>
102
    </tr>
103
    <tr>
104
      <td>3</td>
105
      <td>dev.nceas.ucsb.edu/Metacat/servlet/replication</td>
106
      <td>2001-01-23 9:10:02.5</td>
107
      <td>1</td>
108
      <td>1</td>
109
      <td>0</td>
110
    </tr>
111
  </table>
112
  
113
  <br>
114
  The server list is kept in a table in the database called xml_replication.
115
  Localhost must always be the first entry in the table and have a serverid of 1.
116
  The database fields are:
117
  <ul class="list1">
118
  <li><b>serverid</b> - a unique ID that is generated by the database when a new field is added.</li>
119
  <li><b>server</b> - this field always points to the partner server's replication servlet,
120
  hence the "servlet/replication" on the end of both of the sample servers.  Note
121
  that any port numbers (if your servlet engine is not running on port 80) must
122
  also be included. </li>
123
  <li><b>last_checked</b> - a system generated values that holds the last time that a check was 
124
  made to see if replication needed to be performed.<li>
125
  <li><b>replicate</b> - flag that is set to 1 if you want this server to replicate XML 
126
  metadata documents TO the remote host.  Note that if this flag is set to 0, datareplicate
127
  and hub fields have no meaning.</li>
128
  <li><b>datareplicate</b> - flag that is set to 1 if you want this server to copy data 
129
  files to the remote host.  Note that this field has no meaning if replicate is not set to 1.</li>
130
  If this server is a hub to the remote host, the hub flag should be set to.
131
  <li><b>hub</b> - if this flag is set to true, this server will not only replicate it's own
132
  original documents, it will also replicate documents that were replicated to it.  Thus it 
133
  acts as a replication hub to one or more other Metacat servers.</li>
134
  </ul>
135
  
136
  <a name="Example"></a><div class="header2">Example</div>
137
  Here we show an example setup of three replication servers.  We will discuss each.<br><br>
138
  
139
  First, note that in order for replication to occur, both partner servers must have 
140
  each other in their respective tables or replication will not take place.  Also, 
141
  certificates must be set up correctly on both servers in order for replication to 
142
  work.  See the <a href="#Certificates">certificates</a> section below.<br><br>
143

  
144
  <table border="1">
145
    <tr>
146
      <td>host</td>
147
      <td>replication table</td>
148
    </tr>
149
    <tr>
150
     <td>gamma.nceas.ucsb.edu</td>
151
     <td>
152
      <table border="2">
153
        <tr>
154
          <td><b>server</b></td>
155
          <td><b>last_checked</b></td>
156
          <td><b>replicate</b></td>
157
          <td><b>datareplicate</b></td>
158
          <td><b>hub</b></td>
159
        </tr>
160
        <tr>
161
          <td>localhost</td>
162
          <td>null</td>
163
          <td>0</td>
164
          <td>0</td>
165
          <td>0</td>
166
        </tr>
167
        <tr>
168
          <td>alpha.nceas.ucsb.edu:8080/berkley/servlet/replication&nbsp;&nbsp;&nbsp;</td>
169
          <td>2001-01-22 14:52:12.1</td>
170
          <td>0</td>
171
          <td>0</td>
172
          <td>0</td>
173
        </tr>
174
        <tr>
175
          <td>lamda.nceas.ucsb.edu/Metacat/servlet/replication</td>
176
          <td>2001-01-23 9:10:02.5</td>
177
          <td>1</td>
178
          <td>1</td>
179
          <td>0</td>
180
        </tr>
181
      </table>
182
     </td>
183
    </tr>
184
    <tr>
185
      <td>alpha.nceas.ucsb.edu</td>
186
      <td>
187
        <table border="2">
188
          <tr>
189
            <td><b>server</b></td>
190
            <td><b>last_checked</b></td>
191
            <td><b>replicate</b></td>
192
            <td><b>datareplicate</b></td>
193
            <td><b>hub</b></td>
194
          </tr>
195
          <tr>
196
            <td>localhost</td>
197
            <td>null</td>
198
            <td>0</td>
199
            <td>0</td>
200
            <td>0</td>
201
          </tr>
202
          <tr>
203
            <td>gamma.nceas.ucsb.edu:8080/berkley/servlet/replication</td>
204
            <td>2001-01-21 11:33:12.7</td>
205
            <td>0</td>
206
            <td>1</td>
207
            <td>0</td>
208
          </tr>
209
          <tr>
210
            <td>lamda.nceas.ucsb.edu/Metacat/servlet/replication</td>
211
            <td>2001-01-23 10:22:02.5</td>
212
            <td>1</td>
213
            <td>0</td>
214
            <td>0</td>
215
          </tr>
216
        </table>
217
      </td>
218
    </tr>
219
    <tr>
220
      <td>lamda.nceas.ucsb.edu</td>
221
      <td>
222
        <table border="2">
223
          <tr>
224
            <td><b>server</b></td>
225
            <td><b>last_checked</b></td>
226
            <td><b>replicate</b></td>
227
            <td><b>datareplicate</b></td>
228
            <td><b>hub</b></td>
229
          </tr>
230
          <tr>
231
            <td>localhost</td>
232
            <td>null</td>
233
            <td>0</td>
234
            <td>0</td>
235
            <td>0</td>
236
          </tr>
237
          <tr>
238
            <td>gamma.nceas.ucsb.edu:8080/berkley/servlet/replication</td>
239
            <td>2001-01-21 11:33:12.7</td>
240
            <td>0</td>
241
            <td>0</td>
242
            <td>0</td>
243
          </tr>
244
          <tr>
245
            <td>alpha.nceas.ucsb.edu:8080/Metacat/servlet/replication</td>
246
            <td>2001-01-22 12:15:32.5</td>
247
            <td>1</td>
248
            <td>1</td>
249
            <td>1</td>
250
          </tr>
251
        </table>
252
      </td>
253
    </tr>
254
  </table>
255
  
256
  <a name="gamma"></a><div class="header3">What happens with gamma?</div>
257
  <ul class="list1">
258
  <li>The localhost entry is required internally for replication to work on 
259
      gamma.  As long as we see it there, we can safely disregard it.</li>
260
  <li>We see the entry for the alpha machine has all zeros in replicate, 
261
      datareplicate and hub columns.  This means that gamma is configured to
262
      accept replication information from alpha.  (As we will see in a moment,
263
      alpha is not actually correctly configured to send data to gamma.)</li>
264
  <li>We see that the entry for the lamda machine has ones in the replicate
265
      and data replicate columns and a zero in the hub column.  This tells us
266
      that gamma will replicate it's original documents to lamda, assuming that
267
      lambda is configured to accept replication from gamma (we will see that it
268
      is).  However, because the hub value is zero, any documents that replicate 
269
      to gamma will not be further replicated to lamda.</li>
270
  </ul>
271
   
272
  <a name="alpha"></a><div class="header3">What happens with alpha?</div>
273
  <ul class="list1">
274
  <li>The localhost entry is required internally for replication to work on 
275
      alpha.  As long as we see it there, we can safely disregard it.</li>
276
  <li>We see that the entry for gamma has a zero in the replicate column.  
277
      This means that all other entries are meaningless and can be disregarded.
278
      Even though there is a one in the datareplicate column on alpha and gamma 
279
      is configured to accept replication from alpha, no replicationwill happen 
280
      from alpha to gamma.</li>
281
  <li>We see that the entry for lamda is a one in the replicate column and zeros
282
      in the datareplicate and hub columns.  Assuming lamda is configured to 
283
      accept replication from alpha, alpha will replicate metadata only to lamda 
284
      (and indeed, we will see that lambda is set up to accept replication from 
285
      alpha). </li>
286
  </ul>
287
      
288
  <a name="lamda"></a><div class="header3">What happens with lamda?</div>
289
  <ul class="list1">
290
  <li>The localhost entry is required internally for replication to work on 
291
      lamda.  As long as we see it there, we can safely disregard it.</li>
292
  <li>We see that the entry for gamma has all zeros in replicate, datareplicate
293
      and hub, so lamba is set up to accept replication from gamma.  As we have
294
      already seen, gamma is correctly configured to replicate metadata and data
295
      to lambda.  We should see data and metadata replication from gamma to lamda.
296
  <li>We see that the entry for alpha has ones in the replicate datareplicate and 
297
      hub columns.  There's a lot going on here:
298
    <ul class="list2">
299
    <li>First, lamda will replicate original metadata and data to alpha if 
300
        alpha is configured to accept replication from lamda.  Because alpha 
301
        has an entry for lambda, lamba will be allowed to replicate to alpha. </li>
302
    <li>Second, because the alpha entry has a one in the hub column, lambda 
303
        will not only replicate it's original data, it will also replicate 
304
        data that was replicated to it.  Remember that gamma was configured 
305
        to replicate to lamda.  So any data or metadata that gamma sends to 
306
        lambda will get further replicated to alpha.</li>
307
    <li>Finally, the alpha entry in the table allows the alpha server to 
308
        replicate to lambda.  Since the alpha server is set up to replicate
309
        metadata only, we would expect any original metadata on alpha to 
310
        wind up on lambda.</li>
311
    </ul>
312
  </ul>
313

  
314
<a name="ControlPanel"></a><div class="header1">The Replication Control Panel:</div>      
315
  There is an html control panel for controling replication.  After
316
  installing Metacat, you can access it by calling replControl.html.  For instance, if you 
317
  setup a Metacat application context called 'knb' you would probably type :
318
  
319
  <div class="code">http://server.domain.com/knb/style/skins/dev/replControl.html</div>  
320
  
321
  The control panel is an easy interface for adding/removing/altering servers and 
322
  starting the delta-T handler.  It will also allow you to 'force replicate' your 
323
  server list.  This is useful if you want to initialize the state of one Metacat 
324
  server from an existing state of another (i.e. copy all of the data from an existing
325
  server).</p>
326
  
327
  <a name="Certificates"></a><div class="header1">Certificates:</div>
328
  You will need to generate security certificates on both the replication client 
329
  and server.  The certificates will be exchanged so that each machine understands
330
  that the other has access for replication.<br><br>
331
  The following are the steps to generate and exchange certificates on systems
332
  running Tomcat 5 and java 1.5.  Note that if Tomcat is running in conjunction with
333
  Apache, the process is somewhat different than if it is running standalone.
334

  
335
  <a name="GenerateCertificates"></a><div class="header2">Generate Certificates on both the replication client and server.</div>  
336

  
337
  <a name="GenerateCertTomcat"></a><div class="header3">Generate Certificate for Tomcat standalone (no Apache)</div>
338
  <ul class="list1">
339
  <li>Generate keys in java default key store - this will create a secure key and put it
340
    into the binary certificates file located at $JAVA_HOME/lib/security/cacerts</li> 
341
    <ul class="list2">
342
    <li>Run the command: 
343
   	  <div class="code">keytool -genkey -alias &lt;aliasname&gt; -keyalg RSA -validity 800 -keystore $JAVA_HOME/lib/security/cacerts</div>
344
     where &lt;aliasname&gt; is a unique name that you choose for this cert.  Something like "&lt;hostname-tomcat&gt"
345
     might be appropriate, where &lt;hostname-tomcat&gt is the name of this host.</li>
346
    </ul>
347
  </li>
348
  <li>
349
    Password - keytool will ask for a password.  If this is a pre-existing keystore, you will need
350
    to know its password to modify it.  If you are creating a new keystore, the password you enter
351
    will become the keystore password.
352
  </li>
353
  <li>Sample values when creating certificate</li>
354
    <ul class="list2">
355
    <li>What is your first and last name? <b>myserver.nceas.ucsb.edu </b>
356
        (note: use the host name without port number)<li>
357
    <li>What is the name of your organizional unit? <b>NCEAS</b></li>
358
    <li>What is the name of your organizional unit? <b>UCSB</b></li>
359
    <li>What is the name of your City or Locality? <b>Santa Barbara</b></li>
360
    <li>What is the name of your State or Province? <b>California</b> 
361
        (note: this is spelled in full)<li>
362
    <li>What is the two-letter country code for this unit? <b>US</b></li>
363
    </ul>
364
  <li>Generate certificate - this will pull the certificate you created from the cacerts file
365
      and put it into a local file</li>
366
    <ul class="list2">
367
    <li>Run the command:
368
      <div class="code">keytool -export -alias &lt;aliasname&gt; -file &lt;outputfile&gt;.cert -keystore $JAVA_HOME/lib/security/cacerts</div>
369
      where &lt;aliasname&gt; is the same name you used when you created the certificate.  </li>
370
    <li>A file named &lt;outputfile&gt;.cert will be created in the same directory where you run the keytool 
371
      command.  You can name the output file anything you like, but keep in mind that it will get sent to the 
372
      partner machine used for replication.  The filename should have have enough meaning that someone who sees 
373
      it on that machine can have some idea where it came from.  Again, something like "&lt;hostname&gt;-tomcat.cert"
374
      will suffice.</li>   
375
    </ul>
376
  </li>
377
  <li>Enable SSL in Tomcat 
378
    <ul class="list2">
379
    <li>Edit the Tomcat server file at $TOMCAT_HOME/conf/server.xml</li>
380
    <li>
381
      uncomment the section that starts with "&lt;Connector port="8443" ... (Note: Databased Informationcomments start with
382
      &lt;!-- and end with --&gt;).
383
    </li>
384
  	<li>add two attribute to that section that read:
385
  	  <div class="code">keystoreFile="&lt;JAVA_HOME&gt;/lib/security/cacerts"</div>
386
  	  <div class="code">keystorePass="&lt;keystore_password&gt;"</div>
387
  	  where &lt;JAVA_HOME&gt; should be the actual java path and &lt;keystore_password&gt; should be the 
388
  	  password you used when you created the keystore.
389
  	</li>
390
  	</ul>
391
  </li>
392
  </ul>  
393
    
394
  <a name="GenerateCertApache"></a><div class="header3">Generate Certificate for Apache/Tomcat</div>
395
  <ul class="list1">
396
  <li>Generate keys using openssl
397
    <ul class="list2">
398
    <li>Run the command: 
399
   	  <div class="code">   openssl req -new -out REQ.pem -keyout &lt;hostname&gt;-apache.key</div>
400
    </li>
401
    </ul>
402
  </li>
403
  <li>Sample values when creating certificate</li>
404
    <ul class="list2">
405
    <li>Enter PEM pass phrase: (note: I use the first part of the host name)
406
    <li>Country Name (2 letter code) [AU]: <b>US</b></li>
407
    <li>State or Province Name (full name) [Some-State]: <b>California</b> 
408
        (note: this is spelled in full)</li>
409
    <li>Locality Name (eg, city) []: <b>Santa Barbara</b></li>
410
    <li>Organization Name (eg, company) [Internet Widgits Pty Ltd]: <b>UCSB</b></li>
411
    <li>Organizational Unit Name (eg, section) []: <b>NCEAS</b></li>
412
    <li>Common Name (eg, YOUR name) []: <b>myserver.mydomain.edu</b>
413
        (note: use the host name without port number)</li>
414
    <li>Email Address []:  <b>administrator@mydomain.edu</b></li>
415
    <li>A challenge password []: (note: leave blank)</li>
416
    <li>An optional company name []: (note: leave blank)</li>
417
    </ul>
418
  </li>    
419
  <li>Generate certificate - this will create a local file with your certificate</li>
420
    <ul class="list2">
421
    <li>Run the command:
422
      <div class="code">openssl req -x509 -days 800 -in REQ.pem -key &lt;hostname&gt;-apache.key -out &lt;hostname&gt;-apache.crt</div>
423
      where &lt;hostname&gt; is the same name you used when you created the certificate.  </li>
424
    <li>A file named &lt;hostname&gt;-apache.crt will be created in the same directory where you run the keytool 
425
      command.  You can name the output file anything you like, but keep in mind that it will get sent to the 
426
      partner machine used for replication.  The filename should have have enough meaning that someone who sees 
427
      it on that machine can have some idea where it came from.  Again, something like "&lt;hostname&gt;-apache.crt"
428
      will suffice.</li>   
429
    </ul>
430
  </li>   
431
  <li>Enter the certificate into apache security configuration - you need to register the certificate
432
      in the local Apache instance.  Note that the security files may be in a different place depending
433
      on how you installed apache.</li>
434
    <ul class="list2">
435
    <li>Copy the certificate and key file to the apache ssl directories and enable ssl.</li>
436
    <li>For Ubuntu/Debian based systems:
437
      <ul class="list3">
438
      <li>sudo cp &lt;hostname&gt;-apache.crt /etc/ssl/certs</li>
439
      <li>sudo cp &lt;hostname&gt;-apache.key /etc/ssl/private</li>
440
      <li>As root edit /etc/apache2/sites-available/default.  In the VirtualHost section
441
          after the DocumentRoot line, add:<br>
442
          SSLEngine on<br>
443
          SSLOptions +FakeBasicAuth +ExportCertData +CompatEnvVars +StrictRequire<br>
444
          SSLCertificateFile /etc/ssl/certs/server.crt<br>
445
          SSLCertificateKeyFile /etc/ssl/private/server.key<br>
446
      </li>
447
      </ul>
448
    </li>  
449
    </ul>  
450
    <ul class="list2">
451
    <li>For other systems:
452
      <ul class="list3">
453
      <li>sudo cp &lt;hostname&gt;-apache.crt $APACHE_HOME/conf/ssl.crt</li>
454
      <li>sudo cp &lt;hostname&gt;-apache.key $APACHE_HOME/conf/ssl.key</li> 
455
      <li> ADD STEPS TO ENABLE SSL ON NON_DEBIAN SYSTEMS HERE</li>
456
      </ul>
457
    </li>  
458
    </ul>    	  	        
459
  <li>scp &lt;hostname&gt;-apache.crt to the replication partner machine.</li>
460
  </ul>  
461
  
462
  <a name="RegisterPartner"></a><div class="header2">Register the partner machines certificate.</div>   
463
  At this point, you have created a certificate for each replication server and 
464
  scp-ed them across to each other.  Now you need to import the remote server's
465
  certificate on the local machine.  Perform the following steps for each 
466
  replication server.
467
  <ul class="list1">
468
  <li>Import the remote certificate by running:
469
    <div class="code">keytool -import -alias &lt;remotehostalias&gt; -file &lt;remotehostfilename&gt;.crt -keystore $JAVA_HOME/jre/lib/security/cacerts</div>
470
    where the &lt;remotehostfilename&gt; is the certificate file you created on the remote machine and
471
    copied to this machine.  The &lt;remotehostalias&gt; is the name the certificate will use in
472
    the keystore.  It should be something that identifies the remote host.  
473
  </li>
474
  <li>Restart Apache and Tomcat on both replication machines</li>
475
  </ul>
476

  
477
  <a href="./packages.html">Back</a> | <a href="./metacattour.html">Home</a> | 
478
  <a href="./datafiles.html">Next</a>
479
  </ul>
480
  
481

  
482
</BODY>
483
</HTML>
484 0

  
docs/user/pagedreturn.html
1
<!--
2
  *      Authors: Chad Berkley
3
  *    Copyright: 2000 Regents of the University of California and the
4
  *               National Center for Ecological Analysis and Synthesis
5
  *  For Details: http://www.nceas.ucsb.edu/
6
  *      Created: 2007 November 5
7
  *      Version: 
8
  *    File Info: '$Id$'
9
  * 
10
  * 
11
-->
12
<HTML>
13
<HEAD>
14
<TITLE>Metacat</TITLE>
15
<link rel="stylesheet" type="text/css" href="./default.css">
16
</HEAD> 
17
<BODY>
18
  <table width="100%">
19
    <tr>
20
      <td class="tablehead" colspan="2"><p class="label">Paged Query Returns</p></td>
21
      <td class="tablehead" colspan="2" align="right">
22
          <a href="./spatial_option.html">Back</a> | <a href="./metacattour.html">Home</a> | 
23
          <a href="./sitemaps.html">Next</a>
24
      </td>
25
    </tr>
26
  </table>
27
  
28
  <p>Metacat allows results sets to be broken up into "pages" to aid in the loading
29
  of large result sets and to make the result sets more readable to users.  This is
30
  facilitated by the addition of two variables to the parameter list.  Pagesize
31
  indicates how many results should be returned for a given page.  Pagestart
32
  indicates which page you are currently viewing.  These parameters can be
33
  passed to Metacat when a query is submited.  </p>
34
  
35
  <p>When a paged query is performed, the resultset of that query contains
36
  four extra fields.  Pagestart and pagesize are contained in the xml resultset
37
  along with nextpage and previouspage.  This allows the xslt tranformation to
38
  include navigational links in the rendered resultset.  Here's an example 
39
  xml resultset.</p>
40
  
41
  <pre>
42
    &lt;resultset&gt;
43
      &lt;pagestart&gt;1&lt;/pagestart&gt;
44
      &lt;pagesize&gt;10&lt;/pagesize&gt;
45
      &lt;nextpage&gt;2&lt;/nextpage&gt;
46
      &lt;previouspage&gt;0&lt;/previouspage&gt;
47
      &lt;query&gt;
48
      ...
49
      &lt;/query&gt;
50
      &lt;document&gt;...&lt;/document&gt;
51
      &lt;document&gt;...&lt;/document&gt;
52
    &lt;/resultset&gt;
53
  </pre>
54
  
55
  <p>In the case of the resultset, pagestart will always indicate the page you
56
  are currently viewing.</p>
57
  
58
  <p>Here's an example of the rendered result.</p>
59
  
60
  <img src="images/pagedquery.jpg"/>
61
  
62
  <p>The links to the previous and next pages look like this.</p>
63
  
64
  <pre>
65
   &lt;a href="metacat?action=query&amp;operator=INTERSECT&amp;enableediting=false&amp;anyfield=actor&amp;qformat=kepler&amp;pagestart=0&amp;pagesize=10"&gt;Previous Page&lt;/a&gt;
66
   &lt;a href="metacat?action=query&amp;operator=INTERSECT&amp;enableediting=false&amp;anyfield=actor&amp;qformat=kepler&amp;pagestart=2&amp;pagesize=10"&gt;Next Page&lt;/a&gt;
67
  </pre>
68
  
69
  <p>The important bits of these links are the incremented pagestart parameters.  
70
  The rendered page is on page 1 so the previous page is 0 and the next
71
  page is 2.  By sequencing through the pages, the user can see the
72
  entire resultset.  </p>
73
  
74
  <p>The Kepler skin in lib/style/skins/kepler is a good example of how to
75
  render the resultset with paged query returns.  The XSLT uses the four extra 
76
  fields in the resultset to render the next and previous links.  This could
77
  also be used within a non-web application.
78
  </p>
79
  
80
  <br>
81
  <a href="./spatial_option.html">Back</a> | <a href="./metacattour.html">Home</a> | 
82
  <a href="./sitemaps.html">Next</a>
83
  
84

  
85
</BODY>
86
</HTML>
87 0

  
docs/user/metacatload.html
1
<!--
2
  * metacatload.html
3
  *
4
  *      Authors: Jivka Bojilova
5
  *    Copyright: 2000 Regents of the University of California and the
6
  *               National Center for Ecological Analysis and Synthesis
7
  *  For Details: http://www.nceas.ucsb.edu/
8
  *      Created: 2000 April 5
9
  *      Version: 0.01
10
  *    File Info: '$Id$'
11
  * 
12
  * October Meeting SDSC, 2000
13
-->
14
<HTML>
15
<HEAD>
16
<TITLE>Metacat</TITLE>
17
<link rel="stylesheet" type="text/css" href="./default.css">
18
</HEAD> 
19
<BODY>
20
  <table width="100%">
21
    <tr>
22
      <td class="tablehead" colspan="2"><p class="label">Document Loading</p></td>
23
      <td class="tablehead" colspan="2" align="right">
24
        <a href="./clientapi.html">Back</a> | <a href="./metacattour.html">Home</a> | 
25
        <a href="./metacatquery.html">Next</a>
26
      </td>
27
    </tr>
28
  </table>
29
  <P>Metacat provides functionality for inserting, updating, and deleting
30
  XML documents in the database. Inserted or updated documents are read, checked 
31
  for validity, decomposed into nodes and inserted into the db. Document validity 
32
  is checked if a valid DTD is provided.
33
  <P> <img alt="architecture diagram  of write action" src="metadatawrite.gif">
34
  <P> <b>Operations</b>
35
  <ul>
36
    <li>INSERT - A new XML document is inserted into the DB with given unique
37
    docid. The client must specify the docid.</li>
38
    <li>UPDATE - A document is provided to update a document that already
39
    has a valid docid.  The original document is archived, then overwritten.</li>
40
    <li>DELETE - Document is archived, and pointer in xml_documents is moved
41
    to xml_revisions, efectively deleting the document from public view but
42
    preserving the revision for the revision history.</li>
43
  </ul>
44
  
45
  <p>Insertions, updates and deletes are passed to Metacat as servlet parameters.
46
  The following is an example of a web form that can perform these tasks.</p>
47
  
48
  <pre>
49
    &lt;html&gt;
50
    &lt;head&gt;
51
    &lt;title&gt;MetaCat&lt;/title&gt;
52
    &lt;link rel="stylesheet" type="text/css" href="metacat/style/rowcol.css"&gt;
53
    &lt;/head&gt;
54
    &lt;body class="emlbody"&gt;
55
    &lt;b&gt;MetaCat XML Loader&lt;/b&gt;
56
    &lt;p&gt;
57
    Upload, Change, or Delete an XML document using this form.
58
    &lt;/p&gt;
59
    &lt;form action="http://dev.nceas.ucsb.edu/metacat/servlet/metacat" method="POST"&gt;
60
      &lt;strong&gt;1. Choose an action: &lt;/strong&gt;
61
      &lt;input type="radio" name="action" value="insert" checked&gt; Insert
62
      &lt;input type="radio" name="action" value="update"&gt; Update
63
      &lt;input type="radio" name="action" value="delete"&gt; Delete
64
      &lt;input type="submit" value="Process Action"&gt;
65
      &lt;br /&gt;
66
      &lt;strong&gt;2. Provide a Document ID &lt;/strong&gt;
67
      &lt;input type="text" name="docid"&gt; (optional for Insert)
68
      &nbsp;&nbsp;&nbsp;&lt;input type="checkbox" name="public" value="yes" checked&gt;&lt;strong&gt;Public Document&lt;/strong&gt;
69
      &lt;br /&gt;
70
      &lt;strong&gt;3. Provide XML text &lt;/strong&gt; (not needed for Delete)
71
      &lt;textarea name="doctext" cols="65" rows="15"&gt;&lt;/textarea&gt;
72
      &lt;strong&gt;4. Provide DTD text for upload &lt;/strong&gt; (optional; not needed for Delete)
73
      &lt;textarea name="dtdtext" cols="65" rows="15"&gt;&lt;/textarea&gt;
74
    &lt;/form&gt;
75
    &lt;/body&gt;
76
    &lt;/html&gt;
77
  </pre>
78
  
79
  <p>Once inserted into the database, the document looks like the following:</p>
80
  
81
  <PRE><b>SQL> select * from xml_nodes where docid='NCEAS:1'</b> <br>
82
NODEID NODEINDEX NODETYPE   NODENAME        NODEDATA               PARENTNODEID ROOTNODEID DOCID  
83
------ --------- ---------- --------------- ---------------------- ------------ ---------- -------
84
     1           DOCUMENT   eml-dataset                                                  1 NCEAS:1
85
     2         1 ELEMENT    eml-dataset                                       1          1 NCEAS:1
86
     3         1 TEXT                                                         2          1 NCEAS:1
87

  
88

  
89
     4         2 ELEMENT    meta_file_id                                      2          1 NCEAS:1
90
     5         1 TEXT                       NCEAS:1                           4          1 NCEAS:1
91
     6         3 TEXT                                                         2          1 NCEAS:1
92

  
93

  
94
     7         4 ELEMENT    dataset_id                                        2          1 NCEAS:1
95
     8         1 TEXT                       Dist.ssd01                        7          1 NCEAS:1
96
     9         5 TEXT                                                         2          1 NCEAS:1
97

  
98

  
99
    10         6 ELEMENT    title                                             2          1 NCEAS:1
100
    11         1 TEXT                       Insights on community            10          1 NCEAS:1
101
                                            dynamics
102

  
103
    12         7 TEXT                                                         2          1 NCEAS:1
104

  
105

  
106
    13         8 ELEMENT    originator                                        2          1 NCEAS:1
107
    14         1 ATTRIBUTE  description     Names and addresses of           13          1 NCEAS:1
108
                                            principal investigator
109
  </PRE>
110
  
111
  <p>If you follow the parentnodeid pointers you can recontruct this document.
112
  The <a href="./metacatdb.html">Metacat Database</a> section provides more details
113
  on the storage of XML documents.</p>
114

  
115
  <br>
116
  <a href="./clientapi.html">Back</a> | <a href="./metacattour.html">Home</a> | 
117
  <a href="./metacatquery.html">Next</a>
118
  
119
</BODY>
120
</HTML>
121

  
122 0

  
docs/user/metacatquery.html
1
<!--
2
  * metacatquery.html
3
  *
4
  *      Authors: Jivka Bojilova
5
  *    Copyright: 2000 Regents of the University of California and the
6
  *               National Center for Ecological Analysis and Synthesis
7
  *  For Details: http://www.nceas.ucsb.edu/
8
  *      Created: 2000 April 5
9
  *      Version: 0.01
10
  *    File Info: '$Id$'
11
  * 
12
  * October Meeting SDSC, 2000
13
-->
14
<HTML>
15
<HEAD>
16
<TITLE>Metacat</TITLE>
17
<link rel="stylesheet" type="text/css" href="./default.css">
18
</HEAD> 
19
<BODY>
20
  <table width="100%">
21
    <tr>
22
      <td class="tablehead" colspan="2"><p class="label">Queries and Results</p></td>
23
      <td class="tablehead" colspan="2" align="right">
24
        <a href="./metacatload.html">Back</a> | <a href="./metacattour.html">Home</a> | 
25
        <a href="./metacatread.html">Next</a>
26
      </td>
27
    </tr>
28
  </table>
29
  <P>The Metacat Server provides 
30
  an interface for searching of metadata stored in the Metacat database.
31
  <P> <img alt="architecture diagram of a Metacat query" src="metadataquery.gif">
32
  <br><br><b>Steps to perform a query in Metacat</b>
33
  <ol>
34
    <li>A pathquery document is created from the search criteria provided through
35
    the servlet parameters.</li>  
36
    <li>This pathquery document is sent to DBQuery where it 
37
    is processed and translated into SQL statements.</li>
38
    <li>The SQL statements are executed against the database and the resultsets
39
    are translated into an xml document of doctype "resultset".</li>
40
    <li>The resultset document is either returned directly to the client as XML
41
    or is transformed through XSLT and returned as HTML.</li>
42
  </ol>
43
  
44
  <b>The Pathquery Document</b>
45
  <pre>
46
   &lt;pathquery version="1.0"&gt;
47
      &lt;meta_file_id&gt;unspecified&lt;/meta_file_id&gt;
48
      &lt;querytitle&gt;unspecified&lt;/querytitle&gt;
49
      &lt;returnfield&gt;dataset/title&lt;/returnfield&gt;
50
      &lt;returnfield&gt;keyword&lt;/returnfield&gt;
51
      &lt;returnfield&gt;originator/individualName/surName&lt;/returnfield&gt;
52
      &lt;returndoctype&gt;eml://ecoinformatics.org/eml-2.0.1&lt;/returndoctype&gt;
53
      &lt;returndoctype&gt;eml://ecoinformatics.org/eml-2.0.0&lt;/returndoctype&gt;
54
      &lt;querygroup operator="UNION"&gt;
55
        &lt;queryterm casesensitive="false" searchmode="contains"&gt;
56
          &lt;value&gt;Plant&lt;/value&gt;
57
          &lt;pathexpr&gt;dataset/title&lt;/pathexpr&gt;
58
        &lt;/queryterm&gt;
59
        &lt;queryterm casesensitive="false" searchmode="contains"&gt;
60
          &lt;value&gt;plant&lt;/value&gt;
61
          &lt;pathexpr&gt;keyword&lt;/pathexpr&gt;
62
        &lt;/queryterm&gt;
63
      &lt;/querygroup&gt;
64
  &lt;/pathquery&gt;
65
  </pre>
66
  
67
  <p>The pathquery document was designed to be flexible enough to query specific
68
  fields of any XML document.  It also allows the client to specify which fields
69
  from a returned document are returned in the initial resultset.  Each
70
  &lt;returnfield&gt; parameter specifies a field which the DB will return
71
  for any query hit.  The returndoctype fields allows the client to limit the 
72
  type of documents to be returned.  If no returndoctype element , all document types are returned.
73
  A &lt;querygroup&gt; creates an AND or an OR statement of the &lt;queryterm&gt;s
74
  in the group.  The operator can be UNION or INTERSECT.  A &lt;queryterm&gt;
75
  defines the actual field against which the query is being performed.  The value
76
  of the queryterm that we are quering for is encased in &lt;value&gt; tags.
77
  The &lt;pathexpr&gt; tag specifies an exact path to which you want to restrict
78
  the search.  A &lt;pathexpr&gt; tag which contains the keyword returndoc is 
79
  a special case which is discussed in <a href="./packages.html">Packages and 
80
  Relations</a>.</p><br>
81
  
82
  <b>The Resultset Document</b><br>
83
  
84
  <p>When the pathquery document is submitted and processed, Metacat returns
85
  another XML document called a resultset document.<p>
86
  
87
  <pre>
88
      &lt;resultset&gt;
89
        &lt;query&gt;
90
          &lt;pathquery version="1.0"&gt;
91
            &lt;meta_file_id&gt;unspecified&lt;/meta_file_id&gt;
92
            &lt;querytitle&gt;unspecified&lt;/querytitle&gt;
93
            &lt;returnfield&gt;dataset/title&lt;/returnfield&gt;
94
            &lt;returnfield&gt;keyword&lt;/returnfield&gt;
95
            &lt;returnfield&gt;originator/individualName/surName&lt;/returnfield&gt;
96
            &lt;returndoctype&gt;eml://ecoinformatics.org/eml-2.0.1&lt;/returndoctype&gt;
97
            &lt;returndoctype&gt;eml://ecoinformatics.org/eml-2.0.0&lt;/returndoctype&gt;
98
            &lt;querygroup operator="UNION"&gt;
99
                  &lt;queryterm casesensitive="false" searchmode="contains"&gt;
100
                      &lt;value&gt;Datos&lt;/value&gt;
101
                      &lt;pathexpr&gt;dataset/title&lt;/pathexpr&gt;
102
                  &lt;/queryterm&gt;
103
                  &lt;queryterm casesensitive="false" searchmode="contains"&gt;
104
                     &lt;value&gt;plant&lt;/value&gt;
105
                     &lt;pathexpr&gt;keyword&lt;/pathexpr&gt;
106
                  &lt;/queryterm&gt;
107
           &lt;/querygroup&gt;
108
         &lt;/pathquery&gt;
109
        &lt;/query&gt;  
110
      
111
        &lt;document&gt;
112
          &lt;docid&gt;nceas.44.1&lt;/docid&gt;
113
          &lt;docname&gt;resource&lt;/docname&gt;
114
          &lt;doctype&gt;eml://ecoinformatics.org/eml-2.0.1&lt;/doctype&gt;
115
          &lt;createdate&gt;2001-01-12 16:12:06.0&lt;/createdate&gt;
116
          &lt;updatedate&gt;2001-01-12 16:12:06.0&lt;/updatedate&gt;
117
          &lt;param name="dataset/title"&gt;Datos Meteorologicos&lt;/param&gt;
118
          &lt;param name="keyword"&gt;intertidal&lt;/param&gt;
119
          &lt;param name="originator/individualName/surName"&gt;Smith&lt;/param&gt;
120
        &lt;/document&gt;  
121
        
122
        &lt;document&gt;
123
          &lt;docid&gt;nceas.42.1&lt;/docid&gt;
124
          &lt;docname&gt;resource&lt;/docname&gt;
125
          &lt;doctype&gt;eml://ecoinformatics.org/eml-2.0.1&lt;/doctype&gt;
126
          &lt;createdate&gt;2001-01-12 16:11:31.0&lt;/createdate&gt;
127
          &lt;updatedate&gt;2001-01-12 16:11:31.0&lt;/updatedate&gt;
128
          &lt;param name="dataset/title"&gt;Ocean Surface Temperature&lt;/param&gt;
129
          &lt;param name="keyword"&gt;Plant&lt;/param&gt;
130
          &lt;param name="originator/individualName/surName"&gt;Henry&lt;/param&gt;   
131
       &lt;/document&gt;
132
      .....  
133
      &lt;/resultset&gt;
134
    
135
  </pre>
136
  <p>The first element in the resultset is &lt;query&gt;.  Its content is just 
137
  the pathquery document.  The resultset always returns 
138
  the pathquery document that created it in the &lt;query&gt; tag.  The next
139
  major tag is &lt;document&gt;.  Each XML document returned by the query
140
  is represented by a &lt;document&gt; tag.  The default document information returned
141
  is docid, docname, doctype, doctitle, createdate and  
142
  updatedate.  The param tags are present if the document found contained
143
  the returnfield chosen in the pathquery document.  The name attribute of the
144
  param tag is the full path to the node specified by the returnfield. <p>
145
  
146
  <br>
147
  <a href="./metacatload.html">Back</a> | <a href="./metacattour.html">Home</a> | 
148
  <a href="./metacatread.html">Next</a>
149

  
150
</BODY>
151
</HTML>
152

  
153 0

  
docs/user/packages.html
1
<!--
2
  * packages.html
3
  *
4
  *      Authors: Chad Berkley
5
  *    Copyright: 2000 Regents of the University of California and the
6
  *               National Center for Ecological Analysis and Synthesis
7
  *  For Details: http://www.nceas.ucsb.edu/
8
  *      Created: 2001 January 23
9
  *      Version: 
10
  *    File Info: '$ '
11
  * 
12
  * 
13
-->
14
<HTML>
15
<HEAD>
16
<TITLE>Metacat</TITLE>
17
<link rel="stylesheet" type="text/css" href="./default.css">
18
</HEAD> 
19
<BODY>
20
  <table width="100%">
21
    <tr>
22
      <td class="tablehead" colspan="2"><p class="label">Packages and Relationships</p></td>
23
      <td class="tablehead" colspan="2" align="right">
24
        <a href="./metacatapi.html">Back</a> | <a href="./metacattour.html">Home</a> | 
25
        <a href="./replication.html">Next</a>
26
      </td>
27
    </tr>
28
  </table>
29
  <p>Metacat allows a user to create a virtual link between XML documents within
30
  the system.  These links are called <i>Relationships</i> and are defined by triples
31
  in <i>eml-dataset</i>, <i>eml-literature</i> or <i>eml-software</i> files.  A relationship can be defined between two
32
  XML or <a href="./datafiles.html">non-XML</a> files. 
33
  The following is an example of an eml-dataset-2.0 file holding triples at the end:</p>
34
  
35
  <pre>
36
&lt;?xml version="1.0"?&gt;
37
&lt;!DOCTYPE dataset PUBLIC "-//NCEAS//eml-dataset-2.0//EN" "eml-dataset-2.0.dtd"&gt;
38
&lt;dataset&gt;
39
  &lt;identifier system="null"&gt;berkley.5.3&lt;/identifier&gt;
40
  &lt;shortName&gt;allsp&lt;/shortName&gt;
41
  &lt;title&gt;MARINE sampling data collected between spring 1992 and fall 1996&lt;/title&gt;
42
  &lt;originator&gt;
43
    &lt;individualName&gt;
44
      &lt;salutation&gt;Dr.&lt;/salutation&gt;
45
      &lt;givenName&gt;Peter&lt;/givenName&gt;
46
      &lt;surName&gt;Raimondi&lt;/surName&gt;
47
    &lt;/individualName&gt;
48
    &lt;organizationName&gt;UCSC&lt;/organizationName&gt;
49
    &lt;positionName&gt; &lt;/positionName&gt;
50
    &lt;address&gt;
51
      &lt;deliveryPoint&gt;Biology Dept.&lt;/deliveryPoint&gt;
52
      &lt;deliveryPoint&gt;A309 Earth and Marine Science Building&lt;/deliveryPoint&gt;
53
      &lt;city&gt;Santa Cruz&lt;/city&gt;
54
      &lt;administrativeArea&gt;CA&lt;/administrativeArea&gt;
55
      &lt;postalCode&gt;95060&lt;/postalCode&gt;
56
      &lt;country&gt;USA&lt;/country&gt;
57
    &lt;/address&gt;
58
    &lt;phone phonetype="voice"&gt;831-459-1234 x5674&lt;/phone&gt;
59
    &lt;electronicMailAddress&gt;raimondi@biology.ucsc.edu&lt;/electronicMailAddress&gt;
60
    &lt;onlineLink&gt; &lt;/onlineLink&gt;
61
    &lt;role&gt;Originator&lt;/role&gt;
62
  &lt;/originator&gt;
63
  &lt;pubdate&gt; &lt;/pubdate&gt;
64
  &lt;pubplace&gt; &lt;/pubplace&gt;
65
  &lt;series&gt; &lt;/series&gt;
66
  &lt;abstract&gt;
67
    &lt;paragraph&gt; &lt;/paragraph&gt;
68
  &lt;/abstract&gt;
69
  &lt;keywordSet&gt;
70
    &lt;keyword keywordType="null"&gt;intertidal&lt;/keyword&gt;
71
    &lt;keyword keywordType="null"&gt;santa barbara&lt;/keyword&gt;
72
    &lt;keyword keywordType="null"&gt;photoplot&lt;/keyword&gt;
73
    &lt;keyword keywordType="null"&gt;quadrat&lt;/keyword&gt;
74
    &lt;keywordThesaurus&gt; &lt;/keywordThesaurus&gt;
75
  &lt;/keywordSet&gt;
76
  &lt;additionalInfo&gt;
77
    &lt;paragraph&gt; &lt;/paragraph&gt;
78
  &lt;/additionalInfo&gt; <font color="red">
79
  &lt;triple&gt;
80
    &lt;subject&gt;berkley.6.1&lt;/subject&gt;
81
    &lt;relationship&gt;isRelatedTo&lt;/relationship&gt;
82
    &lt;object&gt;berkley.5.3&lt;/object&gt;
83
  &lt;/triple&gt;
84
  &lt;triple&gt;
85
    &lt;subject&gt;berkley.7.1&lt;/subject&gt;
86
    &lt;relationship&gt;isRelatedTo&lt;/relationship&gt;
87
    &lt;object&gt;berkley.6.1&lt;/object&gt;
88
  &lt;/triple&gt;
89
  &lt;triple&gt;
90
    &lt;subject&gt;berkley.8.1&lt;/subject&gt;
91
    &lt;relationship&gt;isRelatedTo&lt;/relationship&gt;
92
    &lt;object&gt;berkley.5.3&lt;/object&gt;
93
  &lt;/triple&gt;
94
  &lt;triple&gt;
95
    &lt;subject&gt;berkley.8.1&lt;/subject&gt;
96
    &lt;relationship&gt;isRelatedTo&lt;/relationship&gt;
97
    &lt;object&gt;berkley.6.1&lt;/object&gt;
98
  &lt;/triple&gt;
99
  &lt;triple&gt;
100
    &lt;subject&gt;berkley.8.1&lt;/subject&gt;
101
    &lt;relationship&gt;isRelatedTo&lt;/relationship&gt;
102
    &lt;object&gt;berkley.7.1&lt;/object&gt;
103
  &lt;/triple&gt;
104
  &lt;triple&gt;
105
    &lt;subject&gt;berkley.14.1&lt;/subject&gt;
106
    &lt;relationship&gt;isRelatedTo&lt;/relationship&gt;
107
    &lt;object&gt;berkley.6.1&lt;/object&gt;
108
  &lt;/triple&gt; </font>
109
  &lt;temporalCoverage&gt; 1992 to 1996&lt;/temporalCoverage&gt;
110
  &lt;geographicCoverage&gt; &lt;/geographicCoverage&gt;
111
  &lt;taxonomicCoverage&gt; &lt;/taxonomicCoverage&gt;
112
&lt;/dataset&gt;
113
  </pre>
114
  
115
  <b>Description of the Package File</b>
116
  <p>Note that the doctype of this document is an unregistered NCEAS specific
117
  DTD (-//NCEAS//eml-dataset-2.0//EN).  The package doctype is an application 
118
  property of Metacat.  Setting this property (and others) is described in 
119
  <a href="./properties.html">Setting Metacat Properties</a>.  The package file
120
  contains <i>n</i> triples.  Each triple has a subject, relationship,
121
  and an object.  This grouping can be read as follows:  &lt;subject&gt; has
122
  &lt;relationship&gt; to &lt;object&gt;.  Each triple is a logical link
123
  between the subject and object with the relationship being a description of that
124
  link.</p>
125
  <b>The Utility of Relations</b>
126
  <p>Relations become useful because many XML data schemas are broken up into 
127
  multiple DTDs.  Thus, there may be many different XML files that are all 
128
  related to each other yet are stored seperately within the system.  Also, 
129
  since we, here at NCEAS, are developing Metacat for use as a metadata 
130
  repository for ecological data, we need some way of linking our metadata 
131
  to the datafiles that they describe.  Packages are the way we do this.</p>
132
  <b>Post Processed Relations</b>
133
  <p>The package file is inserted into Metacat as any other file is.  Its doctype
134
  is checked against the packagedoctype property in the <a href="properties.html">
135
  Metacat.properties file</a>.  If it is of that type, the file is sent
136
  to a postprocessor to be analyzed and inserted into the xml_relation table.  
137
  The table looks like the following:</p>
138
  
139
  <table border="1">
140
    <tr>
141
      <td>relationid</td><td>docid</td><td>packagetype</td>
142
      <td>subject</td><td>subjectdoctype</td>
143
      <td>relationship</td><td>object</td><td>objectdoctype</td>
144
    </tr>
145
    <tr>
146
      <td>1</td>
147
      <td>berkley.5</td>
148
      <td>-//NCEAS//eml-dataset-2.0//EN</td>
149
      <td>berkley.6.1</td>
150
      <td>null</td>
151
      <td>isRelatedTo</td>
152
      <td>berkley.5.3</td>
153
      <td>null</td>
154
    </tr>
155
    <tr>
156
      <td>2</td>
157
      <td>berkley.5</td>
158
      <td>-//NCEAS//eml-dataset-2.0//EN</td>
159
      <td>berkley.7.1</td>
160
      <td>null</td>
161
      <td>isRelatedTo</td>
162
      <td>berkley.6.1</td>
163
      <td>null</td>
164
    </tr>
165
    <tr>
166
      <td>3</td>
167
      <td>berkley.5</td>
168
      <td>-//NCEAS//eml-dataset-2.0//EN</td>
169
      <td>berkley.8.1</td>
170
      <td>null</td>
171
      <td>isRelatedTo</td>
172
      <td>berkley.5.3</td>
173
      <td>null</td>
174
    </tr>
175
    <tr>
176
      <td>4</td>
177
      <td>berkley.5</td>
178
      <td>-//NCEAS//eml-dataset-2.0//EN</td>
179
      <td>berkley.8.1</td>
180
      <td>null</td>
181
      <td>isRelatedTo</td>
182
      <td>berkley.6.1</td>
183
      <td>null</td>
184
    </tr>
185
    <tr>
186
      <td>5</td>
187
      <td>berkley.5</td>
188
      <td>-//NCEAS//eml-dataset-2.0//EN</td>
189
      <td>berkley.8.1</td>
190
      <td>null</td>
191
      <td>isRelatedTo</td>
192
      <td>berkley.7.1</td>
193
      <td>null</td>
194
    </tr>
195
    <tr>
196
      <td>6</td>
197
      <td>berkley.5</td>
198
      <td>-//NCEAS//eml-dataset-2.0//EN</td>
199
      <td>berkley.14.1</td>
200
      <td>null</td>
201
      <td>isRelatedTo</td>
202
      <td>berkley.6.1</td>
203
      <td>null</td>
204
    </tr>
205
  </table>
206
  
207
  <p>Once, the system has processed the package file and inserted the relations
208
  into the xml_relation table, the files relations are always returned to with it
209
  in the <a href="./metacatquery.html">resultset</a> of a query.</p>
210

  
211
  <b>Package Views (formerly known as 'backtracking')</b>
212
  <p>Package View is a feature that was intentionally left out of the 
213
  <a href="./Metacatquery.html">Queries and Results</a> section.  Package views 
214
  involves sending a doctype (called a returndoctype) along with a query request.  
215
  When there is a hit from that query, the system will check the doctype of the 
216
  hit document against the returndoctype.  If the doctypes do not match, 
217
  the system checks the xml_relation table to see if that document has been packed
218
  by document of that doctype. If such package document exists, it
219
  is returned instead of the one which was originally hit.  If no such
220
  package document exists, then the document which was originally hit is returned.  
221
  This allows a display system (such as a web browser) to try to display a 
222
  certain type of document.</p>
223
  <p>For example:  Take our package file from above.  Say we do a query for 
224
  "intertidal" which returns the document berkley.6 of type -//NCEAS//eml-entity-2.0//EN.
225
  However, we have set returndoctype equal to "-//NCEAS//eml-dataset-2.0//EN".
226
  When berkley.6 is hit, the system will check its package documents to see if 
227
  it is of type -//NCEAS//eml-dataset-2.0//EN.  Since it is, 
228
  (relationid 1, 2 and 4) document berkley.5 is returned instead of berkley.6.
229
  <p>From a client the returndoctype is a servlet parameter.
230
  A URL with a returndoctype would look something like: </p>
231
  <pre>http://server.domain.com/Metacat?action=query&amp;anyfield=%&amp;qformat=html&amp;returndoctype=-//NCEAS//eml-dataset-2.0//EN</pre>
232
  <p>The system then inserts the returndoctype parameter value into a pathquery
233
  document as illustrated in <a href="./metacatquery.html">Queries and Results</a>.
234
  </p>
235
  
236
  
237
  <br>
238
  <a href="./metacatapi.html">Back</a> | <a href="./metacattour.html">Home</a> | 
239
  <a href="./replication.html">Next</a>
240
  
241

  
242
</BODY>
243
</HTML>
244 0

  
docs/user/metacat-apache-config.html
1
<!-- 
2
  *   '$RCSfile$'
3
  *     Purpose: web page describing the installation of Metacat
4
  *   Copyright: 2008 Regents of the University of California and the
5
  *               National Center for Ecological Analysis and Synthesis
6
  *     Authors: Chad Berkley
7
  *
8
  *    '$Author: daigle $'
9
  *    '$Date: 2008-11-24 11:57:40 -0800 (Mon, 24 Nov 2008) $'
10
  *    '$Revision: 4621 $'
11
  *
12
  *
13
  -->
14
  
15
<!DOCTYPE html PUBLIC "-//W3C//DTD html 4.0//EN">
16
<html>
17

  
18
<head>
19
  <title>Metacat Apache Configuration</title>
20
  <link rel="stylesheet" type="text/css" href="./common.css">
21
  <link rel="stylesheet" type="text/css" href="./default.css">
22
</head>
23

  
24
<body>
25

  
26
<table class="tabledefault" width="100%">
27
<tr><td rowspan="2"><img src="./images/KNBLogo.gif"></td>
28
<td colspan="7">
29
<div class="title">Metacat Apache Configuration</div>
30
</td>
31
</tr>
32
<tr>
33
  <td><a href="/" class="toollink"> KNB Home </a></td>
34
  <td><a href="/data.html" class="toollink"> Data </a></td>
35
  <td><a href="/people.html" class="toollink"> People </a></td>
36
  <td><a href="/informatics" class="toollink"> Informatics </a></td>
37
  <td><a href="/biodiversity" class="toollink"> Biocomplexity </a></td>
38
  <td><a href="/education" class="toollink"> Education </a></td>
39
  <td><a href="/software" class="toollink"> Software </a></td>
40
</tr>
41
</table>
42
<hr>
43

  
44
<div class="header1">Table of Contents</div>
45
  <div class="toc">
46
    <div class="toc1"><a href="#Intro">Introduction</a></div>
47
    <div class="toc1"><a href="#UbuntuConfig">Unbuntu/Debian Configuration</a></div>
48
      <div class="toc2"><a href="#ModJK">Set Up Mod JK</a></div>
49
      <div class="toc2"><a href="#SetUpKnb">Set Up Metacat Site</a></div>
50
      <div class="toc2"><a href="#LsidConfig">LSID Service Configuration</a></div>
51
      <div class="toc2"><a href="#UbuntuReloadApache">Reload Apache</a></div>
52
    <div class="toc1"><a href="#OtherConfig">Other O/S Configuration</a></div>
53
  </div> 
54

  
55
<a name="Intro"></a><div class="header1">Introduction</div>
56
  <p>If you are going to run Tomcat behind the Apache web server, you have some additional
57
  setup to perform.  One of the reasons for recommending the Ubuntu Linux O/S is its 
58
  ease of setup.  This becomes abundantly clear when configuring Apache.  We will
59
  address Ubuntu/Debian configurations separately from all other configurations.</p>
60
  
61
<a name="UbuntuConfig"></a><div class="header1">Unbuntu/Debian Configuration</div>  
62
   <p>If you are installing on an Ubuntu/Debian system, and you installed Apache using 
63
   apt-get, the Metacat code will have helper files that can be dropped into directories 
64
   to configure Apache.  Depending on whether you are installing from binary distribution 
65
   or source, these helper files will be in one of two locations:
66
   
67
   <ul>
68
   <li>Installing From Binary Distribution - the helper files will be located in the 
69
   directory where you extracted the distribution.</li>
70
   <li>Installing From Source - regardless if it's a source distribution or source that 
71
   you checked out from the SVN repository, the helper files will be located in:
72
     <div class="code">&lt;metacat_code_dir&gt;/src/scripts</div></li>
73
   </ul>
74
   <p>We will refer to the directory with the helper scripts as &lt;metacat_helper_dir&gt;</p>
75
   <p>We will refer to the directory where Apache is installed as &lt;apache_install_dir&gt;</p>
76
   
77
   <a name="ModJK"></a><div class="header2">Set Up Mod JK</div>
78
	<p>Apache uses a module called Mod JK to talk to Tomcat applications.  If you haven't done so 
79
	already, you can install it by typing:</p>	
80
	  <div class="code">sudo apt-get install libapache2-mod-jk</div>
81
	<p>The helper files that configure the interface between Apache and Tomcat are jk.conf and workers.properties.
82
	To install these files:</p>
83
      <div class="code">sudo cp &lt;metacat_helper_dir&gt;/jk.conf &lt;apache_install_dir&gt;/mods-available/</div>
84
      <div class="code">sudo cp &lt;metacat_helper_dir&gt;/workers.properties &lt;apache_install_dir&gt;</div>
85
    <p>Disabling and re-enabling the Apache Mod JK module will pick up the new changes:</p>
86
      <div class="code">sudo a2dismod jk</div>
87
      <div class="code">sudo a2enmod jk</div>
88
    
89
   <a name="SetUpKnb"></a><div class="header2">Set Up Metacat Site</div>   
90
    <p>Next, Apache needs to know about the Metacat site.  The helper file named "knb" has rules that
91
    tell Apache which traffic to route to Metacat.  Set up the knb (Metacat) site by dropping the 
92
    knb file into the sites-available directory and running the a2ensite to enable the site:</p>
93
      <div class="code">sudo cp &lt;metacat_helper_dir&gt;/knb &lt;apache_install_dir&gt;/sites-available</div>
94
      <div class="code">sudo a2ensite knb</div>
95
      
96
    <a name="LsidConfig"></a><div class="header2">LSID Service Configuration</div>   
97
    <p>If you want to run an optional LSID server along with the Metacat server, set up and enable the authority service site 
98
      configurations by typing:</p>
99
      <div class="code">sudo cp &lt;metacat_helper_dir&gt;/authority &lt;apache_install_dir&gt;/sites-available</div>
100
      <div class="code">sudo a2ensite authority</div>
101
      
102
    <a name="UbuntuReloadApache"></a><div class="header2">Reload Apache</div>
103
    <p>Reload apache to bring in changes by typing:</p>
104
      <div class="code">sudo /etc/init.d/apache2 force-reload</div>
105
  
106
<a name="OtherConfig"></a><div class="header1">Other O/S Configuration</div>
107
  <p>If you are running on an O/S other than Ubuntu/Debian or you installed the
108
  Apache source or binary, you will need to manually edit the Apache configuration
109
  file.</p> 
110
     
111
  <p>We will refer to the directory where Apache is installed as &lt;apache_install_dir&gt;</p>
112
  
113
  <p>Edit:</p>
114
    <div class="code">&lt;apache_install_dir&gt;/conf/httpd.conf</div>
115
   
116
  <a name="ModJkLog"></a><div class="header2">Mod JK Log Configuration</div>
117
    <p> You should configure the log location and level for Mod JK.  If you do not
118
    already have a section like this, you should add it.</p>  
119
    
120
    <div class="code">
121
    &lt;IfModule mod_jk.c&gt;<br>
122
       &nbsp;&nbsp;JkLogFile "/var/log/tomcat/mod_jk.log"<br>
123
       &nbsp;&nbsp;JkLogLevel info<br>
124
    &lt;/IfModule&gt;
125
    </div>
126
    
127
    <p>You can set the log location to any place you like</p>
128

  
129
  <a name="VirtualHost"></a><div class="header2">Virtual Host Configuration</div>  
130
  <p>The following section configures apache to talk to route traffic to the Metacat
131
  application.</p>
132
    <div class="code">
133
    &lt;VirtualHost XXX.XXX.XXX.XXX:80&gt;<br>
134
      &nbsp;&nbsp;DocumentRoot /var/www<br>
135
      &nbsp;&nbsp;ServerName dev.nceas.ucsb.edu<br>
136
      &nbsp;&nbsp;ErrorLog /var/log/httpd/error_log<br>
137
      &nbsp;&nbsp;CustomLog /var/log/httpd/access_log common<br>
138
    <br>
139
      &nbsp;&nbsp;ScriptAlias /cgi-bin/ "/var/www/cgi-knb/"<br>
140
      &nbsp;&nbsp;&lt;Directory /var/www/cgi-knb/&gt;<br>
141
        &nbsp;&nbsp;&nbsp;&nbsp;AllowOverride None<br>
142
        &nbsp;&nbsp;&nbsp;&nbsp;Options ExecCGI<br>
143
        &nbsp;&nbsp;&nbsp;&nbsp;Order allow,deny<br>
144
        &nbsp;&nbsp;&nbsp;&nbsp;Allow from all<br>
145
      &nbsp;&nbsp;&lt;/Directory&gt;<br>
146
    <br>
147
      &nbsp;&nbsp;ScriptAlias /knb/cgi-bin/ "/var/www/webapps/knb/cgi-bin/"<br>
148
      &nbsp;&nbsp;&lt;Directory "/var/www/webapps/knb/cgi-bin/"&gt;<br>
149
        &nbsp;&nbsp;&nbsp;&nbsp;AllowOverride None<br>
150
        &nbsp;&nbsp;&nbsp;&nbsp;Options ExecCGI<br>
151
        &nbsp;&nbsp;&nbsp;&nbsp;Order allow,deny<br>
152
        &nbsp;&nbsp;&nbsp;&nbsp;Allow from all<br>
153
      &nbsp;&nbsp;&lt;/Directory&gt;<br>
154
    <br>
155
      &nbsp;&nbsp;JkMount /knb ajp13<br>
156
      &nbsp;&nbsp;JkMount /knb/* ajp13<br>
157
      &nbsp;&nbsp;JkMount /knb/metacat ajp13<br>
158
      &nbsp;&nbsp;JkUnMount /knb/cgi-bin/* ajp13<br>
159
      &nbsp;&nbsp;JkMount /*.jsp ajp13<br>
160
      &nbsp;&nbsp;JkMount /metacat ajp13<br>
161
      &nbsp;&nbsp;JkMount /metacat/* ajp13<br>
162
    &lt;/VirtualHost&gt;<br>
163
    </div>
164
    
165
    <p>Some notes:</p>
166
    <ul>
167
    <li>ServerName - should be set to the dns name where you will serve Metacat.</li>
168
    <li> ScriptAlias /knb/cgi-bin/ - this directive and the following Directory section
169
    should both point to the cgi-bin directory inside your Metacat installation.</li>
170
    </ul>
171
    
172

  
173
  <a name="WorkersProperties"></a><div class="header2">workers.properties file</div> 
174
   <p>You will need to drop a file named "workers.proerpties" file into your Apache 
175
   configuration directory, and edit the file to make sure all properties are correct.
176
   Metacat provides a base workers.properties file for you to use.  Depending on 
177
   whether you are installing from binary distribution or source, the workers.properties
178
   files will be in one of two locations:
179
   
180
   <ul>
181
   <li>Installing From Binary Distribution - the file will be located in the 
182
   directory where you extracted the distribution.</li>
183
   <li>Installing From Source - regardless if it's a source distribution or source that 
184
   you checked out from the SVN repository, the file will be located in:
185
     <div class="code">&lt;metacat_code_dir&gt;/src/scripts/workers.properties</div></li>
186
   </ul>   
187
   
188
   <p>Copy the workers.properties file into:</p>
189
     &lt;apache_install_dir&gt;/conf/
190
     
191
   <p>Edit the workers.properties file and make sure the following properties are
192
   set correctly:</p>
193
   <ul>
194
   <li>workers.tomcat_home - should be set to the Tomcat install directory on your system.</li>
195
   <li>workers.java_home  - should be set to the Java install directory on your system.</li>
196
   </ul> 
197
   
198
  <a name="LsidConfig"></a><div class="header2">LSID Service Configuration</div>   
199
    <p>If you want to run an optional LSID server along with the Metacat server, add the following lines to the
200
    VirtualHost section you configured above:</p>
201
      <div class="code">
202
        &nbsp;&nbsp;JkMount /authority ajp13<br>
203
        &nbsp;&nbsp;JkMount /authority/* ajp13<br>
204
      </div>
205
      
206
    <a name="UbuntuReloadApache"></a><div class="header2">Reload Apache</div>
207
    <p>Restart apache to bring in changes by typing:</p>
208
      <div class="code">sudo /etc/init.d/apache2 restart</div>
209
      
210
    <br>    
211
</body>
212
</html>
213 0

  
... This diff was truncated because it exceeds the maximum size that can be displayed.

Also available in: Unified diff