Replication =========== Metacat has a built-in replication feature that allows different Metacat servers to share data (both XML documents and data files) between each other. Metacat can replicate not only its home server's original documents, but also those that were replicated from partner Metacat servers. When changes are made to one server in a replication network, the changes are automatically propogated to the network, even if the network is down. Replication allows users to manage their data locally and (by replicating them to a shared Metacat repository) to make those data available to the greater scientific community via a centralized search. In other words, your Metacat can be part of a broader network, but you retain control over the local repository and how it is managed. For example, the KNB Network (Figure 6.1), which currently consists of ten different Metacat servers from around the world, uses replication to "join" the disperate servers to form a single robust and searchable data repository--facilitating data discovery, while leaving the data ownership and management with the local administrators. .. figure:: images/screenshots/image059.jpg :align: center A map of the KNB Metacat network. When properly configured, Metacat's replication mechanism can be triggered by several types of events that occur on either the home or partner server: a document insertion, an update, or an automatic replication (i.e., Delta-T monitoring), which is set at a user-specified time interval. +----------------------+----------------------------------------------------------+ | Replication Triggers | Description | +======================+==========================================================+ | Insert | Whenever a document is inserted into Metacat, the server | | | notifies each server in its replication list | | | that it has a new file available. | +----------------------+----------------------------------------------------------+ | Update | Whenever a document is updated, the server notifies | | | each server in its replication list of the update. | +----------------------+----------------------------------------------------------+ | Delta-T monitoring | At a user-specified time interval, Metacat checks each | | | of the servers in its replication list | | | for updated documents. | +----------------------+----------------------------------------------------------+ Configuring Replication ----------------------- To configure replication, you must configure both the home and partner servers: 1. Create a list of partner servers on your home server using the Replication Control Panel 2. Create certificate files for the home server 3. Create certificate files for the partner server 4. Import partner certificate files to the home server 5. Import home certificate to the partner server 6. Update your Metacat database Each step is discussed in more detail in the following sections. Using the Replication Control Panel ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To add, remove, or alter servers on your home server's Replication list, or to activate and customize the Delta-T handler, use the Replication control panel, which is accessed at the following URL:: http://somehost.somelocation.edu/context/style/skins/dev/replControl.html "http://somehost.somelocation.edu/context" should be replaced with the name of your Metacat server and context (e.g., http://knb.ecoinformatics.org/knb/). You must be logged in to Metacat as an administrator. .. figure:: images/screenshots/image061.jpg :align: center Replication control panel. Note that currently, you cannot use the Replication Control Panel to remove a server after a replication has occurred. At this point in time, the only way to remove a replication server after replication has occurred is to remove the certificates. Also note that you must SCP partner certificates to your machine; you cannot use the "Download Certificate from" option on the Control Panel. For more information about creating and installing certificates, please see Generating and Exchanging Security Certificates. Generating and Exchanging Security Certificates ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Before you can take advantage of Metacat's replication feature, you must generate security certificates on both the replication partner and home servers. The certificates will be exchanged so that each machine understands that the other has replication access. The process for generating certificates is different for Metacat servlets running under Tomcat and those under Tomcat/Apache (the recommended configuration). For instructions on generating and exchanging certificates on systems running only Tomcat (and Java 6), see Generating a Certificate for Tomcat standalone (no Apache). Generate Certificates for Metacat running under Apache/Tomcat ............................................................. Note: Instructions are for Ubuntu/Debian systems. 1. Generate a certificate key using openssl. The key will be named ``-apache.key``, where ```` is the name of your Metacat server. Example values for the individual key fields are included in the table below. :: openssl req -new -out REQ.pem -keyout -apache.key +--------------------------+-------------------------------------------------------------------------+ | Key Field | Description and Example Value | +==========================+=========================================================================+ | Country Name | Two letter country code (e.g., US) | +--------------------------+-------------------------------------------------------------------------+ | State or Province Name | The name of your state or province spelled in full (e.g., California) | +--------------------------+-------------------------------------------------------------------------+ | Locality Name | The name of your city (e.g., Santa Barbara) | +--------------------------+-------------------------------------------------------------------------+ | Organization Name | The company or organization name (e.g., UCSB) | +--------------------------+-------------------------------------------------------------------------+ | Organizational Unit Name | The department or section name (e.g., NCEAS) | +--------------------------+-------------------------------------------------------------------------+ | Common Name | The host server name without port numbers (e.g., myserver.mydomain.edu) | +--------------------------+-------------------------------------------------------------------------+ | Email Address | Administrator's contact email (e.g., administrator@mydomain.edu) | +--------------------------+-------------------------------------------------------------------------+ | A challenge password | --leave this field blank-- | +--------------------------+-------------------------------------------------------------------------+ | An optional company name | --leave this field blank-- | +--------------------------+-------------------------------------------------------------------------+ 2. Create the local certificate file by running the command: :: openssl req -x509 -days 800 -in REQ.pem -key -apache.key -out -apache.crt Use the same ```` you used when you generated the key. A file named ``-apache.crt`` will be created in the directory from which you ran the openssl command. Note: You can name the certificate file anything you'd like, but keep in mind that the file will be sent to the partner machine used for replication. The certificate name should have enough meaning that someone who sees it on that machine can figure out where it came from. 3. Enter the certificate into Apache's security configuration. You must register the certificate in the local Apache instance. Note that the security files may be in a different directory from the one used in the instructions depending on how you installed Apache. Copy the certificate and key file using the following commands: :: sudo cp -apache.crt /etc/ssl/certs sudo cp -apache.key /etc/ssl/private 4. Apache needs to know about Metacat SSL. The helper file named "knb-ssl" has rules that tell Apache which traffic to route to the Metacat SSL port. Set up SSL by dropping the knb-ssl file into the sites-available directory and running ``a2ensite`` to enable the site: :: sudo cp /knb-ssl /sites-available sudo a2ensite knb-ssl 5. Restart Apache to bring in changes by typing: :: sudo /etc/init.d/apache2 restart 6. SCP ``-apache.crt`` to the replication partner machine. Generating a Certificate for Tomcat standalone (no Apache) .......................................................... If you are running Metacat under Tomcat (no Apache), generate keys in the Java default key store. The generated key is placed into the binary certificate's file located at ``/etc/java-1.5.0-sun/security/cacerts``. 1. Generate the key by running the following command (note that you must be logged in as the root user to use the keytool): :: keytool -genkey -alias -keyalg RSA -validity 800 -keystore /etc/java-1.6.0-sun/security/cacerts ```` is a unique name that you choose for this key. Something like "" might be appropriate, where ```` is the name of the Metacat host. 2. The Password-keytool will ask for a password. If writing to a pre-existing keystore, you must know the password. If you are creating a new keystore, the password you enter will become the keystore password. Sample values when creating certificate: :: What is your first and last name? myserver.nceas.ucsb.edu (note: use the host name without port number) What is the name of your organizional unit? NCEAS What is the name of your organizional unit? UCSB What is the name of your City or Locality? Santa Barbara What is the name of your State or Province? California (note: this is spelled in full) What is the two-letter country code for this unit? US 3. Create a certificate by running the command: :: keytool -export -alias -file .cert -keystore /etc/java-1.6.0-sun/security/cacerts ```` is the same name you used when you created the key file. A file named ``.cert`` will be created in the directory from which you ran the keytool command. You can name the output file anything you like, but keep in mind that it will be sent to the partner machine used for replication. The filename should have enough meaning that someone who sees it on that machine can figure out where it came from. Again, something like "-tomcat.cert" will suffice. 4. Edit the Tomcat server file at ``$TOMCAT_HOME/conf/server.xml`` to enable SSL in Tomcat. * Uncomment the section that starts with "). * Add two attribute to that section: :: keystoreFile="/etc/java-1.6.0-sun/security/cacerts" keystorePass="" where ```` is the password you used when you created or accessed the keystore. 5. SCP the certificate to the partner server. To import a certificate ....................... 1. Log in as a root user (the keytool must run as a root user) :: sudo su – 2. Import the remote certificate by running: :: keytool -import -alias -file .crt -keystore /etc/java-1.6.0-sun/security/cacerts where the ```` is the name of the certificate file created on the remote partner machine and SCP'd to the home machine. The ```` is the name the certificate will use in the keystore. The name should identify the remote host. Update your Metacat database ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The simplest way to update the Metacat database to use replication is to use the Replication Control Panel. You can also update the database u sing SQL. Instructions for both options are included in this section. .. figure:: images/screenshots/image063.jpg :align: center Using the Replication Control Panel to update the Metacat database. To update your Metacat database to use replication, select the "Add this server" radio button from the Replication Control Panel, enter the partner server name, and specify how the replication should occur (whether to replicate xml, data, or use the local machine as a hub). Note that you cannot download certificates using this interface. To update the database using SQL ................................ 1. Log in to the database :: psql -U metacat -W -h localhost metacat 2. Select all rows from the replication table :: select * from xml_replication; 3. Insert the partner server. :: INSERT INTO xml_replication (server,last_checked,replicate,datareplicate,hub) VALUES ('/servlet/replication',NULL,1,1,0); Where ```` is the name of the partner server and context. The values 'NULL, 1,1,0' indicate (respectively) the last time replication occurred, that XML docs should be replicated to the partner server, that data files should be replicated to the partner server, and that the local server should not act as a hub. Set a value of 'NULL,0,0,0' if your Metacat is only receiving documents from the partner site and not replicating to that site. 4. Exit the database 5. Restart Apache and Tomcat on both home and partner replication machines