Replication

Back | Home | Next

Metacat has built in replication to allow different Metacat servers to share data between themselves. In this release, Metacat not only replicate XML documents but also data files.

A new hub feature was added to Metacat too. Previous version of Metacat only replicate XML documents whose home server is itself. But hub Metacat can replicate both its documents and other's documents which replicated from other server.

The replication scheme that Metacat uses is both push and pull. There are several triggers that can start a replication mechanism.

Each server contains a list of servers to which it can replicate. One-way replication is enabled by the 'replicate' and 'datareplicate' flags in the list. The server list may look like the following.

serverid server last_checked replicate datareplicate hub
1 localhost null 0 0 0
2 alpha.nceas.ucsb.edu:8080/berkley/servlet/replication 2001-01-22 14:52:12.1 0 0 0
3 dev.nceas.ucsb.edu/Metacat/servlet/replication 2001-01-23 9:10:02.5 1 1 0

The server list is kept in a table in the database called xml_replication. Localhost must always be the first entry in the table and have a serverid of 1. The server field must always point to the other server's replication servlet, hence the servlet/replication on the end of both of the sample servers. Note that any port numbers (if your servlet engine is not running on port 80) must also be included. The replicate flag is set to 1 if you want this server to copy XML documents TO the remote host. If replicate flag is set to 1 and datareplicate is set to 1, this server can copy data file TO the remote host too. If this server is a hub to the remote host, the hub flag should be set to 1. (Note that both servers (the local host and the remote host) must have each other in their respective tables or replication will not take place.)

Example:
host replication table
snoopy.nceas.ucsb.edu
server last_checked replicate datareplicate hub
localhost null 0 0 0
alpha.nceas.ucsb.edu:8080/berkley/servlet/replication    2001-01-22 14:52:12.1 0 0 0
dev.nceas.ucsb.edu/Metacat/servlet/replication 2001-01-23 9:10:02.5 1 1 0
alpha.nceas.ucsb.edu
server last_checked replicate datareplicate hub
localhost null 0 0 0
snoopy.nceas.ucsb.edu:8080/berkley/servlet/replication 2001-01-21 11:33:12.7 0 1 0
dev.nceas.ucsb.edu/Metacat/servlet/replication 2001-01-23 10:22:02.5 1 0 0
dev.nceas.ucsb.edu
server last_checked replicate datareplicate hub
localhost null 0 0 0
snoopy.nceas.ucsb.edu:8080/berkley/servlet/replication 2001-01-21 11:33:12.7 0 0 0
alpha.nceas.ucsb.edu:8080/Metacat/servlet/replication 2001-01-22 12:15:32.5 1 1 1

Our three servers, snoopy, alpha and dev are all set up to replicate between themselves. Snoopy is a one way replicator. Meaning that it only pushes XML documents and data file to dev but does not pull back from it. This is achieved by dev and alpha setting snoopy's 'replicate' value to 0 indicating that they do not want to send their files to snoopy(Even in in Alpha, 'datareplicate' is set to 1 for snoopy but nothing will be sent to Snoopy from alpha). Alpha and dev have a two-way replication agreement since each of them have a 1 in their 'replicate' value for the other.

Snoopy will replicate both XML documents and data file to dev because it setting dev's 'replicata' and 'datareplicate' is 1. Alpha only replicate XML documents to dev and this is caused by it setting dev's 'datareplicate' 0.

Dev is a hub of alpha because it setting alpha's 'hub' value to be 1. Moreover, dev set alpha's 'replicate' and 'datareplicate' value 1. So dev will replicate XML documents and data file whose home server is dev or snoopy(replicated from snoopy) to alpha

Note: if 'replicate' value is 0, the value for 'datareplicate' and 'hub' has no sense.

There is an html control panel for controling replication. After installing Metacat, you can access it by going through the Metacat servlet context you have setup and calling up replControl.html. For instance, if you setup a Metacat servlet instance called 'Metacat' you would probably type http://server.domain.com:8080/Metacat/replControl.html. The control panel is an easy interface for adding/removing/altering servers and starting the delta-T handler. It will also allow you to 'force replicate' your server list. This is useful if you want to initialize the state of one Metacat server from an existing state of another (i.e. copy all of the data from an existing server).


Back | Home | Next