Metacat UNIX Installation Instructions
KNB Home Data People Informatics Biocomplexity Education Software

***Disclaimer***

These installation instructions are meant for a systems administrator/DBA or someone who is an advanced computer user. They are NOT meant for the average computer user. Please realize that by executing these instructions, you may have to trouble shoot many advanced issues yourself.

Pre-Installation

Minimum Requirements

Installing Metacat requires a server running an SQL92 compliant database (Oracle 8i recommended) with at least 128MB RAM, and a Pentium III class processor or higher. The amount of disk space required depends on the size of your RDBMS tablespace (which should be at least 10 MB, however Metacat itself requires only about 1 MB of free space after installation. These instructions assume a Linux environment but may work on other UNIX type environments, however this has not been tested.

Additional Required Software

The server on which you wish to install Metacat must have the following software installed and running correctly before attempting to install Metacat.

  • Oracle 8i (or another SQL92 compliant RDBMS like Postgres)
  • Apache Jakarta-Ant
  • Apache Jakarta-Tomcat

    Note: For a more robust web serving environment, Apache web server should be installed along with Tomcat and the two should be integrated as described on the Apache web site.

Aditional Software Setup

Java

You'll need a recent Java SDK, preferably j2sdk1.4.2 or later. We haven't tested with any of the 1.5.x versions yet, so probably best to stay with 1.4.x. Make sure that JAVA_HOME environment variable is properly set and that both java and javac are on your PATH.

Oracle 8i or Postgres

Oracle:
The Oracle RDBMS must be installed and running as a daemon on the system. In addition the JDBC listener must be enabled. You can enable it by logging in as your Oracle user and typing the following:

lsnrctl start
Your instance should have a table space of at least 5 MB (10 MB or higher recommended). You should also have a username specific to Metacat created and enabled. This user must have most normal permissions including CREATE SESSION, CREATE TABLE, CREATE INDEX, CREATE TRIGGER, EXECUTE PROCEDURE, EXECUTE TYPE, etc. If an action is unexplainably rejected by Metacat it is probably because the user permissions are not correctly set.

Postgres:
Postgres can be easily installed on most linux distributions and on Windows (using cygwin) and Mac OS X. Using Fedora Core or RedHat Linux, you can install the rpms for postgres and then run /etc/init.d/postgresql start in order to start the database. This initializes the data files. You need to do a bit of configuration to create a database and set up a user account and allow internet access via jdbc. See the postgres documentation for this, but here is a quick start:

  • Switch to the "postgres" user account and edit "data/pg_hba.conf", adding the following line to the file:
    host metacat metacat 127.0.0.1 255.255.255.255 password
  • Edit the "data/postgres.conf" file and uncomment and edit the line starting with "tcpip_socket" so that it reads tcpip_socket = true
  • Run createdb metacat to create a new database
  • Run psql metacat to log in using the postgres account and create a new "metacat" user account
    • In postgres, run CREATE USER metacat WITH UNENCRYPTED PASSWORD 'apasswordyoulike';
    • This creates a new account called metacat on the database named metacat
    • Note: there are many ways to do this, so others such as using ENCRYPTED passwords will work fine.
  • Exit the postgres account back to root and restart the postgres database with /etc/init.d/postgresql restart
  • Test logging into the postgres db using the metacat account with the following command: psql -U metacat -W -h localhost metacat

Ant

Ant is a Java based build application similar to Make on UNIX systems. It takes in installation parameters from a file in the root installation directory named "build.xml". The Metacat CVS module contains a default build.xml file that may require some modification upon installation. Ant should be installed on the system and the "ant" executable shell script should be available in the users path. We note that the current build is not working with Ant 1.6.x, so you'll need to use an earler version. We have successfully used Ant 1.5.1, 1.5.2, and some earlier versions.

Tomcat

Install tomcat into the directory of your choice. The directory in which you install Tomcat itself will be referred to as the "$CATALINA_HOME". We recommend to install Tomcat version 4.0. More details about tomcat installation is avaliable in here.

Once all of the prerequisite software is installed as described above, the installation of Metacat can begin. First you must have a current version of the source distribution of Metacat. You can get it two ways. Authorized users can check it out of the NCEAS CVS system. You'll need both the "metacat" module and the "utilities" module to be checked out in sibling directories. The command is as follows:

mkdir knb-software
cd knb-software
cvs checkout -P metacat
cvs checkout -P utilities
Or you can download a gzipped tar file from this site.

Edit build.xml File

Once you have either checked out or unzipped and untarred the source distribution, you can begin the installation process. Change into the metacat directory and edit the file called "build.xml". You will need to change a number of configuration properties to match the setup on your system. If you are using oracle, you'll need to customize the properties in the "oracle" target. If you are using Postgres, you'll need to customize the properties in the "postgres" target. All users will need to customize the properties in the "config" target.

The properties that you need to change will include jdbc-connect, dbDriver, dbAdapter, oracle_home, jdbc, tomcat, webapps, context, user, server, systemidserver, web-base-url, and default-style. Each is described in detail below. You should also verify that the jar file properties mentioned in the remainder of the config target are accessible at the paths listed -- the defaults will usually work.

Note that the build file is preconfigured to install Metacat either using Oracle or PostgreSQL as a backend database. To change the database system, simply change the 'depends' attribute of the 'config' target to be the name of the database target that you wish to use (either 'oracle' or 'postgresql'). If you wish to use a different database system, add a new target for your database with the needed parameters and actions then add it to the 'depends' attribute.

Properties you will likely need to change:
  • The jdbc-connect parameter is the JDBC connection string needed to connect to your database.
  • The dbDriver parameter is the name of the JDBC driver class to use for connections to your database.
  • The dbAdapter parameter is the name of the Metacat adapter class to be used to communicate with a particular database.
  • The oracle_home parameter is the location that oracle is installed on your system.
  • The jdbc parameter is the location of your jdbc driver jar file.
  • The tomcat parameter is the location in which tomcat is installed.
  • The webapps parameter is the location in which your tomcat servlet contexts are installed. This is typically "$TOMCAT_HOME/webapps".
  • The context parameter is the name of the servlet context in which you want metacat to be installed. This will determine the installation directory for the servlet and many of the urls that are used to access the installed Metacat server.
  • The user and password parameters are the database user name that you set up to use Metacat, for example an Oracle username and password.
  • The tomcatversion is the version of your Tomcat. You should put tomcat3 or tomcat4 here.
  • Web-base-url is the URL from which you want to load any stylesheets or supplementary images.
  • Server is the http address on which Metacat is running (note that you should not include the 'http://' in the server property).
  • The systemidserver is the protocol (http or https) and server location to get any DTDs.
  • The datafilepath is the directory to store the data file.
  • The inlinedatafilepath is the directory to store inline data (This is for EML2).
  • The default-style parameter defines the "style-set" that is to be used by default when the qformat parameter is missing or set to "html" during a query. It is set to "knb", which is the only style that ships with the default metacat distribution. If you create your own stylesheets for displaying metacat output, you may want to create a new config file in the config-dir (e.g., mystyle.xml) and then change the default-style to use your custom style (e.g., "mystyle").
  • The debuglevel is the control value of debug message. Generally, it will vary from 0 to 70. In level 70, Metacat will desplay all debug messages.
  • The forcereplicationwaitingtime is the waiting time for start force replication after uploading a package. Usually we use default value.
Other properties that you can but generaly need not change:
  • The installdir parameter is the directory in which Ant should install the servlet. It is your "servlet context path" that was defined above.
  • Replication path is the relative path to the replication servlet. This should be the name of your servlet followed by "/servlet/replication". For example 'metacat/servlet/replication'.
  • The servlet path is the relative path to your servlet as viewed by the Tomcat or Apache web server. Under Tomcat, the form is usually
    /<servlet-context-name>/servlet/metacat
  • The html-path is usually the first directory of the servlet-path. The only reason it wouldn't be is if you are doing something with your web server and you want the html served from a different location than where the servlet is located.
  • The image-path is where you want the Metacat image files stored. It should be a directory that is accessible by the web server.
  • Replication-log is the location at which you want Metacat to place any replication log files. The user that starts Tomcat must have permission to write to this directory.
  • The config-dir parameter specifies the location of the configuration files for the "style-sets" feature. It is set by default to the installation directory and generally does not need to be changed.
  • The eml-module, eml-version, eml-tag parameters control the installation behavior with respect to EML. You should not need to change these paramters.
  • The cvsrootparameter is used when building the distribution and you should not need to change it.

Note: DO NOT add a slash [/] to the end of these paths. Metacat will not function correctly if you do so.

SQL Scripts

You now need to set up the table structure in your database. You can do either do this using the ant build system, or by manually running the scripts using a sql utility.

WARNING: Do NOT run this on an existing metacat installation as it will delete all of your data. If you have an existing metacat installation, see the instructions for "Upgrading" below.

To run the scripts using ant, type ant installdb. This does not work for postgres, so you'll need to run the xmltables-postgres.sql script manually (see next paragraph).

To run the scripts manually, change to the metacat/src directory. Then run you RDBMS's SQL utility. In Oracle it is SQLPlus. This tutorial assumes an Oracle database so this example is for SQLPlus. Login as the oracle user that was set up for use with Metacat. At the SQLPlus prompt type the following:

@xmltables.sql;
For postgres, use a command like: psql -U metacat -W -h localhost -f build/src/xmltables-postgres.sql metacat

Either way, you should see a bunch of output showing the creation of the Metacat table space. The first time you run this script you will get several errors at the beginning saying that you cannot drop a table/index/trigger because it does not exist. This is normal. Any other errors besides this need to be resolved before continuing. The script file name for PostgreSQL is xmltalbes_postgres.sql and for Microsoft SQL server is xmltables-sqlserver.sql.

If the script has run correctly you should be able to type

describe xml_documents
and it should tell you
    Name            Null?         Type
    --------------  ------------  ---------------- 
     DOCID          NOT NULL      VARCHAR2(250)
     ROOTNODEID                   NUMBER(20)
     DOCNAME                      VARCHAR2(100)
     DOCTYPE                      VARCHAR2(100)
     DOCTITLE                     VARCHAR2(1000)
     USER_OWNER                   VARCHAR2(100)
     USER_UPDATED                 VARCHAR2(100)
     SERVER_LOCATION              NUMBER(20)
     REV                          NUMBER(10)
     DATE_CREATED                 DATE
     DATE_UPDATED                 DATE
     PUBLIC_ACCESS                NUMBER(1)
     UPDATED                      NUMBER(1)
   

Upgrading SQL Scripts

If you have an existing metacat installation, you should not run the install script because it will replace all of the older tables with new, empty copies of the tables. Thus you would lose your data! Instead, you can run some upgrade scripts that will change the table structure as needed for the new version. If you are skipping versions, run each upgrade script for the intermediate versions as well. Currently the upgrade scripts are:

  • upgrade-db-to-1.2.sql
  • upgrade-db-to-1.3.sql
  • upgrade-db-to-1.4.sql

So, if you had an existing metacat 1.0 installation and you were upgrading to 1.3, you would need to run both upgrade-db-to-1.2.sql and upgrade-db-to-1.3.sql. Howver, if you were starting from a Metacat 1.2.x installation, you would only need to run the 1.3 upgrade script.

Compilation and Installation

Ant allows compilation and installation to be done in one step. Change into the metacat directory and type:

ant geteml install
or, if you are upgrading an existing installation, type:
ant geteml upgrade

You should see a bunch of messages telling you the progress of compilation and installation. When it is done you should see the message BUILD SUCCESSFUL and you should be returned to a UNIX command prompt. If you do not see the message BUILD SUCCESSFUL then there was an error that you need to resolve. This may come up if you are logged in as a user that does not have write access to one or more of the directories that are listed in the build.xml file, or if any of the paths to files are not configured correctly in the "config" target.

Once metacat itself is installed, you should also register the Ecological Metadata Language (EML) DTDs and schemas. This process is done most easily by running:

ant dtdschemasql

This command registers the DTDs' and schemas' location in the metacat server. Your database username and password have to be set correctly for this to work.

Note: The 'data' directory in the installation directory must be writeable by whatever user is running Tomcat or you will not be able to upload data files to the system.

Restart Tomcat

Once you have successfully installed Metacat, there is one more step. Tomcat (and Apache if you have Tomcat integrated with it) must be restarted. To do this, login as the user that runs your tomcat server (often "tomcat"), go to $CATALINA_HOME/bin and type:

   ./shutdown.sh 
   ./startup.sh 
   
In the Tomcat startup messages you should see something in log file like:
    MetacatServlet Initialize
    Context log path="/metadata" :Metacat: init
    MetacatServlet Initialize
   
If you see that message Tomcat is successfully loading the Metacat servlet. Next, try to run your new servlet. Go to a web browser and type:
http://yourserver.yourdomain.com/yourcontext/
You should substitute your context name for "yourcontext" in the url above. If everything is working correctly, you should see a query page followed by an empty result set. Note that if you do not have Tomcat integrated with Apache you will probably have to type
http://yourserver.yourdomain.com:8080/yourcontext/