Queries and Results

Back | Home | Next

The Metacat Server provides an interface for searching of metadata stored in the Metacat database.

architecture diagram of a Metacat query

Steps to perform a query in Metacat

  1. A pathquery document is created from the search criteria provided through the servlet parameters.
  2. This pathquery document is sent to DBQuery where it is processed and translated into SQL statements.
  3. The SQL statements are executed against the database and the resultsets are translated into an xml document of doctype "resultset".
  4. The resultset document is either returned directly to the client as XML or is transformed through XSLT and returned as HTML.
The Pathquery Document
   <pathquery version="1.0">
      <meta_file_id>unspecified</meta_file_id>
      <querytitle>unspecified</querytitle>
      <returnfield>dataset/title</returnfield>
      <returnfield>keyword</returnfield>
      <returnfield>originator/individualName/surName</returnfield>
      <returndoctype>eml://ecoinformatics.org/eml-2.0.1</returndoctype>
      <returndoctype>eml://ecoinformatics.org/eml-2.0.0</returndoctype>
      <querygroup operator="UNION">
        <queryterm casesensitive="false" searchmode="contains">
          <value>Datos</value>
          <pathexpr>dataset/title</pathexpr>
        </queryterm>
        <queryterm casesensitive="false" searchmode="contains">
          <value>plant</value>
          <pathexpr>keyword</pathexpr>
        </queryterm>
      </querygroup>
  </pathquery>
  

The pathquery document was designed to be flexible enough to query specific fields of any XML document. It also allows the client to specify which fields from a returned document are returned in the initial resultset. Each <returnfield> parameter specifies a field which the DB will return for any query hit. The returndoctype fields allows the client to limit the type of documents to be returned. If no returndoctype element , all document types are returned. A <querygroup> creates an AND or an OR statement of the <queryterm>s in the group. The operator can be UNION or INTERSECT. A <queryterm> defines the actual field against which the query is being performed. The value of the queryterm that we are quering for is encased in <value> tags. The <pathexpr> tag specifies an exact path to which you want to restrict the search. A <pathexpr> tag which contains the keyword returndoc is a special case which is discussed in Packages and Relations.


The Resultset Document

When the pathquery document is submitted and processed, Metacat returns another XML document called a resultset document.

      <resultset>
        <query>
          <pathquery version="1.0">
            <meta_file_id>unspecified</meta_file_id>
            <querytitle>unspecified</querytitle>
            <returnfield>dataset/title</returnfield>
            <returnfield>keyword</returnfield>
            <returnfield>originator/individualName/surName</returnfield>
            <returndoctype>eml://ecoinformatics.org/eml-2.0.1</returndoctype>
            <returndoctype>eml://ecoinformatics.org/eml-2.0.0</returndoctype>
            <querygroup operator="UNION">
                  <queryterm casesensitive="false" searchmode="contains">
                      <value>Datos</value>
                      <pathexpr>dataset/title</pathexpr>
                  </queryterm>
                  <queryterm casesensitive="false" searchmode="contains">
                     <value>plant</value>
                     <pathexpr>keyword</pathexpr>
                  </queryterm>
           </querygroup>
         </pathquery>
        </query>  
      
        <document>
          <docid>nceas.44.1</docid>
          <docname>resource</docname>
          <doctype>eml://ecoinformatics.org/eml-2.0.1</doctype>
          <createdate>2001-01-12 16:12:06.0</createdate>
          <updatedate>2001-01-12 16:12:06.0</updatedate>
          <param name="dataset/title">Datos Meteorologicos</param>
          <param name="keyword">intertidal</param>
          <param name="originator/individualName/surName">Smith</param>
        </document>  
        
        <document>
          <docid>nceas.42.1</docid>
          <docname>resource</docname>
          <doctype>eml://ecoinformatics.org/eml-2.0.1</doctype>
          <createdate>2001-01-12 16:11:31.0</createdate>
          <updatedate>2001-01-12 16:11:31.0</updatedate>
          <param name="dataset/title">Ocean Surface Temperature</param>
          <param name="keyword">Plant</param>
          <param name="originator/individualName/surName">Henry</param>   
       </document>
      .....  
      </resultset>
    
  

The first element in the resultset is <query>. Its content is just the pathquery document. The resultset always returns the pathquery document that created it in the <query> tag. The next major tag is <document>. Each XML document returned by the query is represented by a <document> tag. The default document information returned is docid, docname, doctype, doctitle, createdate and updatedate. The param tags are present if the document found contained the returnfield chosen in the pathquery document. The name attribute of the param tag is the full path to the node specified by the returnfield.


Back | Home | Next