1 |
Accessing and Submitting Metadata and Data
2 |
3 |
4 |
.. contents::
5 |
6 |
The Metacat repository can be accessed and updated using a number of tools,
7 |
8 |
9 |
* the Registry, Metacat's optional Web interface
10 |
* user-created HTML forms
11 |
* Metacat's EarthGrid API
12 |
* existing clients, such as KNB's Morpho application, designed to help
13 |
scientists create, edit, and manage metadata
14 |
* user-created desktop clients that take advantage of Metacat's Java API.
15 |
16 |
In this section, we will look at how to take advantage of these tools to
17 |
customize Metacat for your user-base.
18 |
19 |
A Brief Note about How Information is Stored
20 |
21 |
Metacat stores XML files as a hierarchy of nodes, where each node is stored as
22 |
records in database tables. Because many XML data schemas are broken up into
23 |
multiple DTDs requiring multiple XML files that are related but stored
24 |
separately in the system, the system uses "packages" to link related but
25 |
separate documents. Packaged documents contain information that shows how they
26 |
are related to eachother, essentially stating that file A has a relationship
27 |
to file B, etc. A package file also allows users to link metadata files to the
28 |
data files they describe. For more information about the structure of data
29 |
packages and how XML documents and data are stored in Metacat, please see the
30 |
developer's documentation.
31 |
32 |
Using the Registry
33 |
34 |
Metacat's optional Registry provides a simple Web-based interface for creating,
35 |
editing, and submitting metadata to the Metacat repository (screenshot below). The
36 |
interface includes help documentation, and can be customized using Metacat's
37 |
configuration settings. The Registry also includes an administrative interface
38 |
for managing LDAP user accounts, which is useful if you are using LDAP as your
39 |
Metacat authentication system. Note that you must be running your own LDAP
40 |
server if you wish to use the LDAP Web interface. If you do not have your own
41 |
LDAP server, you can create and manage new accounts on the KNB website
42 |
(http://knb.ecoinformatics.org/). Please note that at this time, the Registry
43 |
interface has only been tested on Linux systems.
44 |
45 |
.. figure:: images/screenshots/image033.jpg
46 |
:align: center
47 |
48 |
An example installation of the Register's web interface. Customize the
49 |
displayed and required modules with the Skins Configuration settings.
50 |
51 |
You can customize which modules (e.g., "Name of Submitter" or "Temporal
52 |
Coverage of Data") are displayed and which are required using the Skins
53 |
Configuration settings
54 |
55 |
Installing the Registry
56 |
57 |
In order to install and run the Registry, you must have Metacat installed and
58 |
Tomcat must be running behind an Apache Web server (see previous sections for
59 |
information about installing and configuring Apache to run with Tomcat).
60 |
61 |
To install and run the Registry:
62 |
63 |
1. Build the Metacat Perl client library:
64 |
65 |
66 |
67 |
cd $METACAT/src/perl/Metacat
68 |
perl Makefile.PL
69 |
sudo make
70 |
sudo make install
71 |
72 |
.. sidebar:: Instructions for Red Hat (Alternate Step 2)
73 |
74 |
* Install the libraries
75 |
76 |
77 |
78 |
sudo yum install gcc libxml2-devel libxslt-devel ant -y
79 |
80 |
* Install CPAN, which allows us to install the Perl dependencies for the
81 |
registry and account management parts of Metacat. If asked to manually
82 |
configure cpan, type 'no' and CPAN will be setup with its default values.
83 |
84 |
85 |
86 |
sudo yum install perl-CPAN
87 |
sudo cpan
88 |
89 |
* You should now see a prompt which looks like:
90 |
91 |
92 |
93 |
94 |
95 |
* The rest of the commands assume you're inside of CPAN. Let's get the most
96 |
recent version of the CPAN software. Just press return after any prompts
97 |
you receive during this process.
98 |
99 |
100 |
101 |
install Bundle::CPAN
102 |
reload cpan
103 |
104 |
* Install the required modules. Here we're installing an old LibXSLT, as the
105 |
current one requires a newer libxslt than is available on Redhat 4 & 5.
106 |
Again, just answer 'yes' to any questions.
107 |
108 |
109 |
110 |
install AutoLoader
111 |
install CGI
112 |
install CGI::SEssion
113 |
install LWP::UserAgent
114 |
install Net::LDAP
115 |
install Template
116 |
install URI
117 |
install MSERGEANT/XML-LibXSLT-1.58.tar.gz
118 |
install Captcha:reCAPTCHA
119 |
install DateTime
120 |
121 |
122 |
2. Install the required system libraries using Ubuntu/Debian (instructions
123 |
Red Hat in sidebar)
124 |
125 |
* Install the libraries
126 |
127 |
128 |
129 |
sudo apt-get install ant libappconfig-perl libxml-libxml-perl \
130 |
libxml-libxslt-perl libtemplate-perl libcgi-session-perl \
131 |
build-essential libnet-ldap-perl libterm-readkey-perl \
132 |
libxml-dom-perl libsoap-lite-perl -y
133 |
134 |
* Install two more package using cpan
135 |
136 |
137 |
138 |
sudo cpan -i Digest::SHA1
139 |
sudo cpan -i Config::Properties
140 |
sudo cpan -i Scalar::Util
141 |
sudo cpan -i Captcha:reCAPTCHA
142 |
sudo cpan -i DateTime
143 |
144 |
145 |
3. Double-check that Metacat's temporary folder, application.tempDir, is
146 |
writable by the apache user, usually www-data or apache.
147 |
148 |
4. Make sure that the following scripts (found in ``<tomcat-home>/webapps/metacat/cgi-bin``)
149 |
are executable: register-dataset.cgi and ldapweb.cgi.
150 |
151 |
152 |
153 |
sudo chmod +x <tomcat-home>/webapps/metacat/cgi-bin/*.cgi
154 |
155 |
5. Restart Apache.
156 |
157 |
158 |
159 |
sudo /etc/init.d/apache2 restart
160 |
161 |
6. Visit the resulting URL:
162 |
163 |
Where ``<your_context_url>`` is the URL of the server hosting the Metacat
164 |
followed by the name of the WAR file (i.e., the application context) that
165 |
you installed. For instance, the context URL for the KNB Metacat is:
166 |
167 |
168 |
If everything worked correctly, the registry home page will open (see figure).
169 |
170 |
.. figure:: images/screenshots/image035.jpg
171 |
:align: center
172 |
173 |
An example of the Registry home page (with the default skin).
174 |
175 |
Customizing the Registry
176 |
177 |
Before using the registry, you may wish to customize the interface using the
178 |
Skins Configuration settings. If you are using the default skin, you must
179 |
disable the 'show site list' setting before you can submit the form without
180 |
errors. You may also wish to remove (or modify) the list of NCEAS-specific
181 |
projects that appear in the default registry. To remove these form fields,
182 |
open Metacat's administrative interface (http://<your.context.url>/metacat/admin)
183 |
and select the Skins Specific Properties Configuration option. On the skins
184 |
configuration page, uncheck the boxes beside any form elements that you do not
185 |
wish to appear in the registry.
186 |
187 |
Once you have saved your changes, you must restart Tomcat for them to come
188 |
into effect. To restart Tomcat, type: ``sudo /etc/init.d/tomcat6 restart`` or an
189 |
equivalent command appropriate to your operating system.
190 |
191 |
.. figure:: images/screenshots/image037.jpg
192 |
:align: center
193 |
194 |
Uncheck the box beside any setting to remove it from the Registry form. In
195 |
the example, the "Show Site List" and "Show Work Group" form fields,
196 |
corresponding to the "Station Name" and "NCEAS Project" drop-down lists in
197 |
the registry form, have been removed.
198 |
199 |
LDAP account management
200 |
201 |
If you intend to use Metacat's built-in LDAP account management feature,
202 |
you will need public and private keys for the reCaptcha widget.
203 |
204 |
1. Get private and public recaptcha keys from Google using your Google account:
205 |
206 |
207 |
2. Configure Metacat to use those keys in the metacat.properties file:
208 |
209 |
210 |
211 |
ldap.recaptcha.publickey=<your public key>
212 |
ldap.recaptcha.privatekey=<your private key>
213 |
214 |
3. Restart Tomcat
215 |
216 |
217 |
Using HTML Forms (the HTTP Interface)
218 |
219 |
Metacat's HTTP interface supports Get and Post requests and a variety of actions (Table 4.1) that facilitate information retrieval and storage. HTTP requests can be sent from any client application that communicates using the Web's HTTP protocol.
220 |
221 |
* Supported Actions (API)
222 |
* Logging in
223 |
* Inserting, Updating, and Deleting XML and Data Documents
224 |
* Searching Metacat
225 |
* Paged Query Return
226 |
* Reading Data and Metadata
227 |
228 |
Supported Actions
229 |
230 |
Metacat supports get and post requests as well as actions for writing, querying,
231 |
and reading stored XML. In addition, the HTTP interface includes functions for
232 |
validating and transforming XML documents (see table).
233 |
234 |
Note that if Replication is enabled, Metacat recognizes several additional
235 |
actions, included in Table 4.2. For more information about replication,
236 |
please see :doc:`replication`.
237 |
238 |
239 |
| Action | Description and Parameters |
240 |
241 |
| delete | Delete the specified document from the database. For an |
242 |
| | example, please see Inserting, Updating, and |
243 |
| | Deleting XML and Data Documents. |
244 |
| | |
245 |
| | ``docid`` - the docid of the document to delete |
246 |
247 |
| export | Export a data package in a zip file. |
248 |
| | |
249 |
| | ``docid`` - the docid of the document to delete |
250 |
251 |
| getaccesscontrol | Get the access control list (ACL) for the |
252 |
| | specified document. |
253 |
| | |
254 |
| | ``docid`` - the docid of the document to delete |
255 |
256 |
| getalldocids | Retrieve a list of all docids registered with the system. |
257 |
| | |
258 |
| | ``scope`` - a string used to match a range of docids in a SQL LIKE statement |
259 |
260 |
| getdataguide | Read a data guide for the specified document type |
261 |
262 |
| Use getdtdschema instead | ``doctype`` - the doctype for which to get the data guide |
263 |
264 |
| getdoctypes | Get all doctypes currently available in the Metacat Catalog System. No parameters. |
265 |
266 |
| getdtdschema | Read the DTD or XMLSchema file for the specified doctype. |
267 |
| | |
268 |
| | ``doctype`` - the doctype for which DTD or XMLSchema files to read |
269 |
270 |
| getlastdocid | Get the latest docid with revision number used by scope. |
271 |
| | |
272 |
| | ``scope`` - the scope to be queried |
273 |
274 |
| getlog | Get the latest docid with revision number used by user. |
275 |
| | |
276 |
| | ``ipaddress`` - the internet protocol address for the event |
277 |
| | ``principal`` - the principal for the event (a username, etc) |
278 |
| | ``docid`` - the identifier of the document to which the event applies |
279 |
| | ``event`` - the string code for the event |
280 |
| | ``start`` - beginning of date-range for query |
281 |
| | ``end`` - end of date-range for query |
282 |
283 |
| getloggedinuserinfo | Get user info for the currently logged in user. No parameters. |
284 |
285 |
| getpricipals | Get all users and groups in the current authentication schema. No parameters. |
286 |
287 |
| getrevisionanddoctype | Return the revision and doctype of a document. |
288 |
| | The output is String that looks like "rev;doctype" |
289 |
| | |
290 |
| | ``docid`` - the docid of the document |
291 |
292 |
| getversion | Get Metacat version. Return the current version of Metacat as XML. No parameters. |
293 |
294 |
| insert | Insert an XML document into the database. For an example, please see |
295 |
| | Inserting, Updating, and Deleting XML and Data Documents |
296 |
| | |
297 |
| | ``docid`` - the user-defined docid to assign to the new XML document |
298 |
| | ``doctext`` - the text of the XML document to insert |
299 |
300 |
| insertmultipart | Insert an XML document using multipart encoding into the database. |
301 |
| | |
302 |
| | ``docid`` - the user-defined docid to assign to the new XML document |
303 |
| | ``doctext`` - the text of the XML document to insert |
304 |
305 |
| isregistered | Check if an individual document exists in either the xml_documents or xml_revisions tables. |
306 |
| | For more information about Metacat's database schema, please see the developer documentation. |
307 |
| | |
308 |
| | ``docid`` - the docid of the document |
309 |
310 |
| login | Log the user in. You must log in using this action before you can perform |
311 |
| | many of the actions. For an example of the login action, see Logging In. |
312 |
| | |
313 |
| | ``username`` - the user's login name |
314 |
| | ``password`` - the user's password |
315 |
316 |
| logout | Log the current user out and destroy the associated session. No parameters. |
317 |
318 |
| query | Perform a free text query. For an example, please see Searching Metacat. |
319 |
| | |
320 |
| | ``returndoctype`` - the doctype to use for your Package View. For more information about packages, see http://knb.ecoinformatics.org/software/metacat/packages.html |
321 |
| | ``qformat`` - the format of the returned result set. Possible values are html or xml or the name of your servlet's Metacat skin. |
322 |
| | ``querytitle`` - OPTIONAL - the title of the query |
323 |
| | ``doctype`` - OPTIONAL - if doctype is specified, the search is limited only to the specified doctype(s). (e.g., eml://ecoinformatics.org/eml-2.0.1 and/or eml://ecoinformatics.org/eml-2.0.0) If no doctype element is specified, all document types are returned |
324 |
| | ``returnfield`` - a custom field to be returned by any hit document. |
325 |
| | ``operator`` - the Boolean operator to apply to the query. Possible values are: union or intersect |
326 |
| | ``searchmode`` - the type of search to be performed. Possible values are: contains, starts-with, ends-with, equals, isnot-equal, greater-than, less-than, greater-than-equals, less-than-equals. |
327 |
| | ``anyfield`` - a free-text search variable. The value placed in this parameter will be searched for in any document in any node. |
328 |
| | ``pagesize`` - the number of search results to display on each search results page (e.g., 10). Used with pagestart. See section 4.3.4 for an example. |
329 |
| | ``pagestart`` - the displayed search results page (e.g, 1). Used with pagesize. See section 4.3.4 for an example. |
330 |
331 |
| read | Get a document from the database and return it in the specified format. See Searching Metacat for an example. |
332 |
| | |
333 |
| | ``docid`` - the docid of the document to return |
334 |
| | ``qformat`` - the format to return the document in. Possible values are: ``html``, ``xml``,or, if your Metacat uses a skin, the name of the skin. |
335 |
336 |
| readinlinedata | Read inline data only. |
337 |
| | |
338 |
| | ``inlinedataid`` - the id of the inline data to read |
339 |
340 |
| setaccess | Change access permissions for a user on a specified document. |
341 |
| | |
342 |
| | ``docid`` - the docid of the document to be modified. |
343 |
| | ``principal`` - the user or group whose permissions will be modified |
344 |
| | ``permission`` - the permission to set (read, write, all) |
345 |
| | ``permType`` - the type of permission to set (allow, deny) |
346 |
| | ``permOrder`` - the order in which to apply the permission (allowFirst, denyFirst) |
347 |
348 |
| spatial_query | Perform a spatial query. These queries may include any of the queries supported by the |
349 |
| | WFS / WMS standards. For more information, see Spatial Queries. |
350 |
| | |
351 |
| | ``xmax`` - max x spatial coordinate |
352 |
| | ``ymax`` - max y spatial coordinate |
353 |
| | ``xmin`` - min x spatial coordinate |
354 |
| | ``ymin`` - min y spatial coordinate |
355 |
356 |
| squery | Perform a structured query. For an example, please see Searching Metacat. |
357 |
| | |
358 |
| | ``query`` - the text of the pathquery document sent to the server |
359 |
| | ``qformat`` - the format to return the results in. Possible values are: ``xml``, or the name of the a skin. |
360 |
361 |
| update | Overwrite an XML document with a new one and give the new one the same docid but with |
362 |
| | the next revision number. For an example, please see Inserting, Updating, and |
363 |
| | Deleting XML and Data Documents. |
364 |
| | |
365 |
| | ``docid`` - the docid of the document to update |
366 |
| | ``doctext`` - the text with which to update the XML document |
367 |
368 |
| upload | Upload (insert or update) a data file into Metacat. Data files are stored on Metacat and may be in any |
369 |
| | format (binary or text), but they are all treated as if they were binary. |
370 |
| | |
371 |
| | ``docid`` - the docid of the data file to upload |
372 |
| | ``datafile`` - the data file to upload |
373 |
374 |
| validate | Validate a specified document against its DTD. |
375 |
| | |
376 |
| | ``docid`` - the docid of the document to validate |
377 |
| | ``valtext`` - the DTD by which to validate this document |
378 |
379 |
380 |
381 |
Metacat Replication Parameters
382 |
383 |
384 |
| Action | Description and Parameters |
385 |
386 |
| forcereplicate | Force the local server to get the specified document from the remote host. |
387 |
| | |
388 |
| | ``server`` - The server to which this document is being sent |
389 |
| | ``docid`` - The docid of the document to send |
390 |
| | ``dbaction`` - The action to perform on the document: insert or update (the default) |
391 |
392 |
| getall | Force the local server to check all known servers for updated documents. No parameters. |
393 |
394 |
| getcatalog | Send the contents of the xml_catalog table encoded in XML. No parameters. |
395 |
396 |
| getlock | Request a lock on the specified document. |
397 |
| | |
398 |
| | ``docid`` - the docid of the document |
399 |
| | ``updaterev`` - the revision number of docid |
400 |
401 |
| gettime | Return the local time on this server. No parameters. |
402 |
403 |
| servercontrol | Perform the specified replication control on the Replication daemon. |
404 |
| | |
405 |
| | ``add`` - add a new server to the replication list |
406 |
| | ``delete`` - remove a server from the replication list |
407 |
| | ``list`` - list all of the servers currently in the server list |
408 |
| | ``replicate`` - a Boolean flag (1 or 0) which determines if this server should copy files from the newly added server. |
409 |
| | ``server`` - the server to add/delete |
410 |
411 |
| read | Sends docid to the remote host. |
412 |
| | |
413 |
| | ``docid`` - the docid of the document to read |
414 |
415 |
| start | Start the Replication daemon with a time interval of deltaT. |
416 |
| | |
417 |
| | ``rate`` - The rate (in seconds) at which you want the replication daemon to check for updated documents. The value cannot be less than 30. The default is 1000 |
418 |
419 |
| stop | Stop the Replication daemon. No parameters. |
420 |
421 |
| update | Send a list of all documents on the local server along with their revision numbers. No parameters. |
422 |
423 |
424 |
Logging In
425 |
426 |
To log in to Metacat, use the ``login`` action.
427 |
428 |
The following is an example of a Web form (see figure) that logs a user into
429 |
Metact. Example HTML code is included below the screenshot.
430 |
431 |
.. figure:: images/screenshots/image039.jpg
432 |
:align: center
433 |
434 |
Logging into Metacat using an HTML form.
435 |
436 |
437 |
438 |
439 |
440 |
<form name="loginform" method="post"action="http://yourserver.com/yourcontext/servlet/metacat"
441 |
target="_top" onsubmit="return submitform(this);" id="loginform">
442 |
<input type="hidden" name="action" value="login"> <input type=
443 |
"hidden" name="username" value=""> <input type="hidden" name=
444 |
"qformat" value="xml"> <input type="hidden" name=
445 |
"enableediting" value="false">
446 |
447 |
448 |
<tr valign="middle">
449 |
<td align="left" valign="middle" class="text_plain">
450 |
451 |
452 |
<td width="173" align="left" class="text_plain" style=
453 |
"padding-top: 2px; padding-bottom: 2px;"><input name="uid"
454 |
type="text" style="width: 140px;" value=""></td>
455 |
456 |
457 |
<tr valign="middle">
458 |
<td height="28" align="left" valign="middle" class=
459 |
460 |
461 |
<td align="left" class="text_plain" style=
462 |
"padding-top: 2px; padding-bottom: 2px;"><select name=
463 |
"organization" style="width:140px;">
464 |
<option value="" selected>— choose one —</option>
465 |
<option value="NCEAS">NCEAS</option>
466 |
<option value="LTER">LTER</option>
467 |
<option value="UCNRS">UCNRS</option>
468 |
<option value="PISCO">PISCO</option>
469 |
<option value="OBFS">OBFS</option>
470 |
<option value="OSUBS">OSUBS</option>
471 |
<option value="SAEON">SAEON</option>
472 |
<option value="SANParks">SANParks</option>
473 |
<option value="SDSC">SDSC</option>
474 |
<option value="KU">KU</option>
475 |
<option value="unaffiliated">unaffiliated</option>
476 |
477 |
478 |
479 |
<tr valign="middle">
480 |
<td width="85" align="left" valign="middle" class=
481 |
482 |
483 |
<td colspan="2" align="left" class="text_plain" style=
484 |
"padding-top: 2px; padding-bottom: 2px;">
485 |
<table width="100%" border="0" cellpadding="0"
486 |
487 |
488 |
<td width="150" align="left"><input name="password"
489 |
type="password" maxlength="50" style="width:140px;"
490 |
491 |
492 |
<td align="center" class="buttonBG_login">
493 |
<input type="submit" name="loginAction" value="Login"
494 |
495 |
496 |
<td align="left"> </td>
497 |
498 |
499 |
500 |
501 |
502 |
503 |
504 |
505 |
506 |
Inserting, Updating, and Deleting XML and Data Documents
507 |
508 |
Adding, editing, and deleting XML documents in Metacat can be accomplished
509 |
using the insert, update, and delete actions, respectively. Before you can
510 |
insert, delete, or update documents, you must log in to Metacat using the
511 |
login action. See Logging in for an example.
512 |
513 |
514 |
Insert a new XML or data document into Metacat. You must specify a document ID.
515 |
516 |
517 |
Update an existing Metacat document. The original document is archived,
518 |
then overwritten.
519 |
520 |
521 |
Archive a document and move the pointer in xml_documents to xml_revisions,
522 |
effectively "deleting" the document from public view, but preserving the
523 |
revision for the revision history. No further updates will be allowed for
524 |
the Metacat document that was "deleted". All revisions of this identifier are no longer
525 |
526 |
527 |
.. warning::
528 |
It is not possible to "delete" one revision without "deleting" all
529 |
revisions of a given identifier.
530 |
531 |
The following is an example of a Web form (see figure) that can perform all
532 |
three tasks. Example HTML code is included in the sidebar.
533 |
534 |
.. figure:: images/screenshots/image041.jpg
535 |
:align: center
536 |
537 |
An example of a Web form used to insert, delete, or update XML documents in Metacat.
538 |
539 |
540 |
541 |
542 |
543 |
544 |
545 |
<body class="emlbody">
546 |
<b>MetaCat XML Loader</b>
547 |
548 |
Upload, Change, or Delete an XML document using this form.
549 |
550 |
<form action="http://yourserver.com/yourcontext/servlet/metacat" method="POST">
551 |
<strong>1. Choose an action: </strong>
552 |
<input type="radio" name="action" value="insert" checked> Insert
553 |
<input type="radio" name="action" value="update"> Update
554 |
<input type="radio" name="action" value="delete"> Delete
555 |
<input type="submit" value="Process Action">
556 |
<br />
557 |
<strong>2. Provide a Document ID </strong>
558 |
<input type="text" name="docid"> (optional for Insert)
559 |
<input type="checkbox" name="public" value="yes" checked><strong>Public Document</strong>
560 |
<br />
561 |
<strong>3. Provide XML text </strong> (not needed for Delete)<br/>
562 |
<textarea name="doctext" cols="65" rows="15"></textarea><br/>
563 |
<strong>4. Provide DTD text for upload </strong> (optional; not needed for Delete)
564 |
<textarea name="dtdtext" cols="65" rows="15"></textarea>
565 |
566 |
567 |
568 |
569 |
Searching Metacat
570 |
571 |
To search Metacat use the ``query`` or ``squery`` actions.
572 |
573 |
574 |
Perform a free text query. Specify the returndoctype, qformat, returnfield,
575 |
operator, searchmode, anyfield, and (optionally) a querytitle and doctype.
576 |
577 |
578 |
Perform a structured query by submitting an XML pathquery document to the
579 |
Metacat server.
580 |
581 |
582 |
When Metacat receives a query via HTTP (screenshot below), the server creates a
583 |
"pathquery" document, which is an XML document populated with the specified
584 |
search criteria. The pathquery document is then translated into
585 |
SQL statements that are executed against the database. Results are translated
586 |
into an XML "resultset" document, which can be returned as XML or transformed
587 |
into HTML and returned (specify which you would prefer with the returnfield
588 |
parameter). You can also opt to submit a pathquery document directly,
589 |
using an squery action.
590 |
591 |
.. figure:: images/screenshots/image043.jpg
592 |
:align: center
593 |
594 |
Example of a basic search form using a query action. The HTML code used to create the form is displayed below.
595 |
596 |
597 |
598 |
599 |
600 |
601 |
602 |
603 |
<form method="POST" action="http://panucci.nceas.ucsb.edu/metacat/metacat">
604 |
605 |
Search for:
606 |
607 |
<input name="action" value="query" type="hidden">
608 |
<input name="operator" value="INTERSECT" type="hidden">
609 |
<input name="anyfield" type="text" value=" " size="40">
610 |
<input name="qformat" value="html" type="hidden">
611 |
612 |
<input name="returnfield" value="creator/individualName/surName" type="hidden">
613 |
<input name="returnfield" value="creator/individualName/givenName" type="hidden">
614 |
<input name="returnfield" value="creator/organizationName" type="hidden">
615 |
<input name="returnfield" value="dataset/title" type="hidden">
616 |
<input name="returnfield" value="keyword" type="hidden">
617 |
618 |
<input name="returndoctype" value="eml://ecoinformatics.org/eml-2.0.1" type="hidden">
619 |
620 |
<input value="Start Search" type="submit">
621 |
622 |
623 |
624 |
625 |
626 |
Metacat's pathquery document can query specific fields of any XML document.
627 |
The pathquery can also be used to specify which fields from each hit are
628 |
returned and displayed in the search result set.
629 |
630 |
631 |
632 |
<pathquery version="1.0">
633 |
634 |
635 |
636 |
637 |
638 |
639 |
640 |
641 |
<querygroup operator="UNION">
642 |
<queryterm casesensitive="true" searchmode="contains">
643 |
<value>Charismatic megafauna</value>
644 |
645 |
646 |
<queryterm casesensitive="false" searchmode="starts-with">
647 |
<value>sea otter</value>
648 |
649 |
650 |
<queryterm casesensitive="false" searchmode="contains">
651 |
652 |
653 |
654 |
655 |
656 |
657 |
658 |
Each ``<returnfield>`` parameter specifies a field that the database will
659 |
return (in addition to the fields Metacat returns by default) for each search
660 |
661 |
662 |
The ``<returndoctype>`` field limits the type of returned documents
663 |
(eg, eml://ecoinformatics.org/eml-2.0.1 and/or eml://ecoinformatics.org/eml-2.0.0).
664 |
If no returndoctype element is specified, all document types are returned.
665 |
666 |
A ``<querygroup>`` creates an AND or an OR statement that applies to the
667 |
nested ``<queryterm>`` tags. The querygroup operator can be UNION or INTERSECT.
668 |
A ``<queryterm>`` defines the actual field (contained in ``<pathexpr>`` tags)
669 |
against which the query (contained in the ``<value>`` tags) is being performed.
670 |
671 |
The ``<pathexpr>`` can also contain a document type keyword contained in
672 |
``<returndoc>`` tags. The specified document type applies only to documents
673 |
that are packaged together (e.g., a data set and its corresponding metadata file).
674 |
If Metacat identifies the search term in a packaged document, the servlet will
675 |
check to see if that document's type matches the specified one. If not,
676 |
Metacat will check if one of the other documents in the package matches. If so,
677 |
Metacat will return the matching document. For more information about packages,
678 |
please see the developer documentation.
679 |
680 |
After Metacat has processed a Pathquery document, it returns a resultset document.
681 |
682 |
683 |
684 |
685 |
686 |
<pathquery version="1.0">
687 |
688 |
689 |
690 |
691 |
692 |
693 |
694 |
695 |
<querygroup operator="UNION">
696 |
<queryterm casesensitive="true" searchmode="contains">
697 |
<value>Charismatic megafauna</value>
698 |
699 |
700 |
<queryterm casesensitive="false" searchmode="starts-with">
701 |
<value>sea otter</value>
702 |
703 |
704 |
<queryterm casesensitive="false" searchmode="contains">
705 |
706 |
707 |
708 |
709 |
710 |
711 |
712 |
713 |
714 |
715 |
716 |
717 |
718 |
<param name="dataset/title">Marine Mammal slides</param>
719 |
<param name="creator/individualName/surName">Bancroft</param>
720 |
721 |
722 |
723 |
724 |
725 |
726 |
727 |
728 |
<param name="dataset/creator/individualName/surName">Nelson</param>
729 |
<param name="dataset/creator/individualName/surName">Harrer</param>
730 |
<param name="dataset/creator/individualName/surName">Reed</param>
731 |
<param name="dataset/title">SBC LTER: Reef: Sightings of Sea Otters (Enhydra lutris) near Santa Barbara and Channel Islands, ongoing since 2007</param>
732 |
733 |
734 |
735 |
736 |
When Metacat returns a resultset document, the servlet always includes the
737 |
pathquery used to create it. The pathquery XML is contained in the <query> tag,
738 |
the first element in the resultset.
739 |
740 |
Each XML document returned by the query is represented by a ``<document>`` tag. By
741 |
default, Metacat will return the docid, docname, doctype, doctitle, createdate
742 |
and updatedate for each search result. If the user specified additional return
743 |
fields in the pathquery using ``<returnfield>`` tags (e.g., dataset/title to return
744 |
the document title), the additional fields are returned in ``<param>`` tags.
745 |
746 |
Metacat can return the XML resultset to your client as either XML or HTML.
747 |
748 |
Paged Query Returns
749 |
750 |
Dividing large search result sets over a number of pages speeds load-time and
751 |
makes the result sets more readable to users (Figure 4.12). To break your search
752 |
results into pages, use the query action's optional pagestart and pagesize
753 |
parameters. The pagesize parameter indicates how many results should be
754 |
returned for a given page. The pagestart parameter indicates which page you
755 |
are currently viewing.
756 |
757 |
.. figure:: images/screenshots/image045.jpg
758 |
:align: center
759 |
760 |
An example of paged search results.
761 |
762 |
When a paged query is performed, the query's resultset contains four extra
763 |
fields: pagestart, pagesize, nextpage, and previouspage (Figure 4.13). The
764 |
nextpage and previouspage fields help Metacat generate navigational links in
765 |
the rendered resultset using XSLT to transform the XML to HTML.
766 |
767 |
768 |
769 |
<!-- An example of an XML resultset that include support for page breaks.
770 |
The pagestart parameter will always indicate the page you are currently viewing.
771 |
772 |
773 |
774 |
775 |
776 |
777 |
<query> ...</query>
778 |
779 |
780 |
781 |
782 |
The HTML search results displayed in the figure were rendered using Kepler's XSLT,
783 |
which can be found in lib/style/skins/kepler. Kepler's XSLT uses the four extra
784 |
resultset fields to render the "Next" and "Previous" links.
785 |
786 |
787 |
788 |
<a href="metacat?action=query&operator=INTERSECT&enableediting=false&anyfield=actor&qformat=kepler&pagestart=0&pagesize=10">Previous Page</a>
789 |
<a href="metacat?action=query&operator=INTERSECT&enableediting=false&anyfield=actor&qformat=kepler&pagestart=2&pagesize=10">Next Page</a>
790 |
791 |
In the example above, the current page is 1, and the previous page (page 0) and next page (page 2) pages are indicated by the values of the pagestart parameters.
792 |
793 |
Reading Data and Metadata
794 |
795 |
To read data or metadata from Metacat, use the ``read`` action. The ``read`` action
796 |
takes two parameters: ``docid``, which specifies the document ID of the document
797 |
to return, and ``qformat``, which specifies the return format for the document
798 |
(``html`` or ``xml`` or the name of a configured style-set, e.g., ``default``). If ``qformat``
799 |
is set to ``xml``, Metacat will return the XML document untransformed. If the
800 |
return format is set to ``html``, Metacat will transform the XML document into
801 |
HTML using the default XSLT style sheet (specified in the Metacat
802 |
configuration). If the name of a style-set is specified, Metacat will use the
803 |
XSLT styles specified in the set to transform the XML.
804 |
805 |
.. figure:: images/screenshots/image047.jpg
806 |
:align: center
807 |
808 |
The same document displayed using different qformat parameters (from left
809 |
to right: the default style-set, XML, and HTML).
810 |
811 |
Note that the ``read`` action can be used to read both data files and metadata files.
812 |
To read a data file, you could use the following request::
813 |
814 |
815 |
816 |
Where ``nceas.55`` is the docid of the data file stored in the Metacat and
817 |
``default`` is the name of the style (you could also use "html" or "xml" or the
818 |
name of a customized skin).
819 |
820 |
821 |
822 |
823 |
824 |
<title>Read Document</title>
825 |
826 |
827 |
<form method="POST" action="http://your.server/your.context/servlet/metacat">
828 |
<input name="action" value="read" type="hidden">
829 |
<input name="docid" type="text" value="" size="40">
830 |
<input name="qformat" value="default" type="hidden">
831 |
<input value="Read" type="submit">
832 |
833 |
834 |
835 |
836 |
Using the EarthGrid API (aka EcoGrid)
837 |
838 |
839 |
.. Note::
840 |
841 |
The EarthGrid/EcoGrid web service API is *deprecated* as of Metacat 2.0.0 and
842 |
will be removed from a future version of Metacat. Its functionality is being
843 |
replaced by the standardized DataONE REST service interface. The EarthGrid API
844 |
will be completely removed by the end of 2013.
845 |
846 |
The EarthGrid (aka EcoGrid) provides access to disparate data on different
847 |
networks (e.g., KNB, GBIF, GEON) and storage systems (e.g., Metacat and SRB),
848 |
allowing scientists access to a wide variety of data and analytic resources
849 |
(e.g., data, metadata, analytic workflows and processors) networked at different
850 |
sites and at different organizations via the internet.
851 |
852 |
Because Metacat supports the EarthGrid API (see table), it can query the
853 |
distributed EarthGrid, retrieve metadata and data results, and write new and
854 |
updated metadata and data back to the grid nodes.
855 |
856 |
For more information about each EarthGrid service and its WSDL file, navigate
857 |
to the "services" page on your Metacat server
858 |
(e.g., http://knb.ecoinformatics.org/metacat/services).
859 |
Note that the AdminService and Version service that appear on this page are
860 |
not part of EarthGrid.
861 |
862 |
EarthGrid/EcoGrid API Summary
863 |
864 |
865 |
| Service | Description |
866 |
867 |
| AuthenticationQueryService | Search for and retrieve protected metadata and data from the EarthGrid as an authenticated user. |
868 |
| | |
869 |
| | Methods: ``query``, ``get`` |
870 |
871 |
| AuthenticationService | Log in and out of the EarthGrid |
872 |
| | |
873 |
| | Methods: ``login``, ``logout`` |
874 |
875 |
| IdentifierService | List, lookup, validate, and add Life Science Identifiers (LSIDs) to the EarthGrid |
876 |
| | |
877 |
| | Methods: ``isRegistered``, ``addLSID``, ``getNextRevision``, ``getNextObject``, ``getAllIds`` |
878 |
879 |
| PutService | Write metadata to the EarthGrid |
880 |
| | |
881 |
| | Methods: ``put`` |
882 |
883 |
| QueryService | Search for and retrieve metadata from the EarthGrid |
884 |
| | |
885 |
| | Methods: ``query``, ``get`` |
886 |
887 |
| RegistryService | Add, update, remove, and search for registered EarthGrid services. |
888 |
| | Note: The WSDL for this service is found under http://ecogrid.ecoinformatics.org/registry/services |
889 |
| | |
890 |
| | Methods: ``add``, ``update``, ``remove``, ``list``, ``query`` |
891 |
892 |
893 |
Using Morpho
894 |
895 |
Morpho is a desktop tool created to facilitate the creation, storage, and
896 |
retrieval of metadata. Morpho interfaces with any Metacat server, allowing
897 |
users to upload, download, store, query and view relevant metadata and data
898 |
using the network. Users can authorize the public or only selected colleagues
899 |
to view their data files.
900 |
901 |
Morpho is part of the Knowledge Network for Biocomplexity (KNB), a national
902 |
network intended to facilitate ecological and environmental research on
903 |
biocomplexity. To use Morpho with your Metacat, set the Metacat URL in the
904 |
Morpho Preferences to point to your Metacat server.
905 |
906 |
.. figure:: images/screenshots/image049.png
907 |
:align: center
908 |
909 |
Set the Metacat URL in the Morpho preferences to point to your Metacat.
910 |
911 |
For more information about Morpho, please see: http://knb.ecoinformatics.org/
912 |
913 |
Creating Your Own Client
914 |
915 |
916 |
.. Note::
917 |
918 |
NOTE: The Client API (and underlying servlet implementation) has been
919 |
deprecated as of Metacat 2.0.0. Future development should utilize the DataONE
920 |
REST service methods. The Client API will be completely removed by the end of 2013.
921 |
922 |
Metacat's client API is available in Java and Perl (the Java interface is
923 |
described in this section and further detailed in the appendix). Some of the
924 |
API is also available in Python and Ruby. The API allows client applications
925 |
to easily authenticate users and perform basic Metacat operations such as
926 |
reading metadata and data files; inserting, updating, and deleting files; and
927 |
searching for packages based on metadata matches.
928 |
929 |
The Client API is defined by the interface edu.ucsb.nceas.metacat.client.Metacat,
930 |
and all operations are fully defined in the javadoc_ documentation. To use the
931 |
client API, include the ``metacat-client.jar``, ``utilities.jar``, ``commons-io-2.0.jar``, and
932 |
``httpclient.jar`` in your classpath. After including these classes, you can
933 |
begin using the API methods (see the next table).
934 |
935 |
.. _javadoc: http://knb.ecoinformatics.org/software/metacat/dev/api/index.html
936 |
937 |
The following code block displays a typical session for reading a document
938 |
from Metacat using the Java client API.
939 |
940 |
941 |
942 |
String metacatUrl = "http://foo.com/context/metacat";
943 |
String username = "uid=jones,o=NCEAS,dc=ecoinformatics,dc=org";
944 |
String password = "neverHarcodeAPasswordInCode";
945 |
try {
946 |
Metacat m = MetacatFactory.createMetacatConnection(metacatUrl);
947 |
m.login(username, password);
948 |
Reader r = m.read("testdocument.1.1");
949 |
// Do whatever you want with Reader r
950 |
} catch (MetacatAuthException mae) {
951 |
handleError("Authorization failed:\n" + mae.getMessage());
952 |
} catch (MetacatInaccessibleException mie) {
953 |
handleError("Metacat Inaccessible:\n" + mie.getMessage());
954 |
} catch (Exception e) {
955 |
handleError("General exception:\n" + e.getMessage());
956 |
957 |
958 |
Operations provided by Client API (Metacat.java class)
959 |
960 |
961 |
| Method | Parameters and Throws | Description |
962 |
963 |
| delete | ``public String delete(String docid) throws InsufficientKarmaException, MetacatException, MetacatInaccessibleException;`` | Delete an XML document in the repository. |
964 |
965 |
| getAllDocids | ``public Vector getAllDocids(String scope) throws MetacatException;`` | Return a list of all docids that match a given scope. If scope is null, return all docids registered in the system. |
966 |
967 |
| getLastDocid | ``public String getLastDocid(String scope) throws MetacatException;`` | Return the highest document ID for a given scope. Used by clients to determine the next free identifier in a sequence for a given scope. |
968 |
969 |
| getloggedinuserinfo | ``public String getloggedinuserinfo() throws MetacatInaccessibleException;`` | Return the logged in user for this session. |
970 |
971 |
| getNewestDocRevision | ``public int getNewestDocRevision(String docId) throws MetacatException;`` | Return the latest revision of specified the document from Metacat |
972 |
973 |
| getSessonId | ``public String getSessionId();`` | Return the session identifier for this session. |
974 |
975 |
| insert | ``public String insert(String docid, Reader xmlDocument, Reader schema) throws InsufficientKarmaException, MetacatException, IOException, MetacatInaccessibleException;`` | Insert an XML document into the repository. |
976 |
977 |
| isRegistered | ``public boolean isRegistered(String docid) throws MetacatException;`` | Return true if given docid is registered; false if not. |
978 |
979 |
| login | ``public String login(String username, String password) throws MetacatAuthException, MetacatInaccessibleException;`` | Log in to a Metacat server. |
980 |
981 |
| logout | ``public String logout() throws MetacatInaccessibleException, MetacatException;`` | Log out of a Metacat server. |
982 |
983 |
| query | ``public Reader query(Reader xmlQuery) throws MetacatInaccessibleException, IOException;`` | Query the Metacat repository and return the result set as a Reader. |
984 |
985 |
| query | ``public Reader query(Reader xmlQuery, String qformat) throws MetacatInaccessibleException, IOException;`` | Query the Metacat repository with the given metacat-compatible query format and return the result set as a Reader. |
986 |
987 |
| read | ``public Reader read(String docid) throws InsufficientKarmaException, MetacatInaccessibleException, DocumentNotFoundException, MetacatException;`` | Read an XML document from the Metacat server. |
988 |
989 |
| readInlineData | ``public Reader readInlineData(String inlinedataid) throws InsufficientKarmaException, MetacatInaccessibleException, MetacatException;`` | Read inline data from the Metacat server session. |
990 |
991 |
| setAccess | ``public String setAccess(String _docid, String _principal, String _permission, String _permType, String _permOrder ); throws InsufficientKarmaException, MetacatException, MetacatInaccessibleException;`` | Set permissions for an XML document in the Metacat repository. |
992 |
993 |
| setMetacatUrl | ``public void setMetacatUrl(String metacatUrl);`` | Set the MetacatUrl to which connections should be made. |
994 |
995 |
| setSessionId | ``public void setSessionId(String sessionId);`` | Set the session identifier for this session. |
996 |
997 |
| update | ``public String update(String docid, Reader xmlDocument, Reader schema) throws InsufficientKarmaException, MetacatException, IOException, MetacatInaccessibleException;`` | Update an XML document in the repository by providing a new version of the XML document. |
998 |
999 |
| upload | ``public String upload(String docid, File file) throws InsufficientKarmaException, MetacatException, IOException, MetacatInaccessibleException;`` | Upload a data document into the repository. |
1000 |
1001 |
| upload | ``public String publicupload(String docid, String fileName, InputStream fileData, int size) throws InsufficientKarmaException, MetacatException, IOException, MetacatInaccessibleException;`` | Upload a data document into the repository. |
1002 |