Metacat: Issueshttps://projects.ecoinformatics.org/ecoinfo/https://projects.ecoinformatics.org/ecoinfo/ecoinfo/favicon.ico?14691340362010-11-16T20:26:48ZEcoinformatics Redmine
Redmine Bug #5241 (Resolved): OAI-PMH: ListRecords verb returns content containing XML processing instruc...https://projects.ecoinformatics.org/ecoinfo/issues/52412010-11-16T20:26:48ZDuane Costadcosta@lternet.edu
<p>When returning lists of EML records, the ListRecords verb may return content that contains XML processing instructions within the OAI-PMH <metadata> elements. The <metadata> elements should contain only XML elements, with all XML processing instructions stripped out of the metadata content.</p>
<p>There is a logical bug in the data provider crosswalk logic for EML 2.0.0, EML 2.0.1, and EML 2.1.0 documents. It strips off the first XML processing instruction before inserting the document into the <metadata> element. For example:</p>
<p><?xml version="1.0"?> <!-- This gets stripped off --></p>
<p>But if there is more than one processing instruction, only the first gets stripped off and the rest are placed inside the <metadata> element:</p>
<p><?xml version="1.0"?> <!-- This gets stripped off --><br /><?xml-stylesheet type="text/xsl" <!-- This DOES NOT get stripped off --><br /> href="../../style/eml/eml-2.0.0.xsl"?></p>
<p>The code should be modified to strip off all leading XML processing instructions.</p> Bug #4637 (Resolved): Metacat Harvester fails to catch some insert and update failureshttps://projects.ecoinformatics.org/ecoinfo/issues/46372009-12-17T19:34:31ZDuane Costadcosta@lternet.edu
<p>Metacat Harvester is not catching all insert and update errors.</p>
<p>Recently at LTER, there have been a handful of documents that have been reported by Metacat Harvester as successful inserts or updates, but in fact the documents are not being successfully inserted or updated into Metacat. <br />The logical error is in method HarvesterDocument.putMetacatDocument(). The problem is that Harvester treats the absence of an exception as a success condition, when it should instead require hard confirmation of success from the Metacat client that the insert or update operation succeeded:</p>
<pre><code>if (harvester.connectToMetacat()) {<br /> try {<br /> if (insert) {<br /> metacatReturnString = metacat.insert(docidFull, stringReader, null);<br /> inserted = true;<br /> harvester.addLogEntry(0, docidFull + " : " + metacatReturnString, <br /> "harvester.InsertDocSuccess", <br /> harvestSiteSchedule.siteScheduleID, <br /> null, "");<br /> }<br /> else if (update) {<br /> metacatReturnString = metacat.update(docidFull, stringReader, null);<br /> updated = true;<br /> harvester.addLogEntry(0, docidFull + " : " + metacatReturnString, <br /> "harvester.UpdateDocSuccess", <br /> harvestSiteSchedule.siteScheduleID, <br /> null, "");<br /> }<br /> }<br /> catch (MetacatInaccessibleException e) {<br /> logMetacatError(insert, metacatReturnString, <br /> "MetacatInaccessibleException", e);<br /> }<br /> catch (InsufficientKarmaException e) {<br /> logMetacatError(insert, metacatReturnString, <br /> "InsufficientKarmaException", e);<br /> }<br /> catch (MetacatException e) {<br /> logMetacatError(insert, metacatReturnString, "MetacatException", e);<br /> }<br /> catch (IOException e) {<br /> logMetacatError(insert, metacatReturnString, "IOException", e);<br /> }</code></pre>
<p>Harvester does not check the value of the string returned by Metacat ('metacatReturnString' in the above code). In the cases where the insert/update operations have been failing, the return string is empty or null. Harvester should examine the return string to confirm that it contains the substring "<success>" or something similar.</p>
<p>The fact that no exception is thrown by Metacat could point to an additional problem in Metacat, since the insert/update operation completes without raising an exception even though the document is not inserted or updated. The documents that appear to trigger this condition are unusually large EML documents (currently there are three documents from CDR and one document from LUQ that trigger this bug).</p>
<p>After the Harvester bug is resolved, or as part of resolving it, further investigation should be done to determine whether there is also a Metacat bug involved here, and if there is, a separate bug entry should be entered for it.</p> Bug #2207 (Resolved): Advanced Search integrationhttps://projects.ecoinformatics.org/ecoinfo/issues/22072005-09-26T22:20:40ZDuane Costadcosta@lternet.edu
<p>Over the past year, the LTER Network Office has developed an Advanced Search web<br />application that uses the Metacat client to run an advanced search on criteria<br />such as subject, author, spatial, and taxon. In its current form, the Advanced<br />Search interface exists as a separate web application, outside of the Metacat<br />code base. The goal of this task is the integrate the Advanced Search web<br />application with the Metacat code base, as described in more detail below.</p>
<hr />
<p>Proposal to Integrate Advanced Search Capability <br />with Metacat Distribution</p>
<p>The goal is to refactor the Query application so that major parts<br />of it would be integrated with Metacat, while other parts of it<br />could be customized for LTER-specific needs and maintained<br />independent of Metacat.</p>
<p>1. Query Engine</p>
<p>The back-end Query Engine can be fully integrated with the Metacat<br />code base. It contains search engine functionality that is<br />generic to Metacat and should be relatively easy to factor out of <br />the Query application. Once it is part of Metacat, it can be<br />packaged as a library that can be distributed with the Query<br />application. After it is integrated with Metacat, the Query Engine <br />code can be maintained by all Metacat developers. If logical <br />improvements or performance optimizations are made to the Query <br />Engine code by the Metacat community, the LTER Query application<br />will benefit from these improvements and optimizations because <br />it will utilize the same code that Metacat utilizes.</p>
<p>2. Advanced Search Form</p>
<p>The Advanced Search Form is implemented as a JSP. It can be <br />reimplemented to eliminate the Struts-based custom tags that it <br />currently uses. JavaScript could be added to replace the <br />Struts-generated JavaScript for client-side form validation; <br />alternatively, form validation could be moved to the server side.</p>
<p>A LTER-customized version of the Advanced Search Form could be <br />maintained in the Query Application. It would be nearly identical<br />to the Metacat version, but it would contain additional input fields,<br />such as a drop-down list that allows the user to restrict the search<br />results to a particular LTER site.</p>
<p>If possible, a mechanism would be worked out to minimize the<br />duplication of effort that would be required to maintain both the <br />Metacat form and the custom LTER form and to keep them consistent.</p>
<p>3. Login Page, Simple Search Page, and Browse Page</p>
<p>These pages in the Query Application would not be integrated with <br />Metacat, since equivalent functionality already exists in the <br />default Metacat skin. These pages would continue to be part of the <br />Query application. Since the Query application uses Struts to manage <br />the functionality of these pages, we could continue to use Struts <br />in the Query application, though we would not need Struts in Metacat.</p>
<p>We may be able to utilize the Metacat skin to provide this <br />functionality for the Query application. Eventually, we may be<br />able to fully migrate all of the functionality of the Query application<br />to Metacat, though we would need to provide a way to extend the<br />Metacat skin with LTER customizations.</p>
<p>The current browse capability of the Query Application is intended<br />to be replaced by a browsable hierarchy of terms based on the work<br />that is being initiated in the LTER working groups for Ontologies<br />and Controlled Vocabularies. Since this work would be a valuable <br />contribution to Metacat as well, it would be useful to integrate <br />these new browse capabilities in the Metacat skin rather than <br />restrict them to just the Query application.</p>
<p>Work Estimates:</p>
<p>Task Effort (Weeks)<br />---- --------------</p>
<p>Phase One:</p>
<p>These tasks would fulfill Matt's requirements<br />to integrate the Advanced Search capability with<br />Metacat, while retaining the Query application as a<br />LTER custom application that shares some of its<br />software components with Metacat.</p>
<pre><code>Refactor Query Engine. Add code 1<br /> to Metacat. Refactor Query Application<br /> to use Metacat library.</code></pre>
<pre><code>Refactor Advanced Search Form to eliminate 1<br /> Struts custom tags.</code></pre>
<pre><code>Implement generic Advanced Search Form and 1<br /> integrate it with Metacat Skin. Maintain custom<br /> LTER form in the Query Application as an<br /> extension to the Metacat form.</code></pre>
<p>Phase Two:</p>
<p>This optional phase would deprecate the Query application<br />as a separate entity, eliminating the duplication of <br />effort needed to keep its advanced search functionality <br />consistent with Metacat's. The time estimates for these<br />tasks should be adjusted after Phase One is completed,<br />since we will have a better understanding of the effort<br />required at that point.</p>
<p>Fully migrate the Query Application to Metacat, 2<br />allowing for LTER customizations within Metacat.<br />Other organizations could use the LTER customizations<br />in Metacat as a model for their own customizations.</p>
<hr />
<p>On 9/6/2005, Mark Servilla wrote:</p>
<p>Hi Matt,</p>
<p>Attached, please find a brief statement of work proposed by Duane Costa<br />regarding the Advanced Query Interface for metacat. We consider the<br />re-unification of the two applications to be a high-priority to the LTER, NCEAS,<br />and the eco-community, and will begin the planning/work effort immediately. At<br />your earliest convenience, please review the SOW and let us know if this is<br />acceptable and/or if you have any questions/comments.</p>
<p>Sincerely,<br />Mark</p>
<hr />
<p>On 9/6/2005, Matt Jones wrote:</p>
<p>Hi Mark,</p>
<p>Thanks. In general this statement of work looks great -- no real modifications<br />on my part. It will be a great time-saver for many people who want an advanced<br />search in their metacat installation, so I appreciate it.</p>
<p>A couple of brief comments for context:<br />1) We need to revise our login infrastructure. Right now metacat uses cookies<br />for session state. That has worked well, and is pretty robust.<br /> We also have a more recent javascript login for managing the cookies on the KNB<br />and default skins -- this is incomplete and therefore broken, although it is<br />what is used on the main KNB page. The problem is that the session information<br />does not propagate across pages as one navigates through the app, especially in<br />the EML pages. We know why, but haven't fixed it. It would be a good time to do<br />so, so let us know if you have particular needs in your part of the login<br />infrastructure.<br />2) We've wanted an effective browsable hierarchy of terms for KNB datasets, but<br />there just is't consensus on a controlled vocabulary. The one on the KNB skin is<br />one I made up with feedback from Mark S and a few others. If you get consensus<br />with LTER, we'd probably want to switch the KNB in general over to using it and<br />creating a browsing interface that allows one to navigate it. So that might be<br />another area for shared work.<br />3) I think the advanced search page needs a map to draw a box for spatial<br />searches. We've found users simply don't know the lat/lon for their area of<br />interest.<br />4) We are working on a spatial option for metacat for Kruger that allows the<br />locations of data to be plotted against other GIS layers in a map and searched<br />using spatial queries. This is in prototype now, but will be released hopefully<br />with the next metacat release. Just a FYI in case a similar request has come<br />your way.<br />5) The new SEEK web developer located at LNO (in process of hiring replacement<br />for Tekell) will be working on a portal for access to EcoGrid data (both EML and<br />DarwinCore). I hope we will be able to adapt your client interface so that it<br />can be applied to the web services backend in EcoGrid, as I think the<br />requirements are essentially the same.</p>
<p>Thanks. Looking forward to working on this with you.</p>
<p>Matt</p>
<hr />
<p>On 9/23/05, Mark Servilla wrote:</p>
<p>Hi Matt,</p>
<p>Sorry for the delay in getting back to this matter. After discussing your<br />comments with Duane, the only item that impacts our involvement directly is<br />number 1 - the session management. It will be critical to work this problem to<br />completion. With respect to number 3, the map UI, we would appreciate any<br />suggestions in this area; short of installing a full-up ArcIMS/MapServer type of<br />application. Is there a simple javascript version for such a map? A first<br />version of this would require only the bounding lat/lon for the query. Thanks!</p>
<p>Sincerely,<br />Mark</p>
<hr />
<p>On 9/23/05, Matt Jones wrote:</p>
<p>Mark,</p>
<p>Thanks for the followthrough. I agree that (1) needs to be worked out, and I'd<br />like to see the GT4 GSI certificate stuff that NCSA did before we decide on a<br />solution. We should probably take an approach that accomodates that if we're<br />delving into the auth infrastructure. <br />Regarding (3), I think there are several open-source geospatial libs for doing<br />this (we use one of them in Morpho). Maybe Duane could talk to John Harris (who<br />is working on this stuff in metacat now) and Dan Higgins (who did the geospatial<br />map in morpho) and come up with a proposal? My impression is that the java lib<br />Dan used was pretty effective and easy to plug into morpho, and I think others<br />have come about. GEON has one in their portal search client, so we might be<br />able to borrow code or ideas from them. We definitely don't want an IMS for<br />this part -- our needs are much simpler.</p>
<p>Matt</p> Bug #1217 (Closed): Extend Metacat Interface and Clienthttps://projects.ecoinformatics.org/ecoinfo/issues/12172003-11-21T16:39:20ZDuane Costadcosta@lternet.edu
<p>The Metacat interface, implemented by the MetacatClient class,<br />exposes a number of core actions of the MetacatServlet as HTTP<br />requests to the servlet. The current list of supported actions <br />includes:</p>
<p>login<br />logout<br />read<br />squery<br />insert<br />update<br />delete</p>
<p>A number of additional actions, most of which are not <br />critical but are convenient, are not supported by the interface:</p>
<p>query<br />export<br />readinlinedata<br />validate<br />setaccess<br />getaccesscontrol<br />getprincipals<br />getdoctypes<br />getdtdschema<br />getdataguide<br />getlastdocid<br />getrevisionanddoctype</p>
<p>(Note: some of the methods listed above are obsolete and<br />therefore should not be included in the extended interface.<br />The complete list of methods that should be added to the <br />interface is yet to be determined.)</p>
<p>Some of the methods in the extended set are useful to other<br />clients, such as Harvester. However, Harvester should interact<br />exclusively with the MetacatClient, rather than directly with the <br />Metacat servlet. Therefore, the current interface should be<br />extended with the additional convenience methods so that they<br />can be used by Harvester and potentially other clients.</p>
<p>There are two ways that the current interface could be extended:</p>
<p>(1) Add a new interface, 'MetacatExtendedInterface.java', to<br />hold the definitions for the convenience methods. The MetacatClient<br />class would implement both the current Metacat interface and the<br />MetacatExtendedInterface. The advantage of this approach is that<br />the current Metacat interface can be kept simple in that it will<br />only define the core methods.</p>
<p>(2) Add the new methods to the existing Metacat interface.<br />The advantage of just adding these methods to Metacat.java is <br />programmatically it's easier because you only have to cast to one <br />interface (not two) to make method calls.</p>
<p>We will determine in the next few days which of these two<br />approaches to use.</p>
<p>A secondary goal of this task is to improve the documentation for<br />the methods in the extended interface.</p>
<p>A future goal (outside the scope of this task) is to refactor <br />Morpho and other code to use the MetacatClient to interact with<br />MetacatServlet.</p> Bug #1046 (Resolved): LTER LDAP database coordination perl script changeshttps://projects.ecoinformatics.org/ecoinfo/issues/10462003-04-18T18:24:38ZDavid Blankmandblankman@lternet.edu
<p>need success and failure messages & rountines built into perl scripts since ldap<br />inserts will be done behind the scenes.</p>
<p>success log file: uid, date<br />failure email to ? information needed: uid, action status, error message</p> Bug #325 (Resolved): create site filters to convert site metadata to eml packageshttps://projects.ecoinformatics.org/ecoinfo/issues/3252001-11-08T19:43:34ZMatt Jonesjones@nceas.ucsb.edu
<p>Site specific metadata/data formats need to be converted into EML packages and<br />transferred via the harvester API to metacat. See visio diagram for design ideas.</p> Bug #273 (Resolved): site deployment for metacathttps://projects.ecoinformatics.org/ecoinfo/issues/2732001-08-31T17:16:59ZMatt Jonesjones@nceas.ucsb.edu
<p>Need to deploy metacat at selected LTER, OBFS, and NRS sites to begin to build<br />our network. OBFS and NRS may have a single point of presence. Initial<br />deployment (after NCEAS and LTERnet) should be to a couple of enthusistic sites,<br />and we should target those sites that specifically agreed to participate in testing.</p>
<p>Jones is working with Nottrott, Michener, and Schildhauer to deploy Metacat in<br />the NRS and OBFS system, and has a working prototype. See him for details.</p> Bug #162 (Resolved): need harvest/batch load for metacathttps://projects.ecoinformatics.org/ecoinfo/issues/1622000-10-25T00:28:59ZMatt Jonesjones@nceas.ucsb.edu
<p>The metacat server needs to be able to accept large numbers of metadata<br />documents for insert and update from site metadata catalogs. This should be<br />enabled either through a pull or push mechanism, so the pull (harvest) will need<br />a registry service as well.</p>