Metacat: Issueshttps://projects.ecoinformatics.org/ecoinfo/https://projects.ecoinformatics.org/ecoinfo/ecoinfo/favicon.ico?14691340362009-07-13T21:55:42ZEcoinformatics Redmine
Redmine Bug #4245 (In Progress): Harvester command line scripts don't executehttps://projects.ecoinformatics.org/ecoinfo/issues/42452009-07-13T21:55:42ZDuane Costadcosta@lternet.edu
<p>Metacat Harvester is normally launched as a Java servlet, but also has the option of being invoked manually from a pair of command-line scripts ('lib/harvester/runHarvester.bat' on Windows, 'lib/harvester/runHarvester.sh' on Linux). As of Metacat 1.9.x, execution of Metacat Harvester via the command-line scripts is not working.</p>
<p>Solution:<br /> 1. Additional dependencies need to be specified in the Java CLASSPATH:<br /> a. METACAT_LIB/log4j-1.2.12.jar<br /> b. METACAT_LIB/xalan.jar<br /> c. METACAT_LIB/postgresql-8.0-312.jdbc3.jar (for POSTGRESQL)<br /> 2. The Harvester.java class needs the following changes:<br /> a. Add support for log4j initialization in the 'main' method.<br /> b. In the 'loadProperties()' method, change the PropertyService constructor from 'PropertyService.getInstance();' to 'PropertyService.getTestInstance(configDir);' where 'configDir' is a relative path to the directory where 'metacat.properties' resides.</p>
<p>Note: The solution implemented to resolve this problem for Metacat Harvester will also be beneficial toward the implementation of the new Metacat OAI-PMH Harvester described in Bug <a class="issue tracker-1 status-2 priority-5 priority-highest" title="Bug: design and implement OAI-PMH compliant harvest subsystem (In Progress)" href="https://projects.ecoinformatics.org/ecoinfo/issues/3835">#3835</a>.</p> Bug #4243 (In Progress): Harvester db errors due to fixed character length overflowhttps://projects.ecoinformatics.org/ecoinfo/issues/42432009-07-13T16:59:04ZDuane Costadcosta@lternet.edu
<p>In a recent release of Metacat (1.9.0), the Harvester property names were<br />refactored to begin with the prefix 'harvester.'. Some of Harvester property<br />names are used as operation codes in metacat's 'harvest_log' table,<br />'harvest_operation_code' field, which is declared with a fixed length of 30<br />characters. The 'harvester.ValidateHarvestListSuccess' code is 35 chars, which<br />exceeds the limit and results in DB errors on record insertion during a harvest.</p> Bug #3835 (In Progress): design and implement OAI-PMH compliant harvest subsystemhttps://projects.ecoinformatics.org/ecoinfo/issues/38352009-02-24T02:06:53ZMatt Jonesjones@nceas.ucsb.edu
<p>Metacat's current harvest mechanism works well but is a proprietary system. The Dryad project has proposed to implement an OAI-PMH compliant harvest susbstem for Metacat in order to allow Metacat to interact more effectively with other systems that implement this protocol. This is a tracking bug for the design and implementation of this feature. Other more detailed bugs will be filed for specific tasks. It would be useful if the final system allowed Metacat to act as both an OAI-PMH Data Provider and as an OAI-PMH Service Provider, allowing us to both serve and harvest documents from OAI-PMH servers.</p>
<p>Some issues to consider and discuss:<br />1) lack of record authorization mechanisms in OAI-PMH. Metacat currently allows harvest with access controls on harvested records. Reverting to a purely OAI-PMH system would eliminate this capability that is used by many of our harvest clients (especially for data, but somewhat for metadata as well). So the design needs to consider a hybrid that allows both public records to be exposed through OAI-PMH and restricted records to be exposed through a protocol like Metacat's that supports access control. What is our design goal here?</p>
<p>2) A corollary of (1) is how to determine who is allowed to update a given record. Does OAI-PMH assume providers always originate from a constant URL endpoint in order to get around authenticating data providers? This is probably not reasonable for even short periods of time (a few years). A number of sites change domain names over short period of times, and the harvester needs to be able to adjust to these changes, update endpoints, and still handle record replacement. Maybe this is a non-issue if PMH allows provider endpoints to be updated.</p>
<p>3) Date-based change detection in OAI-PMH versus GUID-based versioning in metacat. How should these be reconciled? If a PMH harvest occurs every ten days, but a metadata document is revised three times in that interval, does OAI-PMH only get the most recent version? How are the other versions archived and made accessible over time?</p>
<p>4) Data objects. The Metacat harvester allows one to transfer objects of any type, which is used to harvest both metadata objects of various formats (e.g., EML and FGDC) as well as the associated data objects. Each of these objects has their own unique identifier. How would this be handled under OAI-PMH?</p>
<p>A nice background set of slides is here:<br /><a class="external" href="http://www.oaforum.org/otherfiles/berl_oai-tutorial_e.ppt">http://www.oaforum.org/otherfiles/berl_oai-tutorial_e.ppt</a></p> Bug #3367 (New): Harvester stores passwords in clear texthttps://projects.ecoinformatics.org/ecoinfo/issues/33672008-06-05T20:18:24ZChad Berkleyberkley@nceas.ucsb.edu
<p>The harvester stores the user's password in clear text in the database. Passwords need to be stored as md5s or use some other secure form of encryption.</p>