Metacat: Issueshttps://projects.ecoinformatics.org/ecoinfo/https://projects.ecoinformatics.org/ecoinfo/ecoinfo/favicon.ico?14691340362004-04-05T23:23:26ZEcoinformatics Redmine
Redmine Bug #1451 (Resolved): null returndoctype fails to return all documentshttps://projects.ecoinformatics.org/ecoinfo/issues/14512004-04-05T23:23:26ZMatt Jonesjones@nceas.ucsb.edu
<p>When the pathquery "returndoctype" filed is omitted, metacat is supposed to<br />return all matching documents. In fact, no documents are returned under version<br />1.3.1, because it filters out all documents in the returndoctype listt is empty<br />or null. Fixing this will require chaning the logic in the DBQuery module to<br />include results even when the returndoctype field is not present.</p> Bug #1427 (Resolved): xml_index constrains depth of paths that can be insertedhttps://projects.ecoinformatics.org/ecoinfo/issues/14272004-03-30T22:20:05ZMatt Jonesjones@nceas.ucsb.edu
<p>When an XML document contains a deeply nested structure, metacat accepts the<br />document for storage in xml_nodes, but during the subsequent indexing phase, it<br />throws an exception because the composite paths to the deep nodes are too long<br />to fit in the space allocated for the paths in the column in the xml_index<br />table. This column was limited to a a few hundred characters so that it is<br />indexable (Oracle had a limit on the total indexable width of columns).</p>
<p>These problems were discovered and reported by Wade Sheldon (GCE LTER) when he<br />submitted EML documents with fully filled out taxonomic coverage entries. We<br />definitely need to support realistically filled out EML documents.</p>
<p>So, two possible solutions:<br /> 1) make the column much wider<br /> -- this is a partial solution, because the column still might not be big <br /> enough for very deep docs or docs with long element names<br /> -- if its wider, it may not be indexable, which is why it exists<br /> 2) eliminate the dependency on the xml_index table altogether<br /> -- the recursive search needed isn't that much slower, and may not be<br /> slower at all as we tune the database<br /> -- insert/update/delete should be MUCH faster<br /> -- simpler database structure</p>
<p>We have decided to pursue (2) above because of the advantages listed. Rather<br />than completely removing the xml_index code, we are going to make it an option<br />whether or not it is used, but by default ship with it turned off.</p> Bug #1390 (Resolved): add UCNRS to the ldapweb.cgi management listshttps://projects.ecoinformatics.org/ecoinfo/issues/13902004-03-24T22:15:24ZMatt Jonesjones@nceas.ucsb.edu
<p>The UCNRS ldap tree can not be managed through the web scripts now, partly<br />because these entries are in a different tree (o=ucnrs.org) than ecoinfo<br />(dc=ecoinformatics,dc=org). Need to make the changes that allow creation of new<br />LDAP entries for UCNRS and changing and resetting the passwords for these accounts.</p> Bug #1346 (Resolved): The results sent back in response to query should be sortedhttps://projects.ecoinformatics.org/ecoinfo/issues/13462004-02-12T23:00:28ZSaurabh Gargsgarg@nceas.ucsb.edu
<p>The results that are sent back to client from Metacat are random. It would be <br />helpful if the records are sorted. Maybe it would be even more helpful if you <br />could specify the field based on which they should be sorted.</p> Bug #1295 (Resolved): Registry: Button texts are ambiguous and need to be changed.https://projects.ecoinformatics.org/ecoinfo/issues/12952004-02-02T22:38:27ZSaurabh Gargsgarg@nceas.ucsb.edu
<p>Change the text of the buttons to following:</p>
<p><del>> Review Entry<br /></del>> Change reset to 'Cancel'</p>
<p><del>> Yes, submit.<br /></del>> No, go back to editing.</p> Bug #1235 (Resolved): enable passthrough parameters to support stysheet paramshttps://projects.ecoinformatics.org/ecoinfo/issues/12352003-12-10T09:22:21ZMatt Jonesjones@nceas.ucsb.edu
<p>Many different skins for metacat could take advantage of custom parameters in<br />the stylesheets. For example the OBFS registry has a need to add Edit and<br />Delete buttons to the resultset listing. A simple way to do this is to pass<br />paramters through metacat into the stylesheets to control the behavior of the<br />rendered output. This is currently hindered by the DBQuery.createSQuery()<br />function because it currently interprets all unknown parameters as XPaths that<br />should be written as an additional constraint in an squery. We need to<br />partially circumvent this feature in order for passthrough stlesheet parameters<br />to work.</p> Bug #1230 (Resolved): move metacat.properties out of jar filehttps://projects.ecoinformatics.org/ecoinfo/issues/12302003-12-05T20:56:06ZMatt Jonesjones@nceas.ucsb.edu
<p>The current configuration file for metacat (metacat.properties) is installed<br />inside of the metacat.jar JAR file. This makes changing the configuration<br />difficult for most users. Need to move it out of the jar, probably to a<br />location like ${context}/WEB-INF/metacat.properties. I have started code to<br />accomplish this change.</p> Bug #1202 (Resolved): If request sessionid not recognized, Metacat SQL error & 0 records returnedhttps://projects.ecoinformatics.org/ecoinfo/issues/12022003-11-05T04:46:49ZMatthew Brookebrooke@nceas.ucsb.edu
<p>Problem:<br />When searching metacat from an HTML form, if the form contains a hidden field name="sessionid", <br />and the value of sessionid is not recognized by metacat, a SQL error is generated in the tomcat log <br />and metacat returns 0 documents.</p>
<p>Proposed correction: <br />If Metacat gets a sessionid that it doesn't recognize, it should assume the user is not logged in and <br />return all the "public access" documents, instead of returning no documents. This would <br />presumably mean validating the sessionid first, before putting together the SQL query.</p>
<p>(usage note: the above scenario was encountered when using the getSessionId() method of the <br />metacat client API to get the sessionid when the user logs in - browser cookies were not involved. <br />If the user was not logged in, the sessionid hidden field value was the string value "NULL", which <br />led to this behavior)</p>
<p>Sample tomcat log output for a search on the keyword "grass" - note that the initial SQL query <br />(starting on line 5) is well formed - the problem comes later, presumably when the user ID is <br />incorporated into the query? (e.g. "...AND subtreeid IS NULL...")<br />--------------------------------<br />MetaCat: Connection pool size: 5<br />MetaCat: Free Connection number: 5<br />MetaCat: Line 230: Action is: query<br />MetaCat: percentage number: 0<br />MetaCat: query: SELECT docid,docname,doctype,date_created, date_updated, rev FROM <br />xml_documents WHERE docid IN ((SELECT DISTINCT docid FROM xml_nodes WHERE<br /><abbr title="nodedata">UPPER</abbr> LIKE '%GRASS%' )<br />)<br />MetaCat: OwnerQuery: SELECT docid FROM xml_documents WHERE<br />MetaCat: allow string is: OR (principal_name = 'public' AND perm_type = 'allow' AND <br />(permission='4' OR permission='7'))<br />MetaCat: allow query is: SELECT docid from xml_access WHERE( OR (principal_name = 'public' AND <br />perm_type = 'allow' AND (permission='4' OR permission='7'))) AND subtreeid IS NULL<br />MetaCat: denyquery is: SELECT docid from xml_access WHERE( OR (principal_name = 'public' AND <br />perm_type = 'deny' AND perm_order ='allowFirst' AND (permission='4' OR permission='7'))) AND <br />subtreeid IS NULL<br />MetaCat: accessquery is: AND (docid <abbr title="SELECT docid FROM xml_documents WHERE ">IN</abbr> OR (docid IN <br />(SELECT docid from xml_access WHERE( OR (principal_name = 'public' AND perm_type = 'allow' <br />AND (permission='4' OR permission='7'))) AND subtreeid IS NULL) AND docid NOT IN (SELECT docid <br />from xml_access WHERE( OR (principal_name = 'public' AND perm_type = 'deny' AND perm_order <br />='allowFirst' AND (permission='4' OR permission='7'))) AND subtreeid IS NULL )))<br />MetaCat: final query: SELECT docid,docname,doctype,date_created, date_updated, rev FROM <br />xml_documents WHERE docid IN ((SELECT DISTINCT docid FROM xml_nodes WHERE<br /><abbr title="nodedata">UPPER</abbr> LIKE '%GRASS%' )<br />) AND (docid <abbr title="SELECT docid FROM xml_documents WHERE ">IN</abbr> OR (docid IN (SELECT docid from <br />xml_access WHERE( OR (principal_name = 'public' AND perm_type = 'allow' AND (permission='4' <br />OR permission='7'))) AND subtreeid IS NULL) AND docid NOT IN (SELECT docid from xml_access <br />WHERE( OR (principal_name = 'public' AND perm_type = 'deny' AND perm_order ='allowFirst' AND <br />(permission='4' OR permission='7'))) AND subtreeid IS NULL )))<br />SQL Error in DBQuery.findDocuments: ERROR: parser: parse error at or near ")" at character 231</p>
<p>MetaCat: Trying style-set file: /Applications/jakarta-tomcat/webapps/knb/knb.xml<br />MetaCat: style system id is: <a class="external" href="http://anacapa.nceas.ucsb.edu:8080/knb/style/resultset.xsl">http://anacapa.nceas.ucsb.edu:8080/knb/style/resultset.xsl</a></p> Bug #1142 (Resolved): Key issues between java 1.3 and java 1.4https://projects.ecoinformatics.org/ecoinfo/issues/11422003-09-12T22:59:23ZJing Taotao@nceas.ucsb.edu
<p>There is a untrust chain problem between the replication of two metacats which <br />one is running in java 1.4.x and other is running on java 1.3.x. If switching <br />both metacats to 1.3 or 1.4, this problem will be gone</p>
<p>If seemed there is key format difference between java 1.3 and java 1.4. We need <br />figure out the issue to make sure replication can happend between java 1.3 and<br />1.4</p> Bug #1139 (Resolved): squery produces output that is not well-formedhttps://projects.ecoinformatics.org/ecoinfo/issues/11392003-09-03T16:33:44ZMatt Jonesjones@nceas.ucsb.edu
<p>When a metacat query is submitted using the "squery" approach and the query<br />document is missing the xml declaration (i.e., "<?xml version="1.0"?>"), then<br />the output that is returned is not well-formed. In particular, the leading<br />"pathquery" element is missing its "<" opening tag delimiter. Metacat should<br />accept documents without the xml declaration, and should never produce<br />non-well-formed output (its clear that the query is running successfully, but<br />the output formatting is getting messed up).</p>
<p>Here's an example return from metacat:</p>
<p><?xml version="1.0"?><br /><resultset><br /> <query>pathquery version="1.0"><br /> <returndoctype>meeting</returndoctype><br /> <returnfield>/meeting/title</returnfield><br /> <returnfield>/meeting/start</returnfield><br /> <returnfield>/meeting/end</returnfield><br /> <querygroup operator="INTERSECT"></p>
<pre><code>&lt;queryterm casesensitive="false" searchmode="contains"&gt;<br /> &lt;value&gt;%&lt;/value&gt;<br /> &lt;/queryterm&gt;<br /> &lt;/querygroup&gt;<br />&lt;/pathquery&gt;&lt;/query&gt; <br />&lt;document&gt;&lt;docid&gt;mtg-test.1.2&lt;/docid&gt;&lt;docname&gt;meeting&lt;/docname&gt;&lt;doctype&gt;meeting&lt;/doctype&gt;&lt;createdate&gt;2003-08-14<br />21:37:36.0&lt;/createdate&gt;&lt;updatedate&gt;2003-08-15 15:09:41.0&lt;/updatedate&gt;&lt;param<br />name="/meeting/title">SEEK All-Hands Meeting 2003&lt;/param&gt;&lt;param<br />name="/meeting/end">14 Oct 2003&lt;/param&gt;&lt;param name="/meeting/start"&gt;10 Oct<br />2003&lt;/param&gt;&lt;/document&gt;&lt;/resultset&gt;</code></pre>
<p>That document was produced from the following query:</p>
<p><pathquery version="1.0"><br /> <returndoctype>meeting</returndoctype><br /> <returnfield>/meeting/title</returnfield><br /> <returnfield>/meeting/start</returnfield><br /> <returnfield>/meeting/end</returnfield><br /> <querygroup operator="INTERSECT"><br /> <queryterm casesensitive="false" searchmode="contains"><br /> <value>%</value><br /> </queryterm><br /> </querygroup><br /></pathquery></p>
<p>If instead on changes the query to this:</p>
<p><?xml version="1.0"?><br /><pathquery version="1.0"><br /> <returndoctype>meeting</returndoctype><br /> <returnfield>/meeting/title</returnfield><br /> <returnfield>/meeting/start</returnfield><br /> <returnfield>/meeting/end</returnfield><br /> <querygroup operator="INTERSECT"><br /> <queryterm casesensitive="false" searchmode="contains"><br /> <value>%</value><br /> </queryterm><br /> </querygroup><br /></pathquery></p>
<p>then the metacat response is a well-formed result, like this:</p>
<p><?xml version="1.0"?><br /><resultset><br /> <query><br /><pathquery version="1.0"><br /> <returndoctype>meeting</returndoctype><br /> <returnfield>/meeting/title</returnfield><br /> <returnfield>/meeting/start</returnfield><br /> <returnfield>/meeting/end</returnfield></p>
<pre><code>&lt;querygroup operator="INTERSECT"&gt;<br /> &lt;queryterm casesensitive="false" searchmode="contains"&gt;<br /> &lt;value&gt;%&lt;/value&gt;<br /> &lt;/queryterm&gt;<br /> &lt;/querygroup&gt;<br />&lt;/pathquery&gt;&lt;/query&gt; <br />&lt;document&gt;&lt;docid&gt;mtg-test.1.2&lt;/docid&gt;&lt;docname&gt;meeting&lt;/docname&gt;&lt;doctype&gt;meeting&lt;/doctype&gt;&lt;createdate&gt;2003-08-14<br />21:37:36.0&lt;/createdate&gt;&lt;updatedate&gt;2003-08-15 15:09:41.0&lt;/updatedate&gt;&lt;param<br />name="/meeting/title">SEEK All-Hands Meeting 2003&lt;/param&gt;&lt;param<br />name="/meeting/end">14 Oct 2003&lt;/param&gt;&lt;param name="/meeting/start"&gt;10 Oct<br />2003&lt;/param&gt;&lt;/document&gt;&lt;/resultset&gt;</code></pre> Bug #1137 (Resolved): add a metacat-info actionhttps://projects.ecoinformatics.org/ecoinfo/issues/11372003-08-28T17:53:16ZChad Berkleyberkley@nceas.ucsb.edu
<p>I think we need to add a metacat-info action so that you can send a request to <br />metacat and it will print selected properties from the properties file as well <br />as the actual metacat version that is running. I think the version is <br />actually the most important info that we need but other things that could be <br />returned are the database name, the jdbc connection string, etc. this would <br />be very useful for debugging.</p> Bug #188 (Resolved): provide a metacat client library with a standard APIhttps://projects.ecoinformatics.org/ecoinfo/issues/1882001-04-09T19:31:50ZMatt Jonesjones@nceas.ucsb.edu
<p>Need to be able to call metacat directly from another servlet or from within a<br />client app (using it for local metadat storage), so need to make it into a<br />library that can be called this way, rather than passing params via an<br />inefficient http mechanism when it isn't needed. Owne and John agreed to<br />collaborate on this.</p> Bug #186 (Resolved): add web metadata entry form for Metacathttps://projects.ecoinformatics.org/ecoinfo/issues/1862001-04-09T19:17:28ZMatt Jonesjones@nceas.ucsb.edu
<p>Discussed need for a web-based metadata entry form. Nottrott has a simple form<br />which is a starting point, and has agreed to work with higgins on this task.</p> Bug #162 (Resolved): need harvest/batch load for metacathttps://projects.ecoinformatics.org/ecoinfo/issues/1622000-10-25T00:28:59ZMatt Jonesjones@nceas.ucsb.edu
<p>The metacat server needs to be able to accept large numbers of metadata<br />documents for insert and update from site metadata catalogs. This should be<br />enabled either through a pull or push mechanism, so the pull (harvest) will need<br />a registry service as well.</p> Bug #101 (Resolved): generate data set usage metadata/ provide access loghttps://projects.ecoinformatics.org/ecoinfo/issues/1012000-08-24T17:27:35ZMatt Jonesjones@nceas.ucsb.edu
<p>Tracking data set usage is an important part of running a metadata and data<br />archive. We need a new feature in the metacat server that tracks usage<br />statistics for each data-set and creates metadata records that document the<br />usage of data sets. This information is currently encoded in the<br />"data_request", "review_history", and "remarks" fields of eml-supplement.dtd. <br />These fields need to move forward as we revise these standards, and the metacat<br />servlet needs to automatically generate new records of this type as needed for a<br />data set, as it is accessed. We'll need to decide whether simply viewing the<br />metadata should be tracked, or whether a download is required to trigger a log<br />entry.</p>