Metacat: Issueshttps://projects.ecoinformatics.org/ecoinfo/https://projects.ecoinformatics.org/ecoinfo/ecoinfo/favicon.ico?14691340362010-11-17T02:09:47ZEcoinformatics Redmine
Redmine Bug #5244 (Resolved): ldapweb.cgi shouldn't report ou=Account accounts since they're unusablehttps://projects.ecoinformatics.org/ecoinfo/issues/52442010-11-17T02:09:47ZShaun Walbridgewalbridge@nceas.ucsb.edu
<p>Because of security considerations, we can't allow users to reuse their credentials from ou=Account, which is where we create the majority of accounts for Plone and SVN. However, when trying to show users other accounts which are the same, we show these accounts, confusing new users.</p>
<p>As a temporary fix, filter out the ou=Account from accounts when making suggestions.</p> Bug #5114 (Resolved): ESA registry: Bad link in 'register dataset' instructionshttps://projects.ecoinformatics.org/ecoinfo/issues/51142010-07-29T18:23:31ZJim Regetzregetz@nceas.ucsb.edu
<p>If you click the Register Data menu link without being logged in, it returns a page of registration instructions. Under Step 4 is a hyperlink to "ESA Data Registry Form". Clicking it produces a blank page, I imagine because the application context 'esa' is missing from the URL:<br /><a class="external" href="http://data.esa.org/cgi-bin/register-dataset.cgi?cfg=esa">http://data.esa.org/cgi-bin/register-dataset.cgi?cfg=esa</a></p>
<p>It's not a particularly useful or important link in any case (just returning the same page), but it at least shouldn't return a blank page.</p> Bug #5054 (Resolved): Unable to insert large EML documents and no error status returnedhttps://projects.ecoinformatics.org/ecoinfo/issues/50542010-06-18T21:24:22ZDuane Costadcosta@lternet.edu
<p>Two LTER sites (CDR and LUQ) have been unable to insert large EML documents into the LTER Metacat. The failure occurs both in batch mode (i.e. Metacat Harvester) and manually from a web form. There are really two problems:</p>
<p>1. The insert operation fails.</p>
<p>2. Metacat gives no indication that it fails. There is no apparent response of any kind other than that the insert operation terminates.</p>
<p>The following URL points to a very large (2+ MB) EML document with in-line data. Dan Bahauddin, Information Manager at Cedar Creek Ecosystem Science Reserve has granted permission to Metacat developers to access this document for the purpose of testing this bug (thanks, Dan!):</p>
<pre><code><a class="external" href="http://www.cedarcreek.umn.edu/data/emlFiles/knb-lter-cdr.70120.101.xml">http://www.cedarcreek.umn.edu/data/emlFiles/knb-lter-cdr.70120.101.xml</a></code></pre> Bug #5011 (Resolved): Add support for DataONE service APIhttps://projects.ecoinformatics.org/ecoinfo/issues/50112010-05-14T17:25:51ZMatt Jonesjones@nceas.ucsb.edu
<p>Metacat will be used within DataONE as both a Member Node and as a component of the Coordinating Node infrastructure. We need to add support for the proper REST APIs for DataONE, which are described here:</p>
<p><a class="external" href="http://mule1.dataone.org/ArchitectureDocs/index.html">http://mule1.dataone.org/ArchitectureDocs/index.html</a></p>
<p>Initial support should include the following methods:<br />MN.get()<br />MN.getSystemMetadata()<br />MN.listObjects()</p>
<p>CN.create()<br />CN.update()<br />CN.get()<br />CN.getSystemMetadata()<br />CN.resolve()</p>
<p>These tasks are a DataONE project, and so details are being tracked in the DataONE ticket system (<a class="external" href="http://trac.dataone.org">http://trac.dataone.org</a>). This is a tracking bug for the feature overall.</p> Bug #4907 (Resolved): Replication error stops insert/update of valid EML documenthttps://projects.ecoinformatics.org/ecoinfo/issues/49072010-03-29T19:50:47ZDuane Costadcosta@lternet.edu
<p>It appears that a replication error causes insert/update to fail that would otherwise succeed:</p>
<p>On March 26, 2010, Margaret O'Brien wrote:</p>
<pre><code>Hi Duane -<br /> I am trying to do a document update to the LTER network catalog from the Metacat XML loader at:<br /> <a class="external" href="http://metacat.lternet.edu/knb/style/skins/dev/loadxml.jsp">http://metacat.lternet.edu/knb/style/skins/dev/loadxml.jsp</a><br /> after logging in as user=sbc here:<br /> <a class="external" href="http://metacat.lternet.edu/knb/style/skins/dev/login.html">http://metacat.lternet.edu/knb/style/skins/dev/login.html</a></code></pre>
<pre><code>I get this error:<br /> &lt;error&gt;<br /> sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target<br /> &lt;/error&gt;</code></pre>
<pre><code>The document was valid EML 2.0.1. Can you help? I asked Mike Daigle, and he says he has seen this occasionally during replication.</code></pre>
<p>thanks -<br />Margaret O'Brien</p>
<p>I repeated this same procedure, using the same EML document that Margaret tried to update, on two different Metacat servers:</p>
<p>(1) The first server was a test instance of Metacat which had <strong>no</strong> replication configured. The document was successfully inserted without errors.</p>
<p>(2) The second server was the LTER production server which replicates to the KNB server. This result I got was identical to Margaret's. The document failed to update, but the error message seemed to involve replication; there was no indication of any problem with the document itself.</p>
<p>So it appears that there might be a logical flaw in the insert/update process, where a replication issue can prevent an insert/update from succeeding. My expectation would be that insert/update should not depend on replication succeeding first. (There is, of course, a dependency in the other direction: replication depends on the insert/update succeeding first.)</p>
<p>Below is the full traceback from the Tomcat log file on host metacat.lternet.edu:</p>
<p>2010-03-26 08:14:28.721 Data Portal: [WARN] [edu.ucsb.nceas.metacat.MetaCatServlet]: Error in writing eml document to the databasesun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target<br />javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target<br /> at com.sun.net.ssl.internal.ssl.Alerts.getSSLException(Alerts.java:174)<br /> at com.sun.net.ssl.internal.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1611)<br /> at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Handshaker.java:187)<br /> at com.sun.net.ssl.internal.ssl.Handshaker.fatalSE(Handshaker.java:181)<br /> at com.sun.net.ssl.internal.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1035)<br /> at com.sun.net.ssl.internal.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:124)<br /> at com.sun.net.ssl.internal.ssl.Handshaker.processLoop(Handshaker.java:516)<br /> at com.sun.net.ssl.internal.ssl.Handshaker.process_record(Handshaker.java:454)<br /> at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:884)<br /> at com.sun.net.ssl.internal.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1112)<br /> at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1139)<br /> at com.sun.net.ssl.internal.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1123)<br /> at sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:418)<br /> at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:166)<br /> at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1041)<br /> at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:234)<br /> at java.net.URL.openStream(URL.java:1009)<br /> at edu.ucsb.nceas.metacat.MetacatReplication.getURLContent(MetacatReplication.java:1970)<br /> at edu.ucsb.nceas.metacat.DocumentImpl.write(DocumentImpl.java:2577)<br /> at edu.ucsb.nceas.metacat.DocumentImpl.write(DocumentImpl.java:2497)<br /> at edu.ucsb.nceas.metacat.DocumentImplWrapper.write(DocumentImplWrapper.java:63)<br /> at edu.ucsb.nceas.metacat.MetaCatServlet.handleInsertOrUpdateAction(MetaCatServlet.java:2160)<br /> at edu.ucsb.nceas.metacat.MetaCatServlet.handleGetOrPost(MetaCatServlet.java:726)<br /> at edu.ucsb.nceas.metacat.MetaCatServlet.doPost(MetaCatServlet.java:359)<br /> at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)<br /> at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)<br /> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)<br /> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)<br /> at org.vfny.geoserver.filters.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:122)<br /> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)<br /> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)<br /> at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)<br /> at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)<br /> at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:433)<br /> at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)<br /> at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)<br /> at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)<br /> at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)<br /> at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)<br /> at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)<br /> at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)<br /> at java.lang.Thread.run(Thread.java:619)<br />Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target<br /> at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:285)<br /> at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:191)<br /> at sun.security.validator.Validator.validate(Validator.java:218)<br /> at com.sun.net.ssl.internal.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:126)<br /> at com.sun.net.ssl.internal.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:209)<br /> at com.sun.net.ssl.internal.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:249)<br /> at com.sun.net.ssl.internal.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1014)<br /> ... 37 more<br />Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target<br /> at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:174)<br /> at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:238)<br /> at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:280)<br /> ... 43 more</p> Bug #4728 (Resolved): grant document management privileges to admin userhttps://projects.ecoinformatics.org/ecoinfo/issues/47282010-02-03T02:39:00ZMatt Jonesjones@nceas.ucsb.edu
<p>Metacat has an administrative user, but that user only has a few capabilities. We need to enhance the privileges of the administrative user. In particular, they should be able to:</p>
<p>1) Delete all data files and metadata files<br />2) Change ownership on all data and metadata files<br />3) Change permissions on all data and metadata files</p>
<p>There are probably additional services that should be enabled for admins, but these are lower priority and could be postponed to another bug if they would slow down implementation of 1-3 above:</p>
<p>4) Initiate replication<br />5) View logs (I think already implemented)<br />6) Rebuild indices (I think already implemented)<br />...</p> Bug #4405 (Resolved): PISCO, KNB and LTER have different query result for SBC datasets.https://projects.ecoinformatics.org/ecoinfo/issues/44052009-09-21T19:22:32ZJing Taotao@nceas.ucsb.edu
<blockquote><blockquote><blockquote>
<p>On Mon, Sep 21, 2009 at 10:18 AM, Margaret O'Brien <<a class="email" href="mailto:mob@msi.ucsb.edu">mob@msi.ucsb.edu</a>> wrote:</p>
<blockquote>
<p>Hi metacat dev -<br />When I hit these 3 metacats with the query below, I get 3 different<br />resultsets.<br /><a class="external" href="http://knb.ecoinformatics.org/knb/metacat">http://knb.ecoinformatics.org/knb/metacat</a> (returns 13 datasets, some older<br />rev numbers)<br /><a class="external" href="http://metacat.lternet.edu/knb/metacat">http://metacat.lternet.edu/knb/metacat</a> (returns 14 datasets, with newer<br />revision numbers)<br /><a class="external" href="http://data.piscoweb.org/catalog/metacat">http://data.piscoweb.org/catalog/metacat</a> (returns 0 datasets)</p>
<p>All sbc datasets are uploaded to data.piscoweb.org, and are replicated to<br />the other 2. I beleive it is a one-way replication to knb, and from there<br />two-way knb<->lter, but I could be wrong.</p>
<p>thought you might want to know...<br />margaret</p>
<p><pathquery version="1.2"><br /><returndoctype>eml://ecoinformatics.org/eml-2.1.0</returndoctype><br /><returndoctype>eml://ecoinformatics.org/eml-2.0.1</returndoctype><br /><returndoctype>eml://ecoinformatics.org/eml-2.0.0</returndoctype><br /><returnfield>eml/dataset/title</returnfield><br /><returnfield>eml/dataset/creator/individualName/surName</returnfield><br /><returnfield>eml/dataset/creator/organizationName</returnfield><br /><returnfield>eml/dataset/dataTable/entityName</returnfield></p>
<p><returnfield>eml/dataset/dataTable/physical/distribution/online/url</returnfield><br /><querygroup operator="INTERSECT"><br /><queryterm casesensitive="false" searchmode="contains"><br /><value>reed</value><br /><pathexpr>creator/individualName/surName</pathexpr><br /></queryterm><br /><queryterm casesensitive="false" searchmode="starts-with"><br /><value>knb-lter-sbc</value><br /><pathexpr>@packageId</pathexpr><br /></queryterm><br /></querygroup><br /></pathquery></p>
</blockquote>
<p>Hi Margaret,</p>
<p>This is indeed a problem. The difference between nceas and lternet servers<br />may be due to failures in indexing. Its not clear at all why the pisco<br />server would return 0 though. Do you know what the right answer should be<br />(e.g., how many results should be returned, and what are the docids<br />(including revision that should match)?</p>
<p>Matt</p>
</blockquote></blockquote></blockquote>
<blockquote>
<p>SBC datasets are uploaded to the PISCO metacat, and I rely on replication to get them to LTER. But I have not queried PISCO's metacat since it upgraded to the<br />latest version. I have my own catalog application (I dont use the index.jsp), and only use metacat for the storage and dataset display. So I did not see this<br />problem till I created a search-box to directly query metacat for datasets.</p>
<p>I've asked Chris about the 0 hits - but I think he was/is on vacation. I am attaching an email (to chris) that includes the log from a recent restart of that<br />tomcat. It has a lot of warnings and errors in it.</p>
<p>thanks,<br />Margaret</p>
</blockquote>
<p>The response from the lter metacat is correct: there are 14 datasets with creator=Reed, with these ids:<br />knb-lter-sbc.13.11<br />knb-lter-sbc.21.8<br />knb-lter-sbc.24.7<br />knb-lter-sbc.14.8<br />knb-lter-sbc.15.7<br />knb-lter-sbc.17.12<br />knb-lter-sbc.18.8<br />knb-lter-sbc.19.6<br />knb-lter-sbc.25.2<br />knb-lter-sbc.26.1<br />knb-lter-sbc.27.1<br />knb-lter-sbc.28.2<br />knb-lter-sbc.29.1<br />knb-lter-sbc.30.1</p>
<p>I checked the KNB metacat tables, and all 14 of these documents are present in the xml_documents table, with the proper revisions.� Which strengthens my<br />earlier thought that it is only the index that is not updated.</p>
<p>Matt</p>
<p>Since we replicate eml in this way PISCO -> KNB -> LTER and LTER has correct packages, this means replication working fine. So I agree with matt (on irc), this may be<br />caused by indexing issue we just introduced. However, both LTER and KNB should run the same version of metacat (1.9.1). I couldn't understand why they have different<br />behaviors.</p>
<p>Thanks,</p>
<p>Jing</p> Bug #4356 (Resolved): knb website query result shows old version of a documenthttps://projects.ecoinformatics.org/ecoinfo/issues/43562009-08-31T21:26:51ZJing Taotao@nceas.ucsb.edu
<blockquote><blockquote><blockquote><blockquote>
<p>Oliver Soong wrote:</p>
<blockquote>
<p>I dropped a note about this on IRC, and Matt suggested you two might<br />be the ones to hear about it. �If I search on KNB for Kruger, I can<br />turn up judithk.40.35, but if I adjust the URL to ask for<br />docid=judithk.40, I get judithk.40.40. �The package was just recently<br />replicated, so that may be part of the issue, but at the moment,<br />things seem to be out of sync. �I don't know if this is expected<br />(search index is slightly out of date?) or whether this represents a<br />more serious issue.</p>
<p>Oliver</p>
</blockquote></blockquote></blockquote></blockquote></blockquote>
<p>Hi, Mike:</p>
<p>I think i found the problem. I queried the xml_queryresult table by "select * from xml_queryresult where docid like 'judithk.40';" and get something<br />like:</p>
<p>judithk.40 |<br /><docid>judithk.40.35</docid><docname>eml</docname><doctype>eml://ecoinformatics.org/eml-2.1.0</doctype><createdate>2004-07-20</createdate><updatedate>200<br />9-05-29</updatedate><param name="dataset/title">Kruger National Park ecological aerial survey data (1998-2007)</param><param<br />name="creator/individualName/surName">Kruger</param><param name="creator/organizationName">SANParks</param><param name="keyword">Ungulate<br />population</param><param name="keyword">Aerial survey</param><param name="keyword">Distance methodology</param><param name="keyword">SANParks, South<br />Africa</param><param name="keyword">Kruger National Park, South Africa</param><param name="keyword">Kruger National Park</param></p>
<p>You see, the judithk.40.35 is stored in the xml_queryresult table. I think there was a mechanism to delete the cached item when a document was updated.<br />It seems this mechanism was broken.</p>
<p>It was possible to delete some records in xml_queryresult table. But I am not sure if metacat still works this way. If this is the case, we can delete<br />the records for judithk.40 then i think the search will work.</p>
<p>I will file a bug report for this issue.</p>
<p>Thanks,</p>
<p>Jing</p> Bug #4083 (Resolved): Metacat doesn't declare XML document encodinghttps://projects.ecoinformatics.org/ecoinfo/issues/40832009-05-19T01:26:50ZShaun Walbridgewalbridge@nceas.ucsb.edu
<p>When generating EML documents, Metacat doesn't include the encoding. Currently, the Registry expects its documents to be ISO-8859-1, and MetaCatServlet.java usually generates documents without an 'encoding' block, which should default the documents to UTF-8 (which may not be handled elsewhere correctly).</p>
<p>If possible, we should consistently use one encoding for our documents to prevent data munging issues.</p> Bug #3815 (Resolved): Ampersand character not correctly encodedhttps://projects.ecoinformatics.org/ecoinfo/issues/38152009-02-10T01:35:18ZShaun Walbridgewalbridge@nceas.ucsb.edu
<p>Ampersands are encoded as "&" within register-dataset.cgi in normalize(), but documents uploaded have a "%26amp;" entry instead. "%26" is the urlencoded version of "&" and 0x0026 is the Unicode code-point.</p>
<p>An example document exhibiting the behavior:<br /><a class="external" href="http://knb.ecoinformatics.org/knb/metacat?action=read&qformat=nceas&docid=nceas.912.8">http://knb.ecoinformatics.org/knb/metacat?action=read&qformat=nceas&docid=nceas.912.8</a></p>
<p>The organization is set to "U.S. Fish %26amp; Wildlife Service".</p> Bug #3811 (Resolved): Spatial caches should be backed up and restoredhttps://projects.ecoinformatics.org/ecoinfo/issues/38112009-02-06T21:50:30ZMichael Daigledaigle@nceas.ucsb.edu
<p>Right now it takes over 40 minutes for the spacial caches to be generated every time metacat is upgraded on knb. These files should either be located outside of the webapps directories (which may not be possible) or the files should be backed up and restored at update time.</p> Bug #3296 (Resolved): Replication: Many EML documents fail to replicatehttps://projects.ecoinformatics.org/ecoinfo/issues/32962008-05-13T21:37:34ZDuane Costadcosta@lternet.edu
<p>Over 1000 EML documents fail to replicate from LTER Metacat to KNB Metacat during every timed replication, currently scheduled once per week. Jing and I did some initial investigation of this issue last year. Jing suspects that the problem might be related to normalization (and/or lack of normalization) of character entities such as '<'. (See below.)</p>
<p>A secondary bug seems to be that each of the failed documents is sent twice, not once. (Also described below.) Thus, for each timed replication, LTER attempts to send over 2000 documents to KNB, KNB fails to write them into its Metacat, and the process is repeated again the following week. Because of the large number of documents sent during each timed replication, the process requires about eleven hours to complete.</p>
<p>We were previously scheduling timed replications once every 48 hours, but as an interim measure we reduced this to once every 7 days to reduce the load on both Metacat servers.</p>
<p>----<br />On Tue, 17 Jul 2007, Jing Tao wrote:</p>
<p>Here is something on the top of my head (if anybody know it is wrong, please correct me):</p>
<p>When you send xml document containing something like "<" in text part, xerces will automatically transform it to "<" during our metacat inserting process. So "<" will be stored in our db. When we try to read the xml file, metacat should transform the stored "<" to "<". Otherwise, the xml file will be not well-formed. The details of the transform (normalize) is in MetacatUtil.normalize().</p>
<blockquote>
<p>The documents that are failing to be replicated are being accessed twice >every forty-eight hours. This is strange, because we expected them to be >accessed only once every forty-eight hours. Here's an example:</p>
<blockquote>
<p>grep knb-lter-and.3190.4 metacatreplication.log</p>
</blockquote>
<p>07-07-09 01:06:01 :: document knb-lter-and.3190.4 sent<br />07-07-09 06:29:32 :: document knb-lter-and.3190.4 sent<br />07-07-11 01:06:39 :: document knb-lter-and.3190.4 sent<br />07-07-11 06:25:31 :: document knb-lter-and.3190.4 sent<br />07-07-13 01:07:34 :: document knb-lter-and.3190.4 sent<br />07-07-13 07:03:49 :: document knb-lter-and.3190.4 sent<br />07-07-15 01:09:06 :: document knb-lter-and.3190.4 sent<br />07-07-15 07:34:07 :: document knb-lter-and.3190.4 sent</p>
<p>You can see how LNO attempts to send the same document twice, not once, about six hours apart every two days. Why is this happening, shouldn't it only be attempted once every forty-eight hours?</p>
</blockquote>
<p>I took a look at the replication log files and found the time interval is 48 hours, but somehow the replication request are called twice in one replication! It is a bug.</p>
<blockquote>
<p>Jing Tao wrote:</p>
<blockquote>
<p>Hi, Duane:</p>
<p>Thank you for the info. I took a look at the error log in knb site. Yeah, knb rejected lots of documents from LNO. From the error message, it seems those documents have some problem in format. Since LNO can accecept them, I don't think there is any problem in oringin docs. The most possibile problme, which I am guessing, is the normalization. Xerces parsor automatically normailize some stuff. However, metacat failed to denormalize the doc when we try to read it. How do you think? I will take a look soon. The attached files are replication error and log in knb.</p>
<p>Thanks,</p>
<p>Jing</p>
<blockquote>
<p>Hi Jing,</p>
<p>I've attached the LNO replication log files from the past few days. It looks like LNO is replicating many files to KNB with every timed replication, but the KNB server is not accepting them. Also, replication attempt seems to take long, about four or five minutes per document. I think maybe the timed replications were beginning to take so long that they even started to overlap, i.e. a new timed replication would begin even though the previous one from forty-eight hours ago hadn't completed yet. When you get a chance could you please take a look at the KNB logs to match this up and find out why the documents are not being accepted on the KNB side?<br />Thanks,<br />Duane</p>
<p>On 7/11/2007, Duane Costa wrote:</p>
<blockquote>
<p>Hi Jing,</p>
<p>I was just checking the replication logs on the LNO metacat, and it looks like there is an unusually large amount of replication happening with the timed replications (once every 48 hours), with many documents being sent from LNO to KNB. Do you know of any changes (on the KNB side) that might be causing this, and could you check the replication logs at the KNB metacat to see if you agree that it looks unusual?</p>
<p>Thanks,<br />Duane</p>
</blockquote></blockquote></blockquote></blockquote> Bug #2495 (Resolved): Charset bug: Internationalizationhttps://projects.ecoinformatics.org/ecoinfo/issues/24952006-07-19T22:31:05ZSaurabh Gargsgarg@nceas.ucsb.edu
<p>Metacat should be modified in such a way that it can handle characters from other languages also.</p>
<p>Mr. Chau Chin Lin from Taiwan has reported that they have made the following set of changes in Metacat. This enables Metacat to work with 6 languages. The changes are as following:</p>
<p>1.MetacatServlet.java (metacat-src-1.4.0\metacat-1.4.0\src\edu\<br />ucsb\nceas\metacat\ MetacatServlet.java)</p>
<p>HandleGetOrPost()<br />if (action.equals("query")) {<br />/*line:421*/ /*add this line*/response.setContentType("text/xml;<br />charset=UTF-8");<br />PrintWriter out = response.getWriter();<br />handleQuery(out, params, response, username, groupnames,<br />sess_id);<br />out.close();</p>
<p>handleReadAction(){<br />/*line:1030*/ /*add this line*/response.setContentType("text/xml;<br />charset=UTF-8");<br />ServletOutputStream out = null;<br />ZipOutputStream zout = null;<br />PrintWriter pw = null;<br />boolean zip = false;</p>
<p>2.build.properties<br />line 27:jdbc-connect=jdbc:postgresql://localhost/metacat?charSet=UTF-8</p>
<p>3.jsp files(metacat-src-1.4.0\metacat-1.4.0\lib\style\skins\default)<br /><%@ page contentType="text/html; charset=UTF-8" %></p>
<p>UTF-8 is enforced as the character encoding for all types of communication.</p>
<p>Also worth noting is the way geoserver does things. It has an entry in web.xml which specifies a filter to encoding conversion</p>
<p><filter><br /> <filter-name>Set Character Encoding</filter-name><br /><filter-class>org.vfny.geoserver.filters.SetCharacterEncodingFilter</filter-class><br /> <init-param><br /> <param-name>encoding</param-name><br /> <param-value>UTF-8</param-value><br /> </init-param><br /> </filter></p>
<p>A test document with chinese characters can be found here: <a class="external" href="http://bugs.tfri.gov.tw/tfri/servlet/metacat?action=read&qformat=default&docid=test100.4.9">http://bugs.tfri.gov.tw/tfri/servlet/metacat?action=read&qformat=default&docid=test100.4.9</a></p>
<p>A chat log explaing related issues:</p>
<p>[10:05] <sid> the changes which i made for storing all the possible characters in &xxx; form in metacat will probably break things for Lin<br />[10:06] <sid> i am trying to debug it.. but we will probably need to change a bunch of code later on<br />[10:10] <matt> yep<br />[10:12] <sid> this document: <a class="external" href="http://bugs.tfri.gov.tw/tfri/servlet/metacat?action=read&qformat=xml&docid=test100.4.9">http://bugs.tfri.gov.tw/tfri/servlet/metacat?action=read&qformat=xml&docid=test100.4.9</a><br />[10:13] <sid> comes back as this document: <a class="external" href="http://indus.msi.ucsb.edu/knb/metacat?action=read&qformat=xml&docid=sgtest.100.1">http://indus.msi.ucsb.edu/knb/metacat?action=read&qformat=xml&docid=sgtest.100.1</a><br />[10:14] <matt> if you insert it in 1.6+<br />[10:14] <matt> ?<br />[10:14] <sid> yes<br />[10:14] <matt> with or without their patches?<br />[10:14] <sid> i havnt tried the patches yet<br />[10:15] <matt> i think you need them<br />[10:15] <matt> in order to store the characters in postgres as UTF-8<br />[10:16] <sid> its mainly because of this code<br />[10:16] <sid> str.append("&#");<br />[10:16] <sid> str.append(Integer.toString(ch));<br />[10:16] <sid> str.append(';');<br />[10:16] <sid> so any character that we are not familiar with is converted to &#xxx; format<br />[10:17] <sid> the characters that we are familiar with are the characters in the range of 31 and 128 when converted to int.. newline, carriage return, tab, &, <, ><br />[10:18] <sid> so that is good enough for most of the documents<br />[10:19] <sid> but it screws up when we have a character which is not between integer values 0 and 255 <br />[10:19] <sid> which is the case for all other languages<br />[10:19] <sid> so i can try taking out this code and try setting the encoding to UTF-8 for postgres<br />[10:21] <sid> so any character that we are not familiar with, we try to store it as it is in metacat<br />[10:21] <sid> actually in that case i think we can just take away the normalize function<br />[10:22] <sid> as in maybe we wont need any normalization<br />[10:23] <sid> but this will probably screw up if the document being inserted has an encoding other than UTF-8<br />[10:24] <sid> so we will have to enforce that encoding or maybe have an encoding convertor filter</p> Bug #2236 (Resolved): metacat parser allows eml with missing contenthttps://projects.ecoinformatics.org/ecoinfo/issues/22362005-10-27T21:13:12ZMargaret O'Brienmob@msi.ucsb.edu
<p>Incomplete packages are allowed by the EML parser in metacat and at knb<br />(<a class="external" href="http://knb.ecoinformatics.org/emlparser/index.html">http://knb.ecoinformatics.org/emlparser/index.html</a>)<br />packages missing these nodes are allowed:<br />methods/methodStep/description/<br />methods/methodStep/protocol/creator/individualName/surName <br />An example in metacat is knb-lter-sbc.13.3. Morpho1.6rc1 chokes on these files,<br />so I checked the schema, and found that morpho was right. The example file was<br />most likely created with morpho1.4. Personally, I'd prefer you didnt fix this,<br />since it will make some of our existing packages invalid.</p> Bug #2084 (Resolved): Pathquery support for temporal search on date fieldshttps://projects.ecoinformatics.org/ecoinfo/issues/20842005-05-20T23:02:34ZDuane Costadcosta@lternet.edu
<p>Metacat pathquery relational search modes ("greater-than", "less-than", etc.) do<br />not currently support temporal searches on date fields. The reasons for this are<br />described in the email correspondence to metacat-dev below. This enhancement<br />would make it possible to do temporal searches using date ranges, which would be<br />a important feature in an "Advanced Search" form (such as the one currently<br />under development at LTER), and could also be added to the search dialog in Morpho.</p>
<p>---<br />On 5/17/2005, Duane Costa wrote:</p>
<p>Metacat supports the following pathquery search modes: contains, starts-with,<br />ends-with, equals, isnot-equal, greater-than, less-than, greater-than-equals,<br />less-than-equals.</p>
<p>For the search modes that are equivalent to relational operators (equals,<br />isnot-equal, greater-than, less-than, greater-than-equals, less-than-equals), is<br />it possible to use these search modes in EML fields that contain non-numeric<br />string values? In particular, is it possible to use the relational search modes<br />for date strings?</p>
<p>For example, here is a pathquery that attempts to find all documents with<br />temporal coverage between January 1, 1900 and January 1, 2005. It reads like<br />this: “Return all documents that have a beginDate or a singleDateTime greater<br />than or equal to 1900-01-01, and an endDate or a singleDateTime less than or<br />equal to 2005-01-01.”</p>
<p><query><br /><pathquery version="1.2"><br /> <querytitle>LTER Query</querytitle><br /> <returnfield>dataset/title</returnfield><br /> <returnfield>originator/individualName/surName</returnfield><br /> <returnfield>creator/individualName/surName</returnfield><br /> <returnfield>originator/organizationName</returnfield><br /> <returnfield>creator/organizationName</returnfield><br /> <returnfield>keyword</returnfield><br /> <querygroup operator="INTERSECT"><br /> <querygroup operator="INTERSECT"><br /> <querygroup operator="INTERSECT"><br /> <querygroup operator="UNION"><br /> <queryterm searchmode="greater-than-equals" casesensitive="false"><br /> <value>1900-01-01</value><br /> <pathexpr>temporalCoverage/rangeOfDates/beginDate/calendarDate</path<br />expr><br /> </queryterm><br /> <queryterm searchmode="greater-than-equals" casesensitive="false"><br /> <value>1900-01-01</value><br /> <pathexpr>temporalCoverage/singleDateTime/calendarDate</pathexpr><br /> </queryterm><br /> </querygroup><br /> <querygroup operator="UNION"><br /> <queryterm searchmode="less-than-equals" casesensitive="false"><br /> <value>2005-01-01</value><br /> <pathexpr>temporalCoverage/rangeOfDates/endDate/calendarDate</pathex<br />pr><br /> </queryterm><br /> <queryterm searchmode="less-than-equals" casesensitive="false"><br /> <value>2005-01-01</value><br /> <pathexpr>temporalCoverage/singleDateTime/calendarDate</pathexpr><br /> </queryterm><br /> </querygroup><br /> </querygroup><br /> </querygroup><br /> </querygroup><br /></pathquery><br /></query></p>
<p>When I run this against a test Metacat with an Oracle database, this pathquery<br />fails and the resultset contains zero documents. The Tomcat output shows that<br />the pathquery triggers a SQL error in Oracle:</p>
<pre><code>MetaCat: SQL Error in DBQuery.findDocuments: ORA-01722: invalid number</code></pre>
<p>So my question is this: can this problem with temporal search be fixed in the<br />same way that Sid fixed a similar bug for spatial search (Bug 1703, 1718), or is<br />this a different situation because of the fact that the temporal fields contain<br />non-numeric strings while the spatial fields contain numeric values? That is, is<br />it illegal to use the relational pathquery modes for EML fields that contain<br />non-numeric strings? If that is the case, it seems that there would be no<br />practical way to use pathquery for a temporal search involving date ranges: is<br />that correct?</p>
<hr />
<p>On 5/18/2005, Chris Jones wrote:</p>
<p>Duane,</p>
<p>I've thought about this myself, and it seems that an internal metacat solution<br />needs to happen where date/time strings are converted or linked to a universal<br />date representation in order to do comparisons. This might be tough. Each<br />vendor's database seems to store dates internally in very different ways. It<br />would be difficult to create another column in the xml_nodes table that is of<br />type 'date' (depending on the vendor), because an EML (or other XML) date string<br />isn't necessarily recognizable as a date, whereas an integer or float is much<br />more discernible (in the case of the nodedatanumerical column) in xml_nodes.</p>
<p>However, with strong typing used in XMLSchema, metacat could in theory glean<br />'date' strings as type 'date' by referring to the element definition in the<br />XMLSchema document to which the instance document adheres. In that case, a<br />nodedatadate column in xml_nodes might work out where, upon insert or update,<br />the leaf node values get put into that column as a converted date. Of course,<br />this leads to: which date formats will be supported for conversion?</p>
<p>Anyone have a better solution?</p>
<p>Chris</p>
<hr />
<p>On 5/20/2005, Matt Jones wrote:</p>
<p>Chris and Duane,</p>
<p>I think Chris' analysis of the issues with adding support for datetime<br />comparisons is just about right on. It would be a nice feature to have, but<br />implementation would need to accomodate multiple database systems.<br />I've even been contemplating adding support for Sleepycat XMLdb or other XML<br />databases instead of relational backends, and this would further complicate the<br />implementation issues for datetime values. But it would be worthwhile. Lets<br />get it into bugzilla as a feature request and we'll see where it falls out in<br />the priority order.</p>
<p>Matt</p>