EML: Issueshttps://projects.ecoinformatics.org/ecoinfo/https://projects.ecoinformatics.org/ecoinfo/ecoinfo/favicon.ico?14691340362011-06-27T18:47:45ZEcoinformatics Redmine
Redmine Bug #5431 (Resolved): Data Manager throws exception for otherEntity when attributeList is optiona...https://projects.ecoinformatics.org/ecoinfo/issues/54312011-06-27T18:47:45ZDuane Costadcosta@lternet.edu
<p>The EML schema (2.x.y) specifies that all data entities be classified as one of six different DatasetType types:</p>
<p>dataTable<br />spatialRaster<br />spatialVector<br />storedProcedure<br />view<br />otherEntity</p>
<p>Of these, only 'otherEntity' is allowed to optionally specify an 'attributeList' element; the other five entity types are required to specify an 'attributeList' element.</p>
<p>The Data Manager Library assumes that all data entities are required to include the 'attributeList' element, throwing an exception whenever it is not specified. The DML should change its behavior in accordance with the EML schema, allowing 'attributeList' to be an optional element when the DatasetType is 'otherEntity'.</p> Bug #5428 (Resolved): Leading whitespace in data URL causes download to hanghttps://projects.ecoinformatics.org/ecoinfo/issues/54282011-06-24T15:52:20ZDuane Costadcosta@lternet.edu
<p>Leading whitespace in a data URL has been demonstrated to cause the Data Manager to hang during its download operation. A data package that demonstrates this behavior is:</p>
<pre><code><a class="external" href="http://metacat.lternet.edu/knb/metacat/knb-lter-sbc.10.15">http://metacat.lternet.edu/knb/metacat/knb-lter-sbc.10.15</a></code></pre>
<p>One of the two data table entities in the above data package contains leading white space:</p>
<p>"\n <a class="external" href="http://sbc.lternet.edu/external/Ocean/Data/Monthly_Water_Sampling/LTER_monthly_bottledata.txt">http://sbc.lternet.edu/external/Ocean/Data/Monthly_Water_Sampling/LTER_monthly_bottledata.txt</a>"</p>
<p>The fix for this bug would be for the Data Manager to trim leading and trailing whitespace from the data URL.</p>
<p>A secondary enhancement would be for the Data Manager to produce a warning in the form of a quality check whenever it finds that the trimmed URL is different from the original URL. However, this secondary enhancement should be implemented in the DATAMANAGER_QUALITY branch rather than in the 'eml' trunk.</p> Bug #5317 (Resolved): Data Manager Library: Checks for collapseDelimiter instead of collapseDelim...https://projects.ecoinformatics.org/ecoinfo/issues/53172011-02-21T23:04:28ZDuane Costadcosta@lternet.edu
<p>There are two lines in the Data Manager Library source code that contain an apparent bug. The code checks for an EML element named "collapseDelimiter" when it should be checking for "collapseDelimiters". These lines are at:</p>
<p>src/org/ecoinformatics/datamanager/parser/eml/Eml200Parser.java, line 1204:</p>
<pre><code>elementName.equals("collapseDelimiter") &&</code></pre>
<p>src/org/ecoinformatics/datamanager/parser/generic/GenericDataPackageParser.java, line 1278:</p>
<pre><code>elementName.equals("collapseDelimiter") &&</code></pre>
<p>In addition, there are a large number of method names, method parameters, instance variables, and local variables throughout the DML code that are named 'collapseDelimiter' when the more appropriate name for these constructs would be 'collapseDelimiters'. Since these are only names, they do not affect the code logic, but it would be good to clean these up and rename them in accordance with the actual EML element name, 'collapseDelimiters'.</p> Bug #5308 (In Progress): Data Manager Library: storageType content should be stored and usedhttps://projects.ecoinformatics.org/ecoinfo/issues/53082011-02-15T15:57:28ZDuane Costadcosta@lternet.edu
<p>'storageType' is an optional, repeatable element within the EML 'attribute' element. In addition to the documentation available in the EML normative documents, several old bug tickets describe the rationale behind this element: <a class="issue tracker-1 status-3 priority-5 priority-highest closed" title="Bug: eml-attribute changes needed (Resolved)" href="https://projects.ecoinformatics.org/ecoinfo/issues/484">#484</a>, <a class="issue tracker-1 status-3 priority-2 priority-default closed" title="Bug: issues about storageType and attributeDomain (Resolved)" href="https://projects.ecoinformatics.org/ecoinfo/issues/544">#544</a>, <a class="issue tracker-1 status-3 priority-2 priority-default closed" title="Bug: storageType is repeatable in eml-attribute (Resolved)" href="https://projects.ecoinformatics.org/ecoinfo/issues/599">#599</a>.</p>
<p>When the Data Manager Library parses EML attributes, it does not record any 'storageType' content that may be present. This means that the hints that may have been provided by the metadata provider pertaining to how the attribute should be stored optimally (say, in a relational database table), are completely ignored by the Data Manager Library, which instead relies entirely on the 'measurementScale' content for this purpose.</p>
<p>To cite a specific example of how 'storageType' content can be helpful, the document knb-lter-gce.1.9 (<a class="external" href="http://metacat.lternet.edu/knb/metacat/knb-lter-gce.1.9">http://metacat.lternet.edu/knb/metacat/knb-lter-gce.1.9</a>) contains three attributes for year, month, and day, respectively. Each of the attributes has storageType set to 'integer' and measurementScale set to 'dateTime'. When loading the data table into a relational database, the Data Manager Library sets the corresponding database fields to type 'timestamp' (in Postgres), having no knowledge that the storage type "hint" was to set the fields to type integer ('int4' in Postgres). The result is that in the original data table entity, the fields appear like this:</p>
<p>2000 8 26</p>
<p>while in the relational database, they appear like this:</p>
<pre><code>year | month | day <br />---------------------+------------------------+------------------------<br /> 2000-01-01 00:00:00 | 0001-08-01 00:00:00 BC | 0001-01-26 00:00:00 BC</code></pre>
<p>It's clear that in this particular case, the Data Manager Library could have used the storageType hint to select a more appropriate data type for these attributes.</p>
<p>The goal of this task is to:</p>
<p>1. Enhance the EML parsing phase of the Data Manager Library, so that it parses and stores all storageType elements that are provided for an attribute.</p>
<p>2. Enhance the data loading phase of the Data Manager Library, so that it uses storageType content, if provided, to make a more informed decision about which data type to define for the attribute. This may involve the need for heuristics to determine which data type is most appropriate under a given set of circumstances, particularly in cases where more than one storageType element is provided for an attribute.</p> Bug #2835 (Resolved): Data Manager Library: Run-time errors involving Xalan classeshttps://projects.ecoinformatics.org/ecoinfo/issues/28352007-05-01T20:46:50ZDuane Costadcosta@lternet.edu
<p>Two developers (myself and Chad Burt) are getting run-time errors when our applications try to use the Data Manager library. The errors are NoClassDefFoundError involving Xalan classes.</p>
<hr />
<p>On 4/30/2007, Chad Burt wrote:</p>
<p>Hi guys,<br />I am trying to deploy my sbc app with the datamanager library on a different machine and am having some problems. It seems to be missing a dependancy that is not on this fedora linux machine. I get this error:</p>
<p>Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/xpath/axes/PredicatedNodeTest</p>
<p>I never ran into this problem on my mac, i just built the jar and it worked fine. Originally I thought it was because the jar needed to be recompiled because I was on a different machine. So I copied the whole eml tree over, hit "ant clean", then "ant dist-datamanager-lib", and uncompressed the zip to get my jar. No error messages. Ran my dataset import method based off the sample applications and I got the same error.</p>
<p>Is there some kind of apache library I need to have on this machine to get the datamanager library working?</p>
<p>Here is the full stack trace:<br />Exception in thread "main" java.lang.NoClassDefFoundError : org/apache/xpath/axes/PredicatedNodeTest<br /> at java.lang.ClassLoader.defineClass1(Native Method)<br /> at java.lang.ClassLoader.defineClass(ClassLoader.java:620)<br /> at java.security.SecureClassLoader.defineClass (SecureClassLoader.java:124)<br /> at java.net.URLClassLoader.defineClass(URLClassLoader.java:260)<br /> at java.net.URLClassLoader.access$100(URLClassLoader.java:56)<br /> at java.net.URLClassLoader$1.run( URLClassLoader.java:195)<br /> at java.security.AccessController.doPrivileged(Native Method)<br /> at java.net.URLClassLoader.findClass(URLClassLoader.java:188)<br /> at java.lang.ClassLoader.loadClass(ClassLoader.java :306)<br /> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:268)<br /> at java.lang.ClassLoader.loadClass(ClassLoader.java:251)<br /> at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319)<br /> at org.apache.xpath.XPath.<init>(XPath.java:199)<br /> at org.apache.xpath.CachedXPathAPI.eval(CachedXPathAPI.java:322)<br /> at org.apache.xpath.CachedXPathAPI.selectNodeIterator(CachedXPathAPI.java :216)<br /> at org.apache.xpath.CachedXPathAPI.selectSingleNode(CachedXPathAPI.java:177)<br /> at org.apache.xpath.CachedXPathAPI.selectSingleNode(CachedXPathAPI.java:157)<br /> at org.ecoinformatics.datamanager.parser.eml.Eml200Parser.parseDocument (Eml200Parser.java:182)<br /> at org.ecoinformatics.datamanager.parser.eml.Eml200Parser.parse(Eml200Parser.java:160)<br /> at org.ecoinformatics.datamanager.DataManager.parseMetadata(DataManager.java:585)<br /> at org.ecoinformatics.datamanager.sample.ImportDataset.testParseMetadata(ImportDataset.java:331)<br /> at org.ecoinformatics.datamanager.sample.ImportDataset.main(ImportDataset.java:132)</p>
<p>-Chad Burt</p>
<hr />
<p>On 4/30/2007, Duane Costa wrote:</p>
<p>Hi Chad,</p>
<p>I recently started experiencing a very similar run-time error on my Windows machine in an application I am developing that uses the datamanager library too. The only difference is that in my case the NoClassDefFoundError was for a different class:</p>
<p>org.apache.xpath.patterns.NodeTest</p>
<p>After a little trial and error, I found that I could resolve the error by incorporating a newer version of xalan.jar, based on Xalan-Java Version 2.7.0, into my classpath. I am attaching the xalan.jar file that fixed the problem for me. It seems that the NodeTest class was missing from the older xalan.jar but present in the newer xalan.jar. I'm guessing that the same might be true for the PredicatedNodeTest class.</p>
<p>I don't really understand was caused the error to start occurring; it may have something to do with the Java version I am running (I upgraded from Java 1.4.2 to Java 1.5.0 fairly recently). Jing and I will need to investigate this further. Meanwhile, as a temporary fix, could you try including the new xalan.jar file in your sbc application's classpath and let us know if that resolves the error for you?</p>
<p>Thanks,<br />Duane</p>
<hr />
<p>On 4/30/2007, Chad Burt wrote:</p>
<p>Thanks Duane,<br />I'm not too familiar with java and classpaths. I'm calling the datamanager.jar from the command line. I replaced eml/lib/apache/xalan.jar with the one you gave me. Is that correct? I'm getting the same error.<br />-Chad</p>
<hr />
<p>On 4/30/2007, Duane Costa wrote:</p>
<p>Hi Chad,</p>
<p>Interesting, I did exactly what you did, replaced eml/lib/apache/xalan.jar with the one I sent you, and rebuilt datamanager.jar using 'ant jar-datamanager-lib'. Now I'm getting the same error you've been getting! So the new xalan.jar obviously is not the solution, but at least we're getting the same error now. This is going to take more investigation. I'll try to get this figured out. Meanwhile, could you please send me the following?:</p>
<p>(1) The output from running 'java -version' in a command window on your system.<br />(2) The output from running 'echo $CLASSPATH' in a command window on your system.<br />(3) The exact command you use when you run the Data Manager library code.</p>
<p>Thanks,<br />Duane</p>
<hr />
<p>On 4/30/2007, Char Burt wrote:</p>
<p>Here's my info:</p>
<p>java -version:<br />java version "1.5.0_11" <br />Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_11-b03)<br />Java HotSpot(TM) Client VM (build 1.5.0_11-b03, mixed mode, sharing)</p>
<p>$CLASSPATH doesn't seem to be set, nor $CLASS_PATH. It's not set on my mac either.</p>
<p>I am running a custom method based on the sample apps. I've attached the file. It's usually under /src/org/ecoinformatics/datamanager/sample/ImportDataset.java.<br />I call it via :<br />java -cp "datamanager.jar" org.ecoinformatics.datamanager.sample.ImportDataset > /dev/null</p>
<p>Thanks for the help,<br />Chad</p> Bug #2702 (New): Data Manager Library: Support for online URL referenceshttps://projects.ecoinformatics.org/ecoinfo/issues/27022006-12-15T16:44:49ZDuane Costadcosta@lternet.edu
<p>Next release. Again, this will be rare. Not much to be gained from a URL reference.</p>
<p>Matt</p>
<p>Duane Costa wrote:</p>
<blockquote>
<p>Matt, Mark:</p>
<p>Do you think that handling references to online URLs should be a <br />requirement for the first release of the Data Manager Library (1.0.0), or recorded as an enhancement for the next release (1.1.0)?</p>
<p>Thanks,<br />Duane</p>
<blockquote>
<p>-----Original Message-----<br />From: Jing Tao [mailto:<a class="email" href="mailto:tao@nceas.ucsb.edu">tao@nceas.ucsb.edu</a>]<br />Sent: Wednesday, December 13, 2006 9:06 PM<br />To: Duane Costa<br />Cc: 'inigo san gil'; 'Mark Servilla'<br />Subject: RE: In-line data</p>
<p>Hi, Duane:</p>
<p>Yeah, current eml parser coudn't handle the reference for online url. <br />It can handle reference for attributeList and attribute. We can add <br />supporting online url reference as new feature into our data manager <br />library.</p>
<p>Thanks,</p>
<p>Jing</p>
<p>Jing Tao<br />National Center for Ecological<br />Analysis and Synthesis (NCEAS)<br />735 State St. Suite 204<br />Santa Barbara, CA 93101</p>
<p>On Wed, 13 Dec 2006, Duane Costa wrote:</p>
<blockquote>
<p>Date: Wed, 13 Dec 2006 15:37:27 -0700<br />From: Duane Costa <<a class="email" href="mailto:dcosta@lternet.edu">dcosta@lternet.edu</a>><br />To: 'Jing Tao' <<a class="email" href="mailto:tao@nceas.ucsb.edu">tao@nceas.ucsb.edu</a>><br />Cc: 'inigo san gil' <<a class="email" href="mailto:isangil@lternet.edu">isangil@lternet.edu</a>>,<br />'Mark Servilla' <<a class="email" href="mailto:servilla@lternet.edu">servilla@lternet.edu</a>><br />Subject: RE: In-line data</p>
<p>Hi Jing,</p>
<p>Inigo and I have looked into the second issue below a</p>
</blockquote>
<p>little more (the</p>
<blockquote>
<p>question about FTP protocol). The problem was not the FTP</p>
</blockquote>
<p>protocol --</p>
<blockquote>
<p>we changed to HTTP and the Data Manager library had the</p>
</blockquote>
<p>same problem downloading the data. The problem is that the metadata <br />is using a reference to the URL to the data like this:</p>
<blockquote>
<p><dataTable><br />.<br />.<br />.<br /><distribution><br /><references>distributionReference</references><br /></distribution></p>
<p>In another part of the EML, we have:</p>
<p><distribution id="distributionReference"> <online><br /><url><br /><a class="external" href="http://lternet.lternet.edu/~isangil/NIN/nin_met_1982.txt">http://lternet.lternet.edu/~isangil/NIN/nin_met_1982.txt</a><br /></url><br /></online><br /></distribution></p>
<p>Because of the reference, Data Manager has no value for the</p>
</blockquote>
<p>entity identifier, and the download handler is not able to download <br />the</p>
<blockquote>
<p>data. So it seems that this is a legal EML document but the</p>
</blockquote>
<p>EML parser is not able to follow the reference to the URL for the <br />data.</p>
<blockquote>
<p>Here is a link to the document that is having the problem:</p>
<p><a class="external" href="http://lternet.lternet.edu/~isangil/NIN/nin_lter_met_1982.xml">http://lternet.lternet.edu/~isangil/NIN/nin_lter_met_1982.xml</a></p>
<p>Could you take a look?</p>
<p>Thanks,<br />Duane</p>
</blockquote></blockquote></blockquote> Bug #2701 (New): Data Manager Library: Support for inline datahttps://projects.ecoinformatics.org/ecoinfo/issues/27012006-12-15T16:42:45ZDuane Costadcosta@lternet.edu
<p>Wait for the next release -- as far as I know there is very little or no inline data out there in the KNB collection.</p>
<p>Matt</p>
<p>Duane Costa wrote:</p>
<blockquote>
<p>Matt, Mark:</p>
<p>Do you think that handling inline data should be a priority for <br />release 1.0.0 of the Data Manager Library, or something that should be recorded in Bugzilla as an enhancement for the next release, 1.1.0?</p>
<p>Thanks,<br />Duane</p>
<blockquote>
<p>-----Original Message-----<br />From: Jing Tao [mailto:<a class="email" href="mailto:tao@nceas.ucsb.edu">tao@nceas.ucsb.edu</a>]<br />Sent: Wednesday, December 13, 2006 8:59 PM<br />To: Duane Costa<br />Subject: Re: In-line data</p>
<p>Hi, Duane:</p>
<p>Our datamanager couldn't handle inline data so far. Do you think this <br />feature has very high priority?</p>
</blockquote>
<p>.<br />.<br />.</p>
<blockquote>
<p>Jing</p>
<p>Jing Tao<br />National Center for Ecological<br />Analysis and Synthesis (NCEAS)<br />735 State St. Suite 204<br />Santa Barbara, CA 93101</p>
<p>On Wed, 13 Dec 2006, Duane Costa wrote:</p>
<blockquote>
<p>Date: Wed, 13 Dec 2006 12:20:05 -0700<br />From: Duane Costa <<a class="email" href="mailto:dcosta@lternet.edu">dcosta@lternet.edu</a>><br />To: 'Jing Tao' <<a class="email" href="mailto:tao@nceas.ucsb.edu">tao@nceas.ucsb.edu</a>><br />Subject: In-line data</p>
<p>Hi Jing,</p>
<p>We have some metadata that contains <inline> tags to the</p>
</blockquote>
<p>data. Is the</p>
<blockquote>
<p>Data Manager download handler able to use this to download the data?</p>
</blockquote></blockquote>
<p>.<br />.<br />.</p>
<blockquote><blockquote>
<p>Thanks,<br />Duane</p>
</blockquote></blockquote></blockquote> Bug #2700 (Resolved): Data Manager Library: Sample Calling Applicationhttps://projects.ecoinformatics.org/ecoinfo/issues/27002006-12-15T16:39:34ZDuane Costadcosta@lternet.edu
<blockquote>
<p>-----Original Message-----<br />From: Matthew Jones [mailto:<a class="email" href="mailto:jones@nceas.ucsb.edu">jones@nceas.ucsb.edu</a>] <br />Sent: Friday, December 15, 2006 12:37 AM<br />To: Duane Costa<br />Cc: 'Jing Tao'<br />Subject: Re: Sample calling application</p>
<p>Hi Duane and Jing,</p>
<p>A samle app sounds great. Comments inline...</p>
<p>Matt</p>
<p>Duane Costa wrote:</p>
<blockquote>
<p>Matt,</p>
<p>Could you add your comments to this discussion about a</p>
</blockquote>
<p>sample calling</p>
<blockquote>
<p>application in the Data Manager Library code? Jing and I both agree <br />that a sample calling application (as opposed to Junit</p>
</blockquote>
<p>tests) would be</p>
<blockquote>
<p>a useful addition to the distribution, even if it's limited</p>
</blockquote>
<p>to just the user documentation. However, there are a couple <br />of loose ends Jing and I feel unsure about (see below). After <br />you add your comments, I'll open a Bugzilla entry for this.</p>
<blockquote>
<p>Thanks,<br />Duane</p>
<blockquote>
<p>On Wed, 13 Dec 2006, Duane Costa wrote:</p>
<blockquote>
<p>Date: Wed, 13 Dec 2006 11:32:27 -0700<br />From: Duane Costa <<a class="email" href="mailto:dcosta@lternet.edu">dcosta@lternet.edu</a>><br />To: 'Jing Tao' <<a class="email" href="mailto:tao@nceas.ucsb.edu">tao@nceas.ucsb.edu</a>><br />Subject: Sample calling application</p>
<p>Hi Jing,</p>
<p>I think it would be nice to provide a sample calling</p>
</blockquote></blockquote></blockquote>
<p>application in</p>
<blockquote><blockquote><blockquote>
<p>the Data Manager Library source code distribution. It would</p>
</blockquote>
<p>just be a</p>
<blockquote>
<p>small program, together with implementations of the</p>
</blockquote>
<p>call-back interfaces for database connection pool and Ecogrid end <br />point, to demonstrate the different use cases. Do you</p>
</blockquote></blockquote>
<p>think this is a</p>
<blockquote><blockquote>
<p>good idea? If so, there are a few minor things to decide:<br />It is great idea.</p>
</blockquote>
<p>Good! I'll work on the sample program. I'll also add a new</p>
</blockquote>
<p>Bugzilla bug to document these ideas after Matt adds his comments.</p>
<blockquote>
<blockquote><blockquote>
<ul>
<li>Where to put the source code -- One possible package would be:<br />org.ecoinformatics.datamanager.sample</li>
</ul>
</blockquote></blockquote></blockquote>
<p>This package sounds good to me.</p>
<blockquote><blockquote><blockquote>
</blockquote>
<p>I am not sure. But I think since it is sample and it will</p>
</blockquote></blockquote>
<p>be good to</p>
<blockquote><blockquote>
<p>be easy found by the user. So we can still use the package you <br />prosposed, but can we put them into a another dir<br />- sample, which is parallel to src? The dir structure will</p>
</blockquote></blockquote>
<p>look like</p>
<blockquote><blockquote>
<p>sample/org/ecoinformatics/datamanager/sample.</p>
</blockquote>
<p>This sounds fine. Maybe we need a separate ant target in</p>
</blockquote>
<p>build.xml to</p>
<blockquote>
<p>compile the sample code, something like 'ant</p>
</blockquote>
<p>compile-datamanager-sample'.<br />Sounds fine. If its easier to just include the code in src <br />then I might just do that instead of making the parallel <br />hierarchy. But either way is fine.</p>
<blockquote>
<blockquote><blockquote>
<ul>
<li>How to set properties -- The main program could hard-code the <br />database values as constants, or the main program could</li>
</ul>
</blockquote></blockquote></blockquote>
<p>read values</p>
<blockquote><blockquote><blockquote>
<p>from the lib/datamanager/datamanager.properties file. The</p>
</blockquote>
<p>advantage of</p>
<blockquote>
<p>the first approach is that it keeps the database values</p>
</blockquote>
<p>together in the same file with the main program; the</p>
</blockquote></blockquote>
<p>second approach</p>
<blockquote><blockquote>
<p>has the advantage that users can edit</p>
</blockquote></blockquote>
<p>datamanager.properties and run</p>
<blockquote><blockquote>
<p>the sample program without needing to recompile. Which approach do <br />you like better?<br />First, I have a question. How do you plan to run this</p>
</blockquote></blockquote>
<p>sample code? It</p>
<blockquote><blockquote>
<p>will be compiled and distributed too? Or user should compile it by <br />himself or through build.xml? Or even just give user an</p>
</blockquote></blockquote>
<p>idea how to</p>
<blockquote><blockquote>
<p>use the library and we don't have plan let user run it? If we just <br />want to show user how to use the library and I think it is okay to <br />hard code in main program.<br />If we plan to let user run it (like our test file, it is</p>
</blockquote></blockquote>
<p>better put</p>
<blockquote><blockquote>
<p>those values in the property file.</p>
</blockquote>
<p>I don't know which approach is best. Maybe we just want to include <br />sample code primarily as part of the documentation, without any <br />expectation that the user will actually compile and execute</p>
</blockquote>
<p>it. Or maybe we do want the end user to try it out <br />themselves. I think we need Matt's input on this.<br />Let's use a properties file. Hardcoding these values in code <br />is a bad example to set.</p>
</blockquote> Bug #2674 (In Progress): Data Manager Library: Set database table life-span priorityhttps://projects.ecoinformatics.org/ecoinfo/issues/26742006-11-21T20:36:35ZDuane Costadcosta@lternet.edu
<p>Provide an API for the Calling Application to set a database table life-span priority on specific database tables.</p>
<p>When the upper limit on the database size is reached (see Bug <a class="issue tracker-1 status-2 priority-2 priority-default" title="Bug: Data Manager Library: Set upper limit on database size (In Progress)" href="https://projects.ecoinformatics.org/ecoinfo/issues/2673">#2673</a>), the Data Manager Library will free up space by reducing the number of cached data tables in the database based on a "least used" removal algorithm. However, the Calling Application should be able to protect specific tables from removal by setting them as high priority. This is a boolean setting, either a table is protected from removal or it isn't.</p>
<p>This task supports Use Case <a class="issue tracker-1 status-3 priority-2 priority-default closed" title="Bug: MCAT won't build under IRIX with Oracle 8.0.5 (Resolved)" href="https://projects.ecoinformatics.org/ecoinfo/issues/6">#6</a> in the Data Manager Library UML documentation.</p> Bug #2673 (In Progress): Data Manager Library: Set upper limit on database sizehttps://projects.ecoinformatics.org/ecoinfo/issues/26732006-11-21T20:25:42ZDuane Costadcosta@lternet.edu
<p>Provide a means for the Calling Application to set an upper limit on the database size to prevent overloading the database. The table monitor component of the library must abide by the upper limit size constraint, and must include routines to drop tables when size constraints are met.</p>
<p>This task supports Use Case <a class="issue tracker-1 status-5 priority-5 priority-highest closed" title="Bug: mde won't load because of hardcoded image paths (Closed)" href="https://projects.ecoinformatics.org/ecoinfo/issues/5">#5</a> in the Database Manager Library UML documentation.</p> Bug #2578 (In Progress): Data Manager Library: Release and Distributionhttps://projects.ecoinformatics.org/ecoinfo/issues/25782006-10-27T20:28:05ZDuane Costadcosta@lternet.edu
<p>The Data Manager Library should be assigned a release number and distributed with the following components:</p>
<p>1. jar file (datamanager.jar)<br />2. javadoc API documentation<br />3. overview document which describes the API and provides usage examples<br />4. UML documents<br /> a. class diagram<br /> b. sequence of operations</p>
<p>The expected end-user is a programmer who will integrate the libary into a calling application. The Data Manager Library could be distributed on the EML product site.</p> Bug #2577 (Resolved): Data Manager Library: API to enumerate table and field nameshttps://projects.ecoinformatics.org/ecoinfo/issues/25772006-10-27T20:17:31ZDuane Costadcosta@lternet.edu
<p>Some applications may want to do direct queries on the data tables in the database. The application will need to map entity names to table names, and attribute names to field names. Extend the Data Manager Library API to provide a method to enumerate the table and field names for a given entity.</p> Bug #2576 (In Progress): Data Manager Library: Database Connection Poolinghttps://projects.ecoinformatics.org/ecoinfo/issues/25762006-10-27T20:11:08ZDuane Costadcosta@lternet.edu
<p>Rework the design and implementation of database connection pooling in the Data Manager Library. Provide a callback mechanism for the calling application to manage its own connection pool. This should include a mechanism for returning a "Connection not available" status to the Data Manager so that it will know that it needs to wait until a connection is available. The Data Manager should generally use one connection per operation, though if the operation has several steps it could re-use the same connection in more than one step if it's safe to do so.</p> Bug #2575 (Resolved): Data Manager Library: Support for query object APIhttps://projects.ecoinformatics.org/ecoinfo/issues/25752006-10-27T20:01:08ZDuane Costadcosta@lternet.edu
<p>The original design of the Data Manager Library was somewhat vague in its support for querying data tables. After further discussion (Matt, Jing, Duane, and Mark Servilla), we think that allowing the calling application to pass in an ANSI SQL string would be too problematic because of the parsing requirements. The problems arise from needing to parse non-standard entity names into database table names, and non-standard attribute names into database field names. For example:</p>
<p>SELECT SPECIES NAME, SPECIES ID FROM SPECIES</p>
<p>means one thing from the perspective of entities and attributes, but something else from a database perspective, where "NAME" and "ID" would be interpreted as column aliases.</p>
<p>Instead, we will design a query class that the calling application can use to <br />construct its queries in a more structured way by setting various attributes of the query object. At some later point, we may also support queries in an XML format that could be mapped onto the query object by the Data Manager Library. This would facilitate passing queries between two or more processes (e.g. first from Morpho to Metacat, and then from Metacat to the Data Manager Library code).</p>
<p>The JDBC ResultSet object that is returned could also pose a problem, since it contains references to the database table field names, not the original attribute names. The calling application could get around this by restricting itself to accessing the fields by position rather than by name.</p>