https://projects.ecoinformatics.org/ecoinfo/https://projects.ecoinformatics.org/ecoinfo/ecoinfo/favicon.ico?14691340362000-09-02T00:16:45ZEcoinformatics RedmineMetacat - Bug #111: reading large documents from metacat is slowhttps://projects.ecoinformatics.org/ecoinfo/issues/111?journal_id=3742000-09-02T00:16:45ZMatt Jonesjones@nceas.ucsb.edu
<ul></ul><p>Fixed document reading bug (bugzilla bug <a class="issue tracker-1 status-3 priority-5 priority-highest closed" title="Bug: reading large documents from metacat is slow (Resolved)" href="https://projects.ecoinformatics.org/ecoinfo/issues/111">#111</a>) so that reading documents is<br />no longer a power function of the number of nodes in the document which<br />used to be the case). Now, reading a document occurs entirely within<br />DocumentImpl, by making a single SQL call to get the document data, and then<br />using the NodeComparator class to return a TreeSet of the nodes sorted in<br />a depth-first traversal order. This TreeSet is then processed by the new<br />DocumentImpl.toXml() methods, which formats and outputs a text representation<br />of the document to the Writer that is passed in. The DocumentImpl.toString()<br />method has been re-written to utilize DocumentImpl.toXml() as well.</p>
<p>The old algorithm for searching (that utilized the ElementNode, textNode,<br />CommentNode, and PINode classes) is still implemented for comparison<br />purposes, and can be accessed by calling the readUsingSlowAlgorithm() method.<br />A timing option has been added to DocumentImpl.main() so that the methods<br />can be compared (see the -t and -old options). Although the difference<br />in read time is only a fraction of a second for small documents (< 1K),<br />the new method of reading is 72 times faster than the old method for a<br />34K document (1.9 seconds versus 144 seconds). This difference continues<br />to grow as the node count increases.</p> Metacat - Bug #111: reading large documents from metacat is slowhttps://projects.ecoinformatics.org/ecoinfo/issues/111?journal_id=3752013-03-27T21:13:21ZRedmine Admin
<ul></ul><p>Original Bugzilla ID was 111</p>