Modified code to enter/remove data from xml_path_index and enter data into xml_nodes_revision when action=insert, update and delete are performed.
Check if all the paths in returnfield are indexed. If yes, then you xml_path_index for getting values of returnfields instead of using xml_nodes + xml_index
Adding code to check if all the paths being searched for indexed are not. If all paths are indexed, the code will use xml_path_index instead of using xml_nodes and xml_index
Adding a new method to MetaCatServlet for getting the value of paths to be indexed from metacat.properties file and indexing those paths in xml_path_index
Changed the name of the function
Added a new function which can used to control the carriage return at the end of the debug message and 'Metacat' in front of the message...
Fixed a bug in printSQL function. The bug came into the picture for queries which involve multiple query groups and one of the query groups does a % search.
Added code to delete all entries in xml_queryresult table when buildIndex for a docid is called.
Adding the call to normalize function for now -- too many calls to normalize function are made and this needs to be looked into. Removing call to normalize function caused trouble in text like this "A&B"
Fixed a bug in last commit. Doing a search likepathquery version="1.2"><returndoctype>eml://ecoinformatics.org/eml-2.0.0</returndoctype><returnfield>originator/individualName/surName</returnfield><querygroup operator="INTERSECT"><queryterm casesensitive="false" searchmode="contains"><value>%</value></queryterm><queryterm casesensitive="false" searchmode="contains"><value>National Center for Ecological Analysis and Synthesis</value><pathexpr>organizationName</pathexpr></queryterm></querygroup></pathquery>...
Fixed a bug - when returnfield_id is not found in xml_returnfield, records are not added to xml_queryresults
1. removed the call to QuerySpecification.printAttributeSQL() from QuerySpecification.printExtendedSQL()2. in QuerySpecification.printExtendedSQL() if ( returnFieldList.size() == 0 ) then it returns null3. DBQuery.addReturnfield() doesnt execute the element query if printExtendedSQL returns null
A minor change from containAtrributeReturnField to containsAttributeReturnField
I've added in a test in QuerySpecification.printExtendedSQL that checksto see if attributes, and sometimes only attributes, are in the originalreturnfield request. If so, the printAttributeQuery is called.
Changed the handleReturnField() method so that it handles path expressions withonly attributes in the path.
Added a check if printAttributeQuery() for returnPath to see that it is not null so that that this doesnt happen
xml_index.path like 'null' AND xml_nodes.nodename like 'id'
Fixed a bug in previous commit
Modified code for computing the returnfield string - earlier only elements were used to construct the string. e.g. /dataset/titleNow attributes are also added to the returnfield. e.g. title/@id
Replaced 1.4.0 with release
release
When a path expression includes element content and attribute content, thenthe SQL generated needs to search for attribute nodetypes with parentnodenames equal to the path expression element content. However, whenonly searching for attribute content (such as just @packageId), then...
When searching for attributes in the XPATH expression, an 'index out of bounds'exception was thrown when only an attribute was included in the path string.
This fix changes the pathexpr.indexOf comparison to 0 rather than 1, sincethe index starts at 0....
Removing call to normalize from getNodeRecordList()
Changing the normalize function. Adding changes submitted by Johnoel. Removing code for converting " as " can be stored as it on Oracle. Removing code which strips out \n and \r
Removed occurence of enum which is a keyword in Java 1.5
Changed the queries so that PreparedStatement.setString() and .setInt() are used instead of write the string directly into the sql statements
Fixed a bug from previous commit which removed whitespaces from documents.
Removed irrelevant code from previous commit
Removed code which entered value for nodedata in xml_index
Modified code such that nodedata column in xml_index is not created and used. So now we are using the same logic for using xml_index which was used for metacat release 1.4
Added new database upgrade script for postgres.Modified the database upgrade script for oracleAdded comment to DBquery.javaFixed a bug in xmltables.sql
Added code so that metacat administrator can delete any document.
Added the trailing '/' that was ommitted when XPATH queries included attributes.This is a partial bug fix for:
http://bugzilla.ecoinformatics.org/show_bug.cgi?id=2052
Fixed bug in command-line mode which caused array index out of bounds exception.
Changed default maxHarvests value to 0. Added logic to ignore maxHarvests value when it is set to 0 or a negative number. This allows Harvester to run indefinitely without shutting down after reaching a maximum number of harvests. The previous default value of 30 would cause Harvester to terminate after 30 harvests.
Added code to check if the document is indexed yet! So entry into xml_queryresult is only made if the doc is in xml_index.
Added code such that an offset can be specified in metacat.properties for entering records into xml_queryresult table. The value of xml_returnfield.usage_count should be more than the value specified in metacat.properties for records to be entered into xml_queryresult. so if you want results for any combination of returnfields should be stored in xml_queryresult only after that combination has been requested 50 times, set this value to 50
Changes for entering new records into xml_queryresult:
1. Added a function which checks if there is a record exsists in xml_returnfield table for a given combination of return fields. If a record is found, the id for the record is returned. otherwise a new record is created in the xml_returnfield table and the id of the new record is returned....
Added code for updating xml_queryresult for action=delete and action=update
For both actions, the entries in xml_queryresult are deleted.This works for action=update because deletion of entry is simple and the entries will be created again when the docid is part of a search result next time.
Added a function which gives back a string which is generated by sorting the returnfields requested for given query specifications
Modified the offset which is used for creating the resultset. This helps in fixing the 'more than 1000 enteries in IN query' bug on Oracle.
Added a check for eml2.0.1 documents in ContentTypeProvider.java
Using a variable to replace the hard code for namesapce in inserting record to xml_relation table.
Fixed a bug in the format of query result.
Formatting improvements.
Implement a new HarvesterServlet for running Harvester as a servlet. This eliminates the need to run Harvester in a terminal window. By default, the HarvesterServlet is commented out in lib/web.xml.tomcat(3,4,5). The user documentation will be modified to instruct Harvester administrators to uncomment the HarvesterServlet entry.
Re-implement logic to prune old log entries from the HARVEST_LOG and HARVEST_DETAIL_LOG tables. The old logic caused integrity constraint violations in the database because it tried to delete parent records from HARVEST_LOG prior to deleting child records from HARVEST_DETAIL_LOG....
Modifying code so that nodedata is stored in xml_index table next to the paths.This helps in making the search faster.
Increase number of rows in harvest list from 300 to 1200.
Fixed a bug from previous commit
Some more modifications so that % search doesnt run a select on xml_nodes.
Modified code to fix bug # 1850
Change default document type in Harvest List Editor to eml-2.0.1.
Remove DOS end-of-line carriage returns.Other minor formatting improvements to the code.
Modified code so that when a % search is done, a xml_nodes search is not done. This search is not required and hence saves time in doing a % search
Added code to check for NaN valuse which converting string to double.
Fixed a bug in containsKey() function.
Modified 'insert xml_nodes...' so that numerical data is added into numericalnodedata also.
Modified handling of greater-than and less-than so that comparison is done against nodedatanumerical column.
Fixed a bug in previous commit. Moved normalization function before the string size is counted so that size change due to normalization is taken into account.
Added code to check id document is passed as part of the params. This is done toprevent NullPointerException which results in sending null to the user.
Made changes to fix bug# 1538. Changed the code of the normalize function in MetaCatUtil.java. Earlier code was not taking care of characters above 123.
In DBSAXHandler.java, added call to normalize function before text is written to db.
Add a new method to get newest version of a given document.
Add a new method to get newest version for a given document.
Modify value of redirect to match the servlet-mapping URL values in web.xml. The old value used the servlet class name. This worked in Tomcat 4 but seems to break in Tomcat 5 on Windows. The new value uses the servlet-mapping URL value. This should work in both Tomcat 4 and Tomcat 5.
Minor enhancement to support multiple email addresses for harvester administrator and site contact. Each address is separated by a comma or semicolon.
Thanks Jing, Fixed those code comments.
Adde a function to the metacat client to set access on an xml document in ametacat repository.
Modified buildIndex() to now include an '@' sign in the path for ATTRIBUTEnodes. Removed a bunch of debugging information. Fixed the BuildIndexTestso that it would work on any machine (removed hardcoded paths).
Added in servlet action 'buildindex' for building the XML_index table entriesfor either a set of documents (if one or more docid params are provided) orfor the whole set of documents in the xml_documents table. The buildindexaction is restricted to only be accessible by users who are listed in the...
Changed DBSAXHandler.run() to now use the new DocumentImpl.buildIndex() methodfor populating the xml_index table. The new method uses a jdbc ResultSetfor populating the index rather than doing it with the DBSAXNode, so it isfaster and can be run at any time on any document. This will allow us to...
Compose the Metacat URL from the httpserver and the servletpath properties, replacing hard-coded references to servlet.
Send redirect to HarvesterRegistration, instead of using the full class name which works on Tomcat-only installations but not with Apache.
Change the date format to one that is standard on both Oracle and Postgres.
Whitespace changes that fix a few formatting problems after Jing's commit.
Add new feature that delete can be broadcasted by force replication.
The new buildIndex() function now can write allof the appropriate index paths to the database for any given document. Next need to create a function to rebuild on demand, and modify DBSAXHandler.run() to use the new buildIndex() function.
Added changes to buildIndex() function. Now it is finding the right set ofpaths, just have to save these in a hash and then add them to the DB xml_indextable.
changed function parameters in accordance with changes in PermissionController
Modified inline data permission handling, so that access rules for old version of inline data can be checked correctly.
Added a function which returns inline data id with out the revision number.
Changed error text that is returned when an invalid eml is inserted.
Add a code to load eml201 parser to fix the bug that couldn't generate access rule for eml201 doc.
Fixed error in handling of multiple additional metadata tags...
removed errors being generated in handling of qformat when action=insert. If qformat is not specified, xml is assumed as default.
Beginning new method for building the xml_index table. This uses theJDBC resultset directly rather than DBSaxNode, and recurses through therecords of the table. The new function 'buildIndex()' would be called, butcurrently is not linked in to any code, so it shouldn't get in the way....
removed a bug which was pointed out by Bing and fixed by Jing.
Removed some unused code
Added new upload function which takes InputStream as input.
fixed some bugs in document update
Added method to metacat client for reading inline data - readInlineData()
Fixed a bug in upload function. For online data updates, access was not checked.
Fixed handling of various docid formats.
Code added to handle errors resulting from following urls: http://metacat.nceas.ucsb.edu/knb/metacat?action2 - no action specified http://metacat.nceas.ucsb.edu/knb/metacat?action=insert - no docid specified http://metacat.nceas.ucsb.edu/knb/metacat?action=login - no username specified...
Added check for null condition so that proper error text is returned to user.
Checked to be sure the instance has been initialized in the getDBCOnnectionstatic method call. Assuming it has been initialized could (and does)lead to NullPointerExceptions when used outside of the metacat servletif the conneciton pool isn't initialized properly.
Reformatted code for readability. It was crazy. Still has problems, but itsbetter. Will be working on some new methods on monday.
Fixed a bug in access handling when no access is specified.
Added eml-2.0.1 tags for eml processing.
Also fixed a bug. The error returned in case of no revision number specified was just null. Now it says that revision number is required.
Merging in changes made in branch 'dataaccess' by Jing Tao.