add DOI development page. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5513
disable EZID/DOI minting by default since we do not yet have a means of tracking minted DOIs and augmenting metadata for them when we actually receive the object in a subsequent create() or update() call. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5753
use utilities 1.3.0 tag
add solr index documentation outline. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5884
wordsmith the identity mapping page. Not fundamentally different, but hopefully more concise. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5814
use d1_libclient v1.2.1 (temp file creation fix)
tweak to pathquery/generic xpath handling
use utilities and eml style tag as we prep for release.
ready Metacat for 2.0.6 release (docs, db version, build files etc).
group user_owner clause as "AND (... OR .... OR ....)" to handle multiple pathquery <owner> elements. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5880
accidentally added
typo
remove older lucene library and include ORE test to make sure that change does not prevent us from generating OREs. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5874
Search and indexing with Lucene/SOLRRequires a manually configured SOLR installationNot currently used by the rest of metacat
PARC, OBFS, NRS: use only the paths that are indexed by default in metacat.properties. If deployments want to cusotmize these, they are free to do so, but we should ship skins that match the paths we index with a vanilla installation.
generate ID from UUID. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5840
make sure serial version is included or set on MN.update().http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5793
remove duplicate cgi-bin part in path to create account
Quick fix for bad handling of non-default data/backup directories.
Also add the 2.4.1 hazelcast jars to the trunk.
remove indexing task from the queue when we are updating the document
move DocInfo parsing into utilities project so that it can be used by Morpho as well as Metacat.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5737
use utilities tag to build (remember to 'fullclean' after this update!)
use default count = 1000 for CN.listObjects rather than -1 (because now -1 will cause an SQL error)
default replicaStatus to true for the CN.listObject call
make sure to call lock() on the SM when updating rightsholder (like every other method that gets a lock object from HZ).
return from test when we encounter the NotImplemented exception for CN.search()
include identifier.guid in the test SQL clause.
CN.search() id not implemented by metacat -- making that explicit and also testing for it.
default replicaStatus (aka "show replicas in results") to true rather than false
add debug statements for listObject slice debugging
Add the non-snapshot jars for the D1 libraries.
use utilities and eml RC tags for building Metacat.
include dataone.contactSubject in backup properties so it will be "remembered" during upgrades.
update release date to December
additional db indexes for pathquery performancehttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5696
Do not set headers until response is ready to send (5756)
use jar generated from the git repo source (just in case it was different from svn). http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5755
use dual query for query slicing - one for count, another for the actual records when requested.https://redmine.dataone.org/issues/3065
get total (or subtotal when non-slicing params are present) count as a separate query from the field selection query.
include Skye's suggestions about correctly limiting by D1 Event types
use test doi shoulder as the default for local server, at least during testing phase. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5755
first pass at DOI minting using the EZID service in mn.generateIdentifier()http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5755
Fix a minor bug in listObjects() where total was set incorrectly when total was set incorrectly when count=0. The definition of total in the D1 architecture docs says 'The total number of entries in the source list from which the slice was extracted.' With count=0, we assume the total is the total count from the entire object store. Needs testing.
remove empty package
rollback the delete() when there is an error performing part of it -- don't want to end up with partial delete.
use Identifier object not String when retrieving SM from the HZ map to set archived during delete()
for MN.update() we needed to pass the original pid, not the new pid
do not reject any schemes -- all handled the same at the moment.
simple autogen-based implementation of MN.generateIdentifier(). does not support DOIs, ARKs, etc. It does support including a fragment, returning an identifier like "<fragment>.2012113010215298206"
add link for reference on how to do record limits in oracle
limit /log and /object calls to configurable maximum count for paging. defaults to existing Metacat value of 7000
use RDBMS-specific features to limit the resultset for paging the object list -- postgres and oracle have implementations. we don''t really support mssql so I skipped that one.
use RDBMS-specific features to limit the resultset for paging -- postgres and oracle have implementations. we don''t really support mssql so I skipped that one.
Add the latest SNAPSHOT build of the hazelcast jars built by robert at:
http://dev-testing.dataone.org/maven/com/hazelcast/hazelcast/2.4.1-SNAPSHOT/hazelcast-2.4.1-SNAPSHOT.jarhttp://dev-testing.dataone.org/maven/com/hazelcast/hazelcast-client/2.4.1-SNAPSHOT/hazelcast-client-2.4.1-SNAPSHOT.jar...
Update the hazelcast libraries based on the most recent build from the hazelcast trunk using patches that robert submitted via git pull requests.
include debug msg about removing docid from index queue. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5750
remove document from the indexing queue when delete is called. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5750
clean up index queue code before tackling index/delete race condition. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5750
additional release notes about archive/delete behavior and HZ upgrade
no need to mark SM as archived now that DocumentImpl.delete() does it.https://redmine.dataone.org/issues/3406
mark documents as archived=true when they are deleted using the Metacat API.https://redmine.dataone.org/issues/3406
look up the archived value when retrieving SystemMetadata record.https://redmine.dataone.org/issues/3405
surround returned query in CDATA to prevent parsing of xml within xml
Update the two hazelcast jars to 2.4.1-SNAPSHOT versions that Robert generated after fixing certain hazelcast build problems.
correct the metacat.properties help anchors.
use sleeker "?" icon for the admin help links
correct the "?" links in the admin pages to the docs pages that are deployed as part of metacat.
In migrating to Hazelcast 2.4.x, replace deprecated methods.
In migrating to Hazelcast 2.4.x, replace deprecated methods. Use Hazelcast.newHazelcastInstance() rather than Hazelcast.init(). For other deprecated static methods, use the HazelcastInstance equivalent calls.
In CNodeService.updateReplicationMetadata(), we are setting the replicaVerifiedDate() when we update or wholesale add a new replica. However, in setReplicationStatus(), we only do so when there's a new entry. Change setReplicationStatus() to also update the replicaVerifiedDate on updates of existing entries to be more consistent with other changes. This affects node prioritization based on this date timestamp. Thanks to Skye for pointing this out.
To attempt to address performance and stability WRT Hazelcast communication, we're upgrading to the 2.x series of Hazelcast. remove the 1.9.x jar files, and add the 2.4.1-SNAPSHOT jars. Modify HazelcastService to handle the minor change in the ItemListener interface (now passes ItemEvent<Identifier> as an argument)....
implement query description for pathquery -- only tells callers about the pre-indexed paths we have in Metacat since there are an infinite number of "fields" when storing arbitrary XML, but we really don't want people using non-indexed paths for performance reasons anyway. I've typed all the fields as String, even though some are not just strings and can be used for numeric or data comparisons.
Implement MNQuery for "pathquery" engine. Optionally include guid in the pathquery results (https://redmine.dataone.org/issues/3083)
update pub_date when the length of that field is != 4 (use date_created in this scenario). There were 2 entries that had "193" as the pub_date.
replace new lines in creator with spaces. set blank " " titles and creators to "unknown". use "Baltimore Ecosystem Study LTER" for publisher on all BES objects.
include John Kunze's latest suggestions for improved metadata -- a lot of clean-up, especially on characters in the file. Note UTF-8 encoding of the script.
include note about pathquery performance fix (when using indexed fields)
use ObjectFormatInfo libclient utility to look up mimeType and filename extension during get() calls. Configurable mapping file is deployed by default to /var/metacat/dataone where it can then be augmented as needed. This location is controlled in the metacat.properties file (which is injected into the DataONE Settings values during weapp intitialization)....
add count for the total processed pids (from ISet iterator)
handle /object?count=0 queries using simpler (quicker) sql https://redmine.dataone.org/issues/3065
allow getlog action to use docid parameters that do not include revision. In these cases, the latest revision will be used.
handle case where we do not have a pathexpr to checkhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5696
simplify the xml_access query, and instead use guid to check for permission. Now the docid/rev join (to get most recent version for search results) happens "higher up" in the query.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5696
include log stats for total 'read' events when rendering a package.
rework simple log stats so that there is no saxon requirement (xslt 2)
pass parameters to the getLog action for rendering in xslt
remove morpho.jar -- moved needed classes into shared utilities project. (currently building form utilities trunk -- be sure to 'ant fullclean' to get the latest utilities.jar built)
remove use of HttpMessage (in morpho.jar) in favor of standard httpclient methods for calling the servlet in tests
Update d1_common_java and d1_libclient_java to the newest jar files. Add methods to CNodeService to throw NotImplemented exceptions for query(), listQueryEngines(), and getQueryEngineDescription() since these API calls are handled outside of metacat.
do not allow updates to orphan another branch of revision history. https://redmine.dataone.org/issues/3338
Change the set and get methods for the replication verified date to use java.sql.Timestamp rather than java.util.Date via setTimestamp(), not setDate(). The hh:mm:ss.sss was previously getting truncated.
include the subjects we are testing for authentication.https://redmine.dataone.org/issues/2778
remove the max(rev) clause in favor of a more straight-forward join to xml_documents (that will have the max rev). http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5696
add note about sanparks/saeon spatial file download: http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5718
include inverted sendParameters() method that uses the keys as values, and the values as keys so that multiple docid parameters can be specified for the zip download. This was a regression when moving to standard httpclient rather than the roll-your-own version we had been using. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5718
integrate ecoinformatics login with the CIlogon identity mapping flow so that a user is directed through the process with no manual navigation needed (at least in the url bar). https://redmine.dataone.org/issues/1480
use version 2.0.5
shorten the systemmetadata* table names for Oracle's 30 character limit. move version to 2.0.5. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5717