/src/edu/ucsb/nceas - Changes - Metacat - Ecoinformatics Redmine

metacat/src/edu/ucsb/nceas @ 7519

#	Date	Author	Comment
7519	03/22/2013 12:29 PM	ben leinfelder	include xml_revisions. do not allow removal of server_location = 1 documents (these are not replicas). https://redmine.dataone.org/issues/3539
7517	03/14/2013 09:56 AM	ben leinfelder	include size and format datcite elements (optional) and use more general resourceType without formatId in them (Dataset/metadata and Dataset/data). http://schema.datacite.org/meta/kernel-2.2/doc/DataCite-MetadataKernel_v2.2.pdf
7516	03/13/2013 05:11 PM	ben leinfelder	lookup the title for EML files when registering DOIs. lookup the creator from DataONE CN (if available). add EML-based test. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5513
7515	03/13/2013 03:13 PM	Chris Jones	Set the session to null so that the call uses the CN certificate when calling MN.systemMetadataChanged();
7514	03/13/2013 07:26 AM	Chris Jones	To keep all nodes up to date with regard to system metadata changes, add the broadcastSystemMetadataChange() method that finds replica MNs in the node list and calls systemMetadataChanged(). Modify setReplicationStatus() and updateReplicationMetadata() to fire this off when a replica status changes to completed. We may decide to inform MNs at other times too, but this is a conservative amount of calls going to the MNs for now.
7512	03/12/2013 04:44 PM	ben leinfelder	refactor DOI registration into separate class. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5513
7511	03/12/2013 04:26 PM	ben leinfelder	refactor using ezid-client changes that split field names and values into separate enums. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5513
7510	03/12/2013 03:20 PM	ben leinfelder	Correctly mint and register DOIs in teh MN API implementation. Add tests to exercise minting and creating. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5513
7507	03/11/2013 04:48 PM	ben leinfelder	register DOIs with minimal DataCite metadata. still need to determine which details to include and when, but the plumbing is in place as we refine those rules. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5513
7506	03/08/2013 03:49 PM	ben leinfelder	class for removing failed/invalid replicas from target nodes that previously held replicated content (KNB/LTER/PISCO/etc). https://redmine.dataone.org/issues/3539
7503	02/26/2013 10:27 AM	ben leinfelder	disable EZID/DOI minting by default since we do not yet have a means of tracking minted DOIs and augmenting metadata for them when we actually receive the object in a subsequent create() or update() call. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5753
7498	02/22/2013 01:06 PM	Brendan Hahn	tweak to pathquery/generic xpath handling
7495	02/22/2013 11:07 AM	ben leinfelder	group user_owner clause as "AND (... OR .... OR ....)" to handle multiple pathquery <owner> elements. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5880
7494	02/21/2013 05:15 PM	Brendan Hahn	accidentally added
7493	02/21/2013 05:13 PM	Brendan Hahn	typo
7491	02/21/2013 12:43 PM	Brendan Hahn	Search and indexing with Lucene/SOLR Requires a manually configured SOLR installation Not currently used by the rest of metacat
7489	01/31/2013 04:02 PM	ben leinfelder	generate ID from UUID. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5840
7486	01/18/2013 02:12 PM	ben leinfelder	make sure serial version is included or set on MN.update(). http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5793
7480	01/07/2013 04:23 PM	Brendan Hahn	Quick fix for bad handling of non-default data/backup directories.
7477	12/18/2012 05:33 PM	ben leinfelder	remove indexing task from the queue when we are updating the document
7475	12/12/2012 02:38 PM	ben leinfelder	move DocInfo parsing into utilities project so that it can be used by Morpho as well as Metacat. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5737
7471	12/10/2012 09:07 AM	ben leinfelder	use default count = 1000 for CN.listObjects rather than -1 (because now -1 will cause an SQL error)
7469	12/08/2012 06:41 PM	ben leinfelder	default replicaStatus to true for the CN.listObject call
7467	12/07/2012 10:39 AM	ben leinfelder	make sure to call lock() on the SM when updating rightsholder (like every other method that gets a lock object from HZ).
7464	12/07/2012 10:25 AM	ben leinfelder	CN.search() id not implemented by metacat -- making that explicit and also testing for it.
7462	12/05/2012 11:04 AM	ben leinfelder	default replicaStatus (aka "show replicas in results") to true rather than false
7461	12/05/2012 10:29 AM	ben leinfelder	add debug statements for listObject slice debugging
7454	12/03/2012 02:25 PM	Brendan Hahn	Do not set headers until response is ready to send (5756)
7452	12/03/2012 12:30 PM	ben leinfelder	use dual query for query slicing - one for count, another for the actual records when requested. https://redmine.dataone.org/issues/3065
7451	12/03/2012 11:32 AM	ben leinfelder	get total (or subtotal when non-slicing params are present) count as a separate query from the field selection query.
7450	12/03/2012 10:16 AM	ben leinfelder	include Skye's suggestions about correctly limiting by D1 Event types
7448	12/02/2012 08:58 AM	ben leinfelder	first pass at DOI minting using the EZID service in mn.generateIdentifier() http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5755
7447	11/30/2012 05:19 PM	Chris Jones	Fix a minor bug in listObjects() where total was set incorrectly when total was set incorrectly when count=0. The definition of total in the D1 architecture docs says 'The total number of entries in the source list from which the slice was extracted.' With count=0, we assume the total is the total count from the entire object store. Needs testing.
7446	11/30/2012 03:23 PM	ben leinfelder	remove empty package
7445	11/30/2012 02:53 PM	ben leinfelder	rollback the delete() when there is an error performing part of it -- don't want to end up with partial delete.
7444	11/30/2012 02:27 PM	ben leinfelder	use Identifier object not String when retrieving SM from the HZ map to set archived during delete()
7443	11/30/2012 12:17 PM	ben leinfelder	for MN.update() we needed to pass the original pid, not the new pid
7442	11/30/2012 10:49 AM	ben leinfelder	do not reject any schemes -- all handled the same at the moment.
7441	11/30/2012 10:23 AM	ben leinfelder	simple autogen-based implementation of MN.generateIdentifier(). does not support DOIs, ARKs, etc. It does support including a fragment, returning an identifier like "<fragment>.2012113010215298206"
7440	11/29/2012 04:54 PM	ben leinfelder	add link for reference on how to do record limits in oracle
7439	11/29/2012 04:52 PM	ben leinfelder	limit /log and /object calls to configurable maximum count for paging. defaults to existing Metacat value of 7000
7438	11/29/2012 04:33 PM	ben leinfelder	use RDBMS-specific features to limit the resultset for paging the object list -- postgres and oracle have implementations. we don''t really support mssql so I skipped that one.
7437	11/29/2012 04:12 PM	ben leinfelder	use RDBMS-specific features to limit the resultset for paging -- postgres and oracle have implementations. we don''t really support mssql so I skipped that one.
7434	11/26/2012 02:28 PM	ben leinfelder	include debug msg about removing docid from index queue. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5750
7433	11/26/2012 02:25 PM	ben leinfelder	remove document from the indexing queue when delete is called. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5750
7432	11/26/2012 01:50 PM	ben leinfelder	clean up index queue code before tackling index/delete race condition. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5750
7430	11/23/2012 10:02 AM	ben leinfelder	no need to mark SM as archived now that DocumentImpl.delete() does it. https://redmine.dataone.org/issues/3406
7429	11/23/2012 10:00 AM	ben leinfelder	mark documents as archived=true when they are deleted using the Metacat API. https://redmine.dataone.org/issues/3406
7428	11/21/2012 04:35 PM	ben leinfelder	look up the archived value when retrieving SystemMetadata record. https://redmine.dataone.org/issues/3405
7427	11/19/2012 04:03 PM	ben leinfelder	surround returned query in CDATA to prevent parsing of xml within xml
7421	11/10/2012 03:34 PM	Chris Jones	In migrating to Hazelcast 2.4.x, replace deprecated methods. Use Hazelcast.newHazelcastInstance() rather than Hazelcast.init(). For other deprecated static methods, use the HazelcastInstance equivalent calls.
7420	11/09/2012 10:57 AM	Chris Jones	In CNodeService.updateReplicationMetadata(), we are setting the replicaVerifiedDate() when we update or wholesale add a new replica. However, in setReplicationStatus(), we only do so when there's a new entry. Change setReplicationStatus() to also update the replicaVerifiedDate on updates of existing entries to be more consistent with other changes. This affects node prioritization based on this date timestamp. Thanks to Skye for pointing this out.
7419	11/09/2012 08:56 AM	Chris Jones	To attempt to address performance and stability WRT Hazelcast communication, we're upgrading to the 2.x series of Hazelcast. remove the 1.9.x jar files, and add the 2.4.1-SNAPSHOT jars. Modify HazelcastService to handle the minor change in the ItemListener interface (now passes ItemEvent<Identifier> as an argument)....
7418	11/07/2012 04:27 PM	ben leinfelder	implement query description for pathquery -- only tells callers about the pre-indexed paths we have in Metacat since there are an infinite number of "fields" when storing arbitrary XML, but we really don't want people using non-indexed paths for performance reasons anyway. I've typed all the fields as String, even though some are not just strings and can be used for numeric or data comparisons.
7417	11/07/2012 02:53 PM	ben leinfelder	Implement MNQuery for "pathquery" engine. Optionally include guid in the pathquery results (https://redmine.dataone.org/issues/3083)
7412	10/26/2012 09:11 AM	ben leinfelder	use ObjectFormatInfo libclient utility to look up mimeType and filename extension during get() calls. Configurable mapping file is deployed by default to /var/metacat/dataone where it can then be augmented as needed. This location is controlled in the metacat.properties file (which is injected into the DataONE Settings values during weapp intitialization)....
7411	10/26/2012 09:08 AM	ben leinfelder	add count for the total processed pids (from ISet iterator)
7410	10/22/2012 01:38 PM	ben leinfelder	handle /object?count=0 queries using simpler (quicker) sql https://redmine.dataone.org/issues/3065
7409	10/19/2012 10:20 AM	ben leinfelder	allow getlog action to use docid parameters that do not include revision. In these cases, the latest revision will be used.
7408	10/19/2012 10:05 AM	ben leinfelder	handle case where we do not have a pathexpr to check http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5696
7407	10/18/2012 03:14 PM	ben leinfelder	simplify the xml_access query, and instead use guid to check for permission. Now the docid/rev join (to get most recent version for search results) happens "higher up" in the query. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5696
7404	10/18/2012 11:10 AM	ben leinfelder	pass parameters to the getLog action for rendering in xslt
7403	10/16/2012 01:50 PM	ben leinfelder	remove morpho.jar -- moved needed classes into shared utilities project. (currently building form utilities trunk -- be sure to 'ant fullclean' to get the latest utilities.jar built)
7401	10/15/2012 02:38 PM	Chris Jones	Update d1_common_java and d1_libclient_java to the newest jar files. Add methods to CNodeService to throw NotImplemented exceptions for query(), listQueryEngines(), and getQueryEngineDescription() since these API calls are handled outside of metacat.
7400	10/12/2012 01:35 PM	ben leinfelder	do not allow updates to orphan another branch of revision history. https://redmine.dataone.org/issues/3338
7399	10/12/2012 08:27 AM	Chris Jones	Change the set and get methods for the replication verified date to use java.sql.Timestamp rather than java.util.Date via setTimestamp(), not setDate(). The hh:mm:ss.sss was previously getting truncated.
7398	10/08/2012 11:09 AM	ben leinfelder	include the subjects we are testing for authentication. https://redmine.dataone.org/issues/2778
7397	09/28/2012 09:06 AM	ben leinfelder	remove the max(rev) clause in favor of a more straight-forward join to xml_documents (that will have the max rev). http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5696
7395	09/25/2012 09:49 AM	ben leinfelder	include inverted sendParameters() method that uses the keys as values, and the values as keys so that multiple docid parameters can be specified for the zip download. This was a regression when moving to standard httpclient rather than the roll-your-own version we had been using. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5718
7392	09/24/2012 01:09 PM	ben leinfelder	shorten the systemmetadata* table names for Oracle's 30 character limit. move version to 2.0.5. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5717
7382	09/14/2012 02:01 PM	ben leinfelder	use correct docid format when checking for existing mappings.
7379	09/12/2012 02:22 PM	ben leinfelder	use CDATA for docname field in docInfo so that XML parser ignores the content that can contain characters like "&
7370	09/04/2012 03:43 PM	ben leinfelder	use SchemaLocationResolver to fetch remote entries for the xml_catalog -- we want to be able to fetch included xsd files as well as use any error handling it provides for checking the schemas.
7368	09/03/2012 01:50 PM	ben leinfelder	when performing query, make sure we are using the access rules of the latest revision of a given docid, otherwise we may include documents that used to be public but have been made private in subsequent revisions. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5696
7367	08/31/2012 03:05 PM	ben leinfelder	correct the number of prepared statement parameters when inserting to xml_revisions table. Errors like the following were showing in the replication log file: knb 20120831-19:42:38: [ERROR]: DocumentImpl.writeReplication - Failed to create access rule for package: john.15950.1 because The column index is out of range: 12, number of columns: 11. [ReplicationLogging]
7366	08/24/2012 08:24 AM	ben leinfelder	include WHERE in the sql where clause - encountered by SAEON's node admin, Alex Niehaus.
7358	08/23/2012 09:45 PM	ben leinfelder	create docid-guid mapping during replication if it does not exist. we were [incorrectly] assuming that there would be SM coming with the document info that would fill this information in, but for traditional non-MN Metacat deployments there is no SM to provide a mapping. In this case we use the docid as the guid.
7356	08/17/2012 12:42 PM	ben leinfelder	stream the replication "update" response rather than building up a complete list in a stringbuffer. prompted by findings on t he CN: https://redmine.dataone.org/issues/3141
7355	08/15/2012 03:46 PM	ben leinfelder	make sure data objects correctly use force replicate with action "insert" https://redmine.dataone.org/issues/3138
7350	08/06/2012 10:47 PM	ben leinfelder	when updating a document on a remote server, we still need to use the previous docid to check that the user has permissions to do so (rather than the new id that is obsoleting the old id). This was discovered by M Servilla at LTER.
7348	08/06/2012 11:08 AM	ben leinfelder	remove unused "dataonelogger"
7346	08/03/2012 02:27 PM	ben leinfelder	allow SM resynch to be executed any time, not just during start up. https://redmine.dataone.org/issues/3116
7345	08/03/2012 01:01 PM	ben leinfelder	change to debug log level when processing shared/local pids)
7344	08/03/2012 10:41 AM	ben leinfelder	only lock the missing pid event if we know we have it locally to contribute. https://redmine.dataone.org/issues/3117
7343	08/03/2012 09:26 AM	Chris Jones	Add locking to the itemAdded() method so ideally only one CN will respond to the request for a 'wanted' pid from the cluster. The lock is on a string, not the pid, and so won't conflict with system metadata locking. The string is based on the pid, with "missing-" as a prefix.
7342	08/03/2012 08:53 AM	ben leinfelder	only publish to the missing pid "wanted list" when resynching system metadata. we were seeing redundant entry added/updated events when looking up the shared systemmetadata first.
7341	08/02/2012 10:18 PM	ben leinfelder	print the missing pid count, not the total shared pid count so we know how many will be processed.
7340	08/02/2012 05:50 PM	ben leinfelder	change the system metadata resynch approach: nodes will publish PIDs that they are missing after inspecting the shared identifier set. other nodes will be listening for the "wanted" pids and will put their local copy of SystemMetadata on the shared SM map. This should dramatically decrease the hazelcast chatter during a resynch and targets only the pids that are missing from any of the various nodes.
7339	08/01/2012 10:40 PM	ben leinfelder	logging for processing identifier set on restart.
7338	08/01/2012 07:00 PM	ben leinfelder	remove possibility for infinite loop in case data replication is not configured for the server and a data file is encountered (yikes!)
7337	08/01/2012 05:33 PM	ben leinfelder	added logging debug statements to see where the replication timeout might be occurring.
7331	07/26/2012 04:26 PM	ben leinfelder	check for null archived flag in ORE SM https://redmine.dataone.org/issues/3046
7330	07/26/2012 12:08 PM	ben leinfelder	check if the caller is the Node admin (the member node calling itself) as well as the existing check for the CN calling the service. Both of those callers should be given full admin rights.
7326	07/23/2012 11:55 AM	ben leinfelder	use local Set processing to determine which pids (if any) should be contributed to the shared set by this node during the resync. Should save time rather than checking each and every pid against the shared set.
7325	07/20/2012 03:44 PM	ben leinfelder	move the hzIdentifiers initialization into the resync thread so that it does not affect start up time. cleaned up unused methods and superfluous code.
7323	07/20/2012 10:51 AM	ben leinfelder	only load local pids into hzIdentifiers if t hey do not already exist in the shared set. increase logging severity and detail of messages emitted during this process to get a better sense of what is taking so long.
7322	07/19/2012 02:38 PM	ben leinfelder	utility methods to update/reserialize existing ORE maps that were generated with older foresite (and included bad dateTime strings). https://redmine.dataone.org/issues/3046
7319	07/17/2012 03:57 PM	Chris Jones	On the coordinating Nodes, we often get McdbDocNotFoundExceptions for data (doctype == 'BIN') documents because they are not synchronized to the CNs. Change the logging to only print the stack trace during load() and loadAll() when log debug is enabled.
7318	07/17/2012 01:34 PM	ben leinfelder	check for invalid (!) pids. thanks, M. Reyes for catching this https://redmine.dataone.org/issues/3047
7317	07/17/2012 12:06 PM	ben leinfelder	only look up the client timeout property once, not every time we make a call https://redmine.dataone.org/issues/3078

Project

General

Profile

Metacat