After reviewing CNodeService and D1NodeService prompted by Robert comparing the Hazelcast locking with the d1_synchronization locking, I've made a number of changes that will prevent locking problems:
1) Multiple methods contained try/catch blocks that would:...
only delete replicated data files (server_location != 1)
use inherited access control from EML for the data file we download from a remote sourcehttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522
download remote data and save locally when it is referenced by an EML package, then include it in the ORE map.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522
When the requested count in a call to listObjects() is 0, return an empty object list, not a full one. Fixes https://redmine.dataone.org/issues/2122
Minor formatting for querySystemMetadata().
exapnd permissions on the exisiting access rule not on the permission being checked. (hierarchical permissions)
upgrade routine to purge empty replicated data files so that they can be re-replicatedhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5536
Make sure the local id isn't null when we try to get the object from the local instance.
Simplify the error handling, and throw the exception once the CN is updated with the new status.
Set the replica status to failed (not invalidated) when we get exceptions trying to read the object bytes. Not much of a difference, but only the CN, in theory, is supposed to be able to set the invalidated status.
Set the replication status to invalidated when we have a localId, but getting the object bytes fails for any reason.
Only call super.create() if there's no localId found on the MN (ie a replica is there from an out of band process).
Get the object inputstream from the local metacat instance using MetacatHandler.get() rather than MN.getReplica() so we don't throw an InvalidToken exception when passing in a null Session. The D1Client object is never used for this local call.
interpret permissions as hierarchicalhttps://redmine.dataone.org/issues/2150
remove flag for independent system metadata replication -- these entries are replicated along with the data/metadata objects or via hazelcast when the actual object is not on the server.
include SSL settings for client certificate-based replication
process the current revision, not the latest!use direct object/system metadata insertion for ORE maps.
allow other Metacat process (system metadata and ORE generation) to directly insert objects and system metadata without having to go through the MN/CN methods.
sort the docids so that "old" revisions are processed before newer ones
only attempt to unlock a lock if it was created (in the finally block)
new jars with many changes -- including new CN methods: ping, describe, listChecksumAlgorithm. Removed MN.setAccessPolicy. Refactored CN.setOwner() to CN.setRightsHolder().
refresh the SystemMetadata entry for EML and referenced data files when parsing EML access rules -- this ensures our in-memory system metadata map is up to date WRT the DB entries.
add revision history to the generated ORE objects -- we use the revision history of the EML package as a basis because the each ORE revision mirrors the revision of the EML package. Add a placeholder for checking if an equivalent ORE map exists in the DataONE infrastructure - this will be a call to CN.search() that looks at the solr index for OREs based on the EML package ID.
Update the parameter names expected for listObjects() to reflect the MN API changes in the architecture docs.
Change the query semantics such that we implement the MN.listObjects() where the lower datetime bound is inclusive (greater than or equal to" and the upper datetime bound in exclusive (less than). This allows easier paging in client applications.
In the call to MNReplication.replicate(), call back to CNReplication.setReplicationStatus() and set the status to failed when we get local exceptions, exceptions from the source MN when calling getReplica(). Send back an exception with a description when setting the status. Add a private setReplicationStatus() method to refactor these calls out.
Modify CNresourceHandler.setReplicationStatus() to use the new API signature, including the failure BaseException that is parsed out of the MMP as a file section. Log the exception message. Since this is an asynchronous call, ReplicationManager won't see a failed status, but the MNAuditTask eventually will.
Add collectReplicationStatus() to CNResourcHandler to return the BaseException or it's subclass, if any, provided in the the call to setReplicationStatus. The exception will be reported on the CN.
Change setReplicationStatus() to drop serialVersion and report the failure exception message in the CN log.
set SystemMetadata.archived=true on MN.deleteThere is ongoing discussion on what the exact behavior should be here, but this mimics Metacat's delete-as-archive action.http://redmine.dataone.org/issues/882
In MNodeService.replicate(), check to see if we have a replica (via an out of band channel) before we call sourceMN.getReplica().
only create guid->docid mapping during metadata replication if it does not already existhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5520
do not treat access change as an update -- it should not attempt to retrieve the contents of the objecthttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5520
only create guid->docid mapping during data replication if it does not already existhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5520
remove xml_acccess.docid reference (oops)http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5560
updated D1 API -- removed Permission.REPLICATE and associated parameters
process system metadata before access rules (access control is now driven by GUID so the mapping needs to be there)
Change the key of query result cache. The key now has the real search value.
include SerialVersion in describe responsehttps://redmine.dataone.org/issues/2135NOTE: d1 jars should be replaced once all schema changes are finalized and the generate d1_common code is committed to svn
ROLLBACK: check for non-public session in Metacat before showing the registry formhttp://bugzilla.ecoinformatics.org/process_bug.cgi
check for non-public session in Metacat before showing the registry formhttp://bugzilla.ecoinformatics.org/process_bug.cgi
include 'archived' system metadata element in backing DB store
add ; to end of update command
only update accessfileid for our new guid-based records
move latest postgres access upgrade statements to oracle scripthttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5560
include revision clause when updating the accessfileid on the xml_acccess table
remove docid column in favor of guidhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5560
If a member node cannot be found in the node list matching the targetNodeSubject given in isNodeAuthorized(), throw a ServiceFailure exception.
Minor reformatting for readability.
update with latest d1_common/d1_lib (includes latest schema changes)
only handle 100 (consecutive!) docId generations per millisecond, otherwise the generated docid part is bigger than Long.MAX_VALUE and Metacat cannot fully handle that.
check previous revision when attempting to update access control with EML 2.0.x docshttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5560
remove old access rules for a data object when they are being updated by rules contained in an EML document. Now the OnlineDataAccessTest EML 2.1.0 tests pass.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5560
construct the proper previousDocId when checking for update permission
for now, look up SystemMetadata directly from the table otherwise we won't have the latest access information. Need to refresh the in-memory copy everytime we edit the access policy via Metacat (includes EML parser)
check previous revision for permissions to update (includes data described by EML)
use correct "rev" column in xml_revisions table
refactor Metacat access handling to be on a per-revision basis so that it more closely aligns with the DataONE approachhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5560
To avoid id generation conflicts happening at the same millisecond, append a 5 character random string to the end of the docid.
retry: add node name in the correct order for predicate navigationhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5561
add node name in the correct order for predicate navigationhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5561
handle queries with predicates correctly.when docids are used in the return field look up, we need to make sure they are included in the values in the correct order for their corresponding parameter place holders (?)
close prepared statement only if not nullhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5562
fixed a bug that using a wrong table name - acces_log.
fixed a bug that using acces_log table name.
Use accessblock in setaccess method. So user can grant/revoke public readable access.
ensure that the revision list is ordered ascending in case someone changes the sql query without realizing that it matters...
set the byte size of the ORE map before adding it
set/update the obsoletes/obsoletedBy fields in system metadata so that we always have a complete revision history for each object.Note: ORE maps do not have revision history...yet(?)
order the revision list, ascending.
removing unused class -- can't find a reference to it and it's causing compilation errors for me.
for "all" permission, return a list of READ, WRITE, CHANGE_PERMISSION
generating ORE maps and creating/updating system metadata now. There are some Permission conversion issues to be worked out yet
look up access policy by guid or local idTODO: resolve the Metacat/EML "all" permission as it relates to DataONE (there is only READ, WRITE, CHANGE_PERMISSION). for now I am using CHANGE_PERMISSION when it is a Metacat "all"
make exception/error reporting clearer -- was getting lock messages when perhaps that was not the correct exception.
look up all docids is now a static method (ORE/SystemMetadata generation)
Add log statements for each call to ILock.unlock() for debugging.
evict the HazelCast SystemMetadata entry if we update the access control rules via Metacat's legacy API, otherwise stale SystemMetadata stays in memory instead of being looked up from the backing table store.
optionally include ORE generation/insertion into Metacat when generating SystemMetadatahttps://redmine.dataone.org/issues/2056
Set a default HazelcastInstance after init() is called, and use this instance in getLock() to acquire a lock in the cluster.
no need to cast docInfo entries to String -- they are all strings
set revision history, the create/update dates and the owner/submitter (correctly)
use shared method for looking up "docInfo" map -- both in Metacat replication and in D1 system metadata generation
make default formatting a little bit easier to read
reformat code -- no changes
refactor SystemMetadata creation into separate class from the MetacatHandler -- this will be shared by upgrade code and normal metacat api.
include all document revisions when generating "missing" system metadataTODO: revision graph captured in obsoletes/obsoletedBy
When using ILock.lock(), get a lock on the string value of the Identifier, not the Identifier object itself. Hazelcast locking won't work otherwise.
Use the Hazelcast ILock mechanism to lock the system metadata identifier rather than using IMap.lock(pid).
simplify SystemMetadata generation -- will be done during Metacat upgrade for D1 features/support.
clean up populator; use IOUtils library to do string<->stream conversions
Set sessionid for clientVeiwBean when it handle a request.Always set the order type to "allowFirst".
verify checksum when retrieving replica from another member node.https://redmine.dataone.org/issues/1794
make sure to get/put system metadata to the HZ map instead of using IdentifierManager directlyverified changes for: https://redmine.dataone.org/issues/1999
match documentation for the MN.describe() headerhttps://redmine.dataone.org/issues/1904
configure synch schedule in the admin screenhttps://redmine.dataone.org/issues/1933
look-up sych schedule from metacat properties instead of hardcoding themhttps://redmine.dataone.org/issues/1933