When the cn.notifyReplicaNodes method, if the target MN source code is a v2 mn, we will use v2 api; if it is a v1 node, we will use v1 api.
The method of setReplicationstatus can be called by CNs and MNs.
Only CNs can call those methods:CNCore.registerSystemMetadata()CNCore,updateSystemMetadata()CNReplication.setReplicationStatus()CNReplication.updateReplicationMetadata()CNReplication.deleteReplicationMetadata()
Use the NodeReference object to replace the replicaStatus to restrict the listObjects method.
Change the signature of listObject method - remove replicaStatus and add nodeId.
Add a new method -updateSystemMetadata. It only can be called by CNs.
Committed the change which Andreit did. 1. Add the code for synchronize(not implemented)2. Add the code for addForm.
Created the updateSystemMetadata method.
Remove the code to check sid on create and registerSystemMetadata.
Add the code to support CNView interface in CNodeService. Both CNodeService and MNodeService share the same code base.
The setObsoletedBy only handles PID.
Call the method lock.lock() immediately after getting the lock. Otherwise, if an exception happened between the two calls (in another word, lock.lock() was called), lock.unlock() can cause an issue:Current thread is not owner of lock!See https://redmine.dataone.org/issues/6836.
The setReplicationStatus method only supports sid and the setRightsHolder method supports both PID and SID.
The CN.setReplicationPolicy method now only support PIDs. Refer to https://redmine.dataone.org/issues/6734.
Add the rules to check the if a sid is valid in the updateSystemMetadata method.
The MN.listobjects and CN.listobjects methods will call the one in the D1NodeService class.
Add the code to check if the pid is an SID in the registerSystemMetadata method.
Add code to check if the sid equals pid for the method checkSidInRegisterSystemMetadata.
Moved the rules for the SID from D1NodeService.create to MNodeService.create.Also moved the code to check the validation of an pid from D1NodeService.create to MNodeService.create, MNodeService.replicate and CNodeService.create.
Add the code to check if a sid is legitimate in the method create and registerSystemMetadata.
The the code to handle sids on the v2 api - setReplicationPolicy, setReplicationStatus, setAccessPolicy and setObsoletedBy.
Add the code to handle the sid in the delete and archive method.
Fixed a bug that the method getLocalId swallowed an exception incorrectly in the IdentifierManager.
Add delete log for data objects on CNs.
Added the code to inform users the pid was deleted in the NotFound exception.
Remove the system metadata for data objects.
Persitence the system metadata object in the memory before deleting it from hazelcast.
remove CN.systemMetadataChanged in favor of the CN.updateSystemMetadata method. Otherwise there's no good way to know where to fetch the auth copy from since the SM change might be to switch the authMN!
add support for v2 DataONE API.
do not set archived=false for all CN.create calls. The CN will use create() even harvesting content that is new to it and needs to handle already-archived content. https://projects.ecoinformatics.org/ecoinfo/issues/6475
Change CnodeService.archive() to no longer broadcast MN.archive() calls to all of the replica MNs of a pid, but rather broadcast MN.systemMetadataChanged().
can only log events with a valid localId.
On changes to system metadata in CNodeService and DocumentImpl, increment the serialVersion.
Change CNodeService's archive() and delete() methods to only update Member Nodes in the replica list (not CNs!), since calling CN.archive() again would cause an infinite loop. Thanks for catching this Ben.
Update CNodeService.delete() and .archive() to handle situations where the pid is of formatType DATA, and therefore are not registered in the identifier table, and caused NotFound exceptions. For DATA objects, we just update the system metadata now, and for all other objects (METADATA, RESOURCE), we continue to use super.{delete()|archive()}. Also, log the delete/archive into the event log....
Remove the broadcastSystemMetadataChange() method since it was a duplicate of notifyReplicaNodes(). Consolidated now.
Add the methond named isAuthoritativeMNodeAdmin method. It applies to both CN and MN methods.
On calls to archive(), log the correct call (not delete()).
Merging the METACAT_2_0_6_BRANCH changes for [M|C]NodeService into the trunk.
allow verification date to be updated for replicas (patch from Skye). https://redmine.dataone.org/issues/3699
Set the session to null so that the call uses the CN certificate when calling MN.systemMetadataChanged();
To keep all nodes up to date with regard to system metadata changes, add the broadcastSystemMetadataChange() method that finds replica MNs in the node list and calls systemMetadataChanged(). Modify setReplicationStatus() and updateReplicationMetadata() to fire this off when a replica status changes to completed. We may decide to inform MNs at other times too, but this is a conservative amount of calls going to the MNs for now.
make sure to call lock() on the SM when updating rightsholder (like every other method that gets a lock object from HZ).
CN.search() id not implemented by metacat -- making that explicit and also testing for it.
limit /log and /object calls to configurable maximum count for paging. defaults to existing Metacat value of 7000
In CNodeService.updateReplicationMetadata(), we are setting the replicaVerifiedDate() when we update or wholesale add a new replica. However, in setReplicationStatus(), we only do so when there's a new entry. Change setReplicationStatus() to also update the replicaVerifiedDate on updates of existing entries to be more consistent with other changes. This affects node prioritization based on this date timestamp. Thanks to Skye for pointing this out.
Update d1_common_java and d1_libclient_java to the newest jar files. Add methods to CNodeService to throw NotImplemented exceptions for query(), listQueryEngines(), and getQueryEngineDescription() since these API calls are handled outside of metacat.
Oops, previous commit suffered from a happy trigger finger. During deleteReplicationMetadata(), don't delete the replica on the replica Member Node. Call CN.delete() for that functionality. This call just updates sytem metadata (according to the API description).
In setReplicationStatus() and UpdateReplicationMetadata(), don't allow a status state change from COMPLETED to anything other than INVALIDATED. This prevents the completed status from being overwritten due to race conditions.
Throw an exception when NOT allowed, not when allowed =).
Add a few logging statemnts for round trip replication metrics.
remove exception from method decl - was not matching the interface def and not compiling.
implement MN and CN.archive() method -- really just the existing delete() methods.https://redmine.dataone.org/issues/2674https://redmine.dataone.org/issues/2675
call MN.delete() for each replica when CN.delete() is calledhttps://redmine.dataone.org/issues/2676
include Session-less interface methods and updated jars that define them.
remove extraneous pid and permission parameters from isAdminAuthorized() method and make public so that it can be called in other locations - namely before our asynchronous replicate() implementation on the MN.
check for empty null (missing) node.subjectList. This should probably be a required element in the D1 schema, but it appears not. (ORNL entry was missing subjects in cn-dev environment)
just use the e.getMessage() as e.getCause() may be null (seeing NPE when testing via the MN IT tester)
Also allow MNs to set the FAILED status in setReplicationStatus(). this was an oversight on my part, trying to keep MNs that truly did succeed from overriding the COMPLETED status with FAILED.
use isAdminAuthorized() to check access to CN.create(). Note this method takes a pid and permission parameter and neither is used. Also removed the NotFound exception because it would never come up.
check that caller is CN/admin for CN.delete()https://redmine.dataone.org/issues/2506
include CN.delete()https://redmine.dataone.org/issues/2506
Notify each replica MN when critical portions of system metadata change so the MN can pull the latest copy into its store. AccessPolicy and RightsHolder changes are the most critical for the MN to keep updated on.
Modify CNodeService.setReplicationStatus() slightly to restrict MN-based calls to only set the status to COMPLETED. The CNs should be setting failures or invalidations, or the status can remain at QUEUED or REQUESTED, and the MNAuditTask can revisit those replicas as needed.
Add a notifyReplicaNodes() method that calls MNStorage.systemMetadataChanged() on MN replica nodes for a given object identifier. This will be called when there are changes to AccessPolicy and rights holder since these are critical access metadata for an MN, but they can only be changed on the CN.
In setReplicationStatus(), first check for a replica target MN subject match with the session subject. If this fails, look to see if CN admin access is allowed. Otherwise throw NotAuthorized. Addresses https://redmine.dataone.org/issues/2494
Remove individual calls to isAdminAuthorized() in favor of the centralized isAuthorized() call that handles it now.
check for null Session before continuing with setReplicationStatus()https://redmine.dataone.org/issues/2476#note-3
throw not authorized when attempting to getReplica as an invalid/non-existent node
throw InvalidToken when an invalid Permission is passed in. THis requires that internal calls to the method also check for this exception.https://redmine.dataone.org/issues/2388
CNodeService.listChecksumAlgorithms() was returning null rather than the list. Fixed.
use RC-1 Dataone jars
For MNs that haven't set the archived flag to false on create(), set it here. Also, ensure that the CN sync code sets the authoritative and origin member node fields.
include new methods needed for replication (in new d1 jars)https://redmine.dataone.org/issues/2203
add method: setObsoletedBy (https://redmine.dataone.org/issues/2185)augement new method: deleteReplicationMetadata
add method: deleteReplicationMetadataremove method: assertRelationupdate the D1 jarshttps://redmine.dataone.org/issues/2187https://redmine.dataone.org/issues/2158
Simplify setReplicationStatus() to not call updateReplicationMetadata() if a replica doesn't exist. Just create it and update the system metadata, which we already have a lock for.
Update the CN methods to throw a VersionMismatch where the API changed (where serialVersion is a required parameter). These were previously throwing an InvalidRequest exception.Change the exception handling for calls to Hazelcast to catch a RuntimeException (not Exception) so we don't catch exceptions that we purposefully throw....
Don't lock() on the map.get() in isNodeAuthorized() (this assumes that the CN has queued the task already). Add more lock/unlock debug statements, and fix setReplicationStatus() - I missed a finally statement to unlock the pid.
Modify CNReplication methods setReplicationStatus(), updateReplicationMetadata() and setReplicationPolicy() to allow administrative access from a Coordinating Node by calling isAdminAuthorized().
In registerSystemMetadata(), lock the pid prior to calling map.containsKey(pid) since a put to the map could occur between the check and the subsequent put().
Use Lock instead of ILock to be consistent across classes.
After reviewing CNodeService and D1NodeService prompted by Robert comparing the Hazelcast locking with the d1_synchronization locking, I've made a number of changes that will prevent locking problems:
1) Multiple methods contained try/catch blocks that would:...
only attempt to unlock a lock if it was created (in the finally block)
new jars with many changes -- including new CN methods: ping, describe, listChecksumAlgorithm. Removed MN.setAccessPolicy. Refactored CN.setOwner() to CN.setRightsHolder().
Change setReplicationStatus() to drop serialVersion and report the failure exception message in the CN log.
updated D1 API -- removed Permission.REPLICATE and associated parameters
If a member node cannot be found in the node list matching the targetNodeSubject given in isNodeAuthorized(), throw a ServiceFailure exception.
Add log statements for each call to ILock.unlock() for debugging.
When using ILock.lock(), get a lock on the string value of the Identifier, not the Identifier object itself. Hazelcast locking won't work otherwise.
Use the Hazelcast ILock mechanism to lock the system metadata identifier rather than using IMap.lock(pid).
when comparing D1 Subject objects, use the equals() method not direct string comparisonhttps://redmine.dataone.org/issues/2050
access nodeList list correctlyhttps://redmine.dataone.org/issues/2049
Use Subject.equals() when comparing DNs rather than CertificateManager.equalsDN(). Don't lock the pid in isNodeAuthorized() to debug for timeout issues. Minor debugging changes.
Minor logging for isNodeAuthorized(), and compare subjects properly. Change this to Subject.compareTo() when it is vetted.
Catch RuntimeExceptions thrown by Hazelcast as opposed to general Exceptions to we don't catch exceptions we're trying to throw.
generalize exception handling -- add cause detail
Changes to setReplicationStatus and isNodeAuthorized(), working out minor bugs in replication.
include exception cause when throwing new exception (combine RuntimeException in Exception handling -- they are almst identical)