Use a Logger instead of System.out for SystemMetadataMap.
Don't lock() on the map.get() in isNodeAuthorized() (this assumes that the CN has queued the task already). Add more lock/unlock debug statements, and fix setReplicationStatus() - I missed a finally statement to unlock the pid.
Modify CNReplication methods setReplicationStatus(), updateReplicationMetadata() and setReplicationPolicy() to allow administrative access from a Coordinating Node by calling isAdminAuthorized().
Add isAdminAuthorized() to D1NodeService to check if the operation is being requested from a CN. Consult the NodeList from the CN and test the NodeType of the given node and the X509 certificate Subject. Perhaps we should expand this to also check for service-level access in the future.
In registerSystemMetadata(), lock the pid prior to calling map.containsKey(pid) since a put to the map could occur between the check and the subsequent put().
Use Lock instead of ILock to be consistent across classes.
After reviewing CNodeService and D1NodeService prompted by Robert comparing the Hazelcast locking with the d1_synchronization locking, I've made a number of changes that will prevent locking problems:
1) Multiple methods contained try/catch blocks that would:...
use inherited access control from EML for the data file we download from a remote sourcehttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522
download remote data and save locally when it is referenced by an EML package, then include it in the ORE map.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522
exapnd permissions on the exisiting access rule not on the permission being checked. (hierarchical permissions)
Make sure the local id isn't null when we try to get the object from the local instance.
Simplify the error handling, and throw the exception once the CN is updated with the new status.
Set the replica status to failed (not invalidated) when we get exceptions trying to read the object bytes. Not much of a difference, but only the CN, in theory, is supposed to be able to set the invalidated status.
Set the replication status to invalidated when we have a localId, but getting the object bytes fails for any reason.
Only call super.create() if there's no localId found on the MN (ie a replica is there from an out of band process).
Get the object inputstream from the local metacat instance using MetacatHandler.get() rather than MN.getReplica() so we don't throw an InvalidToken exception when passing in a null Session. The D1Client object is never used for this local call.
interpret permissions as hierarchicalhttps://redmine.dataone.org/issues/2150
process the current revision, not the latest!use direct object/system metadata insertion for ORE maps.
allow other Metacat process (system metadata and ORE generation) to directly insert objects and system metadata without having to go through the MN/CN methods.
only attempt to unlock a lock if it was created (in the finally block)
new jars with many changes -- including new CN methods: ping, describe, listChecksumAlgorithm. Removed MN.setAccessPolicy. Refactored CN.setOwner() to CN.setRightsHolder().
add revision history to the generated ORE objects -- we use the revision history of the EML package as a basis because the each ORE revision mirrors the revision of the EML package. Add a placeholder for checking if an equivalent ORE map exists in the DataONE infrastructure - this will be a call to CN.search() that looks at the solr index for OREs based on the EML package ID.
In the call to MNReplication.replicate(), call back to CNReplication.setReplicationStatus() and set the status to failed when we get local exceptions, exceptions from the source MN when calling getReplica(). Send back an exception with a description when setting the status. Add a private setReplicationStatus() method to refactor these calls out.
Change setReplicationStatus() to drop serialVersion and report the failure exception message in the CN log.
set SystemMetadata.archived=true on MN.deleteThere is ongoing discussion on what the exact behavior should be here, but this mimics Metacat's delete-as-archive action.http://redmine.dataone.org/issues/882
In MNodeService.replicate(), check to see if we have a replica (via an out of band channel) before we call sourceMN.getReplica().
updated D1 API -- removed Permission.REPLICATE and associated parameters
include SerialVersion in describe responsehttps://redmine.dataone.org/issues/2135NOTE: d1 jars should be replaced once all schema changes are finalized and the generate d1_common code is committed to svn
If a member node cannot be found in the node list matching the targetNodeSubject given in isNodeAuthorized(), throw a ServiceFailure exception.
update with latest d1_common/d1_lib (includes latest schema changes)
for now, look up SystemMetadata directly from the table otherwise we won't have the latest access information. Need to refresh the in-memory copy everytime we edit the access policy via Metacat (includes EML parser)
refactor Metacat access handling to be on a per-revision basis so that it more closely aligns with the DataONE approachhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5560
ensure that the revision list is ordered ascending in case someone changes the sql query without realizing that it matters...
set the byte size of the ORE map before adding it
set/update the obsoletes/obsoletedBy fields in system metadata so that we always have a complete revision history for each object.Note: ORE maps do not have revision history...yet(?)
generating ORE maps and creating/updating system metadata now. There are some Permission conversion issues to be worked out yet
make exception/error reporting clearer -- was getting lock messages when perhaps that was not the correct exception.
Add log statements for each call to ILock.unlock() for debugging.
evict the HazelCast SystemMetadata entry if we update the access control rules via Metacat's legacy API, otherwise stale SystemMetadata stays in memory instead of being looked up from the backing table store.
optionally include ORE generation/insertion into Metacat when generating SystemMetadatahttps://redmine.dataone.org/issues/2056
Set a default HazelcastInstance after init() is called, and use this instance in getLock() to acquire a lock in the cluster.
no need to cast docInfo entries to String -- they are all strings
set revision history, the create/update dates and the owner/submitter (correctly)
use shared method for looking up "docInfo" map -- both in Metacat replication and in D1 system metadata generation
make default formatting a little bit easier to read
reformat code -- no changes
refactor SystemMetadata creation into separate class from the MetacatHandler -- this will be shared by upgrade code and normal metacat api.
When using ILock.lock(), get a lock on the string value of the Identifier, not the Identifier object itself. Hazelcast locking won't work otherwise.
Use the Hazelcast ILock mechanism to lock the system metadata identifier rather than using IMap.lock(pid).
verify checksum when retrieving replica from another member node.https://redmine.dataone.org/issues/1794
make sure to get/put system metadata to the HZ map instead of using IdentifierManager directlyverified changes for: https://redmine.dataone.org/issues/1999
look-up sych schedule from metacat properties instead of hardcoding themhttps://redmine.dataone.org/issues/1933
when comparing D1 Subject objects, use the equals() method not direct string comparisonhttps://redmine.dataone.org/issues/2050
access nodeList list correctlyhttps://redmine.dataone.org/issues/2049
Use Subject.equals() when comparing DNs rather than CertificateManager.equalsDN(). Don't lock the pid in isNodeAuthorized() to debug for timeout issues. Minor debugging changes.
Minor logging for isNodeAuthorized(), and compare subjects properly. Change this to Subject.compareTo() when it is vetted.
check for authenticated and verified user permissions
throw NotAuthorized when there is no session
Catch RuntimeExceptions thrown by Hazelcast as opposed to general Exceptions to we don't catch exceptions we're trying to throw.
generalize exception handling -- add cause detail
Changes to setReplicationStatus and isNodeAuthorized(), working out minor bugs in replication.
include exception cause when throwing new exception (combine RuntimeException in Exception handling -- they are almst identical)
throw InvalidToken when session is null
Send the correct node id (the target node) when calling setReplicationStatus()
check obsoletes and obsoletedBy PIDs when updating objects
delete system metadata when MN.delete() is called.
throw InvalidToken when there is no session (certificate) provided in update() and delete() methods.
Calls to setReplicationStatus() can only be made by a CN or the MN that is the target replica node. Implement this service restriction in CNodeService using CertificateManager's equalsDN() method.
Added stack trace debugging for CNodeService.isNodeAuthorized() for tracking down replication issues.
Use a session object that is set to null when calling CNode.setReplicationStatus()
Add debugging code to MNodeService.getReplica().
Set a new Session object to null, to be overwritten by the CertificateManager session information from the X.509 certificate.
Fix cast to List<Node> in isNodeAuthorized().
upgrade to 1.0.1-SNAPSHOT DataONE jars
Update methods in MNodeService to reflect they modifications of the MN API with regard to exceptions being raised. Largely removed InvalidRequest from a number of methods, and instead threw an appropriate NotFound or ServiceFailure instead.
D1NodeService get(), getSystemMetadata(), and isAuthorized() no longer throw InvalidRequest.
Add in the systemMetadataChanged() method in MNodeService to respond to notifications. Only allow subjects from CNs listed in the node list to make the call. Update the local copy of the system metadata document for the given pid.
Include the serialVersion in the call to CN.setReplicationStatus() after replicating data.
make MNodeServiceTest pass JUnit testing
Update CNodeService to use the serialVersion parameter and compare it to the current serialVersion of the system metadata found in the hzSystemMetadata map. Throw an InvalidRequest exception if they are not equal. This affects updateReplicationMetadata(), setReplicationStatus(), setReplicationPolicy(), setAccessPolicy(), and setOwner().
Add updateReplicationMetadata() to the CN service implementation. This was missing from the API, and likely never called. It fully replaces the given replica item in the list of replicas in system metadata.
getReplica() should log replication events as DataONE Types.Event.replicate (vs 'getreplica')
Minor indentation cleanup.
Modify isAuthorized() to get the most up to date system metadata from the hzSystemMetadata map.
Add a placeholder setAccessPolicy() method in MNodeService that throws NotImplemented since this method is being deprecated. Note: need to confirm that this shouldn't be calling D1Client.getCN().setAccessPolicy().
Update getSystemMetadata() to lock(); get(); unlock() to ensure we have the latest version of system metadata from the hzSystemMetadata map. Remove the setAccessPolicy() method since it is being deprecated in the MNAuthorization API.change insertSystemMetadata() to use a finer grained Date object on insertion. Locking of the pid happens in the subclass prior to the insert.
Add setAccessPolicy() to CNodeService since the CN should only make changes to access policies for objects registered with the D1 system. Increment the serial version after locling and getting the most up to fdate system metadata. Note: CCIT meeting decision says the serial version of the system metadata (during the change) should equal the current serial version, but setAccessPolicy() does not pass in the entire system metadata object, so there's no way to check. For now, increment the latest system metadata from the hzSystemMetadata map.
In CNodeService, separate the CN.create() functionality from the MN.create() functionality while still using the superclass to call create(). Deal with Hazelcast locks and setting serial versions only in the CN implementation.
Change updateSystemMetadata() to evaluate the incoming system metadata serial version against that found in the hzSystemMetadata map. If they are the same, do the update. If not, throw an InvalidRequest explaining that they need the most current version.
Modify CNodeService's registerSystemMetadata() with support for SystemMetadata's serialVersion field. Also, use the hzSystemMetadata map for all system metadata reads using a lock on the pid in order to get the very latest version. This affected isNodeAuthorized(), getChecksum(), and assertRelation(). Since we're using Hazelcast, exceptions are masked as RuntimeException, so throw a ServiceFailure with the underlying message.
Modify CNodeService's updateSystemMetadata(), setReplicationStatus(), setReplicationPolicy(), and setOwner() with support for SystemMetadata's serialVersion field. Other methods still pending an update. Use the hzSystemMetadata map for all system metadata reads using a lock on the pid in order to get the very latest version.
SystemMetadataManager's functionality is handled by IdentifierManager. Removing it and it's test.
MetadataTypeRegister is now replaced by ObjectFormatService. Removing it and it's test.
move the DataONE 1.0.0-SNAPSHOT
Configure and use CertificateManager in order to act as the MN when performing replicate() and getReplica() mthods.
add User-Agent logging to support D1 requirements
Add debugging output to MNodeService.
update D1 jars to include recent SubjectList -> SubjectInfo refactoring and the SUBJECT_PUBLIC constant