refactor D1-specific upgrade utilities into their own package
remove createAndInsertSystemMetadat() method that acts on a single localId -- incorporated this into the localId-list-based method.
refactor IdentityManager.createSystemMetadata(sm) to be insertSystemMetadata(sm) so that it is clear that this method inserts the SM object into the backing store. This differentiates it from the "generation" methods we use when we need to create SM about pre-existing objects or objects we get from non-D1 api calls.
generate SystemMetadata during D1 registration (not 2.0.0 upgrade). This process runs in a thread and updates a metacat.properties value when it is complete.
getMultipartParameters() outside of debug block -- thanks Mark Reyes @ CDL for catching this.
dataone configuration and registration enhancements:-include flag to disable D1 services, currently only the MN side enforces this-do not allow multiple registration attempts if we have just submitted and are awaiting Node verification by the CN.-do not allow configuration "bypass" if D1 settings have been configured previously....
use correct Collections import
Show "Update" button if this MemberNodeId is already registered with DataONE, otherwise use the "Register" label
match changes to MN service methods (return type as boolean)
Added new methods to generate a default replication policy based on properties from the metacat configuration. This is called during system metadata creation for objects that lack any system metadata.
Modify admin configuration to include default replication policy. Extensively revised the DataONE configuration page, including new wording for intro, improved tooltips throughout, new arrangement of sections, and other cosmetic changes.
Clean up warnings in class.
Remove ability to edit NodeID from D1 configuration page. Fix update of contactSubject and dataone.ore.generated property name.
handle "BIN" objects so as to avoid repeated calls to lookup the non-existent ObjectFormat
do not wait for SM generation to complete during the upgrade -- this way the web UI wont hang for days. the process sets a metacat property when it is complete.
do not shutdown hazelcast -- it needs to be running after the upgrade process so that Metacat actually works.I think the newer version of HZ makes it so the threads are all released as needed.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5572
Commenting out the parts of the upgrade script that started to refer to EXID. At this point, the registration of EZID identifiers will be done out-of-band with respect to the upgrade.
use plain String parameter for {pid} instead of XML serialization of it.
remove {pid} from POST URL on CN.registerSystemMetadata()https://redmine.dataone.org/issues/2284
remove {pid} from POST URL on CN.create()https://redmine.dataone.org/issues/2284
remove {pid} from POST URL on MN.create()https://redmine.dataone.org/issues/2284
catch cases where the previous/next revision of objects have not had system metadata generated yet
create system metadata object if it wasn't found in HZ
process systemMetadata from the docInfo string before writing to the database so that we guarantee guid-docid mapping exists before attempting to look it up.
upgrade to hazelcast 1.9.4.6 so that threadpools are released when not needed (http://code.google.com/p/hazelcast/issues/detail?id=765).include ant target to run a specific main class (mostly for debugging)
use File.deleteOnExit() not a half hour timer thread to do it.
multithreaded implementation for processing docids for system metadata generation.need to investigate ant/junit running that deadlocks hazelcast (config?)
additional logging of the config file being used - seem to have thread locking on the xmlConfig use when running under ant/junit
calculate object size using the size on the file system rather than re-reading as an input stream.Now only EML document bytes will be read twice: once for the checksum and again for parsing out datapackage details
system metadata generation optionally skips entries that have already been generated (data size, checksum) but allows the latest EML that describes them to have the last word on object format
remove DML for parsing -- the D1 EML parser still uses DOM, so this may not be too big of a perfromance improvement
only attempt to update date-like nodedata values.
include generate system metadata upgrade in the success flag
more clean up - reuse prepared statement for data update
look up nodedata values first, then update each one - trying to avoid out of memory exception.
rollback processing Error change -- creates a loop on error. ugh
report processing errors after exceptions have been caught and recorded, otherwise the web UI is blank and there is no clue what happened unless you look in the logs.
fix a bug in MNodeService.replicate() where the checksum value was being compared to the computed checksum object, not its value.
use UTC serialization for log entries so that the timestamp, not just the date, is preservedhttps://redmine.dataone.org/issues/2257
Update the D1Admin class to set the dataone.contactSubject property. I've added the property to the http request to be added to the JSP form, but for now am setting the property using the dataone.subject field value. Not sure if we want to expose the contact subject in the form yet or not.
In MN.getCapabilities(), the required contact subject was not being added to the node instance from the dataone properties. Add it in.
generate ORE maps only once -- and persist the flag to the main backup properties so that subsequent Metacat upgrades remember this value.
use RC-1 Dataone jars
Added DOI generation to the 2.0.0 upgrade process. To succeed, this script must be run on a fresh 2.0.0 database, or on a 1.9.5 version database, as those are the only ways to get the needed foreign keys to be marked as deferrable. The identifier conversion must be turned on by setting correct properties in metacat.properties. See the comments in GenerateGlobalIdentifiers for details. By default, conversion is set to false in the properties file. If you want to convert an instance to use DOIs, be sure to set metacat.properties up BEFORE running through the Metacat configuration and database upgrade.
Refactoring classes that throw generic Exception class to throw their more specific subclasses so that new exceptions are not hidden behind generic messages. Makes debugging easier.
try to read the local document before making the localid->guid mapping (in cases where we fail to read the data locally like if it is referenced in an EML file but does not exist on this Metacat instance)
Ensure we have the object and sysmeta params for MN.create(). We were getting a fatal SAX parsing error encapsulated in a ServiceFailure when a science metadata object param was null. Cut it off at the pass after parsing the MMP entity.
Use the Collections class from java.util.
Remove null field tests in the IdentifierManager class. Schema-level required fields are checked on serialization/deserialization using JibX during the REST resource handler classes. Other required fields are checked in MNodeService and CNodeService, higher in the stack.
For MNs that haven't set the archived flag to false on create(), set it here. Also, ensure that the CN sync code sets the authoritative and origin member node fields.
On MN.create(), set the archived flag to be false. This field isn't required in the schema, but is needed by the DataONE indexer once objects are sync'd.
-generate system meta for all docids, even those not originating on the server (replicas from the past)-generate ORE docs and download remote data only for those documents that originated on this server being upgraded.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522
refactor generate system meta loop to the factory class -- to be reused in sysmeta and ORE generationhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522
When managing obsoletes/obsoletedBy system metadata fields, set the archived flag to false initially, and set it to true on system metadata for objects that a revision obsoletes.
do NOT generate ORE maps or download data when we do the initial System Metadata generation -- this is deferred until D1 registration.
make more generic so that a custom list of IDs can be passed in.
check that the resourceMap (based on Id only) does not currently exist in the local metacat when generating OREs
insert OR update system metadata -- no need to do an update right after initial insert...
call the System Metadata generator during upgrade to 2.0.0
In IdentifierManager.updateSystemMetadata(), add a check for invalid system metadata (fields that throw a NullPointerException on access) to ensure that system metadata is populated correctly. Updated calling classes to handle the exception.
Properly initialize the servlet context when starting alternate servlets, which makes sure that the configuration files have been loaded and config properties are available.
Handle SQLExceptions when trying to save system metadata locally.
Convert SQLExceptions to RuntimeExceptions for Hazelcast MapStore operations.
In IdentifierManager, throw SQLExceptions rather than just logging them, and let them be handled higher up in the stack.
use new endpoint/method:http://mule1.dataone.org/ArchitectureDocs-current/apis/CN_APIs.html#CNReplication.deleteReplicationMetadata
use PUT /obsoletedBy/{pid} for CNCore.setObsoletedBy per our discussion today
Keep the hzIdentifiers set in sync with the Metacat systemmetadata table. If entries are added/updated in the hzSystemMetadata map, make sure the identifier is in the set. If (for some administrative reason) the entry is removed, remove the identifier from the set. This usually doesn't happen.
When loading all keys from Metacat into the hzSystemMetadata map, also load identifiers into the hzIdentifiers set if they are not already there. Although entries may be evicted from the map, the list of identifiers will remain. The list will have a fairly small memory footprint since it's just identifiers.
Add support for the distributed Set of unique identifiers in the storage cluster called 'hzIdentifiers'. This set is a persistent total list of all identifiers (even when entries in the hzSystemMetadata map are evicted). It reflects the state of the identifiers in the postgresql systemmetadata table, but is distributed across the cluster. Add the getIdentifiers() method, which returns the ISet of identifiers.
include new methods needed for replication (in new d1 jars)https://redmine.dataone.org/issues/2203
add method: setObsoletedBy (https://redmine.dataone.org/issues/2185)augement new method: deleteReplicationMetadata
remove method: assertRelationhttps://redmine.dataone.org/issues/2158
add method: deleteReplicationMetadataremove method: assertRelationupdate the D1 jarshttps://redmine.dataone.org/issues/2187https://redmine.dataone.org/issues/2158
serialize the Identifier for the systemMetadata being registeredhttps://redmine.dataone.org/issues/2204
Simplify setReplicationStatus() to not call updateReplicationMetadata() if a replica doesn't exist. Just create it and update the system metadata, which we already have a lock for.
Minor null checks to avoid NPEs when calling replicate()
Don't throw a NotAuthorized exception in isAdminAuthorized() - just return false.
do not download and save remote data resources which are HTML but are not expected to be such (login or info/splash pages before data content).http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522
Update the CN methods to throw a VersionMismatch where the API changed (where serialVersion is a required parameter). These were previously throwing an InvalidRequest exception.Change the exception handling for calls to Hazelcast to catch a RuntimeException (not Exception) so we don't catch exceptions that we purposefully throw....
Use a Logger instead of System.out for SystemMetadataMap.
Don't lock() on the map.get() in isNodeAuthorized() (this assumes that the CN has queued the task already). Add more lock/unlock debug statements, and fix setReplicationStatus() - I missed a finally statement to unlock the pid.
Modify CNReplication methods setReplicationStatus(), updateReplicationMetadata() and setReplicationPolicy() to allow administrative access from a Coordinating Node by calling isAdminAuthorized().
Add isAdminAuthorized() to D1NodeService to check if the operation is being requested from a CN. Consult the NodeList from the CN and test the NodeType of the given node and the X509 certificate Subject. Perhaps we should expand this to also check for service-level access in the future.
store D1 configuration properties in the main backup so that they persist between upgrades.
In registerSystemMetadata(), lock the pid prior to calling map.containsKey(pid) since a put to the map could occur between the check and the subsequent put().
update authoritative member node id when we change it (reconfiguration) and when we initially register as a MN with the CN.
Correctly deserialize the BaseException subclass in handling calls to setReplicationStatus()
Use Lock instead of ILock to be consistent across classes.
After reviewing CNodeService and D1NodeService prompted by Robert comparing the Hazelcast locking with the d1_synchronization locking, I've made a number of changes that will prevent locking problems:
1) Multiple methods contained try/catch blocks that would:...
only delete replicated data files (server_location != 1)
use inherited access control from EML for the data file we download from a remote sourcehttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522
download remote data and save locally when it is referenced by an EML package, then include it in the ORE map.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522
When the requested count in a call to listObjects() is 0, return an empty object list, not a full one. Fixes https://redmine.dataone.org/issues/2122
Minor formatting for querySystemMetadata().
exapnd permissions on the exisiting access rule not on the permission being checked. (hierarchical permissions)
upgrade routine to purge empty replicated data files so that they can be re-replicatedhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5536
Make sure the local id isn't null when we try to get the object from the local instance.
Simplify the error handling, and throw the exception once the CN is updated with the new status.
Set the replica status to failed (not invalidated) when we get exceptions trying to read the object bytes. Not much of a difference, but only the CN, in theory, is supposed to be able to set the invalidated status.