An example python script that uses the python client to loop through a list offiles, read them from disk, and insert them into metacat.
make it clear that the Apache config files are samples and may need to be modified for different servershttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5516
update system requirements to be more reasonable
use larger ("text") db field for guid in the xml_access.accessfileid column
use EML 2.1.1 tag as final tag for the schemahttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5532
use RELEASE_EML_UTILS_1_0_0 for EML style sheetshttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5532
comment our the Demis world map layer -- it was prompting for username/password to use the WMS
Use the Collections class from java.util.
Remove null field tests in the IdentifierManager class. Schema-level required fields are checked on serialization/deserialization using JibX during the REST resource handler classes. Other required fields are checked in MNodeService and CNodeService, higher in the stack.
For MNs that haven't set the archived flag to false on create(), set it here. Also, ensure that the CN sync code sets the authoritative and origin member node fields.
On MN.create(), set the archived flag to be false. This field isn't required in the schema, but is needed by the DataONE indexer once objects are sync'd.
use EML 2.1.1 RC4 tag before final tag (schema)http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5532
use final tag for building with utilities (tags/UTILITIES_1_1_0)http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5532
use final tag for building with ecogridhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5532
-generate system meta for all docids, even those not originating on the server (replicas from the past)-generate ORE docs and download remote data only for those documents that originated on this server being upgraded.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522
refactor generate system meta loop to the factory class -- to be reused in sysmeta and ORE generationhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522
When managing obsoletes/obsoletedBy system metadata fields, set the archived flag to false initially, and set it to true on system metadata for objects that a revision obsoletes.
do NOT generate ORE maps or download data when we do the initial System Metadata generation -- this is deferred until D1 registration.
make more generic so that a custom list of IDs can be passed in.
check that the resourceMap (based on Id only) does not currently exist in the local metacat when generating OREs
insert OR update system metadata -- no need to do an update right after initial insert...
call the System Metadata generator during upgrade to 2.0.0
In IdentifierManager.updateSystemMetadata(), add a check for invalid system metadata (fields that throw a NullPointerException on access) to ensure that system metadata is populated correctly. Updated calling classes to handle the exception.
Fixed formatting to make fixed-width line much shorter so boxes fit on the page.
Removing obsoleted Admin Guide -- this guide is now maintained using Sphinx in docs/user/metacat -- the word document and PDF file are now obsolete.
Properly initialize the servlet context when starting alternate servlets, which makes sure that the configuration files have been loaded and config properties are available.
Removed link to obsolete PDF file.
Modifications to make the build more readily find locally installed versions ofsphinx-build, and to remove the unneccary 'html' directory; output ofadmindoc is now written directly into build/docs.
Modified build to include documentation in the war file that is generated as part of the build, if the 'documentation' target has been run before thewar target. This is not done automatically because its not clear if allpeople will have the proper sphinx environment set up to build the...
Adapted the build to be able to generate the Sphinx Admin Guide, and tobetter handle the copyright for Javadoc generation.
Added links to Javadoc. Tuned the layout a bit.
Completed first draft of Admin guide chapter on DataONE.
Handle SQLExceptions when trying to save system metadata locally.
Convert SQLExceptions to RuntimeExceptions for Hazelcast MapStore operations.
In IdentifierManager, throw SQLExceptions rather than just logging them, and let them be handled higher up in the stack.
use new endpoint/method:http://mule1.dataone.org/ArchitectureDocs-current/apis/CN_APIs.html#CNReplication.deleteReplicationMetadata
use PUT /obsoletedBy/{pid} for CNCore.setObsoletedBy per our discussion today
Keep the hzIdentifiers set in sync with the Metacat systemmetadata table. If entries are added/updated in the hzSystemMetadata map, make sure the identifier is in the set. If (for some administrative reason) the entry is removed, remove the identifier from the set. This usually doesn't happen.
When loading all keys from Metacat into the hzSystemMetadata map, also load identifiers into the hzIdentifiers set if they are not already there. Although entries may be evicted from the map, the list of identifiers will remain. The list will have a fairly small memory footprint since it's just identifiers.
Add support for the distributed Set of unique identifiers in the storage cluster called 'hzIdentifiers'. This set is a persistent total list of all identifiers (even when entries in the hzSystemMetadata map are evicted). It reflects the state of the identifiers in the postgresql systemmetadata table, but is distributed across the cluster. Add the getIdentifiers() method, which returns the ISet of identifiers.
Add the dataone.hazelcast.storageCluster.identifiersSet property that defines the name of the distributed set of unique DataONE identifiers (called 'hzIdentifiers'). This ISet can't be configured in the hazelcast.xml file (only maps and queues can).
Continued authoring the description of DataONE in Metacat. More to come.
include new methods needed for replication (in new d1 jars)https://redmine.dataone.org/issues/2203
add method: setObsoletedBy (https://redmine.dataone.org/issues/2185)augement new method: deleteReplicationMetadata
remove method: assertRelationhttps://redmine.dataone.org/issues/2158
add method: deleteReplicationMetadataremove method: assertRelationupdate the D1 jarshttps://redmine.dataone.org/issues/2187https://redmine.dataone.org/issues/2158
serialize the Identifier for the systemMetadata being registeredhttps://redmine.dataone.org/issues/2204
Use a Date with resolution to milliseconds.
Initial outline for DataONE chapter.
Added OAI-PMH chapter that was contributed by Duane Costa from LTER.
Simplify setReplicationStatus() to not call updateReplicationMetadata() if a replica doesn't exist. Just create it and update the system metadata, which we already have a lock for.
Minor null checks to avoid NPEs when calling replicate()
Don't throw a NotAuthorized exception in isAdminAuthorized() - just return false.
do not download and save remote data resources which are HTML but are not expected to be such (login or info/splash pages before data content).http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522
Fixed formatting.
Moving Metacat Sphinx RST documentation from docs/dev to docs/user directory.
Merged most recent changes from trunk into the RST converted version of the Administrator's Guide. Now the Sphinx/RST version is up to date rlative to the most recent word document, and is now the active copy. The MS Word document will be deprecated and removed. All future changes should be made to the RST version.
Update the CN methods to throw a VersionMismatch where the API changed (where serialVersion is a required parameter). These were previously throwing an InvalidRequest exception.Change the exception handling for calls to Hazelcast to catch a RuntimeException (not Exception) so we don't catch exceptions that we purposefully throw....
Use a Logger instead of System.out for SystemMetadataMap.
Don't lock() on the map.get() in isNodeAuthorized() (this assumes that the CN has queued the task already). Add more lock/unlock debug statements, and fix setReplicationStatus() - I missed a finally statement to unlock the pid.
Modify CNReplication methods setReplicationStatus(), updateReplicationMetadata() and setReplicationPolicy() to allow administrative access from a Coordinating Node by calling isAdminAuthorized().
Add isAdminAuthorized() to D1NodeService to check if the operation is being requested from a CN. Consult the NodeList from the CN and test the NodeType of the given node and the X509 certificate Subject. Perhaps we should expand this to also check for service-level access in the future.
store D1 configuration properties in the main backup so that they persist between upgrades.
In registerSystemMetadata(), lock the pid prior to calling map.containsKey(pid) since a put to the map could occur between the check and the subsequent put().
update authoritative member node id when we change it (reconfiguration) and when we initially register as a MN with the CN.
add description about what becoming a Member Node entails
Correctly deserialize the BaseException subclass in handling calls to setReplicationStatus()
Use Lock instead of ILock to be consistent across classes.
After reviewing CNodeService and D1NodeService prompted by Robert comparing the Hazelcast locking with the d1_synchronization locking, I've made a number of changes that will prevent locking problems:
1) Multiple methods contained try/catch blocks that would:...
Converted the metacat-properties chapter to RST format. Still need to merge innewer changes from the trunk, as I was accidentally working from the 1.9.4branch for this whole conversion process.
only delete replicated data files (server_location != 1)
use inherited access control from EML for the data file we download from a remote sourcehttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522
Removing unused screenshots that are duplicates of the others in the admin doc.
Converted Harvester chapter to RST.
download remote data and save locally when it is referenced by an EML package, then include it in the ORE map.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522
remove systemmetadata replication option -- it is no longer a separate document in metacat
Added stub documents for chapters on DataONE and OAI-PMH (to be converted fromDuane's Word doc).
Small word choice change.
Improved formatting for index.
Added AuthInterface chapter, and a License chapter.
Converted Event Logging and Sitemaps chapters to RST.
Fixed table layout on geoserver and submission chapters. Converted Replicationchapter to RST.
Completed 'Submission' page conversion, and also converted GeoServer docs toRST format.
Partial conversion of the accessing and submitting metadata section to RST.More coming later.
include the EML and data tests in the suite
debugging data locking test
cannot check for deleted data since it is forever available (archived)
Updated the configuration section, converted word doc to RST.
Updated the Installation chapter, coverted to RST.
When the requested count in a call to listObjects() is 0, return an empty object list, not a full one. Fixes https://redmine.dataone.org/issues/2122
Minor formatting for querySystemMetadata().
Updated contributors.
Modified index to fix typo.
Edited introduction to Metacat admin guide, inserted figure.
Screenshots from the Metacat admin guide.
Updating Sphinx doc structure in prep for moving metacat admin guide to Sphinx.
exapnd permissions on the exisiting access rule not on the permission being checked. (hierarchical permissions)
defer to super class member variables
Upgrade to Hazelcast-1.9.4.5 to try to solve CLIENT_CONNECTION_LOST problems seen on the Coordinating Node.
mark client/servlet API and EarthGrid API as deprecatedhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5517