Project

General

Profile

Statistics
| Revision:

# Date Author Comment
7007 02/09/2012 04:11 PM ben leinfelder

only run ORE generation for EML docs -- no need to run this for all documents (yikes!)

7006 02/09/2012 03:48 PM ben leinfelder

use IdMan method to find docids that do not already have system metadata records -- this lets us re-run without re computing system metadata for every entry (in case the process is interrupted). I haven' been using this option because I wanted to continually regenerate all SM for everything in my test DBs, but we are so close to release that I want to get this in there.

7005 02/09/2012 02:39 PM ben leinfelder

for testing: limit and randomize the docs to generate metadata for

7004 02/09/2012 08:39 AM ben leinfelder

FOR TESTING ONLY: limit number of records to 100 so that we can get an estimate

7003 02/08/2012 03:46 PM ben leinfelder

update the memberNodeId in existing system metadata only after the register/update is successful with the CN -- we can avoid unneeded SM updates in cases when the register/update fails because we gave the CN bad info that it rejects.
https://redmine.dataone.org/issues/2308

7002 02/08/2012 03:12 PM ben leinfelder

include member node id text field now that the CN is not assigning random Ids.
https://redmine.dataone.org/issues/2308

7001 02/08/2012 01:20 PM ben leinfelder

1. lookup and use the guid when processing obsoletes/obsoletedBy entries -- had previously been assuming localId==guid but now that we have introduced DOIs as part of the Metacat upgrade process, we may have DOIs for the guid that map to localIds.
2. base ORE guids on the localid of the data package they are describing and not on their DOI -- otherwise we might mash up the DOI prefix (or other id scheme that we are unaware of). By using resourceMapPreix + localId we are sure to have a valid localid and guid for the ORE map we create and add to the system

7000 02/08/2012 11:23 AM ben leinfelder

use updated authorization policies as discussed in:
https://redmine.dataone.org/issues/2277
and
http://epad.dataone.org/20120131-authn-authz-questions

6999 02/08/2012 10:58 AM ben leinfelder

refactor D1-specific upgrade utilities into their own package

6998 02/08/2012 10:53 AM ben leinfelder

remove createAndInsertSystemMetadat() method that acts on a single localId -- incorporated this into the localId-list-based method.

6997 02/08/2012 10:50 AM ben leinfelder

refactor IdentityManager.createSystemMetadata(sm) to be insertSystemMetadata(sm) so that it is clear that this method inserts the SM object into the backing store. This differentiates it from the "generation" methods we use when we need to create SM about pre-existing objects or objects we get from non-D1 api calls.

6996 02/08/2012 10:44 AM ben leinfelder

generate SystemMetadata during D1 registration (not 2.0.0 upgrade). This process runs in a thread and updates a metacat.properties value when it is complete.

6995 02/07/2012 09:54 PM ben leinfelder

getMultipartParameters() outside of debug block -- thanks Mark Reyes @ CDL for catching this.

6994 02/07/2012 04:53 PM ben leinfelder

dataone configuration and registration enhancements:
-include flag to disable D1 services, currently only the MN side enforces this
-do not allow multiple registration attempts if we have just submitted and are awaiting Node verification by the CN.
-do not allow configuration "bypass" if D1 settings have been configured previously....

6993 02/07/2012 03:18 PM ben leinfelder

use correct Collections import

6992 02/07/2012 11:08 AM ben leinfelder

Show "Update" button if this MemberNodeId is already registered with DataONE, otherwise use the "Register" label

6991 02/07/2012 09:54 AM ben leinfelder

match changes to MN service methods (return type as boolean)

6988 02/07/2012 12:02 AM Matt Jones

Added new methods to generate a default replication policy based on properties from the metacat configuration. This is called during system metadata creation for objects that lack any system metadata.

6987 02/07/2012 12:00 AM Matt Jones

Modify admin configuration to include default replication policy. Extensively revised the DataONE configuration page, including new wording for intro, improved tooltips throughout, new arrangement of sections, and other cosmetic changes.

6986 02/06/2012 11:56 PM Matt Jones

Clean up warnings in class.

6984 02/06/2012 01:05 PM Matt Jones

Remove ability to edit NodeID from D1 configuration page. Fix update of contactSubject and dataone.ore.generated property name.

6982 02/06/2012 12:38 PM ben leinfelder

handle "BIN" objects so as to avoid repeated calls to lookup the non-existent ObjectFormat

6981 02/06/2012 11:40 AM ben leinfelder

do not wait for SM generation to complete during the upgrade -- this way the web UI wont hang for days. the process sets a metacat property when it is complete.

6980 02/06/2012 11:38 AM ben leinfelder
6977 02/02/2012 05:15 PM ben leinfelder

do not shutdown hazelcast -- it needs to be running after the upgrade process so that Metacat actually works.
I think the newer version of HZ makes it so the threads are all released as needed.
http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5572

6976 02/02/2012 03:17 PM Matt Jones

Commenting out the parts of the upgrade script that started to refer to EXID. At this point, the registration of EZID identifiers will be done out-of-band with respect to the upgrade.

6975 02/02/2012 02:17 PM ben leinfelder

use plain String parameter for {pid} instead of XML serialization of it.

6974 02/02/2012 11:23 AM ben leinfelder

remove {pid} from POST URL on CN.registerSystemMetadata()
https://redmine.dataone.org/issues/2284

6973 02/02/2012 11:15 AM ben leinfelder

remove {pid} from POST URL on CN.create()
https://redmine.dataone.org/issues/2284

6972 02/02/2012 11:10 AM ben leinfelder

remove {pid} from POST URL on MN.create()
https://redmine.dataone.org/issues/2284

6971 02/01/2012 04:09 PM ben leinfelder

catch cases where the previous/next revision of objects have not had system metadata generated yet

6970 02/01/2012 03:52 PM ben leinfelder

create system metadata object if it wasn't found in HZ

6968 02/01/2012 09:44 AM ben leinfelder

process systemMetadata from the docInfo string before writing to the database so that we guarantee guid-docid mapping exists before attempting to look it up.

6966 01/30/2012 02:49 PM ben leinfelder

upgrade to hazelcast 1.9.4.6 so that threadpools are released when not needed (http://code.google.com/p/hazelcast/issues/detail?id=765).
include ant target to run a specific main class (mostly for debugging)

6965 01/30/2012 02:44 PM ben leinfelder

use File.deleteOnExit() not a half hour timer thread to do it.

6964 01/27/2012 05:15 PM ben leinfelder

multithreaded implementation for processing docids for system metadata generation.
need to investigate ant/junit running that deadlocks hazelcast (config?)

6963 01/27/2012 05:12 PM ben leinfelder

additional logging of the config file being used - seem to have thread locking on the xmlConfig use when running under ant/junit

6962 01/27/2012 10:53 AM ben leinfelder

calculate object size using the size on the file system rather than re-reading as an input stream.
Now only EML document bytes will be read twice: once for the checksum and again for parsing out datapackage details

6961 01/26/2012 11:14 PM ben leinfelder

system metadata generation optionally skips entries that have already been generated (data size, checksum) but allows the latest EML that describes them to have the last word on object format

6960 01/26/2012 09:35 PM ben leinfelder

remove DML for parsing -- the D1 EML parser still uses DOM, so this may not be too big of a perfromance improvement

6957 01/26/2012 12:48 PM ben leinfelder

only attempt to update date-like nodedata values.

6955 01/26/2012 10:03 AM ben leinfelder

include generate system metadata upgrade in the success flag

6954 01/26/2012 10:02 AM ben leinfelder

more clean up - reuse prepared statement for data update

6953 01/26/2012 08:40 AM ben leinfelder

look up nodedata values first, then update each one - trying to avoid out of memory exception.

6952 01/25/2012 03:50 PM ben leinfelder

eliminate the cross product that occurred when updating xml_access with a join

6951 01/25/2012 07:41 AM ben leinfelder

rollback processing Error change -- creates a loop on error. ugh

6950 01/24/2012 10:55 PM ben leinfelder

report processing errors after exceptions have been caught and recorded, otherwise the web UI is blank and there is no clue what happened unless you look in the logs.

6949 01/24/2012 10:47 PM ben leinfelder

semicolons!

6948 01/24/2012 04:32 PM Chris Jones

fix a bug in MNodeService.replicate() where the checksum value was being compared to the computed checksum object, not its value.

6947 01/24/2012 04:22 PM ben leinfelder

use a temporary table to calculate the maximum revision for a given docid and use that when setting the accessfileid during upgrade. the query plan for the all-in-one statement must be brutal as it's been running for 4 hours at this point....

6946 01/24/2012 12:20 PM ben leinfelder

do not insert duplicate GUID entries when adding rows from the xml_revisions table

6945 01/24/2012 11:57 AM ben leinfelder

add "IF EXISTS" clause to identifier table drop in case it does not exist on the given deployment (as is the case on the KNB)

6944 01/24/2012 10:35 AM ben leinfelder

use UTC serialization for log entries so that the timestamp, not just the date, is preserved
https://redmine.dataone.org/issues/2257

6941 01/23/2012 03:09 PM Chris Jones

Update the D1Admin class to set the dataone.contactSubject property. I've added the property to the http request to be added to the JSP form, but for now am setting the property using the dataone.subject field value. Not sure if we want to expose the contact subject in the form yet or not.

6938 01/23/2012 02:43 PM Chris Jones

In MN.getCapabilities(), the required contact subject was not being added to the node instance from the dataone properties. Add it in.

6935 01/23/2012 12:53 PM ben leinfelder

generate ORE maps only once -- and persist the flag to the main backup properties so that subsequent Metacat upgrades remember this value.

6934 01/23/2012 11:08 AM ben leinfelder

use RC-1 Dataone jars

6933 01/20/2012 10:46 PM Matt Jones

Added DOI generation to the 2.0.0 upgrade process. To succeed, this script must be run on a fresh 2.0.0 database, or on a 1.9.5 version database, as those are the only ways to get the needed foreign keys to be marked as deferrable. The identifier conversion must be turned on by setting correct properties in metacat.properties. See the comments in GenerateGlobalIdentifiers for details. By default, conversion is set to false in the properties file. If you want to convert an instance to use DOIs, be sure to set metacat.properties up BEFORE running through the Metacat configuration and database upgrade.

6932 01/20/2012 10:38 PM Matt Jones

Refactoring classes that throw generic Exception class to throw their more specific subclasses so that new exceptions are not hidden behind generic messages. Makes debugging easier.

6931 01/20/2012 03:45 PM ben leinfelder

try to read the local document before making the localid->guid mapping (in cases where we fail to read the data locally like if it is referenced in an EML file but does not exist on this Metacat instance)

6927 01/20/2012 10:14 AM Chris Jones

Ensure we have the object and sysmeta params for MN.create(). We were getting a fatal SAX parsing error encapsulated in a ServiceFailure when a science metadata object param was null. Cut it off at the pass after parsing the MMP entity.

6926 01/19/2012 04:02 PM Matt Jones

An example python script that uses the python client to loop through a list of
files, read them from disk, and insert them into metacat.

6923 01/18/2012 04:18 PM ben leinfelder

use larger ("text") db field for guid in the xml_access.accessfileid column

6919 01/17/2012 04:21 PM Chris Jones

Use the Collections class from java.util.

6918 01/17/2012 03:20 PM Chris Jones

Remove null field tests in the IdentifierManager class. Schema-level required fields are checked on serialization/deserialization using JibX during the REST resource handler classes. Other required fields are checked in MNodeService and CNodeService, higher in the stack.

6917 01/17/2012 03:17 PM Chris Jones

For MNs that haven't set the archived flag to false on create(), set it here. Also, ensure that the CN sync code sets the authoritative and origin member node fields.

6916 01/17/2012 03:15 PM Chris Jones

On MN.create(), set the archived flag to be false. This field isn't required in the schema, but is needed by the DataONE indexer once objects are sync'd.

6912 01/17/2012 12:06 PM ben leinfelder

-generate system meta for all docids, even those not originating on the server (replicas from the past)
-generate ORE docs and download remote data only for those documents that originated on this server being upgraded.
http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522

6911 01/17/2012 11:43 AM ben leinfelder

refactor generate system meta loop to the factory class -- to be reused in sysmeta and ORE generation
http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522

6910 01/17/2012 11:34 AM Chris Jones

When managing obsoletes/obsoletedBy system metadata fields, set the archived flag to false initially, and set it to true on system metadata for objects that a revision obsoletes.

6909 01/13/2012 04:57 PM ben leinfelder

do NOT generate ORE maps or download data when we do the initial System Metadata generation -- this is deferred until D1 registration.

6908 01/13/2012 02:25 PM ben leinfelder

make more generic so that a custom list of IDs can be passed in.

6907 01/13/2012 02:01 PM ben leinfelder

check that the resourceMap (based on Id only) does not currently exist in the local metacat when generating OREs

6906 01/13/2012 01:31 PM ben leinfelder

insert OR update system metadata -- no need to do an update right after initial insert...

6905 01/13/2012 01:05 PM ben leinfelder

call the System Metadata generator during upgrade to 2.0.0

6904 01/13/2012 11:17 AM Chris Jones

In IdentifierManager.updateSystemMetadata(), add a check for invalid system metadata (fields that throw a NullPointerException on access) to ensure that system metadata is populated correctly. Updated calling classes to handle the exception.

6901 01/13/2012 01:14 AM Matt Jones

Properly initialize the servlet context when starting alternate servlets, which makes sure that the configuration files have been loaded and config properties are available.

6894 01/12/2012 01:56 PM Chris Jones

Handle SQLExceptions when trying to save system metadata locally.

6893 01/12/2012 01:56 PM Chris Jones

Convert SQLExceptions to RuntimeExceptions for Hazelcast MapStore operations.

6892 01/12/2012 01:54 PM Chris Jones

In IdentifierManager, throw SQLExceptions rather than just logging them, and let them be handled higher up in the stack.

6891 01/12/2012 01:32 PM ben leinfelder

use new endpoint/method:
http://mule1.dataone.org/ArchitectureDocs-current/apis/CN_APIs.html#CNReplication.deleteReplicationMetadata

6890 01/12/2012 12:18 PM ben leinfelder

use PUT /obsoletedBy/{pid} for CNCore.setObsoletedBy per our discussion today

6889 01/12/2012 07:53 AM Chris Jones

Keep the hzIdentifiers set in sync with the Metacat systemmetadata table. If entries are added/updated in the hzSystemMetadata map, make sure the identifier is in the set. If (for some administrative reason) the entry is removed, remove the identifier from the set. This usually doesn't happen.

6888 01/12/2012 07:47 AM Chris Jones

When loading all keys from Metacat into the hzSystemMetadata map, also load identifiers into the hzIdentifiers set if they are not already there. Although entries may be evicted from the map, the list of identifiers will remain. The list will have a fairly small memory footprint since it's just identifiers.

6887 01/12/2012 07:44 AM Chris Jones

Add support for the distributed Set of unique identifiers in the storage cluster called 'hzIdentifiers'. This set is a persistent total list of all identifiers (even when entries in the hzSystemMetadata map are evicted). It reflects the state of the identifiers in the postgresql systemmetadata table, but is distributed across the cluster. Add the getIdentifiers() method, which returns the ISet of identifiers.

6884 01/11/2012 04:42 PM ben leinfelder

include new methods needed for replication (in new d1 jars)
https://redmine.dataone.org/issues/2203

6883 01/11/2012 01:25 PM ben leinfelder

add method: setObsoletedBy (https://redmine.dataone.org/issues/2185)
augement new method: deleteReplicationMetadata

6882 01/11/2012 11:31 AM ben leinfelder

remove method: assertRelation
https://redmine.dataone.org/issues/2158

6881 01/11/2012 11:24 AM ben leinfelder

add method: deleteReplicationMetadata
remove method: assertRelation
update the D1 jars
https://redmine.dataone.org/issues/2187
https://redmine.dataone.org/issues/2158

6880 01/11/2012 10:41 AM ben leinfelder

serialize the Identifier for the systemMetadata being registered
https://redmine.dataone.org/issues/2204

6876 01/10/2012 05:04 PM Chris Jones

Simplify setReplicationStatus() to not call updateReplicationMetadata() if a replica doesn't exist. Just create it and update the system metadata, which we already have a lock for.

6875 01/10/2012 05:03 PM Chris Jones

Minor null checks to avoid NPEs when calling replicate()

6874 01/10/2012 05:01 PM Chris Jones

Don't throw a NotAuthorized exception in isAdminAuthorized() - just return false.

6873 01/10/2012 12:12 PM ben leinfelder

do not download and save remote data resources which are HTML but are not expected to be such (login or info/splash pages before data content).
http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522

6869 01/09/2012 05:08 PM Chris Jones

Update the CN methods to throw a VersionMismatch where the API changed (where serialVersion is a required parameter). These were previously throwing an InvalidRequest exception.
Change the exception handling for calls to Hazelcast to catch a RuntimeException (not Exception) so we don't catch exceptions that we purposefully throw....

6868 01/09/2012 04:59 PM Chris Jones

Use a Logger instead of System.out for SystemMetadataMap.

6867 01/07/2012 06:01 PM Chris Jones

Don't lock() on the map.get() in isNodeAuthorized() (this assumes that the CN has queued the task already). Add more lock/unlock debug statements, and fix setReplicationStatus() - I missed a finally statement to unlock the pid.

6866 01/07/2012 12:39 PM Chris Jones

Modify CNReplication methods setReplicationStatus(), updateReplicationMetadata() and setReplicationPolicy() to allow administrative access from a Coordinating Node by calling isAdminAuthorized().

6865 01/07/2012 12:34 PM Chris Jones

Add isAdminAuthorized() to D1NodeService to check if the operation is being requested from a CN. Consult the NodeList from the CN and test the NodeType of the given node and the X509 certificate Subject. Perhaps we should expand this to also check for service-level access in the future.

6864 01/06/2012 01:51 PM ben leinfelder

store D1 configuration properties in the main backup so that they persist between upgrades.