Project

General

Profile

Statistics
| Revision:

# Date Author Comment
7325 07/20/2012 03:44 PM ben leinfelder

move the hzIdentifiers initialization into the resync thread so that it does not affect start up time. cleaned up unused methods and superfluous code.

7323 07/20/2012 10:51 AM ben leinfelder

only load local pids into hzIdentifiers if t hey do not already exist in the shared set. increase logging severity and detail of messages emitted during this process to get a better sense of what is taking so long.

7322 07/19/2012 02:38 PM ben leinfelder

utility methods to update/reserialize existing ORE maps that were generated with older foresite (and included bad dateTime strings).
https://redmine.dataone.org/issues/3046

7319 07/17/2012 03:57 PM Chris Jones

On the coordinating Nodes, we often get McdbDocNotFoundExceptions for data (doctype == 'BIN') documents because they are not synchronized to the CNs. Change the logging to only print the stack trace during load() and loadAll() when log debug is enabled.

7318 07/17/2012 01:34 PM ben leinfelder

check for invalid (!) pids. thanks, M. Reyes for catching this
https://redmine.dataone.org/issues/3047

7315 07/17/2012 11:09 AM ben leinfelder

check for whitespace in identifiers during create() and update()
https://redmine.dataone.org/issues/3047

7297 07/10/2012 10:20 AM ben leinfelder

set date SM modified when we are setting obsoletes/obsoletedBy/archived values. This way the CN can actualy pick up the changes in revision history.

7295 07/09/2012 04:23 PM ben leinfelder

log error when looking up non-existent local SM rather than completely bombing out of the resynch thread.

7286 07/02/2012 03:35 PM ben leinfelder

use secure Metacat context URL for D1 registration
https://redmine.dataone.org/issues/3030

7285 07/02/2012 12:06 PM ben leinfelder

first pass: DataONE-specific log retrieval to avoid java-based post-processing.

7278 06/18/2012 03:43 PM ben leinfelder

set archived flag (true) when we set the obsoletedBy value in the ORE system metadata

7273 06/18/2012 12:13 PM ben leinfelder

use the localId for obsoletes/obsoletedBy ORE system metadata (https://redmine.dataone.org/issues/2964)

7252 06/06/2012 03:14 PM Chris Jones

Oops, previous commit suffered from a happy trigger finger. During deleteReplicationMetadata(), don't delete the replica on the replica Member Node. Call CN.delete() for that functionality. This call just updates sytem metadata (according to the API description).

7251 06/06/2012 03:10 PM Chris Jones
7245 06/06/2012 10:23 AM Chris Jones

Minor logging change.

7244 06/06/2012 10:01 AM Chris Jones

Add debug logging to delete() to understand why we're getting InsufficientKarmaException.

7236 06/05/2012 02:07 PM Chris Jones

Since we already have determined access via isAuthorized() and isAdminAuthorized(), act as the Metacat administrator during calls to DocumentImpl.delete() in archive(), passing in null username and group.

7234 06/04/2012 08:49 PM ben leinfelder

restrict getLogRecrods (both MN and CN) to be called only by admin users (the CN)
https://redmine.dataone.org/issues/2855

7231 06/02/2012 05:46 AM Chris Jones

In setReplicationStatus() and UpdateReplicationMetadata(), don't allow a status state change from COMPLETED to anything other than INVALIDATED. This prevents the completed status from being overwritten due to race conditions.

7222 05/31/2012 09:04 PM ben leinfelder

use metacat.properties to specify the default checksum algorithm to use -- this way it will be easy for us to switch to whatever DataONE decrees.
https://redmine.dataone.org/issues/2834

7221 05/31/2012 06:16 PM ben leinfelder

put(sm) for every pid we have a SM value for so that all members receive the entry event and can save locally.

7218 05/31/2012 10:56 AM Chris Jones

Throw an exception when NOT allowed, not when allowed =).

7217 05/31/2012 10:53 AM ben leinfelder

ignore partition owner -- always attempt to look up form local store if we were unable to get the SM from the shared map.

7216 05/31/2012 10:13 AM ben leinfelder

do not check if this CN has a "perfect" copy of the SM identifiers -- we need any CN coming online to contribute the records that they have locally so that in the event that all three CNs have a partial view of things they all eventually share each others' SM entries.

7215 05/31/2012 10:10 AM Chris Jones

Also get the list size, which may throw an NPE.

7214 05/31/2012 09:53 AM Chris Jones

Only add an AccessPolicy to SystemMetadata during generation when the AccessPolicy is not empty. We've had some scenarios where IdentifierManager.getaccessPolicy() is returning an empty policy because of an empty permission list coming from the db. This was causing InvalidSystemMetadata exceptions during MN to MN replication.

7213 05/31/2012 09:19 AM ben leinfelder

push SystemMetadata entries from the CN that has them all to the shared map where other nodes may not have all entries. The CN with the complete copy only pushes SM entries that it does not own and that return as null because those are the ones that are missing on the other, non-complete CNs....

7212 05/30/2012 10:00 PM ben leinfelder

trace level log for looping over EVERY pid in the system.

7211 05/30/2012 09:47 PM ben leinfelder

meant to log the guids (source) not the pids (target)

7210 05/30/2012 08:51 PM ben leinfelder

trace level log for looping over EVERY pid in the system.

7209 05/30/2012 08:18 PM ben leinfelder

logging for each step of shared identifiers loading.

7208 05/30/2012 08:07 PM ben leinfelder

remove pause/resume - seemed to make metacat just hang on SM retrieval. Add more logging when returned SM is null -- want to make sure it is becuase the local node "owns" the pid key even though there is no value for it.

7207 05/30/2012 06:12 PM ben leinfelder

due to hudson build issue, did not actually end up testing pause/resume -- trying that again

7206 05/30/2012 05:53 PM ben leinfelder

pause/resume was not enough. trying shutdown/restart

7205 05/30/2012 05:02 PM ben leinfelder

experiment with lifecycle pause/resume. hopefully it prevents our node from taking ownership of any keys before we are sure we have them all.

7204 05/30/2012 08:29 AM ben leinfelder

increase logging and add back in the call to saveLocally() in case the SM object has already been loaded into the shared map but before this node came back online.

7203 05/29/2012 11:21 PM ben leinfelder

no need to call saveLocally explicitly since loading from the shared store triggers that behavior locally because of the configured listeners.
use an iterator over the shared identifiers in case this set is constantly changing.

7202 05/29/2012 10:10 PM ben leinfelder

make only one DB call to look up local pids - no need to do a pstmt for every single shared pid.

7201 05/29/2012 09:05 PM ben leinfelder

on init (start up) launch a synchronization thread that ensures all shared identifier entries have a corresponding local System Metadata entry.

7197 05/29/2012 10:31 AM ben leinfelder

fix NPE (logMetacat object was not initialized) that was occurring during store()

7192 05/25/2012 06:20 PM Chris Jones

Don't set the replication status to failed for an object when it is called by a public user. Just throw the NotAuthorized exception. This prevents this node from being de-prioritized because of public calls to the method.

7188 05/23/2012 04:41 PM ben leinfelder

share the same dbConnection when inserting and then updating SystemMetadata objects in the backing store.
any errors encountered during the update will rollback the entire transaction and the SM record will not exist, even in part.

7187 05/23/2012 03:28 PM ben leinfelder

Do not loadAllKeys() for SystemMetadataMap when Metacat first starts up. hzIdentifiers will be populated with a simple SQL statement rather than the serial loading of every single SystemMetadata object. It will remain in synch using the usual entryXXX() methods as before....

7184 05/23/2012 09:57 AM ben leinfelder

include pidFilter handling - only matches the complete pid. Issues a warning in the Metacat logs when pidFilter cannot be applied but allows the call to getLogs() to return as though there was no pidFilter given.
https://redmine.dataone.org/issues/2798

7179 05/21/2012 02:31 PM Chris Jones

Add a few logging statemnts for round trip replication metrics.

7178 05/21/2012 02:12 PM ben leinfelder

add trace statements for measuring time to complete SM generation.

7171 05/17/2012 12:46 PM ben leinfelder

remove exception from method decl - was not matching the interface def and not compiling.

7168 05/08/2012 04:30 PM ben leinfelder

only generate system metadata for original objects.
https://redmine.dataone.org/issues/2721

7162 05/02/2012 08:58 AM ben leinfelder

handle authorization for delete() differently for CN vs MN.
On the CN, only the CN (or tbd admin user) can call it.
On the MN, both the CN (or admin user) and the same MN can call it.

7159 05/01/2012 02:48 PM ben leinfelder

add Session-less archive() method

7157 05/01/2012 11:14 AM ben leinfelder

only admin users can call MN/CN.delete(). This is limited to any CN and only the MN that is calling itself

7156 05/01/2012 10:47 AM ben leinfelder

update the sysmeta data modified when setting archived=true
https://redmine.dataone.org/issues/882

7150 04/30/2012 04:03 PM ben leinfelder

optionally remove the document/data file from the filesystem completely when 'deleting' it.
https://redmine.dataone.org/issues/2677

7149 04/30/2012 03:42 PM ben leinfelder

newer d1 jars that include shared AuthUtilsmethod for isAuthorized() consistency
https://redmine.dataone.org/issues/2661

7148 04/30/2012 03:35 PM ben leinfelder

implement MN and CN.archive() method -- really just the existing delete() methods.
https://redmine.dataone.org/issues/2674
https://redmine.dataone.org/issues/2675

7147 04/30/2012 03:05 PM ben leinfelder

call MN.delete() for each replica when CN.delete() is called
https://redmine.dataone.org/issues/2676

7146 04/30/2012 02:20 PM ben leinfelder

defer to AuthUtils for flattening out the equivIdent subject list.
https://redmine.dataone.org/issues/2661

7145 04/27/2012 10:24 AM ben leinfelder

check normal access control rules for getSystemMetadata before deferring to MN replica information that may grant MNs additional access to the SM.
https://redmine.dataone.org/issues/2656

7144 04/25/2012 03:33 PM ben leinfelder

include Session-less interface methods and updated jars that define them.

7142 04/19/2012 02:04 PM ben leinfelder

remove extraneous pid and permission parameters from isAdminAuthorized() method and make public so that it can be called in other locations - namely before our asynchronous replicate() implementation on the MN.

7141 04/19/2012 01:50 PM ben leinfelder

check for empty null (missing) node.subjectList. This should probably be a required element in the D1 schema, but it appears not. (ORNL entry was missing subjects in cn-dev environment)

7140 04/19/2012 11:57 AM ben leinfelder

just use the e.getMessage() as e.getCause() may be null (seeing NPE when testing via the MN IT tester)

7139 04/18/2012 04:04 PM ben leinfelder

check for empty null (missing) node.subjectList. This should probably be a required element in the D1 schema, but it appears not. (ORNL entry was missing subjects in cn-dev environment)

7136 04/17/2012 09:20 AM ben leinfelder

needed to initialize the nodeList that stores matching nodes (by subject) -- this was the source of a NPE when we had a matching node subject.

7134 04/13/2012 04:40 PM Chris Jones

As Ben suggested, don't compare to the node list if there are no replicas listed. This reduces the number of calls to listNodes() on the CN.

7133 04/13/2012 04:32 PM Chris Jones

Minor logging change in throwing ServiceFailure when Hazelcast throws a RuntimeException.

7132 04/13/2012 04:07 PM Chris Jones

Modify getSystemMetadata() to allow nodes that are listed as replicas to access the system metadata. Use the Session.Subject to find a list of nodes from the CN that match the subject, and compare those node ids to the listed replica node ids. Add listNodesBySubject() helper method to do so.

7128 04/09/2012 03:18 PM ben leinfelder

add a parameter for optionally writing EML-embedded access control rules to the Metacat DB.
https://redmine.dataone.org/issues/2584
https://redmine.dataone.org/issues/2583

7127 04/06/2012 04:22 PM ben leinfelder

added comments and logging about https://redmine.dataone.org/issues/2572

7126 04/06/2012 03:01 PM ben leinfelder

generalize the exception handling because our actions are the same no matter what the specific error is during create - we just notify the CN that the replicate call failed

7125 04/06/2012 02:58 PM ben leinfelder

catch general Exception that may be thrown during MN.replicate() when creating the object locally. There are a few records that keep slipping off our radar with no explanation as to why they remain in "REQUESTED" status.

7123 04/06/2012 01:53 PM ben leinfelder

catch errors for each localid we are processing so that they do do prevent other ids from having ORE content generated

7122 04/06/2012 01:52 PM ben leinfelder

additional debug logging for tracking down MN replication errors

7117 04/04/2012 04:55 PM ben leinfelder

add comment about returning early when no system metadata can be found.
removed extraneous check on the content type of the SM -- was unused.
formatted indenting

7116 04/04/2012 04:49 PM ben leinfelder

for SystemMetadata events we first check the event for the SM value. If it returns null, we look it up from the shared map. It seems as if we don't always get a value with our events.

7115 04/04/2012 03:35 PM ben leinfelder

comment out: synchronize local system metadata on cn restart

7114 04/03/2012 01:31 PM ben leinfelder

synchronize local system metadata on cn restart

7113 04/03/2012 11:58 AM ben leinfelder

additional logging in MN.replicate()

7112 04/03/2012 11:32 AM ben leinfelder

double check "ecogrid" data urls for valid docid.rev - namely integer rev numbers - when parsing EML and also generating system metadata when necessary. Log the errors as warnings.

7111 04/02/2012 04:11 PM ben leinfelder

log calls to store() system metadata to the backing store

7108 03/30/2012 05:24 PM ben leinfelder

Add the listener for LifecycleEvent state changes

7107 03/30/2012 05:23 PM ben leinfelder

synchronizeLocalStore() when the cluster has a LifecycleEvent state change to RESUMED.

7106 03/29/2012 02:48 PM ben leinfelder

refactor memberAdded code to separate method - synchronizeLocalStore for possible reuse

7101 03/28/2012 11:08 AM ben leinfelder

change ordering of getLogRecords() parameter -- pidFilter is in the middle now

7099 03/27/2012 04:35 PM ben leinfelder

upgrade to latest RC in libclient and common jars -- includes updated getLogRecords and new mn.generateIdentifier method

7098 03/27/2012 02:25 PM ben leinfelder

-use MembershipListener to keep new members' backing store for system metadata synchronized with the shared system metadata map.
-remove the unused InstanceListener interface

7091 03/26/2012 04:25 PM ben leinfelder

add logging statements when there is a problem calling setReplicationStatus

7089 03/26/2012 02:10 PM Chris Jones

Add a few more debugging statements to HazelcastService for troubleshooting hazelcast map concurrency.

7087 03/22/2012 09:31 PM Chris Jones

Use Jjava.util.Calendar rather than com.ibm ...

7086 03/22/2012 03:14 PM Chris Jones

Also allow MNs to set the FAILED status in setReplicationStatus(). this was an oversight on my part, trying to keep MNs that truly did succeed from overriding the COMPLETED status with FAILED.

7084 03/21/2012 11:26 AM ben leinfelder

use current datetime (at system metadata generation) as the date last modified

7083 03/19/2012 06:14 PM Chris Jones

Don't check for populated obsoletes and obsoletedBy fields during CN.create(), only MN.create(). The CN should expect that the MN has populated this field because of existing revision information, and should trust the MN information. Addresses https://redmine.dataone.org/issues/2507.

7082 03/19/2012 06:08 PM Chris Jones

Some minor logging changes.

7079 03/19/2012 10:12 AM ben leinfelder

use isAdminAuthorized() to check access to CN.create(). Note this method takes a pid and permission parameter and neither is used. Also removed the NotFound exception because it would never come up.

7078 03/19/2012 10:01 AM ben leinfelder

check that caller is CN/admin for CN.delete()
https://redmine.dataone.org/issues/2506

7077 03/19/2012 09:52 AM ben leinfelder

include CN.delete()
https://redmine.dataone.org/issues/2506

7076 03/16/2012 04:07 PM Chris Jones

Notify each replica MN when critical portions of system metadata change so the MN can pull the latest copy into its store. AccessPolicy and RightsHolder changes are the most critical for the MN to keep updated on.

7075 03/16/2012 11:40 AM Chris Jones

Only allow CNs to call MN.synchronizationFailed() by calling isAdminAuthorized(). The pid must also be valid.

7074 03/15/2012 07:50 PM Chris Jones

Modify CNodeService.setReplicationStatus() slightly to restrict MN-based calls to only set the status to COMPLETED. The CNs should be setting failures or invalidations, or the status can remain at QUEUED or REQUESTED, and the MNAuditTask can revisit those replicas as needed.

7073 03/15/2012 07:14 PM Chris Jones

Add a notifyReplicaNodes() method that calls MNStorage.systemMetadataChanged() on MN replica nodes for a given object identifier. This will be called when there are changes to AccessPolicy and rights holder since these are critical access metadata for an MN, but they can only be changed on the CN.