add an alternative method for loading system metadata identifiers but leave it commented out. We may find that using the ObjectList method is too much overhead, but it will always be consistent with what metacat reports for listObjects().
add note about long-running load for shared system metadata map
increase amount of text the 'xml_path_index.path' column can accommodate. I was seeing errors like this during indexing:knb 20120312-11:42:05: [ERROR]: DocumentImpl.buildIndex - SQL Exception while indexing document knb-lter-and.3147 : ERROR: value too long for type character varying(1000) [edu.ucsb.nceas.metacat.DocumentImpl]
Added the following values to the HTTPD site configuration: JkOptions +ForwardURICompatUnparsed AllowEncodedSlashes On AcceptPathInfo On
If PID is not part of the multipart params, we end up with a NullPointerException. Throw an InvalidRequest in this case rather than ServiceFailure resulting from the NPE.
add note about https://redmine.dataone.org/issues/2451 to the documentation
translate "insert" events in Metacat as Event.CREATE events ("create") for DataONEhttps://redmine.dataone.org/issues/2461
for good measure, use the D1 encoding util for url decoding the parameters for listObjectshttps://redmine.dataone.org/issues/2460
log record paging:-use start and count parameters-if start+count exceeds the total number of records, then only return from start to the end of the list-if start exceeds total record count, start at the end of the list (will be empty list)https://redmine.dataone.org/issues/2458
Use 'fromDate' and 'toDate' as listObject param filters to comply with the API documentation. We had changed this in MNResourceHandler, but somehow missed it in CNResourceHandler.
check whether mapping (
catch additional NotFound exception for: "do not include log entries for documents that the caller is not allowed to read." https://redmine.dataone.org/issues/2444
serialize exception in header for describe response when there is a BaseExceptionhttps://redmine.dataone.org/issues/2440
do not include log entries for documents that the caller is not allowed to read. https://redmine.dataone.org/issues/2444
use revision provided in the docid when looking up guid. had been using latest revision which I think incorrectly reports on the log history.noticed this when looking at: https://redmine.dataone.org/issues/2444
Add testIsEquivIdentityAuthorized() to ensure that [MN|CN].isAuthorized() is authorizing equivalent identities correctly. Note: Using TypeMarshaller.marshalTypeToOutputStream(type, System.out) to serialize an object seems to jack up output to stdout - not sure why.
A minor change to isAuthorized() - compare each Person in the SubjectInfo (not just the primary Subject) since each person could have an equivalent identity mapped to the primary Subject. Add debug logging for the comparison.
added debug logginghttps://redmine.dataone.org/issues/2429
check if verified flag is null before evaluating (NPE during MN Auth test)https://redmine.dataone.org/issues/2429
throw InvalidToken when there is invalid SubjectInfo embedded in the certificatehttps://redmine.dataone.org/issues/2431
fixed Oracle script issues identified by: Brian Turcotte <bturcott@sfwmd.gov>. He provided the fixes, so thank you!
do not include stylesheet for list of checksum algorithms -- there is no template for it and therefore looks blank in a browser
update docs to match node registration behavior: we do not assign them nodeIds at registration
Roll back the nodeId default to blank (used to indicate registration on new installs - thanks Matt.)
Add a default nodeId in metacat.properties of 'urn:node:METACAT1' as a placeholder that needs to be changed on configuration.
Globally change the property 'dataone.memberNodeId' to 'dataone.nodeId'. This is more useful for both MNs and CNs implemented in Metacat. Also, change D1NodeService.getLogRecords() to return log entries with the actual node id rather than the IP address (looks like a cut/paste error)....
throw InvalidToken when an invalid Permission is passed in. THis requires that internal calls to the method also check for this exception.https://redmine.dataone.org/issues/2388
Set mime type on images.
Set mime type.
call deleteReplica when we get that request (looks like an undetected copy and paste error)
do not allow blank node references to be used.https://redmine.dataone.org/issues/2362
only generate system metadata when the call comes from the legacy Metacat API, not the D1 API.https://redmine.dataone.org/issues/2362 (I think this was the culprit)
do not "lookup" object format when retrieving system metadata -- just return what we have stored as the formatId and don't [erroneously] default it to binary when there's a problem with the lookup (cache or service or otherwise).https://redmine.dataone.org/issues/2365
Get ReplicationPolicy correctly generated:-tweak the regular expression for getting the pref/blocked node list for default replication policy.-set blocked list (had mistakenly been two calls to set pref list)
actually, let's set the serialVersion during the MN.create() call so that the HZ map and the backing store have the same information immediately. Also, this is how the docs specify it.http://mule1.dataone.org/ArchitectureDocs-current/design/SystemMetadata.html
if serialVersion is null, use defualt value of 0
handle both listing and getting checksums using the GET endpoint -- depends whether or not a pid is included in the URLhttps://redmine.dataone.org/issues/2089
include systemmetadata and ore generation flags as "remembered" configuration values for the admin UI.
remove ID mapping when a create()/"insert" call fails so that subsequent calls do not return an IdentifierNotUnique error. In this case it was due to invalid XML.https://redmine.dataone.org/issues/2341
use RC-3 DataONE jars and fix compilation error that arose. https://redmine.dataone.org/issues/2351
overload getAllDocidsByType() method for backward semtools compatibility
Use 'a2dissite' to disable the default site (not 'a2ensite').
do not subset the list for MS generation testing -- at least not as the default in svn!
CNodeService.listChecksumAlgorithms() was returning null rather than the list. Fixed.
restore "test" target that I nuked when adding runoneclass. (thanks, Chris)
ObjectFormatCache.getFormat(String formatStr) has been deprecated, and now only takes a formatId instance to get a format from the cache. It also throws Service Failure and NotImplemented, so here just set the format to application/octet-stream in any case.
Update D1NodeService to reflect new ObjectFormatCache signature.
Adding the new d1 [common|libclient] RC2 jars from the D1_COMMON_JAVA_v1.0.1-RC2 and D1_LIBCLIENT_JAVA_v1.0.1-RC2 tags in the repository.
only run ORE generation for EML docs -- no need to run this for all documents (yikes!)
use IdMan method to find docids that do not already have system metadata records -- this lets us re-run without re computing system metadata for every entry (in case the process is interrupted). I haven' been using this option because I wanted to continually regenerate all SM for everything in my test DBs, but we are so close to release that I want to get this in there.
for testing: limit and randomize the docs to generate metadata for
FOR TESTING ONLY: limit number of records to 100 so that we can get an estimate
update the memberNodeId in existing system metadata only after the register/update is successful with the CN -- we can avoid unneeded SM updates in cases when the register/update fails because we gave the CN bad info that it rejects. https://redmine.dataone.org/issues/2308
include member node id text field now that the CN is not assigning random Ids.https://redmine.dataone.org/issues/2308
1. lookup and use the guid when processing obsoletes/obsoletedBy entries -- had previously been assuming localId==guid but now that we have introduced DOIs as part of the Metacat upgrade process, we may have DOIs for the guid that map to localIds.2. base ORE guids on the localid of the data package they are describing and not on their DOI -- otherwise we might mash up the DOI prefix (or other id scheme that we are unaware of). By using resourceMapPreix + localId we are sure to have a valid localid and guid for the ORE map we create and add to the system
use updated authorization policies as discussed in:https://redmine.dataone.org/issues/2277andhttp://epad.dataone.org/20120131-authn-authz-questions
refactor D1-specific upgrade utilities into their own package
remove createAndInsertSystemMetadat() method that acts on a single localId -- incorporated this into the localId-list-based method.
refactor IdentityManager.createSystemMetadata(sm) to be insertSystemMetadata(sm) so that it is clear that this method inserts the SM object into the backing store. This differentiates it from the "generation" methods we use when we need to create SM about pre-existing objects or objects we get from non-D1 api calls.
generate SystemMetadata during D1 registration (not 2.0.0 upgrade). This process runs in a thread and updates a metacat.properties value when it is complete.
getMultipartParameters() outside of debug block -- thanks Mark Reyes @ CDL for catching this.
dataone configuration and registration enhancements:-include flag to disable D1 services, currently only the MN side enforces this-do not allow multiple registration attempts if we have just submitted and are awaiting Node verification by the CN.-do not allow configuration "bypass" if D1 settings have been configured previously....
use correct Collections import
Show "Update" button if this MemberNodeId is already registered with DataONE, otherwise use the "Register" label
match changes to MN service methods (return type as boolean)
updated d1 jars with latest libclient changes and objectformatcache use
Updated configuration documentation in admin guide for Metacat DataONE section. Changed links in configuration utility to point at the Admin guide.
Added new methods to generate a default replication policy based on properties from the metacat configuration. This is called during system metadata creation for objects that lack any system metadata.
Modify admin configuration to include default replication policy. Extensively revised the DataONE configuration page, including new wording for intro, improved tooltips throughout, new arrangement of sections, and other cosmetic changes.
Clean up warnings in class.
Remove ability to edit NodeID from D1 configuration page. Fix update of contactSubject and dataone.ore.generated property name.
include flag indicating that system metadata generation has completed (useful for independent long-running thread)
handle "BIN" objects so as to avoid repeated calls to lookup the non-existent ObjectFormat
do not wait for SM generation to complete during the upgrade -- this way the web UI wont hang for days. the process sets a metacat property when it is complete.
Fixed a bug the a hyper-link included the username/password input fields.
use RC-2 DataONE jars -- these are built from trunk still, but include the next tag naming convention
do not shutdown hazelcast -- it needs to be running after the upgrade process so that Metacat actually works.I think the newer version of HZ makes it so the threads are all released as needed.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5572
Commenting out the parts of the upgrade script that started to refer to EXID. At this point, the registration of EZID identifiers will be done out-of-band with respect to the upgrade.
use plain String parameter for {pid} instead of XML serialization of it.
remove {pid} from POST URL on CN.registerSystemMetadata()https://redmine.dataone.org/issues/2284
remove {pid} from POST URL on CN.create()https://redmine.dataone.org/issues/2284
remove {pid} from POST URL on MN.create()https://redmine.dataone.org/issues/2284
catch cases where the previous/next revision of objects have not had system metadata generated yet
create system metadata object if it wasn't found in HZ
adjust the width of the label suffix.
process systemMetadata from the docInfo string before writing to the database so that we guarantee guid-docid mapping exists before attempting to look it up.
Adjust the column width of the search result.
upgrade to hazelcast 1.9.4.6 so that threadpools are released when not needed (http://code.google.com/p/hazelcast/issues/detail?id=765).include ant target to run a specific main class (mostly for debugging)
use File.deleteOnExit() not a half hour timer thread to do it.
multithreaded implementation for processing docids for system metadata generation.need to investigate ant/junit running that deadlocks hazelcast (config?)
additional logging of the config file being used - seem to have thread locking on the xmlConfig use when running under ant/junit
calculate object size using the size on the file system rather than re-reading as an input stream.Now only EML document bytes will be read twice: once for the checksum and again for parsing out datapackage details
system metadata generation optionally skips entries that have already been generated (data size, checksum) but allows the latest EML that describes them to have the last word on object format
remove DML for parsing -- the D1 EML parser still uses DOM, so this may not be too big of a perfromance improvement
test harness for running system metadata generation outside of the upgrade process
include comment about KNB estimated time to run during upgrade:Total time: 20 minutes 58 seconds
only attempt to update date-like nodedata values.
use "test" to exercise upgrade code on staging DB.