| Revision:

metacat / src / edu @ 6130

# Date Author Comment
6130 06/07/2011 02:56 PM ben leinfelder

organize imports so that it is clearer what dependencies exist on the D1 jars

6129 06/07/2011 12:08 PM ben leinfelder

include create() and reserveIdentifier() methods

6128 06/07/2011 11:14 AM ben leinfelder

include override annotation for register method

6127 06/07/2011 11:09 AM ben leinfelder

use Date not joda's DateTime

6126 06/07/2011 10:28 AM ben leinfelder

expose spatial cache regeneration option in the admin interface

6125 06/07/2011 10:27 AM ben leinfelder

force replication for newly-registered system metadata

6124 06/07/2011 09:53 AM Chris Jones

Merged in the D1_0_6_2_BRANCH changes that include the transition from ObjectFormat calls to ObjectFormatCache calls.

6123 06/06/2011 03:56 PM ben leinfelder

check system metadata for the id as well (in cases when we only have system metadata)

6122 06/06/2011 03:28 PM ben leinfelder

include GUID column for xml_access and related methods for storing/retrieving access rules

6121 06/06/2011 12:08 PM ben leinfelder

implement the old interface for now (until 0.6.2)

6120 06/03/2011 12:51 PM ben leinfelder

include CNCore implementation - only registerSystemMetadata is implemented at the moment. also - updated d1 jar (0.6.2) should be used since that is where the method is defined.
would like to consider making ResourceHandler more modular - seems like it does A LOT of different things

6119 06/02/2011 04:40 PM ben leinfelder

include System Metadata forced replication - just need to figure out when to call it!

6118 06/01/2011 01:45 PM ben leinfelder

handle timed replication of system metadata. there are still a few outstanding issues:
-track server location of system metadata-only entries
-replication policy flag for system metadata-only entries?
-locking for replicated entries?
-forced replication of entries

6108 05/27/2011 11:55 AM ben leinfelder

read and write D1 access policy rules from metacat xml_access table.
still TBD: which mechanism takes precedence when there are systemMetadata access rules and EML access rules and other access rules?

6107 05/27/2011 09:45 AM ben leinfelder

persist system metadata replication policy and status using db tables

6105 05/26/2011 11:51 AM ben leinfelder

rework SystemMetadata creation when inserting documents via the Metacat servlet api (in which case there was no client-supplued system metadata)

6104 05/26/2011 10:25 AM ben leinfelder

do not look in systemMetadata for a docid->guid mapping

6102 05/25/2011 03:53 PM ben leinfelder

transfer full System Metadata (as XML) during document and data replication

6099 05/25/2011 11:59 AM ben leinfelder

-remove system metadata guid -> local id mapping (there is no document for system metadata now)
-include system metadata elements when replicating data objects (TODO: transfer all system metadata structures with the docinfo request).
TODO: remove docid+rev from the systemMetadata table definition

6097 05/24/2011 04:18 PM ben leinfelder

do not use XML files for storing SystemMetadata - use DB tables only.

6092 05/19/2011 01:52 PM Matt Jones

Modified Metacat to build against the D1_SCHEMA_0_6_1 branch of the dataone schemas by incorporating the 0.6.1-SNAPSHOT version of d1_common and d1_libclient libraries, and refactoring Metacat code references to the d1 schema changed types.

6091 05/18/2011 04:32 PM Chris Jones

In order to sync up with DataONE 0.6.1 changes, I'm backing out ObjectFormatService changes temporarily in Metacat. Most functionality will be rolled back in using the DataONE 0.6.2 tag, but some methods in ObjectFormatService (such as getListFromDisk()) will be moved into d1_libclient_java.

6088 05/17/2011 08:02 PM Chris Jones

Changes in the DataONE ObjectFormat class deprecate the convert() method, and we're now using Metacat's ObjectFormatService to look up object format attributes. The following changes replace ObjectFormat.convert() with ObjectFormatService.getFormat() in several classes....

6079 05/05/2011 03:14 PM ben leinfelder

use update method to update the mapping between local and guid (d1) when we get a force replication request that is an "update

6068 05/05/2011 10:45 AM rnahf

generateMissingSystemMetadata was swallowing Exceptions instead of throwing. Refactored so that specific exceptions are thrown, affecting [create/update]SystemMetadata methods, too.

6067 05/05/2011 10:19 AM rnahf

committing changes related to the new restservice update specification (newPid vs. obsoletedGuid)

6064 05/05/2011 10:00 AM ben leinfelder

replace whitespace in generated docid scope (sanparks patch from 1.9.4 branch)

6059 05/04/2011 12:39 PM ben leinfelder

use outputstream as an object, not a string. relax the Map typing to allow for mixed values. (sanparks patch)

6057 05/03/2011 04:04 PM ben leinfelder

use "object_format" element consistently so that it is replicated across instances

6053 04/28/2011 02:31 PM ben leinfelder

remove very old "metacat webservice" code - as far as i can tell it is never referenced or used. plus we have eocgrid and the new D1 rest services covering this territory now

6051 04/28/2011 01:05 PM rnahf

zero padded date string in DocumentUtil.generateDocumentId() for readability

6050 04/26/2011 08:22 AM Chris Jones

Use SystemUtil.getContextURL() in ResourceHandler to construct the DataONE service URL (rather than direct calls to PropertyService). This handles http and https URLs, and strips the :80 or :443 for the well known ports.

6048 04/25/2011 03:25 PM Chris Jones

Minor changes to MetacatHandler:
- Improved logging where MetaCatServlet.class was used in getLogger() rather than MetacatHandler.class (holdover from the refactor)
- Minor formatting changes, and replacement of 'MetaCatServlet' with 'MetacatHandler' in the logging output as needed.

6045 04/25/2011 11:08 AM rnahf

improved multipart handling (improved logging messages, code, and error checking). Added exception classname to error output when the generic Exception is thrown. Added error check for cases of null value for file parts 'sysmeta' and 'object.'

6044 04/25/2011 10:58 AM rnahf

added a few debugging lines in createSystemMetadata() related to contents of identifier strings

6042 04/24/2011 05:42 PM Chris Jones

Modified IdentifierManager.getDocumentInfo() to include the docid in the returned hash map, since it is useful to be able to obtain the docid and rev separately from a given fullDocidWithRev (e.g. test.1.1).

6041 04/22/2011 01:44 PM rnahf

fixing annoying error message inaccuracy

6037 04/14/2011 03:07 AM Matt Jones

Changed AuthLDAP to deal with cases where getAttributes encounters non-string
attributes (which used to cause a ClassCastException). Now, if an attribute
value can not be cast to string, we catch the class cast exception and just
skip this value. This only typically occurs when an LDAP server is set to send...

6036 04/13/2011 08:03 PM Matt Jones

MOdified MetacatHandler to catch cases where ObjectFormat is not being set properly on data files when
generating SystemMetadata. When the EML document contains a format for an entity that maps to a null
type in ObjectFormat.convert(), then the type ends up being null and an error is generated on insertion...

6035 04/11/2011 05:46 PM ben leinfelder

allow "docid override" queries to include the results of a "normal" query - if the operator is left null, it acts as the usual override, otherwise UNION and INTERSECT modes can be used to either augment or refine the results.
this is for incorporating semantic+spatial+keyword queries into one query operation/result

6034 04/08/2011 10:22 AM ben leinfelder

remove System.out statements in favor of logging

6033 04/08/2011 08:56 AM Chris Jones

Removed hardcoded D1 node type in ResourceHandler and added in a new 'dataone.nodeType' property. Also added 'dataone.coordinatingNodeBaseURL' property which points to the CN that stores the authoritative object format list. If this instance of Metacat is a CN, it may point to itself.

6032 04/08/2011 08:28 AM Chris Jones

ResourceHandler in Metacat was set to return the KNB site URL as the MN base URL rather than the node Id. Fixed.

6031 04/07/2011 03:17 PM ben leinfelder

initialize the HandlerPluginManager

6030 04/07/2011 03:16 PM ben leinfelder

allow the addition of properties via code

6029 04/01/2011 05:00 PM ben leinfelder

add event notification for insert/update/delete on documents (for semtools plugin)

6027 03/31/2011 05:12 PM ben leinfelder

do not attempt to check permissions when reading documents for systemMetadata generation (unless I completely do not understand this feature - please verify!).

6025 03/29/2011 11:23 AM ben leinfelder

do each table separately with it's own connection - running into memory issues on dev.nceas running this.

6023 03/28/2011 08:51 AM Chris Jones

This is the start of the ObjectFormatService, which manages the list of object formats registered within Metacat. This includes schema types, mime types, and other information related to a particular format. The service provides functionality for the DataONE MemberNode and CoordinatingNode components, with CoordinatingNodes providing the authoritative list of object formats. See

6022 03/25/2011 03:04 PM Duane Costa

Bug 3835 - design and implement OAI-PMH compliant harvest subsystem
Minor bug fix to handle irregular Metacat docids containing two or more dot ('.') characters. In the LTER Metacat, the following docids (scope and identifier, excluding the revision value) have that characteristic:...

6021 03/25/2011 10:43 AM Duane Costa

Bug 3835 - design and implement OAI-PMH compliant harvest subsystem
Return a 'badVerb' response when the 'verb' request parameter is missing from the request. Previously this generated a NullPointerException.

6020 03/24/2011 03:10 PM ben leinfelder

use the jaxb date parser for ISO 8601 formats. the numeric and date node values are now calculated after the document has been successfully inserted in the db so any sql exceptions do not prevent the raw node data from being saved.

6019 03/23/2011 03:39 PM ben leinfelder

rollback the accessDAO changes - leaving well enough alone.

6018 03/23/2011 02:14 PM ben leinfelder

only include accessfileid when it is not toplevel

6017 03/23/2011 01:31 PM ben leinfelder

include accessfileid and subtreeid when inserting xml_access values

6016 03/23/2011 12:51 PM ben leinfelder

use access control dao for setting access in EML parser. send additional xml_access info in replication request

6015 03/22/2011 05:19 PM ben leinfelder

insert/update documents with null user and null group to circumvent access control restrictions then update the user_owner and user_updated values to reflect what exists on the originating server (pisco)

6014 03/22/2011 01:24 PM ben leinfelder

use 'user_updated' field when writing the replicated document - allows most recent ownership/permissions to be used (in case LDAP groups have shifted) and is more accurate for both updates and initial inserts (hopefully addresses the replication issue we are having with pisco)

6012 03/16/2011 10:56 PM ben leinfelder

add support for temporal element query in pathquery

6001 03/02/2011 02:12 PM Chris Jones

DocumentImpl.delete() now throws finer grained exceptions (not a general exception). Consequently, the classes that call it have been updated to handle the thrown exceptions, including CrudService, ReplicationHandler, and ReplicationService.

6000 03/02/2011 10:39 AM ben leinfelder

refactor the names of these Data Manager implementation classes so that it's easier to use them with the default/local versions of similar. These classes utilize Metacat-specific configuration values rather than relying soley on the bundles that are used in the stand-alone DM lib.

5998 03/01/2011 09:13 AM Chris Jones

To support GUIDs in MetacatHandler.handleDeleteAction(), I've added in a new method:
deleteFromMetacat() - deletes a document based on the docid
This factors the deletion code out of handleDeleteAction(). handleDeleteAction() now does a docid lookup based on GUID, and if it is not found, reverts to the deletion based on docid instead.

5977 02/17/2011 02:57 AM Chris Jones

These are fairly significant changes to MetacatHandler.handleInsertOrUpdateAction() that add in support for creation or update of GUIDs and SystemMetadata. Upon insertion or update of DataPackages from non-DataONE aware clients (such as Morpho), the identifier table is updated by creating a GUID, and the systemmetadata table is updated with fields after the EML document is parsed for distribution information and entity typing. System Metadata documents are also generated and inserted into Metacat. The list of data entities is iterated over and System Metadata is generated for each data file as well.

5976 02/17/2011 02:39 AM Chris Jones

In MetacatHandler I've removed updateSystemMetadata() in favor of additions to insertOrUpdateSystemMetadata(). Modified createSystemMetadata() to reflect the changes as well.

5975 02/16/2011 06:23 PM Chris Jones

Modified MetacatHandler.createSystemMetadata() to take a localId, not a guid as an argument since there are times when the guid has yet to have been created, and it is created in this method if so.
Also, put the read() call to get the InputStream of the data/metadata document into it's own try/catch statement.

5968 02/16/2011 01:56 PM Chris Jones

Somehow missed adding in javadoc for read(). Here it is.

5967 02/16/2011 01:52 PM Chris Jones

For now, getSystemMetadata() will be private like the other *SystemMetadata() methods.

5966 02/16/2011 01:47 PM Chris Jones

Modified MetacatHandler, updated the getSystemMetadata() method to now use read() and deserializeSystemMetadata() to produce the SystemMetadata object. Exceptions are pushed up the stack, and so accordingly, modified createSystemMetadata() to reflect the changes.

5962 02/16/2011 09:25 AM Chris Jones

Modified MetacatHandler, added createSystemMetadata() - generates SystemMetadata objects for newly inserted data or documents. This is intended to be used from handleInsertOrUpdateAction(), and only for documents being inserted from clients that don't support the DataONE interface. The method parses EML documents to discover data entities, and updates the system metadata for those entries, with support for describes and describedBy metadata. Currently doesn't handle FGDC, etc. documents....

5961 02/16/2011 09:14 AM Chris Jones

Modified MetacatHandler, added three methods:
getSystemMetadata() - returns a SystemMetadata object from the systemmetadata table using the given GUID. Stub only.
updateSystemMetadata() - updates the systemmetadata table using the given SystemMetadata object....

5960 02/16/2011 09:10 AM Chris Jones

Modified MetacatHandler and added two methods:
serializeSystemMetadata() - Serialize a SystemMetadata object to XML string
deserializeSystemMetadata() - Deserialize a SystemMetadata object from an XML string

5959 02/16/2011 09:07 AM Chris Jones

Modified MetacatHandler, added read() - Read a document from metacat and return an InputStream. The XML or data document should be on disk, but if not, read from the metacat database. This method should be optimized, along with others, to not write stream data to disk for performance reasons.

5958 02/16/2011 09:06 AM Chris Jones

To support generation of SystemMetadata and GUIDs, added a number of methods to MetacatHandler that are also in CrudService(). CrudService should eventually be refactored to use the handler methods. Added:
readFromFilesystem() - Read a file from Metacat's configured file system data directory, and return a FileInputStream. This code has been factored out of handleInsertOrUpdateAction()....

5957 02/15/2011 02:08 PM berkley

fixed bug where the wrong checksum alg got written to the db

5955 02/14/2011 11:45 AM berkley

added file extension for txt or csv files

5953 02/13/2011 11:20 AM Chris Jones

To support the generatemissingsystemmetadata REST call, modified CrudService.createSystemMetadata() to use DataoneEMLParser and further determine object formats from EML metadata. Formats currently supported are text/plain, text/csv, image/[jpg|jp2|bmp|tiff|png], and only for EML documents with 'ecogrid://' defined entity urls....

5950 02/11/2011 01:48 PM berkley

adding more debuggin and fixing bug with systemmetadata

5945 02/10/2011 06:06 PM Jing Tao

Add code to download the included schema.

5944 02/10/2011 04:17 PM berkley

fixed replication bug where systemmetadata was not getting procssed correctly

5943 02/10/2011 10:56 AM berkley

think I fixed the connection problem. one connection in IdentifierManager was being leaked. added more debug info in case it happens again

5941 02/10/2011 10:27 AM Jing Tao

Add a static method to get base url base on a schema url.

5938 02/09/2011 05:00 PM Jing Tao

A sax handler class can get included schema path.

5933 02/08/2011 02:08 PM berkley

added some debug info to DBConnectionPool

5930 02/08/2011 12:05 PM berkley

fixed typo that prevented replication

5929 02/08/2011 11:46 AM berkley

fixed node response bug

5927 02/07/2011 03:41 PM berkley

fixed update problem

5926 02/07/2011 02:49 PM berkley

put the pid in the info section of the url

5923 02/07/2011 10:50 AM berkley

fixed content type problem where csv files were set as text/xml

5922 02/04/2011 06:34 PM berkley

fixed problem with count in listObjects()

5921 02/04/2011 06:14 PM Matt Jones

Cleaned up warnings, removed dead code.

5920 02/04/2011 05:53 PM Matt Jones

Updated to most recent DataONE libraries. Updated CrudService to set the correct origin MN and auth MN in system metadata. Refactored exception passing. More work to come in generating SystemMetadata.

5918 02/04/2011 01:05 PM berkley

removed debug statements

5917 02/04/2011 12:39 PM berkley

fixed bugs in listObjects

5916 02/04/2011 11:30 AM Matt Jones

Add in correct node references in system metadata.

5915 02/04/2011 10:29 AM Matt Jones

Cleanup harvester exceptions and generics.

5914 02/04/2011 06:17 AM ben leinfelder

remove httpclient 3.1 and custom-built httpclient.jar
rework MetacatClient (and other classes) to use httpclient 4
updated build to not create httpclient.jar
encoding tests now pass.

5913 02/04/2011 05:57 AM ben leinfelder

blank configuration value should be treated the same as null

5906 02/03/2011 03:39 PM berkley

few bug fixes for listObjects

5895 02/03/2011 03:25 PM berkley

added code to do database query for listObjects

5893 02/03/2011 01:19 PM Matt Jones

Cleaned up unused imports.