do not attempt to check permissions when reading documents for systemMetadata generation (unless I completely do not understand this feature - please verify!).
do each table separately with it's own connection - running into memory issues on dev.nceas running this.
This is the start of the ObjectFormatService, which manages the list of object formats registered within Metacat. This includes schema types, mime types, and other information related to a particular format. The service provides functionality for the DataONE MemberNode and CoordinatingNode components, with CoordinatingNodes providing the authoritative list of object formats. See https://redmine.dataone.org/issues/1378....
Bug 3835 - design and implement OAI-PMH compliant harvest subsystem Minor bug fix to handle irregular Metacat docids containing two or more dot ('.') characters. In the LTER Metacat, the following docids (scope and identifier, excluding the revision value) have that characteristic:...
Bug 3835 - design and implement OAI-PMH compliant harvest subsystem Return a 'badVerb' response when the 'verb' request parameter is missing from the request. Previously this generated a NullPointerException.
use the jaxb date parser for ISO 8601 formats. the numeric and date node values are now calculated after the document has been successfully inserted in the db so any sql exceptions do not prevent the raw node data from being saved.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=2084
rollback the accessDAO changes - leaving well enough alone.
only include accessfileid when it is not toplevel
include accessfileid and subtreeid when inserting xml_access values
use access control dao for setting access in EML parser. send additional xml_access info in replication request
insert/update documents with null user and null group to circumvent access control restrictions then update the user_owner and user_updated values to reflect what exists on the originating server (pisco)
use 'user_updated' field when writing the replicated document - allows most recent ownership/permissions to be used (in case LDAP groups have shifted) and is more accurate for both updates and initial inserts (hopefully addresses the replication issue we are having with pisco)
add support for temporal element query in pathqueryhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=2084
DocumentImpl.delete() now throws finer grained exceptions (not a general exception). Consequently, the classes that call it have been updated to handle the thrown exceptions, including CrudService, ReplicationHandler, and ReplicationService.
refactor the names of these Data Manager implementation classes so that it's easier to use them with the default/local versions of similar. These classes utilize Metacat-specific configuration values rather than relying soley on the bundles that are used in the stand-alone DM lib.
To support GUIDs in MetacatHandler.handleDeleteAction(), I've added in a new method:deleteFromMetacat() - deletes a document based on the docidThis factors the deletion code out of handleDeleteAction(). handleDeleteAction() now does a docid lookup based on GUID, and if it is not found, reverts to the deletion based on docid instead.
These are fairly significant changes to MetacatHandler.handleInsertOrUpdateAction() that add in support for creation or update of GUIDs and SystemMetadata. Upon insertion or update of DataPackages from non-DataONE aware clients (such as Morpho), the identifier table is updated by creating a GUID, and the systemmetadata table is updated with fields after the EML document is parsed for distribution information and entity typing. System Metadata documents are also generated and inserted into Metacat. The list of data entities is iterated over and System Metadata is generated for each data file as well.
In MetacatHandler I've removed updateSystemMetadata() in favor of additions to insertOrUpdateSystemMetadata(). Modified createSystemMetadata() to reflect the changes as well.
Modified MetacatHandler.createSystemMetadata() to take a localId, not a guid as an argument since there are times when the guid has yet to have been created, and it is created in this method if so. Also, put the read() call to get the InputStream of the data/metadata document into it's own try/catch statement.
Somehow missed adding in javadoc for read(). Here it is.
For now, getSystemMetadata() will be private like the other *SystemMetadata() methods.
Modified MetacatHandler, updated the getSystemMetadata() method to now use read() and deserializeSystemMetadata() to produce the SystemMetadata object. Exceptions are pushed up the stack, and so accordingly, modified createSystemMetadata() to reflect the changes.
Modified MetacatHandler, added createSystemMetadata() - generates SystemMetadata objects for newly inserted data or documents. This is intended to be used from handleInsertOrUpdateAction(), and only for documents being inserted from clients that don't support the DataONE interface. The method parses EML documents to discover data entities, and updates the system metadata for those entries, with support for describes and describedBy metadata. Currently doesn't handle FGDC, etc. documents....
Modified MetacatHandler, added three methods:getSystemMetadata() - returns a SystemMetadata object from the systemmetadata table using the given GUID. Stub only.updateSystemMetadata() - updates the systemmetadata table using the given SystemMetadata object....
Modified MetacatHandler and added two methods:serializeSystemMetadata() - Serialize a SystemMetadata object to XML stringdeserializeSystemMetadata() - Deserialize a SystemMetadata object from an XML string
Modified MetacatHandler, added read() - Read a document from metacat and return an InputStream. The XML or data document should be on disk, but if not, read from the metacat database. This method should be optimized, along with others, to not write stream data to disk for performance reasons.
To support generation of SystemMetadata and GUIDs, added a number of methods to MetacatHandler that are also in CrudService(). CrudService should eventually be refactored to use the handler methods. Added:readFromFilesystem() - Read a file from Metacat's configured file system data directory, and return a FileInputStream. This code has been factored out of handleInsertOrUpdateAction()....
fixed bug where the wrong checksum alg got written to the db
added file extension for txt or csv files
To support the generatemissingsystemmetadata REST call, modified CrudService.createSystemMetadata() to use DataoneEMLParser and further determine object formats from EML metadata. Formats currently supported are text/plain, text/csv, image/[jpg|jp2|bmp|tiff|png], and only for EML documents with 'ecogrid://' defined entity urls....
adding more debuggin and fixing bug with systemmetadata
Add code to download the included schema.
fixed replication bug where systemmetadata was not getting procssed correctly
think I fixed the connection problem. one connection in IdentifierManager was being leaked. added more debug info in case it happens again
Add a static method to get base url base on a schema url.
A sax handler class can get included schema path.
added some debug info to DBConnectionPool
fixed typo that prevented replication
fixed node response bug
fixed update problem
put the pid in the info section of the url
fixed content type problem where csv files were set as text/xml
fixed problem with count in listObjects()
Cleaned up warnings, removed dead code.
Updated to most recent DataONE libraries. Updated CrudService to set the correct origin MN and auth MN in system metadata. Refactored exception passing. More work to come in generating SystemMetadata.
removed debug statements
fixed bugs in listObjects
Add in correct node references in system metadata.
Cleanup harvester exceptions and generics.
remove httpclient 3.1 and custom-built httpclient.jarrework MetacatClient (and other classes) to use httpclient 4updated build to not create httpclient.jarencoding tests now pass.
blank configuration value should be treated the same as null
few bug fixes for listObjects
added code to do database query for listObjects
Cleaned up unused imports.
pass the root exception message up the call chain so that it can effectively be reported as a helpful error message. also, the JUnit test expects the specific error message (SchemaRegistryTest)
adding fields for additional system metadata info
use the read() method instead of manually calling with parameters
some new code for debugging mmp
updated the populator
added code to run an squery for listObjects instead of an anyfield query
always re-write web.xml in case geoserver has been redeployedhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=4307
Modified MetacatPopulator to deal with change in D1Client static methods.
added more code for new mmp requests
refactor checksum and some other stuff
include a default location for the Geoserver data directory (under the metcat deployment)
added some debug lines
rework geoserver configuration:-geoserver context is set to 'geoserver' by default, but can be reconfigured-data directory is set in the geoserver web.xml file (we have a template, set the value accordingly, then overwrite the deployed version in the geoserver webapp)...
add boolean return to indicate whether or not a property was modified
trying to get the new MMP handler working with ResourceHandler
updating commons-fileupload to 1.2.2
replicate works on metacat now. just waiting for roberts changes to the mmp clients
use local shape files if the Geoserver env variable is not set. They might also be the same
geoserver upgrade:-remove embedded geoserver -include geotools api and update spatial harvesting-include simple template for using maps in skin (openlayers now, not mapbuilder)
new mmp code
refactored MMP handling
removing code I just added
adding default url to test against
fixes for creating SM for legacy docs
updated replicate to only use GET requests. added notes for tomorrows standup
use the detected document encoding when getting the outputstream writer from the responsehttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=2495
fixed bug with http/https port
adding additional debugging info
undoing last commit
added additional debugging info
adding war version replacement token
fix for member variables but in the request wrapper
adding war version to node response
implemented health api
use debug level for request encoding message
added url decoding to the filter
use detected document encoding or Metacat's default encoding (UTF-8)
use UTF-8 encoding when getting bytes from the DB and converting them into a string.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=2495
only call response.getWriter() when we are about to send text/xml to the client, otherwise we end of calling both getWriter() and getOutputStream() - resulting in an illegal state.
use detected XML encoding when reading/writing filesuse UTF-8 as default when performing queries in the DB (assume DB is using UTF-8)remove as many PrintWriters (uses system default character encoding only) as possible and construct OutputStreamWriters where explicit encoding can be given....
fixed a bug with trailing slashes