In the reindex method, the error message will be sent back.
Add the code to check the authorization of the client which is reindexing a pid.
Fixed a bug that the method getLocalId swallowed an exception incorrectly in the IdentifierManager.
Fixed a bug that the solr index of data files doesn't reflect the change of access control.
remove AnnotatorService completely - was moved to cn-index-processor
let metacat-index lookup annotations for indexing rather than the metacat "reindex" action.
allow per-document reindexing to be initiated by any user (to support third-party annotations)
Write the input stream into the file system without alteration in dataone create and update methods.
look up annotations when reindexing a given pid. still very much a prototype in that we are looking up annotations from an external annotator-store. TODO: add pid filtering to query when annotateit.org supports it (pending upgrade on their site).
add support for v2 DataONE API.
prevent js scriptlets from running when we return error messages to the client by escaping any potentially harmful xml blocks. https://projects.ecoinformatics.org/ecoinfo/issues/6224
switch to use FIleUpload instead of O'Reilly COS library for handling chunked file uploads. https://projects.ecoinformatics.org/ecoinfo/issues/6517
index the ORE after we submit the metadata for indexing. https://projects.ecoinformatics.org/ecoinfo/issues/6520
recursively submit obsoleted objects for indexing when instructed. https://projects.ecoinformatics.org/ecoinfo/issues/6424
look up guid when done setting access by docid so we can sync and refresh accesspolicy on MN and CN.
additional logging for set access
setAccessAction: get guid from passed in id for calls to SyncAccessPolicy, HazelcastService.refreshSystemMetadataEntry
Add the code to check if the docid contains the whitespaces in the handleInsertOrUpdate, handleUpload and handleInsertMultipartInsertAction methods.
allow utf-8 user first/last names to be used in responses for: login, logout, validatesession, getprincipals.
When a docid's access policy is modified with metacat native api, update CN with the new access policy
Unify solr indexing with an IndexTask that is added to the queue -- allows us to send more than just the systemMetadata to the indexer. Initially this is for READ event counts for each document. https://projects.ecoinformatics.org/ecoinfo/issues/6346
Reviewed code for all uses of FileInputStream, checking to see if the method should be closing the stream, and if so, closing it in the method as well as in the finally clause to ensure we don't leak file descriptors.
Sparate the action reindex and reindexall.
Generate an ORE when sci metadata is added via the Metacat API.https://projects.ecoinformatics.org/ecoinfo/issues/6061
If the pathquery engine is disabled, the xml path index queue will be disabled as well.
add Metacat servlet action to force the reindexing of one or more or all pids in the system. https://projects.ecoinformatics.org/ecoinfo/issues/5945
do not use tmp file to return an inputstream on read() operations - just read from the file we already have. https://projects.ecoinformatics.org/ecoinfo/issues/6009
use standard File.createTempFile() method for uploaded data files and delete them when we are done with them. https://projects.ecoinformatics.org/ecoinfo/issues/6008
use an independent ISet<SystemMetadata> structure to communicate objects that should be indexed by metacat-index. https://projects.ecoinformatics.org/ecoinfo/issues/5943
remove extra lines from returned <docid/> block. https://projects.ecoinformatics.org/ecoinfo/issues/5932
Allow use of PID instead of docid in the Perl registry. At least for reading/editing and deleting existing content. Does not create content using a pid. https://projects.ecoinformatics.org/ecoinfo/issues/5932
adjust action=zip behavior to use full docids and entity names (data files) for the zip entry. Also uses the given qformat to render the metadata. https://projects.ecoinformatics.org/ecoinfo/issues/3816
move DocInfo parsing into utilities project so that it can be used by Morpho as well as Metacat.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5737
Do not set headers until response is ready to send (5756)
allow getlog action to use docid parameters that do not include revision. In these cases, the latest revision will be used.
pass parameters to the getLog action for rendering in xslt
optionally remove the document/data file from the filesystem completely when 'deleting' it.https://redmine.dataone.org/issues/2677
add a parameter for optionally writing EML-embedded access control rules to the Metacat DB.https://redmine.dataone.org/issues/2584https://redmine.dataone.org/issues/2583
use Java-based temp file creation instead of Date (ms) timestamp to ensure uniqueness of the file and avoid re-use by two concurrent threads.
only generate system metadata when the call comes from the legacy Metacat API, not the D1 API.https://redmine.dataone.org/issues/2362 (I think this was the culprit)
use File.deleteOnExit() not a half hour timer thread to do it.
download remote data and save locally when it is referenced by an EML package, then include it in the ORE map.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5522
refactor Metacat access handling to be on a per-revision basis so that it more closely aligns with the DataONE approachhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5560
look up all docids is now a static method (ORE/SystemMetadata generation)
evict the HazelCast SystemMetadata entry if we update the access control rules via Metacat's legacy API, otherwise stale SystemMetadata stays in memory instead of being looked up from the backing table store.
optionally include ORE generation/insertion into Metacat when generating SystemMetadatahttps://redmine.dataone.org/issues/2056
refactor SystemMetadata creation into separate class from the MetacatHandler -- this will be shared by upgrade code and normal metacat api.
simplify SystemMetadata generation -- will be done during Metacat upgrade for D1 features/support.
When read a FGDC document, Metacat will add a new parameter enableFGDCediting params for the xml transforming.
move the DataONE 1.0.0-SNAPSHOT
add User-Agent logging to support D1 requirements
catch datapackage parsing errors as before
catch runtime exceptions that arise from hazelcast storage errors in the system metadata map
only "save" to the shared system metadata map - not directly to the table store.
rely on Hazelcast to store the SystemMetadata locally for the node. Entry event listeners store the shared system metadata on their local node when alerted. TODO: remove old replication code that included system metadata xml when replicating scimeta and data
include sysmeta for uploaded BIN data that comes through the legacy Metacat servlet API
changes for schema update (d1_common)
Update classes to use the DataONE 0.6.4 schema and types. Major changes involve using BigInteger vs long in SystemMetadata.size, and using ObjectFormatIdentifier rather than Object format.
latest D1 jars - changes include:updateSystemMetadata() impl for CNnew identifier methods (generate is its own method)removal of the resourceMap pointer from system metadata
remove ServiceTypeUtil - replace with TypeMarshaller
use new "v1" types from DataONE
remove extraneous update() call when create() does the call for us
Change Metacathandler.read() to be public since it's internal to Metacat, and use read() in D1NodeService after isAuthorized() for the calling Subject from the Session object.
Updated CNCoreImpl to implement listFormats() and getFormat(), and changed calls to ObjectFormatCache in IdentifierManager, MetacatHandler to call getInstance(). Removed the ObjectFormatService registration from MetaCatServlet since it is replaced by CNCoreImpl.
use Data Manager Library to parse EML when needed in DataONE classes.(augmented DML to parse data format elements in EML to estimate MIME type)https://redmine.dataone.org/issues/1634
organize imports so that it is clearer what dependencies exist on the D1 jars
Merged in the D1_0_6_2_BRANCH changes that include the transition from ObjectFormat calls to ObjectFormatCache calls.
rework SystemMetadata creation when inserting documents via the Metacat servlet api (in which case there was no client-supplued system metadata)
-remove system metadata guid -> local id mapping (there is no document for system metadata now)-include system metadata elements when replicating data objects (TODO: transfer all system metadata structures with the docinfo request).TODO: remove docid+rev from the systemMetadata table definition
do not use XML files for storing SystemMetadata - use DB tables only.
Modified Metacat to build against the D1_SCHEMA_0_6_1 branch of the dataone schemas by incorporating the 0.6.1-SNAPSHOT version of d1_common and d1_libclient libraries, and refactoring Metacat code references to the d1 schema changed types.
In order to sync up with DataONE 0.6.1 changes, I'm backing out ObjectFormatService changes temporarily in Metacat. Most functionality will be rolled back in using the DataONE 0.6.2 tag, but some methods in ObjectFormatService (such as getListFromDisk()) will be moved into d1_libclient_java.
Changes in the DataONE ObjectFormat class deprecate the convert() method, and we're now using Metacat's ObjectFormatService to look up object format attributes. The following changes replace ObjectFormat.convert() with ObjectFormatService.getFormat() in several classes....
Minor changes to MetacatHandler:- Improved logging where MetaCatServlet.class was used in getLogger() rather than MetacatHandler.class (holdover from the refactor)- Minor formatting changes, and replacement of 'MetaCatServlet' with 'MetacatHandler' in the logging output as needed.
added a few debugging lines in createSystemMetadata() related to contents of identifier strings
fixing annoying error message inaccuracy
MOdified MetacatHandler to catch cases where ObjectFormat is not being set properly on data files whengenerating SystemMetadata. When the EML document contains a format for an entity that maps to a nulltype in ObjectFormat.convert(), then the type ends up being null and an error is generated on insertion...
add event notification for insert/update/delete on documents (for semtools plugin)
do not attempt to check permissions when reading documents for systemMetadata generation (unless I completely do not understand this feature - please verify!).
include accessfileid and subtreeid when inserting xml_access values
To support GUIDs in MetacatHandler.handleDeleteAction(), I've added in a new method:deleteFromMetacat() - deletes a document based on the docidThis factors the deletion code out of handleDeleteAction(). handleDeleteAction() now does a docid lookup based on GUID, and if it is not found, reverts to the deletion based on docid instead.
These are fairly significant changes to MetacatHandler.handleInsertOrUpdateAction() that add in support for creation or update of GUIDs and SystemMetadata. Upon insertion or update of DataPackages from non-DataONE aware clients (such as Morpho), the identifier table is updated by creating a GUID, and the systemmetadata table is updated with fields after the EML document is parsed for distribution information and entity typing. System Metadata documents are also generated and inserted into Metacat. The list of data entities is iterated over and System Metadata is generated for each data file as well.
In MetacatHandler I've removed updateSystemMetadata() in favor of additions to insertOrUpdateSystemMetadata(). Modified createSystemMetadata() to reflect the changes as well.
Modified MetacatHandler.createSystemMetadata() to take a localId, not a guid as an argument since there are times when the guid has yet to have been created, and it is created in this method if so. Also, put the read() call to get the InputStream of the data/metadata document into it's own try/catch statement.
Somehow missed adding in javadoc for read(). Here it is.
For now, getSystemMetadata() will be private like the other *SystemMetadata() methods.
Modified MetacatHandler, updated the getSystemMetadata() method to now use read() and deserializeSystemMetadata() to produce the SystemMetadata object. Exceptions are pushed up the stack, and so accordingly, modified createSystemMetadata() to reflect the changes.
Modified MetacatHandler, added createSystemMetadata() - generates SystemMetadata objects for newly inserted data or documents. This is intended to be used from handleInsertOrUpdateAction(), and only for documents being inserted from clients that don't support the DataONE interface. The method parses EML documents to discover data entities, and updates the system metadata for those entries, with support for describes and describedBy metadata. Currently doesn't handle FGDC, etc. documents....
Modified MetacatHandler, added three methods:getSystemMetadata() - returns a SystemMetadata object from the systemmetadata table using the given GUID. Stub only.updateSystemMetadata() - updates the systemmetadata table using the given SystemMetadata object....
Modified MetacatHandler and added two methods:serializeSystemMetadata() - Serialize a SystemMetadata object to XML stringdeserializeSystemMetadata() - Deserialize a SystemMetadata object from an XML string
Modified MetacatHandler, added read() - Read a document from metacat and return an InputStream. The XML or data document should be on disk, but if not, read from the metacat database. This method should be optimized, along with others, to not write stream data to disk for performance reasons.
To support generation of SystemMetadata and GUIDs, added a number of methods to MetacatHandler that are also in CrudService(). CrudService should eventually be refactored to use the handler methods. Added:readFromFilesystem() - Read a file from Metacat's configured file system data directory, and return a FileInputStream. This code has been factored out of handleInsertOrUpdateAction()....
added some debug lines
use the detected document encoding when getting the outputstream writer from the responsehttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=2495
use detected XML encoding when reading/writing filesuse UTF-8 as default when performing queries in the DB (assume DB is using UTF-8)remove as many PrintWriters (uses system default character encoding only) as possible and construct OutputStreamWriters where explicit encoding can be given....
fixed bug where permission would get set to -1 for no good reason
add support for EML 2.1.1
fix the setaccess() method so that it accepts strings not numbers (i.e. "read" not "4")
allow public access to log information when docid is given. IP and principal are not returned unless an administrator makes the request.
fixed a couple bugs with dates and fixed a major bug where metacat was reading the entire document from the database everytime a DocumentImpl instance was created even though it didn't need to