Sync access policy between mn -> cn in case where metacat native ui being used to update ap on mn
Unify solr indexing with an IndexTask that is added to the queue -- allows us to send more than just the systemMetadata to the indexer. Initially this is for READ event counts for each document. https://projects.ecoinformatics.org/ecoinfo/issues/6346
Reviewed code for all uses of FileInputStream, checking to see if the method should be closing the stream, and if so, closing it in the method as well as in the finally clause to ensure we don't leak file descriptors.
Closing some more streams that were left open. This Bug #6136 seems to be pervasive and is going to require an extensive audit to find all of the places where streams are not closed properly.
Refactor to use IOUtils.closeQuietly() which handles nulls and streams that are already closed.
Closing FileOutputStream handles so that the OS limits on filehandles are not exceeded.
support a "force replication delete all action" during replication. This is used when we want Metacat to remove the content from the other target replicas because the DataONE delete() action was called (more powerful than just "archive").
use an independent ISet<SystemMetadata> structure to communicate objects that should be indexed by metacat-index. https://projects.ecoinformatics.org/ecoinfo/issues/5943
add space to prevent syntax error when additional clause is appended. https://projects.ecoinformatics.org/ecoinfo/issues/5929.
CHange replication 'update' query to use a LEFT JOIN so that the performance of the replication update action is improved, which had been causing an HTTP timeout for large metacat installations. See https://projects.ecoinformatics.org/ecoinfo/issues/5929.
include xml_revisions.do not allow removal of server_location = 1 documents (these are not replicas).https://redmine.dataone.org/issues/3539
move DocInfo parsing into utilities project so that it can be used by Morpho as well as Metacat.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5737
use correct docid format when checking for existing mappings.
use CDATA for docname field in docInfo so that XML parser ignores the content that can contain characters like "&
use SchemaLocationResolver to fetch remote entries for the xml_catalog -- we want to be able to fetch included xsd files as well as use any error handling it provides for checking the schemas.
create docid-guid mapping during replication if it does not exist. we were [incorrectly] assuming that there would be SM coming with the document info that would fill this information in, but for traditional non-MN Metacat deployments there is no SM to provide a mapping. In this case we use the docid as the guid.
stream the replication "update" response rather than building up a complete list in a stringbuffer. prompted by findings on t he CN: https://redmine.dataone.org/issues/3141
remove unused "dataonelogger"
remove possibility for infinite loop in case data replication is not configured for the server and a data file is encountered (yikes!)
added logging debug statements to see where the replication timeout might be occurring.
only look up the client timeout property once, not every time we make a callhttps://redmine.dataone.org/issues/3078
configurable replication client timeouthttps://redmine.dataone.org/issues/3078
stack trace the HZ put exception during CN-CN replication
additional debugging statements for CONCURRENT_MAP_PUT error during CN-CN replication.
instead of generating SM and ORE maps during dataone configuration/MN registration, moved this all to the replication admin screen where we can target generation for specific nodes. That way it's more controlled as to when and where we generate DataONE required content....
add "Generate System Metadata" button to the replication server list display. When clicked, we generate SM for records belonging to that source server. This is only enabled when DataONE has been configured.https://redmine.dataone.org/issues/2762
optionally remove the document/data file from the filesystem completely when 'deleting' it.https://redmine.dataone.org/issues/2677
add a parameter for optionally writing EML-embedded access control rules to the Metacat DB.https://redmine.dataone.org/issues/2584https://redmine.dataone.org/issues/2583
band-aid for CN-CN replication permOrder issue when access control is embedded in EML and the system metadata is replicated before the EML. we just log the inconsistency and allow the insert to succeed https://redmine.dataone.org/issues/2583
check whether mapping (
process systemMetadata from the docInfo string before writing to the database so that we guarantee guid-docid mapping exists before attempting to look it up.
remove flag for independent system metadata replication -- these entries are replicated along with the data/metadata objects or via hazelcast when the actual object is not on the server.
only create guid->docid mapping during metadata replication if it does not already existhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5520
do not treat access change as an update -- it should not attempt to retrieve the contents of the objecthttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5520
only create guid->docid mapping during data replication if it does not already existhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5520
process system metadata before access rules (access control is now driven by GUID so the mapping needs to be there)
use shared method for looking up "docInfo" map -- both in Metacat replication and in D1 system metadata generation
replication control panel now fully implemented as an admin configuration screenhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5528
move replication configuration actions to the admin servlet and out of the replication servlethttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5528
save SystemMetadata when replicating data and metadata -- this way if/when the node decides to be a DataONE MN it already has the information needed for each object
get server param only when it is expected
check replication table (not keystore) for trusted server host name match
started replication unit test
add note about alternative methods for getting cert/key
use DateTimeMarshaller for all replication date transfers
print the stacktrace when there is an error -- debuggin!
use SSL to get content from stream
http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5527
skip verification -- remaining TODO
verify certificate
inspect keystore entries for matching client certificate
lookup the correct property for keystore file
use HttpClient to set up SSL connection when doing replication calls -- this will use the server's configured certificate as the client certificate on the request. The server it is calling can then inspect that certificate and decide whether or not it trusts the caller.
check client-provided certificate when servicing ReplicationServlet requests.
add User-Agent logging to support D1 requirements
include SystemMetadata when replicating data and metadata documents -- this allows us to establish the guid-to-docid mapping that is crucial for being able to read the replicated document by guid (d1 api)
do not send <systemMetadata> with the <docInfo> replication information - this is handled by the Hazelcast shared map
rely on Hazelcast to store the SystemMetadata locally for the node. Entry event listeners store the shared system metadata on their local node when alerted. TODO: remove old replication code that included system metadata xml when replicating scimeta and data
remove ServiceTypeUtil - replace with TypeMarshaller
use new "v1" types from DataONE
use correct log name for the class
add option for replicating system metadata (dataone)https://redmine.dataone.org/issues/1626
force replication for newly-registered system metadata
Merged in the D1_0_6_2_BRANCH changes that include the transition from ObjectFormat calls to ObjectFormatCache calls.
include System Metadata forced replication - just need to figure out when to call it!
handle timed replication of system metadata. there are still a few outstanding issues: -track server location of system metadata-only entries-replication policy flag for system metadata-only entries?-locking for replicated entries?-forced replication of entries
transfer full System Metadata (as XML) during document and data replication
-remove system metadata guid -> local id mapping (there is no document for system metadata now)-include system metadata elements when replicating data objects (TODO: transfer all system metadata structures with the docinfo request).TODO: remove docid+rev from the systemMetadata table definition
do not use XML files for storing SystemMetadata - use DB tables only.
use update method to update the mapping between local and guid (d1) when we get a force replication request that is an "update
use "object_format" element consistently so that it is replicated across instanceshttps://redmine.dataone.org/issues/1514
insert/update documents with null user and null group to circumvent access control restrictions then update the user_owner and user_updated values to reflect what exists on the originating server (pisco)
use 'user_updated' field when writing the replicated document - allows most recent ownership/permissions to be used (in case LDAP groups have shifted) and is more accurate for both updates and initial inserts (hopefully addresses the replication issue we are having with pisco)
DocumentImpl.delete() now throws finer grained exceptions (not a general exception). Consequently, the classes that call it have been updated to handle the thrown exceptions, including CrudService, ReplicationHandler, and ReplicationService.
adding more debuggin and fixing bug with systemmetadata
fixed replication bug where systemmetadata was not getting procssed correctly
fixed typo that prevented replication
fixed bugs in listObjects
added code to do database query for listObjects
only call response.getWriter() when we are about to send text/xml to the client, otherwise we end of calling both getWriter() and getOutputStream() - resulting in an illegal state.
use detected XML encoding when reading/writing filesuse UTF-8 as default when performing queries in the DB (assume DB is using UTF-8)remove as many PrintWriters (uses system default character encoding only) as possible and construct OutputStreamWriters where explicit encoding can be given....
add support for EML 2.1.1
reformatting logs for robert
added another logging statement
added replicate log statements with the guid and localId
hopefully fixed bug with systemmetadata replication
debug statements in dbsaxhandler
fixing problems with replication and systemmetadata
added functionality to set access permissions to system metadata the same as the document that it describes
fixed major bug in replication where the document info was being truncated due to a poorly implemented sax parser
added a DataOneLogger for event notifications on the CN. The logger is called DataOneLogger and can be managed in the log4j.properties file
refactored the sessionService to use a correct singleton initialization scheme. Added true authentication to ResourceHandler.
replication of guids now works. tested this for both forced replication and update/insert/delete triggered replication
fixed bug where guid end tag wasn't getting printed
added a method in IdentifierManager to get a guid from a docid and rev. added fields in the documentinfo replication document to pass the guid. now need to handle the guid and insert it into the table if its found
Modifications to support the DataONE service API version 0.1.0. For DataONE, the get() andcreate() services are partially complete. Several more functions and checks need to be added tocreate() before it is viable. This DataONE support is not complete, and the current support breaks the MetacatRestClientTest for the time being (this client will eventually be removed).
Pass the doc xml as a string to docImpl.write and writeRepication. This is so a reader can be create for the parsing and for the write to disk. Also created a db access class for xml query result deletion.
Log doc and rev query counts and times. Fix mis-spellings.
Change add sql to use a prepared statement. Only try to download a cert if a url was provided.