During the replication, the remote content will be saved without alteration.
add support for v2 DataONE API.
show the SM and ORE generation buttons even if they have not registered/configured dataone. many potential MNs want to see their generated SM before registering (and we want them to too!).
Add admin service to update DOI registrations by specifying a list of formatIds or DOIs, or update all.
first pass at allowing admins to update DOI registration. This only acts on EML objects at the moment and is meant to illustrate one mechanism for updating the DOIs. https://projects.ecoinformatics.org/ecoinfo/issues/6530
Index the document after it has been inserted.
support content from all serverLocations when summarizing entity info (semtools)
recursively submit obsoleted objects for indexing when instructed. https://projects.ecoinformatics.org/ecoinfo/issues/6424
Run syncAll in a single thread so admin config UI doesn't freeze
Couple modifications:-use "pid" throughout so as not to confuse docids and pids-ensure any failures in the set do not prevent synching for other pids in the set
Sync access policy between mn -> cn in case where metacat native ui being used to update ap on mn
Unify solr indexing with an IndexTask that is added to the queue -- allows us to send more than just the systemMetadata to the indexer. Initially this is for READ event counts for each document. https://projects.ecoinformatics.org/ecoinfo/issues/6346
Reviewed code for all uses of FileInputStream, checking to see if the method should be closing the stream, and if so, closing it in the method as well as in the finally clause to ensure we don't leak file descriptors.
Closing some more streams that were left open. This Bug #6136 seems to be pervasive and is going to require an extensive audit to find all of the places where streams are not closed properly.
Refactor to use IOUtils.closeQuietly() which handles nulls and streams that are already closed.
Closing FileOutputStream handles so that the OS limits on filehandles are not exceeded.
support a "force replication delete all action" during replication. This is used when we want Metacat to remove the content from the other target replicas because the DataONE delete() action was called (more powerful than just "archive").
use an independent ISet<SystemMetadata> structure to communicate objects that should be indexed by metacat-index. https://projects.ecoinformatics.org/ecoinfo/issues/5943
add space to prevent syntax error when additional clause is appended. https://projects.ecoinformatics.org/ecoinfo/issues/5929.
CHange replication 'update' query to use a LEFT JOIN so that the performance of the replication update action is improved, which had been causing an HTTP timeout for large metacat installations. See https://projects.ecoinformatics.org/ecoinfo/issues/5929.
include xml_revisions.do not allow removal of server_location = 1 documents (these are not replicas).https://redmine.dataone.org/issues/3539
move DocInfo parsing into utilities project so that it can be used by Morpho as well as Metacat.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5737
use correct docid format when checking for existing mappings.
use CDATA for docname field in docInfo so that XML parser ignores the content that can contain characters like "&
create docid-guid mapping during replication if it does not exist. we were [incorrectly] assuming that there would be SM coming with the document info that would fill this information in, but for traditional non-MN Metacat deployments there is no SM to provide a mapping. In this case we use the docid as the guid.
stream the replication "update" response rather than building up a complete list in a stringbuffer. prompted by findings on t he CN: https://redmine.dataone.org/issues/3141
remove possibility for infinite loop in case data replication is not configured for the server and a data file is encountered (yikes!)
added logging debug statements to see where the replication timeout might be occurring.
only look up the client timeout property once, not every time we make a callhttps://redmine.dataone.org/issues/3078
configurable replication client timeouthttps://redmine.dataone.org/issues/3078
instead of generating SM and ORE maps during dataone configuration/MN registration, moved this all to the replication admin screen where we can target generation for specific nodes. That way it's more controlled as to when and where we generate DataONE required content....
add "Generate System Metadata" button to the replication server list display. When clicked, we generate SM for records belonging to that source server. This is only enabled when DataONE has been configured.https://redmine.dataone.org/issues/2762
optionally remove the document/data file from the filesystem completely when 'deleting' it.https://redmine.dataone.org/issues/2677
add a parameter for optionally writing EML-embedded access control rules to the Metacat DB.https://redmine.dataone.org/issues/2584https://redmine.dataone.org/issues/2583
band-aid for CN-CN replication permOrder issue when access control is embedded in EML and the system metadata is replicated before the EML. we just log the inconsistency and allow the insert to succeed https://redmine.dataone.org/issues/2583
check whether mapping (
remove flag for independent system metadata replication -- these entries are replicated along with the data/metadata objects or via hazelcast when the actual object is not on the server.
only create guid->docid mapping during metadata replication if it does not already existhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5520
do not treat access change as an update -- it should not attempt to retrieve the contents of the objecthttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5520
only create guid->docid mapping during data replication if it does not already existhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5520
process system metadata before access rules (access control is now driven by GUID so the mapping needs to be there)
use shared method for looking up "docInfo" map -- both in Metacat replication and in D1 system metadata generation
replication control panel now fully implemented as an admin configuration screenhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5528
move replication configuration actions to the admin servlet and out of the replication servlethttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5528
save SystemMetadata when replicating data and metadata -- this way if/when the node decides to be a DataONE MN it already has the information needed for each object
started replication unit test
add note about alternative methods for getting cert/key
use DateTimeMarshaller for all replication date transfers
print the stacktrace when there is an error -- debuggin!
use SSL to get content from stream
http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5527
use HttpClient to set up SSL connection when doing replication calls -- this will use the server's configured certificate as the client certificate on the request. The server it is calling can then inspect that certificate and decide whether or not it trusts the caller.
add User-Agent logging to support D1 requirements
include SystemMetadata when replicating data and metadata documents -- this allows us to establish the guid-to-docid mapping that is crucial for being able to read the replicated document by guid (d1 api)
do not send <systemMetadata> with the <docInfo> replication information - this is handled by the Hazelcast shared map
rely on Hazelcast to store the SystemMetadata locally for the node. Entry event listeners store the shared system metadata on their local node when alerted. TODO: remove old replication code that included system metadata xml when replicating scimeta and data
remove ServiceTypeUtil - replace with TypeMarshaller
use new "v1" types from DataONE
add option for replicating system metadata (dataone)https://redmine.dataone.org/issues/1626
Merged in the D1_0_6_2_BRANCH changes that include the transition from ObjectFormat calls to ObjectFormatCache calls.
include System Metadata forced replication - just need to figure out when to call it!
handle timed replication of system metadata. there are still a few outstanding issues: -track server location of system metadata-only entries-replication policy flag for system metadata-only entries?-locking for replicated entries?-forced replication of entries
transfer full System Metadata (as XML) during document and data replication
-remove system metadata guid -> local id mapping (there is no document for system metadata now)-include system metadata elements when replicating data objects (TODO: transfer all system metadata structures with the docinfo request).TODO: remove docid+rev from the systemMetadata table definition
do not use XML files for storing SystemMetadata - use DB tables only.
use update method to update the mapping between local and guid (d1) when we get a force replication request that is an "update
use "object_format" element consistently so that it is replicated across instanceshttps://redmine.dataone.org/issues/1514
insert/update documents with null user and null group to circumvent access control restrictions then update the user_owner and user_updated values to reflect what exists on the originating server (pisco)
use 'user_updated' field when writing the replicated document - allows most recent ownership/permissions to be used (in case LDAP groups have shifted) and is more accurate for both updates and initial inserts (hopefully addresses the replication issue we are having with pisco)
DocumentImpl.delete() now throws finer grained exceptions (not a general exception). Consequently, the classes that call it have been updated to handle the thrown exceptions, including CrudService, ReplicationHandler, and ReplicationService.
adding more debuggin and fixing bug with systemmetadata
fixed replication bug where systemmetadata was not getting procssed correctly
added code to do database query for listObjects
only call response.getWriter() when we are about to send text/xml to the client, otherwise we end of calling both getWriter() and getOutputStream() - resulting in an illegal state.
use detected XML encoding when reading/writing filesuse UTF-8 as default when performing queries in the DB (assume DB is using UTF-8)remove as many PrintWriters (uses system default character encoding only) as possible and construct OutputStreamWriters where explicit encoding can be given....
add support for EML 2.1.1
reformatting logs for robert
added another logging statement
hopefully fixed bug with systemmetadata replication
debug statements in dbsaxhandler
fixing problems with replication and systemmetadata
added functionality to set access permissions to system metadata the same as the document that it describes
fixed major bug in replication where the document info was being truncated due to a poorly implemented sax parser
replication of guids now works. tested this for both forced replication and update/insert/delete triggered replication
fixed bug where guid end tag wasn't getting printed
added a method in IdentifierManager to get a guid from a docid and rev. added fields in the documentinfo replication document to pass the guid. now need to handle the guid and insert it into the table if its found
Modifications to support the DataONE service API version 0.1.0. For DataONE, the get() andcreate() services are partially complete. Several more functions and checks need to be added tocreate() before it is viable. This DataONE support is not complete, and the current support breaks the MetacatRestClientTest for the time being (this client will eventually be removed).
Pass the doc xml as a string to docImpl.write and writeRepication. This is so a reader can be create for the parsing and for the write to disk. Also created a db access class for xml query result deletion.
Log doc and rev query counts and times. Fix mis-spellings.
Change add sql to use a prepared statement. Only try to download a cert if a url was provided.
change AccessControlForSingleFile to only be instantiated for one file. move ACL methods to AccessControlForSingleFile. Change format of access sections returned to EML 2.1.0.
Move access control source to it's own directory.
Change location of PropertyService to properties directory
Change MetaCatVersion to MetacatVersion
Create replication directory. Move replication code there. Use log4j for replication logging (rollingfileappender). Beef up replication logging and error control.
Roll back replication user changes. Fix code that converts access levels to integer and to text.
Introduce replication user. Use the fileutil writer methods instead of writing directly.
Beef up exception handling from file utilities. Move UtilException to MetacatUtilException to eliminate conflict with similar exception in utility package.
Replace System.out.println statements with logMetacat statements.
Update replication documentation and fix code so that replication log is available.