use HttpMultipartRestClient since the DefaultHttpMultipartRestClient was removed from d1_libclient_java
During the replication, the remote content will be saved without alteration.
add support for v2 DataONE API.
show the SM and ORE generation buttons even if they have not registered/configured dataone. many potential MNs want to see their generated SM before registering (and we want them to too!).
Add admin service to update DOI registrations by specifying a list of formatIds or DOIs, or update all.
first pass at allowing admins to update DOI registration. This only acts on EML objects at the moment and is meant to illustrate one mechanism for updating the DOIs. https://projects.ecoinformatics.org/ecoinfo/issues/6530
Index the document after it has been inserted.
Index the document after document is written to the db.
support content from all serverLocations when summarizing entity info (semtools)
recursively submit obsoleted objects for indexing when instructed. https://projects.ecoinformatics.org/ecoinfo/issues/6424
Run syncAll in a single thread so admin config UI doesn't freeze
Couple modifications:-use "pid" throughout so as not to confuse docids and pids-ensure any failures in the set do not prevent synching for other pids in the set
Sync access policy between mn -> cn in case where metacat native ui being used to update ap on mn
Unify solr indexing with an IndexTask that is added to the queue -- allows us to send more than just the systemMetadata to the indexer. Initially this is for READ event counts for each document. https://projects.ecoinformatics.org/ecoinfo/issues/6346
Reviewed code for all uses of FileInputStream, checking to see if the method should be closing the stream, and if so, closing it in the method as well as in the finally clause to ensure we don't leak file descriptors.
Closing some more streams that were left open. This Bug #6136 seems to be pervasive and is going to require an extensive audit to find all of the places where streams are not closed properly.
Refactor to use IOUtils.closeQuietly() which handles nulls and streams that are already closed.
Closing FileOutputStream handles so that the OS limits on filehandles are not exceeded.
support a "force replication delete all action" during replication. This is used when we want Metacat to remove the content from the other target replicas because the DataONE delete() action was called (more powerful than just "archive").
use an independent ISet<SystemMetadata> structure to communicate objects that should be indexed by metacat-index. https://projects.ecoinformatics.org/ecoinfo/issues/5943
add space to prevent syntax error when additional clause is appended. https://projects.ecoinformatics.org/ecoinfo/issues/5929.
CHange replication 'update' query to use a LEFT JOIN so that the performance of the replication update action is improved, which had been causing an HTTP timeout for large metacat installations. See https://projects.ecoinformatics.org/ecoinfo/issues/5929.
include xml_revisions.do not allow removal of server_location = 1 documents (these are not replicas).https://redmine.dataone.org/issues/3539
move DocInfo parsing into utilities project so that it can be used by Morpho as well as Metacat.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5737
use correct docid format when checking for existing mappings.
use CDATA for docname field in docInfo so that XML parser ignores the content that can contain characters like "&
use SchemaLocationResolver to fetch remote entries for the xml_catalog -- we want to be able to fetch included xsd files as well as use any error handling it provides for checking the schemas.
create docid-guid mapping during replication if it does not exist. we were [incorrectly] assuming that there would be SM coming with the document info that would fill this information in, but for traditional non-MN Metacat deployments there is no SM to provide a mapping. In this case we use the docid as the guid.
stream the replication "update" response rather than building up a complete list in a stringbuffer. prompted by findings on t he CN: https://redmine.dataone.org/issues/3141
remove unused "dataonelogger"
remove possibility for infinite loop in case data replication is not configured for the server and a data file is encountered (yikes!)
added logging debug statements to see where the replication timeout might be occurring.
only look up the client timeout property once, not every time we make a callhttps://redmine.dataone.org/issues/3078
configurable replication client timeouthttps://redmine.dataone.org/issues/3078
stack trace the HZ put exception during CN-CN replication
additional debugging statements for CONCURRENT_MAP_PUT error during CN-CN replication.
instead of generating SM and ORE maps during dataone configuration/MN registration, moved this all to the replication admin screen where we can target generation for specific nodes. That way it's more controlled as to when and where we generate DataONE required content....
add "Generate System Metadata" button to the replication server list display. When clicked, we generate SM for records belonging to that source server. This is only enabled when DataONE has been configured.https://redmine.dataone.org/issues/2762
optionally remove the document/data file from the filesystem completely when 'deleting' it.https://redmine.dataone.org/issues/2677
add a parameter for optionally writing EML-embedded access control rules to the Metacat DB.https://redmine.dataone.org/issues/2584https://redmine.dataone.org/issues/2583
band-aid for CN-CN replication permOrder issue when access control is embedded in EML and the system metadata is replicated before the EML. we just log the inconsistency and allow the insert to succeed https://redmine.dataone.org/issues/2583
check whether mapping (
process systemMetadata from the docInfo string before writing to the database so that we guarantee guid-docid mapping exists before attempting to look it up.
remove flag for independent system metadata replication -- these entries are replicated along with the data/metadata objects or via hazelcast when the actual object is not on the server.
only create guid->docid mapping during metadata replication if it does not already existhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5520
do not treat access change as an update -- it should not attempt to retrieve the contents of the objecthttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5520
only create guid->docid mapping during data replication if it does not already existhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5520
process system metadata before access rules (access control is now driven by GUID so the mapping needs to be there)
use shared method for looking up "docInfo" map -- both in Metacat replication and in D1 system metadata generation
replication control panel now fully implemented as an admin configuration screenhttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5528
move replication configuration actions to the admin servlet and out of the replication servlethttp://bugzilla.ecoinformatics.org/show_bug.cgi?id=5528
save SystemMetadata when replicating data and metadata -- this way if/when the node decides to be a DataONE MN it already has the information needed for each object
get server param only when it is expected
check replication table (not keystore) for trusted server host name match
started replication unit test
add note about alternative methods for getting cert/key
use DateTimeMarshaller for all replication date transfers
print the stacktrace when there is an error -- debuggin!
use SSL to get content from stream
http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5527
skip verification -- remaining TODO
verify certificate
inspect keystore entries for matching client certificate
lookup the correct property for keystore file
use HttpClient to set up SSL connection when doing replication calls -- this will use the server's configured certificate as the client certificate on the request. The server it is calling can then inspect that certificate and decide whether or not it trusts the caller.
check client-provided certificate when servicing ReplicationServlet requests.
add User-Agent logging to support D1 requirements
include SystemMetadata when replicating data and metadata documents -- this allows us to establish the guid-to-docid mapping that is crucial for being able to read the replicated document by guid (d1 api)
do not send <systemMetadata> with the <docInfo> replication information - this is handled by the Hazelcast shared map
rely on Hazelcast to store the SystemMetadata locally for the node. Entry event listeners store the shared system metadata on their local node when alerted. TODO: remove old replication code that included system metadata xml when replicating scimeta and data
remove ServiceTypeUtil - replace with TypeMarshaller
use new "v1" types from DataONE
use correct log name for the class
add option for replicating system metadata (dataone)https://redmine.dataone.org/issues/1626
force replication for newly-registered system metadata
Merged in the D1_0_6_2_BRANCH changes that include the transition from ObjectFormat calls to ObjectFormatCache calls.
include System Metadata forced replication - just need to figure out when to call it!
handle timed replication of system metadata. there are still a few outstanding issues: -track server location of system metadata-only entries-replication policy flag for system metadata-only entries?-locking for replicated entries?-forced replication of entries
transfer full System Metadata (as XML) during document and data replication
-remove system metadata guid -> local id mapping (there is no document for system metadata now)-include system metadata elements when replicating data objects (TODO: transfer all system metadata structures with the docinfo request).TODO: remove docid+rev from the systemMetadata table definition
do not use XML files for storing SystemMetadata - use DB tables only.
use update method to update the mapping between local and guid (d1) when we get a force replication request that is an "update
use "object_format" element consistently so that it is replicated across instanceshttps://redmine.dataone.org/issues/1514
insert/update documents with null user and null group to circumvent access control restrictions then update the user_owner and user_updated values to reflect what exists on the originating server (pisco)
use 'user_updated' field when writing the replicated document - allows most recent ownership/permissions to be used (in case LDAP groups have shifted) and is more accurate for both updates and initial inserts (hopefully addresses the replication issue we are having with pisco)
DocumentImpl.delete() now throws finer grained exceptions (not a general exception). Consequently, the classes that call it have been updated to handle the thrown exceptions, including CrudService, ReplicationHandler, and ReplicationService.
adding more debuggin and fixing bug with systemmetadata
fixed replication bug where systemmetadata was not getting procssed correctly
fixed typo that prevented replication
fixed bugs in listObjects
added code to do database query for listObjects
only call response.getWriter() when we are about to send text/xml to the client, otherwise we end of calling both getWriter() and getOutputStream() - resulting in an illegal state.
use detected XML encoding when reading/writing filesuse UTF-8 as default when performing queries in the DB (assume DB is using UTF-8)remove as many PrintWriters (uses system default character encoding only) as possible and construct OutputStreamWriters where explicit encoding can be given....
add support for EML 2.1.1
reformatting logs for robert
added another logging statement
added replicate log statements with the guid and localId
hopefully fixed bug with systemmetadata replication
debug statements in dbsaxhandler
fixing problems with replication and systemmetadata