upgrade to Metacat 2.1.0 on the trunk. This includes a new index_event table for storing indexing events that need to be reprocessed. https://projects.ecoinformatics.org/ecoinfo/issues/5944
stub for storing IndexEvent objects in Metacat (from metacat-index processing). https://projects.ecoinformatics.org/ecoinfo/issues/5944
do not force a get() during refresh (causing EML-defined data access rules to be lost when inserting EML docs about data files). note that this reverses a change that was meant to trigger indexing, but now we are using a new queue to share index events with metacat-index and so should not be necessary.
do not use tmp file to return an inputstream on read() operations - just read from the file we already have. https://projects.ecoinformatics.org/ecoinfo/issues/6009
use standard File.createTempFile() method for uploaded data files and delete them when we are done with them. https://projects.ecoinformatics.org/ecoinfo/issues/6008
correct regex for whitespace in D1 identifier.
use an independent ISet<SystemMetadata> structure to communicate objects that should be indexed by metacat-index. https://projects.ecoinformatics.org/ecoinfo/issues/5943
do not create solr-home if there is no template to compy into that directory (need to be able to create it later if/when someone decides to use and deploy metacat-index). https://projects.ecoinformatics.org/ecoinfo/issues/6006
do not attempt to copy solr-home template from metacat-index webapp if it does not exist. This would be in cases where metacat-index is not deployed. https://projects.ecoinformatics.org/ecoinfo/issues/6006
Solr will be enabled if it is in the db.enabledEngines.
do not require PortalCertificateManager be configured. Fix NPE because session was not created when using old sessionid-based authentication. https://projects.ecoinformatics.org/ecoinfo/issues/5942
handle client certificates, portal certificates and jsessionid as three ways to prove you are an uthenticated user. https://projects.ecoinformatics.org/ecoinfo/issues/5942
Use some contants from the EnabledQueryEngines.
Updated documentation, and added modification date to the sitemap index file entries.
Remove unused import.
Mofdified Sitemap class to also generate the sitemap index file that is needed when more than one sitemap file is provided.
use ContentTypeInputStream interface (and ByteArray implementation) to specify the desired content-type of the InputStream returned by MN.query().
load the evicted SM back into the map on a "Refresh" so that listeners hear the update. (metacat-index, for example)
switch back to log4j statements now that I am sure certificate delegation is working.
use System.out.println until the oa4mp logging issue is resolved.
add logging for portal certificate look up process.
use relative path for oa4mp_client.xml (within servlet context). https://projects.ecoinformatics.org/ecoinfo/issues/5936
first pass at integrating CILogon/MyProxy certificates in Metacat. Configuration is specific to mn-demo-4.test.dataone.org for the time being (this will cause localhost deployments to fail webapp deployment). https://projects.ecoinformatics.org/ecoinfo/issues/5936
Updated Sitemap generation to use latest version of the sitemap protocol schemas.
Use the SolrQueryServiceController to get the spec version and index schema information.
Change the package of SolrQueryReponseWriterFactory and SolrQueryResponseTransformer.
Use the new query(SolrParams param) method of the SolrQueryServiceController.
Use the SolrQueryServiceController class to handle the query.
Move the cod which transformed the query response to the inputstream to the metacat-common module.Remove some obsoleted imports.
Move the code to generate the QueryResponseWriter to the metacat-common module. So it can be shared with the metacat-index module.
organize imports
remove extra lines from returned <docid/> block. https://projects.ecoinformatics.org/ecoinfo/issues/5932
Allow use of PID instead of docid in the Perl registry. At least for reading/editing and deleting existing content. Does not create content using a pid. https://projects.ecoinformatics.org/ecoinfo/issues/5932
initialize the SOLR home directory if it does not already exist.
Only after reloading the core, the query result can reflect the change made in metacat-index module.
Fixed a bug to put "OR" correctly in the query. And remove the user "authorized_user" from the rightsholder clause in the query.
Use the set of subjects to replace the user and groups for the solr query.
escape special XML characters when constructing a pathquery from user input (&). https://projects.ecoinformatics.org/ecoinfo/issues/3017
adjust action=zip behavior to use full docids and entity names (data files) for the zip entry. Also uses the given qformat to render the metadata. https://projects.ecoinformatics.org/ecoinfo/issues/3816
Add the rightsHolder in the access filter.
adjust action=zip behavior to use full docids when checking for permissions/existence. https://projects.ecoinformatics.org/ecoinfo/issues/3816
Add code to handle query for the http solr server.
Use a new class to handle the solr query engine description request.
Add double quotes to surrend the user or group names in the access fq. This will fix the issue if the names have white spaces.
Add the access query filter.
Allow use of server-side XSLT for SOLR queries that include "wt=<qformat>". https://projects.ecoinformatics.org/ecoinfo/issues/5812
Allow null SM.submitter (per schema). There were null values in cn-dev (and probably elsewhere since it is technically allowed in the schema. But with a null value, we need to have a null Subject for the SM.submitter field, not a Subject with a null getValue() return. Encountered this when testing for: https://projects.ecoinformatics.org/ecoinfo/issues/5929.
add space to prevent syntax error when additional clause is appended. https://projects.ecoinformatics.org/ecoinfo/issues/5929.
CHange replication 'update' query to use a LEFT JOIN so that the performance of the replication update action is improved, which had been causing an HTTP timeout for large metacat installations. See https://projects.ecoinformatics.org/ecoinfo/issues/5929.
Add the code to read the index field information from the schema.xml.
Add code to handle the solr index information. we still need to figure out how to get the information.
Add the solr engine to the engine list.
use maven to manage most jar dependencies in Metacat.Exceptions include: LSID, Datamamager (EML),
Add code to handle solr query.
Add a class to handle solr query.
Remove those obsolete index classes.
Merging the METACAT_2_0_6_BRANCH changes for [M|C]NodeService into the trunk.
allow verification date to be updated for replicas (patch from Skye). https://redmine.dataone.org/issues/3699
select only distinct guids (synch may have failed more than once for any given guid)https://redmine.dataone.org/issues/3539
include xml_revisions.do not allow removal of server_location = 1 documents (these are not replicas).https://redmine.dataone.org/issues/3539
include size and format datcite elements (optional) and use more general resourceType without formatId in them (Dataset/metadata and Dataset/data). http://schema.datacite.org/meta/kernel-2.2/doc/DataCite-MetadataKernel_v2.2.pdf
lookup the title for EML files when registering DOIs.lookup the creator from DataONE CN (if available).add EML-based test. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5513
Set the session to null so that the call uses the CN certificate when calling MN.systemMetadataChanged();
To keep all nodes up to date with regard to system metadata changes, add the broadcastSystemMetadataChange() method that finds replica MNs in the node list and calls systemMetadataChanged(). Modify setReplicationStatus() and updateReplicationMetadata() to fire this off when a replica status changes to completed. We may decide to inform MNs at other times too, but this is a conservative amount of calls going to the MNs for now.
refactor DOI registration into separate class. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5513
refactor using ezid-client changes that split field names and values into separate enums. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5513
Correctly mint and register DOIs in teh MN API implementation. Add tests to exercise minting and creating. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5513
register DOIs with minimal DataCite metadata. still need to determine which details to include and when, but the plumbing is in place as we refine those rules. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5513
class for removing failed/invalid replicas from target nodes that previously held replicated content (KNB/LTER/PISCO/etc). https://redmine.dataone.org/issues/3539
disable EZID/DOI minting by default since we do not yet have a means of tracking minted DOIs and augmenting metadata for them when we actually receive the object in a subsequent create() or update() call. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5753
tweak to pathquery/generic xpath handling
group user_owner clause as "AND (... OR .... OR ....)" to handle multiple pathquery <owner> elements. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5880
accidentally added
typo
Search and indexing with Lucene/SOLRRequires a manually configured SOLR installationNot currently used by the rest of metacat
generate ID from UUID. http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5840
make sure serial version is included or set on MN.update().http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5793
Quick fix for bad handling of non-default data/backup directories.
remove indexing task from the queue when we are updating the document
move DocInfo parsing into utilities project so that it can be used by Morpho as well as Metacat.http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5737
use default count = 1000 for CN.listObjects rather than -1 (because now -1 will cause an SQL error)
default replicaStatus to true for the CN.listObject call
make sure to call lock() on the SM when updating rightsholder (like every other method that gets a lock object from HZ).
CN.search() id not implemented by metacat -- making that explicit and also testing for it.
default replicaStatus (aka "show replicas in results") to true rather than false
add debug statements for listObject slice debugging
Do not set headers until response is ready to send (5756)
use dual query for query slicing - one for count, another for the actual records when requested.https://redmine.dataone.org/issues/3065
get total (or subtotal when non-slicing params are present) count as a separate query from the field selection query.
include Skye's suggestions about correctly limiting by D1 Event types
first pass at DOI minting using the EZID service in mn.generateIdentifier()http://bugzilla.ecoinformatics.org/show_bug.cgi?id=5755
Fix a minor bug in listObjects() where total was set incorrectly when total was set incorrectly when count=0. The definition of total in the D1 architecture docs says 'The total number of entries in the source list from which the slice was extracted.' With count=0, we assume the total is the total count from the entire object store. Needs testing.
remove empty package
rollback the delete() when there is an error performing part of it -- don't want to end up with partial delete.
use Identifier object not String when retrieving SM from the HZ map to set archived during delete()
for MN.update() we needed to pass the original pid, not the new pid
do not reject any schemes -- all handled the same at the moment.
simple autogen-based implementation of MN.generateIdentifier(). does not support DOIs, ARKs, etc. It does support including a fragment, returning an identifier like "<fragment>.2012113010215298206"
add link for reference on how to do record limits in oracle
limit /log and /object calls to configurable maximum count for paging. defaults to existing Metacat value of 7000