index the multiple values provided by award number lookup. https://github.com/NCEAS/mdqengine/issues/72
Remove the referenced bean rdfXmlSubprocessor which was on the removed file application-context-oa.xml.
remove unused OA indexing file and reference
Add the provenence subprocessor.
extract "metadata" fields from result output now rather than dedicated fields in the model
include group lookup field in mdq indexing
use text() in xpath for multi-valued funder extraction
correct xpath an solr settings for funder field
make funder a multivalued field.
extract check.echo.funder.1 output value for the "funder" index field
Add the code to handle the merge-needed fields having multiple pairs.
During index merging process, if a field is a system metadata field and current document doesn't have the field, we don't need to merge it from the existing solr document.
re-include mdq context file after merge
include mdq composite score
remove root node in xpath (for some reason this wasn't working on mn-demo-8 but was locally).
Increase the version to 2.8.1
Merge the code for rdfxml subprocessor from d1_cn_index_processor to metacat.
include v1 in the run mdq formatId
Add the prefix "prov_" the solr fields.
adjust color-based score calculations to match UI and add up to total result count
add indexing for scores based on successful checks by check.type
use dateConverter for run timestamp indexing
include initial MDQ run processing in metacat-index
Use a new class to overwrite the class RdfXmlSubprocessor in d1-processor since that one has a method to use solr http server directly.
merge changes from d1 indexing lib
Add the method PropertyConfigurator.configureAndWatch to monitor the change on log4j.properties file.
Add more log information.
Remove the import of JiBXException.
Replace the JiBXException by our own MashallingException.
Centralize the version which will be modified. Bump the d1_cn_index_processor version to 2.3.0 snapshot.
Change it to 2.8.0 version.
Process the noaa variant of isotc211.
Add the file from d1_cn_index_processor.
Add a new copy from d1_cn_index_processor.
Add a statement to help diagnose issues.
Change it to 2.7.0 snapshot.
Change the version of d1_cn_index_processor to 2.2
Add beans for the iso index.
Add a bean file for the iso index.
add checks on archived flag to avoid NPE.
only consult fields to merge if there was an existing referenced doc
subclass AnnotatorSubprocessor for use in metacat-index (uses embedded solr server and solrj for retrieving/merging existing documents).
bump trunk to 2.6.0-SNAPSHOT and pull in d1 dependencies at 2.1.0-SNAPSHOT to continue trunk development.
add fileName, mediaType and mediaTypeProperties to solr schema and v2 system metadata processor
refactor v2 context bean to use the v1 pattern used in metacat
include seriesId in solr schema and context file (v2 system metadata)
Add the code to print the exception.
add missing quotation mark
fix xpath from CN changes for isPublic. https://redmine.dataone.org/issues/7374
include hierarchical permissions when evaluating isPublic during indexing. https://redmine.dataone.org/issues/7374
Index science metadata fields for the Dublin Core Extended metadata format. - Use d1_cn_index_processor 1.4.5 in metacat-index and update beans with new dcx subprocessor and xsi namespace
merge CN annotation context files to metacat (MN) to support semantic index fields.
use new IDocumentDeleteSubprocessors to handle clean-up of annotation index fields when annotations are removed.
moved RDF XML subprocessor to cn-index project.
move RDF/XML subprocessor and example configuration with SPARQL query to the cn index project from metacat so that it can be used by prov team when indexing ProvONE models in ORE documents
add fieldsToMerge property for annotation updates
add stub merge method to match Skye's recent refactoring to support reindexing when deletes are performed.
use refactored classes from cn-index-processor. still in flux, but improving to better handle non-xml files
add accessors for bean property
use ISolrDataField because RDF subprocessor uses the SparqlField subclass.
use input stream instead of Document for resource map processing test
update classes and context files that use cn-index-processor classes. allowing document subprocessors to be less tied to XML.
use sparql field and triple store from cn-index_processor (refactor). include annotatorSubprocessor for testing in metacat-index
let metacat-index lookup annotations for indexing rather than the metacat "reindex" action.
remove dev-testing in favor of maven.dataone.org repo
when we remove a slor index of a resource map, we don't need to know the content of the resource map. Instead, we will search the solr index to get information.
Add the code to handle to remove the resource map index.
Add codes to handle remove a source map solr index.
Create a valid URI by using all lowercase letters when creating a name for the triple model in the Rdf Xml Subprocessor. See bug: https://projects.ecoinformatics.org/ecoinfo/issues/6595
Change the d1_cn_index_processor version from 1.3.0 snapshot to 2.0.0 snapshot.
When indexing annotations from RDFs, use the doc id to access the system metadata, not the model name since they are not always the same.
Add PROV relationships to the Solr schema. Populate the fields using the RdfXmlSubprocessor
Add wasDerivedFrom field to the Solr schema and use Sparql query to retrieve the value from the RDF
update to use v2 types for indexing
add support for v2 DataONE API.
handle multiple subprocessors (RDF and ORE) before the object gets indexed by making sure to merge the solr doc map before submitting to the index.
use default "metacat" context name for metacat-index testing.
include ORE formatId as handled by the RDF subprocessor and index prov:wasDerivedFrom field where it exists in the RDF model. https://projects.ecoinformatics.org/ecoinfo/issues/6548
use a non-public rightsHolder for both EML and Annotation test documents now that the RDF subprocessor checks each annotation to see that it came from a user that as write permission for the object being annotated.
test for update using the updated EML file, not the original. Also add the SM to the shared map so that the indexing process can consult SM.accessPolicy when indexing annotations that assert things about those test documents.
ignore the metacat/solr comparator tests - they are one-offs.
pass around the object file path rather than the data stream so that multiple subprocessors can index the same object and not consume the stream before it gets to the next one. In preparation for extending the assertions stored in OREs. https://projects.ecoinformatics.org/ecoinfo/issues/6548
only allow multiple values for multi-valued fields....
use newer httpclient library so that Jena's dependency is met - this goes all the way back to d1_common/libclient needing to pull in the newer library.
allow multivalued fields to be indexed using the "fields" pass through.
Localized the file which doesn't have the bean for dataUrl.
Remove the reference to the bean eml.fileID.
Remove the bean named eml.fileID which used the ResolveSolrField class.
calculate geohash_3 to three places (typo)
use NSEW for the bounding box geohash calculation from EML - all versions
Using 1.3.0-SNAPSHOT from d1_cn_index_processor
Add beans to support geohashes
handle null Boolean in SM.archived field
use Matthew Jones for test creator since he has an ORCID in their staging environment.
augment annotation indexing test/sample to include orcid annotation. https://projects.ecoinformatics.org/ecoinfo/issues/6267https://projects.ecoinformatics.org/ecoinfo/issues/6423
include characteristic_sm field with SPARQL query
switch to index standard since it is more likely we will be able to determine this from our existing EML attribute information. https://projects.ecoinformatics.org/ecoinfo/issues/6253