use new IDocumentDeleteSubprocessors to handle clean-up of annotation index fields when annotations are removed.
moved RDF XML subprocessor to cn-index project.
add stub merge method to match Skye's recent refactoring to support reindexing when deletes are performed.
use refactored classes from cn-index-processor. still in flux, but improving to better handle non-xml files
add accessors for bean property
use ISolrDataField because RDF subprocessor uses the SparqlField subclass.
update classes and context files that use cn-index-processor classes. allowing document subprocessors to be less tied to XML.
use sparql field and triple store from cn-index_processor (refactor). include annotatorSubprocessor for testing in metacat-index
when we remove a slor index of a resource map, we don't need to know the content of the resource map. Instead, we will search the solr index to get information.
Add the code to handle to remove the resource map index.
Add codes to handle remove a source map solr index.
Create a valid URI by using all lowercase letters when creating a name for the triple model in the Rdf Xml Subprocessor. See bug: https://projects.ecoinformatics.org/ecoinfo/issues/6595
When indexing annotations from RDFs, use the doc id to access the system metadata, not the model name since they are not always the same.
Add PROV relationships to the Solr schema. Populate the fields using the RdfXmlSubprocessor
update to use v2 types for indexing
handle multiple subprocessors (RDF and ORE) before the object gets indexed by making sure to merge the solr doc map before submitting to the index.
use a non-public rightsHolder for both EML and Annotation test documents now that the RDF subprocessor checks each annotation to see that it came from a user that as write permission for the object being annotated.
pass around the object file path rather than the data stream so that multiple subprocessors can index the same object and not consume the stream before it gets to the next one. In preparation for extending the assertions stored in OREs. https://projects.ecoinformatics.org/ecoinfo/issues/6548
only allow multiple values for multi-valued fields....
allow multivalued fields to be indexed using the "fields" pass through.
handle null Boolean in SM.archived field
switch to index standard since it is more likely we will be able to determine this from our existing EML attribute information. https://projects.ecoinformatics.org/ecoinfo/issues/6253
switch to the OpenAnnotation (OA) model for annotating datapackages with measurements/characteristics (semtools)
include ID field as a minimum for indexing additional fields.
correctly include stacktrace for error debugging.
return null if there is no existing SolrDoc for the given pid.
check for existing index document before trying to use existing fields.
allow indexing of RDF documents - provide a sparql query that will return values for the field name. Using measurement_sm initially (a dynamic multivalued solr field). https://projects.ecoinformatics.org/ecoinfo/issues/6253
check for existing documents - don't assume it exists.
Unify solr indexing with an IndexTask that is added to the queue -- allows us to send more than just the systemMetadata to the indexer. Initially this is for READ event counts for each document. https://projects.ecoinformatics.org/ecoinfo/issues/6346
Rename the IndexGenerator to IndexGeneratorTimerTask.
Fixed a bug that when a data file was archived, the solr index for the metadata object still kept the "documents" element.
made the delete method synchronized.
If an object was archived, the solr index will be removed for it.
Use the setting from the metacat-common component.
Use the d1_cn_index_processor 1.2.0 version.
combine the index code for failed ids and other ids.
Clean up the code.
The IndexGenerator will index the obsoleted data objects as well.
Remove the obsoletes chain from the update method in the SolrIndex class.
When an object is archvied, the solr index will not be removed.
merge from 2.2 branch: remove the index queue item when it is being processed. https://projects.ecoinformatics.org/ecoinfo/issues/6117
remove any index event errors if the pid has successfully been reindexed. https://projects.ecoinformatics.org/ecoinfo/issues/6089
Change the parameters order of the constructor. We maybe reuse some code from d1_cn_processor.
Modified the documentation.
Use the ResourceMapException when a component of a resource map isn't found in the solr index.
Add a ResourceMapException.
Use the class path configuration of spring to replace the file configuration. We can reuse the application context files in the d1_cn_index_processor jar.
Add a constructor.
Remove the constructor.
Remove a logFile method.
use the v1.1.x branch ResourceMap class for metacat-index
The exceptions will be caught during the looping of deleting the solr index.
Remove the code to write some debug information into a temporary file.
Use the ResourceMapFactory rather than the ResourceMap constructor to build a resource map.
Write the ids from metacat into a temporary file.
Move a file to the temp dir.
Add a method to write ids which will be indexed into a file.
Besides the getArchvied() method, the getObsoletedBy method was added to determine if the object is archvied or not.
Add code to handle deleted ids.
Use schedule method to start the index.
Add the code to write the error message to the log in the itemRemvoed method.
In determining the time arrange, the equality was removed.
Add code to handle failed ids.
Remove the EventLog write.
Add the EventLog code.
It will throw an exception if the subprocessor can't handle the document.
Check if the all components of a resource map have been processed before processing the resource map.
Fixed a bug that the event log can't save the real lastest process date.
Change the date format.Remove the replication part of log4j.
Use a new date format.
Add the code that only the ids with the correct system metadata modification time will be added to the index queue.
Add code to get and set the last process date.
move IndexEvent into metacat-common. Perparation for Metacat responding to events and writing them to a persistent store. https://projects.ecoinformatics.org/ecoinfo/issues/5944
refactor IndexEventLog a bit to simplify type/action information. prep for serializing IndexEvent objects to Metacat. https://projects.ecoinformatics.org/ecoinfo/issues/5944
remove serial number from indexeventlog - it is not used elsewhere in the api. https://projects.ecoinformatics.org/ecoinfo/issues/5944
correct spelling for index.eventlog.classname property
use an independent ISet<SystemMetadata> structure to communicate objects that should be indexed by metacat-index. https://projects.ecoinformatics.org/ecoinfo/issues/5943
consolidate SystemMetadata map retrieval in preparation for using a different structure for objects to index.
adding ability to remove event from the [error] queue.
Add code to implment set and get the last processed date.
It will make the index only for those objects which were modified after the marked time.
Add set and get the lastprocessedDate in the IndexEventLog.Remove the code to write the successful event.
Log the timed index jobs.
Add the code to log the failed events.
Add a temporary file log for debugging.
Add a serial number for the event.Add method to set events to be archived.
Add a new class variable - isArchived for class IndexEvent.
Update the documentation about those classes.
Add a event and eventlog for the index.
Use the identifier set to get the list of ids in the member node.
The returned ISet should be Identifier.
Add method to get identifier set.
Set up a Timer to run the regenerating solr index task periodically.
Add code to handle delete data package information when delete a pid in the solr index.
Add two static methods to get the SystemMetadata and data object InputStream for the specified id.
Add code to check if the metacat.properties is available.
If solr is not enabled, it would not be running.
Use another thread in the Servlet init method to wait hazelcast.