Project

General

Profile

Bug #7176

Metacat-index RDF/XML subprocessor not populating prov_hasDerivations field

Added by Peter Slaughter over 2 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
index
Target version:
Start date:
03/23/2017
Due date:
% Done:

0%

Estimated time:
Bugzilla-Id:

Description

The package https://dev.nceas.ucsb.edu/#view/urn:uuid:c7cda366-5658-4350-ba5a-8d2b84829f5d has one prov relationship 'urn:uuid:94cb9677-be83-4873-aa7c-6691e32229a3 http://www.w3.org/ns/prov#wasDerivedFrom urn:uuid:146239cd-2f41-4312-8f90-75c8cad09a48'

From this prov relationship, the Solr index 'prov_hasDerivations' field for urn:uuid:146239cd-2f41-4312-8f90-75c8cad09a48 should be set to urn:uuid:94cb9677-be83-4873-aa7c-6691e32229a3.
See https://dev.nceas.ucsb.edu/knb/d1/mn/v1/query/solr/?q=id:%22urn:uuid:146239cd-2f41-4312-8f90-75c8cad09a48%22

However, the prov_wasDerived from field (the reciprocal relationship) is set for the derivation: https://dev.nceas.ucsb.edu/knb/d1/mn/v1/query/solr/?q=id:%22urn:uuid:94cb9677-be83-4873-aa7c-6691e32229a3%22

The problem may be related to the 'prov_hasDerivations' SPARQL query in metacat-index/src/main/resources/application-context-prov-base.xml:
<bean id="prov20150115.hasDerivations" class="org.dataone.cn.indexer.annotation.SparqlField">
...
SELECT (str(?pidValue) as ?pid) (str(?derivedDataPidValue) as ?prov_hasDerivations)
FROM <$GRAPH_NAME>
WHERE {
?derived_data prov:wasDerivedFrom ?source_data .
?source_data cito:documentedBy ?source_metadata .
?source_metadata dcterms:identifier ?pidValue .
?derived_data dcterms:identifier ?derivedDataPidValue .
}

Not sure why the 'source_metadata' is included in this query. Also, this query is not the
reciprocal of the 'prov_wasDerivedFrom' query:

&lt;bean id="prov20150115.wasDerivedFrom" class="org.dataone.cn.indexer.annotation.SparqlField"&gt;
...
SELECT (str(?pidValue) as ?pid) (str(?wasDerivedFromValue) as ?prov_wasDerivedFrom)
FROM <$GRAPH_NAME>
WHERE {
?derived_data prov:wasDerivedFrom ?primary_data .
?derived_data dcterms:identifier ?pidValue .
?primary_data dcterms:identifier ?wasDerivedFromValue .
}

Note that this will also be a problem for the CN DataONE d1_cn_index_processor component.

The resource map for this package has been included.

resmap-no-hasDerivations.xml (8.37 KB) resmap-no-hasDerivations.xml Peter Slaughter, 03/23/2017 03:08 PM

History

#1 Updated by Jing Tao over 2 years ago

  • Target version set to 2.9.0

#2 Updated by Jing Tao over 2 years ago

  • Target version changed from 2.9.0 to 2.8.2

#3 Updated by Jing Tao over 2 years ago

  • Status changed from New to Closed

The query was modified and it worked.

Also available in: Atom PDF