Task #6040
Metacat-index does not handle <references>
Description
I indexed a document from EVOS that uses a reference for a creator rather than the details of the person:
<creator><references>1359152217358</references></creator>
But in the index it shows up as "||" instead of following the reference back the the id where it was declared:
<associatedParty id="1359152217358">...
History
#1 Updated by ben leinfelder over 7 years ago
Here is a bit of the bean definition used by indexing to pick out the content from EML
<bean id="eml.origin" class="org.dataone.cn.indexer.parser.CommonRootSolrField" p:multivalue="true" p:root-ref="originRoot"> <constructor-arg name="name" value="origin" /> </bean> <bean id="originRoot" class="org.dataone.cn.indexer.parser.utility.RootElement" p:name="origin" p:xPath="//dataset/creator" p:template="[individualName]||[organizationName]"> <property name="leafs"><list><ref bean="organizationNameLeaf"/></list></property> <property name="subRoots"><list><ref bean="individualNameRoot" /></list></property> </bean>
#2 Updated by ben leinfelder over 7 years ago
- Target version changed from 2.1.0 to 2.1.1
#3 Updated by ben leinfelder over 7 years ago
- Target version changed from 2.1.1 to 2.2.0
#4 Updated by ben leinfelder over 7 years ago
- Target version changed from 2.2.0 to 2.2.1
#5 Updated by ben leinfelder over 7 years ago
- Target version changed from 2.2.1 to 2.2.0
- Priority changed from Normal to High
Apparently this is fixed in cn-index-processor v1.2.0 -- so we will need to pull in this newer dependency in metacat-index and adjust the code accordingly.
#6 Updated by ben leinfelder over 7 years ago
- Target version changed from 2.2.0 to 2.2.1
#7 Updated by ben leinfelder over 7 years ago
- Parent task set to #6114
This is included in the 1.2.0 d1 index release. It will not include || but instead will use blanks. Not a very great "solution" but better.
#8 Updated by Matt Jones over 7 years ago
Spaces aren't really sufficient as a solution, and there are a lot of references fields in EML. We probably need to contribute a fix for this if Skye is not going to fix it for DataONE.
#9 Updated by Jing Tao over 7 years ago
Skye said that the sax parser is used to parse those information. This change may require to use DOM parser. It is a big change.
#10 Updated by ben leinfelder over 7 years ago
Even with a SAX parser, the implementation could keep track of all elements with "id" attributes and anytime a "references" element is encountered, substitute with that node. The tricky part would be when we encounter a references element before the actual element that declares the id -- would have to track the references that are unfulfilled and fill them in when we actually get to the id elements.
#11 Updated by ben leinfelder about 7 years ago
- Parent task deleted (
#6114)