Task #6040
openMetacat-index does not handle <references>
Description
I indexed a document from EVOS that uses a reference for a creator rather than the details of the person:
<creator><references>1359152217358</references></creator>
But in the index it shows up as "||" instead of following the reference back the the id where it was declared:
<associatedParty id="1359152217358">...
Updated by ben leinfelder over 11 years ago
Here is a bit of the bean definition used by indexing to pick out the content from EML
<bean id="eml.origin" class="org.dataone.cn.indexer.parser.CommonRootSolrField" p:multivalue="true" p:root-ref="originRoot"> <constructor-arg name="name" value="origin" /> </bean> <bean id="originRoot" class="org.dataone.cn.indexer.parser.utility.RootElement" p:name="origin" p:xPath="//dataset/creator" p:template="[individualName]||[organizationName]"> <property name="leafs"><list><ref bean="organizationNameLeaf"/></list></property> <property name="subRoots"><list><ref bean="individualNameRoot" /></list></property> </bean>
Updated by ben leinfelder about 11 years ago
- Target version changed from 2.1.0 to 2.1.1
Updated by ben leinfelder about 11 years ago
- Target version changed from 2.1.1 to 2.2.0
Updated by ben leinfelder about 11 years ago
- Target version changed from 2.2.0 to 2.2.1
Updated by ben leinfelder about 11 years ago
- Priority changed from Normal to High
- Target version changed from 2.2.1 to 2.2.0
Apparently this is fixed in cn-index-processor v1.2.0 -- so we will need to pull in this newer dependency in metacat-index and adjust the code accordingly.
Updated by ben leinfelder about 11 years ago
- Target version changed from 2.2.0 to 2.2.1
Updated by ben leinfelder about 11 years ago
- Parent task set to #6114
This is included in the 1.2.0 d1 index release. It will not include || but instead will use blanks. Not a very great "solution" but better.
Updated by Matt Jones about 11 years ago
Spaces aren't really sufficient as a solution, and there are a lot of references fields in EML. We probably need to contribute a fix for this if Skye is not going to fix it for DataONE.
Updated by Jing Tao about 11 years ago
Skye said that the sax parser is used to parse those information. This change may require to use DOM parser. It is a big change.
Updated by ben leinfelder about 11 years ago
Even with a SAX parser, the implementation could keep track of all elements with "id" attributes and anytime a "references" element is encountered, substitute with that node. The tricky part would be when we encounter a references element before the actual element that declares the id -- would have to track the references that are unfulfilled and fill them in when we actually get to the id elements.