Project

General

Profile

1
.. raw:: latex
2

    
3
  \newpage
4

    
5

    
6
Metacat Indexing
7
===========================
8
Lorem ipsum
9

    
10
SOLR background information
11
---------------------------
12
Features:
13

    
14
* something
15
* something
16
* more
17
* even more
18

    
19
Something to explain the advantage of solr over the old metacat index approach
20

    
21
Indexed documents and fields
22
-----------------------------
23
Metacat reuses the default DataONE index which includes many common metadata formats
24
out-of-the-box
25

    
26
1. EML
27
2. FGDC
28
3. Dryad
29

    
30

    
31
Default indexed fields
32
-----------------------
33
Describe the existing fields like in the DataONE docs, with link to them
34

    
35

    
36
Index configuration overview
37
----------------------------
38
Describe the configuration files and extension points for the implementation
39

    
40

    
41
Adding additional document types and fields
42
--------------------------------------------
43
Step-by-step guide for adding new documents and indexed fields.
44

    
45

    
46
Querying the index
47
--------------------
48
Provide example SOLR queries and expected results. Show a variety of return types
49
and query facets.
50

    
51

    
52
Access Policy enforcement
53
-------------------------
54
Explain how access control is processed and honored when utilizing the index.
55

    
56

    
57
Regenerating the index from scratch
58
-----------------------------------
59
When the SOLR index has been drastically modified, a complete regenration of the 
60
index may be necessary. In order to accomplish this:
61

    
62
Step-by-step instructions
63

    
64
NOTE: this may take a long time depending on the size of your Metacat store.
65

    
66

    
67

    
68
Class design overview
69
----------------------
70

    
71
.. figure:: images/indexing-class-diagram.png
72

    
73
   Figure 1. Class design overview.
74
   
75
..
76
  @startuml images/indexing-class-diagram.png
77
  
78
	package cn-index-processor.parser {
79
	
80
		interface IDocumentSubprocessor {
81
			+ boolean canProcess(Document doc)
82
			+ initExpression(XPath xpath)
83
			+ Map<String, SolrDoc> processDocument(String identifier, Map<String, SolrDoc> docs, Document doc)
84
		}
85
		class AbstractDocumentSubprocessor {
86
			- List<SolrField> fields
87
		}
88
		class ResourceMapSubprocessor {
89
		}
90
		class ScienceMetadataDocumentSubprocessor {
91
		}
92
			  
93
		interface ISolrField {
94
			+ initExpression(XPath xpathObject)
95
			+ List<SolrElementField> getFields(Document doc, String identifier)
96
		}
97
		class SolrField {
98
			- String name
99
			- String xpath
100
			- boolean multivalue
101
		}
102
		class CommonRootSolrField {
103
		}
104
		class FullTextSolrField {
105
		}
106
		class MergeSolrField {
107
		}
108
		class ResolveSolrField {
109
		}
110
		class SolrFieldResourceMap {
111
		}
112
		    
113
	}
114
	
115
	IDocumentSubprocessor <|-- AbstractDocumentSubprocessor
116
	AbstractDocumentSubprocessor <|-- ResourceMapSubprocessor
117
	AbstractDocumentSubprocessor <|-- ScienceMetadataDocumentSubprocessor
118

    
119
	ISolrField <|-- SolrField
120
	SolrField <|-- CommonRootSolrField
121
	SolrField <|-- FullTextSolrField
122
	SolrField <|-- MergeSolrField
123
	SolrField <|-- ResolveSolrField			
124
	SolrField <|-- SolrFieldResourceMap		
125
	
126
	AbstractDocumentSubprocessor o--"*" ISolrField
127
	
128
	package metacat.index {
129
		  
130
		class MetacatIndex {
131
			- GenericIndex index
132
			+ insert(String pid, InputStream)
133
			+ update(String pid, InputStream)
134
			+ remove(String pid)
135
			+ OutputStream query(String solrQuery)
136
		}
137
		
138
		class GenericIndex {
139
		}
140
		class SolrjIndex {
141
		}
142
		class Embedded {
143
		}
144
	
145
	}
146
	
147
	GenericIndex <|-- SolrjIndex		
148
	SolrjIndex <|-- Embedded		
149

    
150
	package solr {
151
		  
152
		abstract class SolrServer {
153
			+ add(SolrInputDocument doc)
154
			+ deleteByQuery(String id)
155
			+ query(SolrQuery query)
156
		}
157
		class EmbeddedSolrServer {
158
		}
159
		class HttpSolrServer {
160
		}
161
	
162
	}
163
	
164
	SolrServer <|-- EmbeddedSolrServer
165
	SolrServer <|-- HttpSolrServer
166
	
167
	SolrjIndex o--"1" HttpSolrServer
168
	Embedded o--"1" EmbeddedSolrServer
169
	MetacatIndex o--"1" GenericIndex
170
  
171
  @enduml
(19-19/22)