Project

General

Profile

1
.. raw:: latex
2

    
3
  \newpage
4

    
5

    
6
Metacat Indexing
7
===========================
8
Lorem ipsum
9

    
10
SOLR background information
11
---------------------------
12
Features:
13

    
14
* something
15
* something
16
* more
17
* even more
18

    
19
Something to explain the advantage of solr over the old metacat index approach
20

    
21
Indexed documents and fields
22
-----------------------------
23
Metacat reuses the default DataONE index which includes many common metadata formats
24
out-of-the-box
25

    
26
1. EML
27
2. FGDC
28
3. Dryad
29

    
30

    
31
Default indexed fields
32
-----------------------
33
Describe the existing fields like in the DataONE docs, with link to them
34

    
35

    
36
Index configuration overview
37
----------------------------
38
Describe the configuration files and extension points for the implementation
39

    
40

    
41
Adding additional document types and fields
42
--------------------------------------------
43
Step-by-step guide for adding new documents and indexed fields.
44

    
45

    
46
Querying the index
47
--------------------
48
Provide example SOLR queries and expected results. Show a variety of return types
49
and query facets.
50

    
51

    
52
Access Policy enforcement
53
-------------------------
54
Explain how access control is processed and honored when utilizing the index.
55

    
56

    
57
Regenerating the index from scratch
58
-----------------------------------
59
When the SOLR index has been drastically modified, a complete regenration of the 
60
index may be necessary. In order to accomplish this:
61

    
62
Step-by-step instructions
63

    
64
NOTE: this may take a long time depending on the size of your Metacat store.
65

    
66

    
67

    
68
Class design overview
69
----------------------
70

    
71
.. figure:: images/indexing-class-diagram.png
72

    
73
   Figure 1. Class design overview.
74
   
75
..
76
  @startuml images/indexing-class-diagram.png
77
  
78
	package index.model {
79
		  
80
		abstract class FieldSpec {
81
			- String name
82
			+ abstract String[] extract(Reader s);
83
		}
84
		
85
		class D1IndexField {
86
		}
87
		
88
		class XpathIndexField {
89
			- String xpath
90
		}
91
		
92
		class MCIndexDocDef {
93
			- Set<FieldSpec> fields
94
			+ add(FieldSpec)
95
			+ remove(FieldSpec)
96
		}
97
		    
98
	}
99
	
100
	FieldSpec <|-- D1IndexField
101
	FieldSpec <|-- XpathIndexField
102
	
103
	MCIndexDocDef  o--"*" FieldSpec
104
	
105
	package index {
106
		  
107
		interface GenericIndex {
108
			+ insert(String, Map<String, String[]>)
109
			+ String [] query(String)
110
			+ remove(String)
111
			+ update(String, Map<String, String[]>)
112
		}
113
		
114
		class D1Index {
115
		}
116
		
117
		class SolrjIndex {
118
		}
119
		
120
		class Embedded {
121
		}
122
		
123
		class LuceneIndex {
124
		}
125
		
126
		class MetacatIndex {
127
			+ remove(String)
128
			+ retrieve(String)
129
			+ update(String, Reader)
130
		}
131
		
132
		class DocType {
133
			+ boolean isEml()
134
			+ boolean isSysmeta()
135
			+ boolean isSyseml()
136
			+ boolean isSysdryad()
137
			+ boolean isSysfgdc()
138
		}
139
	
140
	}
141
	
142
	GenericIndex <|-- D1Index
143
	GenericIndex <|-- SolrjIndex
144
	SolrjIndex <|-- Embedded
145
	GenericIndex <|-- LuceneIndex
146
  
147
  @enduml
(19-19/22)