Project

General

Profile

« Previous | Next » 

Revision 8101

clean-up and flesh-out the metacat-index docs. https://projects.ecoinformatics.org/ecoinfo/issues/5884

View differences:

query-index.rst
5 5

  
6 6
Metacat Indexing
7 7
===========================
8
Lorem ipsum
8
Metacat v2.1 introduces support for building a SOLR index of Metacat content.
9
While we continue to support the "pathquery" search mechanism, this will be phased out 
10
in favor of the more efficient SOLR query interface.
9 11

  
10
SOLR background information
11
---------------------------
12
Features:
13 12

  
14
* something
15
* something
16
* more
17
* even more
13
Metacat deployments that opt to use the Metacat SOLR index will be able to take advantage 
14
of:
18 15

  
19
Something to explain the advantage of solr over the old metacat index approach
16
* fast search performance
17
* built-in paging features
18
* customizable return formats (for advanced admins)
20 19

  
21 20
Indexed documents and fields
22 21
-----------------------------
23
Metacat reuses the default DataONE index which includes many common metadata formats
24
out-of-the-box
22
Metacat integrates the existing DataONE index library which includes many common metadata formats
23
out-of-the-box:
25 24

  
26 25
1. EML
27 26
2. FGDC
28
3. Dryad
27
3. Dryad*
29 28

  
30 29

  
31 30
Default indexed fields
32 31
-----------------------
33
Describe the existing fields like in the DataONE docs, with link to them
32
For a complete listing of the indexed fields, please see the DataONE documentation.
34 33

  
34
http://mule1.dataone.org/ArchitectureDocs-current/design/SearchMetadata.html
35 35

  
36
Index configuration overview
36
Metacat also reports on the currently-indexed fields, simply navigate to:
37

  
38
http://mule1.dataone.org/ArchitectureDocs-current/apis/MN_APIs.html#MNQuery.getQueryEngineDescription
39

  
40
with "solr" as the engine.
41

  
42
Index configuration
37 43
----------------------------
38
Describe the configuration files and extension points for the implementation
44
Metacat-index is deployed as a separate web application (metacat-index.war) and should be deployed 
45
as a sibling of the Metacat webapp (knb.war). Deploying metacat-index.war is only required when SOLR support
46
is desired and can safely be omitted if it will not be utilized for any given Metacat deployment.
39 47

  
48
During the initial installation/upgrade, an empty index will be initialized in the configured "solr-home" location.
49
Metacat-index will index all the existing Metacat content when the webapp next initializes.
50
Note: the configured solr-home directory should not exist before configuring Metacat with indexing for the first time, 
51
otherwise the blank index will not be created for metacat-index to utilize.
40 52

  
53
Additional advanced configuration options are available in the metacat.properties file (shared between Metacat and Metacat-index).
54

  
55

  
41 56
Adding additional document types and fields
42 57
--------------------------------------------
43
Step-by-step guide for adding new documents and indexed fields.
58
TBD: Step-by-step guide for adding new documents and indexed fields.
44 59

  
45 60

  
46 61
Querying the index
47 62
--------------------
48
Provide example SOLR queries and expected results. Show a variety of return types
49
and query facets.
63
The SOLR index can be queried using standard SOLR syntax and return options. 
64
The DataONE query interface exposes the SOLR query engine.
50 65

  
66
http://mule1.dataone.org/ArchitectureDocs-current/apis/MN_APIs.html#MNQuery.query
51 67

  
68
Please see the SOLR documentation for examples and exhaustive syntax information.
69

  
70
http://lucene.apache.org/solr/
71

  
72

  
52 73
Access Policy enforcement
53 74
-------------------------
54
Explain how access control is processed and honored when utilizing the index.
75
Access control is enforced by the index such that only records that are readable by the 
76
user performing the query are returned to the user. Any SOLR query submitted will be 
77
augmented with access control criteria corresponding to if and how the user is currently 
78
authenticated. Both certificate-based (DataONE API) and JSESSIONID-based (Metacat API) 
79
authentication are simultaneously supported.
55 80

  
56 81

  
57 82
Regenerating the index from scratch
58 83
-----------------------------------
59
When the SOLR index has been drastically modified, a complete regenration of the 
84
When the SOLR index has been drastically modified, a complete regeneration of the 
60 85
index may be necessary. In order to accomplish this:
61 86

  
62
Step-by-step instructions
87
Step-by-step instructions:
63 88

  
64
NOTE: this may take a long time depending on the size of your Metacat store.
89
1. Entirely remove the solr-home directory
90
2. Step through the Metacat admin interface main properties screen, specifying the solr-home directory you wish to use
91
3. Restart the webapp container (Tomcat).
65 92

  
93
Content can also be submitted for index regeneration by using the the Metacat API:
66 94

  
95
1. Login as the Metacat administrator
96
2. Navigate to: <host>/<metacat_context>/metacat?action=reindex[&pid={pid}]
97
3. If the pid parameter is omitted, all objects in Metacat will be submitted for reindexing.
67 98

  
99

  
100

  
68 101
Class design overview
69 102
----------------------
70 103

  
......
163 196
    SolrServer <|-- EmbeddedSolrServer
164 197
    SolrServer <|-- HttpSolrServer
165 198
	
166
	package "Stand-alone indexer (webapp or daemon)" {
199
	package "Metact-index (webapp)" {
167 200
		  
168 201
		class ApplicationController {
169 202
		    - List<SolrIndex> solrIndex
......
180 213

  
181 214
		class SystemMetadataEventListener {
182 215
			- SolrIndex solrIndex
183
			- IMap hzSystemMetadata
184
			- IMap hzObjectPath
185
			+ entryAdded(EntryEvent<Identifier, SystemMetadata>)
186
			+ entryUpdated(EntryEvent<Identifier, SystemMetadata>)
187
			+ entryRemoved(EntryEvent<Identifier, SystemMetadata>)
216
			+ itemAdded(ItemEvent<SystemMetadata>)
217
			+ itemRemoved(ItemEvent<SystemMetadata>)
188 218
		}
189 219
	
190 220
	}
......
197 227
		}
198 228
		
199 229
		class HazelcastService {
230
			- IMap hzIndexQueue
200 231
			- IMap hzSystemMetadata
232
			- IMap hzObjectPath
201 233
		}
202 234
		
203
		class ObjectPathMap {
204
			- IMap hzObjectPath
205
		}
206 235
	}
207 236
	
208 237
	MetacatSolrIndex o--"1" SolrServer
209 238
	HazelcastService .. SystemMetadataEventListener
210
	ObjectPathMap .. SystemMetadataEventListener
211 239
	
212 240
	ApplicationController o--"*" SolrIndex
213 241
	SolrIndex o--"1" SolrServer	

Also available in: Unified diff