Metacat performance issue in Sanparks skin
Matt and Mike reported it would like about 4 or 5 minutes to do a search in sanparks skin of production server. We should fixed before 1.8.1 release.
#2 Updated by ben leinfelder about 12 years ago
duane's comments about the indexPath in LTER pointed me toward the search pathquery in the sanparks/saeon skins.
Perhaps the fix is as simple as adding the additional returnfields that are requested (from FGDC documents).
We had previously only added the FGDC field "placekey" because it was used as part of the queryterm/pathexpr element.
The additional returnfields are:
#3 Updated by ben leinfelder about 12 years ago
adding those extra returnfields to the index paths did not seem to make a difference with searches on dev.
The sanparks skin ends up generating SQL that uses an "intersect" so as to handle organization filters for both EML and FGDC document types- i believe this is where the performance hit is being introduced.
It might be nice to include a search option so that you could search for only one document type at a time. If you wanted all types it would just take longer...
*dev.nceas.ucsb.edu is giving about 85 seconds for a 'kruger' search vs. the default skin's 23 seconds for the same search term.
#4 Updated by ben leinfelder about 12 years ago
oh! one more thing:
looks like sanparks is always searching for the search term in any nodedata (not limited to title, abstract, etc....)
that means it does not use the xml_patch_index table to find some of the valid docids....
It is usually an option in the other skins' search interface to specify that searches only look in certain [predefined] fields. Perhaps we should add this to the sanparks and saeon skins
#5 Updated by Jing Tao about 12 years ago
Here is the selection query:
SELECT docid,docname,doctype,date_created, date_updated, rev FROM xml_documents WHERE docid IN (((SELECT DISTINCT docid FROM xml_path_index WHERE (UPPER LIKE '%PLANT%' AND path IN ('abstract/para','surName','givenName','organizationName','title','keyword','para','geographicDescription','literalLayout','@packageId','abstract','idinfo/citation/citeinfo/title','idinfo/citation/citeinfo/origin','idinfo/keywords/theme/themekey')) UNION (SELECT DISTINCT docid FROM xml_nodes WHERE UPPER LIKE '%PLANT%' AND parentnodeid IN (SELECT nodeid FROM xml_index WHERE path LIKE 'idinfo/keywords/theme/placekey') ) ) INTERSECT (SELECT DISTINCT docid FROM xml_path_index WHERE (UPPER LIKE '%SANPARKS, SOUTH AFRICA%' AND path IN ('placekey','keyword')) OR (UPPER LIKE '%SAEON, SOUTH AFRICA%' AND path IN ('placekey','keyword'))))) AND (docid IN (SELECT docid from xml_access WHERE = 'public') AND perm_type = 'allow' AND permission > 3)) AND docid NOT IN (SELECT docid from xml_access WHERE = 'public') AND perm_type = 'deny' AND perm_order ='allowFirst' AND permission > 3) ))