Bug #2159
openMetacat Performance: Divide xml_nodes based on the doctype
0%
Description
Matt suggested that xml_nodes table can probably be divided into multiple
tables based on the doctype. For example, xml_nodes can be divided into the
following:
1. xml_nodes_eml_2_0_0
2. xml_nodes_eml_2_0_1
3. xml_nodes_default
Hence any query for searching for a given text will be divided into 3 sub
queries, results of which can be unioned. While we are still going through the
same number of records (assuming the query is for all doctypes), this might
result in performance enhancement in a db like Oracle on a multi-proc machine.
From experience, Postgres will still run it on one proc... unless we run a
seperate query on each of the tables and union the results in the servlet
Though it is not clear how much performance would be achieved by this...
Updated by Saurabh Garg over 18 years ago
This approach might be worth considering but I think it will only give short term benefits if at all any benefits. This is because most of the searched we have request both eml-2.0.0 and eml-2.0.1 as return doctypes. And these are the major doctypes that we hold right now. hence I dont think this approach will be helpful in any way.