OmniTI performed an Evergreen software evaluation on Evergreen this past fall on behalf of the Massachusetts Library Network Cooperative. The performance evaluation evaluated the Evergreen database, staff client and messaging framework. The study was mostly funded with a Library Services and Technology Act grant authorized through the Massachusetts Board of Library Commissioners along with some funds from MassLNC.
OmniTI release its report in December 2013. masslnc_evergreen_-_summary_of_recommendations.pdf
Following the release of the report, MassLNC had asked OmniTI to follow up on two additional questions that are outlined below with OmniTI's responses.
1. In the RFP, MassLNC had asked if there were ways to restore settings stored in the database to improve the relevancy of results retrieved through the OPAC without a serious cost to performance. https://bugs.launchpad.net/evergreen/+bug/844374 OmniTI shared the following thoughts with us:
The first was in looking at the bib search query, which passes the following in as the third argument to the search.query_parser_fts(int, int, text) function. The part that stood out with the query were sequential scans performed on both metabib.metarecord_source_map and metabib.record_attr.
The query planner/explain for that query passed into the function was:
evergreen=# explain SELECT m.source AS id, ARRAY [ m.source ] AS records, 1.0 / ( AVG ( ( 1 ) ) + 1 ) ::NUMERIC AS rel, 1.0 / ( AVG ( ( 1 ) ) + 1 ) ::NUMERIC AS rank, FIRST ( mrd.attrs -> 'date1' ) AS tie_break FROM metabib.metarecord_source_map m INNER JOIN metabib.record_attr mrd ON ( m.source = mrd.id ) WHERE 1 = 1 AND ( TRUE ) GROUP BY 1 ORDER BY 4 ASC NULLS LAST, 5 DESC NULLS LAST, 3 DESC LIMIT 10000 ; ┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├───────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Limit (cost=1159860.83..1159885.83 rows=10000 width=533) │ │ -> Sort (cost=1159860.83..1165766.53 rows=2362279 width=533) │ │ Sort Key: ((1.0 / (avg(1) + 1::numeric))), (first((mrd.attrs -> 'date1'::text))) │ │ -> HashAggregate (cost=932046.04..991103.01 rows=2362279 width=533) │ │ -> Hash Join (cost=68199.58..317853.50 rows=2362279 width=533) │ │ Hash Cond: (mrd.id = m.source) │ │ -> Seq Scan on record_attr mrd (cost=0.00..193549.79 rows=2362279 width=533) │ │ -> Hash (cost=38670.37..38670.37 rows=2362337 width=8) │ │ -> Seq Scan on metarecord_source_map m (cost=0.00..38670.37 rows=2362337 width=8) │ └───────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ (9 rows)
Sequential scans on small tables (fully resident in memory) aren't really a red flag. Those two tables aren't exactly small, however. That may be an indication that the way they're being joined has some room for improvement. That could either be through more appropriate indexing or creating something along the lines of a rollup table, where the necessary join of the tables are pre-computed and stored in a separate physical table.
2. The second question related to a long-running query found in the logs of one of our production database that wasn't identified as a problem query in the original report. The production system was running 2.3 at the time, and the query in question was:
SELECT "ahr".id AS "id", "ahr".cut_in_line AS "cut_in_line", "ahr".request_time AS "request_time" FROM ACTION.hold_request AS "ahr" INNER JOIN ACTION.hold_copy_map AS "ahcm" ON ( "ahcm".HOLD = "ahr".id ) INNER JOIN ACTION.hold_copy_map AS "ahcm2" ON ( "ahcm2".target_copy = "ahcm".target_copy ) WHERE ( "ahcm2".HOLD = 0 ) GROUP BY 0, 0, 0 ORDER BY COALESCE ( "ahr".cut_in_line, '' ) DESC, "ahr".request_time;
OmniTI's recommendation:
SELECT "ahr".id AS "id", "ahr".cut_in_line AS "cut_in_line", "ahr".request_time AS "request_time" FROM ACTION.hold_request AS "ahr" INNER JOIN ACTION.hold_copy_map AS "ahcm" ON ( "ahcm".HOLD = "ahr".id ) WHERE ( "ahcm".HOLD = 2255161 ) GROUP BY 1, 2, 3 ORDER BY "ahr".cut_in_line DESC NULLS LAST, "ahr".request_time;
Changes: