====== Optimizing the performance of your Evergreen server ======
===== PostgreSQL Database Configuration =====
NOTE: Originally compiled at the Developer Hack-a-way, November 2015
Operation System Considerations
* avoid 3.2 kernel (http://www.databasesoup.com/2014/09/why-you-need-to-avoid-linux-kernel-32.html)
* usually "deadline" I/O scheduler is better than CFQ
* mount filesystems used for Pg with the 'noatime' option
* disable NUMA zone-reclaim
* Kernel settings per http://www.postgresql.org/message-id/50E4AAB1.9040902@optionshouse.com (Be careful with sched_migration_cost, setting it too high can crash kernel on boot)
* consider segregating I/O for WAL and WAL archiving on different drives/RAID controllers than those used for the main data directory
* Disable transparent huge pages ([[http://georgialibraries.markmail.org/thread/em237qdardxj443i|see email thread]])
Hardware Considerations:
* RAID Cache + Battery + Writeback Cache (when battery is present / enabled)
* separate RAID volumes for the data directory
* separate RAID volume for the xact_log
* BIOS
* hyperthreading disabled (bshum)
* Ideally enough RAM to cache entire database (or at least metabib tables and indexes) -- RAM is faster than SSD
Application-level
* Use recent version of MARC::File::XML that uses DOM rather than SAX
* Avoid running anything other than Pg on the database server
* Apply the fix for https://bugs.launchpad.net/evergreen/+bug/1438136
* Check page size for searching
* Tuning opensrf.xml - max children - cstore/pcrud in particular
* Separate tasks onto different physical (or virtual) machines
PostgreSQL
* https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
* autovacuum aggressiveness
* default_statistics_target
* random_page_cost (lower particularly if you have SSDs)
* log_min_duration_statement (to identify long-running queries)
* GIN indexes, especially for metabib tables
* Look at actual index usage (e.g., http://www.databasesoup.com/2014/05/new-finding-unused-indexes-query.html)
* use of a separate, replicated database for reporting purposes
Tools
* pg_prewarm - http://www.postgresql.org/docs/9.4/static/pgprewarm.html
* pgbadger - log analyzing and statistics: http://github.com/dalibo/pgbadger/
* EXPLAIN ANALYZE
===== Apache server optimization (written in 2009) =====
There are a number of steps you can follow to optimize your Apache server.
- Enable mod_deflate to send compressible content over the network in gzipped format:
-
a2enmod deflate # on Debian or Ubuntu
- Edit /etc/apache2/mods-enabled/deflate.conf to set CSS and JavaScript to the set of file types that will be compressed. You can't currently keep text/html and text/xml as XMLENT currently conflicts with DEFLATE:
AddOutputFilterByType DEFLATE text/css application/javascript
- Cache content that does not change often using the mod_expires Apache module. You can set some simple caching rules in /etc/apache2/sites-enabled/eg.conf:
ExpiresActive On
# Set default expiry time to one month - we could probably make this 6 months
ExpiresDefault "access plus 1 month"
ExpiresByType text/html "access plus 25 hours"
ExpiresByType application/xhtml+xml "access plus 25 hours"
- **Requires Evergreen 1.6 or higher**: Parallelize the requests to your server using multiple hosts. Most browsers will send a maximum of 2 to 4 concurrent requests to the same hostname, and as Evergreen pages pull in a lot of CSS, JavaScript, and image files, this can result in some unpleasant loading delays. A simple workaround is to set up CNAME entries to point multiple hostnames at the same IP address, but be warned that unless you have a valid wildcard SSL certificate, your users may have an unpleasant experience if they try to log in to their account.
- Set up hostnames to serve your JavaScript, CSS, and image files separately from your core content. If "library.example.org" is your primary hostname, you could set up "js.example.org", "css.example.org", and "images.example.org" as hostnames that point to the same IP address as "library.example.org". Then, edit /etc/apache2/eg_vhost.conf to set the environment variables for the various included files as follows:
... other stuff, snipped
SetEnvIf Request_URI ".*" OILS_OPAC_BASE=/opac/
# This gives you the option to configure a different host to serve OPAC images from
# Specify the hostname (withouth protocol) and path to the images. Protocol will
# be determined at runtime
SetEnvIf Request_URI ".*" OILS_OPAC_IMAGES_HOST=images.example.org/opac/
SetEnvIf Request_URI ".*" OILS_OPAC_CSS_HOST=css.example.org/opac/
SetEnvIf Request_URI ".*" OILS_OPAC_JS_HOST=js.example.org/opac/