Table of Contents
Automated SQL Report Scripting (Draft)
Requirements
- Setup or make use of a database user that can run SELECT queries against your database server.
- Use a converter script to change .csv to .xls file formatting.
- Install and configure Mutt for a text-based e-mail client to automatically send the finished reports to the library.
Overview of Process
- Run SQL query against database to retrieve data and produce output in CSV format.
- Add title of report and edit report with additional elements.
- Use converter script to change from CSV to XLS format.
- Generate e-mail text to include, and then attach XLS to email to library.
Sample scripts
The following sample scripts are used to provide daily e-mailed reports for the previous day's circulation for a given library.
Daily Sample Script
Control script
- The directory where we store scripts is in /home/<user>/sql where <user> will be the name of the user you setup to run these scripts with.
- Change other related anonymized places within this script with the appropriate values for your organization. Someday these should be updated to use more variables to allow for further customization or simpler setup instead of hardcoded values.
- <db_user> = user that can access postgres database where Evergreen is running
- <db_hostname> = hostname for Evergreen database server, usually localhost if single-server
- <db_name> = usually "evergreen" if using default configuration options
- The RECIPIENT variable defines the e-mail address you will be sending to
#!/bin/sh # TEST_daily_stats.sh # # Script for generating daily circ stats for TEST Library # Define our variables BASEDIR=/home/<user>/sql DATE=$(date --date='1 day ago' '+%F') FOLDER=TEST ORG=TEST RECIPIENT="test@library.org" # Run the SQL report to get csv output psql -U <db_user> -h <db_hostname> -f $BASEDIR/$FOLDER/$ORG'_daily_stats'.postgresql <db_name> # Insert Title at beginning and Delete the last line from the csv (number of rows) sed -e '1 i Daily Circ for '$DATE'|' -e '1 i \ |' -e '$d' $BASEDIR/$FOLDER/$ORG'_daily_stats2'.csv > $BASEDIR/$FOLDER/$ORG'_daily_stats'.csv rm $BASEDIR/$FOLDER/$ORG'_daily_stats2'.csv # Convert csv to xls $BASEDIR/csv2xls.pl -s '|' -v 0 -o $BASEDIR'/'$FOLDER'/'$ORG'_daily_circ_'$DATE $BASEDIR/$FOLDER/$ORG'_daily_stats'.csv # Create email text for the day $BASEDIR/$FOLDER/$ORG'_daily_text'.sh > $BASEDIR/$FOLDER/$ORG'_email_text_'$DATE.txt # E-mail out the report mutt -s "Daily Circ for $DATE" -a $BASEDIR/$FOLDER/$ORG'_daily_circ_'$DATE.xls -- $RECIPIENT < $BASEDIR/$FOLDER/$ORG'_email_text_'$DATE.txt # Remove leftovers rm $BASEDIR/$FOLDER/$ORG'_daily_stats'.csv rm $BASEDIR/$FOLDER/$ORG'_daily_circ_'$DATE.xls rm $BASEDIR/$FOLDER/$ORG'_email_text_'$DATE.txt
SQL Query
This file is called TEST_daily_circ.postgresql. It creates an output file used by the control script above. The path for the output is hardcoded into the first line of the SQL with the \o. The \f | indicates that | will be used as the delimiter between columns.
\o /home/<user>/SQL/TEST/TEST_daily_stats2.csv \a \f | SELECT acl.name AS "Shelving Location", COUNT(acirc.id) AS "# of Circs" FROM action.circulation acirc JOIN asset.copy acopy ON acopy.id = acirc.target_copy JOIN asset.copy_location acl ON acopy.location = acl.id WHERE acirc.circ_lib = '11' AND date_trunc('DAY'::text, acirc.create_time) = date_trunc('DAY'::text, now() - '1 day'::INTERVAL) GROUP BY acl.name UNION SELECT 'ZZZ Total' AS "Shelving Location", COUNT(acirc.id) AS "# of Circs" FROM action.circulation acirc JOIN asset.copy acopy ON acopy.id = acirc.target_copy JOIN asset.copy_location acl ON acopy.location = acl.id WHERE acirc.circ_lib = '11' AND date_trunc('DAY'::text, acirc.create_time) = date_trunc('DAY'::text, now() - '1 day'::INTERVAL) ORDER BY 1; \o
E-mail Text Script
#!/bin/bash # TEST_daily_text.sh # # Script for generating daily email body text for TEST Library DATE=$(date --date='1 day ago' '+%A, %B %d, %Y') echo "Hi, Attached is your daily circ report for "$DATE". Please let us know if you have any questions. Thank you, Your Helpdesk"
Monthly Sample Script
Using a similar method as described above for the daily control scripts but with some enhancements. First, there is now a control loop.sh script that runs through variables from a file to generate reports content. Queries are now stored in a folder called "queries" and use postgrsql variables passed from other scripts.
Running the LOOP Script
To run the action, we now have a single script called loop.sh which runs with a variable file. The example in this case could be libs.txt. The loop.sh script takes the variables out of libs.txt and passes them to the monthly_stats.sh script to be used there.
Run with:
./loop.sh libs.txt
libs.txt
SHORTNAME|11|test@email.com SHORTNAME2|13|test2@email.com
loop.sh
#!/bin/sh # Looping SQL Running echo "Starting Loop Cycle..." INPUT=$1 OLDIFS=$IFS IFS=\| [ ! -f $INPUT ] && { echo "$INPUT file not found"; exit 99; } while read shortname ou email do echo "\nLib : $shortname" echo "OU : $ou" echo "E-mail : $email\n" ./monthly_stats.sh $shortname $ou $email done < $INPUT IFS=$OLDIFS echo "\n... and we're done!"
monthly_stats.sh
#!/bin/sh # Generic monthly_stats.sh # # Script for generating monthly reports for an EG Library # # The whole thing # Define variables DATE=$(date --date='1 month ago' '+%Y-%m') EMAILDATE=$(date '+%F') QUERIES=/home/<user>/sql/queries BASEDIR=/home/<user>/sql/monthly FOLDER=TEST ORG=$1 OU=$2 RECIPIENT="$3" # Run the SQL reports to get csv outputs psql -U <db_user> -h <db_hostname> -f $BASEDIR/$FOLDER/$ORG'_daily_stats'.postgresql <db_name> psql -U <db_user> -h <db_hostname> -f $QUERIES/'patrons_without_residency'.sql <db_name> -o $BASEDIR/$FOLDER/$ORG'_'$DATE'_patrons_without_residency2'.csv -v circ_lib=$OU # Add title of report to csv sed -e '1 i Patrons Without Residency Stat Cat Entries as of '$DATE'|' -e '1 i \ |' -e '$d' $BASEDIR/$FOLDER/$ORG'_'$DATE'_patrons_without_residency2'.csv > $BASEDIR/$FOLDER/$ORG'_'$DATE'_patrons_without_residency'.csv rm $BASEDIR/$FOLDER/$ORG'_'$DATE'_patrons_without_residency2'.csv # Convert csv to xls /home/<user>/sql/csv2xls.pl -s '|' -v 0 -o $BASEDIR'/'$FOLDER'/'$ORG'_'$DATE'_patrons_without_residency' $BASEDIR/$FOLDER/$ORG'_'$DATE'_patrons_without_residency'.csv # Generate email text $BASEDIR/residency_monthly_text.sh > $BASEDIR/$FOLDER/$ORG'_'$DATE'_email_text'.txt # E-mail out the report mutt -s "$ORG - Patrons Missing Residency Code as of $EMAILDATE" -a $BASEDIR/$FOLDER/*.xls -- $RECIPIENT < $BASEDIR/$FOLDER/$ORG'_'$DATE'_email_text'.txt # Remove leftovers rm $BASEDIR/$FOLDER/$ORG'_'$DATE'_patrons_without_residency'.csv rm $BASEDIR/$FOLDER/$ORG'_'$DATE'_patrons_without_residency'.xls rm $BASEDIR/$FOLDER/$ORG'_'$DATE'_email_text'.txt
queries/patrons_without_residency.sql
-- patrons_without_residency.sql -- This report lists patron information for those who do not have an entry for stat_cat "1" (Residency) \a \f | SELECT acard.barcode AS "Barcode", au.family_name AS "Last Name", au.first_given_name AS "First Name", aou.name AS "Library Name" FROM actor.usr au INNER JOIN actor.org_unit aou ON au.home_ou = aou.id INNER JOIN actor.card acard ON acard.id = au.card INNER JOIN permission.grp_tree pgt ON pgt.id = au.profile LEFT OUTER JOIN (SELECT * FROM actor.stat_cat_entry_usr_map WHERE stat_cat = 1) AS asceum ON asceum.target_usr = au.id WHERE au.home_ou = :circ_lib AND asceum.id IS NULL AND au.deleted IS FALSE AND pgt.parent = 2 ORDER BY 4, 2, 1;
Some Future Ideas
- Have less hardcoded values in the script; customize variables that can be set or passed to the shell script during run.
- Add some additional options to the SQL query run:
- Use \t to output tuples instead of ugly hack with sed removing the row count.
- This didn't work because we lost the first row showing the column titles, so retaining sed hack for now.
- Make some components of the query into variables that can be passed via script.
- Perhaps consider moving the query as a saved view in the database.