==== Bibliographic queries in Evergreen ====
Evergreen, in trunk as of March 2010 and for all versions after the 1.6 series, uses an advanced, configurable query parser for bibliographic searches. This new parser is much more flexible and featureful than the old one, which hard coded the syntax for parsing queries.
How about ... some fun with examples!
=== Examples ===
* Search for records containing "harry potter":harry potter
* Too easy? That were published after January 1, 2000:harry potter after(2000)
* We only care about the actual series:harry potter after(2000) author:rowling
* And that are at our library:harry potter after(2000) author:rowling site(ARL-ATH)
* And, sort those by pub date:harry potter after(2000) author:rowling site(ARL-ATH) sort(pubdate)
* Descending (newest first):harry potter after(2000) author:rowling site(ARL-ATH) sort(pubdate)#descending
* How about some nested boolean action:("harry potter" && (stone || chamber)) && (author:rowling || subject:rowling) item_form(d) subject[Magic in literature]
(That last part is a facet.)
* Phrase searches can be left-anchored: identifier|bibcn:"^123 ABC"
or right-anchored: bibcn:"2004$"
* Phrase searches can also consider punctuation literally: "C++"
* Speaking of facets, imagine a locally defined index definition in the keyword class called mat_type that indexes, say, 945$m, which is a local field holding bib-level "material type" strings. Imagine further that this field is marked as a facet field, but not a search field. You could:
* Browse all of your DVD mat_types sorted by author:keyword|mat_type[DVD] sort(author)
* Find all available audio cassettes in your South West branch:#available keyword|mat_type[AudioCassette] site(SW) sort(title)
* Get the bibs for all of your VHS and BetaMax items published during the 80s in one browse list ordered by their approximate purchase date:keyword|mat_type[VHS # BetaMax] between(1980,1989) sort(create_date)
=== Grammar ===
First, we'll start with a pseudo-grammar for the new query parser with some inline notes:
regexp := valid PCRE
word := valid UTF-8 non-whitespace characters
whitespace := string matching PCRE /\s+/s
boolean_word := 'yes' | 'no' | 'true' | 'false' | '1' | '0'
modifier_marker := '#' ### configurable, default
phrase_boundary := '"'
phrase_left_anchor := '^'
phrase_right_anchor := '$'
1_word_phrase_marker := '+'
negator := '-'
search_seperator := ':' | '='
class_field_seperator := '|'
boolean_and := '&&' ### configurable, EG default
boolean_or := '||' ### configurable, EG default
subquery_start := '(' ### configurable, default
subquery_end := ')' ### configurable, default
word_list := word { ',' word }
negated_word := negator word
required_word := 1_word_phrase_marker word ### one-word phrase shortcut
phrase := phrase_boundary { phrase_left_anchor } word { whitespace word } { phrase_right_anchor } phrase_boundary
term := word | negated_word | required_word | phrase { whitespace term }
boolean_operator := boolean_and | boolean_or
registerd_class := 'keyword' | 'title' | 'author' | 'subject' | 'series' ### configurable, default for EG
class_alias := regexp ### 'kw', 'ti', 'au', 'su', 'se' and many more, configurable, loaded from IDL class cmsa where field is null
search_class := registered_class | class_alias
registered_field := word ### configurable, loaded from IDL class cmf where search_field is true
field_alias := regexp ### configurable, loaded from IDL class cmsa where field is not null
registered_facet := word ### configurable, loaded from IDL class cmf where facet_field is true
classed_search := search_class search_seperator term
search_field_list := registered_field { class_field_seperator search_field_list }
fielded_search := search_class class_field_seperator { search_field_list } search_seperator term
field_alias_search := field_alias search_seperator term
facet_list := registered_facet { class_field_seperator facet_list }
facet_value_list := term { ' # ' term }
facet_search := search_class [ class_field_seperator { facet_list } ] '[' facet_value_list ']'
search := term | classed_search | fielded_search | field_alias_search | facet_search
registered_modifier := 'available' | 'staff' | 'descending' ### and many more, defined in QueryParser implementation driver class_alias
search_modifier := modifier_marker registered_modifier | registered_modifier '(' boolean_word ')'
registered_filter := 'site' | 'sort' | 'item_type' ### and many more, defined in QueryParser implementation driver class_alias
search_filter := registered_filter '(' word_list ')' | registered_filter ':' word_list
boolean_term := term boolean_op term
subquery := subquery_start query subquery_end
query := term | boolean_term | search | search_modifier | search_filter | subquery { [boolean_op] query }
=== Evergreen Configuration of the Query Parser ===
== Filters ==
* **audience** -- MARC audience codes
* **vr_format** -- MARC video recording format codes
* **format** -- '-' separated list of item_type and item_form codes, no commas
* **item_type** -- MARC item type code
* **item_form** -- MARC form codes
* **lit_form** -- MARC literary form codes
* **locations** -- shelving location IDs
* **site** -- Org Unit ID or short name
* **lasso** -- Staff-defined Org Lasso ID or name
* **my_lasso** -- User-defined Org Lasso ID or name (not yet implemented)
* **depth** -- Org Unit Type depth value, or Org Unit Type opac label or name string
* **sort** -- Bibliographic record sort axis, one of: title, author, pubdate, create_date or relevance
* **language** -- MARC language code
* **preferred_language** -- MARC language code or Evergreen translation-supported locale
* **preferred_language_weight** or **preferred_language_multiplier** -- relevance multiplier for records matching the preferred language
* **statuses** -- Evergreen status IDs
* **bib_level** -- MARC Bibliographic Level
* **before** -- Filter on MARC Date1 where the record value is earlier than or equal to the supplied value
* **after** -- Filter on MARC Date1 where the record value is later than or equal to the supplied value
* **between** -- Filter on MARC Date1 where the record value is between the supplied values (two required, start and end)
* **during** -- Filter on MARC Date1 and Date2 where the supplied value is between Date1 and Date2
* **offset** or **skip_check** -- Superpage offsetting, not intended for end users
* **limit** or **check_limit** or **superpage_size** -- Superpage sizing, not intended for end users
* **core_limit** -- Superpage visibility horizon, not intended for end users
* **superpage** -- Target superpage to generate, not intended for end users
* **estimation_strategy** -- How to treat deleted and hidden records in hit estimation, not intended for end users
== Modifiers ==
* **available** -- Limit to statuses 0 (Available), 7 (Reshelving) and 12 (Reserves)
* **descending** -- Reverse the sort order (default is ascending)
* **ascending** -- Sort in ascending order
* **metabib** or **metarecord** -- Metarecord search, instead of direct bibliographic record search
* **staff** -- Staff search, which includes hidden records and non-transcendent records with no items or located URIs
* Cover density ranking algorithm tuning parameters, as described in the [[http://www.postgresql.org/docs/9.0/interactive/textsearch-controls.html#TEXTSEARCH-RANKING|Postgres documentation]].
* **CD_logDocumentLength** => 1
* **CD_documentLength** => 2
* **CD_meanHarmonic** => 4
* **CD_uniqueWords** => 8
* **CD_logUniqueWords** => 16
* **CD_selfPlusOne** => 32
* **lucky** -- Return only the first hit, à la Google's "I'm feeling lucky" button
== Registered classes (from the stock config.metabib_class) ==
* **keyword** -- Anywhere in the record except the physical description
* **title** -- Any abbreviated, translated, alternate, uniform or proper title
* **author** -- Any personal, corporate or conference author
* **subject** -- Any standard subject field. Also broken down into topic, temporal, geographic and name subjects by field
* **series** -- Series title
* **identifier** -- Accession numbers, standard numbers (ISxN, UPC, EAN, etc), local call number strings, etc
== Registered search fields (from the stock config.metabib_field) ==
^ **class** ^ **name** ^
|author|conference|
|author|corporate|
|author|other|
|author|personal|
|keyword|keyword|
|series|seriestitle|
|subject|complete|
|subject|geographic|
|subject|name|
|subject|temporal|
|subject|topic|
|title|abbreviated|
|title|alternative|
|title|proper|
|title|translated|
|title|uniform|
|identifier|isbn|
|identifier|issn|
|identifier|upc|
|identifier|ismn|
|identifier|ean|
|identifier|isrc|
|identifier|sici|
|identifier|bibcn|
|identifier|accession|
== Registered class and field aliases ==
^ **alias** ^ **class** ^ **field** ^
|au | author | |
|creator | author | |
|name | author | |
|kw | keyword | |
|se | series | |
|su | subject | |
|ti | title | |
|eg.author | author | |
|eg.name | author | |
|eg.keyword | keyword | |
|eg.series | series | |
|eg.subject | subject | |
|eg.title | title | |
|bib.name | author | |
|bib.nameconference | author | conference|
|bib.namecorporate | author | corporate|
|bib.namepersonal | author | personal|
|bib.namepersonalfamily | author | personal|
|bib.namepersonalgiven | author | personal|
|dc.contributor | author | |
|dc.creator | author | |
|bib.edition | keyword | |
|bib.genre | keyword | |
|bib.subjecttitle | keyword | |
|dc.identifier | keyword | |
|dc.publisher | keyword | |
|srw.serverchoice | keyword | |
|bib.titleseries | series | seriestitle|
|bib.subjectname | subject | name|
|bib.subjectoccupation | subject | complete|
|bib.subjectplace | subject | geographic|
|dc.subject | subject | |
|bib.title | title | abbreviated|
|bib.titleabbreviated | title | abbreviated|
|bib.titlealternative | title | alternative|
|bib.titletranslated | title | translated|
|bib.titleuniform | title | uniform|
|dc.title | title | |
|id | identifier | |
|dc.identifier | identifier | |
|eg.isbn | identifier | isbn|
|eg.issn | identifier | issn|
|eg.upc | identifier | upc|
|eg.callnumber | identifier | bibcn|
== Registered facet fields (from the stock config.metabib_field) ==
^ **class** ^ **name** ^
|author|conference|
|author|corporate|
|author|other|
|author|personal|
|series|seriestitle|
|subject|geographic|
|subject|name|
|subject|temporal|
|subject|topic|