documentation:technical:search_grammar
This is an old revision of the document!
Table of Contents
Bibliographic queries in Evergreen
Evergreen, in trunk as of March 2010 and for all versions after the 1.6 series, uses an advanced, configurable query parser for bibliographic searches. This new parser is much more flexible and featureful than the old one, which hard coded the syntax for parsing queries.
Grammar
First, we'll start with a pseudo-grammar for the new query parser with some inline notes:
regexp := valid PCRE word := valid UTF-8 non-whitespace characters whitespace := string matching PCRE /\s+/s boolean_word := 'yes' | 'no' | 'true' | 'false' | '1' | '0' modifier_marker := '#' ### configurable, default phrase_boundary := '"' require-er := '+' negator := '-' search_seperator := ':' | '=' class_field_seperator := '|' boolean_and := '&&' ### configurable, EG default boolean_or := '||' ### configurable, EG default subquery_start := '(' ### configurable, default subquery_end := ')' ### configurable, default word_list := word { ',' word } negated_word := negator word required_word := require-er word phrase := phrase_boundary word { whitespace word } phrase_boundary term := word | negated_word | required_word | phrase { whitespace term } boolean_operator := boolean_and | boolean_or registerd_class := 'keyword' | 'title' | 'author' | 'subject' | 'series' ### configurable, default for EG class_alias := regexp ### 'kw', 'ti', 'au', 'su', 'se' and many more, configurable, loaded from IDL class cmsa where field is null search_class := registered_class | class_alias registered_field := word ### configurable, loaded from IDL class cmf where search_field is true field_alias := regexp ### configurable, loaded from IDL class cmsa where field is not null registered_facet := word ### configurable, loaded from IDL class cmf where facet_field is true classed_search := search_class search_seperator term search_field_list := registered_field { class_field_seperator search_field_list } fielded_search := search_class class_field_seperator { search_field_list } search_seperator term field_alias_search := field_alias search_seperator term facet_list := registered_facet { class_field_seperator facet_list } facet_value_list := term { ' # ' term } facet_search := search_class [ class_field_seperator { facet_list } ] '[' facet_value_list ']' search := term | classed_search | fielded_search | field_alias_search | facet_search registered_modifier := 'available' | 'staff' | 'descending' ### and many more, defined in QueryParser implementation driver class_alias search_modifier := modifier_marker registered_modifier | registered_modifier '(' boolean_word ')' registered_filter := 'site' | 'sort' | 'item_type' ### and many more, defined in QueryParser implementation driver class_alias search_filter := registered_filter '(' word_list ')' | registered_filter ':' word_list boolean_term := term boolean_op term subquery := subquery_start query subquery_end query := term | boolean_term | search | search_modifier | search_filter | subquery { [boolean_op] query }
Evergreen Configuration of the Query Parser
Filters
- audience – MARC audience codes
- vr_format – MARC video recording format codes
- format – '-' separated list of item_type and item_form codes, no commas
- item_type – MARC item type code
- item_form – MARC form codes
- lit_form – MARC literary form codes
- locations – shelving location IDs
- site – Org Unit ID or short name
- lasso – Staff-defined Org Lasso ID or name
- my_lasso – User-defined Org Lasso ID or name (not yet implemented)
- depth – Org Unit Type depth value, or Org Unit Type opac label or name string
- sort – Bibliographic record sort axis, one of: title, author, pubdate, create_date or relevance
- language – MARC language code
- preferred_language – MARC language code or Evergreen translation-supported locale
- preferred_language_weight or preferred_language_multiplier – relevance multiplier for records matching the preferred language
- statuses – Evergreen status IDs
- bib_level – MARC Bibliographic Level
- before – Filter on MARC Date1 where the record value is earlier than or equal to the supplied value
- after – Filter on MARC Date1 where the record value is later than or equal to the supplied value
- between – Filter on MARC Date1 where the record value is between the supplied values (two required, start and end)
- during – Filter on MARC Date1 and Date2 where the supplied value is between Date1 and Date2
- offset or skip_check – Superpage offsetting, not intended for end users
- limit or check_limit or superpage_size – Superpage sizing, not intended for end users
- core_limit – Superpage visibility horizon, not intended for end users
- superpage – Target superpage to generate, not intended for end users
- estimation_strategy – How to treat deleted and hidden records in hit estimation, not intended for end users
Modifiers
- available – Limit to statuses 0 (Available), 7 (Reshelving) and 12 (Reserves)
- descending – Reverse the sort order (default is ascending)
- ascending – Sort in ascending order
- metabib or metarecord – Metarecord search, instead of direct bibliographic record search
- staff – Staff search, which includes hidden records and non-transcendent records with no items or located URIs
Registered classes (from the stock config.metabib_class)
- keyword
- title
- author
- subject
- series
Registered search fields (from the stock config.metabib_field)
class | name |
---|---|
author | conference |
author | corporate |
author | other |
author | personal |
keyword | keyword |
series | seriestitle |
subject | complete |
subject | geographic |
subject | name |
subject | temporal |
subject | topic |
title | abbreviated |
title | alternative |
title | proper |
title | translated |
title | uniform |
Registered facet fields (from the stock config.metabib_field)
class | name |
---|---|
author | conference |
author | corporate |
author | other |
author | personal |
series | seriestitle |
subject | geographic |
subject | name |
subject | temporal |
subject | topic |
documentation/technical/search_grammar.1270053785.txt.gz · Last modified: 2022/02/10 13:33 (external edit)