User Tools

Site Tools


documentation:technical:search_grammar

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
documentation:technical:search_grammar [2010/03/31 12:30] – adding modifier list mikerdocumentation:technical:search_grammar [2022/02/10 13:34] (current) – external edit 127.0.0.1
Line 2: Line 2:
  
 Evergreen, in trunk as of March 2010 and for all versions after the 1.6 series, uses an advanced, configurable query parser for bibliographic searches.  This new parser is much more flexible and featureful than the old one, which hard coded the syntax for parsing queries. Evergreen, in trunk as of March 2010 and for all versions after the 1.6 series, uses an advanced, configurable query parser for bibliographic searches.  This new parser is much more flexible and featureful than the old one, which hard coded the syntax for parsing queries.
 +
 +How about ... some fun with examples!
 +
 +=== Examples ===
 +
 +  * Search for records containing "harry potter":<code>harry potter</code>
 +    * Too easy? That were published after January 1, 2000:<code>harry potter after(2000)</code>
 +    * We only care about the actual series:<code>harry potter after(2000) author:rowling</code>
 +    * And that are at our library:<code>harry potter after(2000) author:rowling site(ARL-ATH)</code>
 +    * And, sort those by pub date:<code>harry potter after(2000) author:rowling site(ARL-ATH) sort(pubdate)</code>
 +    * Descending (newest first):<code>harry potter after(2000) author:rowling site(ARL-ATH) sort(pubdate)#descending</code>
 +  * How about some nested boolean action:<code>("harry potter" && (stone || chamber)) && (author:rowling || subject:rowling) item_form(d) subject[Magic in literature]</code> (That last part is a facet.)
 +  * Phrase searches can be left-anchored: <code>identifier|bibcn:"^123 ABC"</code> or right-anchored: <code>bibcn:"2004$"</code>
 +  * Phrase searches can also consider punctuation literally: <code>"C++"</code>
 +  * Speaking of facets, imagine a locally defined index definition in the keyword class called mat_type that indexes, say, 945$m, which is a local field holding bib-level "material type" strings.  Imagine further that this field is marked as a facet field, but not a search field.  You could:
 +    * Browse all of your DVD mat_types sorted by author:<code>keyword|mat_type[DVD] sort(author)</code>
 +    * Find all available audio cassettes in your South West branch:<code>#available keyword|mat_type[AudioCassette] site(SW) sort(title)</code>
 +    * Get the bibs for all of your VHS and BetaMax items published during the 80s in one browse list ordered by their approximate purchase date:<code>keyword|mat_type[VHS # BetaMax] between(1980,1989) sort(create_date)</code>
  
 === Grammar === === Grammar ===
Line 14: Line 32:
 modifier_marker       := '#'   ### configurable, default modifier_marker       := '#'   ### configurable, default
 phrase_boundary       := '"' phrase_boundary       := '"'
-require-er            := '+'+phrase_left_anchor    := '^' 
 +phrase_right_anchor   := '$' 
 +1_word_phrase_marker  := '+'
 negator               := '-' negator               := '-'
 search_seperator      := ':' | '=' search_seperator      := ':' | '='
Line 25: Line 45:
 word_list             := word { ',' word } word_list             := word { ',' word }
 negated_word          := negator word negated_word          := negator word
-required_word         :require-er word +required_word         :1_word_phrase_marker word  ### one-word phrase shortcut 
-phrase                := phrase_boundary word { whitespace word } phrase_boundary+phrase                := phrase_boundary { phrase_left_anchor } word { whitespace word } { phrase_right_anchor } phrase_boundary
 term                  := word | negated_word | required_word | phrase { whitespace term } term                  := word | negated_word | required_word | phrase { whitespace term }
  
Line 95: Line 115:
   * **metabib** or **metarecord** -- Metarecord search, instead of direct bibliographic record search   * **metabib** or **metarecord** -- Metarecord search, instead of direct bibliographic record search
   * **staff** -- Staff search, which includes hidden records and non-transcendent records with no items or located URIs   * **staff** -- Staff search, which includes hidden records and non-transcendent records with no items or located URIs
 +  * Cover density ranking algorithm tuning parameters, as described in the [[http://www.postgresql.org/docs/9.0/interactive/textsearch-controls.html#TEXTSEARCH-RANKING|Postgres documentation]].
 +    * **CD_logDocumentLength**    => 1
 +    * **CD_documentLength**       => 2
 +    * **CD_meanHarmonic**         => 4
 +    * **CD_uniqueWords**          => 8
 +    * **CD_logUniqueWords**       => 16
 +    * **CD_selfPlusOne**          => 32
 +  * **lucky** -- Return only the first hit, à la Google's "I'm feeling lucky" button
 +
 +== Registered classes (from the stock config.metabib_class) ==
 +  * **keyword** -- Anywhere in the record except the physical description
 +  * **title** -- Any abbreviated, translated, alternate, uniform or proper title
 +  * **author** -- Any personal, corporate or conference author
 +  * **subject** -- Any standard subject field.  Also broken down into topic, temporal, geographic and name subjects by field
 +  * **series** -- Series title
 +  * **identifier** -- Accession numbers, standard numbers (ISxN, UPC, EAN, etc), local call number strings, etc
 +
 +== Registered search fields (from the stock config.metabib_field) ==
 +^ **class** ^ **name** ^
 +|author|conference|
 +|author|corporate|
 +|author|other|
 +|author|personal|
 +|keyword|keyword|
 +|series|seriestitle|
 +|subject|complete|
 +|subject|geographic|
 +|subject|name|
 +|subject|temporal|
 +|subject|topic|
 +|title|abbreviated|
 +|title|alternative|
 +|title|proper|
 +|title|translated|
 +|title|uniform|
 +|identifier|isbn|
 +|identifier|issn|
 +|identifier|upc|
 +|identifier|ismn|
 +|identifier|ean|
 +|identifier|isrc|
 +|identifier|sici|
 +|identifier|bibcn|
 +|identifier|accession|
  
 +== Registered class and field aliases ==
 +^ **alias**         ^ **class**   ^ **field** ^
 +|au                     | author        | |
 +|creator                | author        | |
 +|name                   | author        | |
 +|kw                     | keyword       | |
 +|se                     | series        | |
 +|su                     | subject       | |
 +|ti                     | title         | |
 +|eg.author              | author        | |
 +|eg.name                | author        | |
 +|eg.keyword             | keyword       | |
 +|eg.series              | series        | |
 +|eg.subject             | subject       | |
 +|eg.title               | title         | |
 +|bib.name               | author        | |
 +|bib.nameconference     | author        | conference|
 +|bib.namecorporate      | author        | corporate|
 +|bib.namepersonal       | author        | personal|
 +|bib.namepersonalfamily | author        | personal|
 +|bib.namepersonalgiven  | author        | personal|
 +|dc.contributor         | author        | |
 +|dc.creator             | author        | |
 +|bib.edition            | keyword       | |
 +|bib.genre              | keyword       | |
 +|bib.subjecttitle       | keyword       | |
 +|dc.identifier          | keyword       | |
 +|dc.publisher           | keyword       | |
 +|srw.serverchoice       | keyword       | |
 +|bib.titleseries        | series        | seriestitle|
 +|bib.subjectname        | subject       | name|
 +|bib.subjectoccupation  | subject       | complete|
 +|bib.subjectplace       | subject       | geographic|
 +|dc.subject             | subject       | |
 +|bib.title              | title         | abbreviated|
 +|bib.titleabbreviated   | title         | abbreviated|
 +|bib.titlealternative   | title         | alternative|
 +|bib.titletranslated    | title         | translated|
 +|bib.titleuniform       | title         | uniform|
 +|dc.title               | title         | |
 +|id                     | identifier    | |
 +|dc.identifier          | identifier    | |
 +|eg.isbn                | identifier    | isbn|
 +|eg.issn                | identifier    | issn|
 +|eg.upc                 | identifier    | upc|
 +|eg.callnumber          | identifier    | bibcn|
  
 +== Registered facet fields (from the stock config.metabib_field) ==
 +^ **class** ^ **name** ^
 +|author|conference|
 +|author|corporate|
 +|author|other|
 +|author|personal|
 +|series|seriestitle|
 +|subject|geographic|
 +|subject|name|
 +|subject|temporal|
 +|subject|topic|
 +  
documentation/technical/search_grammar.1270053034.txt.gz · Last modified: 2022/02/10 13:33 (external edit)

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
CC Attribution-Share Alike 4.0 International Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki

© 2008-2022 GPLS and others. Evergreen is open source software, freely licensed under GNU GPLv2 or later.
The Evergreen Project is a U.S. 501(c)3 non-profit organization.