Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision |
documentation:indexing [2016/07/26 17:17] – rjs7 | documentation:indexing [2016/07/27 09:39] – [Field Aliases] rjs7 |
---|
</code> | </code> |
| |
Each indexed datum coming from a bibliographic record in Evergreen is extracted based on an Indexed Field Definition. There are approximately twenty default definitions in a stock Evergreen installation spread across all six search classes. These definitions provide Evergreen with the information it needs in order to extract interesting data from bibliographic records. | Each indexed datum coming from a bibliographic record in Evergreen is extracted based on an Indexed Field Definition. There are approximately thirty default definitions in a stock Evergreen installation spread across all six search classes. These definitions provide Evergreen with the information it needs in order to extract interesting data from bibliographic records. |
| |
==== Field Class ==== | ==== Field Class ==== |
Indexed data from different fields will probably be considered to have different importance when calculating the relevance of a matched query term. For instance, a match in a translated title may be considered less important than a match in the title proper. The **weight** allows control over this. | Indexed data from different fields will probably be considered to have different importance when calculating the relevance of a matched query term. For instance, a match in a translated title may be considered less important than a match in the title proper. The **weight** allows control over this. |
| |
By supplying a higher or lower relative **weight**, one field can be made more or less important, in relevance ranking terms, than others. This value is used as a multiplier to the baseline cover-density ranking (discussed below), and so setting this value to 0 will allow matches, but always rank them at the bottom of the list. Likewise, supplying a very large **weight** multiplier will cause matches to appear at the top of the list. Floating point values are allowed, and by supplying values between 0 and 1 can be used apply fine-grained, percentage-based adjustments. | By supplying a higher or lower relative **weight**, one field can be made more or less important, in relevance ranking terms, than others. This value is used as a multiplier to the baseline cover-density ranking (discussed below), and so setting this value to 0 will allow matches, but always rank them at the bottom of the list. Likewise, supplying a very large **weight** multiplier will cause matches to appear at the top of the list. Floating point values are allowed, and values between 0 and 1 can be used to apply fine-grained, percentage-based adjustments. |
| |
Evergreen ships with all Indexed Field Definition weights set to 1 by default. | Evergreen ships with all Indexed Field Definition weights set to 1 by default. |
First, these aliases provide a mechanism for internationalizing the user-supplied search constraints; for instance, "author" can be aliased to "skrywer" to support searching by native speakers of Afrikaans without having to know the English term "author". | First, these aliases provide a mechanism for internationalizing the user-supplied search constraints; for instance, "author" can be aliased to "skrywer" to support searching by native speakers of Afrikaans without having to know the English term "author". |
| |
In a similar manner, aliases can be used to map [[http://www.loc.gov/standards/sru/specs/cql.html|CQL]] context set match points, which have standard names external to any specific search backend, to appropriate match points in any given Evergreen installation. For instance, the [[http://www.loc.gov/standards/sru/cql-bibliographic-searching.html|proposed CQL 'bib' context set]] defines, among others, title indexes called **dc.title**, **bib.titleUniform** and **bib.titleSeries**. In a stock Evergreen installation, **dc.title** would be aliased to the entire **title** search class, **bib.titleUniform** to the **uniform** field within the **title** search class, and **bib.titleSeries** to the **seriestitle** field within the **series** search class. | In a similar manner, aliases can be used to map [[http://www.loc.gov/standards/sru/specs/cql.html|CQL]] context set match points, which have standard names external to any specific search backend, to appropriate match points in any given Evergreen installation. For instance, the [[http://www.loc.gov/standards/sru/cql/contextSets/bib-context-set.html|proposed CQL 'bib' context set]] defines, among others, title indexes called **dc.title**, **bib.titleUniform** and **bib.titleSeries**. In a stock Evergreen installation, **dc.title** would be aliased to the entire **title** search class, **bib.titleUniform** to the **uniform** field within the **title** search class, and **bib.titleSeries** to the **seriestitle** field within the **series** search class. |
| |
| |