Differences

This shows you the differences between two versions of the page.

--- documentation:indexing [2016/07/27 09:39] – [Field Aliases] rjs7
+++ documentation:indexing [2022/02/10 13:34] (current) – external edit 127.0.0.1
@@ Line 1: / Line 1: @@
+FIXME - Add Information on Virtual Index Definitions in 3.1+
 ======= Bibliographic Indexing in Evergreen ========
 Indexing and searching bibliographic data in Evergreen are complex processes.  Also, Evergreen is extremely configurable in these areas and can be tuned to the specific needs of each installation.  Because of this, there are many moving parts which are tightly integrated and interdependent.
@@ Line 195: / Line 197: @@
 Normalizer functions are in-database stored procedures, and can be written in any programming language supported by Postgres.  The stock normalizers are all written in either PL/PerlU, PL/pgSQL or SQL.
-Twelve normalizer functions are registered in the stock Evergreen installation.  They are
+Twenty-one normalizer functions are registered in the stock Evergreen installation.  They are
 ^ Name ^ Description ^
+|Approximate High Date Normalize|Normalize the value to the nearest date-ish value, rounding up|
+|Approximate Low Date Normalize|Normalize the value to the nearest date-ish value, rounding down|
+|Coded Value Map Normalizer|Applies coded_value_map mapping of values|
 |Down-case|Convert text lower case.|
 |Extract Dewey-like number|Extract a string of numeric characters ther resembles a DDC number.|
 |First word|Include only the first space-separated word of a string.|
+|Generic Mapping Normalizer|Map values or sets of values to new values.|
 |ISBN 10/13 conversion|Translate ISBN10 to ISBN13, and vice versa, for indexing purposes.|
 |Left truncation|Discard the specified number of characters from the left side of the string.|
@@ Line 206: / Line 212: @@
 |NACO Normalize -- retain first comma|Apply NACO normalization rules to the extracted text, retaining the first comma.  See https://www.loc.gov/aba/pcc/naco/normrule-2.html for details.|
 |Normalize date range|Split date ranges in the form of "XXXX-YYYY" into "XXXX YYYY" for proper index.|
-|Replace|Replace all occurances of first parameter in the string with the second parameter.|
+|Normalize date range|Normalize the value to NULL if it is not a number|
+|Replace|Replace all occurrences of first parameter in the string with the second parameter.|
+|Remove Parenthesized Substring|Remove any parenthesized substrings from the extracted text, such as the agency code preceding authority record control numbers in subfield 0.|
 |Right truncation|Include only the specified number of characters from the left side of the string.|
+|Search Normalize|Apply search normalization rules to the extracted text. A less extreme version of NACO normalization.|
 |Strip Diacritics|Convert text to NFD form and remove non-spacing combining marks.|
+|Trim Surrounding Space|Trim leading and trailing spaces from extracted text.|
+|Trim Trailing Punctuation|Eliminate extraneous trailing commas and periods in text.|
 |Up-case|Convert text upper case.|