====== Proposal: Serials - Basic Predicting and Receiving ====== FIXME This proposal is still in progress, please do not edit yet FIXME Okay - but let's keep a link to the [[https://docs.google.com/drawings/view?id=196Cyl9CAIzZ33cOVt0AphvUSbAwc9kfRGRB5Nz-5tTI&hl=en&pli=1|latest schema diagram]] handy, at least. ===== Summary ===== Now that MFHD records can be added and are linked to bib records, the next step is to start populating the issuance table in order to predict and receive issues. As issues are received the MFHD record must be updated accordingly. This proposal will focus on fleshing-out the database schema and defining how the MFHD record will hook in. ===== Details ===== ==== Proposed Schema - Major Revision ==== The following is a work-in-progress rewrite of the schema originally proposed (still shown below). It is changing rapidly and may contain some syntax errors! Primary changes include a de-emphasis of MFHD records and a more direct representation of individual issues. DROP SCHEMA serial CASCADE; BEGIN; CREATE SCHEMA serial; CREATE TABLE serial.record_entry ( id BIGSERIAL PRIMARY KEY, record BIGINT REFERENCES biblio.record_entry (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED, owning_lib INT REFERENCES actor.org_unit (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED, creator INT NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED, editor INT NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED, source INT, create_date TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now(), edit_date TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now(), active BOOL NOT NULL DEFAULT TRUE, deleted BOOL NOT NULL DEFAULT FALSE, marc TEXT NOT NULL, last_xact_id TEXT NOT NULL ); CREATE INDEX serial_record_entry_creator_idx ON serial.record_entry ( creator ); CREATE INDEX serial_record_entry_editor_idx ON serial.record_entry ( editor ); CREATE INDEX serial_record_entry_owning_lib_idx ON serial.record_entry ( owning_lib, deleted ); CREATE RULE protect_mfhd_delete AS ON DELETE TO serial.record_entry DO INSTEAD UPDATE serial.record_entry SET deleted = true WHERE old.id = serial.record_entry.id; CREATE TABLE serial.subscription ( id SERIAL PRIMARY KEY, start_date TIMESTAMP WITH TIME ZONE NOT NULL, end_date TIMESTAMP WITH TIME ZONE, -- interpret NULL as current subscription record_entry BIGINT REFERENCES serial.record_entry (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED, expected_date_offset INTERVAL DEFAULT NULL -- acquisitions/business-side tables link to here ); --at least one distribution per org_unit holding issues CREATE TABLE serial.distribution ( id SERIAL PRIMARY KEY, subscription INT NOT NULL REFERENCES serial.subscription (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, holding_lib INT REFERENCES actor.org_unit (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED, label TEXT NOT NULL, receive_call_number BIGINT REFERENCES asset.call_number (id) DEFERRABLE INITIALLY DEFERRED, receive_unit_template INT REFERENCES asset.copy_template (id) DEFERRABLE INITIALLY DEFERRED, bind_call_number BIGINT REFERENCES asset.call_number (id) DEFERRABLE INITIALLY DEFERRED, bind_unit_template INT REFERENCES asset.copy_template (id) DEFERRABLE INITIALLY DEFERRED, unit_label_base TEXT DEFAULT NULL, unit_label_suffix TEXT DEFAULT NULL ); CREATE TABLE serial.stream ( id SERIAL PRIMARY KEY, distribution INT NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED ); CREATE TABLE serial.issuance ( id SERIAL PRIMARY KEY, creator INT NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED, editor INT NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED, create_date TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now(), edit_date TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now(), subscription INT NOT NULL REFERENCES serial.subscription (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, label TEXT DEFAULT NULL, date_published TIMESTAMP WITH TIME ZONE DEFAULT NULL, holding_code TEXT DEFAULT NULL, holding_type TEXT DEFAULT NULL CHECK ( holding_type IS NULL OR holding_type IN ('basic','supplement','index') ), holding_link_id INT DEFAULT NULL -- TODO: add columns for separate enumeration/chronology values ); CREATE TABLE serial.item ( id SERIAL PRIMARY KEY, creator INT NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED, editor INT NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED, create_date TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now(), edit_date TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now(), issuance INT NOT NULL REFERENCES serial.issuance (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, stream INT NOT NULL REFERENCES serial.stream (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, unit INT REFERENCES serial.unit (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, uri INT REFERENCES asset.uri (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED, date_expected TIMESTAMP WITH TIME ZONE DEFAULT NULL, date_received TIMESTAMP WITH TIME ZONE DEFAULT NULL ); CREATE TABLE serial.unit ( label TEXT DEFAULT NULL, label_sort_key TEXT DEFAULT NULL, contents TEXT NOT NULL ) INHERITS (asset.copy); CREATE TABLE serial.item_note ( id SERIAL PRIMARY KEY, item INT NOT NULL REFERENCES serial.item (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, creator INT NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED, create_date TIMESTAMP WITH TIME ZONE DEFAULT NOW(), pub BOOL NOT NULL DEFAULT FALSE, title TEXT NOT NULL, value TEXT NOT NULL ); CREATE TABLE serial.bib_summary ( id SERIAL PRIMARY KEY, distribution INT NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, generated_coverage TEXT NOT NULL, textual_holdings TEXT ); CREATE TABLE serial.sup_summary ( id SERIAL PRIMARY KEY, distribution INT NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, generated_coverage TEXT NOT NULL, textual_holdings TEXT ); CREATE TABLE serial.index_summary ( id SERIAL PRIMARY KEY, distribution INT NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, generated_coverage TEXT NOT NULL, textual_holdings TEXT ); COMMIT; -- to be moved to asset schema file CREATE TABLE asset.copy_template ( id SERIAL PRIMARY KEY, owning_lib INT NOT NULL REFERENCES actor.org_unit (id) DEFERRABLE INITIALLY DEFERRED, creator BIGINT NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED, editor BIGINT NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED, create_date TIMESTAMP WITH TIME ZONE DEFAULT NOW(), edit_date TIMESTAMP WITH TIME ZONE DEFAULT NOW(), name TEXT NOT NULL, -- columns above this point are attributes of the template itself -- columns after this point are attributes of the copy this template modifies/creates circ_lib INT REFERENCES actor.org_unit (id) DEFERRABLE INITIALLY DEFERRED, status INT, location INT, loan_duration INT CHECK ( loan_duration IS NULL OR loan_duration IN (1,2,3) ), fine_level INT CHECK ( loan_duration IS NULL OR loan_duration IN (1,2,3) ), age_protect INT, circulate BOOL, deposit BOOL, ref BOOL, holdable BOOL, deposit_amount NUMERIC(6,2), price NUMERIC(8,2), circ_modifier TEXT, circ_as_type TEXT, alert_message TEXT, opac_visible BOOL, floating BOOL, mint_condition BOOL ); ==== Schema ==== === Problems and Solutions === Starting with the database schema, here are some notable problems and possible solutions. - Problem: the current schema provides no means of storing prediction information or reception metadata.\\ Solution: expand the issuance table to include such data. - Problem: Multi-issue subscriptions are not easily accommodated.\\ Solution: Add another abstraction table called 'distribution' to sit between 'subscription' and the other pieces. It would perform many duties of the current subscription table, leaving subscription to deal with global data and to be the linking point to the acquisitions side. Subsequent tables (binding_unit, issuance, *_summary) will then link to 'distribution' rather than 'subscription'. The distribution table will also provide the linking point to the 'record_entry' (MFHD) table (and exist in a 1-to-1 relation with 'record_entry'). - Problem: binding unit is too specific for certain use cases.\\ Solution: Replace the 'binding_unit' table with a more flexible 'shelving_unit' table which can contain any number of single issues and bound volumes. === Proposed Schema Relationship Diagram === {{:acq:serials:serials_schema_diagram_v3.gif|Serials Schema Diagram}} === Proposed Schema === CREATE SCHEMA serial; CREATE TABLE serial.record_entry ( id BIGSERIAL PRIMARY KEY, record BIGINT REFERENCES biblio.record_entry (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED, owning_lib INT NOT NULL REFERENCES actor.org_unit (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED, creator INT NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED, editor INT NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED, source INT, create_date TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now(), edit_date TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now(), active BOOL NOT NULL DEFAULT TRUE, deleted BOOL NOT NULL DEFAULT FALSE, marc TEXT NOT NULL, last_xact_id TEXT NOT NULL ); CREATE INDEX serial_record_entry_creator_idx ON serial.record_entry ( creator ); CREATE INDEX serial_record_entry_editor_idx ON serial.record_entry ( editor ); CREATE INDEX serial_record_entry_owning_lib_idx ON serial.record_entry ( owning_lib, deleted ); CREATE RULE protect_mfhd_delete AS ON DELETE TO serial.record_entry DO INSTEAD UPDATE serial.record_entry SET deleted = true WHERE old.id = serial.record_entry.id; CREATE TABLE serial.subscription ( id SERIAL PRIMARY KEY, start_date TIMESTAMP WITH TIME ZONE NOT NULL, end_date TIMESTAMP WITH TIME ZONE -- interpret NULL as current subscription -- acquisitions/business-side tables link to here ); CREATE TABLE serial.distribution ( id SERIAL PRIMARY KEY, subscription INT NOT NULL REFERENCES serial.subscription (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, -- distribution to record_entry is 1-to-1 record_entry BIGINT UNIQUE REFERENCES serial.record_entry (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED, -- owning_lib column allows for a possible 'global' record_entry (e.g. record_entry per consortium, distribution per library) owning_lib INT REFERENCES actor.org_unit (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED, uri INT REFERENCES asset.uri (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED, label TEXT NOT NULL, default_call_number BIGINT REFERENCES asset.call_number (id) DEFERRABLE INITIALLY DEFERRED, -- the copy_transparency table was meant for something else, but may serve us here -- ultimately, we need some sort of shared/global copy template (current copy templates are stored as a user setting), -- ideally one which also stores location (rendering the default_location column obsolete) -- copy_transparency INT REFERENCES asset.copy_transparency (id) DEFERRABLE INITIALLY DEFERRED, default_location INT REFERENCES asset.copy_location (id) DEFERRABLE INITIALLY DEFERRED, shelving_unit_label_base TEXT DEFAULT NULL, shelving_unit_label_suffix TEXT DEFAULT NULL ); CREATE TABLE serial.shelving_unit ( id SERIAL PRIMARY KEY, distribution INT NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, label TEXT NOT NULL, label_sort_key TEXT NOT NULL, call_number BIGINT REFERENCES asset.call_number (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED, CONSTRAINT su_label_once_per_dist UNIQUE (distribution, label) ); CREATE TABLE serial.issuance ( id SERIAL PRIMARY KEY, creator INT NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED, editor INT NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED, create_date TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now(), edit_date TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now(), distribution INT NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, shelving_unit INT REFERENCES serial.shelving_unit (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, label TEXT DEFAULT NULL, date_published TIMESTAMP WITH TIME ZONE DEFAULT NULL, date_expected TIMESTAMP WITH TIME ZONE DEFAULT NULL, date_received TIMESTAMP WITH TIME ZONE DEFAULT NULL, copies_expected INT NOT NULL DEFAULT 1, copies_received INT NOT NULL DEFAULT 0, has_claims BOOL NOT NULL DEFAULT FALSE, expected_code TEXT DEFAULT NULL -- TODO: add columns for separate enumeration/chronology values? ); -- 1-to-many map of shelving_units (1) to copies (many) CREATE TABLE serial.shelving_unit_copy_map ( id SERIAL PRIMARY KEY, shelving_unit INT NOT NULL REFERENCES serial.shelving_unit (id), copy BIGINT NOT NULL UNIQUE REFERENCES asset.copy (id), CONSTRAINT su_copy_once UNIQUE (shelving_unit,copy) ); CREATE INDEX serial_shelving_unit_copy_map_su_idx ON serial.shelving_unit_copy_map (shelving_unit); CREATE TABLE serial.issuance_note ( id SERIAL PRIMARY KEY, issuance INT NOT NULL REFERENCES serial.issuance (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, creator INT NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED, create_date TIMESTAMP WITH TIME ZONE DEFAULT NOW(), pub BOOL NOT NULL DEFAULT FALSE, title TEXT NOT NULL, value TEXT NOT NULL ); CREATE TABLE serial.claim ( id SERIAL PRIMARY KEY, creator INT NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED, editor INT NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED, create_date TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now(), edit_date TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now(), issuance INT NOT NULL REFERENCES serial.issuance (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, date_claimed TIMESTAMP WITH TIME ZONE DEFAULT NULL, notes TEXT DEFAULT NULL --vendor key? ); CREATE TABLE serial.bib_summary ( id SERIAL PRIMARY KEY, distribution INT NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, generated_coverage TEXT NOT NULL, textual_holdings TEXT ); CREATE TABLE serial.sup_summary ( id SERIAL PRIMARY KEY, distribution INT NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, generated_coverage TEXT NOT NULL, textual_holdings TEXT ); CREATE TABLE serial.index_summary ( id SERIAL PRIMARY KEY, distribution INT NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED, generated_coverage TEXT NOT NULL, textual_holdings TEXT ); ==== MFHD Choices ==== The MFHD standard allows for a fair amount of flexibility in some areas. An eventual goal will be to provide configuration switches to meet the preferences of individual libraries, but the initial implementation will focus on a single working model. A few examples are: === Level of compression === - No compression: When not using compression, MFHD records will have a separate field 863 for every issue. This is generally untenable for anything but the simplest cases (or cases where the serial is unpredictable), and there are no plans to support it at this time. - Partial compression: partially compressed holdings will normally have a separate 863 for each bound volume. This physical level of holding information is more or less duplicated in the 'binding_unit' table and unnecessary for expressing holdings from a coverage perspective. As some libraries wish to express this physical-item/holding relationship in their MFHD records, this option is still planned for future development. - Full compression (**initial plan**): fully compressed holdings express full coverage information in the least possible space. Links to physical items are not supported. The initial receiving code will focus on fully compressed holdings as they meet the primary purpose of holdings information (coverage) with the least effort. //I wonder if, for shoving data into the MFHD record itself, we should leave it uncompressed, and then just use the summary tables to store the fully compressed summary in the three summary tables? -miker// === Automatic vs. manual editing === - Fully automatic: the holdings fields are fully managed by the serials interface. This approach may seem ideal, but it restricts manual editing by knowledgeable librarians and depends upon a fully implemented and flexible model. It also depends on accommodating historical information (pre-Evergreen holdings) in some way. - Fully manual: manually editing holdings with no automation should be possible, but only as a last resort. - Partially automatic (**initial plan**): partial automation will allow a mixture of auto-generated fields and manually edited fields. There is no direct expression of this in the standard, so we will use 'internal note' subfield to tag fields which are under automated control (setting the note to 'EGAUTO' or something similar). Any field lacking the this tag will remain untouched by the automation code. ==== Workflow ==== The overall workflow and interface still needs to be hammered out, but will need to go something like this: - Whenever a serial is created (manually or through import), the MFHD record will be consulted to determine the latest held issue and the serial pattern. - The interface will ask the user to input (or estimate) the last-received date of the serial in order to determine a reasonable expected date (as offset from the chronology date). - The system will use the MFHD record and functions to populate the issuance table with a set number of predicted issues (eventually based on global or serial-level preference, but perhaps set at a sane level such as 30 to start). The system will generate and store a human-readable label of each issuance's enumeration, chronology, expected date, and a flattened 863 representation. //For subscription objects with an end date, we should project the currently purchased run -- that is, for a 5-year subscription, project the next 5 years at purchase time -- and for open-ended subscriptions we should probably use an OU setting with a sane (if not set) default. Eh? -miker// - When the next issue arrives, the user will search for the serial and be presented with a list of predicted issues in the order predicted. The matching issuance will be selected and set as received in some fashion. - The system will reinstantiate the received issuance's 863 and add it to the MFHD record, then sort the 'EGAUTO' tagged 863 fields (if necessary) and compress any which represent continuous holdings (if possible). - The system will then temporarily reinstantiate the last available issuance for the journal in question, predict another issuance based on this instance, then add the new issuance to the database. It is also desirable for the user to have the ability to 'regenerate all predictions' for any given serial. ===== Historical info of potential interest ===== [[acq:serials:proto_schema_geometry|Proto-schema using geometric datatypes for issuance tracking and indexing]] ===== Status Updates ===== * 2010-05-17: Added historical proto-schema * 2010-05-03: Schema and diagram updated * 2010-03-03: Schema proposal update, diagram added * 2009-11: Initial preview