User Tools

Site Tools


acq:serials:basic_predicting_and_receiving

Proposal: Serials - Basic Predicting and Receiving

FIXME This proposal is still in progress, please do not edit yet FIXME

Okay - but let's keep a link to the latest schema diagram handy, at least.

Summary

Now that MFHD records can be added and are linked to bib records, the next step is to start populating the issuance table in order to predict and receive issues. As issues are received the MFHD record must be updated accordingly. This proposal will focus on fleshing-out the database schema and defining how the MFHD record will hook in.

Details

Proposed Schema - Major Revision

The following is a work-in-progress rewrite of the schema originally proposed (still shown below). It is changing rapidly and may contain some syntax errors! Primary changes include a de-emphasis of MFHD records and a more direct representation of individual issues.

DROP SCHEMA serial CASCADE;

BEGIN;

CREATE SCHEMA serial;

CREATE TABLE serial.record_entry (
	id		BIGSERIAL	PRIMARY KEY,
	record		BIGINT		REFERENCES biblio.record_entry (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED,
	owning_lib	INT		REFERENCES actor.org_unit (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED,
	creator		INT		NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED,
	editor		INT		NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED,
	source		INT,
	create_date	TIMESTAMP WITH TIME ZONE	NOT NULL DEFAULT now(),
	edit_date	TIMESTAMP WITH TIME ZONE	NOT NULL DEFAULT now(),
	active		BOOL		NOT NULL DEFAULT TRUE,
	deleted		BOOL		NOT NULL DEFAULT FALSE,
	marc		TEXT		NOT NULL,
	last_xact_id	TEXT		NOT NULL
);

CREATE INDEX serial_record_entry_creator_idx ON serial.record_entry ( creator );
CREATE INDEX serial_record_entry_editor_idx ON serial.record_entry ( editor );
CREATE INDEX serial_record_entry_owning_lib_idx ON serial.record_entry ( owning_lib, deleted );

CREATE RULE protect_mfhd_delete AS ON DELETE TO serial.record_entry DO INSTEAD UPDATE serial.record_entry SET deleted = true WHERE old.id = serial.record_entry.id;

CREATE TABLE serial.subscription (
	id			SERIAL		PRIMARY KEY,
	start_date		TIMESTAMP WITH TIME ZONE	NOT NULL,
	end_date		TIMESTAMP WITH TIME ZONE,	-- interpret NULL as current subscription
        record_entry    	BIGINT		REFERENCES serial.record_entry (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED,
	expected_date_offset	INTERVAL	DEFAULT NULL
	-- acquisitions/business-side tables link to here
);

--at least one distribution per org_unit holding issues
CREATE TABLE serial.distribution (
        id              	SERIAL  PRIMARY KEY,
        subscription    	INT     NOT NULL REFERENCES serial.subscription (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	holding_lib		INT	REFERENCES actor.org_unit (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED,
        label			TEXT    NOT NULL,
	receive_call_number	BIGINT	REFERENCES asset.call_number (id) DEFERRABLE INITIALLY DEFERRED,
	receive_unit_template	INT	REFERENCES asset.copy_template (id) DEFERRABLE INITIALLY DEFERRED,
	bind_call_number	BIGINT	REFERENCES asset.call_number (id) DEFERRABLE INITIALLY DEFERRED,
	bind_unit_template	INT	REFERENCES asset.copy_template (id) DEFERRABLE INITIALLY DEFERRED,
        unit_label_base		TEXT    DEFAULT NULL,
        unit_label_suffix	TEXT    DEFAULT NULL
);

CREATE TABLE serial.stream (
        id             	SERIAL  PRIMARY KEY,
        distribution   	INT     NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
);

CREATE TABLE serial.issuance (
	id		SERIAL	PRIMARY KEY,
	creator		INT	NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED,
        editor		INT	NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED,
        create_date	TIMESTAMP WITH TIME ZONE        NOT NULL DEFAULT now(),
        edit_date	TIMESTAMP WITH TIME ZONE        NOT NULL DEFAULT now(),
	subscription	INT	NOT NULL REFERENCES serial.subscription (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	label		TEXT	DEFAULT NULL,
        date_published	TIMESTAMP WITH TIME ZONE	DEFAULT NULL,
        holding_code	TEXT	DEFAULT NULL,
        holding_type	TEXT	DEFAULT NULL CHECK ( holding_type IS NULL OR holding_type IN ('basic','supplement','index') ),
        holding_link_id	INT	DEFAULT NULL
	-- TODO: add columns for separate enumeration/chronology values
);

CREATE TABLE serial.item (
	id		SERIAL	PRIMARY KEY,
	creator		INT	NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED,
        editor		INT	NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED,
        create_date	TIMESTAMP WITH TIME ZONE        NOT NULL DEFAULT now(),
        edit_date	TIMESTAMP WITH TIME ZONE        NOT NULL DEFAULT now(),
	issuance	INT	NOT NULL REFERENCES serial.issuance (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	stream		INT	NOT NULL REFERENCES serial.stream (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	unit		INT	REFERENCES serial.unit (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	uri		INT	REFERENCES asset.uri (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED,
        date_expected	TIMESTAMP WITH TIME ZONE	DEFAULT NULL,
        date_received	TIMESTAMP WITH TIME ZONE	DEFAULT NULL
);

CREATE TABLE serial.unit (
	label		TEXT	DEFAULT NULL,
	label_sort_key	TEXT	DEFAULT NULL,
	contents	TEXT	NOT NULL
) INHERITS (asset.copy);

CREATE TABLE serial.item_note (
	id		SERIAL	PRIMARY KEY,
	item		INT	NOT NULL REFERENCES serial.item (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	creator		INT	NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED,
	create_date	TIMESTAMP WITH TIME ZONE	DEFAULT NOW(),
	pub		BOOL	NOT NULL DEFAULT FALSE,
	title		TEXT	NOT NULL,
	value		TEXT	NOT NULL
);

CREATE TABLE serial.bib_summary (
	id			SERIAL	PRIMARY KEY,
	distribution		INT	NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	generated_coverage	TEXT	NOT NULL,
	textual_holdings	TEXT
);

CREATE TABLE serial.sup_summary (
	id			SERIAL	PRIMARY KEY,
	distribution		INT	NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	generated_coverage	TEXT	NOT NULL,
	textual_holdings	TEXT
);

CREATE TABLE serial.index_summary (
	id			SERIAL	PRIMARY KEY,
	distribution		INT	NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	generated_coverage	TEXT	NOT NULL,
	textual_holdings	TEXT
);

COMMIT;

-- to be moved to asset schema file
CREATE TABLE asset.copy_template (
	id		SERIAL	PRIMARY KEY,
	owning_lib	INT	NOT NULL REFERENCES actor.org_unit (id) DEFERRABLE INITIALLY DEFERRED,
	creator		BIGINT	NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED,
	editor		BIGINT	NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED,
	create_date	TIMESTAMP WITH TIME ZONE	DEFAULT NOW(),
	edit_date	TIMESTAMP WITH TIME ZONE	DEFAULT NOW(),
	name		TEXT	NOT NULL,
	-- columns above this point are attributes of the template itself
	-- columns after this point are attributes of the copy this template modifies/creates
	circ_lib	INT	REFERENCES actor.org_unit (id) DEFERRABLE INITIALLY DEFERRED,
	status		INT,
	location	INT,
	loan_duration	INT	CHECK ( loan_duration IS NULL OR loan_duration IN (1,2,3) ),
	fine_level	INT	CHECK ( loan_duration IS NULL OR loan_duration IN (1,2,3) ),
	age_protect	INT,
	circulate	BOOL,
	deposit		BOOL,
	ref		BOOL,
	holdable	BOOL,
	deposit_amount	NUMERIC(6,2),
	price		NUMERIC(8,2),
	circ_modifier	TEXT,
	circ_as_type	TEXT,
	alert_message	TEXT,
	opac_visible	BOOL,
	floating	BOOL,
	mint_condition	BOOL
);

Schema

Problems and Solutions

Starting with the database schema, here are some notable problems and possible solutions.

  1. Problem: the current schema provides no means of storing prediction information or reception metadata.
    Solution: expand the issuance table to include such data.
  2. Problem: Multi-issue subscriptions are not easily accommodated.
    Solution: Add another abstraction table called 'distribution' to sit between 'subscription' and the other pieces. It would perform many duties of the current subscription table, leaving subscription to deal with global data and to be the linking point to the acquisitions side. Subsequent tables (binding_unit, issuance, *_summary) will then link to 'distribution' rather than 'subscription'. The distribution table will also provide the linking point to the 'record_entry' (MFHD) table (and exist in a 1-to-1 relation with 'record_entry').
  3. Problem: binding unit is too specific for certain use cases.
    Solution: Replace the 'binding_unit' table with a more flexible 'shelving_unit' table which can contain any number of single issues and bound volumes.

Proposed Schema Relationship Diagram

Serials Schema Diagram

Proposed Schema

CREATE SCHEMA serial;

CREATE TABLE serial.record_entry (
	id		BIGSERIAL	PRIMARY KEY,
	record		BIGINT		REFERENCES biblio.record_entry (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED,
	owning_lib	INT		NOT NULL REFERENCES actor.org_unit (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED,
	creator		INT		NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED,
	editor		INT		NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED,
	source		INT,
	create_date	TIMESTAMP WITH TIME ZONE	NOT NULL DEFAULT now(),
	edit_date	TIMESTAMP WITH TIME ZONE	NOT NULL DEFAULT now(),
	active		BOOL		NOT NULL DEFAULT TRUE,
	deleted		BOOL		NOT NULL DEFAULT FALSE,
	marc		TEXT		NOT NULL,
	last_xact_id	TEXT		NOT NULL
);

CREATE INDEX serial_record_entry_creator_idx ON serial.record_entry ( creator );
CREATE INDEX serial_record_entry_editor_idx ON serial.record_entry ( editor );
CREATE INDEX serial_record_entry_owning_lib_idx ON serial.record_entry ( owning_lib, deleted );

CREATE RULE protect_mfhd_delete AS ON DELETE TO serial.record_entry DO INSTEAD UPDATE serial.record_entry SET deleted = true WHERE old.id = serial.record_entry.id;

CREATE TABLE serial.subscription (
	id		SERIAL	PRIMARY KEY,
	start_date	TIMESTAMP WITH TIME ZONE	NOT NULL,
	end_date	TIMESTAMP WITH TIME ZONE	-- interpret NULL as current subscription 
	-- acquisitions/business-side tables link to here
);

CREATE TABLE serial.distribution (
        id              		SERIAL  PRIMARY KEY,
        subscription    		INT     NOT NULL REFERENCES serial.subscription (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	-- distribution to record_entry is 1-to-1
        record_entry    		BIGINT  UNIQUE REFERENCES serial.record_entry (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED,
	-- owning_lib column allows for a possible 'global' record_entry (e.g. record_entry per consortium, distribution per library)
	owning_lib			INT	REFERENCES actor.org_unit (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED,
	uri				INT	REFERENCES asset.uri (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED,
        label				TEXT    NOT NULL,
	default_call_number		BIGINT	REFERENCES asset.call_number (id) DEFERRABLE INITIALLY DEFERRED,
	-- the copy_transparency table was meant for something else, but may serve us here
	-- ultimately, we need some sort of shared/global copy template (current copy templates are stored as a user setting),
	-- ideally one which also stores location (rendering the default_location column obsolete)
	-- copy_transparency		INT	REFERENCES asset.copy_transparency (id) DEFERRABLE INITIALLY DEFERRED,
	default_location		INT	REFERENCES asset.copy_location (id) DEFERRABLE INITIALLY DEFERRED,
        shelving_unit_label_base	TEXT    DEFAULT NULL,
        shelving_unit_label_suffix	TEXT    DEFAULT NULL
);

CREATE TABLE serial.shelving_unit (
	id			SERIAL	PRIMARY KEY,
	distribution		INT	NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	label			TEXT	NOT NULL,
	label_sort_key		TEXT	NOT NULL,
        call_number     	BIGINT  REFERENCES asset.call_number (id) ON DELETE SET NULL DEFERRABLE INITIALLY DEFERRED,
	CONSTRAINT su_label_once_per_dist UNIQUE (distribution, label)
);

CREATE TABLE serial.issuance (
	id		SERIAL	PRIMARY KEY,
	creator		INT	NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED,
        editor		INT	NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED,
        create_date	TIMESTAMP WITH TIME ZONE        NOT NULL DEFAULT now(),
        edit_date	TIMESTAMP WITH TIME ZONE        NOT NULL DEFAULT now(),
	distribution	INT	NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	shelving_unit	INT	REFERENCES serial.shelving_unit (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	label		TEXT	DEFAULT NULL,
        date_published	TIMESTAMP WITH TIME ZONE	DEFAULT NULL,
        date_expected	TIMESTAMP WITH TIME ZONE	DEFAULT NULL,
        date_received	TIMESTAMP WITH TIME ZONE	DEFAULT NULL,
        copies_expected	INT	NOT NULL DEFAULT 1,
        copies_received	INT	NOT NULL DEFAULT 0,
        has_claims	BOOL	NOT NULL DEFAULT FALSE,
        expected_code	TEXT	DEFAULT NULL
	-- TODO: add columns for separate enumeration/chronology values?
);

-- 1-to-many map of shelving_units (1) to copies (many)
CREATE TABLE serial.shelving_unit_copy_map (
	id		SERIAL	PRIMARY KEY,
	shelving_unit	INT	NOT NULL REFERENCES serial.shelving_unit (id),
	copy		BIGINT	NOT NULL UNIQUE REFERENCES asset.copy (id),
	CONSTRAINT su_copy_once UNIQUE (shelving_unit,copy)
);
CREATE INDEX serial_shelving_unit_copy_map_su_idx ON serial.shelving_unit_copy_map (shelving_unit);

CREATE TABLE serial.issuance_note (
	id		SERIAL	PRIMARY KEY,
	issuance	INT	NOT NULL REFERENCES serial.issuance (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	creator		INT	NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED,
	create_date	TIMESTAMP WITH TIME ZONE	DEFAULT NOW(),
	pub		BOOL	NOT NULL DEFAULT FALSE,
	title		TEXT	NOT NULL,
	value		TEXT	NOT NULL
);

CREATE TABLE serial.claim (
	id		SERIAL	PRIMARY KEY,
	creator		INT	NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED,
        editor		INT	NOT NULL REFERENCES actor.usr (id) DEFERRABLE INITIALLY DEFERRED,
        create_date	TIMESTAMP WITH TIME ZONE        NOT NULL DEFAULT now(),
        edit_date	TIMESTAMP WITH TIME ZONE        NOT NULL DEFAULT now(),
	issuance	INT	NOT NULL REFERENCES serial.issuance (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
        date_claimed	TIMESTAMP WITH TIME ZONE	DEFAULT NULL,
	notes		TEXT	DEFAULT NULL
	--vendor key?
);

CREATE TABLE serial.bib_summary (
	id			SERIAL	PRIMARY KEY,
	distribution		INT	NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	generated_coverage	TEXT	NOT NULL,
	textual_holdings	TEXT
);

CREATE TABLE serial.sup_summary (
	id			SERIAL	PRIMARY KEY,
	distribution		INT	NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	generated_coverage	TEXT	NOT NULL,
	textual_holdings	TEXT
);

CREATE TABLE serial.index_summary (
	id			SERIAL	PRIMARY KEY,
	distribution		INT	NOT NULL REFERENCES serial.distribution (id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED,
	generated_coverage	TEXT	NOT NULL,
	textual_holdings	TEXT
);

MFHD Choices

The MFHD standard allows for a fair amount of flexibility in some areas. An eventual goal will be to provide configuration switches to meet the preferences of individual libraries, but the initial implementation will focus on a single working model. A few examples are:

Level of compression

  1. No compression: When not using compression, MFHD records will have a separate field 863 for every issue. This is generally untenable for anything but the simplest cases (or cases where the serial is unpredictable), and there are no plans to support it at this time.
  2. Partial compression: partially compressed holdings will normally have a separate 863 for each bound volume. This physical level of holding information is more or less duplicated in the 'binding_unit' table and unnecessary for expressing holdings from a coverage perspective. As some libraries wish to express this physical-item/holding relationship in their MFHD records, this option is still planned for future development.
  3. Full compression (initial plan): fully compressed holdings express full coverage information in the least possible space. Links to physical items are not supported. The initial receiving code will focus on fully compressed holdings as they meet the primary purpose of holdings information (coverage) with the least effort. I wonder if, for shoving data into the MFHD record itself, we should leave it uncompressed, and then just use the summary tables to store the fully compressed summary in the three summary tables? -miker

Automatic vs. manual editing

  1. Fully automatic: the holdings fields are fully managed by the serials interface. This approach may seem ideal, but it restricts manual editing by knowledgeable librarians and depends upon a fully implemented and flexible model. It also depends on accommodating historical information (pre-Evergreen holdings) in some way.
  2. Fully manual: manually editing holdings with no automation should be possible, but only as a last resort.
  3. Partially automatic (initial plan): partial automation will allow a mixture of auto-generated fields and manually edited fields. There is no direct expression of this in the standard, so we will use 'internal note' subfield to tag fields which are under automated control (setting the note to 'EGAUTO' or something similar). Any field lacking the this tag will remain untouched by the automation code.

Workflow

The overall workflow and interface still needs to be hammered out, but will need to go something like this:

  1. Whenever a serial is created (manually or through import), the MFHD record will be consulted to determine the latest held issue and the serial pattern.
  2. The interface will ask the user to input (or estimate) the last-received date of the serial in order to determine a reasonable expected date (as offset from the chronology date).
  3. The system will use the MFHD record and functions to populate the issuance table with a set number of predicted issues (eventually based on global or serial-level preference, but perhaps set at a sane level such as 30 to start). The system will generate and store a human-readable label of each issuance's enumeration, chronology, expected date, and a flattened 863 representation. For subscription objects with an end date, we should project the currently purchased run – that is, for a 5-year subscription, project the next 5 years at purchase time – and for open-ended subscriptions we should probably use an OU setting with a sane (if not set) default. Eh? -miker
  4. When the next issue arrives, the user will search for the serial and be presented with a list of predicted issues in the order predicted. The matching issuance will be selected and set as received in some fashion.
  5. The system will reinstantiate the received issuance's 863 and add it to the MFHD record, then sort the 'EGAUTO' tagged 863 fields (if necessary) and compress any which represent continuous holdings (if possible).
  6. The system will then temporarily reinstantiate the last available issuance for the journal in question, predict another issuance based on this instance, then add the new issuance to the database.

It is also desirable for the user to have the ability to 'regenerate all predictions' for any given serial.

Historical info of potential interest

Status Updates

  • 2010-05-17: Added historical proto-schema
  • 2010-05-03: Schema and diagram updated
  • 2010-03-03: Schema proposal update, diagram added
  • 2009-11: Initial preview
acq/serials/basic_predicting_and_receiving.txt · Last modified: 2022/02/10 13:34 by 127.0.0.1

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
CC Attribution-Share Alike 4.0 International Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki

© 2008-2022 GPLS and others. Evergreen is open source software, freely licensed under GNU GPLv2 or later.
The Evergreen Project is a U.S. 501(c)3 non-profit organization.