plpmcu:plp:crosschecks:xlgpr
Table of Contents
XLGPR Crosscheck
The purpose of this crosscheck is to make sure that a large print record does not match a similar record for a non-large print item. To do this, we look at a very large number of fields and attempt to answer the question: Does this record represent a large-print item (True or False)?
Pre-processing
- None
Data extraction
- The GMD is extracted from the 245$h
- The edition statement is extracted from the 250$a
- The description is extracted from the 300 $a
- Each occurrence of the 500 field (all subfields)
- Each occurrence of fields 600-655 (all subfields)
- The publisher is extracted from the 260 $b (first occ.)
Normalization
- standard normalization is applied to all extracted data
- The GMD used in this crosscheck undergoes custom normalization
Processing rules
- The extracted data is checked, in the order listed above, for any occurrence of the word 'LARGE' followed by either of the words 'PRINT' or 'TYPE'. If this phrase is found, then the large print flag for the record is set to True.
- If the above logic fails, several well-known large print publisher strings–'CHIVERS', 'THORNDYKE', 'ULVERSCROFT'–are matched against the 260 $b, and if found, the large print flag for the record is set to True.
If both records represent large print items, or neither record represent a large print item, the crosscheck passes.
plpmcu/plp/crosschecks/xlgpr.txt · Last modified: by 127.0.0.1
