Differences
This shows you the differences between two versions of the page.
phelp:helpsamerecorddupes [2015/03/20 13:36] |
phelp:helpsamerecorddupes [2021/12/29 16:21] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | SAME RECORD DUPLICATE DATA | ||
+ | |||
+ | It is possible to use MARC Global to identify duplicate data within the same record. | ||
+ | |||
+ | Note: If you want to identify records in a file that have a duplicate title, match key (ISBN, LCCN, OCLC), or any other MARC Field, use the SORT Utility instead. | ||
+ | |||
+ | To run this type of task, start MARC Global, press Skip, select the ' | ||
+ | |||
+ | On the Options page, enter the tag that you want to check for duplicate data. You can specify a single tag, or an ' | ||
+ | |||
+ | NORMALIZATION | ||
+ | |||
+ | Several normalization options are also available on this form; these normalizations will be applied to each field before it is compared. | ||
+ | |||
+ | Ignore blanks--remove leading and trailing blanks, and compress two blanks to a single blank. | ||
+ | |||
+ | Ignore case--all data (except for MARC subfield codes) will be shifted to uppercase. | ||
+ | |||
+ | Ignore indicators--if whole tags are being compared, the indicators will be removed. | ||
+ | |||
+ | Ignore punctuation--all punctuation marks will be converted to blank spaces, and then multiple blanks will be compressed to a single blank. | ||
+ | |||
+ | Ignore subfield codes--all MARC subfield delimiters (x1F) and the byte that immediately follows them (hopefully a subfield code) will be replaced with a single blank space. | ||
+ | |||
+ | ADDITIONAL OPTIONS | ||
+ | |||
+ | Two additional options are available for this type of job. 'Find duplicate data only in matching tags' is applicable when when the TAG is ' | ||
+ | |||
+ | The other option, ' | ||
+ | |||
+ | NOTES | ||
+ | |||
+ | This type of task can be saved to the saved reviews file. | ||
+ | |||
+ | Two types of Text Output are available for this task: ' | ||
+ | |||
+ | Some of the normalization options may have dependencies. For example, if a subfield is specified for the Tag being searched, then indicators are always ignored. On the other hand, if a subfield is specified for the Tag being searched, the ' | ||
+ | |||
+ | By default, an ' | ||
+ | |||
+ | An ' | ||
+ | For example: | ||
+ | -Control number fields are often repeated (001 and 010; 035 and 9XX) | ||
+ | -Call number fields are often repeated (050 and 090; 082 and 092) | ||
+ | -Authors in 100 often appear in 600 and 700 fields | ||
+ | -Title fields (22X-24X) often duplicate after normalization is applied | ||
+ | -Edition (250) statements are sometimes repeated in XXX notes | ||
+ | -Note (500) fields are often repeated as 650 fields (eg. ' | ||
+ | -Subjects (6XX) commonly repeat verbatim with only a change in indicators, | ||
+ | -Untraced series (490) headings are almost always repeated in 830 | ||
+ | -Title (24X) and Series (4XX) headings are often repeated in 7XX fields | ||
+ | |||
+ | The list goes on. | ||
+ | |||