Running RDA Automation in batch-mode

The option to run RDA automation on a file of records is found at the bottom of the 'Utilities' menu.

Selecting this options displays a form very similar to the RDA automation section of the main options:

The only difference in the options themselves is that here the 'Save record' option has been changed to 'Split output'.

In addition, the bottom of the form contains

  • a 'Run' button, which starts the job running on the selected file
  • a 'Cancel' button, which variously closes the form, or interrupts a running job
  • a bar that tracks the job progress and provides brief status information

Usage

The first step is to select the MARC file to run the automation on.

One can make this selection either from the 'File|Select MARC' option, or from the 'Select MARC file' option at the top of the Utilities menu:

After that, simply adjust any options to be changed, and press the 'Run' button.
For a complete description of the RDA Automation options, follow this link.

When the 'Run' button is pressed, and while the job is running, the caption of the 'Cancel' button will flip to 'Stop'–to make it more obvious that one may press that button to stop the job. Note that once stopped, the job has to be started again from the beginning.

Once the job has completed, or been manually interrupted, a new button named 'View Log' appears to the right of the 'Run' button. Pressing this button will display various statistics on the job, as well as a complete and detailed list of changes performed.

Log

The log contains three sections:

  1. Report header
  2. Record-level details
  3. Summary details and statistics

Note that the size of the log is directly related to the size of source file; if you are running the utility on a large MARC file, it will create a large log: estimate about 10MB of log for each 100K source records.

The first section of the log lists the complete pathnames for all of the files involved in the run (including the log itself).

Hybridization report 

Job run on:          07/28/16 8:54:47 AM
Source file:         D:\un\_marc_big\LcEnglish-1607.mrc
Results file:        D:\un\_marc_big\LcEnglish-1607-hybridized.mrc
Split file (n/c):    D:\un\_marc_big\LcEnglish-1607-not-changed.mrc
Log file:            D:\un\_marc_big\LcEnglish-1607-hybridized.log.txt

This filename list is followed by a detailed, record-level log of changes. This section makes up the bulk of the report, and may be quite large when processing a large source file.

An entry is made for each record processed and includes:

  • the record sequence number (in the source file, not the results file)
  • the record control number (set this up on the 'Batch reports' page of the options)
  • a statement of the result of running the automation on the record
  • the value of each RDA tag that was modified
Record #1
001:  00000004
Success--336, 337, and 338 added
336	$atext
337	$aunmediated
338	$avolume

Record #2
001:  2009509911
No change--already contains (3) 33X tags
Record output to split file

Record #3
001:  93517815
Fallback option applied: Do not fill unfilled tags (1)
336	$atwo-dimensional moving image
337	$aprojected

To access the summary statistics for the run, we have to scroll down (pressing Ctrl+End might be the fastest way to do this) to the end of the log, until we reach the line that reads 'Summary stats':

Summary stats

Records in:   2429
Records out:  2429
  -> Results file:   2332
  -> Split file:       97

Success--336, 337, and 338 added: 2330
No change--already contains (3+) 33X tags: 4
No change--already contains (3) 33X tags: 93
Fallback option applied: Do not fill unfilled tags (1 unfilled): 2

Here we get the record counts, including a breakdown of how many records were written to the two output files. Records successfully processed by the automation, and records to which a fallback option was applied, are written to the results file; records that were not processed because they were already Ok are written to the split file.

This is followed by another breakdown of the results. The first line, labeled 'Success', counts the number of records where automation filled all three tags (336, 337, and 338). The lines that follow vary according to the fallback option in effect.

The report concludes with a frequency table that lists each 336+337+338 value, followed by the number of times it occurred in the file, sorted from highest to lowest. This list may be large depending on the file size and the nature of the collection, so we will only show a snippet of it here:

Detailed stats

text; unmediated; volume: 1958
performed music; audio; audio disc: 220
notated music; unmediated; volume: 61
text; microform; microfilm reel: 46
two-dimensional moving image; video; videodisc: 9
spoken word; audio; audio disc: 9
text; microform; microfiche: 8
text; computer; online resource: 4

The values for each of the three tags are separated by a semi-colon; when one of the RDA tags repeats, the values for that tag are separated by a comma.

For example, the following string–

two-dimensional moving image; projected, video; film reel, videocassette: 1

–represents a record with one 336:

two-dimensional moving image; 

two 337's:

projected, video; 

and two 338's:

film reel, videocassette

Notes

If, in addition to the detailed stats described above, which track a concatenation of the 336, 337, and 338 values, you want a frequency table of each individual value that appears in these tags, use the MARC Analysis utility as follows:

  • select your MARC File
  • start MARC Analysis
  • click Options and goto the 'Custom lists' page
  • enter the tag+subf for the data you want, eg:

  • Click Save, then click Analyze

In a few seconds you should have the desired data. Click 'Text' to open the results and then scroll down to the '33' tags:

Custom List for 336a
        cartographic image:1
        notated music:61
        performed music:226
        spoken word:10
        still image:1
        text:   2020
        two-dimensional movi:13
246/batch_hybridization.txt · Last modified: 2021/12/29 16:21 (external edit)
Back to top
CC Attribution-Share Alike 4.0 International
Driven by DokuWiki