Differences

This shows you the differences between two versions of the page.

Link to this comparison view

235:verify_good-inside-bad [2013/04/27 09:09] (current)
Line 1: Line 1:
 +**Verify option to check for 'good records' within 'bad'**
 +
 +__Background__
 +
 +The Verify utility processes a MARC file one byte at a time((This isn't completely true, since it reads the file from the disk in large chunks, and then processes each chunk one byte at a time)). When it sees the MARC end_of_record delimiter (x1D), it creates a stream of data that begins with the byte following the last successfully verified record, and ends with the x1D just found. The utility then tries to //verify// the MARC structure of that stream.
 +
 +This has proven a good approach throughout the years, but it can be fooled by the following scenario. If, after a 'good' record has been parsed, the directory of the next succeeding record is truncated, such that a new record begins within the directory of the truncated record, Verify will spit out the 'second' record as an error, even if it proves to be valid.
 +
 +To deal with this problem, (which we hope is rare!), we have added a new option to Verify called 'Check for good within bad':
 +
 +{{:235:new_verify_opt.jpg}} 
 +
 +When this option is selected, Verify will, in the above scenario, rescan the data stream looking for a piece of data that resembles the MARC leader, and if found, check whether that 'second' leader marks the beginning of a valid MARC record. In this case, the truncated directory will still be spit out as an error, but the valid record will not be lost.
 +
 +The new option is turned off by default; we recommend that you enable it only when you are dealing with a problematic file--that is, a file with MARC errors, which, on inspection, do not have another obvious cause. A telltale sign of this problem is an empty 'MarcErr' record, like the following:
 +<code>
 +
 +
 +---
 +Filename: D:\Marc\test records.mrc
 +Record Number: 20
 +File Offset: 35019
 +Last Good EOR: 33596
 +</code>
 +
 +Usually, what can be read of the record is printed above the three dashes. 
 +
  
235/verify_good-inside-bad.txt · Last modified: 2013/04/27 09:09 (external edit)
Back to top
CC Attribution-Noncommercial-Share Alike 3.0 Unported
Driven by DokuWiki Recent changes RSS feed