Differences

This shows you the differences between two versions of the page.

--- 235:verify_good-inside-bad [2013/04/27 13:09]
+++ 235:verify_good-inside-bad [2021/12/29 16:21] (current)
@@ Line 1: / Line 1: @@
+**Verify option to check for 'good records' within 'bad'**
+__Background__
+The Verify utility processes a MARC file one byte at a time((This isn't completely true, since it reads the file from the disk in large chunks, and then processes each chunk one byte at a time)). When it sees the MARC end_of_record delimiter (x1D), it creates a stream of data that begins with the byte following the last successfully verified record, and ends with the x1D just found. The utility then tries to //verify// the MARC structure of that stream.
+This has proven a good approach throughout the years, but it can be fooled by the following scenario. If, after a 'good' record has been parsed, the directory of the next succeeding record is truncated, such that a new record begins within the directory of the truncated record, Verify will spit out the 'second' record as an error, even if it proves to be valid.
+To deal with this problem, (which we hope is rare!), we have added a new option to Verify called 'Check for good within bad':
+{{:235:new_verify_opt.jpg}}
+When this option is selected, Verify will, in the above scenario, rescan the data stream looking for a piece of data that resembles the MARC leader, and if found, check whether that 'second' leader marks the beginning of a valid MARC record. In this case, the truncated directory will still be spit out as an error, but the valid record will not be lost.
+The new option is turned off by default; we recommend that you enable it only when you are dealing with a problematic file--that is, a file with MARC errors, which, on inspection, do not have another obvious cause. A telltale sign of this problem is an empty 'MarcErr' record, like the following:
+<code>
+---
+Filename: D:\Marc\test records.mrc
+Record Number: 20
+File Offset: 35019
+Last Good EOR: 33596
+</code>
+Usually, what can be read of the record is printed above the three dashes.

Back to top