Changes to MARC Analysis in version 236

In version 236, we've added a statistic (that the program has always maintained yet never til now displayed) that identifies how many bytes each tag in the file occupies. This info will be displayed at the top of the report, with the other per file 'maximums':

File size: 18925471 bytes

MARC record count: 11947
...
Most repeated tag in file: 999  (501 times in record number 3856)

Tag using most bytes in file: 999  (occupies 4128834 bytes in 30108 occurrences)

Longest single field in file: 2698 bytes  (tag 505 in record number 4838)
...

From this we can glean that tag 999 occupies 4128834 of the 18925471 data bytes in the file1).

As with all MARC Analysis statistics, this information will be most useful to those seeking it, and a curiosity (we hope) to all others.

The other change we made was in the options. In the past, the program has maintained two frequency lists, and they could either be displayed or suppressed.

In 236, there are now six lists available (all of them simply more exhaustive listings of the information that is presented at the top of the report), and each one is controlled by its own option. The six lists are as follows (the first two have always been present in MARC Analysis):

Frequency: What % of records contain each tag used in the file
Occurrence: How many times each tag is used in the file
Overall Size: How many bytes does each tag use in the file
Maximum Length: What is the longest, 2nd longest, etc., tag in the file
Minimum Length: What is the shortest, 2nd shortest, etc., tag in the file
Average Length: What is the longest, on average, tag in the file

The option to display all of the results, or only the top 10 results, etc., remains as before. In the program, this page of the options now looks like this:

1)
Note: when we calculate the length of a tag in MARC Analysis, we do not count the 12 bytes in the directory; we count only the number of bytes used in the data portion of the record
236/ma_changes_236.txt · Last modified: 2021/12/29 16:21 (external edit)
Back to top
CC Attribution-Share Alike 4.0 International
Driven by DokuWiki