MARC Report and subject validation

Subject heading validation is an experimental feature added in version 242, and updated in version 246.

We dub this feature 'experimental' because it tests the functionality and feasibility of real-time validation of vocabulary data via HTTP. Subjects were chosen because the amount of data we need to maintain on the server is relatively small and manageable. 1).

Subject heading validation is controlled by a single checkbox named Validate subject headings at the bottom of the Validation page of the options:

This option is disabled by default. When this option is selected, the program will attempt to validate subject headings in edit sessions.

You will be able to tell if this option is enabled directly from the Edit Session, in several ways. When enabled:

1. A Validation Button is displayed at the bottom right of the navigation panel:

If you cannot see this button, you will need to increase the width of your Edit session window.

When validation is enabled, this button contains a green check-mark. Clicking this button (the green check-mark will change into a red 'X') quickly disables subject validation without going into the options (If subject validation is enabled in the options, disabling it here will not change that setting). Click it a second time to re-enable subject validation.

2. An extra column, containing checkboxes, appears at the end of each tag:

If the heading validates, the checkbox is checked; if it does not validate, the checkbox remains empty; and if the heading matches an alternate label (as opposed to the preferred label, i.e. what we used to call a 'see-from'), the checkbox is filled with a solid color.

When validation is disabled in the program options, this extra column is not visible.

When validation is (temporarily) disabled using the Validation button (#1 above), the extra column remains visible but the checkboxes are replaced with a zero.

Finally, while the program is validating headings, a spinning timer control will appear to the right of the button; once validation is complete, the spinner will be replaced with a period.

You will probably quickly realize that validation should be disabled when quickly navigating through a file of records, because the program will not advance to the next record until it receives the search results from the server for the current record.

Back To Top

What is validated

The following table lists the data elements that are validated and the vocabularies that they are validated against

Tag Vocabulary Source
650 Ind 2=0 $a Library of Congress Subject Headings http://id.loc.gov
650 Ind 2=1 $a LC Children's Subjects and LCSH http://id.loc.gov
650 Ind 2=2 $a National Library of Medicine Subject Headings https://www.nlm.nih.gov/mesh
655 $2=gsafd     Validates against the gsafd.mrc (from Northwestern University)
655 $2=lcfgt     Validates against the LC Genre/Form terms (from id.loc.gov)

Back To Top

Processing steps and Notes

The processing used for this validation is as follows: the program contacts the TMQ server (www.marcofquality.com) with a list of subject headings in the record currently being viewed or edited. The server then queries its database for the presence of these headings, and returns the result of each lookup. When the program receives these results, it then marks the TAG column (in the main record view) of each subject checked, as follows:

NOTES

This feature is not active in Batch Mode.

Re: LCSH, we validate only $a. Validating this subject source is problematic because of its design. There is no way to know if a given heading has an authority record or not. It would be much better if all possible valid subject headings were backed by authorities, and hopefully that will be the case in the future. Failing that though, we do not wish to make validating LCSH a full-time occupation, so we are just going to stick with $a for now.

Re: LCSH alternate labels (aka 'See-From'). If a 650 I2=0 heading does not validate, the same heading is then searched a second time in a table of alternate headings. Thus, subject validation will take longer if your I2=0 subjects do not validate (since each heading will be searched twice).

Re: MESH, we are able to provide a more comprehensive result, validating the $a and all $x subfields, with the exception of the MESH 'publication types' (similar to the LCSH form subdivisions but coded as $x instead of $v). The latter are removed from the subject string before validation.

If you have suggestions for files to add to the list above, please send them to us.

1) The initial goal was, and still is, to provide real-time validation of name headings. But the dataset needed for that project is very large Hence this 'experiment'
246/subject_validation.txt · Last modified: 2016/08/24 23:39 (external edit)
Back to top
CC Attribution-Noncommercial-Share Alike 3.0 Unported
Driven by DokuWiki Recent changes RSS feed