Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Previous revision
phelp:helplistmatch [2011/03/24 20:57]
phelp:helplistmatch [2023/11/01 01:34] (current)
Rick
Line 1: Line 1:
 +MATCHING ITEMS FROM A LIST IN MARC REVIEW
 +
 +MARC Review has the ability to match MARC Data (specified on the pattern form) against a list of items entered into a text file.
 +
 +The first step is decide what 'kind' of match you want to perform: simple string matching, or value list matching.
 +
 +SIMPLE STRING MATCH
 +
 +Simple string matching is the default MARC Review match behavior. This type of matching simply checks for the presence of a string anywhere in the MARC data specified on the pattern form.
 +
 +For example, if the pattern form specifies
 +  
 +  TAG=650
 +  SUBF=a
 +  DATA=librar
 +  CASE=False
 +  
 +then the program will find all records that contains a 650 $a with the term 'librar' in it:
 +
 +  $aDigital libraries
 +  $aFriends of the library
 +  $aInternational librarianship
 +  $aLibrarians
 +  $aLibraries and people with disabilities
 +  $aLibrary catalogs
 +
 +We can anchor a match to the beginning or end of the MARC data by using regular expressions. Continuing with the 650 $a example, if we change DATA to:
 +
 +  DATA=^librar
 +  CASE=False
 +  RegEx=True
 +  
 +it then matches only subfields beginning with the string:
 +
 +  $aLibrarians
 +  $aLibraries and people with disabilities
 +  $aLibrary catalogs
 +
 +Matching to the end of a field is a bit harder, because a pattern like:
 +
 +  DATA=librar.*$
 +  CASE=False
 +  RegEx=True
 +  
 +will still match a heading like:
 +
 +  $aLibraries and people with disabilities
 +
 +due to the 'greediness' of the regular expression support in the program. So what we would have to do is something like this:
 +
 +  DATA=libraries$||library$||librarians$||librarianship$
 +  CASE=True
 +  RegEx=True
 +
 +turning on case-sensitivity so that we do not match '$aLibraries' (additionally, we could add a blank space in front of our search terms).
 +
 +VALUE LIST MATCH
 +
 +Value list string matching was added to the program in version 236. The purpose of a value list is to support a controlled vocabulary. Examples of value lists are everything from the 'MARC Code List for Languages' to the Library of Congress Subject Headings. 
 +
 +Ideally, MARC fields that are to contain data from a controlled vocabulary should be entered using dropdown menus that contain all available values. For example, in MARC Report we may click on the 008 element for 'Language' and press <F1> to select from a list of all valid Language codes. However, this type of data entry may not be feasible for a large list of subject headings, which may contain many thousands (or hundreds of thousands) of items.
 +
 +Searching of value lists in MARC Review is somewhat different than the default string matching described above. Whereas above, we asked the question 'is the specified data ("librar") present in the field we are searching ("650$a")?', value list support asks the question 'is the field I am searching present in the value list I have specified?'.
 +
 +Value list matching is always left-anchored in MARC Review. We assume, by definition, that a subject heading of 
 +  
 +  650 $aLibraries.
 +  
 +should never match a value list item like 
 +
 +  Technical services (Libraries)
 +
 +Thus, we do not match strings within strings when validating a term in a MARC record against a value list. This has a benefit in that we can programmatically support very large lists when the search term is left-anchored.
 +
 +Also, in value list matching there is no Regular expression option, as matching is always left-anchored. 
 +
 +Instead, there are the options 'Partial' and 'Complete'. If the 'Partial' option is selected, then the MARC field
 +
 +  650 $aLibraries.
 +  
 +would match all of the following items from a LCSH value list:
 +
 +  Libraries
 +  Libraries (Rooms)
 +  Libraries and adult education
 +  Libraries and booksellers
 +  Libraries and colleges
 +  Libraries and community
 +  Libraries and distance education.
 +  Libraries and education
 +  Libraries and electronic publishing
 +  Libraries and families.
 +  ...
 +  Libraries, Medical
 +
 +But if the 'Complete' option was selected, then the MARC field above would match only the value list item
 +
 +  Libraries
 +
 +Note: the Case sensitive option is also supported in value list matching.
  
Back to top
CC Attribution-Share Alike 4.0 International
Driven by DokuWiki