Differences

This shows you the differences between two versions of the page.

Link to this comparison view

plp:synonyms_advanced [2010/03/17 18:16]
tmqinc
plp:synonyms_advanced [2013/04/27 09:09] (current)
Line 1: Line 1:
 +====== PLP: Synonyms--Advanced Topics ======
 +
 +To begin, we want to stress that the effective use of synonyms requires a better-than-average knowledge of PLP crosscheck processing. It may be a good idea to review this documentation before embarking upon a major synonym creation project. You may find that a replacement you want to perform is already handled by a crosscheck's processing rules, or that a synonym rule you want to create will never match because of some subtlety in the way that PLP prepares the MARC data for a field. 
 +
 +You can find [[plp:crosschecks|detailed crosscheck documentation here]].
 +
 +===== Wildcards =====
 +
 +__Using wildcards in the Word/Phrase field__
 +
 +In addition to the 'Match flags', an asterisk can be added to the beginning or end of any word/phrase  
 +to force the match to 'wildcard' to the beginning/end of the subfield. For example:
 +
 +  VIDEO*        (and MatchFlag=Left)
 +  *RESOURCE     (and MatchFlag=Right)
 +
 +Limitation: 
 +
 +Do not both use an asterisk at the beginning and set the matchflag to 'Left' \\
 +Do not both use an asterisk at the end and set the matchflag to 'Right'
 +
 +Always keep in mind normalization when creating a synonym rule. For instance, the following GMD
 +
 +  '$h[electronic resource] /'
 +  
 +will become
 +
 +  'ELECTRONIC RESOURCE'
 +  
 +after normalization (the quotes used above are added to show the beginning and end of the strings and are not part of the data).
 +
 +===== What goes where =====
 +
 +__Ordering of Word/Phrase and Synonym fields__
 +
 +When replacing a pattern with another string, it may seem arbitrary as to which string is entered into the 'Word/Phrase' box, and which is string entered into the 'Synonym' box. On the contrary, the order of entry is critical.
 +
 +As a rule, you should always enter the longest, or fullest, form as a string in the 'word/phrase' field, and the shorter string in the 'synonym' field. If this isn't possible or applicable, then simply make sure that a word/phrase is never a substring of its own synonym. Also, be extremely careful when using the 'Any' Match Flag; try to use the 'All' Match flag as much as possible. 
 +
 +Here is an detailed example to illustrate this advice.
 +
 +Lets assume two otherwise matching records, where the 260 $b is as follows:
 +  Record 1:  $bLaw Journal,         [normalizes to LAW JOURNAL]
 +  Record 2:  $bLaw Journal Press,   [normalizes to LAW JOURNAL PRESS]
 +
 +You have observed, during many hours of MCU review:-(, that this difference causes X260B to fail in more than a few cases. You know that using the synonym table you can now make these two strings equivalent, but which one should be the word/phrase, and which one should be the synonym?
 +
 +If you ignore the rule of thumb stated above, you might create a rule like this:
 +  word/phrase: LAW JOURNAL
 +  synonym:     LAW JOURNAL PRESS
 +  match flag:  Any [or Left]
 +
 +What happens when PLP runs this rule during crosscheck processing? 
 +
 +If this synonym rule is applied to Record 1, the 260$b becomes:
 +  $bLaw Journal,  
 +  normalizes to: LAW JOURNAL
 +  after synonym processing: LAW JOURNAL PRESS
 +
 +If this synonym rule is applied to Record 2, the 260$b becomes:
 +  $bLaw Journal Press,  
 +  normalizes to: LAW JOURNAL PRESS
 +  after synonym processing: LAW JOURNAL PRESS PRESS
 +
 +And as a result, synonym processing fails in its goal to get these records successfully through the X260B crosscheck.
 +
 +There are two ways to avoid this unwanted result. First, use the 'All' Match flag. If this is done, then the publisher in Record 2 will not match the synonym rule and thus will not be changed (because 'all' of the data in the MARC subfield must match the word/phrase field in the rule). 
 +
 +Second, never create a word/phrase that is a substring of its synonym (i.e., the word/phrase can fit inside the synonym string). For example, 'LAW JOURNAL' is a substring of 'LAW JOURNAL PRESS'. 
 +
 +If the order was reversed, and the word/phrase was 'LAW JOURNAL PRESS' and the synonym was "LAW JOURNAL', then the publisher in Record 1 would not match, and the publisher in record 2 would be changed to 'LAW JOURNAL'--thus creating two matching strings and passing the crosscheck!
 +
 +Because this ordering of word/phrase and synonym is so critical, the program checks each  rule to make sure that the word/phrase is not a substring of its synonym. Even so, knowing the gritty details about synonyms work will also help you design better rules.
 +
  
Back to top
CC Attribution-Noncommercial-Share Alike 3.0 Unported
Driven by DokuWiki Recent changes RSS feed