PLP: Synonyms--Advanced Topics §

To begin, we want to stress that the effective use of synonyms requires a better-than-average knowledge of PLP crosscheck processing. It may be a good idea to review this documentation before embarking upon a major synonym creation project. You may find that a replacement you want to perform is already handled by a crosscheck's processing rules, or that a synonym rule you want to create will never match because of some subtlety in the way that PLP prepares the MARC data for a field.  

You can find detailed crosscheck documentation here

Back To Top

Wildcards §

Using wildcards in the Word/Phrase field 

In addition to the 'Match flags', an asterisk can be added to the beginning or end of any word/phrase to force the match to 'wildcard' to the beginning/end of the subfield. For example: 

VIDEO*        (and MatchFlag=Left)
*RESOURCE     (and MatchFlag=Right) 

Limitation:  

Do not both use an asterisk at the beginning and set the matchflag to 'Left'
Do not both use an asterisk at the end and set the matchflag to 'Right' 

Always keep in mind normalization when creating a synonym rule. For instance, the following GMD 

'$h[electronic resource] /' 

will become 

'ELECTRONIC RESOURCE' 

after normalization (the quotes used above are added to show the beginning and end of the strings and are not part of the data). 

Back To Top

What goes where §

Ordering of Word/Phrase and Synonym fields 

When replacing a pattern with another string, it may seem arbitrary as to which string is entered into the 'Word/Phrase' box, and which is string entered into the 'Synonym' box. On the contrary, the order of entry is critical. 

As a rule, you should always enter the longest, or fullest, form as a string in the 'word/phrase' field, and the shorter string in the 'synonym' field. If this isn't possible or applicable, then simply make sure that a word/phrase is never a substring of its own synonym. Also, be extremely careful when using the 'Any' Match Flag; try to use the 'All' Match flag as much as possible.  

Here is an detailed example to illustrate this advice. 

Lets assume two otherwise matching records, where the 260 $b is as follows: 

Record 1:  $bLaw Journal,         [normalizes to LAW JOURNAL]
Record 2:  $bLaw Journal Press,   [normalizes to LAW JOURNAL PRESS] 

You have observed, during many hours of MCU review:-(, that this difference causes X260B to fail in more than a few cases. You know that using the synonym table you can now make these two strings equivalent, but which one should be the word/phrase, and which one should be the synonym? 

If you ignore the rule of thumb stated above, you might create a rule like this: 

word/phrase: LAW JOURNAL
synonym:     LAW JOURNAL PRESS
match flag:  Any [or Left] 

What happens when PLP runs this rule during crosscheck processing?  

If this synonym rule is applied to Record 1, the 260$b becomes: 

$bLaw Journal,  
normalizes to: LAW JOURNAL
after synonym processing: LAW JOURNAL PRESS 

If this synonym rule is applied to Record 2, the 260$b becomes: 

$bLaw Journal Press,  
normalizes to: LAW JOURNAL PRESS
after synonym processing: LAW JOURNAL PRESS PRESS 

And as a result, synonym processing fails in its goal to get these records successfully through the X260B crosscheck. 

There are two ways to avoid this unwanted result. First, use the 'All' Match flag. If this is done, then the publisher in Record 2 will not match the synonym rule and thus will not be changed (because 'all' of the data in the MARC subfield must match the word/phrase field in the rule).  

Second, never create a word/phrase that is a substring of its synonym (i.e., the word/phrase can fit inside the synonym string). For example, 'LAW JOURNAL' is a substring of 'LAW JOURNAL PRESS'.  

If the order was reversed, and the word/phrase was 'LAW JOURNAL PRESS' and the synonym was “LAW JOURNAL', then the publisher in Record 1 would not match, and the publisher in record 2 would be changed to 'LAW JOURNAL'–thus creating two matching strings and passing the crosscheck! 

Because this ordering of word/phrase and synonym is so critical, the program checks each rule to make sure that the word/phrase is not a substring of its synonym. Even so, knowing the gritty details about synonyms work will also help you design better rules. 

plp/synonyms_advanced.txt · Last modified: 2013/04/27 09:09 (external edit)
Back to top
CC Attribution-Noncommercial-Share Alike 3.0 Unported
Driven by DokuWiki Recent changes RSS feed