Differences
This shows you the differences between two versions of the page.
— |
240:embobug [2021/12/29 16:21] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== MARC Review and MARC Global: ' | ||
+ | |||
+ | Many years ago, we ' | ||
+ | into a single statement. See below for the manual entry for this syntax. | ||
+ | |||
+ | Although we are going to try to keep this way of doing things functional, we want to point out that in most cases, using PCRE regular expressions //(the new way of doing things)//, might be the better choice. | ||
+ | |||
+ | For example, consider this ' | ||
+ | |||
+ | {{: | ||
+ | |||
+ | Stringing together the starting numbers like this removes the need to make a separate pattern to match each number, and is a great improvement for the user. At runtime, though, the program splits these embedded patterns apart and runs the review just as if you had specified six different patterns: | ||
+ | |||
+ | < | ||
+ | TAG=082 SUBF=a DATA=^71 REGEX=True | ||
+ | TAG=082 SUBF=a DATA=^72 REGEX=True | ||
+ | TAG=082 SUBF=a DATA=^73 REGEX=True | ||
+ | TAG=082 SUBF=a DATA=^74 REGEX=True | ||
+ | TAG=082 SUBF=a DATA=^306 REGEX=True | ||
+ | TAG=082 SUBF=a DATA=^646 REGEX=True | ||
+ | </ | ||
+ | |||
+ | So, there is performance hit with this way of doing things. Perhaps a more efficient way to write this review would be to combine the first four patterns into one-- | ||
+ | < | ||
+ | TAG=082 SUBF=a DATA=^7[1234] REGEX=True | ||
+ | </ | ||
+ | --so that the program only has to run three reviews on each 082, instead of six. | ||
+ | |||
+ | But now we have PCRE (which we did not have 20 years ago). And we can rewrite the above as simply: | ||
+ | |||
+ | {{: | ||
+ | |||
+ | PCRE uses parens to group **sub-patterns**((this, | ||
+ | |||
+ | It takes a bit of getting used to, but it should be worth it. | ||
+ | |||
+ | ---- | ||
+ | < | ||
+ | |||
+ | It is possible, and sometimes necessary, to specify multiple patterns in a single | ||
+ | ' | ||
+ | one with one of the boolean symbols listed below. | ||
+ | |||
+ | The following boolean symbols are supported within the DATA box: | ||
+ | |||
+ | && = and | ||
+ | || = or | ||
+ | !! = not | ||
+ | |||
+ | You can use the following English equivalents for the above interchangeably, | ||
+ | they are enclosed in angle brackets (they are not case-sensitive): | ||
+ | |||
+ | < | ||
+ | < | ||
+ | < | ||
+ | |||
+ | An example of each of these three boolean expressions follows. | ||
+ | |||
+ | ' | ||
+ | True if both ' | ||
+ | |||
+ | ' | ||
+ | True if either ' | ||
+ | |||
+ | ' | ||
+ | True if there is a $d ' | ||
+ | |||
+ | These patterns can be combined with the standard Match Rules ' | ||
+ | (eg. 'NOT 035 $a OCoLC||TMQ' | ||
+ | |||
+ | NOTE: If you use a regular expression with an embedded boolean, it must be repeated for each | ||
+ | argument. For example: 949 $a = ' | ||
+ | </ | ||
+ | |||