Differences
This shows you the differences between two versions of the page.
233:marc_review_diacritics [2013/04/27 13:09] |
233:marc_review_diacritics [2021/12/29 16:21] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | __Searching for special characters in MARC Review__ | ||
+ | There is now an easier way to find all records with special characters, such as diacritics, using MARC Review. | ||
+ | |||
+ | Start Marc Review, go to the Pattern form, and enter the following pattern: | ||
+ | |||
+ | {{: | ||
+ | |||
+ | If you run this review, it will match all records that have characters outside the normal ASCII range. | ||
+ | |||
+ | This review is possible because MARC Review now supports regular expression ranges that include non-printing characters. The example above uses the hex numbers: {x7F} and {xFF}; you can also use decimal notation: {127} and {255}. | ||
+ | |||
+ | At present (233 beta), this type of review is only possible for a whole record search (ie. a pattern where TAG=' | ||
+ | |||
+ | If your records use MARC-8 encoding, you could refine this review to find all invalid characters: | ||
+ | |||
+ | TAG=XXX | ||
+ | DATA= [{x01}-{x1A}{x1C}{x80}-{x87}{x8A}-{x8C}{x8F}{xAF}{xBB}{xBE}{xBF}{xC9}-{xDF}{xFC}{xFD}{xFF}] | ||
+ | REGULAR EXPRESSION=True | ||
+ | |||
+ | The character set specified above is simply the reverse of the MARC-8 characters listed as valid by LC: http:// |