phelp:helpglobalchangeoptions [[MARC Report]]

Global Change Options

The options that appear on this page depend on the type of Global Change you selected on the previous page.

GENERAL NOTES

To enter a MARC subfield delimiter, press <Ctrl>D (hold down the control key and press the letter 'D').

In forms where a 'Case Sensitive' option is available, note that subfield codes (i.e. the character that follows the MARC subfield delimiter) are always case-sensitive, regardless of the value of this setting.

Keep in mind that, if you specify patterns, your global change will be applied only in the records that match these patterns; if you want a global change applied to all of the records in a file, do not specify a pattern.

However, also keep in mind that almost every global change that you can specify implies a pattern. For example, a task such as 'Delete Tag=650' can only be applied to records that have a 650 tag; there is no need to explicitly add the pattern 'AND Tag=650'. Another example, 'Copy Subfield 049 $a to 949 $a', can only be applied to records that have a 049 tag with a subfield $a in it; again, there is no need to specify the pattern 'AND Tag=949 Subf=a' (however, doing so, in either case, will do no harm).

If you need to restrict a global change to a specific occurrence of a Tag or Subfield, you should first create a Pattern Match to select the required occurrence. There is generally no support for Tag or Subfield occurrences in the options described on this page.

IMPORTANT: The examples below assume that no pattern matches have been created (i.e. the phrase 'in every record of the file' is used frequently below). This is not a recommendation! But it makes it easier to illustrate how each type of global change works. Also, the examples are designed to demonstrate the progam's functionality and may not always make sense from a cataloging point of view.

DELETE A TAG (OR TAGS)

Enter the tag that you want to delete. This tag will be deleted from all matching records.

Example: Tag=650 Action: Deletes every occurrence of tag 650 in every record in the file.

You may also enter a list of tags to be deleted. If you enter a list of tags, each tag must contain three digits (or an 'X', see below), and tags must be separated from one another by one blank space

Example: Tag=049 090 092 590 690 Action: Deletes every occurrence of every 049, 090, 092, 590, and 690 in every record of the file.

Both of the above cases also support MARC 'X' tags and tag ranges. For example:

Tag=90X –Delete tags 900 901 902 903 904 905 906 907 908 909 Tag=900-909 –Delete tags 900 901 902 903 904 905 906 907 908 909 Tag=049 9XX –Delete the 049 and all 9XX tags Tag=900-950 952-999 –Delete all 9XX tags EXCEPT tag 951

The program also supports the use of 'not' in a tag expression. For example, the following two expressions in the Tag box evaluate to the same thing:

900-950 952-999
9XX not 951

In each case, all of the tags from 900 to 999, inclusive, with the exception of 951, will be deleted from each record.

'Not' also supports a list of tags, with both tag ranges and 'X' tags supported:

9XX not 949-953  --Delete all tags beginning with '9' except for 949 950 951 952 953
9XX not 91X      --Delete all tags beginning with '9' except for 910-919

Although it doesn't make much sense to enter a tag in the 'not' part of the expression that is not included in the list, it is not an error to do so (eg. 900-911 not 912). Also, a 'not' expression that results in no tags at all being specified has the peculiar result of deleting whatever has been entered into the Tag box (eg. 900-950 not 9XX).

There are restrictions on this syntax. You cannot use 'XXX' or '00X'; the expression '0XX' translates to '010-099' –it will not delete control tags, i.e. 000-009; workaround this by specifying control tags one at a time: '002 004 006'. A tag range must contain 3 digits followed by a dash followed by three digits; using 'X' tags in a range will generate an error;, as will a shortcut like '900-50'. Any tag range that begins with a control tag will be ignored.

DELETE A SUBFIELD

Enter the tag and the subfield code that you want to delete. This subfield will be deleted from all matching records. When a subfield is removed from a tag, it may affect the punctuation in the remaining subfields. Click on the Punctuation Options button to configure how punctuation will be handled.

Example: Tag=700 Subf=e Action: Deletes every occurrence of subfield e from every tag 700 in every record in the file.

ADD A NEW TAG

Enter the tag, the indicators (or leave the indicators empty for the default, which is two blank indicators), the subfield, and the data for the field you want to add. If the new data contains more than one subfield, leave the subfield box blank, and enter all of the subfield codes as part of the data (using <Ctrl><D>).

You can also select an overlay option. Select 'Skip record' if you want to go on to the next record when the tag you are adding is already present in the record, select 'Add new tag' to add your tag to the record anyway, or select 'Replace tag' to replace the first matching tag in the record with the tag you have defined above.

Example: Tag=690 Ind1= Ind2= Subf= Data=±aOffline±bBin 55. [Replace] Action: Add a tag 690, with two blank indicators and the data shown above, to every record in the file; if the record has any 690s, delete all of them before adding the new tag.

NOTE ABOUT PATTERNS AND OVERLAY OPTIONS

Patterns are applied before the overlay option; that is, if you have specified patterns, and a record does not match, then the overlay option (Skip, Add, Replace) is never consulted. Once the program has evaluated all of the patterns in a review, and determines that the record does not match, it goes on to the next record without looking at any of the global change parameters.

ADDING TAGS FROM A TEXT FILE

It is possible to use the 'Add a Tag' option to add data from a textfile to matching records.

If you are not familiar with the topic 'USING MARC REVIEW TO SEARCH A LIST OF ITEMS', you should review that now (start MARC Review, press Next, then Help–this section on 'Lists' is near the bottom).

In MARC Global, this feature requires a tab-delimited textfile that contains two columns. The first column must contain a match key (eg. a control number) that will select the records that you want to add the data to. Each match key should only appear once, and should uniquely identify a record.

The second column contains the data that will be added to the matching records. Use a '$' for the subfield delimiter in this column–it will be automatically converted.

If either column is null, the whole row is discarded. If the first column requires a regular expression to match the records correctly, then add the regular expression in the textfile.

For example, consider a textfile like the following, where the first column contains a control number, and the second contains a URI:

pn08050191 $uhttp:zebra.pnet.com/eBooks?docNum=34771$zClick here to access pn08091322 $uhttp:zebra.pnet.com/eBooks?docNum=25901$zClick here to access pn09020156 $uhttp:zebra.pnet.com/eBooks?docNum=28321$zClick here to access pn09021159 $uhttp:zebra.pnet.com/eBooks?docNum=30981$zClick here to access …

Because the control number in the above example is normalized, there's no need to wrap it in a regular expression.

To add the URI fields to the matching MARC records, follow these steps: 1. Start MARC Global and press Next 2. In the PATTERN form, enter 001 (or the control number tag) in the TAG box, then press TAB. 3. Right-click on the DATA box and select your textfile. If there are no problems, the program will display the number of match keys and data rows loaded from the textfile. Click OK. 4. Press NEXT, select ADD A NEW TAG, then press NEXT again. 5. Enter 856 (or the tag you want to add the URI to) in the TAG box, then press TAB twice. 6. Right-click on the DATA box. The program will add the first URI from the textfile to the DATA box and ask for confirmation.

From here on in, you may use the standard MARC Global options. (For example, if an 856 already exists in the matching record, do you want to Add a new one? Replace the existing one? etc.)

When you press RUN to start the job, the program will: 1. Read each MARC record in the source file 2. If the record contains the tag from the pattern form (001), check whether the data in that tag matches a key from the first column of the textfile. 3. If there is a match, the corresponding data from the second column in the textfile will be added to the MARC record in the specified tag (856). 4. If there is not a match, the program goes to the next MARC record in the file.

NOTES

Its common to forget to check the regular expression box when loading a review from a textfile, so if nothing seems to match, check this first (if applicable).

If the data column (column 2) from the textfile lacks an opening subfield, then you must add it to the 'SUBF' box on the 'ADD A NEW TAG' form; on the other hand, if the opening subfield is included (as in the above example), then you must clear the SUBF box.

Duplicate match keys (column 1) are not permitted, and if the program detects them when loading the list, it will tell you so.

On the other hand, if more one record in the MARC file matches a match key (column 1), then the same data column (column 2) will be added to each MARC record.

ADD A SUBFIELD TO A TAG

Enter the tag, the subfield, and the data for the subfield you want to add. You cannot embed subfield codes in the data using this option.

You can specify exactly where you want the subfield to go in the tag to which it is being added:

at the start of the tag, 
at the end of the tag, 
before any subfield code that you specify,
after any subfield code that you specify.

However, if you specify that the new subfield should go before or after an existing subfield, and that existing subfield is not present in the record, then no change will take place.

When a subfield is added to an existing tag, it may affect the punctuation in the other subfields. Click on the Punctuation Options button to configure how punctuation will be handled.

You can also select an overlay option. If the subfield you are adding is already present in the tag specified, select 'Skip record' if you want to go on to the next record, select 'Add new subfield' to add your subfield to the tag anyway, or select 'Replace subfield' to replace the first matching subfield in the tag with the subfield you have defined above.

Finally, if the you are adding a subfield to a tag, and the tag is not present in the record, you can select the 'Create New Tag' option to add a new tag to the records that contains only your subfield.

Example: Tag=590 Subf=z Data=rf [To end of tag] [Skip] [Create a new tag] Action: Add subfield z containing 'rf' to the end of EVERY 590 tag in every record in the file that does not already have a 590 with $z. If a record does not have a 590 tag, a new 590 tag is added to the record that contains only '$zrf'.

COPY AND CHANGE TERMINOLOGY

In the remaining global change types, there are two sections on each form. The top section is where you describe what you want to be copied or changed, and the bottom section is where you describe what you want it to be copied or changed to. We refer to the section on the top as the 'Source', and the section on the bottom as the 'Target'.

The presence of the source data also has the same effect as a pattern match. That is, if the source data that you want to copy or change is not present in a record, then no change will take place.

COPY A TAG

Enter the tag you want to copy in the topmost Tag box (the Source), and the destination tag number in the Tag box below (the Target).

If you want the source Tag to be deleted from the record after it is copied to the target tag, select the 'Move Tag' option. There is no functional difference between using the 'Copy Tag' change type and selecting the Move Tag option, or simply renaming a tag using 'Change Tag' (see below).

You can also select an overlay option. Select 'Skip record' if you want to go on to the next record when the tag you are copying is already present in the record, select 'Add new tag' to add your tag to the record anyway, or select 'Replace tag' to replace the first matching tag in the record with the tag you have defined above.

By default, Copy a Tag copies all occurrences of the tag specified. If you want to copy only a specific occurrence of a tag, use a pattern to specify the occurrence you want to copy.

Example: Source Tag=650 Target Tag=690 [Skip] Action: Copies every 650 tag to a new 690 tag in every record that does not have a 690 tag.

Example: Source Tag=650 Target Tag=690 [Add new] Action: Copies every 650 tag in every record to a new 690 tag.

Example: Source Tag=650 Target Tag=690 [Move Tag] Action: Copies every 650 tag in every record to a new 690 tag, then deletes the copied 650 tag.

COPY A SUBFIELD

Enter the tag and subfield you want to copy in the Source section, and the destination tag and subfield in the Target section. Use the overlay options (as described above) to determine what will happen if the target subfield already exists in the target tag.

If you want the source subfield to be deleted from the record after it is copied to the target tag, select the 'Move Subfield' option. 'Move' is not supported when either 'Occ' box is set to 'All' (see below).

By default, Copy a Subfield copies only the first occurrence of the matching subfield. If you want to copy all matching subfields, select 'All' from the 'Occ' box in the 'Copy From' area; if you want to copy only a specific occurrence of a subfield in a tag, enter the occurrence in the 'Occ' box in the 'Copy From' area. In either case, for best results, setup a pattern to specify that occurrence before setting that occurrence on this form.

By default, Copy a Subfield copies the data to the first occurrence of the tag specified in the 'Copy To' area. If you want the data copied to a specific occurrence of a tag, enter the occurrence in the 'Occ' box of the 'Copy To' area. Note that if a specified occurrence does not exist, no copying will occur. If you want the data copied to all occurrences of a tag, select 'All' from the 'Occ' box in the 'Copy To' area.

The validation for this type of global change will block certain uses of the occurrence field that are not supported. For example, the use of 'All' in both the 'Copy From' and the 'Copy To' Occ boxes at the same time is not supported.

There is a special option in Copy Subfield to copy only a part of a subfield. To access this option, right-click in the 'Subf' box in the 'Copy From' area.

There are two ways to use this option:

By specifying the offset and length within the subfield of the substring to be copied
By specifying a pattern to match the beginning of the substring to be copied

The offset count begins with 1 (not 0), where '1' refers to the first byte after the specified subfield.

If you are working with fairly fixed data, you can use the 'Start At' field to enter the position–within the subfield–to begin copying from, and then use the 'Length' field to enter the number of bytes to copy. Leave 'Length' set to 0 to copy all remaining bytes in the subfield.

On the other hand, a pattern can also be entered in the 'Start At' field. The case-sensitive and regular expression options available in standard patterns are also supported here. The 'Length' field works as above; leave it to '0' to copy all remaining bytes (beginning with the first byte that matches the pattern), or set it to a number to copy only that many bytes from the point where the pattern matches.

For example, this option might be used to copy document IDs that are embedded with an URL; consider the following field:

856 40 $uhttp:web.lib.com/eBooks?docId=CX3477199999$zAccess here. The ID 'CX3477199999' might be useful in another field for matching and updating records. By using the 'Copy Subfield'task: TAG=856 SUBF=u then right-clicking the SUBF box: START AT=CX[0-9][0-9] LENGTH=12 REGULAR EXPRESSION=CHECKED we can generate a control number in each record that contained only the 'docID' portion of the 856 $u: 035 $aCX3477199999 One caveat here: do not use this special option to replace pattern matching. For example, with the following fields– 901 $aColoradoSt_Univ$b.b22873715 901 $aUniv_NoColorado$b.b1482338x –if 'Start At' was set to 'Univ' and Length set to '0', the result would be 'Univ' being copied from the first field, and 'Univ_NoColorado' being copied from the second field–which is probably not what you want. Copying a subfield is similar to adding a subfield except that the source of the data is taken from the record. Similarly, you can specify exactly where you want the subfield to be copied: at the start of the target tag, at the end of the target tag, or before or after any subfield code in the target tag that you specify. If you specify that the subfield being copied should go before or after an existing subfield, and that existing subfield is not present in the record, then no change will take place. Also, when a subfield is added to an existing tag, it may affect the punctuation in the other subfields. Click on the Punctuation Options button to configure how punctuation will be handled. If you are copying a subfield, and the target tag is not present in the record, you can set an option to add a new tag for the subfield being copied. And if the subfield you are copying is already present in the target tag, use one of the following overlay options: select 'Skip record' cancel the copy and go on to the next record; select 'Add new subfield' to add your subfield to the tag in a new subfield; or select 'Replace subfield' to replace the first matching subfield in the tag with the subfield you have defined above. Hint: If you want to rename all the subfields within a tag, use the Change Subfield option instead. CHANGE A TAG Enter the number of the tag you want to change in the top section, and the new tag number in the bottom section. This global change simply modifies the tag in the directory. It is much simpler than using the 'Move Tag' option in 'Copy Tag', although it lacks the overlay options of the latter. Example: Source Tag=650 Target Tag=690 Action: Changes every occurrence of 650 to 690 in every record in the file. CHANGE A TAG OCCURRENCE This option was designed make it possible to change a tag's position in a record. Enter a tag and an occurrence number in the top section, then enter which occurrence you want the tag moved to in the bottom section. You can enter numbers (1, 2, 3, etc) in the occurrence boxes, or select an option from the pulldown list. Note that this global change does not simply modify the tag's offset in the directory, but actually moves the data to the requested occurrence position within the record. Example: Source Tag=650 Occ=First Target Occ=Last Action: Moves the first 650 so that it is the last occurring 650 in every record in the file. CHANGE AN INDICATOR (OR INDICATORS) Enter the tag for which you want to change the indicators in the top section. Enter the new indicator values for this tag in the bottom section. You can change both indicators at the same time, or just one or the other. If a value is entered for an indicator in the top section, then only indicators that match that value will be changed; if a value is not entered, then it acts as a wild card. For example, if Indicator 1 in the top form is '1', and Indicator 1 in the bottom form is '0', only tags that contain Ind 1 = '1' will be changed to '0'. However, if Indicator 1 in the top form is empty (as opposed to containing a blank), all Indicator 1 values for the tag will be changed to '0'. To enter a blank space, press the Space Bar on your keyboard. A special character (that looks like a small square) will appear when you tab away from the edit box. Note that there is a difference between entering a blank space (which is a valid indicator value in MARC), and not entering anything at all, which matches all indicators for that position. Example: Source Tag=650 Ind1= Ind2=[blank space] Target Ind1= Ind2=0 Action: Changes the second indicator to 0 in every 650 where indicator 2 is a blank, in every record in the file. Example: Source Tag=650 Ind1= Ind2= Target Ind1= Ind2=0 Action: Changes the second indicator to 0 in every 650 in every record in the file. CHANGE A SUBFIELD Enter the tag and subfield that you want to change in the top section, and enter the new subfield code in the bottom section. Example: Source Tag=852 Subf=a Target Subf=b Action: Changes every subfield a to subfield b in every 852 in every record in the file. CHANGE SUBFIELD ORDER Enter the tag that you want to change the subfield order in. Then you can either select 'Sort subfields alphabetically', or enter your own subfield sort order in the 'Subf' box below. If you choose to sort alphabetically, the subfields will be arranged in the order 'a'..'z', '0'..'9'. In the current version, numeric subfields are added (in order) to the end of the tag (whereas in a true ASCII sort, numeric subfields would preced the alphabetic subfields). If you enter your own subfield order, the program will re-arrange the subfields in the field according to that order. Any subfields that appear in the field and are not present in your list, will be added to the end of the new field (after all the subfields in your list) in the same order that they occur in. Any subfields in your list that do not appear in a field are ignored. Example: Source Tag=949 Target Subf=aw Action: Changes 949 so that the first two subfields are 'a' and 'w'. Any other subfields present in the 949 will be appended, after $w, in the same order that they occur. CHANGE DATA This will probably be the global change that you use the most. It is also the primary global change that can be applied to a fixed field (by which we refer to the leader and tags 006-008). Enter the tag, subfield, and data that you want to change in the top section. You cannot leave the “Data” box blank. If you leave the subfield box blank, the program will search the whole tag for whatever value is in the data box. Alternately, you can embed subfields in the Data box (but you cannot both enter a value into the subfield box, and embed a subfield into the data box). If you want the program to match the source data without regard to case, un-select the Case Sensitive checkbox. If the source data contains a regular expression, you must select the Regular Expression checkbox (or the program will treat any metacharacters as literal characters). In the bottom section, enter the new data. This is the data that will replace whatever you entered in the Data box above. If you leave this box blank, then any data that matches the source data will be deleted (i.e. the effect is to replace something with nothing). The 'Data Occ' box allows you to specify which occurrence of a data pattern you want to change. The default is the first matching occurrence of the string entered in the 'Data' box below. Example: Tag=650 Subf=z Data=iss DataOcc=First New data=izz If the 650 contains a '$zMississippi', the first matching data occurrence would occur after the 'M', and the second matching data occurrence would begin before 'ippi'. The default action would be to change 'Mississippi' to 'Mizzissippi'. If you want the result to be 'Mizzizzippi', set the 'Data Occ' box to 'All' instead of 'First'; if you want the result to be 'Missizzippi', enter '2' in the 'Data Occ' box. Hint: Don't make the mistake of using the 'Data Occ' box when you really mean 'Tag Occ' or 'Subf Occ'. To single out an occurrence at one of these other levels, use pattern matching. This example changes all occurrences of ' and ' to ' & ' in a 520: Tag and Data to change Tag=520 Subf= Data=' and ' (without the quotes) Data Occ=All New data=' & ' (without the quotes) At this time, the use of regular expressions in the “New data” box has an extremely narrow focus. This option will be enabled only if: 1) the 'Data to change' contains a regular expression subpattern, and 2) the 'New data' contains a back reference to the subpattern For more info and examples, please see the following article on our wiki: http://www.marcofquality.com/wiki/mrt/doku.php?id=236:pcre_and_mg DIACRITICS MARC Global recognizes the same syntax as MARC Review uses to specify diacritics, both in the 'Data to Change', and the 'New Data' boxes. For example, to change the letter 'c' into the copyright symbol: Data=c New data=\xC2\xA9 IMPORTANT Do not use the older MARC Review curly brace syntax, eg. New data={xC2}{xA9} as a replacement pattern in MARC Global, as it will, in this case, replace the 'c' with the string '{xC2}{xA9}'. CHANGING FIXED FIELD DATA To make changes to a fixed field, enter the tag and then press the <Tab> key. This will bring up the default fixed field template for that tag. You can change to a different template by clicking on the small 'book' icon next to the Cancel button. Using a template has the result of restricting a global change to only those records associated with the template. For example, if you are entering data for an 008, and you select the 'Books' template, then your change can only take place on records that are 'Books' (ie Print materials); all other types of materials will be excluded from the global change. If you want to remove this format restriction and change all occurrences of the fixed field, select the 'Any format' template. This template lists only elements that have the same meaning in each format or material type. The implication is that is not safe to globally change elements whose meanings vary with the format of the record. On the template, go to the fixed field element that you wish to change, and enter the data to be changed. When you are finished, click Save to return to the Change Data form. You can only change one data element at a time in the current version. If you do happen to enter changes for more than one element on the template, the program will accept the last element entered and discard the others. The default value for each element is a blank-filled element. These blanks do not act as wildcards, as in the Change Indicator section described above. They will match literal blanks in the corresponding MARC record. If you want to use a wildcard, enter a regular expression, eg. enter a period to match any single character. When you do this, just remember to select the Regular Expression option on the Change Data form. Once you have selected a template for a fixed field, you can easily change it by clicking on the (blue) label for the format or material type. Or you can click on the Tag box and press the <Tab> key. In the bottom section, enter the new data. This is the data that will replace whatever you entered in the Data box above. You cannot leave the 'New data' box empty for a fixed field. SPECIAL WARNING FOR FIXED FIELDS For fixed fields, the length of the data to be changed must equal the length of the replacement data. Changing the length of a fixed field will render some or all of the data in the field corrupt; changing the length of the leader will corrupt the whole record. To guard against this problem, we use two techniques. First, in the program code, we do not allow the fixed field length to change (ie either become longer or shorter) during a global change. Second, during data entry on the Change Data form, if the program detects a situation where the length of the data being changed does not equal the length of the replacement data, it will display a warning, and prevent you from continuing until the problem is fixed. However, if a regular expression is being used, it may not be possible for the program to evaluate this situation correctly; in that case, the warning allows you to proceed at your own risk. The program does not yet have a way to globally fix fixed fields that are not the correct length to start with. This will be added in a later version. CHANGING ONLY A PART OF A FIXED FIELD ELEMENT If a fixed field element is longer than one byte, and you only want to change a part of it, follow the steps outlined below. Remember that positions in fixed fields are zero-based; it might be a good idea to have your LC Marc manual open when you try this type of global change. 1. Select the element in the template, and enter the data to be changed (as above); 2. Manually edit the 'Pos', 'Len', and 'Data' fields on the Change Data form to meet your needs; 3. Finally, enter the replacement data. For example, the Illustrations field for Books is 4 bytes long and begins at position 18 in the 008. If you wanted to set only the last three bytes of this field to blank, you would: 1. Select the element in the template and click Save. 2. Change the 'Pos' box from 18 to 19, change the 'Len' from 4 to 3, enter three periods in the 'Data' box, and select the Regular Expression option 3. Enter three blanks in the 'New Data' box. And although we do not recommend it, you could also use this technique to globally change any position in a fixed field, regardless of the format or material type. ADD SEQUENTIAL NUMBER Select this type of global change if you want to add a tag that contains an auto-incrementing number (i.e. a number that increases by one with each record processed). Enter the tag where you want to add this number in the 'Tag' box, and the subfield (if it is a variable tag) in the 'Subf' box. The number to start at and the formatting for that number are to be entered in the 'Data' box. If you just enter a simple number (eg. '456'), then that number will be added in the tag specified above to the first record. With each subsequent record, the number you enter is increased by one, so that in the second matching record, '457' is added, and in the third, '458' is added, etc. If you select the option called 'Increase number for matching records only', then your starting number will only be increased when a record matches a pattern. (If no pattern was specified, then all records will match, and this option will not apply). For example, let's say you enter '901' in the Tag box, 'a' in Subf box, '1' in the Data box. The first record will be assigned '1', the second '2', and so on. But if you also specify a pattern, like 'NOT 040a=DLC', and then select the 'matching records only' option, then the first record with an 040a that is not 'DLC' will be assigned '1', the second record with an 040a that is not 'DLC' will be assigned '2' (even if this is actually the 100th record in the file), etc. If you want the number to be left-justified (or fixed-length), you must prefix the starting number with zeroes. For example, a starting number like '000456' will create numbers that are always 6 bytes long: '000456', '000457', '000458', etc. You can add as many zeroes as you want. Note that if you enter a starting number like '001', and your file is larger than 1,000 records, all records after 1,000 will have numbers that are (at least) 4 bytes long instead of 3. This may not be what you want. On the other hand, if you enter a starting number like '2003000001' (like a LCCN), then all of your numbers will always be 10 digits long. If the program does count beyond a million records, it will simply change the '3' to a '4': '2003999999', '2004000000', '2004000001', etc. Again, this may not be what you want. So be careful to assign enough leading zeroes if you want a fixed-length (7 is usually enough). To summarize this point: a starting number of '1' will create a sequence like: '1', '2', '3', etc; and a starting number of '00000001' will create a sequence like '00000001', '00000002', '00000003' etc. So, in this case, '1' is not quite the same as '00000001'; We recommend that you use fixed-length numbers if possible, especially if you ever plan on sorting the file on the number being added, or doing any kind of batch processing involving that number. If you want a 'prefix' added to each number, simply add the prefix to the starting number that you enter in the 'Data' box. Here are some examples: 'tmq00000001', 'tmq1' 'myPrefix20030001', etc. Enter the number immediately after the prefix without intervening spaces. If you are using an OCLC code for a prefix, and your OCLC code contains a number or other special character, enter a '%' after the prefix; for example: 't3r%00000001'. The program will drop the '%' when it creates the sequential number, so the first number assigned in the example will be 't3r00000001'. Do not enter any other characters after the number (tmq0001 rev) or within the number (2003-0001). If you do, a message will pop up reminding you of the formatting instructions explained here. NORMALIZE NUMBER This global change makes it possible to automatically normalize the most common match keys in a record. Normalized match keys are essential for databases, as they help to ensure that new records will correctly match existing records when they are loaded, and avoid creating duplicates. There are six patterns that currently can be entered in the top part of the form: Tag Subf Data 001 ocm 010 a 010 z 020 a 022 a 035 a (OcoLC) Anything else will generate a prompt to enter one of the above. The next set of options applies ONLY to match keys that fail normalization. In this case, you can either log the tag with the problem, or log the problem and move the tag to a 9XX field of your choice (961 by default). Note that if the normalization succeeds, the change applied to the tag will always be logged. On the other hand, if you dont want to log any changes, you can turn off logging altogether via the Output Options (on the 'Global Change' page). The checkboxes at the bottom allow you to configure some of the finer points of the normalization performed. If 'Ignore dupes' is checked, the program will ignore duplicate numbers in the same record; if it is not checked, the program will delete any duplicate numbers in the same record. If 'Ignore trailing blanks' is checked, the program will ignore records with pre-2001 LCCNs that lack a trailing blank space in the 12th position; if it is not checked, the program will correct these LCCNs (and log them). If 'Ignore invalid check digits' is checked, the program will ignore 020s or 022s with check digits that do not validate; if it is not checked, the program will report those fields as errors, and optionally, move them to the 9xx tag specified above. Note that if one of these options does not apply (eg. the 'Trailing blank' option when the tag specified is 020), changing the option has no effect. Finally, by creating and saving a review for each of these six possibilities listed above, you can construct a powerful autoreview that you can then use to execute all six normalizations on any file. FIX FILING INDICATOR Correct filing indicators are important if your system follows MARC specifications when indexing titles. This global change makes it possible to automatically fix filing indicators in tags 245 and 440 (both of which use indicator 2 to count non-filing characters). Although there are many other tags in MARC for which non-filing indicators are defined (as well as the $t subfields in added entry tags), current AACR2 practice is to enter a uniform title (or the title of a related work) without articles and to set indicator 1 to 0. There are three options that can be applied to this task. 'Change blanks to zero' will set the non-filing indicator to zero when the first word is not an initial article and the indicator 2 value is a blank space; if this option is not selected–which is the default–then the blank space will be retained. The rationale for this option is that, for some files/databases, changing blank to zero will generate a very large number of changes, something that may not always be desirable–if, for example, you are planning to reload all of the changes to your local system. The second option is 'Use eng as the default language code'. If, for whatever reason, a language code cannot be obtained from 008/35, this option–if selected–will set the language code to 'eng'. For the most part, the program uses the language code to distinguish between initial articles in one language that are not initial articles in another language. For example, in records where the language code is 'ger', 'Die' would be seen as an initial article and the filing indicator would be set accordingly; but if the language code was not 'ger', no change will be made to titles beginning with 'Die'. The third option is called 'Log Language Exceptions'. By default, the program logs all the changes it makes to the non-filing indicators (even if you have 'Enable logging' turned off on the Global Change page of the options). The program also adds messages to the log for problems that were found but could not be fixed, such as: titles that do not begin with subfield $a, titles that contain junk in the indicator 2 position, and errors in the language code. Also, all occurrences of the English article 'Ye' are written to the exception log (so that the user can decide how to handle them!). The 'Log Language Exceptions' option will also log all titles that contain an initial article in a language that is different than the language code of the record. Since this happens fairly frequently, you may want to turn this logging off–hence this option. The Fi-fix task will also automatically fix (and log) problems in titles that begin with leading spaces (the blank spaces will be deleted), ellipses, and double-dashes. The reason for performing these fixes is that a correct filing indicator value cannot be determined without them. The program supports ALL of the language codes and initial articles listed on the LC website, as well as some languages from Oceania not listed there. SAME RECORD DUPLICATE FIELDS [This topic has its own Help file. To access it, select this job from the list of MARC Global types, press Next, then press Help] SPLIT LONG TAGS Use this option to break long fields into smaller fields at a specified length. For example, some OPACs may truncate a display field at a certain number of characters, and some systems may return an error when trying to load a record with a very long field. You might use this option to identify these long fields and split them into shorter fields. In the top part of the form you must enter a pattern that specifies the tags to match. This pattern must use the syntax specified in the MARC Review Help file, which we repeat below: In the TAG box, enter the tag number of the field you want to check. Splitting on subfields is not supported in this mode, so leave the SUBF box empty. The DATA box is where you specify the length. The format that must be used is: '-' (a dash) operator (from the list below) '#' (a pound sign) number (the length of the field) The operators are: 'ge' Greater than or equal to 'gt' Greater than 'le' Less than or equal to 'lt' Less than 'eq' Equal to 'ne' Not equal to This syntax must be followed exactly–no spaces anywhere, the dash and the pound are required, no commas in the number, etc. For example: -gt#2000 will find all tags where the field length is greater than 2000 bytes (Note: these are MARC bytes, and not unicode characters). Note: Although the record directory is not used to compute the field length, the length of a tag as computed by 'Split long fields' should be the same as that in the length portion of the tag's directory entry. The program uses the following logic to split the fields. As each record is processed, the pattern in the top part of the form is applied. If no fields match, the program advances to the next record. If fields match, then for each field greater than the requested length, beginning at the position specified in the 'Break at' option (see below), the program reads the field backwards until it finds a blank space. It then copies all of the data from the starting position (in this case, the beginning of the field) up to the blank space, into a buffer. The next byte after the blank space becomes the new starting position, and the 'break at' position is incremented accordingly. The process is repeated until there are not enough bytes remaining to support a new 'break at' position. At this point, the remainder of the field is added as the last entry in the new tag buffer. Once no more fields match the length criteria, the new (buffered) fields are added to the MARC record in the order that they were created. Finally, the original matching tags (the ones that were 'split') are deleted. To support sorting and programmatic reconstruction of the split field, a field link and sequence number identifier is added to the beginning of each new field (in subfield $8, exactly as described in the 'General sequencing' section of http://www.loc.gov/marc/bibliographic/ecbdcntf.html ). For example, if a 505 is split into the three pieces, each new piece would have the following coding added to the beginning of the field: $81.1\x $81.2\x $81.3\x The next 505 that is split from the same record would have: $82.1\x $82.2\x … and so on. This field link and sequence number, and the need to repeat the indicators of the original tag, and insert a leading subfield, and add a field terminator, add about 12-14 bytes of overhead to each new tag created. This overhead is 'computed' when the program sets each 'break at' position, so that the final length of the tag never exceeds the specified value. OPTIONS The 'Break At' option is a number that contains the requested maximum length of the tags created by the 'split' processing. This may be the same as the maximum length specified in your pattern, or it may be shorter–but it cannot be longer. The 'Use Field link' option allows you to suppress the usage of $8 Field Link and sequence numbers to order the tags. This is checked by default. In our testing, we found the resulting data easily became jumbled up without this information. The '(Try to) break on a delimiter' option modifies the default behavior, described above. When this is checked, whenever the program finds a 'break at' position, it continues it search (backward through the string) until it finds a MARC subfield delimiter. If the new position falls within a pre-defined range (currently set to 20% of whatever the 'Break At' option is set to), the next tag created will begin with the subfield identified; else the original break point (the first blank space) will be used. This option is only useful for coded fields, like Enhanced 505s. If its more important that the result fields be of equal length, then do not select this option. The 'Add/retain blank space' option tells the program whether to leave a blank space at the end of each new tag it creates, except the last. Depending on how your system reconstructs these fields for display, a blank space might be needed when the fields are re-joined. Notes and caveats First, please keep in mind that this is a machine process, and the split fields produced may not be 'pretty' in some cases. This task was designed for MARC fields that contain “words” (like the 505 or 520); thus, fields that do not contain blank spaces cannot be split using this option. The minimum break at position is 100 bytes. If you want to preserve a copy of the original tag before it is split, run the 'Copy a Tag' task before running the 'Split long tags' task. (You might first want to run a MARC Analysis on your file to get a list of all unused tags; you may also want to use the analysis to check whether copying these tags might overflow the MARC record length boundary). If a field that has been linked (using Linkage subfield $6) matches the pattern, that field will be excluded from the split operation, because $6 implies that there is another field in the record which would also need to be split in an exactly parallel manner (which might be impossible to programatically determine if the linked field was in a different script). It is generally not possible to get fields that exactly match the 'break at' length using this routine; but all result fields, except for the last, should be approximately the 'break at' length, while never exceeding. This option, although it works as designed, may not work as expected with long fields that have already been split, since there is no way to tell if, for example, two 505s in an existing record are the result a previous split by a vendor process (none of the example records we have seen take measures to indicate the sequencing of split tags). Some manual re-ordering may be neccessary in this case, especially if the previously split tags are not in order to begin with. You may wonder if it makes sense to set the 'break at' position to a value smaller than the length specified in the pattern. For example, if you know that your opac display chokes on fields longer than 4000 bytes, then breaking the tag at that point could conceivably generate a tag that is only a couple of bytes long (if the original tag was, say, 4002 bytes long). But breaking these tags at, for example, 3800 bytes, would mean that the shortest length of a split tag would be about 201 bytes, which should create a readable portion of text. Using this option wisely may require a bit of research: find out if you have any tags with lengths that are near the 'break at' position and perhaps divert them to a separate file. and/or handle them manually.

phelp/helpglobalchangeoptions.txt · Last modified: 2021/12/29 16:21 (external edit)

Back to top