Differences

This shows you the differences between two versions of the page.

--- phelp:helpexportoptions [2021/12/29 16:21]
+++ phelp:helpexportoptions [2021/12/29 16:21] (current)
@@ Line 1: / Line 1: @@
+TEXT EXPORT
+These options generally apply only to the 'One Line' and 'Export' text output formats, with the exception of the 'Occs' and 'Blank Around Subs' options.
+TAG DELIMITER
+Select the character that you would like used to separate the data that is output for each tag. By default, this delimiter is set to the TAB character. If none of the delimiter characters that appear in the list meet your needs, you can enter your own. To do this you have to know the ASCII code (in decimal) for the character you want to use as a delimiter, and it must be entered as three digits (zero-filled). For example, to use '#' as the delimiter, you would have to enter '035', etc.
+IF THE DELIMITER APPEARS IN THE MARC DATA
+If the character that you select for a delimiter actually appears in the data that is being output, it will break the export format (if you are going to import your data into another program). For example, if you export in tab-delimited format, and the title includes an actual TAB character, the import program will add an extra field beginning at that TAB character in the title.
+This option gives you an opportunity to workaround this problem. If you select
+  Ignore: the delimiter character will be ignored if it occurs in the MARC data;
+  Preserve: the delimiter character will be preserved if it occurs in the MARC data by surrounding the data being output in quotation marks;
+  Replace: the delimiter character will be replaced with another character (usually a blank space) if it occurs in the MARC data;
+  Delete: the delimiter character will be deleted wherever it occurs in the MARC data.
+If you are running a review and have selected one-line output, accept the default, which is 'Ignore'. If you are exporting the data and expecting to import it into another program, choose one of the other three options.
+DEFAULT DATA
+This option only applies to 'One-line' or Export output formats. If any tag selected for export does not exist in the MARC record, the program writes an entry like '[xxx not found]' to the corresponding column in the export record (where 'xxx' is whatever tag was not found). However, with this option you can define a piece of 'default data' that will be used instead. For example, if you enter a zero in the default data option, the program will insert a '0' whenever a specified tag/subf is not present in the record.
+NOTES:
+) Default data is global (there is no way to specify different default data for different tags);
+) Default data will only be consulted when the option 'Export a fixed number of tags' is also selected;
+) Although the text box is small, there is no length requirement or restriction; and
+) The program will remove any leading or trailing whitespace from whatever is entered there.
+OUTPUT OF MARC DATA ELEMENTS
+In this section you can configure which MARC data elements are included in the text output for each tag.
+Tags: If this option is checked, the MARC tag will be prepended (followed by a blank space) to the tag's data. For example:
+Scottish music[sound recording] :music rough guide.
+The default is not to output tag labels.
+Indicators: If this option is checked, the MARC indicators for the tag will be displayed:
+00Scottish music[sound recording] :music rough guide.
+The default is not to output indicators.
+Subfields: If this option is checked, MARC subfield delimiters ('$') and subfield codes will be output:
+00$aScottish music$h[sound recording] :$bmusic rough guide.
+This is the only Text Export option selected by default.
+Blank Around subfields: If this option is selected a blank space is added before and after each subfield:
+00 $a Scottish music $h [sound recording] : $b music rough guide.
+The default is not add blank spaces around subfields.
+This option is an exception in that it can be applied to the One-line, Custom, and Full Record text output types.
+Occs: If this option is checked, the MARC tag's occurrence number will follow the MARC Tag. For example:
+-01 00$aScottish music$h[sound recording] :$bmusic rough guide.
+The default is not to output tag occurrences.
+This option is an exception because it is only applicable in the 'Custom Record' type of text output.
+Fixed number of tags: This option is important if you want to export text data so that it can be imported by a program like Excel. This option will ensure that each record output will appear to have the same number of fields. This means that when a tag that is in your list of 'Tags to Output' is missing from a record, a placeholder field will be inserted.
+For example, if tag 440 is in your list but missing from a record, then the program will output '[440 not found]' so that the correct mapping will be maintained during the import process. If this option is not selected, the program will not output anything at all for missing tags.
+IF A DISPLAY TAG REPEATS
+This option allows you to configure how the export process should deal with repeating tags (like subject headings).
+Export first occurrence: If selected, this option exports only the first occurrence of any repeating tag that you have selected for display.
+Export each occurrence: If selected, this option exports each occurrence of a repeating tag as a separate field. Please note that this option will not be a good choice if you expect to re-import the data.
+Concatenate into one: If selected, this option will concatenate all the occurrences of a repeating tag into one export field. This is the best option to use if you are expecting to re-import the data.
+KEY-VALUE FORMAT
+This option is only valid if the following review criteria are met:
+) Two patterns must be specified (no more, no less)
+) The first pattern must be marked as the "Key" (by right-clicking the tag for the first pattern on the "Tags to Output" list, and clicking the "Key" checkbox)
+) No additional tags may be added to the "Tags to Output" list
+In this scenario the first pattern should reference an identifier, such as a unique record control number (this is the "key"); the second pattern may refer to any other value in the record that may or may not repeat (this is the "value").
+If selected, and if the pattern matching criteria are valid (as defined above), then each line of output will consist of two entries:
+) the output for the key
+) the output for the value
+This option invokes special processing on the output. If the "value" pattern (ie. the second pattern) matches more than one tag/subfield in a record, the "value" for each match will be output on a separate line, preceded by the data from the "key" pattern.
+For example, with 001 specified as the first pattern, and the second pattern simply set to 650 (no subfields specified in the pattern), the output looks like this:
+  1372414	$aCats$vFiction.
+  1372414	$aShapeshifting$vFiction.
+  1372414	$aFantasy fiction.$2sears
+  1372414	$aCats.$2fast$0(OCoLC)fst00849374
+  1372414	$aShapeshifting.$2fast$0(OCoLC)fst01748560
+  1122393	$aAnimal attacks$vJuvenile literature.
+  1122393	$aDangerous animals$vMiscellanea$vJuvenile literature.
+  1122393	$aDangerous animals.
+  1122393	$aSharks$vJuvenile literature.
+  1122393	$aSharks.
+  1133738	$aPolice$zCalifornia$zLos Angeles$vFiction.
+The example should serve to demonstrate the purpose of this output format: to generate an index on repeatable MARC data (something often difficult to achieve). The result of such a review may require post-processing to be useful; for now, we will leave that as an exercise for the user (though keep in mind the list-processing capabilities available in MARC Review and MARC Global: http://www.marcofquality.com/wiki/mrt/doku.php?id=help:mr_list_search_236)
+What if we need to specify more than one subfield for the value pattern, and/or need each subfield to be output to its own line? To do this, go to the "Text Export" page of the output options and select the box labeled "Fixed number of tags"
+For example, to dump out all of the OCLC numbers in an 035 tag (which may appear in either $a or $z) on separate lines, follow these steps: create one pattern for the 001, and a second pattern for the 035, with subfields "az", and set data to match to "(OCoLC)". If we then select "key-value format" from the text output options, we will get something like this:
+  99023142        $a(OCoLC)137241444$z(OCoLC)175282078$z(OCoLC)786207682
+  1602940         $a(OCoLC)156809741$z(OCoLC)861541319
+  1608626         $a(OCoLC)156812066$z(OCoLC)671701739$z(OCoLC)859651370
+--but when we additionally select "Fixed number of tags" on the same page, the output becomes more amenable to subsequent processing:
+  99023142        $a(OCoLC)137241444
+  99023142        $z(OCoLC)175282078
+  99023142        $z(OCoLC)786207682
+  1602940         $a(OCoLC)156809741
+  1602940         $z(OCoLC)861541319
+  1608626         $a(OCoLC)156812066
+  1608626         $z(OCoLC)671701739
+  1608626         $z(OCoLC)859651370
+Regarding the output itself, the lines are written in the order that they are found in the record (whether they are complete tags, or strings of subfields within the same tag). There is no attempt to sort the results. Also, if either the "key" or the "value" pattern fails, there will be no output for that record. Thus, if its essential that every record be represented in the output, it might be best to first pre-process the file and add dummy values to those records lacking the targeted fields.
+Also note that most of the options on the 'Text Export' page are ignored by this output type. Tags, occurrences, and indicators will never be output, whereas subfields (if applicable) are always output. The "Tag delimiter" option is, however, respected; in the examples above, the delimiter is set to the default which is a tab character; but one might also use a vertical bar, etc., between the key and the value.
+Finally, the option on the main "Text Output" page--"Output record sequence number"--is respected; therefore, when that option is selected, there will be three columns in each line of output instead of two.
+DISPLAY '#' IF FIELD CONSISTS SOLELY OF A FIELD TERMINATOR
+This option, checked by default, has a very limited scope. It applies primarily to control tags (like the 001), that are completely empty, except for a field terminator.
+Depending on what options are selected, this scenario might cause an anomaly in MARC Review's text output. For example, if the 'One Line' output format is being used and the 'Tags' option above is not selected, then a control tag that is completely empty will not otherwise be displayed. Here is an example result summary manifesting this condition:
+record(s) in source file.
+tag(s) in 1680 record(s) matched the pattern:
+    AND 001
+text records output to:
+    D:\oneDrive\my documents\Desktop\mreview.txt
+So, the anomaly is: 1680 records contained the 001 and matched the (simple) pattern, but only 228 items were output? Upon investigation one discovers that the 001 in the other 1452 records was completely empty except for the field terminator, which is a non-printing character.
+Hence this option, which will force something to be output in the scenario in question.
+A NOTE ABOUT EXPORTING MARC DATA
+Please note that the nature of MARC is such that importing a MARC record into a 'flat-file' database will usually yield only mediocre results. This is because of the great flexibility of MARC, where each 'record' has its own unique definition. The record length, number of fields, and field lengths all vary widely from one MARC record to a number. Therefore, when attempting to export, please keep in mind that MARC was designed as a communications format and not as a database format.
+Best results will be achieved when exporting fields that are almost always present in each record and do not repeat. For example, exporting MARC records into a 'bibliography format' is fairly straightforward: on the 'Text Output' screen, select MARC Tags 100, 245, 260, and 300 for display. When exporting, always select the 'Export a fixed number of fields' option, so that if a tag is not present (like 100), then a placeholder field will be exported; this will prevent the title from being imported into the author column, and so on. And, if you are exporting repeatable fields, like subject headings, consider selecting the 'Concatenate into one' option, so that for each record, all subjects are exported as one field to the import process.

phelp/helpexportoptions.txt · Last modified: 2021/12/29 16:21 (external edit)

Back to top