Differences
This shows you the differences between two versions of the page.
phelp:helptextimport [2021/12/29 16:21] |
phelp:helptextimport [2021/12/29 16:21] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | IMPORT FROM TEXT | ||
+ | |||
+ | The import utility accepts two different formats: MARC Tagged, and Data formatted in columns. | ||
+ | |||
+ | The utility (as of version 236) will recognize and import Tagged records that use LC's MARC Breaker format (for information on this format, see: http:// | ||
+ | |||
+ | TAGGED FORMAT | ||
+ | |||
+ | By default, this utility will import text records in tagged format into MARC. This is a standard format for MARC records rendered as text. | ||
+ | |||
+ | For example, here are two (very brief) records in MARC tagged format: | ||
+ | |||
+ | 100 1 $aSilverstein, | ||
+ | 245 10$aWho wants a cheap rhinoceros? | ||
+ | 250 | ||
+ | 260 0 $aLondon :$bCollier Macmillan, | ||
+ | 300 | ||
+ | |||
+ | 100 1 $aCrocker, Betty. | ||
+ | 245 10$aBetty Crocker' | ||
+ | 250 $a1st ed. | ||
+ | 260 $aNew York, N.Y. : | ||
+ | 300 $a173 p. :$bill. (some col.) ;$c27 cm. | ||
+ | |||
+ | The blank line after each record is important as it tells the program where each record ends. (It does not matter how many blank lines separate the records). | ||
+ | |||
+ | Imported records should have a minimum of two fields. If there is only field per record, the importer will reject it. | ||
+ | |||
+ | Each tag must be entered as three digits: ' | ||
+ | |||
+ | The subfield delimiter should be a dollar sign (' | ||
+ | |||
+ | There must be one blank space (and not a TAB) after each tag: '100 ' | ||
+ | |||
+ | There cannot be blank lines between tags. A blank line is the signal to the text-to-MARC conversion program that it has reached the end of the record. | ||
+ | |||
+ | For control fields (tags 000 to 009), the field data starts immediately after the blank space that follows the tag: | ||
+ | |||
+ | 001 ocm00811859 | ||
+ | 003 OCoLC | ||
+ | |||
+ | NOTE: If a fixed-length field like the 008 is present and is not the correct length, the conversion will take the field as given, and present a warning note after the text file has been processed. | ||
+ | |||
+ | For variable fields, the indicators start immediately after the blank space that follows the tag. There is no spacing between the indicators, and the first subfield of the data follows immediately after the second indicator: | ||
+ | |||
+ | [Both indicators blank] | ||
+ | 260 | ||
+ | |||
+ | [First indicator blank] | ||
+ | 650 0$aCorporations$zUnited States. | ||
+ | |||
+ | [Second indicator blank] | ||
+ | 245 0 $aMoody' | ||
+ | |||
+ | [Both indicators coded] | ||
+ | 246 30$aOTC industrial news reports | ||
+ | |||
+ | There should be no line breaks in variable fields (ie Word Wrap should be turned off). If there are line breaks, then the start of each text line that follows the line with the tag in it should be indented with three blank spaces: | ||
+ | |||
+ | 520 | ||
+ | | ||
+ | | ||
+ | 600 | ||
+ | |||
+ | MARC BREAKER FORMAT | ||
+ | |||
+ | This utility will also recognize and import MARC BREAKER formatted records. Here is an example of a tagged record in this format: | ||
+ | |||
+ | =000 00000nam\\2200000\a\4500 | ||
+ | =001 0123456789 | ||
+ | =003 NjP | ||
+ | =005 19930422091534.7 | ||
+ | =008 920806s1991\\\\nju\\\\\\\\\\\000\0\\eng\d | ||
+ | =020 \\$a0777000008 | ||
+ | =040 \\$aNjP$cNjP | ||
+ | =100 1\$aSmith, John W.,$d1955- | ||
+ | =245 10$aPolitical tides in America /$cby John W. Smith III | ||
+ | ; with an introduction and commentary by Spencer Yarborough. | ||
+ | =260 \\$aCamden, NJ : | ||
+ | =300 \\$a344 p. :$bill., maps ;$c26 cm. | ||
+ | =650 \7$aPolitical science$zUnited States.$2lcsh | ||
+ | =650 \7$aPopulism$zUnited States.$2lcsh | ||
+ | =700 1\$aYarborough, | ||
+ | |||
+ | Our implementation follows the specifications listed in the MARCMaker/ | ||
+ | |||
+ | MARC Report will also recognize and load the MARC Breaker character set conversion file (text21.txt) if it is present. This file should be located in the user's My Documents\MarcReport\Templates folder. A copy of this file is included with the program during installation; | ||
+ | |||
+ | |||
+ | COLUMNAR FORMAT | ||
+ | |||
+ | This utility will also import text records in formatted in columns. Most databases can dump data in this format, and spreadsheets like Excel can import data in this format. For example: | ||
+ | |||
+ | AUTHOR TITLE PUBLISHER DATE PAGES | ||
+ | Hader, Berta The big snow Aladdin Books c1948 [48] p. | ||
+ | Grant, Maxwell Blood red rose Macmillan c1986 403 p. | ||
+ | King, Tabitha Small world Macmillan 1981 229 p. | ||
+ | Blume, Judy Forever... Bradbury Press c1975 199 p. | ||
+ | Lechner, Alan Street games Harper & Row 1980 xiii, | ||
+ | Lobel, Arnold Fables Harper & Row c1980 40 p. | ||
+ | |||
+ | Currently, the program requires columnar data to be tab-delimited. | ||
+ | |||
+ | To import data from columns, you must go to the Options page, select the ' | ||
+ | |||
+ | In the example above, the tag mapping would be: | ||
+ | |||
+ | 100 245 260b 260c 300 | ||
+ | |||
+ | In the tag mapping box, tags must be separated by one blank space and there must be one tag/subf for each column in the data. | ||
+ | |||
+ | One subfield can be specified for each variable tag; if no subfield is entered, the data will be added to a $a. | ||
+ | |||
+ | During import, empty rows will be automatically skipped. | ||
+ | |||
+ | During import, rows that do not have the same number of columns as tags in the mapping will be skipped; a count of any such rows will be displayed at the end of the process. | ||
+ | |||
+ | There is no way to specify leader bytes, indicators, or embedded subfields (unless the data contains the actual MARC subfield character) with this type of import. | ||
+ | |||
+ | ALEPH IMPORT | ||
+ | |||
+ | Text exported from aleph systems have a format that is slightly different from the typical MARC tagged text. Use this option if it applies. | ||
+ | |||
+ | These systems also may export alphabetic tags (see the next option). | ||
+ | |||
+ | ALPHABETIC TAG SUPPORT | ||
+ | |||
+ | When an import option known to contain alphabetic tags is selected, the program will prompt you to select a conversion table. If the conversion table (as outlined below) is successfully loaded, MARC Import will convert each alphabetic tag to its corresponding numeric tag. The alphabetic tag itself will be copied to a subfield $9 and inserted as the first subfield of the corresponding field. | ||
+ | |||
+ | When the import is complete, statistics on the alpha-to-numeric conversion will be displayed, including any alphabetic tags that were found in the text but not in the conversion table, if applicable. | ||
+ | |||
+ | If you wish to discard all alphabetic tags, simply clear the option box for the 'Alpha Tag translation table' (ie. no supply a translation table and any fields with alphabetic tags will be silently ignored). | ||
+ | |||
+ | ALPHABETIC TAG CONVERSION TABLE | ||
+ | |||
+ | The format of the alphabetic tag conversion table must be (exactly) as follows: a 3-byte alphabetic tag in uppercase, followed by an ' | ||
+ | |||
+ | A01=961 | ||
+ | CAT=962 | ||
+ | FIN=963 | ||
+ | FMT=964 | ||
+ | LCS=965 | ||
+ | SRC=966 | ||
+ | |||
+ | Only uppercase alphabetic tags are supported. | ||
+ | |||
+ | NOTES | ||
+ | |||
+ | Save conversion tables to the folder: My Documents\MarcReport\Options | ||
+ | |||
+ | There is a utility called ' | ||
+ | |||
+ | For a bit more information on working with files containing alphabetic tags, open the Verify utility and read the Help page there. | ||