phelp:helptextimport [[MARC Report]]

IMPORT FROM TEXT

The import utility accepts two different formats: MARC Tagged, and Data formatted in columns.

The utility (as of version 236) will recognize and import Tagged records that use LC's MARC Breaker format (for information on this format, see: http://www.loc.gov/marc/makrbrkr.html)

TAGGED FORMAT

By default, this utility will import text records in tagged format into MARC. This is a standard format for MARC records rendered as text.

For example, here are two (very brief) records in MARC tagged format:

100 1 $aSilverstein, Shel. 245 10$aWho wants a cheap rhinoceros? 250 $aRev. and expanded ed. 260 0 $aLondon :$bCollier Macmillan,$cc1983. 300 $a[56] p. :$bill. ;$c20 x 25 cm.

100 1 $aCrocker, Betty. 245 10$aBetty Crocker's do-ahead cookbook. 250 $a1st ed. 260 $aNew York, N.Y. :$bMacmillan USA,$cc1994. 300 $a173 p. :$bill. (some col.) ;$c27 cm.

The blank line after each record is important as it tells the program where each record ends. (It does not matter how many blank lines separate the records).

Imported records should have a minimum of two fields. If there is only field per record, the importer will reject it.

Each tag must be entered as three digits: '008' and not '8'.

The subfield delimiter should be a dollar sign ('$')

There must be one blank space (and not a TAB) after each tag: '100 '

There cannot be blank lines between tags. A blank line is the signal to the text-to-MARC conversion program that it has reached the end of the record.

For control fields (tags 000 to 009), the field data starts immediately after the blank space that follows the tag:

001 ocm00811859 003 OCoLC

NOTE: If a fixed-length field like the 008 is present and is not the correct length, the conversion will take the field as given, and present a warning note after the text file has been processed.

For variable fields, the indicators start immediately after the blank space that follows the tag. There is no spacing between the indicators, and the first subfield of the data follows immediately after the second indicator:

[Both indicators blank] 260 $aAmsterdam :$bMoody's Investors Service.

[First indicator blank] 650 0$aCorporations$zUnited States.

[Second indicator blank] 245 0 $aMoody's OTC reports.

[Both indicators coded] 246 30$aOTC industrial news reports

There should be no line breaks in variable fields (ie Word Wrap should be turned off). If there are line breaks, then the start of each text line that follows the line with the tag in it should be indented with three blank spaces:

520 $aEach player forms interlocking words cross-word

 fashion on the board and competes for high score by 
 taking advantage of letter values.

600 $aGames.

MARC BREAKER FORMAT

This utility will also recognize and import MARC BREAKER formatted records. Here is an example of a tagged record in this format:

=000 00000nam\\2200000\a\4500 =001 0123456789 =003 NjP =005 19930422091534.7 =008 920806s1991\\\\nju\\\\\\\\\\\000\0 \\eng\d =020 \\$a0777000008 =040 \\$aNjP$cNjP =100 1\$aSmith, John W.,$d1955- =245 10$aPolitical tides in America /$cby John W. Smith III ; with an introduction and commentary by Spencer Yarborough. =260 \\$aCamden, NJ :$bTrendsetter Press$c1991. =300 \\$a344 p. :$bill., maps ;$c26 cm. =650 \7$aPolitical science$zUnited States.$2lcsh =650 \7$aPopulism$zUnited States.$2lcsh =700 1\$aYarborough, Spencer,$d1931-

Our implementation follows the specifications listed in the MARCMaker/MARCBreaker manual (http://www.loc.gov/marc/makrbrkr.html).

MARC Report will also recognize and load the MARC Breaker character set conversion file (text21.txt) if it is present. This file should be located in the user's My Documents\MarcReport\Templates folder. A copy of this file is included with the program during installation; a backup copy can be found in the program directory under 'defaults/templates'.

COLUMNAR FORMAT

This utility will also import text records in formatted in columns. Most databases can dump data in this format, and spreadsheets like Excel can import data in this format. For example:

AUTHOR TITLE PUBLISHER DATE PAGES Hader, Berta The big snow Aladdin Books c1948 [48] p. Grant, Maxwell Blood red rose Macmillan c1986 403 p. King, Tabitha Small world Macmillan 1981 229 p. Blume, Judy Forever… Bradbury Press c1975 199 p. Lechner, Alan Street games Harper & Row 1980 xiii, 176 p. Lobel, Arnold Fables Harper & Row c1980 40 p.

Currently, the program requires columnar data to be tab-delimited.

To import data from columns, you must go to the Options page, select the 'Columnar format' option, and enter a tag mapping for the columns.

In the example above, the tag mapping would be:

100 245 260b 260c 300

In the tag mapping box, tags must be separated by one blank space and there must be one tag/subf for each column in the data.

One subfield can be specified for each variable tag; if no subfield is entered, the data will be added to a $a.

During import, empty rows will be automatically skipped.

During import, rows that do not have the same number of columns as tags in the mapping will be skipped; a count of any such rows will be displayed at the end of the process.

There is no way to specify leader bytes, indicators, or embedded subfields (unless the data contains the actual MARC subfield character) with this type of import.

ALEPH IMPORT

Text exported from aleph systems have a format that is slightly different from the typical MARC tagged text. Use this option if it applies.

These systems also may export alphabetic tags (see the next option).

ALPHABETIC TAG SUPPORT

When an import option known to contain alphabetic tags is selected, the program will prompt you to select a conversion table. If the conversion table (as outlined below) is successfully loaded, MARC Import will convert each alphabetic tag to its corresponding numeric tag. The alphabetic tag itself will be copied to a subfield $9 and inserted as the first subfield of the corresponding field.

When the import is complete, statistics on the alpha-to-numeric conversion will be displayed, including any alphabetic tags that were found in the text but not in the conversion table, if applicable.

If you wish to discard all alphabetic tags, simply clear the option box for the 'Alpha Tag translation table' (ie. no supply a translation table and any fields with alphabetic tags will be silently ignored).

ALPHABETIC TAG CONVERSION TABLE

The format of the alphabetic tag conversion table must be (exactly) as follows: a 3-byte alphabetic tag in uppercase, followed by an '=' sign, followed by a 3-digit numeric tag. For example:

A01=961 CAT=962 FIN=963 FMT=964 LCS=965 SRC=966

Only uppercase alphabetic tags are supported.

NOTES

Save conversion tables to the folder: My Documents\MarcReport\Options

There is a utility called 'findAlphas.exe' in the Marc Report program folder. When you run this utility, it will read the file selected and create a list of all alphabetic tags in the file and the number of times each one was found. This utility will also generate an alphabetic tag conversion table using the information collected.

For a bit more information on working with files containing alphabetic tags, open the Verify utility and read the Help page there.

phelp/helptextimport.txt · Last modified: 2021/12/29 16:21 (external edit)

Back to top