MARC Report: Import from Text

The import utility accepts two different text formats: MARC Tagged, and Data formatted in columns.

Note that this page does not cover MARCXML. For that you should refer to the MARC-to-XML and XML-to-MARC documentation.

The Import utility (as of version 236) will recognize and import Tagged records that use LC's MARC Breaker format (for information on this format, see: http://www.loc.gov/marc/makrbrkr.html)

Back To Top

TAGGED FORMAT

By default, this utility will import text records in tagged format into MARC. This is a standard format for MARC records rendered as text.

For example, here are two (very brief) records in MARC tagged format:

100 1 $aSilverstein, Shel.
245 10$aWho wants a cheap rhinoceros? 
250   $aRev. and expanded ed.
260 0 $aLondon :$bCollier Macmillan,$cc1983.
300   $a[56] p. :$bill. ;$c20 x 25 cm.
100 1 $aCrocker, Betty.
245 10$aBetty Crocker's do-ahead cookbook.
250   $a1st ed.
260   $aNew York, N.Y. :$bMacmillan USA,$cc1994.
300   $a173 p. :$bill. (some col.) ;$c27 cm.

The blank line after each record is important as it tells the program where each record ends. (It does not matter how many blank lines separate the records).

Imported records should have a minimum of two fields. If there is only field per record, the importer will reject it.

Each tag must be entered as three digits: '008' and not '8'.

The subfield delimiter should be a dollar sign ('$')

There must be one blank space (and not a TAB) after each tag: '100 '

There cannot be blank lines between tags. A blank line is the signal to the text-to-MARC conversion program that it has reached the end of the record.

For control fields (tags 000 to 009), the field data starts immediately after the blank space that follows the tag:

001 ocm00811859 
003 OCoLC

NOTE: If a fixed-length field like the 008 is present and is not the correct length, the conversion will take the field as given, and present a warning note after the text file has been processed.

For variable fields, the indicators start immediately after the blank space that follows the tag. There is no spacing between the indicators, and the first subfield of the data follows immediately after the second indicator:

[Both indicators blank]
260   $aAmsterdam :$bMoody's Investors Service.
[First indicator blank]
650  0$aCorporations$zUnited States.
[Second indicator blank]
245 0 $aMoody's OTC reports.
[Both indicators coded]
246 30$aOTC industrial news reports

There should be no line breaks in variable fields (ie Word Wrap should be turned off). If there are line breaks, then the start of each text line that follows the line with the tag in it should be indented with three blank spaces:

520   $aEach player forms interlocking words cross-word 
 fashion on the board and competes for high score by 
 taking advantage of letter values.
600   $aGames.

Back To Top

MARC BREAKER FORMAT

This utility will also recognize and import MARC BREAKER formatted records. Here is an example of a tagged record in this format:

=000  00000nam\\2200000\a\4500
=001  0123456789
=003  NjP
=005  19930422091534.7
=008  920806s1991\\\\nju\\\\\\\\\\\000\0\\eng\d
=020  \\$a0777000008
=040  \\$aNjP$cNjP
=100  1\$aSmith, John W.,$d1955-
=245  10$aPolitical tides in America /$cby John W. Smith III
 ; with an introduction and commentary by Spencer Yarborough.
=260  \\$aCamden, NJ :$bTrendsetter Press$c1991.
=300  \\$a344 p. :$bill., maps ;$c26 cm.
=650  \7$aPolitical science$zUnited States.$2lcsh
=650  \7$aPopulism$zUnited States.$2lcsh
=700  1\$aYarborough, Spencer,$d1931-

Our implementation follows the specifications listed in the MARCMaker/MARCBreaker manual (http://www.loc.gov/marc/makrbrkr.html).

MARC Report will also recognize and load the MARC Breaker character set conversion file (text21.txt) if it is present. This file should be located in the user's My Documents\MarcReport\Templates folder. A copy of this file is included with the program during installation; a backup copy can be found in the program directory under 'defaults/templates'.

Back To Top

COLUMNAR FORMAT

This utility will also import text records in formatted in columns. Most databases can dump data in this format, and spreadsheets like Excel can import data in this format. For example:

AUTHOR		TITLE		PUBLISHER	DATE	PAGES
Hader, Berta	        The big snow 	Aladdin Books	c1948	[48] p. 
Grant, Maxwell	Blood red rose	Macmillan	c1986	403 p. 
King, Tabitha	        Small world 	Macmillan	1981	229 p. 
Blume, Judy 	        Forever... 	Bradbury Press	c1975	199 p. 
Lechner, Alan	        Street games 	Harper & Row	1980	xiii, 176 p.
Lobel, Arnold	        Fables 		Harper & Row	c1980	40 p.

Currently, the program requires columnar data to be tab-delimited.

To import data from columns, you must go to the Options page, select the 'Columnar format' option, and enter a tag mapping for the columns.

In the example above, the tag mapping would be:

100 245 260b 260c 300

In the tag mapping box, tags must be separated by one blank space and there must be one tag/subf for each column in the data.

One subfield can be specified for each variable tag; if no subfield is entered, the data will be added to a $a.

During import, empty rows will be automatically skipped.

During import, rows that do not have the same number of columns as tags in the mapping will be skipped; a count of any such rows will be displayed at the end of the process.

There is no way to specify leader bytes, indicators, or embedded subfields (unless the data contains the actual MARC subfield character) with this type of import.

Back To Top

OTHER IMPORT FORMATS

Text exported from aleph systems have a format that is slightly different from the typical MARC tagged text. Use this option if it applies.

These systems also may export alphabetic tags (see the next option).

ALPHABETIC TAG SUPPORT

When an import option known to contain alphabetic tags is selected, the program will prompt you to select a conversion table. If the conversion table (as outlined below) is successfully loaded, MARC Import will convert each alphabetic tag to its corresponding numeric tag. The alphabetic tag itself will be copied to a subfield $9 and inserted as the first subfield of the corresponding field.

When the import is complete, statistics on the alpha-to-numeric conversion will be displayed, including any alphabetic tags that were found in the text but not in the conversion table, if applicable.

If you wish to discard all alphabetic tags, simply clear the option box for the 'Alpha Tag translation table' (ie. no supply a translation table and any fields with alphabetic tags will be silently ignored).

ALPHABETIC TAG CONVERSION TABLE

The format of the alphabetic tag conversion table must be (exactly) as follows: a 3-byte alphabetic tag in uppercase, followed by an '=' sign, followed by a 3-digit numeric tag. For example:

A01=961
CAT=962
FIN=963
FMT=964
LCS=965
SRC=966

Only uppercase alphabetic tags are supported.

NOTES

Save alpha conversion tables to the folder: My Documents\MarcReport\Options

There is a utility called 'findAlphas.exe' in the Marc Report program folder. When you run this utility, it will read the file selected and create a list of all alphabetic tags in the file and the number of times each one was found. This utility will also generate an alphabetic tag conversion table using the information collected.

For a bit more information on working with files containing alphabetic tags, open the Verify utility and read the Help page there.

phelp/helptextimport.txt · Last modified: 2016/02/18 09:26 (external edit)
Back to top
CC Attribution-Noncommercial-Share Alike 3.0 Unported
Driven by DokuWiki Recent changes RSS feed