Differences

This shows you the differences between two versions of the page.

Link to this comparison view

phelp:helptextimport [2021/12/29 16:21] (current)
Line 1: Line 1:
 +IMPORT FROM TEXT
 +
 +The import utility accepts two different formats: MARC Tagged, and Data formatted in columns. 
 +
 +The utility (as of version 236) will recognize and import Tagged records that use LC's MARC Breaker format (for information on this format, see: http://www.loc.gov/marc/makrbrkr.html)
 +
 +TAGGED FORMAT
 +
 +By default, this utility will import text records in tagged format into MARC. This is a standard format for MARC records rendered as text.
 +
 +For example, here are two (very brief) records in MARC tagged format:
 +
 +100 1 $aSilverstein, Shel.
 +245 10$aWho wants a cheap rhinoceros? 
 +250   $aRev. and expanded ed.
 +260 0 $aLondon :$bCollier Macmillan,$cc1983.
 +300   $a[56] p. :$bill. ;$c20 x 25 cm.
 +
 +100 1 $aCrocker, Betty.
 +245 10$aBetty Crocker's do-ahead cookbook.
 +250   $a1st ed.
 +260   $aNew York, N.Y. :$bMacmillan USA,$cc1994.
 +300   $a173 p. :$bill. (some col.) ;$c27 cm.
 +
 +The blank line after each record is important as it tells the program where each record ends. (It does not matter how many blank lines separate the records).
 +
 +Imported records should have a minimum of two fields. If there is only field per record, the importer will reject it.
 +
 +Each tag must be entered as three digits: '008' and not '8'
 +
 +The subfield delimiter should be a dollar sign ('$'
 +
 +There must be one blank space (and not a TAB) after each tag: '100 '
 +
 +There cannot be blank lines between tags. A blank line is the signal to the text-to-MARC conversion program that it has reached the end of the record. 
 +
 +For control fields (tags 000 to 009), the field data starts immediately after the blank space that follows the tag:
 +
 +001 ocm00811859 
 +003 OCoLC
 +
 +NOTE: If a fixed-length field like the 008 is present and is not the correct length, the conversion will take the field as given, and present a warning note after the text file has been processed.
 +
 +For variable fields, the indicators start immediately after the blank space that follows the tag. There is no spacing between the indicators, and the first subfield of the data follows immediately after the second indicator: 
 +
 +[Both indicators blank]
 +260   $aAmsterdam :$bMoody's Investors Service.
 +
 +[First indicator blank]
 +650  0$aCorporations$zUnited States.
 +
 +[Second indicator blank]
 +245 0 $aMoody's OTC reports.
 +
 +[Both indicators coded]
 +246 30$aOTC industrial news reports
 +
 +There should be no line breaks in variable fields (ie Word Wrap should be turned off). If there are line breaks, then the start of each text line that follows the line with the tag in it should be indented with three blank spaces:
 +
 +520   $aEach player forms interlocking words cross-word 
 +   fashion on the board and competes for high score by 
 +   taking advantage of letter values.
 +600   $aGames.
 +
 +MARC BREAKER FORMAT
 +
 +This utility will also recognize and import MARC BREAKER formatted records. Here is an example of a tagged record in this format:
 +
 +=000  00000nam\\2200000\a\4500
 +=001  0123456789
 +=003  NjP
 +=005  19930422091534.7
 +=008  920806s1991\\\\nju\\\\\\\\\\\000\0\\eng\d
 +=020  \\$a0777000008
 +=040  \\$aNjP$cNjP
 +=100  1\$aSmith, John W.,$d1955-
 +=245  10$aPolitical tides in America /$cby John W. Smith III
 + ; with an introduction and commentary by Spencer Yarborough.
 +=260  \\$aCamden, NJ :$bTrendsetter Press$c1991.
 +=300  \\$a344 p. :$bill., maps ;$c26 cm.
 +=650  \7$aPolitical science$zUnited States.$2lcsh
 +=650  \7$aPopulism$zUnited States.$2lcsh
 +=700  1\$aYarborough, Spencer,$d1931-
 +
 +Our implementation follows the specifications listed in the MARCMaker/MARCBreaker manual (http://www.loc.gov/marc/makrbrkr.html). 
 +
 +MARC Report will also recognize and load the MARC Breaker character set conversion file (text21.txt) if it is present. This file should be located in the user's My Documents\MarcReport\Templates folder. A copy of this file is included with the program during installation; a backup copy can be found in the program directory under 'defaults/templates'.
 +
 +
 +COLUMNAR FORMAT
 +
 +This utility will also import text records in formatted in columns. Most databases can dump data in this format, and spreadsheets like Excel can import data in this format. For example:
 +
 +AUTHOR TITLE PUBLISHER DATE PAGES
 +Hader, Berta The big snow Aladdin Books c1948 [48] p. 
 +Grant, Maxwell Blood red rose Macmillan c1986 403 p. 
 +King, Tabitha Small world Macmillan 1981 229 p. 
 +Blume, Judy Forever... Bradbury Press c1975 199 p. 
 +Lechner, Alan Street games Harper & Row 1980 xiii, 176 p.
 +Lobel, Arnold Fables Harper & Row c1980 40 p.
 +
 +Currently, the program requires columnar data to be tab-delimited.
 +
 +To import data from columns, you must go to the Options page, select the 'Columnar format' option, and enter a tag mapping for the columns. 
 +
 +In the example above, the tag mapping would be:
 +
 +100 245 260b 260c 300
 +
 +In the tag mapping box, tags must be separated by one blank space and there must be one tag/subf for each column in the data. 
 +
 +One subfield can be specified for each variable tag; if no subfield is entered, the data will be added to a $a. 
 +
 +During import, empty rows will be automatically skipped.
 +
 +During import, rows that do not have the same number of columns as tags in the mapping will be skipped; a count of any such rows will be displayed at the end of the process.
 +
 +There is no way to specify leader bytes, indicators, or embedded subfields (unless the data contains the actual MARC subfield character) with this type of import.
 +
 +ALEPH IMPORT
 +
 +Text exported from aleph systems have a format that is slightly different from the typical MARC tagged text. Use this option if it applies.
 +
 +These systems also may export alphabetic tags (see the next option). 
 +
 +ALPHABETIC TAG SUPPORT
 +
 +When an import option known to contain alphabetic tags is selected, the program will prompt you to select a conversion table. If the conversion table (as outlined below) is successfully loaded, MARC Import will convert each alphabetic tag to its corresponding numeric tag. The alphabetic tag itself will be copied to a subfield $9 and inserted as the first subfield of the corresponding field. 
 +
 +When the import is complete, statistics on the alpha-to-numeric conversion will be displayed, including any alphabetic tags that were found in the text but not in the conversion table, if applicable. 
 +
 +If you wish to discard all alphabetic tags, simply clear the option box for the 'Alpha Tag translation table' (ie. no supply a translation table and any fields with alphabetic tags will be silently ignored).
 +
 +ALPHABETIC TAG CONVERSION TABLE
 +
 +The format of the alphabetic tag conversion table must be (exactly) as follows: a 3-byte alphabetic tag in uppercase, followed by an '=' sign, followed by a 3-digit numeric tag. For example:
 +
 +A01=961
 +CAT=962
 +FIN=963
 +FMT=964
 +LCS=965
 +SRC=966
 +
 +Only uppercase alphabetic tags are supported. 
 +
 +NOTES
 +
 +Save conversion tables to the folder: My Documents\MarcReport\Options
 +
 +There is a utility called 'findAlphas.exe' in the Marc Report program folder. When you run this utility, it will read the file selected and create a list of all alphabetic tags in the file and the number of times each one was found. This utility will also generate an alphabetic tag conversion table using the information collected. 
 +
 +For a bit more information on working with files containing alphabetic tags, open the Verify utility and read the Help page there.
  
phelp/helptextimport.txt ยท Last modified: 2021/12/29 16:21 (external edit)
Back to top
CC Attribution-Share Alike 4.0 International
Driven by DokuWiki