Importing Authority data

Needs rewritingDeborah 2014/10/02 00:23

To facilitate the import of MARC data into RIMMF, we have developed an interface that search a few sources of freely available data on the web. To open this search form, press <F3>

Searching

To use the service, select a search target on the left, enter some text in the top box, then press Search. The default search target is '[ID] NAF', by which we mean the LC Name Authority file as available via http://id.loc.gov (Thank you, LC!)

For example:

Using the search box may take a bit of practice, especially as each search target will behave somewhat differently.

As you can see in the example above, our search for 'Dewey, M' does not browse all the way through to 'Dewey, Melvil' 1).

To get down to 'Melvil' we will need to enter at least 'Dewey, Me' in the search box and try again.

The same search against VIAF, which also offers an similar search protocol, will get different results:

Once a search has been executed, the options on the right become actionable.

  • View MARC: downloads the MARCXML version of the record and converts it to tagged text format, then displays it in the RIMMF browser
  • Import: maps the MARC record into a RIMMF RDA record, using the current mapping settings, then opens the RIMMF record (see below) * Help: displays this wiki page,
  • Tips: some things we have learned about using this search form (and we have mcuh to learn yet).
  • Cancel: useful if a search gets 'hung up'
  • Close: dismiss the search form

In addition, right-clicking on any heading in the search results hitlist will browse to the html page for that URI:

Importing

The quality of the 'records' imported into RIMMF depends on how well we are able to map the linked data into our RDA element implementation. THis part of RIMMF has always been, and still is, a work in progress.

Since authority data is typically generated in MARC, or in a MARC-like format, and since our mapping expertise (such as it is) lies in MARC, we have decided to use MARCXML serializations to create the RIMMF data. To access the MARCXML for a heading, right-click on the heading, and then page down to the bottom of the resulting web page. For an LC page, click on the last link, 'MARC/XML'; for a VIAF page, expand the 'Record Views' section, then click the 'MARC 21 record' link. One good thing about the MARCXML formats is that the data returned is as close to loss-less as we will ever get.

Now for the interesting part! If we select a heading in the list of search results, and press 'Import', then something like the following should pop-up:

At this point, you may click the 'Add record' button at the bottom to add this record to your RIMMF collection; or 'Dicard' to discard it.

Thus, its quite easy to test the mapping of MARC to RDA by:

  • Searching LC
  • Selecting a heading
  • Pressing Import
  • Viewing the results, and
  • Pressing either 'Add' or 'Discard'

Problems

The example in the screenshot above shows the type of mapping problems we are going to have dealing with subfield $c in Name headings, since in MARC we have:

100 1  $aDewey, Melvil$c(Rapper)

Is “Rapper” a relator or a title? Part of the 'Preferred Name' or not?

Other problems that we might run into in 100$c alone:

100 $c: For persons, there are several instances where part of the preferred name element is recorded in MARC 100 $c, such as:

Jr.  or "II" etc. ( for a person entered under surname (9.2.2.9.5)
Miss or Dr., etc. (person known only by a surname, 9.2.2.9.3)
Mrs., etc.  (for a married person identifed only by a partner's name, 9.2.2.9.4)
"the Red" etc. (as in, Eric, the Red, for persons known only by a given name, 9.2.2.18)
"of Aquitaine, etc. (as in, Eleanor, of Aquitaine, for names of royal persons, 9.2.2.20) 

MARC 100$c is also used for other RDA elements, such as:

Title of the person (9.4)
Other designation associated with the person (9.6)
Profession or occupation (9.16)
Place associated with the family (10.5)
Hereditary title (family) (10.7)

Thanks to Dave Reser, LC, for the above lists, which he came up with in a quick response to an email on the topic.

Thus, the example above only scratches the surface of the mapping issues to be resolved. (Refer to this page for specific details on the mapping process itself).

Another problem worth noting at this point is that the current support in RIMMF for VIAF needs some work. You may see what I mean by searching for a well-known name and importing it into RIMMF–the result is likely to be quite an overwhelming list of 'Variant Name for the Person' elements, many of them being duplicates, and so on:

A better implementation would allow the user some way to filter on the authority data's source (eg. 'DNB', 'BNE, 'BNF', etc.). It also shows that we do not have an equivalent for MARC subfield $2 in RDA (though perhaps that is how 'Source Consulted' was really meant to be used?).

RIMMF, in version 2, still does not support the display of non-Roman scripts (although on this form, the View MARC option should be able to render unicode text correctly).

1)
'ID' offers an OpenSearch/Suggest protocol, which we use here, and it appears that the results are limited to 10 headings per search in this mode
details/naco.txt · Last modified: 2023/06/07 20:39 by 127.0.0.1
Back to top
CC Attribution-Share Alike 4.0 International
Driven by DokuWiki