MARC Report 250: RDA Language validation

The main enhancement in version 2.50 is to add support for the validation of RDA strings in non-English languages.

In this version, French, Spanish, and German are supported.

Support for other languages may be added in the future, taking into consideration that RDA string validation is based on the RDA Registry. Therefore, the universe of language support available to MARC Report is limited to the languages into which RDA has been fully translated, in the RDA Registry.

For information about RDA translation, visit this link: http://www.rdaregistry.info/rgAbout/rdaref/translations/

Our goal in this version was to support the three languages listed above as completely as possible, without having an impact on the vast majority of our English language users. We performed extensive validation tests to ensure that version 250 behaved exactly the same as version 249 for our English users, and that loading the RDA strings for other languages would have no substantial performance impact on the program (such as slowing down batch mode, for example). We believe that unless a user goes into the options and adds support for another language, they should never even be aware that anything in the program has changed.

At the same time we wanted to ensure that:

  1. Updates to language strings could be easily added
  2. New languages could easily be added

Back To Top

Options

The options for RDA language support are very simple.

They are accessed via a button/bar on the RDA page of the options.

After clicking on that button, the following form appears:

Select the language or languages that you want to validate. That's it.

Note that English cannot be selected or deselected. It is the default language, and the English strings are always loaded into memory when the program starts.

Back To Top

How it works

The validation of RDA strings applies to the MARC content designators selected for validation on the preceding page of the options, i.e., the 'Options|RDA' page (the page from which the screenshot above was launched).

In the default setup, this would be the following fields:

100 $e, 
110 $e, 111 $j
336 $a, 
337 $a, 
338 $a
700 $e, 700 $i
710 $e, 710 $i
711 $i, 711 $j
730 $i

As time goes on, this list may change. For example, we hope to add support for validation of English terms in the 34X fields in a future version; when that happens, language support for these fields will also be simultaneously implemented.

Validation of language strings is tied to a record's 'Language of cataloging', as specified in the record's 040 $b.

Thus, if the language of cataloging for a record is 'fre', and if you have selected 'fre' on this options page, then the program will validate the RDA strings from the selected subfields against the corresponding French strings in the RDA registry.

However, if the language of cataloging is 'fre', and you have not selected 'fre' on this options page, the program will validate the RDA strings against the default (English) strings from the registry; and even though the French terms might be valid, they will be flagged as invalid.

When an 040$b contains one of the languages supported by our RDA validation checks, but that language has not been enabled in your options, one of the first error messages you will see in the panel on the right will read:

040: Language validation disabled

as a way of prompting you to check your RDA language options. If this record is intended for use in a catalog that does not use that language, then you may need to replace this record with one that is cataloged in the language of the catalog.

Also note that our language support requires that the Leader/09 be coded with 'a', indicating Unicode. If 000/09 is blank, i.e. indicating MARC-8, we will be unable to validate strings in another language, even if that language is supported and you have selected it as a language validation option. So, if 000/09 is blank, and 040 $b is, e.g., 'fre', and you have selected 'French' as a language validation option, the following error message will display:

040: Language validation not possible

and go on to say that the MARC-8 encoding of the record allows validation against English values only.

If the language of cataloging of a record (040$b) is not one of the language codes currently supported, for example, 'ita', then the following message will appear:

040: Language validation not supported

to let you know that because we do not yet support validation of RDA strings for that language, those strings cannot be validated. If this record is intended for use in a catalog that does not use that language, then you may need to replace this record with one that is cataloged in the language of the catalog.

Note that the reason for this 'lack of support' of a language may be either:

  1. There is not a full translation available from the RDA Toolkit/Registry, or
  2. MARC Report lacks the capability to display diacritics for that language

As mentioned above, English is the default validation language in MARC Report. So in either of the three cases above where an error message regarding the 040 $b is displayed, the program will try to validate the RDA strings in the record against the default English language strings.

To further illustrate, and to emphasize the importance of the 040 $b, lets consider the following scenario: you have selected French as a validation language option, and you import a record with 040 $b set to 'eng'; you then clone that record to create a new record, and change all of the RDA terms in the new record from English to French, but do not change 040 $b to 'fre'. MARC Report will see that the language of cataloging is still coded 'eng' and will validate all of your French terms against the English RDA term list, and will spit out an 'Invalid' error message for every one of tag/subfields in the record that are validated as RDA strings. But if you change the 040 $b to match the language of the strings (i.e., 'fre'), and everything will be hunky-dory.

Back To Top

Function keys

The behavior of the function keys in MARC Report is also influenced by the 040 $b language of cataloging code. If the language of cataloging is selected in the RDA language support option, and one of the function keys that activates a list is pressed, the list will display strings from the language of cataloging (where we have them). However, in cases where there is a problem, such as when one of the three 040: $b validation error messages described above is shown, then the list that is displayed will be rendered in the default language: English.

In the following examples, assume 'fre' is selected in the RDA Language options, and that the 040 $b of the record in question is also set to 'fre'.

F1: MARC Help

If the cursor is on the 100 field and F1 is pressed, a list like the following will display in the $e section:

Everything else on this page (most of which is collapsed) will be in the default language, English.

Note the language of the definitions in the center of the display is also French. These are 'official' definitions from the French translation of RDA. Their availability and the ease with which we are able to access them is thanks to the open linked data design of the RDA Registry.

F7: Display codelist

If the cursor is sitting on 700 $e, and F7 is pressed, a list like the following will display:

Just as with the default English, the French term displays on the left, and the complete RDA definition in French on the right.

336/337/338 Automatic dropdown lists

Clicking on any of the three RDA fields will pop-up the appropriate dropdown list in the language of cataloging:

This example illustrates the importance of the 040 $b language of cataloging. Even if the 337 (in this case) is already filled in with what would be a valid value in English, the dropdown still shows the French terms. Note that although it's not evident in the above screenshot, the program has here flagged the existing English value in Tag 337 subfield $a

unmediated

as Invalid.

F10: RDA Automation

Similarly, when the user presses F10 (the command to automatically fill the 336, 337, and 338 fields, according to the preferences set in the RDA Automation section of the options), the resulting fields will again match the language of cataloging, if applicable.

There is one caveat in this case (see the first limitation below). Some of the automation code for generating Carrier Type still relies on detecting English strings in the 300 field, so double-check the 338 $a until this limitation is addressed.

Back To Top

Limitations

The following issues regarding language support might be resolved in future releases, but exist in version 2.50

First, many of our cataloging checks are based on looking for expected string values in a given field, and all of these expected values are in English (except for the specifically RDA strings that are found in the list of fields above).

RDA language validation is not yet supported for hybrid records; we will add that support in MARC Report 2.51

As mentioned above, RDA language validation does not support strings encoded as MARC-8; there is no plan to add MARC-8 support.

The “Relationship editors” (for RDA Appendix I and RDA Appendix J, available via the RDA page of the program Options) that can be used to customize the validation of RDA terms have not yet been updated to support non-English strings. We will try to add that in a 2018 version. In the interim, please contact us if you want to customize a language file; its possible, perhaps even easy, using something like excel.

Finally, a note about the display of diacritics. Diacritics display correctly in all of the supported languages (and in the screenshots above). This display is based on ISO-8859. However, once a selection from a list is made and the corresponding string is added to the record, it will be converted to UTF-8; at that point the diacritics will display in the typically strange manner that MARC Report uses for this encoding. To verify that the string is correct, the user should flip to XML view, which displays all diacritics, as well as non-Latin scripts, correctly.

250/major_changes.txt · Last modified: 2017/12/28 16:10 by richard
Back to top
CC Attribution-Noncommercial-Share Alike 3.0 Unported
Driven by DokuWiki Recent changes RSS feed