Composition of the diacritic and character

advertisement
voyager-help@lib.cam.ac.uk
http://www.lib.cam.ac.uk/libraries/
Foreign Language Cataloguing Workshop
26/27 November 2009
Rachel Marsh
rem49@cam.ac.uk
Table of Contents
1. Pay and Display ...................................................................................2
1.1 Using appropriate fonts and coding
2. Taking the ‘dire’ out of diacritic? ……………………………………… 3
2.1 Special Character Entry
2.2 Special Character Mode
3. MARC my words ……………………………………………………... ……5
3.1. 880
3.2. 066
3.3. 041
4. Red sauce or brown sauce? .........................................................…..10
Sourcing Foreign language records
4. Foreign Keyboards/Adding Languages to the PC ………………....10
Foreign Language
Cataloguing Workshop
1
26/27 November 2009
1. Pay and Display
For diacritics and special characters to display successfully in Voyager and
your web browser the font preferences of these applications must be set
correctly. It is important to check all your preferences before starting work.
1.1 Using appropriate fonts and coding
Voyager Cataloguing client
The font must be set to ‘Arial Unicode MS’
To do this:
·
go to Options then Preferences
·
The ‘Sessions defaults and Preferences’ window will open
·
Go the Colors/Fonts tab
·
Select ‘Arial Unicode MS’ from the drop-down font menu.
·
Click OK
Browsers
Browsers should also be set to use Arial Unicode MS.
To set the font in Internet Explorer:
·
from the Tools pull-down menu select Internet Options
·
on the General tab click on Fonts under ‘Appearance’
·
select Latin based from the Language script menu
·
then select Arial Unicode MS from the Webpage font menu
·
click OK then OK again
To set the font in Firefox:
·
from the Tools pull-down menu select Options
·
in the Content tab under Fonts & Colors choose Arial Unicode MS as the
Default font
·
click on Advanced…
·
select Western in the Fonts for menu
·
select Arial Unicode MS from the Sans-serif menu
You must also ensure that the character encoding for the page you are looking at is set to
Unicode (UTF-8). To check this:
·
In Internet Explorer from the View pull-down menu, select Encoding and
check that the black dot is next to Unicode (UTF 8)
·
In Firefox from the View pull-down menu select Character Encoding and
check that the black dot is next to Unicode (UTF-8)
The above information is also available in the Newton Help Pages ‘displaying Unicode
characters’:
http://ul-newton.lib.cam.ac.uk/vwebv/ui/en_US/htdocs/help/unicode.htm
Exercise 1
Check that the font preferences in Voyager Cataloguing, Internet Explorer and Firefox are
all set correctly on your PC.
Foreign Language
Cataloguing Workshop
2
26/27 November 2009
2. Taking the ‘dire’ out of diacritic?
Inclusion in catalogue entries of characters outside the standard alphanumeric
set (letters A-Z in upper- and lower-case, numerals, and common marks of
punctuation) requires special procedures. Such characters include both
diacritics (marks attached to letters to indicate modified sound or value) and
special characters (pound sign, dagger etc.).
How do I enter diacritics and special characters in Voyager
Cataloguing?
2.1 Special Character Entry
To add a diacritic place the cursor in the space After the character that requires it. From the
Edit menu select Special Character Entry (or use shortcut Ctrl+E). A window will open
listing in alphabetical order of name all the diacritic characters Voyager will allow to be
entered. Click on the diacritic to be entered then click either Insert to Insert the character and
keep the window open, or Insert/Close to insert the character and return to the record. To
add a special character follow the exact same steps, only place the cursor where you would
like the special character to appear.
Please note that the order in which entries appear in the list of available diacritics is
alphabetical, based on a collective decision locally as to the most appropriate way of
describing them in order to achieve a sensible and useful output order. They are not the
official MARC 21 names for those characters referred to in MARC documentation:
http://www.loc.gov/marc/specifications/specchartables.html
Foreign Language
Cataloguing Workshop
3
26/27 November 2009
Bug. Sometimes the ‘special character entry’ option in the edit menu appears greyed
out and cannot be accessed. If this happens try clicking in the main record window and then
going back to the edit menu. If that doesn’t work, try closing and re-opening the record. This is
an intermittent bug and unfortunately has no pattern to it.
2.2 Special Character Mode
From the Edit menu select Special Character Mode or use the shortcut Ctrl+D. This
changes your keyboard layout. The keys on the keyboard now produce diacritic characters
rather than the normal characters. ‘Special Character’ will appear in the bottom right hand
corner of the Voyager window to show that you are in Special Character Mode.
Information about which key now produces which diacritic can be found in two places:
1. in the first column in the special character entry window under key press
2. in the cataloguing documentation section of the Libraries@Cambridge website ‘list of
diacritics/ special characters by character’
http://www.lib.cam.ac.uk/libraries/login/documentation/diacritics_char.htm
Example
To write an e acute ‘é’ the key press or input symbol is b.
Place the cursor to the right of the e to which you wish to add the acute. Turn on the Special
Character Mode and press b on the keyboard.
Note that you cannot delete when in SCM.
To deactivate Special Character Mode and return the keyboard to normal, select Special
Character Mode from the Edit menu again, or use shortcut Ctrl+D.
Adding multiple diacritics
Multiple diacritics associated a single character should be handled as follows:
·
Enter diacritics from letter outward: 1 Letter 2 Diacritic Nearer 3 Diacritic
Farther
·
Enter letter with diacritics above and below in this order: 1 Letter 2
Diacritic Below 3 Diacritic Above
Composition of the diacritic and character
Diacritics and special characters are always one of two types: "spacing" (usually special
characters) and "non-spacing" (usually diacritics). Spacing characters occupy their own
space when printed or displayed on a screen and non-spacing characters do not.
Example
The ñ shown above is composed of an n plus a non-spacing diacritic tilde. There are two
characters but they only occupy one space. To delete both characters you will need to
press Backspace twice.
*Steer clear of Alt Codes*
Foreign Language
Cataloguing Workshop
4
26/27 November 2009
These work in Windows applications but could cause display problems in Voyager.
*Finally. Don’t forget Copy and paste*
Simply copying diacritics from other Voyager records is as good a method as any!
Exercise 2
In Voyager Cataloguing create a new record with a 100 author field and a 245 title field with
the following information:
Author: François Mitterand (diacritic = cedilla)
Title: Mon rêve, ou la Bibliothèque nationale de France (diacritics= circumflex and grave)
Experiment with both Special Character Entry and Special Character Mode to enter the
diacritics. Please do NOT save the record to the database but, keep the record open.
3. MARC my words
The following MARC fields are ones to look out for when cataloguing a foreign
language item.
3.1. 880
In the current version of Voyager (7.0.2 with unicode) non-roman text ( Japanese, Arabic,
Chinese, Korean, Persian, Hebrew, Yiddish etc…) can be input, edited and displayed in any
field in bibliographic, holding, or authority records using a standard keyboard. 880 fields are
used in the bibliographic record to display the non-roman text.
MARC21 standard definition of the 880 field
“Fully content-designated representation, in a different script, of another
field in the same record. Field 880 is linked to the associated regular
field by subfield $6 (linkage). A subfield $6 in the associated field also
links that field to the 880 field. The data in field 880 may be in more than
one script.”
In other words an 880 field contains non-roman script and is linked via a $6 to the
corresponding primary MARC field that contains the roman transliteration of that script. The
two fields form a kind of ‘couplet’ within a record. IMPORTANT: The ‡6 subfields in the
primary field and the 880 must be coded correctly and link up for the record to display
correctly in the OPAC.
Example:
The following record represents an item that has its author, title and publication data in
Arabic.
Foreign Language
Cataloguing Workshop
5
26/27 November 2009
The primary MARC fields (100, 245, 260) contain the Roman transliteration of the Arabic.
The actual Arabic script is entered into the 880 fields.
The three 800 fields and the three primary MARC fields form couplets thus:
The 100 field /1st
800field couplet
The 100 field contains the Roman transliteration of the Arabic author name. It starts with a ‡6 subfield
linking it to the first 800 field in the record:
‡6 880-01.
‘880-01’ meaning the first 880 field in this record.
The first 800 field links back to the 100 field with a similar ‡6:
‡6 100-01
‘100-01’ = 100 meaning linked to the 100 field, ‘01’ referring again to the fact that it is the first 880 field
in this record.
The 245 field /2nd 880 field couplet
The 245 field contains the Roman transliteration of the Arabic title. It starts with a ‡6 subfield linking it to
the second 800 field in the record:
‡6 880-02.
‘880-02’ meaning the second 880 field in this record.
The second 800 field links back to the 245 field with a similar ‡6:
‡6 245-02
Foreign Language
Cataloguing Workshop
6
26/27 November 2009
‘245-02’ = 245 meaning linked to the 245 field, ‘02’ referring again to the fact that it is the second 880
field in this record.
The 260 field /3rd 880 field couplet
The 260 field contains the Roman transliteration of the Arabic publishing details. It starts with a ‡6
subfield linking it to the third 880 field in the record:
‡6 880-03.
‘880-03’ meaning the third 880 field in this record.
The third 880 field links back to the 260 field with a similar ‡6:
‡6 260-03
‘260-03’ = 260 meaning linked to the 260 field, ‘03’ referring again to the fact that it is the third 880 field
in this record.
Additional Codes In the 880 ‡6
Script Identification Code
Records downloaded from other databases may contain codes such as “(3”, “(N”, “(2” or “$1” in the 880
‡6 after the linking information. This is the Script Identification Code and was required in pre-Unicode
records to identify the script in the 880 field. It is NOT required in Unicode records. If it is present in a
downloaded record do not delete it, but you do not need to supply it for records you create yourself.
Orientation Code
If the script in the 880 field is one that is written right to left like Hebrew or Arabic, the ‡6 subfield must
also contain “/r”. For example, if the 880 field contained text in the Arabic script, the ‡6 might look as
follows
‡6 260-03/(3/r
This is known as the Orientation Code and must be supplied for such scripts for them to index
correctly. You do NOT need to put the orientation code into the linking primary field (100, 245, 260, etc.),
as the information in that field is romanised so is written left to right.
First and Second Indicators in the 880 field
Appropriate indicators as available in associated field. Indicators in field 880 have the same meaning
and values as the indicators in the associated field.
Foreign Language
Cataloguing Workshop
7
26/27 November 2009
Subfield Codes
‡6 linkage (Not repeatable)
‡a-z same as associated field
‡0-5, 7-9 same as associated field
3.2 066
Existing MARC regulations state that if a record contains any characters in a character set other than
the default MARC Latin sets, then the record must have an 066 field with the Script Identification Code
for the script in a ‡c subfield. The Script Identification Codes are as follows:
Script
Arabic
Chinese, Japanese, Korean
Cyrillic
Hebrew
Greek
Code
(3
$1
(N
(2
(S
For example if the record contained text in the Cyrillic script, the 066 should be coded as follows:
066 ## ‡c(N
3.3 041 Language Code (R)
Codes for languages associated with an item when the language code in field
008/35-37 of the record is insufficient to convey full information. Includes
records for multilingual items and items that involve translation.
Sources of the codes are: MARC Code List for Languages
http://www.loc.gov/marc/languages/
Indicators
First - Translation indication
# - No information provided
0 - Item not a translation/does not include a translation
1 - Item is or includes a translation
Second - Source of code
# - MARC language code
7 - Source specified in subfield $2
Foreign Language
Cataloguing Workshop
8
26/27 November 2009
Most useful Subfield Codes
$a - Language code of text/sound track or separate title (R)
Language code in the first occurrence of subfield $a is also recorded in
008/35-37 (Language) unless 008/35-37 contains blanks (###) or the code
"zxx" (No linguistic content).
$b - Language code of summary or abstract (R)
$h - Language code of original and/or intermediate translations of text (R)
Language code(s) for intermediate translations; codes precede those for
original languages.
Examples
041 ##$aeng$afre$aswe
An multilingual item in English, French and Swedish
041 1#$aeng$hrus
An item in English translated from the original Russian OR an item in English
that includes Russian translation somewhere in the text
041 1#$aeng$hger $hswe
An item in English that has been translated from German that has been
translated from the original language of Swedish OR an item in English that
has German and Swedish translations somewhere in the text!
041 0#$aeng$bfre $bger$bspa
An item in English with summaries and/or abstracts in French, German and
Spanish.
* Don’t get confused with 040*
Cataloguing Source (NR)
040 $b = language of cataloguing
The MARC code for the language of cataloguing in the record should be entered here, i.e. eng for
English. Not the language of the item being catalogued.
Useful resources
·
Marc standards Appendix D – Multi-script records. Contains full record examples.
·
Marc standards - 880 Alternate Graphic Representation (R)
·
Marc standards – 041 and 040
http://www.loc.gov/marc/bibliographic/ecbdhome.html
·
Libraries@Cambridge website /documentation/Cataloguing/Cataloguing using non-roman scripts:
http://www.lib.cam.ac.uk/libraries/login/documentation/Unicode_non_roman_cataloguing_handout.p
df
Foreign Language
Cataloguing Workshop
9
26/27 November 2009
Exercise 3
Go back to the record you created in Exercise 2. Add a 041 field. Note that Mitterand's book
contains a summary in English but no translations.
4. Red sauce or brown sauce?
Sourcing reliable records for foreign language items can sometimes be difficult. There is
unfortunately no rule of thumb.
Foreign language cataloguers at the UL go to the usual sources first: OCLC, LC, RLUK. If
these agencies don't have a good record then they try the national library for the relevant
country. These national library records can give some help with subject headings even if they
do not use AACR2 or MARC21. It is also a good idea to look for a translation in English of the
item you are cataloguing. This sometimes comes up trumps.
Some recommended resources:
Greek books in print
http://www.biblionet.gr - but this does require a knowledge of Greek and of
course doesn't supply a blbliographic record.
NACSIS-CAT is the largest cataloguing records for Japanese monographs and
serials and sometimes it is useful for cataloguing. You can see the
cataloguing records of NACSIS-CAT on the following sites:
http://webcat.nii.ac.jp/webcat_eng.html
http://webcat.nii.ac.jp/
http://webcatplus.nii.ac.jp/en/
http://webcatplus.nii.ac.jp/
Sources for Chinese records but they are not MARC-21:
http://webcat.nii.ac.jp/webcat_eng.html
http://162.105.138.200/uhtbin/cgisirsi/Q0lb16NUNN/0/0/49
http://www.nlc.gov.cn/
http://nbinet.ncl.edu.tw/search
5. Foreign keyboards / adding languages to the PC
If you want to type large amounts of text in a foreign script it may be easier to add foreign language
keyboard options to your PC. Foreign language keyboards could also be used to enter diacritics in a
language. Please note that you must have administrator rights on your machine to change languages to
do this.
5.1 How to add languages on Windows XP
1.
From the Start menu select Settings then Control Panel
2.
Double click on Regional and Language Options
3.
On the window that opens click on the Languages tab
4.
Click on the Details button
Foreign Language
Cataloguing Workshop
10
26/27 November 2009
5.
The ‘Text Services and Input Languages’ window will open. The ‘Installed services’
box will list the languages and accompanying keyboards that are already installed the machine.
6.
To add a language click on the Add button
7.
A new window will open
8.
Click on the down arrow to the right of the “Input language” box. A drop-down list of all the
languages it is possible to install will appear. Select the language required then click OK.
9.
If the language does not appear on the list, it will need to be installed from the Windows
XP CD – please consult your Computing Officer for details on how to do this.
10. For some languages there are several possible keyboard layouts/IMEs (Input Method
Editor). In nearly all cases the one the system initially offers as default is the one that is
generally used. If at a later date this proves to be not suitable, you will need to add the
language again but select a different keyboard layout from the list.
11.
To add another language click on the Add button and repeat the process.
12.
When all the required languages have been added, click Apply.
13.
In the “Text services and Input Languages” window click on the Language bar button.
Ensure that the ‘Show the language bar on the desktop’ and ‘Show additional language bar
icons in the taskbar’ boxes are ticked.
14.
The language bar sits on the taskbar in the bottom right-hand corner of the screen and
allows you to move quickly and easily between languages, and also to see which language you
are currently using.
5.2 changing the language once added
In order to enter non-English or non-Roman characters, the keyboard on the PC must be changed from
one that produces English characters when the keys are pressed to one that produces characters for
the specified language or script when keys are pressed.
The language setting is specific to the program that is open when the language is changed. Therefore
the first thing you need to do is in the Voyager client or OPAC window, click where you want to the nonEnglish characters to appear.
1.
Click on two-letter keyboard language indicator on the taskbar.
2.
A list of the languages that have been installed on the PC will be displayed.
3.
Click on the language you want to switch the keyboard to. The keyboard language
indicator will change to the code for the language you have selected
4.
The keyboard has now been changed from an English keyboard to one as it would appear
for the language that has been selected, and users can input characters from that language as
search criteria using the keyboard.
5.
If the language chosen is one that is written right to left (such as Arabic and Hebrew),
characters will appear on the screen in this manner (the first character typed in a string will be
the first character from the right, and the last character typed will be first from the left).
5.3 Using the on-screen keyboard
Foreign Language
Cataloguing Workshop
11
26/27 November 2009
When the language on a PC is changed, the keyboard changes from an English keyboard to one as it
would appear for the language that has been selected, and users can input characters from that
language. As a result, the characters that display when the keys are pressed will no longer correspond
with the symbols on the keys on the keyboard. The on-screen keyboard displays the keyboard for the
language selected on the screen and allows the user to use it as a map to see what keys they need to
press to produce the required characters, or to click on the key on the on-screen keyboard to input that
character. This should be already installed on the PC, but if not consult your Computing Officer.
To access the on-screen keyboard, from the Start menu select Programs, then Accessories, then
Accessibility, and Onscreen Keyboard.
The on-screen keyboard will appear on the screen
Initially the keyboard displays in English, and the language indicator in the bottom right hand corner of
the screen changes to “EN”.
Return to the Voyager client or OPAC browser window. If you have already changed the language here
check that the language indicator in the taskbar has changed back to the language you selected earlier.
If it hasn’t, click in the Voyager client/browser window. If you now move the mouse pointer over the onscreen keyboard the keys will change to the ones for the language selected. If you haven’t already
changed the language, do so then move the mouse pointer over the on-screen keyboard and again the
keys will change to the ones for the language selected.
** for instructions on how to add languages and keyboards to Windows 2000 and for full screen shots of
all the steps above. Please see libraries@cambridge documentation **
http://www.lib.cam.ac.uk/libraries/login/documentation/Unicode_Adding_Languages_to_PC.pdf
Appendices
Voyager shortcuts
Ctrl + E
Ctrl + D
opens Special Character Entry window
activates or deactivates Special Character
Mode
Useful web resources
1. Pay and Display
Newton Help Pages ‘displaying Unicode characters’
http://ul-newton.lib.cam.ac.uk/vwebv/ui/en_US/htdocs/help/unicode.htm
2. Taking the ‘dire’ out of diacritic?
lib@cam: list of diacritics/ special characters by character and key press
Foreign Language
Cataloguing Workshop
12
26/27 November 2009
http://www.lib.cam.ac.uk/libraries/login/documentation/diacritics_char.htm
MARC standards: character sets
http://www.loc.gov/marc/specifications/specchartables.html
3. MARC my words
Marc standards Appendix D – Multi-script records. Contains full record examples
http://www.loc.gov/marc/bibliographic/ecbdmulti.html
Marc standards - 880 Alternate Graphic Representation (R)
http://www.loc.gov/marc/bibliographic/bd880.html
lib@cam: Cataloguing using non-roman scripts
http://www.lib.cam.ac.uk/libraries/login/documentation/Unicode_non_roman_cataloguing_handout.pdf
MARC Code List for Languages
http://www.loc.gov/marc/languages/
4. Red sauce or brown sauce?
Greek books in print
http://www.biblionet.gr
NACSIS-CAT (Japanese)
http://webcat.nii.ac.jp/webcat_eng.html
http://webcat.nii.ac.jp/
http://webcatplus.nii.ac.jp/en/
http://webcatplus.nii.ac.jp/
Sources for Chinese records:
http://webcat.nii.ac.jp/webcat_eng.html
http://162.105.138.200/uhtbin/cgisirsi/Q0lb16NUNN/0/0/49
http://www.nlc.gov.cn/
http://nbinet.ncl.edu.tw/search
5. Foreign keyboards/adding languages to the PC
lib@cam: Adding languages to a PC
http://www.lib.cam.ac.uk/libraries/login/documentation/Unicode_Adding_Languages_to_PC.pdf
Extras
lib@cam: East European diacritics: potential pitfalls
http://www.lib.cam.ac.uk/libraries/login/documentation/diacritics_euro.htm
Wikipedia article on Unicode
http://en.wikipedia.org/wiki/Unicode
Unicode home page
Unicode home page
UL Near and Middle Eastern Department Services
http://www.lib.cam.ac.uk/deptserv/neareastern/services.html - Near Eastern services
LC Romanization tables
http://www.loc.gov/catdir/cpso/roman.html
Foreign Language
Cataloguing Workshop
13
26/27 November 2009
Download