ppt

advertisement
LING 001 Introduction to Linguistics
Spring 2010
Writing systems II
Chinese writing system
Reading
Mar. 31
Origins of Chinese characters
•
Legend has it that Cangjie, a historian-official who lived in the time of
Huangdi (the Yellow Emperor), created Chinese characters at the
inspiration of such natural objects as the sun, the moon, the stars, and
footprints of animals and birds.
•
Historical records and archeological finds reveal that Cangjie might, in
fact, have been the first person to study and index the Chinese
characters.
LING 001 Introduction to Linguistics, Spring 2010
2
Secrets discovered in Medicine
•
In 1899, Chinese scholar Wang Yirong discovered from “dragon bones”
- an ingredient of traditional Chinese medicine - some peculiar symbols.
•
The symbols were Jiaguwen, or script written on oracle bones (tortoise
shells and animal bones), and these particular bones were found to
date back 3,000 years.
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
LING 001 Introduction to Linguistics, Spring 2010
3
Evolution of Chinese characters
Oracle bone script
Bronze script
Small seal script
QuickTime™ and a
TIFF (Uncompress ed) dec ompres sor
are needed to s ee this pic ture.
Clerical script
Quick Time™a nd a
TIFF ( Unco mpre ssed ) dec ompr esso r
ar e nee ded to see this pictur e.
Standard script
Grass script
Running script
Simplified script
LING 001 Introduction to Linguistics, Spring 2010
4
Structure of Chinese characters
•
•
•
•
Pictograms: 日 ‘sun, day’, 人 ‘person’
Indicatives: 上 ‘up’ 下 ‘down’ 凹 ‘concave’ 凸 ‘convex’
Semantic-semantic compounds: 休 ‘rest’ = 人 + 木 (person + tree)
Semantic-phonetic compounds: 蝗 (虫 + 皇) ‘locust’ = ‘INSECT +
huang2’
LING 001 Introduction to Linguistics, Spring 2010
5
Structure of Chinese characters
•
Most semantic-phonetic compounds (also called phonetic compounds)
have a left-right structure, having their semantic radicals on the the left
and phonetic radicals on the right.
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
LING 001 Introduction to Linguistics, Spring 2010
6
Chinese Rebus: Phonetic Loans
I
•
•
•
•
•
can
來 lái = wheat
lái = come
來 = wheat/come
麥 = wheat
來 = come
see
you
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
LING 001 Introduction to Linguistics, Spring 2010
7
8
Chinese characters
P ict o g ram
Ind ica tive
1 40 0 110 0
BC
22 7
(23%)
20 0
AD
20
(2 %)
12 5
(1 %)
th
18
cent u ry
7, 00 0
f r e que nt
ch ar a ct ers
150 0
(3 %)
19 %
3 64
(4 %)
Se man tic - 39 6
se mant ic (41%)
116 7
(13%)
Se man tic - 3 34
(34 %)
phonetic
769 7
(82%)
47141
(97%)
81 %
Note: The 3,500 most frequently used characters in Modern Chinese would cover 99.48%
of a 2 million character corpus, where the first 2,500 characters accounted for 97.97%.
LING 001 Introduction to Linguistics, Spring 2010
Computer coding of characters
• Decimal (base 10): 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
e.g., 107 = 1*102 + 0*101 + 7*100
• Binary (base 2): 0, 1
e.g., 1101011 = 1*26 + 1*25 + 1*23 + 1*21 + 1*20 (107)
• Bit: the basic unit in computer, represents either 1 or 0, on or off.
• Byte: A sequence of eight bits. It is used as a fundamental unit
in modern computers. Also called octet.
• Characters are represented by binary numbers in computer
LING 001 Introduction to Linguistics, Spring 2010
9
Computer coding of characters
•
American Standard Code for Information Interchange (ASCII)
• a 7-bit character set: range from 0-127
• 0-31 - control codes and formatting:
Escape, Tab, Space,...
• 32-126 – punctuations , numbers, and English letters :
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFG
HIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklm
nopqrstuvwxyz{|}~
e.g., “A” - 1000001 (65)
•
ISO-8859-1
• It uses 8 bits, and contains ASCII as a subset.
• In addition to the ASCII characters, ISO-8859-1 contains various
accented characters and other letters needed for writing languages
of Western Europe, and some special characters.
LING 001 Introduction to Linguistics, Spring 2010
10
Computer coding of characters
•
Some languages (Chinese, Japanese, Korean, etc.) have more than
256 characters.
•
Encoding standards for these languages use sequences of bytes for
characters.
•
Because different standards use different numbers of bytes, the
computer can’t tell whether a given byte is a whole character or part of
a character; corruption of one byte can corrupt the whole data stream.
•
Unicode: 21-bit encoding space allows for 1,114,112 characters;
95,156 code point values assigned to characters in Unicode 3.2;
879,626 code point values reserved for future character assignments.
LING 001 Introduction to Linguistics, Spring 2010
11
UTF-8
•
For ASCII characters, the 21-bit value is truncated to 8 bits:
• For other characters, the 21-bit value is turned into a sequence of two,
three, or four 8-bit values:
LING 001 Introduction to Linguistics, Spring 2010
12
Phonology in reading development
•
Phonological processes are generally considered to be important for
developing word reading skills.
• “The best predictor of reading difficulty in kindergarten or first grade
is the inability to segment words and syllables into constituent
sound units (phonemic awareness).” (Lyon, 1995)
• “Reading and phonemic awareness are mutually reinforcing:
Phonemic awareness is necessary for reading, and reading, in turn,
improves phonemic awareness still further.” (Shaywitz, 2003)
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
LING 001 Introduction to Linguistics, Spring 2010
13
14
Dyslexia
•
•
•
Phonological deficits are the most significant and consistent cognitive
marker of dyslexic children.
Auditory Analysis Test: it asks a child to segment words into their
underlying phonological units and then to delete specific phonemes
from the words, e.g., say ‘block’ without ‘buh’.
Even in high school students, phonological awareness was the best
indicator of reading ability.
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
LING 001 Introduction to Linguistics, Spring 2010
Phonological activation in reading
• Skilled readers activate phonological representations in reading.
• Lexical decision: In a lexical decision paradigm, subjects are
presented with a written stimulus, and are asked to answer
whether or not the stimulus in question is a word of their
language. A typical finding is that participants take more time to
reject pseudohomophones foils than controls foils.
• Naming: In the naming paradigm, subjects are again presented
with a written stimulus, but this time they are asked to
pronounce the stimulus aloud. — to “name” the word that is on
the screen. A typical finding is that participants take shorter time
to name the word if a homophone (prime) is presented before
the target word.
LING 001 Introduction to Linguistics, Spring 2010
15
Phonological activation in reading
•
Eye movement: Target words were read faster (shorter
fixation duration) when a phonologically similar word,
e.g., homophone, was presented briefly at the onset of
fixation on the target region (Rayner et al. 1995).
• The prime for a given target (e.g., beach) was either
identical to the target (beach), a phonologically similar word
(the homophone beech), a visually similar nonhomophone
(bench), or a dissimilar word (noise).
• Comparing fixation times on the target when it was
preceded by the homophone versus the visually similar
word.
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
LING 001 Introduction to Linguistics, Spring 2010
16
Phonological activation in reading
•
ERP (event-related potential): Target words had smaller ERPs
(averaged electrical activity in the brain) when a phonologically
similar word or syllable was presented before the target word.
(Ashby 2010).
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
LING 001 Introduction to Linguistics, Spring 2010
17
Phonological activation in reading
•
•
The morphemic nature of Chinese writing leads easily to the
assumption of a close connection between graphic form and meaning.
characters are not alphabetic.
On average, 11characters share a single pronunciation if disregarding
tone, about four homophones for each character if tone is considered.
石室诗士施氏, 嗜狮, 誓食十狮。
氏时时适市视狮。
十时, 适十狮适市。
是时,适施氏适市。
氏视是十狮, 恃矢势, 使是十狮逝世。
氏拾是十狮尸, 适石室。
石室湿, 氏使侍拭石室。
石室拭, 氏始试食是十狮。
食时, 始识是十狮, 实十石狮尸。
试释是事。
LING 001 Introduction to Linguistics, Spring 2010
18
Phonological activation in reading
•
Like in English, participants in Chinese take shorter time to name the
target word if a homophone (prime) is presented before it (Perfetti &
Tan 1998).
• Graphic information begins the identification process and it the first to show
a priming effect
• Phonological information precedes semantic information in primed naming
QuickTime™ and a
TIFF (Uncompressed) decompressor
are needed to see this picture.
LING 001 Introduction to Linguistics, Spring 2010
19
Phonological activation in reading
•
Analysis of 19 published brain mapping studies (fMRI) of phonological
processing in reading, six with Chinese and 13 with alphabetic
languages, found significant differences between languages (Tan et al.
2005)
• The left middle frontal gyrus is responsible for addressed phonology in
Chinese.
• Left temporoparietal regions mediate assembled phonology in alphabetic
languages.
•
More on language and brain later.
LING 001 Introduction to Linguistics, Spring 2010
20
Download