Orthography and Language Preservation

Breaking Rules
for Orthography Development
Pamela Munro, UCLA
Developing Orthographies for
Unwritten Languages
LSA Annual Meeting, Pittsburgh, 7 Jan. 2011
About this talk
• Orthographic rules
• Two case studies:
— Tlacolula Valley Zapotec, spoken in Oaxaca,
Mexico, by close to 30,000 people, several thousand
of whom live in Southern California
— Gabrielino/Tongva/Fernandeño, the language of
the original inhabitants of the Los Angeles Basin, last
recorded in the 1930s
• Orthography and community needs
What is orthography?
• An orthography is a systematic, standardized
writing system for a particular language. A good
orthography is designed to be appropriate for the
sounds and phonology of that language.
• An orthography is different from people’s ad hoc
writing on a word-by-word basis, although such
efforts may often represent the beginning of
orthography development.
• In optimal cases, an orthography can be typed
on a standard keyboard (though there are
certainly exceptions to this rule).
• An orthography must be acceptable to users (and
other stakeholders: Cahill, this symposium).
What isn’t orthography?
• Orthography is not phonetic transcription.
• No two languages can necessarily be written
with the same orthography.
• A standardized orthography gives users a
system for writing words so that other speakers
can read them.
• This does not necessarily mean that each word
has a standardized spelling or pronunciation, or
that some speakers’ pronunciations can or
should be labeled non-standard.
Orthographic rules vary from language to language —
each language’s orthography is different.
For example, consider the letter <j>
in familiar European languages:
• In English, <j> usually represents the sound [ǰ], as
in juvenile or junior.
• In French, <j> represents the zh sound ([ž]), as in
jeune 'young'.
• In Spanish, <j> represents the sound of German or
Scottish ch [x], as in joven 'young'.
• In German, <j> represents the y sound [y], as in
jung 'young'.
Two basic rules for
good orthographies
• Every symbol or symbol combination (a
sequence such as <ch>) must always
represent the same sound (or
phoneme) of the language.
• Every sound of the language must
always be written the same way.
Orthographies vary in terms of
how well they follow these rules…
• English violates both rules. English sounds are often written in
more than one way (for example, the [ey] sound in way,
weigh, raid, and rate), and the same English spelling may
represent more than one sound (for example, the <ough> in
cough, through, rough, and though).
• A Spanish speaker almost always knows exactly how to
pronounce a word from its spelling (though some letters, such
as <x>, present difficulties). However, many speakers
pronounce <y> and <ll> alike, so they may be uncertain of
how to spell an unfamiliar word containing a y sound. There is
also confusion among <s>, <z>, and <c> and between <g>
and <j>, and speakers are often puzzled about where to use
the letter <h>.
Orthographic rules need to be flexible
• A morpheme may be written the same even though
it is pronounced differently in different contexts.
• For example, the English plural suffix is always
written <s>, even though it is regularly pronounced
[z] when it follows a voiced sound: compare the
sound of the <s> in pots [s] vs. pods [z].
• My two case studies illustrate two quite different
ways in which orthographic rules must be broken to
be effective.
Case Study 1:
Tlacolula Valley Zapotec (TVZ)
Ladies from San Lucas Quiaviní with UCLA collaborators.
Tlacolula Valley Zapotec (TVZ / zab):
the academic project
Collaborators and contributors include
• Dr. Felipe H. Lopez, UCLA
• Dr. Brook Lillehaugen, University of Nevada,
and several others — at UCLA, there have
been three master’s theses and three
dissertations written about the language,
with more work in progress
The sounds of TVZ: Consonants
fortis stop
c / qu [k]
lenis stop
g / gu [g]
fortis affricate
fortis fricative
lenis fricative
ch [č]
x [š]
x: [ş]
zh [ž]
zh: [ȥ]
j [x]
lenis nasal
ng [ŋ]
fortis nasal
nng (fortis [ŋ])
lenis lateral
fortis lateral
flap (lenis?)
trill (fortis?)
glide (lenis)
The sounds of TVZ: Vowels
Vowels: i, e, ë [ɯ], a, o, u
Tones: high, low, falling, rising
Phonation: modal (plain, e.g. a), checked
(postglottalized, e.g. a’), breathy (e.g. ah),
creaky (e.g. à) (at least…)
Up to three (or four?) vowel units may occur
in a syllable nucleus, with the same or
different phonation.
TVZ vowel complexes
There are 20 or so different
“vowel complex” types, each
of which is associated
uniquely with a specific tone.
Lopez’s and my 1999
dictionary is written in an
“academic” orthography that
differentiates each of these.
But even linguists find the
details of this writing system
hard to use and remember.
Speakers often remark that words like the
following are “spelled the same”….
<bel> ?
<gyia> ?
<na> ?
<nda> ?
Be'll 'Abel'
'will go home'
behll 'fish'
gyihah 'rock'
gyìa 'agave
nah 'now'
nnah 'says'
nàa 'is'
bèèe'll 'snake'
bèe'l 'naked'
beèe'l 'meat'
'will drink'
gyììa' 'flower'
nnaàa' 'hand'
'had been
ndàa' 'loose'
'had broken'
ndaàa' 'hot'
nàa' 'I’, now
spelled <naa>)
<Bel> / <bel>
• Be’ll 'Abel'
• behll 'fish'
• bèèe’ll 'snake'
• bèe’l 'naked'
• bèe’ll '(woman’s) sister'
• beèe’l 'meat'
gyiia 'will go home'
gyihah 'rock'
gyìa 'agave root'
gyii’ah 'will drink'
gyììa’ 'flower'
gyìi’ah 'market'
• nah 'now'
• nnah 'says'
• nàa 'is'
• nnàa’ah 'heavy'
• nnaàa’ 'hand'
and also
• nàa’ 'I'/ 'me' (now written <naa>)
ndàa 'sensitive'
nda’ah 'had been poured'
ndàa’ 'loose'
ndàa’ah 'had broken'
ndaàa’ 'hot'
Native speakers are as puzzled as linguists
about the best way to write Zapotec.
Nyi’ihs yàa ‘clean water’ on a water
purifying facility in Mitla.
Informally written Zapotec can be inconsistent,
for both vowels and consonants.
Dìi’zh x:tèe’n binguul ‘word of the elder’ in the
Tlacolula newspaper El Tlacolulense.
(<sh> for zh, <s> for x:, vowel contrasts neutralized)
TVZ: The teaching project
Lillehaugen, Lopez, and
I are collaborating on
Cali Chiu? A Course in
Valley Zapotec, now
being used for the sixth
time in first year college
Zapotec courses taught
by Lopez at UCLA (and
earlier at UC San
TVZ: The minimalist orthography
In Cali Chiu? we have replaced the
academic orthography of the dictionary
with a minimalist spelling system that
writes, for example, all 20+ varieties of a as
<a> and merges consonant contrasts with
a low functional load (x, x: both <x>; zh,
zh: both <zh>; m, mm both <m>; n, nn both
<n>;ng, nng both <ng>…).
Spelling and pronunciation guides
Normally, our book uses only the minimalist spelling.
The first time words are introduced, however, they are
followed by a pronunciation guide (using the academic
orthography) in square brackets:
bel [bèèe'll] 'snake'
bel [bèe'l] 'naked'
bel [bèe'll] '(woman’s) sister'
bel [beèe'l] 'meat'
Different people have different learning styles and
bracketed pronunciation guides may be more useful to
some than to others, so students are never tested on
pronunciation guide representations.
The (new) TVZ orthography: Conclusions
• The new orthography violates the first basic rule of
orthography design (every letter or letter combination
must always represent the same sound of the language),
just as with the English words in the following sentence
– He made a bow with the ribbon.
vs. He made a bow to the queen.
– I read this book yesterday.
vs. I read this book every day.
• This “violation” makes the orthography less intimidating to
users and does not seem to impede written
communication among those familiar with the language.
Future Plans
• We’re continuing to revise the Cali Chiu? book,
and are working on a Spanish translation, which
Lillehaugen taught from in Mexico City.
• Lillehaugen has written a well received book on
Tlacolula-area plants in the minimalist
orthography; versions are planned for other
• The dictionary will be revised using the new
system, with the old entries as pronunciation
• Lopez and I are editing speakers’ narratives about
the immigration experience, to be published in the
new orthography.
Post-Mimimalist Zapotec Writing
• Several years ago, Román López Reyes (a
teacher in San Lucas Quiaviní) established the
Colectivo Literario Quiaviní, a group of students
who write poetry and stories in TVZ (Chávez
Peón and López Reyes, et al., 2009).
• Crucially, however, they write using a form of the
(published, hence authoritative) academic
orthography, not the (easier!) minimalist
• But there are many inconsistencies (it’s a very
difficult system)….
Mario Chávez Peón (UBC), Román López Reyes, and I
with the Colectivo Literario Quiaviní in 2008.
Case Study 2:
My fellow members of the Gabrielino/Tongva/Fernandeño
language committee include Virginia Carmelo and Jacob Gutierrez,
shown here with Pamela Villaseñor, the late Carol Ramirez, and L.
Frank Manriquez at the 2006 Breath of Life Workshop in Berkeley.
• The original language of the Los Angeles
area, traditionally known as Fernandeño
(in the San Fernando Valley) and
Gabrielino (in the Los Angeles basin)
• In recent decades, called Tongva by
descendants of speakers in the Gabrielino
areas (I’ll refer to the language this way)
• Last documented in the 1930s by
linguist/ethnologist J. P. Harrington
• Presumably “sleeping” for more than fifty
J. P. Harrington (1884-1961)
Harrington recorded vast
amounts of data on
American Indian
languages for the
Bureau of American
There are 4 microfilm
reels of Tongva data,
which are being
transcribed by
volunteers working
with the J. P.
Harrington Project at
UC Davis.
The Takic (Uto-Aztecan) Languages
of Southern California
also Tataviam (and others?)
Linguistic work on Tongva
• I began work on Tongva (systematizing and
analyzing Harrington’s notes) in the late 1970s,
inheriting the project from Geraldine Anderson
and the late William Bright of UCLA.
• Drawing on my experience with other Southern
California Uto-Aztecan languages, I developed a
practical orthography for Tongva that I used in
several linguistic articles.
• Since 2004 I’ve been meeting monthly with the
other members of the Tongva language
committee, working on basic learning and on
developing materials for language learners.
My academic orthography for Tongva
(developed before 1980)
Consonants: ’, ch, h, k, kw, l, m, n, ng,
p, r, s, sh, t, v, w, x, y (and, in
Spanish loans, b, (d?,) f, g, z).
Vowels: aa, ee, ii, oo, uu (long
vowels); a, e, o (short vowels).
Difficult sounds and spellings (1)
For my Tongva colleagues, the difficult sounds are ’ (the glottal
stop, as found in the middle of English uh-oh, as in kwitii’
‘boy’), ng (as in yongaavewot ‘condor’ or ngooxavavet
‘metate’), and x (as in xaay ‘not’).
Like <j>, the letter <x> has different roles in different
orthographies —
• In English, <x> represents a [ks] sound.
• In TV Zapotec, as we just saw, <x> represents a [š] sound,
like English <sh> (or its retroflexed counterpart).
• In Tongva, <x> is used to represent [x] (like German <ch> in
Bach, or Spanish <j> in Baja)
• In Spanish, <x> can represent [ks] in éxito, [š] in mexica, [x]
in México, or [s]/[ks]/[š] in mixteco.
Difficult sounds and spellings (2)
• For the linguist, the greatest difficulty is finding the most
efficient and esthetic orthographic representation of the
Tongva vowels….
• In Harrington’s recordings, each word has one stressed
and long vowel (written double: these are <aa>, <ee>,
<oo>, <ii>, <uu>).
• Harrington wrote unstressed short vowels less consistently,
but contrasted only three vowel sounds (qualities): [a]; [e]
(occasionally written [i]); and [o] (occasionally written [u]).
• Therefore, in my original linguistic work (including the early
drafts of our Tongva dictionary) I wrote only unstressed
<a>, <e>, and <o>.
(The Tongva pattern is paralleled by that of closely related
Luiseño, which contrasts only unstressed a, i, u.)
Tongva plurals (1)
Tongva forms noun plurals by reduplication (sometimes there
is also a plural suffix):
naavot 'tuna cactus'
xaayy 'mountain'
peet 'road'
toomshar '(type of) oak'
naaxovar 'cane'
paaytxo'ar 'bow'
nanaavot 'tuna cactuses'
xaxaayy 'mountains'
pepeet 'roads'
totoomshar 'oaks'
nanaaxovar 'canes'
papaaytxo'ar 'bows'
The first consonant and vowel of the singular are copied at
the beginning of the plural noun. In the words above, the
first two vowels of the plural noun are identical, except for
their length/stress. All plurals are stressed on their second
syllable (have a long second syllable vowel).
Tongva plurals (2)
The plurals of other nouns look less regular, however:
kiiy 'house'
huunar 'bear'
muuhot 'owl'
piinor 'hummingbird'
kekiiy 'houses'
hohuunar 'bears'
momuuhot 'owls'
pepiinoram 'hummingbirds'
The short copy (first vowel) and long original (second
vowel) of these plural nouns don’t look the same….
Tongva plurals (3)
Nouns that have the stress on the second vowel also have their
first vowel short, second vowel long in the plural. Once again,
though, in some plurals the first two vowels are the same, but in
some they are different:
shaxaat 'willow'
sheveer 'sycamore'
novoor 'tray basket'
shashaaxat 'willows'
shesheever 'sycamores'
nonoovor 'tray baskets'
mokaat 'song'
pekwaar 'blackberry'
shokaat 'deer'
momuukat 'songs'
pepiikwar 'blackberries'
shoshuukat 'deer (plural)'
Additionally, other vowels in the word may change in the plural:
xongiit 'squirrel'
xoxoonget 'squirrels'
The problem is the spelling system
• In the academic orthography, the only
short/unstressed vowels are <a>, <e>, and
• Thus, even though we believe that the first
vowel of plural kekiiy ‘houses’ is a short copy
of the ii in the singular kiiy (pronounced as [i] /
[e], the system requires it be written as <e>.
• Similarly, even though we believe that the first
vowel of singular shokaat ‘deer’ is a short
copy of the stressed uu in shoshuukat ([u] /
[o]), we’re required to write it as <o>.
A new way to write the Tongva vowels
Relaxing the rules about writing unstressed/short vowels and
allowing the use of unstressed/short <i> and <u> makes all the
singular-plural pairs look more regular:
kiiy 'house'
kikiiy 'houses’ (not kekiiy)
huunar 'bear'
huhuunar 'bears’ (not hohuunar)
muuhot 'owl'
mumuuhot 'owls’ (not momuuhot)
piinor 'hummingbird'
pipiinoram 'hummingbirds’ …
mukaat 'song'
mumuukat 'songs'
pikwaar 'blackberry'
pipiikwar 'blackberries'
shukaat 'deer'
shushuukat 'deer (plural)'
xongiit 'squirrel'
xoxoongit 'squirrels'
Consequences of the spelling changes
• The revised orthography is no longer strictly
phonological (since it suggests a grammatical
difference between unstressed/short <i>/<e> and
<u>/<o> which Harrington’s recordings don’t support).
• This change may tempt learners to make a
pronunciation difference where original speakers might
not have done so. But this is probably not too crucial.
• Another problem is that the status of unstressed/short
vowels that don’t alternate with stress/long vowels is
uncertain. This too is probably not too important.
• The advantage is that in the new system plurals (and
other comparable phenomena) are much easier to
• The revised orthography is being used often. Here, our
group is teaching non-Tongva participants in the 2006
Breath of Life Workshop the Tongva Hooke’-Pooke’…
• Prayers and other cultural events offer another opportunity
to write and speak Tongva.
Jacob Gutierrez is a professional artist who
hosts our Tongva meetings.
At right are his design for the
“Day of the Whale” at the
Point Vicente Interpretive
Center and one of his
flash cards.
Jacob Gutierrez’s Tongva/GabrielinoFernandeño village map
…using village
name spellings in
our Tongva
(we’re still
learning more
about this and
The (revised) Gabrielino/Tongva/Fernandeño
orthography: Conclusions
• The revised orthography violates the second
basic rule of orthography design (every sound
of the language must always be written the
same way).
• The violation results in an orthography that
makes it easier for learners to see the
relationship between singular and plural
nouns (and other such relationships).
What all this shows
• Orthographies must be developed on a caseby-case basis to fit the needs of each
individual language and community (again,
see Cahill, this symposium).
• Ideally, an orthography should follow the one
sound / one symbol and one symbol / one
sound rules, but sometimes these rules must
be modified to serve language and
community needs.
That’s all…
• Niizyi… (Tlacolula Valley Zapotec)
• Horuura’… (Tongva)
(except for the references,
which are on the handout)
Thanks to everyone who has helped me learn about
these two wonderful languages!
