Natural Languages
© 2007
• Definition of Language
– In math and computer science:
• A lexicon & rules for combining terms from the lexicon
– In common use:
• Structured verbal interaction between people
• Any structured interaction such as “The
Language of Film”
• Are computer languages a model for human natural language?
© 2007
Wide Variability among
Natural Languages
• Sentence Structure
– SVO (Subject-Verb-Object) (English, Chinese)
– OVS (Gaelic/Celtic)
– SVO (Hindi, Japanese, Hopi)
• Written
– Ideographic (Chinese),
– Syllabic (Thai),
– Alphabetic (English)
• Spoken
– Tonal (Chinese)
– Non-tonal (English)
© 2007
• Words
– Morphology, Orthography, Phonetics, Phonology
– Words are categorized into parts of speech
• Syntax
– Phrase and sentence structure based on parts of speech
• Semantics
– Literal meaning
• Pragmatics/Discourse
– Uses beyond the literal meaning
© 2007
• Grammars are most often associated with modeling syntax though semantic grammars are also possible.
In the broadest sense, grammars are rules for languages
• The most general grammars are “context-free”. That is, the structure does not depend of the context.
• The grammars used for syntax are usually “constituent grammars”. That is they identify the relationship of the components (constituents) of the phrase.
• Grammars taught in grade school are “descriptive” grammars. Grammars in the formal analysis of language are “prescriptive” and usually “generative”.
• Grammars are usually defined by rules, but statistical transition networks are also used to model the structure of language.
© 2007
Modeling Natural Language
Syntax with Grammars
• Rewrite (or production) rules (phrasestructure grammar)
• A very simple example of rewrite rules
S NP+VP
NP N, Adj+N,
VP V, V+NP
© 2007
• Can we identify the grammatical structure of a given statement?
• Parsing is the basis of syntax checking for computer program compilers.
• A parse tree is structure of a given statement given
– a lexicon with parts-of-speech
– a grammar
NP
S
• A very simple sample parse tree
VP shown at the right. This has a Verb Phrase with a Direct Object.
NP
This Direct Object is itself a Noun Adj N V
Phrase.
• Difficulties: Garden path sentences
Adj
– “The man who hunts ducks out on weekends”
• Many algorithms have been developed for parsing,
N
© 2007
• What do we know about how people process and learn language?
• Are all languages context free?
• Language learning
– Children sometimes seem to over-apply rules. “I goed to the store”
• Competence vs. performance
• Transformational grammars are a model that allows re-arrangement of structure.
© 2007
Modeling Syntax with
Statistical Models
• While most grammars are a rule-based representation, a statistical representation of language may more capture structure more flexibly.
• In particular, Markov models can describe the transitions between different parts of speech. For instance, the Nouns are often followed by Verbs but
Adjectives are rarely followed by Verbs
© 2007
• What exactly is a word?
– Sail-boat, Pennsylvania, 555-1212, F-16
• Definitions of words
– Why aren’t the definitions of words in dictionaries all the same?
– Are exact definitions of words possible?
• Across time, across groups
– Words evolve in meaning
• Sometimes by radial categories (that is, often by metaphor)
• What is the relationship between concepts and words?
© 2007
Tools beyond Traditional Dictionaries:
WordNet and FrameNet
• WordNet http://wordnet.princeton.edu/
– Shows hierarchical relationships for dictionary terms. Very loosely, this can be thought of as an ontology.
• FrameNet http:// framenet.icsi.berkeley.edu
/
– Verbs show the relationship among concepts. For instance “to give” implies that there is a gift, a gifter, and a giftee.
© 2007
• Very different statements can have similar semantics.
• The semantics of statements in a computer programming language (i.e., a program) can be determined from its behavior.
• The semantics of natural language is often judged by the meaning and relationship of the components. Subjective and contextualized meaning is considered as pragmatics which we will discuss later.
© 2007
• Semantic grammar
– Even with different surface structure, can we develop a standard representation for the meaning.
• Interlingua
– A common mediator for meaning across languages. This could be useful for translation.
© 2007
Referential
• Conveys information about some real phenomenon
• This is what we think about as normal language use
Expressive
• describes feelings of the speaker
Conative
• attempts to elicit some behavior from the addressee
Phatic
• builds a relationship between both parties in a conversation
Meta-lingual
• self-references
Poetic
• focuses on the text independent of reference from R. Jakobson
© 2007
• Sentences form macro-structures or superstructures of meaning. This includes structured language such as argumentation, negotiation, news, narrative, and explanations.
• What are the components (elements) and structure of discourse. For instance, structuring messages to make it clear for listeners
• Given-New
Bill (a person you know) went to the store (is in a new location)
• Theme-Rheme
When in Rome (theme), do as the Romans do (rheme)
© 2007
• Toulmin has proposed a general structure for arguments
Grounds Claim
Evidence Rebuttal
• There are a lot of complex structured verbal interactions
– Legal arguments
– Design rationale
– Negotiations
© 2007
• What an explanation consists of
– Two types of phenomena being explained
• Causal antecedents
– How do we explain the American Civil War?
• Sub-processes
– How does a gasoline engine work?
– Background for the person receiving the explanation needs to be considered.
© 2007
• (Goals + Events + Resolution) + Characters
• Many stories seem highly structured
– Some stories seem so structured that they have been described as “story grammars”. This is most notably true of Russian Fairy Tales
• Many stories also reflect familiar human quandaries
– “Romeo and Juliet”
• Interactive and dynamic narrative (useful in games)
– Could we become a player in an interactive
“Romeo and Juliet”?
© 2007
• Conversation adds a social and interactive component to language
• Conversational norms (Maxims)
• Truthful, informative, relevant, clear
• But these are routinely violated
• Managing conversations
– Opening / Closing
– Turn taking
• In Native American councils, the person holding the talking stick controlled the floor
© 2007