Natural Language Generation - Homepages | The University of

advertisement
A very short introduction to
Natural Language Generation
Kees van Deemter
Computing Science
University of Aberdeen
Language Technology
Meaning
Natural Language
Understanding
Natural Language
Generation
Text
Text
Speech
Recognition
Speech
Synthesis
Speech
Speech
First: NLG from
a practical perspective
 Goal:

Use computers to express information in human-accessible
form
 Input:

some non-linguistic representation of information (e.g.,
tables in database, logical formulas, JAVA code, ...)
 Output:

documents, reports, explanations, help messages, ... in
some human language (Chinese, English, Dutch)
 Knowledge sources required:

knowledge of language and of the domain; maybe of the
intended audience as well
Example System: FoG
 Function:

Produces textual weather reports in English and French
 Input:

Graphical/numerical weather depiction
 User:

Environment Canada (Canadian Weather Service)
 Developer:

CoGenTex. [Kitteridge, Goldberg and Driedger 1994.]
 Status:

Fielded, in operational use since 1992
FoG: Input
FoG: Output




Example System:
Dial Your Disc (DYD)
 Function:

Context-sensitive descriptions of Mozart’s instrumental
music
 Input:

Music database + history of interaction
 Target user:

Music industry, customers for music-on-demand
 Developer:

Philips Electronics (Nat Lab – IPO, Eindhoven; 1993-6)
[Van Deemter & Odijk 1995]
 Status:

Methods reused in GOALGETTER and other systems
Example System:
Dial Your Disc (DYD)
 User composes a home-made CD
 Speech interface tells system what type of music
the user would like to add to the CD. E.g.,

“I’d like some piano music”. “I’m interested in solo
performances”.  “piano”, “solo”
 System chooses one composition with solo piano.
The music starts. After a while, a text is spoken
 The second time a piano sonata is selected, the
following text might be generated:
Example System: Dial Your Disc (DYD)
Example of approximate output, in its most elaborate form:
“The following+ composition+, from which you are going to hear
a fragment+ of part three+, was written+ by Mozart in the
beginning+ of seventeen+ seventy+ five+, in Munich+. The work
is also+ a sonata+ in f+, like the preceding+ composition, but
now+ for piano+. The KV+ number of this work is K. two+ eight+
zero+. This sonata+ consists of three+ parts+: allegro assai+,
adagio+, and presto+. The presto lasts two+ minutes+ forty+
five+ seconds+. This presto is located on track six+ of first+ CD+
of volume seventeen+. The piano+ is played by Mitsuko
Uchida+. The recording+ of the sonata+ was made+ in the
Henry Wood+ Hall in London+, England, in the eighties+. The
quality+ of its recording is DDD+. The following+ is a fragment+
of the third+ part+.” [A fragment follows]
Each “+” marks a pitch accent on the preceding word
When to use NLG?
When
 there are many potential documents to be
written, differing according to the context
(user, situation, language)
 there are some general principles behind
document design.
Why is NLG hard?
 NLG involves many choices, e.g. which
content to include, what order to say it in,
what words to use.
 Linguistics does not yet provide us with a
ready-made, precise theory about how to
make such choices to produce coherent text
Why does choice matter?
The Serbian Prime Minister, Zoran Djindjic, has been
assassinated in the capital, Belgrade.
The pro-reform, pro-Western leader was shot in the
stomach and in the back outside government
offices at around 1300 (1200 gmt), and died of
his wounds in hospital.
(BBC news, UK edition, 12/3/03)
Tasks and Architecture in
NLG (Reiter 1994)
Content Determination
Document Structuring
Aggregation
Lexicalisation
Generation of Referring Expressions
Linguistic Realisation
Physical Realisation
Document
Planning
Microplanning
Surface
Realisation
Second perspective:
NLG as a branch of linguistics
 NLG systems map ideas to words
 Surely, this is linguistic territory!
If linguists cannot say how the different stories
about James Sportler differ, then who can?
 An NLG program might be seen as a model of
language production (in terms of its output; the
human production process may be very different)

 NLG is the smaller twin brother of
NL Understanding
 NLG poses deep theoretical problems about
language and communication
 NLG has great potential for applications
 This course: Generation of Referring
Expressions
Download