Why are computers so stupid and what can be done about it?

advertisement
Why are computers so stupid
and what can be done about it?
Artificial intelligence
and commonsense knowledge
Ernest Davis
Science on Saturday
March 3, 2012
Two well-known truths about
computers
Computers are great and amazing and a lot
of fun to deal with.
Computers are stupid and frustrating and it
can be a huge amount of work to get what
you want out of them.
Chess
• Computers can play chess better than the
greatest chess masters.
But
• They’re no use for answering a question
like “Give me an example where White
can take Black’s queen, but if he does,
Black can immediately checkmate.”
Physics
• Computers can compute the interaction of
two galaxies colliding.
• Wolfram Alpha can answer “How far was
Jupiter from Saturn on Dec. 17, 1604?”
But
• They’re no use for answering a question
like, “Can you ever get a solar eclipse one
day and a lunar eclipse the next day?”.
• You can’t answer the question “When is
the next sunrise over crater Aristarchus?”
faster than a 18th c. astronomer.
Movies
You can get a
list of all Frank
Capra’s movies,
but no computer
can answer the
question,
“What are Ellie
and Peter doing
here?”
Textual Analysis
Computers can tell which of the Federalist
Papers were written by Hamilton and
which by Madison.
But
They can’t answer the question “Why is F.P
#54 no longer directly relevant?” (It
discussed the 3/5 rule for slaves.)
A different kind of computer
stupidity
(Boring, banal example, but that’s the point.)
NYU “upgraded” its software for student
registration. Endless problems.
We want to have a rule that a student can
register for at most 4 classes.
Answers: Can’t be done/Will cost a lot of
money.
Incompetent software engineering.
Artificial Intelligence
“Then somehow it achieved self-awareness,
and in a few nanoseconds had enslaved
the human race.”
Many tasks that are very easy for people are
extremely difficult for computers:
Vision
Natural Language
Operating in a rich environment (kitchen).
Simple reasoning (chess question)
Why is Natural Language Hard?
Many reasons. One of the hardest aspects
is ambiguity.
Lexical disambiguation:
“This gift is for Stuart.”
“This gift is for Christmas.”
“This bowl is for soup.”
O.E.D. list 36 primary meanings of “for”;
>100 subcategories
Ambiguity is ubiquitous
The juiciest prize is to become the face of a
luxury brand such as Dior or Burberry. To
have any chance, a model must first have
magazine shoots under her designer belt.
This fact allows fashion magazines to pay
peanuts, even for a cover-shoot.
"The beauty business", The Economist, Feb.
11, 2012.
Ambiguous words
The juiciest prize is to become the face of a
luxury brand such as Dior or Burberry. To
have any chance, a model must first have
magazine shoots under her designer belt.
This fact allows fashion magazines to pay
peanuts, even for a cover-shoot.
Black – unambiguous.
Blue – most frequent meaning
Red – not most frequent meaning
Reference disambiguation:
Winograd schemas
“Jane knocked on Susan’s door, but she
didn’t answer.”
“Jane knocked on Susan’s door, but she
didn’t get an answer”
“The trophy doesn’t fit in the suitcase,
because it’s too small.”
“The trophy doesn’t fit in the suitcase,
because it’s too large.”
Why is computer vision hard?
• Two images of the same thing may be
very different depending on viewpoint,
lighting, etc.
• Two things in the same category may be
geometrically very different.
• Context is used to interpret objects for
which there is actually very little image
information.
Concert / party. Warmish spring afternoon. Street
in suburban neighborhood. Wooded hill behind.
Canvas awning above, mended with duct tape.
Large mug front center. People in back.
Messy kitchen
Bottle is empty
Fridge in corner
Toaster oven, paper
towel on counter.
Plant is hung from
curtain rod.
Daytime
Two approaches to artificial
intelligence
• Corpus-based machine learning
• Knowledge-based techniques
Corpus-based machine learning
You have:
• Large body of data (text, pictures, etc.)
• A task
Find many patterns of superficial features
that are relevant to the task.
Determine how to combine them to carry out
the task.
Critical: These are done without any real
understanding of the task or content.
Notable successes
• Speech understanding
– Automatic dictation
– SIRI etc.
• Google translate
• Autonomous vehicle
• Automatic check reading
Automatic dictation
Start with: Corpus of recorded speech and
transcription
Extract patterns/rules:
• Sounds => phoneme
• Sequence of phoneme => word
• Common sequences of words
All labelled with probabilities
Compute: Most probable interpretation of a
sequence of sounds.
Google translate
Start with:
• French/English dictionary
• Information about grammars
• “Bi-texts” e.g. Canadian parliamentary
proceedings.
Extract:
• Translations of words to words or phrases
to phrases, with probabilities.
• Rules for reorganizing sentence structure.
Limitations of Corpus-Based
Approach
• Task-specific. Learning to translate French
does not enable the program to answer
questions about a story in French.
• Corpus limitations. If your corpus is
Parliamentary proceedings, you end up
with a Parliamentary vocabulary.
• Data limitation. No huge corpus of bitexts.
• Errors can be weird.
Google translate: To French
and back
The juiciest prize is to
become the face of a
luxury brand such as Dior
or Burberry. To have any
chance, a model must
first have magazine
shoots under her
designer belt. This fact
allows fashion magazines
to pay peanuts, even for
a cover-shoot.
The price [sic] is more juicy
to become the face of a
luxury brand like Dior and
Burberry. To have a
chance, a model must
first be magazine shoots
under his belt designer.
This fact can pay peanuts
fashion magazines, even
coverage for rickshaws.
Google translate: To Japanese
and back
The juiciest prize is to
become the face of a
luxury brand such as Dior
or Burberry. To have any
chance, a model must
first have magazine
shoots under her
designer belt. This fact
allows fashion magazines
to pay peanuts, even for
a cover-shoot.
The juicy prize is to be a
face of brands such as
Dior and luxury, such as
Burberry. In order to have
any chance, the model
must have the shooting of
her first magazine under
designer belt. This fact,
and further, fashion
magazines, you can pay
peanuts cover shoot.
Google translate: To Azerbaijani
and back
The juiciest prize is to
become the face of a
luxury brand such as Dior
or Burberry. To have any
chance, a model must
first have magazine
shoots under her
designer belt. This fact
allows fashion magazines
to pay peanuts, even for
a cover-shoot.
The juiciest a premium
luxury brands such as
Dior, or to face Burberry.
For any chance, a model
should be the first in the
bottom of the magazine
tumurcuqlar designer
belt. This fact is even, the
fashion magazines to pay
peanuts cover-shoot.
Knowledge-based approach
• Determine the knowledge needed for
reasoning in a domain.
• Develop a notation that is clearly defined
and that can express that knowledge.
• Encode all the domain knowledge.
• Find ways to automate reasoning with this
knowledge.
• Integrate the knowledge with the task
Knowledge involves deep features of
domain and task. Manually constructed.
Commonsense Knowledge
The knowledge about the world that everyone has
by age 7.
Learned by living in the world, not book-learning
Time, Space, Physical objects, People, Animals
and Plants …
“If an open bottle full of liquid is turned upside
down, the contents will pour out.”
“Hitting someone will not make them like you.”
“An animal is the same species as its parents.”
So obvious that it’s not worth talking about.
“The trophy doesn’t fit in the suitcase
because it’s too large”.
Interpretation of “trophy is too large”:
The trophy does not fit in the suitcase, and
any larger trophy will also not fit, but some
smaller trophy would fit.
Interpretation of “suitcase is too large”:
The trophy does not fit in the suitcase and
would not fit in any larger suitcase, but
would fit in some smaller suitcase.
Fact
If an object fits in a container, it fits in any
larger container.
So we can rule out the second reading.
In logical notation:
∀o,c1,c2 FitsIn(o,c1) ⋀ Larger(c1,c2) ⇒
FitsIn(o,c2)
Commonsense spatial reasoning
“Jane knocked on Susan’s door, but she
didn’t [get an] answer.”
Much more difficult:
• Social interactions are more complex than
geometry.
• Narrative coherence, rather than
plausibility. Neither woman answered or
got an answer.
How far have we gotten?
A lot is known about representing: Ontology,
general reasoning methods, time,
A fair amount is known about: Space,
knowledge and belief, interactions
between people, plans and goals.
A little is known about: physical reasoning.
Not much is known about other categories.
Successes
• Planning. Mars rover.
• Debugging. Find quite subtle bugs in very
complex programs (operating systems,
aircraft control, etc.) and hardware design.
• Theorem proving: A couple of original
mathematical theorems have been proven.
Obstacles
• Commonsense reasoning is a small,
complex part of any AI task.
• Little payoff until there is a lot of
commonsense knowledge.
• Software development starts with simple
useful systems, and adds features. It is
unwelcoming to systems that need to be
very complex from the start.
• Shortcuts lead to chaos.
Combined approaches
• Information extraction from text (partial
success).
A suicide car bomber struck at the gates of Baghdad’s
police academy Sunday afternoon, as recruits were
leaving the compound, punctuating weeks of relative
calm here after a particularly violent January.
Extract:
Event: TerroristAttack. Place:Baghdad.
Date: 2/19/12 PM. Method: CarBomb
Combined approaches
• In a street photograph, a human must be
at street level, not floating next to
windows. (Alyosha Efros, CMU)
The Zipf Distribution: Bane of AI
• AKA: Inverse power distribution, long tail,
fat tail.
The kth [largest/most common] item has
[size/frequency] proportional to 1/kα where
1≤α≤ 2.5 or so.
Zipf’s law: Lots of things follow the Zipf
distribution: Income, city population,
number of inlinks, number of occurences
of a word …
Consequences of the Zipf
distribution
A few rich people have most of the money.
A significant fraction of words in a corpus
appear very rarely (long tail)
In the BNC (108 words), the 20 most
common words account for 28% of the
tokens.
0.5% of the tokens are words that occur only
once.
2.3% are words that occur no more than 20
times.
Reducing the miss rate by a fixed
percentage requires reading an
exponentially increasing corpus.
E.g. to reduce the miss rate by 5%, you
have to double the size of the corpus.
Getting mediocre “promising” results is easy.
Getting good results is a lot of work.
Getting really excellent results is a huge
amount of work.
Why this is bad for AI
• Machine learning: Hard to get all the
patterns e.g. all sequences of three words
that may occur.
• Knowledge-based systems: Hard to get all
the facts you may need.
How to proceed
• Look in depth at a variety of different
domains.
• Get good solutions to basic issues
• Natural language texts must be used with
caution.
• Patience. This is a large, difficult project,
which may take centuries.
Download