Artificial Intelligence CS 165A  Miscellaneous topics:  Final exam review

advertisement
Artificial Intelligence
CS 165A
Thursday, December 6, 2007
 Miscellaneous topics:
creativity, computer vision, speech understanding
 Final exam review
Computer Vision
Making computers see
Creativity and intelligence
• What is the relationship between creativity and
intelligence?
– Are they separate and independent?
– Is one required for the other?
• Claims:
– Human-level intelligence requires creativity
– Computers can be creative
• Examples:
– Aaron
– Computers in art, drama, fiction, poetry, science, ....
http://www.aaai.org/AITopics/html/create.html
Emotion and intelligence
• What is the relationship between emotion and intelligence?
– Are they separate and independent?
– Is one required for the other?
• Claims:
– Human-level intelligence requires understanding and representing
emotion
– Computers can experience and communicate affect
• Examples:
– Affective Computing
http://affect.media.mit.edu/
Computer Vision
• What is computer vision?
– “Making computers see”
Nice
sunset!
“Extracting descriptions of the world from images or
sequences of images”
What Does Computer Vision Do?
3D models of objects
Object recognition
Navigation
Event/action recognition
…
Computer Vision
• Vision is easy, right? Just open your eyes!
– No, it’s a hard problem…
– Much of your very complex brain is devoted to doing vision
– It involves cognition, navigation, manipulation and learning
 Not just simple “match a feature vector to a database” tasks
• CV is about interpreting the content of images (incl. video)
– Field is ~40 years old
– Originally a child of AI
– Now closely related to several other research fields
 Pattern recognition, image processing, computer graphics,
learning, ....
CV goes from images to descriptions
• “Boats”
or
• “Outdoors”
or
• “A harbor with many dozens of
boats; water is calm and glassy;
masts are all vertical; mountains
in background, blue sky with a
touch of clouds…”
Why is this hard?
Vision Transforms From This…
01
03
03
00
00
00
02
02
02
01
02
01
01
02
02
00
00
01
01
00
00
02
01
00
00
00
00
04
01
00
00
03
00
00
00
01
03
01
04
04
02
01
01
00
01
00
30
22
0F
07
0E
0C
0B
10
15
12
11
10
0F
13
0F
10
12
13
19
1B
21
17
1A
15
13
1A
1E
1F
21
21
30
30
54
4B
38
21
1D
27
22
32
33
28
29
25
05
3A
1B
0B
09
0C
0B
14
12
0E
10
12
0F
15
14
16
1C
17
20
1C
1D
18
1B
18
1A
18
1C
29
29
2E
27
2E
32
34
30
2C
2A
1F
2F
27
22
36
31
20
28
00
38
16
04
0E
09
0A
0F
0B
10
0F
0C
0D
0D
14
0F
10
1A
0F
13
1C
15
1A
1C
21
21
21
1F
27
23
20
2C
2E
1E
23
24
27
2B
2A
37
33
23
34
1C
2A
03
39
14
10
0C
09
05
0F
10
12
0E
11
12
18
16
13
11
19
14
13
1C
16
1A
1E
1E
26
28
27
27
29
28
29
36
3C
36
2E
22
20
28
33
29
31
24
2E
23
00
2D
0A
07
07
08
08
0D
0A
0C
10
13
0E
11
11
12
1A
18
26
21
1C
1D
2B
27
2E
31
3A
32
2A
2C
29
2A
39
3F
44
51
5C
21
21
1A
20
29
30
25
23
02
1D
08
09
08
08
09
0A
0D
0D
10
10
10
0D
13
10
0D
15
1B
1D
1B
15
1B
21
1B
28
30
26
2C
2A
2F
3B
36
3E
48
59
44
48
3B
1B
22
20
23
28
29
00
15
0B
07
0A
07
0A
0E
0D
0C
0B
10
0E
11
13
1D
1A
20
18
12
1B
18
2A
1D
23
25
26
2E
36
34
2A
30
24
29
3C
4B
31
2F
45
35
19
19
19
28
26
00
10
0A
08
0A
08
0C
0A
0B
0C
0C
0B
0F
14
17
12
1A
29
20
18
1E
1E
32
3F
47
34
40
41
4D
44
44
4E
2D
27
2E
30
3F
40
2E
4A
30
1B
18
22
1E
03
0E
0D
09
0B
09
0A
0C
0D
0A
0F
10
13
10
12
21
25
20
2F
47
55
36
34
4E
4E
4C
4C
4A
50
5A
57
3C
5A
56
2D
27
33
2F
3A
1D
35
1E
28
1E
1D
01
0C
0B
09
0F
09
0A
0C
0C
0B
0F
0F
13
12
17
15
28
3F
3D
3D
49
5B
46
32
23
1F
26
2C
34
39
42
40
46
38
34
39
1F
2D
40
20
1D
17
2A
20
34
01
0A
0B
08
0A
0A
08
0E
0B
0B
0E
0C
11
12
17
1E
33
1F
3E
47
49
29
2C
25
21
2B
18
34
42
4F
31
40
46
4C
35
2B
37
2A
33
2C
1E
1C
1D
1F
38
01
0A
0C
05
0C
05
0A
0A
0B
09
0C
11
13
14
28
21
30
37
42
45
36
2C
1B
1B
19
1C
2C
46
45
29
28
49
68
5C
29
2B
24
25
2D
2F
16
1F
1F
1F
1B
01
0A
06
08
07
08
0A
0C
0C
0C
10
11
17
19
1E
1F
26
29
3B
3A
28
19
26
1B
49
8B
90
8A
95
90
8C
5E
30
44
58
24
23
2B
2F
1F
19
1F
1D
1D
1B
00
09
07
08
06
07
06
0B
0D
0F
0D
13
11
13
1A
1C
2B
39
45
27
2A
29
4C
93
99
9B
A1
A5
9B
9B
93
AE
8B
26
5B
29
36
2C
1F
1F
18
1F
1B
1B
22
00
06
05
05
0B
07
08
09
0B
09
15
0D
0F
17
17
1D
3E
49
2E
3B
24
4F
40
46
5B
42
39
89
AA
A5
A3
9F
8C
94
0D
69
27
20
1E
3B
1C
1C
1E
1C
26
00
08
05
09
07
07
06
0A
0B
09
10
0F
14
13
19
2D
29
24
48
33
9F
AF
BA
AF
AA
9B
A0
9E
7E
86
AC
A4
A3
9A
36
37
24
25
1B
34
16
31
1B
29
18
00
07
06
03
0B
09
06
09
0A
0D
09
0D
11
16
14
1A
35
33
70
A8
AD
BC
BB
AB
AC
A7
97
A3
AD
AA
60
B1
AC
A2
50
25
2B
25
20
1A
18
23
26
22
1A
00
06
06
08
05
08
04
0A
0A
07
12
0D
11
16
12
2D
6C
8F
96
A6
AC
AF
B5
B1
B7
A1
B8
B0
B3
B2
BA
4E
A5
A2
34
29
4D
26
37
2A
23
1C
31
43
4C
00
06
06
05
0B
0A
06
0A
0A
0B
11
0B
14
20
4F
7C
83
93
9F
91
AA
AB
AE
AC
AF
B4
AA
B7
AA
B3
BD
AA
3E
A6
52
82
50
3E
3C
38
39
2F
39
37
33
00
05
03
02
08
08
02
09
0B
08
12
25
39
73
7D
7A
5E
B4
96
81
B1
9E
95
A4
A6
B0
B2
AF
B2
AE
B4
AA
A1
8E
9C
97
85
55
3F
44
10
13
16
17
1C
00
05
07
08
09
09
06
0B
0C
15
50
7A
84
68
74
95
7B
AE
6B
4B
9C
A1
94
93
9A
AA
A5
AB
A8
A0
AE
A0
AF
4E
A8
A1
90
5E
3C
1E
13
11
14
10
11
00
07
04
08
07
06
07
0B
17
60
68
7F
88
87
85
6B
94
79
24
A1
8D
97
84
89
93
A0
A6
AB
B2
A3
A8
A4
A8
70
B5
AB
96
62
34
0C
0E
16
13
15
14
00
07
06
06
03
0A
04
05
15
5D
66
79
7E
89
91
30
8A
42
0F
75
5F
82
7A
91
8F
9D
A3
99
92
9C
A2
9C
82
99
AA
AC
86
6D
30
0C
0E
10
14
15
14
00
04
05
06
08
03
04
0C
1C
61
89
6D
8C
93
93
48
5A
39
22
4B
3E
70
8A
86
85
92
98
97
98
94
62
94
A4
AC
B3
B2
A3
6D
24
06
1A
12
13
12
14
00
05
09
04
04
09
04
0C
15
59
71
80
73
8B
8C
62
3D
73
4B
AC
98
9F
9A
90
7F
72
76
90
8E
79
91
A2
AC
A6
AE
A6
A5
6E
17
0C
15
16
15
10
10
00
04
05
02
04
07
06
0A
0D
33
5E
6E
7A
83
7F
87
42
7D
C3
A1
B7
AE
B9
AA
A0
8E
92
A4
9E
43
5F
AB
A2
A2
A0
A6
99
68
0D
10
15
13
1B
14
10
00
04
04
05
02
06
09
04
08
0D
3F
54
5C
69
6F
71
76
89
A4
B5
B7
AD
BB
9F
A4
97
96
94
8E
2B
52
A8
96
89
9C
A0
8D
5E
0B
12
13
19
22
15
18
00
06
05
03
00
06
05
07
09
0A
08
0C
1E
43
5F
5C
5C
46
3F
79
A3
A5
AD
91
C2
71
98
85
44
25
4F
93
71
7E
8C
89
7A
43
0E
1B
1A
1B
1A
1B
17
00
02
01
02
04
03
05
06
08
07
09
0D
05
07
0B
0A
13
12
4F
0C
31
92
9C
97
9F
A7
6D
7C
34
2D
3F
52
73
5B
62
69
4E
0D
11
21
18
17
1E
1E
1E
00
01
04
05
02
05
08
03
05
08
0A
09
0A
0A
09
08
08
06
0C
0B
11
16
8A
AD
99
32
08
08
18
07
09
0E
08
11
0A
0F
0E
10
1E
21
2C
19
1B
15
29
00
02
04
05
04
03
06
05
05
08
09
0A
0F
12
12
11
13
12
18
13
14
10
15
7F
4E
04
0D
07
05
0E
0D
0E
10
0E
12
10
1B
21
23
34
2E
1D
15
1A
20
00
02
02
00
00
01
04
07
05
05
0A
06
0E
0A
0D
0C
0F
12
16
0F
0A
07
09
0C
09
0A
07
07
06
05
0D
09
0B
10
14
1C
15
18
1B
32
19
13
13
11
1A
00
02
03
02
04
06
05
04
04
03
03
04
0C
0B
0C
09
0C
0F
0F
0B
0D
0E
09
0B
08
0A
08
08
0A
06
09
0B
0B
10
0D
18
20
32
25
20
0F
14
16
10
15
00
02
03
02
03
02
04
05
02
06
03
02
05
06
02
04
04
08
05
02
04
0A
05
0E
0A
0D
0C
09
0D
0C
0E
0D
0B
17
16
14
0F
1A
14
0B
0D
10
0C
14
12
00
07
04
04
08
03
06
03
05
07
02
05
02
06
04
04
04
03
05
03
08
0C
0B
0B
0D
0D
0B
09
0D
0A
0E
10
0E
12
14
10
0F
13
0D
0E
10
10
0D
13
17
00
01
02
04
00
07
01
02
04
01
05
00
04
03
07
02
01
03
08
06
07
08
0D
0C
0C
09
0E
08
0D
0F
0B
0C
0F
0D
11
10
16
10
10
10
0E
12
11
14
0E
02
02
04
00
06
01
0A
01
04
03
05
05
03
04
04
06
05
03
05
07
07
05
0F
0C
0A
0D
0D
0C
0F
0D
12
0C
10
0C
10
0F
12
13
0F
0D
0E
11
0E
17
14
00
02
03
00
09
04
03
06
00
05
04
04
06
05
05
04
05
04
05
07
07
0B
0B
09
0C
0C
0D
0D
0C
09
0B
10
11
0D
0E
0C
13
15
12
0D
14
12
12
12
12
01
03
02
03
04
04
02
03
04
02
02
03
05
03
04
03
03
03
04
04
06
05
07
05
07
07
0A
0B
08
0C
0B
09
0A
0C
0D
0F
0B
10
0F
0F
0D
0D
0D
11
12
To this…
• Objects
– Cat, chair, window, star, bush, water, a shoe, my mother…
• Properties
– Big, bright, yellow, fast, graspable, moving…
• Relations
– In front, behind, on top, next to, larger, closer, identical…
• Shapes
– Round, rectangular, star-shaped, symmetric…
• Textures
– Rough, smooth, irregular…
• Movement
– Turning, looming, rolling…
Aims of Computer Vision
• Automate visual perception
• Construct scene descriptions from images
• Make useful decisions about real physical objects and
scenes based on sensed images
• Produce symbolic (perhaps task-dependent) descriptions
from images
• Produce from images of the external world a useful
description that is not cluttered with irrelevant information
• Support tasks that require visual information
Some applications of computer vision
• Photogrammetry, GIS
– Commercial, military, government
• Robotics
– Industrial, military, medical, space, entertainment
• Inspection, measurement
• Medical imaging
– Automatic detection, outlining, measurement
• Graphics and animation, special effects
• Surveillance and security
• Multimedia database indexing and retrieval, compression
• Human-computer interaction (VBI)
Progress in Computer Vision
• First generation: Military/Early Research
– Few systems, each custom-built, cost $Ms
– “Users” have PhDs
– 1 hour per frame
• Second generation: Industrial/Medical
– Numerous systems, 1-1000 of each, cost $10Ks
– Users have college degree
– RT with special hardware
• Third generation: Consumer
– 100000(00) systems, cost $100s
– Users have little or no training
– RT in software
Examples
Viisage
FaceExplorer
CMU NavLab
Cognex
Speech Understanding
Speech recognition +
Natural language processing
Speech Understanding
• Language-based technologies have potentially many useful
applications
–
–
–
–
Automatic translation
Database query
Computer interfaces
etc.
• Two main components
– Speech recognition (SR or ASR)
– Natural language processing (NLP or NLU)
• SR + NLP = Speech Understanding (SU)
Speech and language technologies
Input:
• Speech recognition converts an audio signal to a set of
words
– A.k.a. “Automatic speech recognition” (ASR)
• Language processing derives meaning from the words
Output:
• Language generation converts concepts to natural language
• Speech generation converts text to audible speech
– A.k.a. “Text-to-speech” (TTS)
Note: Speech recognition by itself doesn’t solve much…
Speech recognition
• Converts a digital audio signal to a set of words
• Typical output: N-best words or complete sentences, based
on a frequency table or predefined grammar [Hypotheses]
• Linguistic constraints: e.g., statistical grammar, contextfree grammar… [Filter]
• Trend – more linguistic information used in word
recognition
• Coarticulation problem (“Did chew no?”)
Natural language processing (NLP)
• NLP uses tools from formal language theory to process
natural language
– Grammar, syntax, etc.
• NLP traditionally has been syntax-driven – start with a
complete syntactic analysis accounting for all the words in
the utterance
• This doesn’t work so well with spoken material
– Unknown words, novel linguistic constructs, recognition errors,
false starts, disfluencies…
• Alternative: more semantic-driven approaches
– Word spotting: looking for key words and phrases
– Almost all practical systems use some version of this
Problems with speech understanding
• Spontaneous dialogue is replete with
–
–
–
–
–
–
–
–
–
Improper grammar
Disfluencies (“Uh”, “Um”, word fragments)
Interruptions (Barge-in)
Confirmation
Clarification
Ellipsis (“and so on”)
Sentence fragments
Colloquialisms (“Kill two birds with one stone”)
Slang (“He ain’t got no dough”)
Examples
• “How to recognize speech”
or
• “How to wreck a nice beach”
• “I saw the Grand Canyon flying to Colorado.”
Holy Grail of Speech Understanding Systems
• Low quality input (microphone, transmission, placement of
microphone)
• Noisy environment
• Continuous speech
• Speaker independent
•
•
•
•
•
•
Large vocabulary (many 1000’s)
Unrestricted grammar
Domain independent
Handles accents, non-native speakers
Understands and generates natural prosody
Understands and generates backchannel communication
Final Exam Review
Final Exam
• Friday, December 12, 4-7pm, HERE
• Allowed: An 8.5x11” sheet of paper, writing on both sides
• Bring a calculator
• Don’t need paper
• Some equations, etc. will be given (those on the midterm
and more) – posted on web soon
• The exam covers the whole quarter, with more emphasis
on material since the midterm
• Responsible for reading, lectures, homeworks, quizzes
• Like midterm in form
Introduction
• What is AI? What are its goals?
• What are its foundations?
• How does it relate to other areas of study?
• What do AI people do?
• Strong AI, Weak AI
• Thought vs. action, human vs. ideal
• Three parts of a typical AI program
– Data / knowledge (“knowledge base”)
– Operations / rules (“production rules”)
– Control
Intelligent agents
• Different kinds of agents
• Our goal: Ideal rational agents
• PEAS
– Performance measure, Environment, Actuators, Sensors
Problem solving and search
• Problem formulation, abstraction
• State space representations
• State space vs. search tree
• Branching factor
• Search criteria
– Completeness, optimality, time and space complexity
• Blind search methods
–
–
–
–
–
Depth first
Breadth first
Uniform cost
Depth limited
Iterative deepening
Search (cont.)
• Heuristic (informed) search
– Best-first search
 Greedy best-first search
 A* search
– Memory bounded search
 IDA*
 SMA*
– Iterated improvement algorithms
 Hill-climbing
 Simulated annealing
• Admissible heuristics
Adversarial search
• Game playing
• Minimax algorithm
• Alpha-beta pruning
• Non-deterministic games
Knowledge and reasoning
• Logic, syntax, semantics
• Knowledge base, inference engine
• Inference vs. entailment
• Propositional logic
• Satisfiable, unsatisfiable, valid sentences
• Sound inference
• Inference rules:
– Modus ponens, and-introduction, ..., resolution
First order logic
• FOL syntax and semantics
– Universal and existential quantifiers
• Conversion to CNF and INF
• Inference in FOL
• Universal instantiation, existential instantiation, existential
introduction
– SUBST( )
• Generalized modus ponens
• Unification, skolemization
• Generalized resolution (2 versions)
– Disjunctions
– Implications
Knowledge Representation
• The frame problem
• Situation calculus
– Result(A, S)
• Other ways to deal with time
– Event calculus, generalized events, etc.
Probabilistic reasoning and belief nets
• Joint and conditional probabilities, marginalization, Bayes
Rule, independence
• Constructing belief networks
• Computing queries on belief networks
Big questions in AI
• Is strong AI possible?
• How do minds work?
• The Turing test
• The Chinese room
• Simulation vs. reality
Misc.
• Nothing on robotics, vision, or speech understanding
Download