Artificial Intelligence CS 165A Thursday, December 6, 2007 Miscellaneous topics: creativity, computer vision, speech understanding Final exam review Computer Vision Making computers see Creativity and intelligence • What is the relationship between creativity and intelligence? – Are they separate and independent? – Is one required for the other? • Claims: – Human-level intelligence requires creativity – Computers can be creative • Examples: – Aaron – Computers in art, drama, fiction, poetry, science, .... http://www.aaai.org/AITopics/html/create.html Emotion and intelligence • What is the relationship between emotion and intelligence? – Are they separate and independent? – Is one required for the other? • Claims: – Human-level intelligence requires understanding and representing emotion – Computers can experience and communicate affect • Examples: – Affective Computing http://affect.media.mit.edu/ Computer Vision • What is computer vision? – “Making computers see” Nice sunset! “Extracting descriptions of the world from images or sequences of images” What Does Computer Vision Do? 3D models of objects Object recognition Navigation Event/action recognition … Computer Vision • Vision is easy, right? Just open your eyes! – No, it’s a hard problem… – Much of your very complex brain is devoted to doing vision – It involves cognition, navigation, manipulation and learning Not just simple “match a feature vector to a database” tasks • CV is about interpreting the content of images (incl. video) – Field is ~40 years old – Originally a child of AI – Now closely related to several other research fields Pattern recognition, image processing, computer graphics, learning, .... CV goes from images to descriptions • “Boats” or • “Outdoors” or • “A harbor with many dozens of boats; water is calm and glassy; masts are all vertical; mountains in background, blue sky with a touch of clouds…” Why is this hard? Vision Transforms From This… 01 03 03 00 00 00 02 02 02 01 02 01 01 02 02 00 00 01 01 00 00 02 01 00 00 00 00 04 01 00 00 03 00 00 00 01 03 01 04 04 02 01 01 00 01 00 30 22 0F 07 0E 0C 0B 10 15 12 11 10 0F 13 0F 10 12 13 19 1B 21 17 1A 15 13 1A 1E 1F 21 21 30 30 54 4B 38 21 1D 27 22 32 33 28 29 25 05 3A 1B 0B 09 0C 0B 14 12 0E 10 12 0F 15 14 16 1C 17 20 1C 1D 18 1B 18 1A 18 1C 29 29 2E 27 2E 32 34 30 2C 2A 1F 2F 27 22 36 31 20 28 00 38 16 04 0E 09 0A 0F 0B 10 0F 0C 0D 0D 14 0F 10 1A 0F 13 1C 15 1A 1C 21 21 21 1F 27 23 20 2C 2E 1E 23 24 27 2B 2A 37 33 23 34 1C 2A 03 39 14 10 0C 09 05 0F 10 12 0E 11 12 18 16 13 11 19 14 13 1C 16 1A 1E 1E 26 28 27 27 29 28 29 36 3C 36 2E 22 20 28 33 29 31 24 2E 23 00 2D 0A 07 07 08 08 0D 0A 0C 10 13 0E 11 11 12 1A 18 26 21 1C 1D 2B 27 2E 31 3A 32 2A 2C 29 2A 39 3F 44 51 5C 21 21 1A 20 29 30 25 23 02 1D 08 09 08 08 09 0A 0D 0D 10 10 10 0D 13 10 0D 15 1B 1D 1B 15 1B 21 1B 28 30 26 2C 2A 2F 3B 36 3E 48 59 44 48 3B 1B 22 20 23 28 29 00 15 0B 07 0A 07 0A 0E 0D 0C 0B 10 0E 11 13 1D 1A 20 18 12 1B 18 2A 1D 23 25 26 2E 36 34 2A 30 24 29 3C 4B 31 2F 45 35 19 19 19 28 26 00 10 0A 08 0A 08 0C 0A 0B 0C 0C 0B 0F 14 17 12 1A 29 20 18 1E 1E 32 3F 47 34 40 41 4D 44 44 4E 2D 27 2E 30 3F 40 2E 4A 30 1B 18 22 1E 03 0E 0D 09 0B 09 0A 0C 0D 0A 0F 10 13 10 12 21 25 20 2F 47 55 36 34 4E 4E 4C 4C 4A 50 5A 57 3C 5A 56 2D 27 33 2F 3A 1D 35 1E 28 1E 1D 01 0C 0B 09 0F 09 0A 0C 0C 0B 0F 0F 13 12 17 15 28 3F 3D 3D 49 5B 46 32 23 1F 26 2C 34 39 42 40 46 38 34 39 1F 2D 40 20 1D 17 2A 20 34 01 0A 0B 08 0A 0A 08 0E 0B 0B 0E 0C 11 12 17 1E 33 1F 3E 47 49 29 2C 25 21 2B 18 34 42 4F 31 40 46 4C 35 2B 37 2A 33 2C 1E 1C 1D 1F 38 01 0A 0C 05 0C 05 0A 0A 0B 09 0C 11 13 14 28 21 30 37 42 45 36 2C 1B 1B 19 1C 2C 46 45 29 28 49 68 5C 29 2B 24 25 2D 2F 16 1F 1F 1F 1B 01 0A 06 08 07 08 0A 0C 0C 0C 10 11 17 19 1E 1F 26 29 3B 3A 28 19 26 1B 49 8B 90 8A 95 90 8C 5E 30 44 58 24 23 2B 2F 1F 19 1F 1D 1D 1B 00 09 07 08 06 07 06 0B 0D 0F 0D 13 11 13 1A 1C 2B 39 45 27 2A 29 4C 93 99 9B A1 A5 9B 9B 93 AE 8B 26 5B 29 36 2C 1F 1F 18 1F 1B 1B 22 00 06 05 05 0B 07 08 09 0B 09 15 0D 0F 17 17 1D 3E 49 2E 3B 24 4F 40 46 5B 42 39 89 AA A5 A3 9F 8C 94 0D 69 27 20 1E 3B 1C 1C 1E 1C 26 00 08 05 09 07 07 06 0A 0B 09 10 0F 14 13 19 2D 29 24 48 33 9F AF BA AF AA 9B A0 9E 7E 86 AC A4 A3 9A 36 37 24 25 1B 34 16 31 1B 29 18 00 07 06 03 0B 09 06 09 0A 0D 09 0D 11 16 14 1A 35 33 70 A8 AD BC BB AB AC A7 97 A3 AD AA 60 B1 AC A2 50 25 2B 25 20 1A 18 23 26 22 1A 00 06 06 08 05 08 04 0A 0A 07 12 0D 11 16 12 2D 6C 8F 96 A6 AC AF B5 B1 B7 A1 B8 B0 B3 B2 BA 4E A5 A2 34 29 4D 26 37 2A 23 1C 31 43 4C 00 06 06 05 0B 0A 06 0A 0A 0B 11 0B 14 20 4F 7C 83 93 9F 91 AA AB AE AC AF B4 AA B7 AA B3 BD AA 3E A6 52 82 50 3E 3C 38 39 2F 39 37 33 00 05 03 02 08 08 02 09 0B 08 12 25 39 73 7D 7A 5E B4 96 81 B1 9E 95 A4 A6 B0 B2 AF B2 AE B4 AA A1 8E 9C 97 85 55 3F 44 10 13 16 17 1C 00 05 07 08 09 09 06 0B 0C 15 50 7A 84 68 74 95 7B AE 6B 4B 9C A1 94 93 9A AA A5 AB A8 A0 AE A0 AF 4E A8 A1 90 5E 3C 1E 13 11 14 10 11 00 07 04 08 07 06 07 0B 17 60 68 7F 88 87 85 6B 94 79 24 A1 8D 97 84 89 93 A0 A6 AB B2 A3 A8 A4 A8 70 B5 AB 96 62 34 0C 0E 16 13 15 14 00 07 06 06 03 0A 04 05 15 5D 66 79 7E 89 91 30 8A 42 0F 75 5F 82 7A 91 8F 9D A3 99 92 9C A2 9C 82 99 AA AC 86 6D 30 0C 0E 10 14 15 14 00 04 05 06 08 03 04 0C 1C 61 89 6D 8C 93 93 48 5A 39 22 4B 3E 70 8A 86 85 92 98 97 98 94 62 94 A4 AC B3 B2 A3 6D 24 06 1A 12 13 12 14 00 05 09 04 04 09 04 0C 15 59 71 80 73 8B 8C 62 3D 73 4B AC 98 9F 9A 90 7F 72 76 90 8E 79 91 A2 AC A6 AE A6 A5 6E 17 0C 15 16 15 10 10 00 04 05 02 04 07 06 0A 0D 33 5E 6E 7A 83 7F 87 42 7D C3 A1 B7 AE B9 AA A0 8E 92 A4 9E 43 5F AB A2 A2 A0 A6 99 68 0D 10 15 13 1B 14 10 00 04 04 05 02 06 09 04 08 0D 3F 54 5C 69 6F 71 76 89 A4 B5 B7 AD BB 9F A4 97 96 94 8E 2B 52 A8 96 89 9C A0 8D 5E 0B 12 13 19 22 15 18 00 06 05 03 00 06 05 07 09 0A 08 0C 1E 43 5F 5C 5C 46 3F 79 A3 A5 AD 91 C2 71 98 85 44 25 4F 93 71 7E 8C 89 7A 43 0E 1B 1A 1B 1A 1B 17 00 02 01 02 04 03 05 06 08 07 09 0D 05 07 0B 0A 13 12 4F 0C 31 92 9C 97 9F A7 6D 7C 34 2D 3F 52 73 5B 62 69 4E 0D 11 21 18 17 1E 1E 1E 00 01 04 05 02 05 08 03 05 08 0A 09 0A 0A 09 08 08 06 0C 0B 11 16 8A AD 99 32 08 08 18 07 09 0E 08 11 0A 0F 0E 10 1E 21 2C 19 1B 15 29 00 02 04 05 04 03 06 05 05 08 09 0A 0F 12 12 11 13 12 18 13 14 10 15 7F 4E 04 0D 07 05 0E 0D 0E 10 0E 12 10 1B 21 23 34 2E 1D 15 1A 20 00 02 02 00 00 01 04 07 05 05 0A 06 0E 0A 0D 0C 0F 12 16 0F 0A 07 09 0C 09 0A 07 07 06 05 0D 09 0B 10 14 1C 15 18 1B 32 19 13 13 11 1A 00 02 03 02 04 06 05 04 04 03 03 04 0C 0B 0C 09 0C 0F 0F 0B 0D 0E 09 0B 08 0A 08 08 0A 06 09 0B 0B 10 0D 18 20 32 25 20 0F 14 16 10 15 00 02 03 02 03 02 04 05 02 06 03 02 05 06 02 04 04 08 05 02 04 0A 05 0E 0A 0D 0C 09 0D 0C 0E 0D 0B 17 16 14 0F 1A 14 0B 0D 10 0C 14 12 00 07 04 04 08 03 06 03 05 07 02 05 02 06 04 04 04 03 05 03 08 0C 0B 0B 0D 0D 0B 09 0D 0A 0E 10 0E 12 14 10 0F 13 0D 0E 10 10 0D 13 17 00 01 02 04 00 07 01 02 04 01 05 00 04 03 07 02 01 03 08 06 07 08 0D 0C 0C 09 0E 08 0D 0F 0B 0C 0F 0D 11 10 16 10 10 10 0E 12 11 14 0E 02 02 04 00 06 01 0A 01 04 03 05 05 03 04 04 06 05 03 05 07 07 05 0F 0C 0A 0D 0D 0C 0F 0D 12 0C 10 0C 10 0F 12 13 0F 0D 0E 11 0E 17 14 00 02 03 00 09 04 03 06 00 05 04 04 06 05 05 04 05 04 05 07 07 0B 0B 09 0C 0C 0D 0D 0C 09 0B 10 11 0D 0E 0C 13 15 12 0D 14 12 12 12 12 01 03 02 03 04 04 02 03 04 02 02 03 05 03 04 03 03 03 04 04 06 05 07 05 07 07 0A 0B 08 0C 0B 09 0A 0C 0D 0F 0B 10 0F 0F 0D 0D 0D 11 12 To this… • Objects – Cat, chair, window, star, bush, water, a shoe, my mother… • Properties – Big, bright, yellow, fast, graspable, moving… • Relations – In front, behind, on top, next to, larger, closer, identical… • Shapes – Round, rectangular, star-shaped, symmetric… • Textures – Rough, smooth, irregular… • Movement – Turning, looming, rolling… Aims of Computer Vision • Automate visual perception • Construct scene descriptions from images • Make useful decisions about real physical objects and scenes based on sensed images • Produce symbolic (perhaps task-dependent) descriptions from images • Produce from images of the external world a useful description that is not cluttered with irrelevant information • Support tasks that require visual information Some applications of computer vision • Photogrammetry, GIS – Commercial, military, government • Robotics – Industrial, military, medical, space, entertainment • Inspection, measurement • Medical imaging – Automatic detection, outlining, measurement • Graphics and animation, special effects • Surveillance and security • Multimedia database indexing and retrieval, compression • Human-computer interaction (VBI) Progress in Computer Vision • First generation: Military/Early Research – Few systems, each custom-built, cost $Ms – “Users” have PhDs – 1 hour per frame • Second generation: Industrial/Medical – Numerous systems, 1-1000 of each, cost $10Ks – Users have college degree – RT with special hardware • Third generation: Consumer – 100000(00) systems, cost $100s – Users have little or no training – RT in software Examples Viisage FaceExplorer CMU NavLab Cognex Speech Understanding Speech recognition + Natural language processing Speech Understanding • Language-based technologies have potentially many useful applications – – – – Automatic translation Database query Computer interfaces etc. • Two main components – Speech recognition (SR or ASR) – Natural language processing (NLP or NLU) • SR + NLP = Speech Understanding (SU) Speech and language technologies Input: • Speech recognition converts an audio signal to a set of words – A.k.a. “Automatic speech recognition” (ASR) • Language processing derives meaning from the words Output: • Language generation converts concepts to natural language • Speech generation converts text to audible speech – A.k.a. “Text-to-speech” (TTS) Note: Speech recognition by itself doesn’t solve much… Speech recognition • Converts a digital audio signal to a set of words • Typical output: N-best words or complete sentences, based on a frequency table or predefined grammar [Hypotheses] • Linguistic constraints: e.g., statistical grammar, contextfree grammar… [Filter] • Trend – more linguistic information used in word recognition • Coarticulation problem (“Did chew no?”) Natural language processing (NLP) • NLP uses tools from formal language theory to process natural language – Grammar, syntax, etc. • NLP traditionally has been syntax-driven – start with a complete syntactic analysis accounting for all the words in the utterance • This doesn’t work so well with spoken material – Unknown words, novel linguistic constructs, recognition errors, false starts, disfluencies… • Alternative: more semantic-driven approaches – Word spotting: looking for key words and phrases – Almost all practical systems use some version of this Problems with speech understanding • Spontaneous dialogue is replete with – – – – – – – – – Improper grammar Disfluencies (“Uh”, “Um”, word fragments) Interruptions (Barge-in) Confirmation Clarification Ellipsis (“and so on”) Sentence fragments Colloquialisms (“Kill two birds with one stone”) Slang (“He ain’t got no dough”) Examples • “How to recognize speech” or • “How to wreck a nice beach” • “I saw the Grand Canyon flying to Colorado.” Holy Grail of Speech Understanding Systems • Low quality input (microphone, transmission, placement of microphone) • Noisy environment • Continuous speech • Speaker independent • • • • • • Large vocabulary (many 1000’s) Unrestricted grammar Domain independent Handles accents, non-native speakers Understands and generates natural prosody Understands and generates backchannel communication Final Exam Review Final Exam • Friday, December 12, 4-7pm, HERE • Allowed: An 8.5x11” sheet of paper, writing on both sides • Bring a calculator • Don’t need paper • Some equations, etc. will be given (those on the midterm and more) – posted on web soon • The exam covers the whole quarter, with more emphasis on material since the midterm • Responsible for reading, lectures, homeworks, quizzes • Like midterm in form Introduction • What is AI? What are its goals? • What are its foundations? • How does it relate to other areas of study? • What do AI people do? • Strong AI, Weak AI • Thought vs. action, human vs. ideal • Three parts of a typical AI program – Data / knowledge (“knowledge base”) – Operations / rules (“production rules”) – Control Intelligent agents • Different kinds of agents • Our goal: Ideal rational agents • PEAS – Performance measure, Environment, Actuators, Sensors Problem solving and search • Problem formulation, abstraction • State space representations • State space vs. search tree • Branching factor • Search criteria – Completeness, optimality, time and space complexity • Blind search methods – – – – – Depth first Breadth first Uniform cost Depth limited Iterative deepening Search (cont.) • Heuristic (informed) search – Best-first search Greedy best-first search A* search – Memory bounded search IDA* SMA* – Iterated improvement algorithms Hill-climbing Simulated annealing • Admissible heuristics Adversarial search • Game playing • Minimax algorithm • Alpha-beta pruning • Non-deterministic games Knowledge and reasoning • Logic, syntax, semantics • Knowledge base, inference engine • Inference vs. entailment • Propositional logic • Satisfiable, unsatisfiable, valid sentences • Sound inference • Inference rules: – Modus ponens, and-introduction, ..., resolution First order logic • FOL syntax and semantics – Universal and existential quantifiers • Conversion to CNF and INF • Inference in FOL • Universal instantiation, existential instantiation, existential introduction – SUBST( ) • Generalized modus ponens • Unification, skolemization • Generalized resolution (2 versions) – Disjunctions – Implications Knowledge Representation • The frame problem • Situation calculus – Result(A, S) • Other ways to deal with time – Event calculus, generalized events, etc. Probabilistic reasoning and belief nets • Joint and conditional probabilities, marginalization, Bayes Rule, independence • Constructing belief networks • Computing queries on belief networks Big questions in AI • Is strong AI possible? • How do minds work? • The Turing test • The Chinese room • Simulation vs. reality Misc. • Nothing on robotics, vision, or speech understanding