CS 764: Seminar in Computer Vision

CORNELL UNIVERSITY CS 764 Seminar in Computer Vision Ramin Zabih Fall 1998 CORNELL UNIVERSITY Course mechanics  Meeting time will be Tue/Thu 11-12, here • Starting a week from today  Home page is now up www/CS764  Assignment: present one paper • You’ll have a lot of freedom, but you need to talk to me in advance • Some possible papers will be posted shortly 2 CORNELL UNIVERSITY Topic of this seminar  The use of “knowledge” in the analysis of visual data • Sometimes called “context”  Clearly this is vital • On both psychological and technical grounds • But how? No one has much of an idea…  What is the interface between reasoning and perception? (Or, mind and body?) 3 CORNELL UNIVERSITY What is the visual system’s “contract” Two standard (bad) answers  Answer 1: describe the scene in terms of surfaces [low-level vision]  • There is a green patch 2” wide 1’ away  Answer 2: describe the scene in terms of objects [model-based recognition] • Start with a set of 3D models (modelbase) • Determine position and pose 4 CORNELL UNIVERSITY Why are these answers wrong?  They are almost purely data-driven • Bottom-up (from the data) versus top-down (from somewhere else)  They report “objective fact”, with no room for the task at hand • For a given image, there is only one right answer  Other problems as well • Not very useful, etc. 5 CORNELL UNIVERSITY Technical and psychological arguments  There are technical arguments against this • Vision is an inverse problem – Many 3D scenes could explain a single 2D image • On engineering grounds, this makes no sense – Ultimately, perception is used for some task  The human perceptual system has both topdown and bottom-up elements • Various optical illusions – Two people can look at the same picture and see something completely different 6 CORNELL UNIVERSITY Your vision system doesn’t listen 10 CORNELL UNIVERSITY It makes “reasonable” assumptions 11 CORNELL UNIVERSITY Low-level vision has its solution Inverse problems require assumptions  The assumptions for low-level vision are extremely general (I.e., weak)  • Reflect the physics of the visible world • For example, motion or depth or intensity tend to be “coherent” – Saying that every pixel is moving differently from its neighbors is a very unlikely answer – The world we live in tends not to do that – Helmholtz’s “unconscious inference” 12 CORNELL UNIVERSITY We’ll need high-level vision  Most of the field is low-level vision or modelbased recognition • Partly to avoid the confusion CS764 is about  Key question: how to avoid brittleness? • Can make the visual system compute just what we need for our task (I.e., berries) • But how to handle the unexpected (I.e., lions)? 13 CORNELL UNIVERSITY A short historical perspective  1960’s vision was completely task-specific • A black blob in the center of the image is a telephone • These efforts are now considered “hacks”  1970’s vision became completely general • Marr pushed the field towards precise technical questions • Low-level vision and recognition became dominant 14 CORNELL UNIVERSITY Tasks strike back  In the mid-1980’s, several attempts were made to re-introduce a notion of task • Active/animate/purposive vision  These attempts are widely viewed as failures, for good reasons • We’ll look at them a bit next week  It’s not enough to have good intuitions • There needs to be technical merit as well 15 CORNELL UNIVERSITY Desiderata  Technical solutions (algorithms) that are very roughly consistent with human data • Goal is not AI, psychology or philosophy  Provide visual summaries useful for tasks, but degrade gracefully • Handle open/unstructured environments • Deal with expectations and breakdown 16 CORNELL UNIVERSITY Our path for 764  No good computational work to read • Perhaps Vera will fix this?  We will examine papers along these lines: • • • • Computational approaches that failed Psychological data that is highly suggestive Neurologically inspired architectures Cognitive scientists and philosophers – Their goal is argument, not algorithm! – They’ve thought the most about these issues 17

CS 764: Seminar in Computer Vision

Related documents

Products

Support

CS 764: Seminar in Computer Vision

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib