EECS 531 Computational Vision, Spring 2015 Instructor Dr. Michael Lewicki Associate Professor Electrical Engineering and Computer Science Dept. Case Western Reserve University email: michael.lewicki@case.edu Office: Olin 508. Office Hours: Tu/We 11:00-12:00 or by appointment. Class meeting times Tu/Th 2:45 - 4:00 in Nord 212 Main Textbook Computer Vision: Algorithms and Applications by Richard Szeliski. This is also available electronically at http://szeliski.org/Book/. Supplemental Textbooks Computational Vision: Information Processing in Perception and Visual Behavior by Hanspector A. Mallot. I also recommend this because it has better and more thorough explanations of some of the core topics, although it is not as broad as Szeliski. Seeing: The Computational Approach to Biological Vision, 2nd edition, by John P. Frisby and James V. Stone. I recommend this for background and broader perspective on visual perception and physiology. This is a good book to have if you haven’t taken a course in perception. I will sometimes use material from this book in the lectures, but main focus of the course will be on computational algorithms. Multiple View Geometry in Computer Vision, 2nd edition, by Richard Hartley and Andrew Zisserman. This is an excellent resource for geometric approaches to computer vision and has good chapters on projective geometry and camera models. Szeliski can be dense and hard to follow on some of these topics. H&Z go through all the details. Computer Vision: A Modern Approach, by David A. Forsyth and Jean Ponce. This book has a very good introduction to image formation and image models. Web page The course has a blackboard site (https://blackboard.case.edu). Search within blackboard to find the site for EECS 531. Check there periodically for the latest announcements, homework assignments, lecture slides, handouts, etc. Course Description The goal of computer vision is to create systems that recognize patterns and recover structures from complex images and scenes. This course teaches both the science behind our understanding of the fundamental problems in vision and the engineering that develops mathematical models and inference algorithms to solve these problems. Specific topics include feature detection, and classification; visual representations and dimensionality reduction; motion detection and optical flow; image segmentation; depth perception, multi-view geometry, and 3D reconstruction; shape and surface perception; visual scene analysis and object recognition. Course Goals The goal of this course is to teach a comprehensive and practical understanding of the computer vision problems ranging from pattern recognition to scene analysis. The course will teach how to reason scientifically about problems and issues in computational vision, how to extract the essential computational properties of those abstract ideas, and how to convert these into explicit mathematical models and computational algorithms. You will learn how to implement and test effective computational algorithms for problem in computer vision. In class discussions are an essential aspect of the course. An important goal of the course is to teach productive discussion, analysis, and critique of issues and topics related to computer vision. Course requirements The course requirements consist of • reading the assigned background material • participation in class discussions • completion of the homework assignments • completion of an independent project (which includes a writeup and presentation) Lectures The lectures are designed to teach the following: • the nature of the problem • common mathematical solutions • algorithmic implementations • practical applications • provide motivation for upcoming topics When possible, the mathematical and conceptual background required for the lectures will be covered in prior lectures or assignments. Computer vision depends on a broad range of fields, and it is generally not possible to understand all the results down to a fundamental level. We will make an effort to clearly encapsulate results from other areas so that you can apply them even without understanding how they came about. This is in fact the essence of progress. The goal is to know less, i.e. to package knowledge in a way that it can be easily used. The overall goal is to teach the current state-of-the-art in computer vision with sufficient background so that you can apply the algorithms to novel problems. Assignments Assignments are the primary means by which to learn the mathematical material presented in class and will be coordinated with the lectures. Some of the advanced methods discussed in class are not practical to cover in a homework because of their complexity. If you would like to study a particular topic in greater detail, it would be well worth considering designing a class project around that topic. Some assignments will depend on material completed in earlier assignments. Therefore, complete the entire assignment and stay current. The programming assignments will be in Matlab, but it is possible to use other languages, e.g. numerical python, but it is usually more work, and you will not be able to take advantage of matlab code that will be provided with some of the assignments. Each assignment must be turned in as a single pdf file. The reason for this is that it makes grading far easier and avoids formatting and version problems that often arise from ms word files. The best way is to use latex (for equations) and include code and figures as needed. Once you know how to do it, latex is faster and far more flexible than doing it in a word processor, because if you need to update figure, you can simply regenerate the pdf. Equations are also faster to specify and yields much more professional formatting. Students in previous years have also used the Matlab report generator, but you must join multiple pdf files into a single pdf for the assignment. Readings We will do readings throughout the course. Readings outside of the textbook will be incorporated into the assignments. Background material and research papers will be made available on blackboard. You will be responsible for understanding the material and participating in class discussions. Late policy Late assignments will not be accepted except for medical reasons. This is to ensure that you do not fall behind, and so the assignments can be returned in a timely fashion. Course Projects Each student is required to complete an independent project which should be an implementation and/or application of algorithm discussed in class or a closely subject. You do not need to write your own code and you are encouraged to find existing code for a topic you are interested in to use as the basis for exploration. Your written report should be in the style of a tutorial that explains the problem and algorithms(s) with illustrative examples. Each student is also responsible for giving a presentation on their project which be presented to the whole class and should last about 10-15 min. The presentations will be scheduled for each student throughout the semester at times that best fits with the course topics and schedule. This means that you have some flexibility in choosing your deadline for this part of the course. Look through the course topics, textbook, and lecture slides for project ideas. You will write a 1 page project proposal and discuss you project with me. In your proposal, you should explain what you want to do, what code you will use, and what examples or test data you want to test the algorithms on. Final Grade Final grades will be a composite score of course requirements in the following proportions: Assignments (total) 75% Project report 15% Project presentation 10% Total 100% Extra credit, class participation, and any special circumstances will be used in determining borderline cases. Collaboration Collaborative discussion is encouraged, but any work submitted for an assignment must be entirely your own and may not be derived from the work of others, whether a published source, assignments from previous years, another student, or any other person. Doing otherwise without acknowledging that you have done so is cheating. It is your responsibility to take standard measures to protect your programs, homework assignments, and examinations from illicit inspection or copying. Violations will be handled in accordance with the University Policy on Cheating and Plagiarism. Class Schedule (subject to revisions)1 Date Topics Readings Assignment due dates out draft final Introduction and Overview - image 1 Tue, Jan 13 processing vs computer vision, perception is inference of the external scene, course design S.1, M.1 Feature Detection and Classification 2 Thu, Jan 15 convolution, feature detection, signals and noise, classification S.3-4, M.3-4, Viola and Jones, IJCV 2004; A1 Spectral Representation - filtering, 1D and 3 Tue, Jan 20 2D Fourier transforms, natural image statistics, S.3.4 wavelets and multi-scale representations Neural Networks and Classification - neural networks, optimization and gradient descent, 4 Thu, Jan 22 S.14.1-2 choosing step size, neural units as feature detectors, multi-layer networks, non-linearities Basis Representation and Principal Components - linear basis representation, 5 Tue, Jan 27 multivariate Gaussians, dimensionality reduction with principal component analysis S.A1, S.14.2; Turk and Pentland Learning Visual Representations - efficient 6 Thu, Jan 29 coding of images, independent component analysis, multi-scale coding Olshausen and Field, 1996, 2000; S.3.5, M.3.4 Tue, Feb 3 8 Generalized Features and Invariance S.4; Riesenhuber Thu, Feb 5 biological inspiration, feature pooling, feature et al, 1999; Lowe correspondence, SIFT features 2004 9 Tue, Feb 10 1 S.3.4.3; Lewicki Bayesian Inference & Image Denoising and Olshausen, vision as inference; Wiener filtering, denoising 1999 7 Image Segmentation - image boundaries in natural images, clustering, normalized cuts. A2 A1 A1 A3 A2 S.5; Shi and Malik Hierarchical Statistical Representations hierarchical generative models; modeling of 10 Thu, Feb 12 natural textures and boundaries; interpretation of complex cells Karklin & Lewicki, 2009 Object Recognition (Introduction) - shape 11 Tue, Feb 17 constancy, reference frames, holistic features, invariant feature recognition Sinha, 2002 A2 A4 A3 In notes, S.x.y refers to Szeliski chapter x, section y, M.x.y for Mallot, and FS.x.y for Frisby and Stone. Date Topics Readings Assignment due dates out draft final Hierarchical Models for Recognition - feed forward models, convolutional neural nets, 12 Thu, Feb 19 deep belief nets, criticisms of object recognition systems; more object recognition Fei-Fei et al, 2007; Pinto et al, 2008, 2011; Hinton, 2006 Motion Estimation - aperture problem, 13 Tue, Feb 24 motion gradient equation, optic flow fields, ill- M.9; S.8; FS.14 posed problems regularization Motion Inference - Bayesian inference and 14 Thu, Feb 26 motion estimation, human perception of motion FS.15; Weiss, Simoncelli, & Adelson 1999 Motion Representation - learning 15 Tue, Mar 3 representations of complex motions, motion and inference during fixation Cadieu & Olshausen 2011; Burak et al 2010 16 Thu, Mar 5 Active Vision - types of eye movements, visual integration A3 A4 Burak et al 2010 Tue, Mar 10 Spring break - no class Thu, Mar 12 Shape from Shading - image formation 17 Tue, Mar 17 models, inference of shape, reflectance, and lighting, regularization without priors S.12; Zheng et al, 1999; Freeman 1994 Shape and Surface Perception - perception of lighting, shape from two-tone images, 18 Thu, Mar 19 shape perception of complex surfaces, higherlevel shape representations Ostrovsky et al 2005; Purves et al, 2004; Fleming et al 2004 A5 19 Tue, Mar 24 Geometric Computer Vision Overview A4 A5 Feature-based Alignment - 2D alignment 20 Thu, Mar 26 using least squares, RANSAC and variations, image stitching, 2D coordinate transforms S.6.1; S2.1.1-2 Planar Geometry and Projective Transformations - geometric primitives, 21 Tue, Mar 31 projective space, projective transformations, removing projective distortion S2.1.2; Hartley and Zisserman (2004), Ch.2 A6 A5 Date Topics Readings Assignment due dates out draft final 3D Transformations - basic transformations, 22 Thu, Apr 2 axis/angle rotation parameterization, unit quaternions S2.1.3-4 23 Tue, Apr 7 Camera models and Pose estimation - 3D to S2.1.5-6; S6.2-3 2D projections, camera models, optics 24 Thu, Apr 9 TBD S7; Snavely, et al, 2006; Brown and Lowe, 2005 25 Tue, Apr 14 Triangulation - linear least squares, normal equations S7.1; S A.2; A6 Two-frame structure from motion - epipolar 26 Thu, Apr 16 geometry, the essential matrix, 8-point algorithm 27 Tue, Apr 21 Visual Scene Analysis Saliency and Visual Search - models of 28 Thu, Apr 23 saliency, ideal visual search, in-attentional blindness A6 Itti et al 1998; Torralba et al 2006; Najemnik & Geisler 2005 Potential additional topics (TBD) The fundamental matrix Factorization Bundle adjustment Triggs et al, 2000 Shape Perception - perception of lighting, Purves et al, shape from two-tone images, shape perception 2004; Fleming, et of complex surfaces, higher-level shape al, 2004 representations Surface Perception - lightness perception, perceptual constancy, perception of surface and material properties Visual Scene Analysis Fleming, Dror, and Adelson, 2003 Date Topics Readings Assignment due dates out draft final S.3.4, FS13, Doi Robust Coding and Image Reconstruction and Lewicki, image denoising, Weiner filtering?, robust 2005, 2006; coding in noisy systems, vision as inference Lewicki and Bayesian restoration, super-resolution? Olshausen, 1999