CS 523 (CS 423/EE 533) Computer Vision Lecture 1 INTRODUCTION TO COMPUTER VISION About the Course 2 Syllabus http://vvgl.ozyegin.edu.tr Objective Introduction to the theory, tools, and algorithms of 3D computer vision Instructor Assist. Prof. M. Furkan Kıraç E-mail: furkan.kirac@ozyegin.edu.tr Room: 219 Hours Wednesdays, 10:40-13:30, Room: 241 Grading Projects: 6x10% Final Exam: 40% 3 Grading Short Projects: Late submissions are not accepted. Copying answers from others’ work is not permitted. Final Exam: At least 3 of the 6 Short Projects must be turned in by the due date in order to qualify for the Final Exam. No make-up will be given for the Final Exam. Students can take the Bütünleme exam if they miss the Final Exam. 4 Recommended Books Computer Vision: Algorithms and Applications, Richard Szeliski, Springer, 2010. Computer Vision: A Modern Approach, David A. Forsyth and Jean Ponce, Prentice-Hall, 2002. Introductory Techniques for 3D Computer Vision, Emanuele Trucco and Alessandro Verri, Prentice-Hall 1998. 5 OpenCV Resources Learning OpenCV, Gary Bradski and Adrian Kaehler, O'Reilly, 2008. OpenCV 2 Computer Vision Application Programming Cookbook, Robert Laganière, Packt Publishing, 2011. Mastering OpenCV with Practical Computer Vision Projects, Daniel Lélis Baggio, et al., Packt Publishing, 2012. 6 Week Lectures 24 September 2014 Lecture 1 1 October 2014 Lecture 2 8 October 2014 Lecture 3 15 October 2014 Lecture 4 22 October 2014 Lecture 5 29 October 2014 Lecture 6 5 November 2014 Lecture 7 12 November 2014 Lecture 8 19 November 2014 Lecture 9 26 November 2014 Lecture 10 3 December 2014 Lecture 11 10 December 2014 Lecture 12 17 December 2014 Lecture 13 24 December 2014 Lecture 14 31 December 2014 Lecture 15 ? Applications of Computer Vision 8 Image Stitching Image Matching Object Recognition 3D Reconstruction Interior Modeling 13 3D Augmented Reality 14 3D Camera Tracking 15 Stereo Conversion for 3DTV 16 Depth Estimation and View Interpolation for 3DTV 17 Human Tracking 18 License Plate Recognition 19 Human Pose Estimation 20 Course Outline 21 Topics to be covered 3D geometry fundamentals Transformations and projections Camera calibration Feature detection and matching Image stitching Single view geometry Two view geometry Multiple view geometry Stereo vision and depth estimation 3D structure from motion 3D camera tracking 22 Relation to Other Fields 23 Computer Vision Figure from "Computer Vision: Algorithms and Applications,” Richard Szeliski, Springer, 2010. 24 Computer Graphics Lights and materials Shading Texture mapping Environment effects Animation 3D scene modeling 3D character modeling (OpenGL) 25 Computer Graphics 26 Image Processing Topics Resampling Enhancement Noise filtering Restoration Reconstruction Segmentation Image compression (MATLAB and OpenCV) 27 Image Processing 28 Video Processing Topics Spatio-temporal sampling Motion estimation Frame-rate conversion Multi-frame noise filtering Multi-frame restoration Super-resolution Video compression (MATLAB & OpenCV) 29 Video acquisition-display chain Capture Representation Coding Transmission Decoding Rendering 30 Human vs. Computer 31 Optical illusions Actual vs. Perceived Intensity (Mach band effect) 33 Brightness Adaptation of the Eye 34 Optical illusions Optical illusions Why is Computer Vision Difficult? Human perception Human perception Human Visual System 41 Human Eye Photoreceptors: Rods & Cones Rods vs. Cones Rods Perceive brightness only Night vision Cones Perceive color Day vision Red, green, and blue cones Cone Distribution Blue is less-focused 64% 32% 2% Visual Threshold drop during Dark Adaptation Spatial Resolution of the Human Eye Photopic (bright-light) vision: Approximately 7 million cones Concentrated around fovea Scotopic (dim-light) vision Approximately 75-150 million rods Distributed over retina (HDTV: 1920x1080 = 2 million pixels) 50 Frequency Responses of Cones Same amount of energy produces different sensations of brightness at different wavelengths Green wavelength contributes most to the perceived brightness. 51 Trichromatic Color Mixing C Any color can be obtained by mixing three primary colors Red, Green, Blue (RGB) with the right proportion T C , k 1, 2,3 k k Tk : T ristimulus values Image Formation 54 Human Eye vs. Camera Camera components Eye components Lens Lens, cornea Shutter Iris, pupil Film Retina Cable to transfer images Optic nerve to send the incident light information to the brain Human Vision Image formation Pin-Hole Camera Model Point Spread Effect Out-of-Focus Blur Shrinking the Aperture Converging Lens Correction with a Converging Lens Perfectly In-Focus for a Certain Distance Only “circle of confusion” Depth-of-Field Depth-of-Field “Sharp Image” within Depth-ofField due to Finite Sensor Size ZF ZN Focal Length (F) and Depth (Z) Z F Y y Y yF Z xF X Z Aperture Size Affects Depth-Of-Field f / 5.6 f / 32 Aperture Ad 2 Camera f-number F f d F A f 2 Exposure Time Motion Blur Effect due to Finite Exposure Time Decrease in aperture implies… Increase in depth-of-field Decrease in motion blur Decrease in exposure 2D Image Representation 76 Image Capture (Courtesy Gonzalez & Woods) 77 Digital Image Capture Digital Image Capture Light sensitive diodes convert photons to electrons Color Image Capture: Single vs. Three CCD Arrays Bayer filter (cheaper but introduces spatial resolution loss) RGB splitter (three separate imaging sensors, higher resolution) Digital Camera Issues Noise Color charge overflowing into neighboring pixels In-camera processing color fringing (chromatic aberration) artifacts from Bayer patterns Blooming caused by low light over-sharpening can produce halos Compression creates blocking artefacts Digitization: Sampling and Quantization Over Sampling Over Quantization 84 Images as Matrices of Integers (0,0) m 126 127 126 128 127 124 158 125 126 127 123 120 144 163 123 126 125 121 128 155 160 126 123 127 122 142 162 164 120 122 124 130 157 161 166 119 121 123 145 162 164 165 0 → black, 255 → white n 0 ≤ s(m,n) ≤ 255 } quantization 0 ≤ m ≤ M-1 MxN 8-bit gray-scale (intensity, luminance) image 85 0 ≤ n ≤ N-1 sampling Images as Functions We can think of an image as a function, f, from R2 to R: f( x, y ) gives the intensity at position ( x, y ) Realistically, we expect the image only to be defined over a rectangle, with a finite range: • f: [a,b]x[c,d] [0,1] A color image is just three functions pasted together. We can write this as a “vector-valued” function: r ( x, y ) f ( x, y ) g ( x, y ) b( x, y ) RGB Color Bands (Channels) Red Green Blue YUV Bands Also called Y Cb Cr Y : Luma Cb : Chrominance_blue Cr : Chrominance_red Color Y U (Cb) V (Cr ) YUV-RGB Conversion Summary 90 Summary Human visual system Pin-hole camera model Image representation 91 Problems to be Addressed How to find camera parameters? Where is the camera, where is it directed at? What is the movement of the camera? Where are the objects located in 3D? What are the dimensions of objects in 3D? What is the 3D structure of a scene? How to process stereo video? How to detect and match image features? How to stitch images? 92