Lecture 1 Machine Vision MCT-453 Dr. Muhammad Usman University of Engineering and Technology, Lahore Faisalabad Campus Lecture 1 Lecture 1 Introduction Introduction What is Machine Vision? “The use of devices for optical, non-contact sensing to automatically receive and interpret an image of a real scene in order to obtain information and/or control machines or processes” Machine Vision Machine Vision Machine Vision (MV) is concerned with the engineering of “Integrated mechanical-optical-electronic-software systems” Natural Objects Materials Human Artifacts Manufacturing Processes Detect Defects Improve Quality Operating Efficiency Safety of product & processes It is also used to control machines used in manufacturing Goal – Machine Vision To create a model of the real world from images – Recovering useful information about a scene from its twodimensional projections – This recovery requires the inversion of many to one mapping – Knowledge about the objects in the scene and projection geometry is required Mathematical Model Goal – Steps Image Acquisition Pre-processing Segmentation Labelling Feature Detection Classification Objectives • Students will learn the tools to acquire and subsequently process the images using a problem-solving approach. • This approach requires to assess the needs first and employ the solution components accordingly from a list of available procedures and algorithms. • Although the course can be thought of as a mixture of algorithm development and their mathematical implementation, but the focus would be on algorithm understanding and development. • The course material and associated lab-work, therefore, is aimed to enable the students to have a multitude of image processing techniques to be used later in the development of their semester projects. • The course is also aimed to motivate the students by introducing state-of-the-art in the field of machine vision. Course Contents • Introduction: What is Machine Vision? Practical Mechatronic applications • Image Acquisition and Representation: Concepts of representation of images, Digitization, binary, gray and color (RGB, CMYK, HSI etc.) images, elementary image processing functions (enhancement and filtration of digital image in spatial as well as in frequency domain), image properties, adjacency conventions • Fundamentals of Digital Image Processing: Point, Neighborhood, and Geometric operations, Image restoration, Mathematical Morphology • Segmentation: Thresholding, Edge-based segmentation, Regionbased Segmentation, Mean Shift Segmentation • Image Analysis: Template Matching, Decision-theoretic approaches, The Hough transform • Object Recognition: Statistical Pattern Recognition, Neural Nets, Syntactic Pattern recognition, Optimization techniques in recognition, Fuzzy Systems • Motion Analysis: Differential motion analysis methods, Optical Flow, Analysis based on correspondence of interest points, Video tracking, Motion models to aid tracking • Applications to robotics and intelligent machine interaction will also be included. • Suggested Text: 1. Image Processing, Analysis and Machine Vision by Milan Sonka, Vaclav Hlavac and Roger Boyle 2. David Vernon. Machine Vision 3. Computer and Machine Vision - Theory, Algorithms, Practicalities By E R Davies 4. Digital Image Processing by Rafael C. Gonzalez • Course Pre-requisites: – MA-244: Probability and Statistics – MA-234: Linear Algebra Lecture 1 CLOs CLOs – Machine Vision CLO – 1 Understand fundamental concepts of machine vision systems and digital image processing CLOs – Machine Vision CLO – 2 Apply basic image processing techniques including point, neighbourhood, geometric and morphological operations. Filter Convolution Contrast Thresholding Stretching CLOs – Machine Vision CLO – 2 Apply basic image processing techniques including point, neighbourhood, geometric and morphological operations. CLOs – Machine Vision CLO – 3 Comprehend image processing algorithms for segmentation, image analysis, object recognition and motion analysis. CLOs – Machine Vision CLO – 3 Comprehend image processing algorithms for segmentation, image analysis, object recognition and motion analysis. Relationship to other fields Relationship to other fields Techniques developed from many areas are used for recovering Information from images Image Processing: • Usually transform images into other images • Task of information recovery is left to a human user • e.g., image enhancement, compression, correcting blur images Machine Vision: • Takes image as input but produce outputs such as representation for the object contours in an image • Emphasis is to recover information automatically Image processing algorithms are useful in pre-processing stage of MV To enhance particular information and suppress noise Relationship to other fields Compute Vision: • Generates images from geometric primitives such as lines, circles, and free-form surfaces • It plays significant role in visualization and virtual reality Machine Vision: • Estimating the geometric primitives and other features from images CV is the synthesis of images and MV is the analysis of images MV is using curved and surface representation from CV And CV is using MV techniques for creating realistic images Relationship to other fields Pattern recognition: • Classifies numerical and symbolic data • Techniques of this field plays an important role in MV for recognizing objects • Many industrial application heavily rely on pattern recognition Artificial Intelligence: • It is concerned with designing intelligent systems and with studying computational aspects of intelligence • Techniques from this field is used in later stages of MV Perception translates signals from the world into symbols, cognition manipulates symbols, and action translates symbols into signals that effects changes into the world CV is considered a subfield of AI Relationship to other fields ”MV produces measurements or abstraction from geometrical properties” Measurement Geometry Measurement Interpretation MV Automated processing Interpretation IVS Lecture 2 Image Geometry Vision • Vision is the most powerful sense • Allows to interact with the world without making any direct physical contact • Approx. 60% of your brain processing in the process of visual perception • Able to navigate seamlessly in this complex world Machine Vision • Enterprise of building machines that can see or emulate human vision. Why? • Several reasons: – Various routine works can be performed by machines (e.g., tidying things up, driving home etc.) So that we can have time to perform other tasks – Human vision focus on qualitative not quantitative • Not capable of precise measurements of the things in physical world – Build a system that can surpass human vision and extract information about the world that human can not perceive. Vision deals with images • An image is an array of pixels – A pixel has values • Brightness • Color • Distance (Depth) • Material (soon) What we see in images What machine sees • • Vision is challenging when we want to extract all the information we observed in previous image Therefore; – Vision is Hard – Multi-disciplinary (optics, Mechanical, Electronics, and IT, sometimes psychology and biology) • We have successful applications Various Applications • Factory Automation – vision guided robotics • Deals uncertainty in the physical world (parts alignment etc.) – Visual inspection • Deals with defects • Optical character recognition (OCR) – Car number identification • Security – Object detection and tracking • Biometrics – Eye scan, face detection • Many others: – Optical mouse, Gaming (Kinect), Snapchat, Autonomous navigation (exploration, driverless cars), Medical imaging Image formation There are two parts of the image formation process β’ The geometry of image formation β’ Which determine where in the image plane the projection of the point in the scene will be located β’ The physics of light β’ Which determine the brightness of a point in the image plane as a function of scene illumination and surface properties Image formation The position (x’, y’) in the image plane of the point at position (x, y, z) in the scene is found by computing the line-of-sight intersecting (x’, y’) and (x, y, z) Image formation The distance of the point (x, y, z) from the z-axis is π = ππ + ππ Distance of the projected point (x’, y’) from the origin of the image plane is π′ = π′π + π′π The two triangles are similar, so the ratio of the corresponding sides of the triangle must be same π π′ = π π Image formation Similarly, the two triangles are similar, so as their ratios π′ π′ π′ π = = = π π π π The position of the point in the image plane is give by π′ = π π π′ = π π Industrial Vision System (IVS) Inspection & other Applications IVS-Inspection β’ Inspection is necessary because consumers who purchase unsatisfactory products are less likely to make a repeat purchase. β’ In the aerospace, automotive and food industries, failed products can cause injury or even fatal accidents. IVS-Inspection β’ Humans are engaged in inspection tasks β’ but their performance is often less than satisfactory. β’In some cases, human inspection is not even possible β’ Thus, automated inspection is required. IVS-Inspection β’Automated inspection is highly desirable in the inspection of dangerous materials. β’These include flammable, explosive or radioactive substances. Machine Vision Applications • Vision systems are currently being used extensively in manufacturing industry, where they perform a very wide variety of inspection, monitoring and control functions. – – – – – – – Electronics Automobiles Aircraft Domestic Applications (furniture polish, tooth paste to refrigerators) Food Industry Agriculture horticulture Application Classified by Tasks Present and projected applications of Machine Vision to natural products may be classified according to the function they perform Tasks Function Analyzing the Content Count how many Sweets of each kind are put in a box Analyzing the Shape Fruit, Vegetables, Animal Carcasses Analyzing Texture Bread, Cork Floor Tiles, Wood Panels Assembling Food Products Pizza, Kimchi, Meat Pies Checking Aesthetic Appearance Loaves, Cakes, Quiches, Trifles Cleaning Selective Washing of Cauliflower, Broccoli, Leeks Coating Chocolate Coating of Confectionery Bars, Icing Application Classified by Tasks Tasks Function Counting Counting Cherries on the surface of a Cake Decorating Cakes, Chocolates, Trifles, Pies Detecting Foreign Bodies Seeds, Nut Shells, Twigs, Stones, Contact lenses Detecting Surface Contamination Mildew, Mud, Bird Excrement Estimating the Size or Volume Fruit, Fish/Animal/Poultry Carcasses, Meat Grading Identifying Premier-Quality Fruit and Vegetables Harvesting There is a huge variety of tasks of this type Packaging Fragile/Variable Products, Cream Cakes, Meringues Sorting, Spraying, Cutting Fruit from Leaves; Fish by Size/Species on a Trawler Other Machine Vision Applications Application Functions Document Processing Optical Character Recognition/Verification and Document Authentification Security and Surveillance Identifying Intruders in Secure Spaces Medicine and Health Screening Cell Samples for Genetic Screening, Identifying Cancer Cells Military Target Identification and Fire Control Traffic Control/Monitoring Both Pedestrian and Motor Vehicles Forensic Science and Finger-Print analysis, Research Astronomy, Bio-Medical, Particle Physics, Materials Engineering Machine Vision Major Components Machine Vision As an absolute minimum, a machine vision must contain: • some means of presenting the object to be inspected to the camera; • lights; • camera; • an electronic circuit card to digitize the signal from the camera; • computer, or dedicated electronic image-processing hardware; • Software: if a computer is used for image processing; • actuator: this may be anything from a simple accept/reject gate, to a multi-axis robot. Automated Vision Inspection (AVI) • AVI operates by employing a camera to acquire an image of the object being inspected and then utilizing appropriate image processing hardware and software routines to find and classify areas of interest in the image. Automated Vision Inspection (AVI) • Generally, AVI involves the following processing stages • Image acquisition to obtain an image of the object to be inspected; • Image enhancement to improve the quality of the acquired image, which facilitates later processing; • Segmentation to divide the image into areas of interest and background. The result of this stage is called the segmented image, where objects represent the areas of interest; • Feature extraction to calculate the values of parameters that describe each object; • Classification to determine what is represented by each object. Image Processing Image Processing Image processing involves changing the nature of an image in order to either β’ improve its pictorial information for human interpretation β’ render it more suitable for autonomous machine perception Humans like their images to be sharp, clear and detailed; machines prefer their images to be simple and uncluttered. Aspects of Image Processing • We can subdivide different image processing algorithms into broad subclasses • Image enhancement. This refers to processing an image so that the result is more suitable for a particular application. It includes • Sharpening or de-blurring an out of focus image, • Highlighting edges, • Improving image contrast, or brightening an image, • Removing noise. Aspects of Image Processing • Image restoration This may be considered as reversing the damage done to an image by a known cause, for example • removing of blur caused by linear motion • removal of optical distortions, • removing periodic interference • • • • • Image segmentation This involves subdividing an image into constituent parts Isolating certain aspects of an image: Finding lines, circles, or particular shapes in an image, In an aerial photograph, identifying cars, trees, buildings, or roads An Image Processing Task β’ Acquiring the image First we need to produce a digital image from a paper envelope. β’ Preprocessing This is the step taken before the major image processing task. β’ To render the resulting image more suitable for the job β’ It may involve enhancing the contrast, removing noise, or identifying regions likely to contain the postcode. β’ Segmentation Here is where we actually get the postcode; in other words we extract from the image that part of it which contains just the postcode. An Image Processing Task • Representation and description These terms refer to extracting the particular features which allow us to differentiate between objects. Here we will be looking for curves, holes and corners which allow us to distinguish the different digits which constitute a postcode. • Recognition and interpretation This means assigning labels to objects based on their descriptors (from the previous step), and assigning meanings to those labels. So we identify particular digits, and we interpret a string of four digits at the end of the address as the postcode. Enhancing the edges of an image to make it appear sharper Removing noise from an image Removing motion blur from an image Obtaining the edges of an image Removing detail from an image Image Acquisition Image Acquisition and Sampling • Sampling refers to the process of digitizing a continuous function suppose we take the function 1 y = sin( x) + sin(3 x) 3 • A continuous function can be reconstructed from its samples provided that the sampling frequency is at least twice the maximum frequency in the function—Nyquist Criterion Image Acquisition and Sampling • We consider an image as a continuous function of two variables, which is then sampled and quantized to convert it to produce a digital image • Sampling rate determines how many pixels the digital image will have, and • Quantization determines how many intensity levels will be used to represent the intensity value at each sample point Image Acquisition and Sampling β’ To view the scene, we record the energy reflected from it; we may use visible light, or some other energy source Using light β’ It is the predominant energy source for images β’ it is the energy source which human beings can observe directly. β’ It has the advantage of being safe, cheap, easily detected and readily processed with suitable hardware β’ Two very popular methods of producing a digital image are with – A Digital Camera – A Flat-Bed Scanner Digital Camera • Such a camera has an array of photo-sites, these are silicon electronic devices whose voltage output is proportional to the intensity of light falling on them. • For a camera attached to a computer, information from the photo-sites is then output to a suitable storage medium. • Generally this is done on hardware using a frame-grabbing card. • This allows a large number of images to be captured in a very short time in the order of one ten-thousandth of a second each. • The images can then be copied onto a permanent storage devices. Digital Camera • The output will be an array of values; each representing a sampled point from the original scene. The elements of this array are called picture elements, or more simply pixels. • Digital still cameras use a range of devices, from floppy discs and CD's, to various specialized cards and memory sticks. Flat Bed Scanner • This works on a principle similar to the digital camera. • Instead of the entire image being captured at once on a large array, a single row of photo-sites is moved across the image, capturing it row-by-row as it moves. • This is a much slower process but it is quite reasonable to allow all capture and storage to be processed by suitable software. Other Energy Sources • Visible light is part of the electromagnetic spectrum: radiation in which the energy takes the form of waves of varying wavelength. • X-rays • x-ray tomography • CAT (Computed Axial Tomography) • As the beam moves around the object, an image of the object can be constructed; such an image is called a tomogram. Image Defination Images and Digital Images • Consider image as being a two dimensional function, where the function values give the brightness of the image at any given point • Image brightness values can be any real numbers in the range 0.0 (black) to 1.0 (white) • The ranges of x and y will clearly depend on the image, but they can take all real values between their minima and maxima Images and Digital Images • A digital image differs from a image in that the x, y and f (x,y) values are all discrete. • X and y range from 1 to 256 each and brightness also range from 0 (black) to 255 (white) • it can be considered as a large array of sampled points, each of which has a particular quantized brightness • Neighborhoods have odd numbers of rows and columns Types of Images Types of Digital Images • Four basic type of images • Binary Each pixel is just black or white. • There are only two possible values for each pixel, we only need one bit per pixel. • Images for which a binary representation may be suitable include text (printed or handwriting) fingerprints or architectural plans Types of Digital Images • Grayscale (Intensity) Each pixel is a shade of grey, normally from 0 (black) to 255 (white) • It means each pixel can be represented by eight bits • Natural range for image file handling • Such images arise in medicine (X-rays), images of printed works, and indeed 256 different gray levels is sufficient for the recognition of most natural objects Types of Digital Images • True color, or RGB Here each pixel has a particular color; that color being described by the amount of red, green and blue in it. • If each of these components has a range 0-255 this gives a total of 2553=16,777,216 different possible colors in the image • Since the total number of bits required for each pixel is 24 such images are also called 24-bit color image • For every pixel there correspond three values. Types of Digital Images • Indexed Most color images only have a small subset of the more than sixteen million possible colors. • Image has an associated color map, which is simply a list of all the colors used in that image. • Each pixel has a value which does not give its color (as for an RGB image), but an index to the color in the map. • It is convenient if an image has 256 colors or less, for then the index values will only require one byte each to store. Image File Sizes • Suppose we consider a 512x512 binary image 512x512x1=262,144=32768 bytes=32768 Kb=0.033Mb • A grayscale image of the same size requires 512x512x1=262,144 bytes=262.14 KB=0.262 MB • For color images, in which each pixel is associated with 3 bytes of color information 512x512x3=786,432 bytes=786.43 KB=0.786 MB • Satellite images may be of the order of several thousand pixels in each direction Image perception We should be aware of the limitations of the human visual system. Image perception consists of two basic steps: • capturing the image with the eye, • recognizing and interpreting the image with the visual cortex in the brain The combination and immense variability of these steps influences the ways in we perceive the world around us Image perception There are a number of things to bear in mind: 1. Observed intensities vary as to the background • A simple block of grey will appear darker if we placed on a white background than if it were placed on a black background. • i.e. we don’t perceive grey scales “as they are”, but rather as they differ from their surroundings Image perception 2. We may observe non-existent intensities as bars in continuously varying grey levels Image perception • Our visual system tends to undershoot or overshoot around the boundary of regions of different intensities Thank You 81