Computer Vision CSA 401 Department of CSE School of Engineering & Technology Sharda University Dr. Ali Imam Abidi Vision of the University To serve the society by being a global University of higher learning in pursuit of academic excellence, innovation and nurturing entrepreneurship. Mission of the University 1. Transformative educational experience 2. Enrichment by educational initiatives that encourage global outlook 3. Develop research, support disruptive innovations and accelerate entrepreneurship 4. Seeking beyond boundaries Outcome-Based Education (OBE) An education theory that equates learning of a subject matter through an educational system with few goals and whether those goals are being achieved or not and up to what degree. Course Outcomes (COs) • Course outcomes are the goals which would be assessed against the learning of that course itself. • This would provide an estimate of the degree of outcomes that have been achieved at the end of the course and how much of the subject matter itself. Course Outcomes (COs) 1. Define the Fundamentals of Computer Vision and Computer Graphics and relate them with real world applications 2. Explain Image formation models and Foundations for mathematical basis for various projection systems 3. Apply Image processing techniques such as Segmentation and Edge Detection for real time and real world applications. 4. Analyze various feature extraction techniques for different problem domain. 5. Evaluate Pattern Recognition Using Clustering, Classification, Supervised Learning and Unsupervised Learning Techniques 6. Build computer vision applications for real world Applications. •Computer Vision and Nearby Fields Computer Graphics: Models to Images Photography: Images to Images Image Processing: Captured Images to workable (refined) images Computer Vision: Images to Models •Computer Vision • Make computers understand images and video. •What kind of scene? •Where are the cars? •How far is the building? •… •Vision is really hard • Vision is an amazing feat of natural intelligence • Visual cortex occupies about 50% of Macaque brain • More human brain devoted to vision than anything else •Is that a queen or a bishop? •Why computer vision matters •Safety •Health •Comfort •Fun •Security •Access •Ridiculously brief history of computer vision • 1966: Minsky assigns computer vision as an undergrad summer project • 1960’s: interpretation of synthetic worlds • 1970’s: some progress on interpreting selected images • 1980’s: ANNs come and go; shift toward geometry and increased mathematical rigor • 1990’s: face recognition; statistical analysis in vogue • 2000’s: broader recognition; large annotated datasets available; video processing starts •Guzman ‘68 •Ohta Kanade ‘78 •Turk and Pentland ‘91 •Optical character recognition (OCR) • Technology to convert scanned docs to text • If you have a scanner, it probably came with OCR software •Digit recognition, AT&T labs •http://www.research.att.com/~yann/ •License plate readers •http://en.wikipedia.org/wiki/Automatic_number_plate_recognition •Face detection • Many new digital cameras now detect faces • Canon, Sony, Fuji, … •Smile detection •Sony Cyber-shot® T70 Digital Still Camera •3D from thousands of images •Building Rome in a Day: Agarwal et al. 2009 •Object recognition (in supermarkets) •LaneHawk by EvolutionRobotics •“A smart camera is flush-mounted in the checkout lane, continuously watching for items. When an item is detected and recognized, the cashier verifies the quantity of items that were found under the basket, and continues to close the transaction. The item can remain under the basket, and with LaneHawk,you are assured to get paid for it… “ •Vision-based biometrics •“How the Afghan Girl was Identified by Her Iris Patterns” Read the story •wikipedia •Login without a password… •Fingerprint scanners on many new laptops, other devices •Face recognition systems now beginning to appear more widely http://www.sensiblevision.com/ •Object recognition (in mobile phones) • Point & Find, Nokia • Google Goggles •Special effects: shape capture •The Matrix movies, ESC Entertainment, XYZRGB, NRC •Special effects: motion capture •Pirates of the Carribean, Industrial Light and Magic •Sports •Smart cars •Slide content courtesy of Amnon Shashua • Mobileye • Vision systems currently in high-end BMW, GM, Volvo models • By 2010: 70% of car manufacturers. •Google cars •http://www.nytimes.com/2010/10/10/science/10google.html?ref=artificialintelligence •Interactive Games: Kinect • Object Recognition: http://www.youtube.com/watch?feature=iv&v=fQ59dXOo63o • Mario: http://www.youtube.com/watch?v=8CTJL5lUjHg • 3D: http://www.youtube.com/watch?v=7QrnwoO1-8A • Robot: http://www.youtube.com/watch?v=w8BmgtMKFbY •Vision in space •NASA'S Mars Exploration Rover Spirit captured this westward view from atop a low plateau where Spirit spent the closing months of 2007. • Vision systems (JPL) used for several tasks • • • • Panorama stitching 3D terrain modeling Obstacle detection, position tracking For more, read “Computer Vision on Mars” by Matthies et al. •Industrial robots •Vision-guided robots position nut runners on wheels •Mobile robots •NASA’s Mars Spirit Rover •http://en.wikipedia.org/wiki/Spirit_rover •http://www.robocup.org/ •Saxena et al. 2008 •STAIR at Stanford •Medical imaging •3D imaging •MRI, CT •Image guided surgery •Grimson et al., MIT The Computer Vision Hierarchy • Low-level vision: process image for feature extraction (edge, corner, or optical flow). • Middle-level vision: object recognition, motion analysis, and 3D reconstruction using features obtained from the low-level vision. • High-level vision: interpretation of the evolving information provided by the middle level vision as well as directing what middle and low level vision tasks should be performed. The Computer Vision Hierarchy • low-level Image → image • mid-level image → features • high-level features → analysis Low Level Vision (≈ Digital Image Processing) • Image Acquisition • • • Image captured by a sensor (digitized) Pre-processing • Denoising • Sharpening/Blurring • Image enhancement • Contrast stretch Image Segmentation • Object separation from background • Region growing, edge linking Mid Level Vision • Attributes/Features Selection • Attributes/Feature Extraction • Features like edges, points (corners) • Feature subset selection etc. High Level Vision • ‘High-level’ vision lacks a single, agreed upon definition. • It might usefully be defined as those stages of visual processing that transition from analyzing local image structure to analyzing structure of the external world that produced those images. High Level Vision • The central computational challenge of vision arises from the fact that it is an ‘ill-posed’ problem • • the external world is three dimensional and made up of surfaces with different reflectance properties, yet the visual system must make do with a pair of twodimensional retinas containing only a handful of photoreceptor types. depending on: • • • the object’s position relative to the viewer the configuration of light sources, the presence of other objects any given object can cast an effectively infinite number of different images onto the retina While much research has focused on object recognition as a central task in vision, many real world scenes are only poorly described by object labels such labels provide an extremely impoverished description of the scene, and it is unclear whether some of the labels (e.g. ‘planter’) are valid for animals besides humans A segmentation-based description of the scene is better, but still represents a shadow of the total information content of the scene. We are easily able to extra a wealth of information from this scene, even for objects that are problematic to label. This additional information includes 3D information, such as normal vector directions, and even more abstract task-driven information, such as a whether a portion of the scene is in the open or under cover. Such tasks may represent a more framing context for high-level vision, especially in non-human animals. High Level Vision • Object Recognition • • Detection of classes of objects (faces, motorbikes, trees etc.) in images. Recognition of specific objects such as George Bush/Bill Clinton or a specific machine part etc. • Classification of images or parts of images for medical or scientific applications. • Recognition of events in surveillance videos. • Measurement of distances for robotics. High Level Vision Tools • Graph-Matching: A*, Constraint Satisfaction, Branch and Bound Search, Simulated Annealing • Learning Methodologies: Decision Trees, Neural Nets, SVMs, EM Classifier. • Probabilistic Reasoning, Belief Propagation, Graphical Models Graphical Models. Overview of Diverse Computer Vision Applications Document Image Analysis • algorithms and techniques that are applied to images of documents to obtain a computer-readable description from pixel data. Document Image (non-textual format) Pixel Data Algorithms & Techniques Textual format) • Optical Character Recognition (OCR) software that recognizes characters in a scanned document. • objective of document image analysis: is to recognize the text and graphics components in images of documents, and to extract the intended information as a human would. A hierarchy of document processing subareas listing the types of document components dealt within each subarea. (Reproduced with permission from O’Gorman & Kasturi 1997.) Two components + Pictures 1. Textual processing: deals with the text components of a document image. Some tasks here are: • • • determining the skew (any tilt at which the document may have been scanned into the computer), finding columns, paragraphs, text lines, and words, and finally recognizing the text (and possibly its attributes such as size, font etc.) by optical character recognition (OCR). Two components 2. Graphics processing: deals with the non-textual line and symbol components that make up: • line diagrams, delimiting straight lines between text sections, company logos etc. Pictures • • are a third major component of documents, but except for recognizing their location on a page, further analysis of these is usually the task of other image processing and machine vision techniques. Biometrics • Associating an identity with an individual is called personal identification. • resolving the identity of a person can be categorized into two fundamentally distinct types of problems: • • Verification Recognition Identity Resolution Verification Recognition (Identification) • Verification (authentication): refers to the problem of confirming or denying a person’s claimed identity. • Identification: refers to the problem of establishing a subject's identity• • either from a set of already known identities (closed identification problem) or otherwise (open identification problem). • The term positive personal identification typically refers (in both verification as well as identification context) to identification of a person with high certainty. • An engineering approach to the (abstract) problem of authentication of a person's identity is to reduce it to the problem of authentication of a concrete entity related to the person. • these entities include: • • a person's possession {"something that you possess''), e.g., permit physical access to a building to all persons whose identity could be authenticated by possession of a key; person's knowledge of a piece of information {"something that you know""), e.g., permit login access to a system to a person who knows the user-id and a password associated with it • Some systems, e.g., ATMs, use a combination of "something that you have" (ATM card) and "something that you know" (PIN) to establish an identity. ATM card + PIN Access • problem with the traditional approaches of identification using possession as a means of identity is that the possessions could be lost, stolen, forgotten, or misplaced. • Further, once in control of the identifying possession, by definition, any other "unauthorized" person could abuse the privileges of the authorized user. The problem with using knowledge as an identity authentication mechanism is that it is difficult to remember the passwords/PINs; easily recallable passwords/FINs etc. • another approach to positive identification has been to reduce the problem of identification to the problem of identifying physical characteristics of the person. The characteristics could be either a person's physiological traits, e.g., fingerprints, hand geometry, etc. or her behavioral characteristics, e.g., voice and signature. This method of identification of a person based on his/her physiological/behavioral characteristics is called Biometrics. Biological Measurements Qualify To Be A Biometric • Any human physiological or behavioral characteristic could be a biometrics provided it has the following desirable properties: i. Universality: which means that every person should have the characteristic. ii. Uniqueness: which indicates that no two persons should be the same in terms of the characteristic. iii. Permanence: which means that the characteristic should be invariant with time, and iv. Collectability: which indicates that the characteristic can be measured quantitatively • important requirements: i. Performance: which refers to the achievable identification accuracy, the resource requirements to achieve an acceptable identification accuracy, and the working or environmental factors that affect the identification accuracy. ii. Acceptability: which indicates to what extent people are willing to accept the biometric system, and iii. Circumvention: which refers to how easy it is to fool the system by fraudulent techniques. Biometrics Markers: Overview No single biometrics is expected to effectively satisfy the needs of all identification (authentication) applications. • Each biometrics has its strengths and limitations; and accordingly, each biometric appeals to a particular identification (authentication) application. 1. Voice 2. Infrared Facial and Hand Vein Thermograms • 3. 4. 5. 6. 7. Fingerprints Face Iris Ear Gait 8. Keystroke Dynamics 9. DNA 10. Signature 11. Retinal Scan 12. Hand and Finger Geometry Object Recognition • Object recognition system finds objects in the real world from an image of the world, using object models which are known a-priori. • Object recognition problem can be defined as a labeling problem based on models of known objects. • Formally, given an image containing one or more objects of interest (and background) and a set of labels corresponding to a set of models known to the system, the system should assign correct labels to regions, or a set of regions, in the image. • The OR problem is closely tied to the segmentation problem. • Without at least a partial recognition of objects, segmentation cannot be done, and without segmentation, object recognition is not possible. OR: System Components • • • • Model database (also called modelbase) Feature detector Hypothesizer Hypothesis verifier 1. Model database: contains all the models known to the system information in the model database depends on the approach used for the recognition. • a qualitative or functional description and/or precise geometric surface information. • representation of an object should capture all relevant information without any redundancies and should organize this information in a form that allows easy access by different components of the object recognition system. • Feature is some attribute of the object that is considered important in describing and recognizing the object in relation to other objects. • • • Size, color, and shape are some commonly used features. 2. Feature Detector: • • • • • applies operators to images and identifies locations of features that help in forming object hypotheses. features used by a system depend on the types of objects to be recognized and the organization of the model database. Using the detected features in the image, the hypothesizer assigns likelihoods to objects present in the scene. This reduce the search space for the recognizer using certain features. The modelbase is organized using some type of indexing scheme to facilitate elimination of unlikely object candidates from possible consideration. 2.a. Feature extraction • Which features should be detected, and how can they be detected reliably? • Most features can be computed in two-dimensional images but they are related to three-dimensional characteristics of objects. • Due to the nature of image formation process, some features would be easy to compute than the others. 2.b. Feature-model matching • How can features in images be matched to models in the database? • In most object recognition tasks, there are many features and numerous objects. • An exhaustive matching approach will solve the recognition problem but may be too slow to be useful. • Effectiveness of features and efficiency of a matching technique must be considered in developing a matching approach. 3. Hypotheses formation • How can a set of likely objects based on the feature matching be selected? • How can probabilities be assigned to each possible object? • The hypothesis formation step is basically a heuristic to reduce the size of the search space. • uses knowledge of the application domain to assign some kind of probability or confidence measure to different objects in the domain. • reflects the likelihood of the presence of objects based on the detected features. https://www.mygreatlearning.com/blog/ yolo-object-detection-using-opencv/ https://www.kdnuggets.c om/2020/08/metricsevaluate-deep-learningobject-detectors.html https://apple.github.io/turicreate/docs/userguide/object_detection/ https://www.fritz.ai/image-recognition/ Object Verification • • • • • How can object models be used to select the most likely object from the set of probable objects in a given image? The presence of each likely object can be verified by using their models. One must examine each plausible hypothesis to verify the presence of the object or ignore it. If the models are geometric, it is easy to precisely verify objects using camera location and other scene parameters. In other cases, it may not be possible to verify a hypothesis. Medical Image Analysis An Overview Medical Imaging Modalities • looking into modalities you would be having CT, MR, ultrasound, microscopy, optical coherence tomography. • Another critical part you will obviously be getting exposed to is about organ appearances module. • basically how different organs are going to appear in different modalities whether in a healthy state or in a disease state • E.g., bone would appear brighter on x rays and CT’s would appear darker on T1 MR and T2 MR. https://www.researchgate.net/figure/Typology-of-Medical-Imaging-Modalities_fig1_319535615 https://www.researchgate.net/figure/Overview-of-common-clinical-imagingmodalities-which-have-potential-for-multimodal_fig11_280117631 https://cancerimagingjournal.biomedcentral.com/articles/10.1186/s40644 -020-00312-3 • • • • • • Your fatty regions would appear brighter on MR and darker on x rays, your fatty regions would again appear brighter on ultrasound as well, whereas a water filled region which would appear brighter on MR will appear darker on ultrasound. Now different organs under different modalities will have different sort of ways in which they are viewed. water would appear darker in ultrasound and brighter in MR whereas fat would appear brighter in ultrasound and brighter in MR. Modules of Medical Image Analysis https://link.springer.com/chapter/10.1007/978-3-540-74658-4_62 Medical Image formats and protocols Medical images can be efficiently processed, objectively evaluated and made available at many places at the same time by means of appropriate communication networks and protocols. • • • PACS: Picture Archiving and Communication Systems DICOM: Digital Imaging and Communications in Medicine. Medical Image Analysis covers four major areas: 1. Image formation includes all the steps from capturing the image to forming a digital image matrix. 2. Image visualization refers to all types of manipulation of this matrix, resulting in an optimized output of the image. 3. Image analysis includes all the steps of processing, which are used for quantitative measurements as well as abstract interpretations of medical images. This requires a prior knowledge on the context and content of the images. Medical Image Analysis covers four major areas: 4. Image management sums all techniques that provide the efficient storage, communication, transmission, archiving, and access (retrieval) of image, since an uncompressed radiograph may require several megabytes of storage capacity. The methods of telemedicine are also a part of the image management. Low level & High level Medical Image Processing • • manual or automatic techniques, which can be implemented without a-priori knowledge on the specific content of images. • Primarily image analysis methods • Feature extraction, classification • Prior knowledge is consequential This type of algorithm has similar effects regardless the content of the images. • Interpretations etc. • E.g., SURF algorithm, SVM for images etc. • Morphological techniques • E.g., histogram stretching of a radiograph improves the contrast as it does on any holiday photograph. Degrees of abstraction of medical image data • The raw data level records an image as a whole. Therefore, the totality of all pixels is regarded on this level. • The pixel level refers to discrete individual pixels. • The edge level represents the one-dimensional (1-D) structures, which are composed of at least two neighbored pixels. • The texture level refers to two-dimensional (2-D) structures. On this level, however, the delineation of the area’s contour may be unknown. • The region level describes 2-D structures with a well-defined boundary. • The object level associates textures or regions with a certain meaning or name, i. e. semantics is given on this level. • The scene level considers the ensemble of image objects in spatial and /or temporal terms. Enhancement (Why?) • Can’t distinguish between tissues The nature of the physiological system under investigation and the procedures used in imaging may diminish the contrast and the visibility of details. • Data is too noisy for computer algorithm to perform well Medical images are often deteriorated by noise due to various sources of interference and other phenomena that affect the measurement processes in imaging and data acquisition systems. • Imaging artifacts interfere with visualization or computer processing. How? • • • • Increase contrast Remove noise Emphasize edges: Edge boost, Unsharp masking Modify shapes Contrast Enhancement by Histogram Equalization Enhancement by Adaptive Wavelet Shrinkage Denoising Enhancement by adaptive filtering: Noise or Speckle Reduction Convolution • mathematical operation, i. e., convolution is performed using templates. • A template is a mostly small, squared mask with odd lateral length. This template is mirrored along two axes (hence the name convolution is commonly used) and positioned in one corner of the input image. The image pixels under the mask are named kernel. • The sliding average (a) and the binomial low-pass filter (b) cause a smoothing of the image. The binomial high-pass filter (c), however, increases contrast and edges, but also the noise in the image. The templates (a) to (c) must be normalized to make sure that range domain of values is not exceeded. The contrast filter (d) is based on integer pixel values. The convolution with (d) is, therefore, very easy to calculate. The anisotropic templates (e) and (f) belong to the family of Sobel operators. Registration • Image Registration is defined as the process of establishing correspondences between two images. • It is the alignment/overlaying of two or more images so that the best structural superimposition can be achieved. • Registration methods can be performed on two or more images, but in general it involves only two images at a time. • One is usually referred to as the source or moving image, while the other is referred to as the target or fixed image. The source image is denoted by S, while the target is denoted by T. Source (S) is Green; Target (T) is Red Registered • combining data obtained from a variety of imaging modalities (combining a CT and an MRI view of the same patient) to get more information about the disease at once. Multimodal registration and fusion. Row 1. T1- weighted MRI of a 66 year old subject with right parietal glioblastoma. Row 2. Corresponding PET layers after multimodal registration. Row 3. Fusion of registered layers to support the treatment planning. Row 4. The fusion of MRI with PET of the sensorimotoractivated cortex area proves that the relevant area is out of focus Feature Extraction • Feature extraction is defined as the first stage of intelligent (high level) image analysis. • It is followed by segmentation and classification, which often do not affect in the image itself, i.e. the data or pixel level, but are performed on higher abstract level. Therefore, the task of feature extraction is to emphasize image information on the particular level where the following algorithms operate. • • Consequently, information provided on other levels must be suppressed. Thus, data reduction is executed to obtain the characteristic properties. Feature Extraction • Data-based features (raw data level) • Pixel-based features (individual pixels) • Edge-based features (local contrast, i. e., a strong difference of (gray scale or color) values of adjacent pixels). • Textural features (e.g. honeycomb-like lung). (i) Structural approaches, which are based on texture primitives (so called texel or textone) and their rules of combinations, and (ii) Statistical approaches. • Regional features (object classification and identification). (i) localization-descriptive (along the major axes), (ii) delineation-descriptive measures such as shape, convexity, and length of the border etc. Medical Image Segmentation • Segmentation, separation of structures of interest from the background and from each other, is an essential analysis function for which numerous algorithms have been developed in the field of image processing. • The principal goal of the segmentation process is to partition an image into regions that are homogeneous with respect to one or more characteristics or features. • Segmentation is an important tool in medical image processing, and it has been useful in many applications. E.g. 1 Simple Case Complex Case E.g. 2 • In medical imaging, segmentation is important for feature extraction, image measurements, and image display. • In some applications it may be useful to classify image pixels into anatomical regions, such as bones, muscles, and blood vessels, while in others into pathological regions, such as cancer, tissue deformities, and multiple sclerosis lesions. • Segmentation can be thought as the preprocessor for further analysis. Segmentation techniques can be divided into classes in different ways. e.g., based on the classification scheme: • Manual, semi-automatic, and automatic • Pixel-based (local methods) and region-based (global methods). • Low-level segmentation (thresholding, region growing, etc.), and • Model-based segmentation (multispectral or feature map techniques, Marcov random field, deformable models, etc.). • Model-based techniques are suitable for segmentation of images that have artifacts, noise, and weak boundaries between structures. • Deformable models: Snake model and Level Sets • Classical (thresholding, edge-based, and region-based techniques), Statistical, Fuzzy, and Neural network techniques. Classification • Assigns all connected regions obtained from the segmentation to particularly specified classes of objects. • Usually, region based features that capture the characteristics of the objects sufficiently abstractedly are used to guide the classification process. In this case, another feature extraction step is performed between segmentation and classification. • • These features must be sufficiently discriminative and suitably adopted to the application, since they fundamentally impact the resulting quality of the classifier. Classification • • Non-parametric classifier: Nearest neighbor (NN), K-NN. Parametric procedures normally based on the assumption of distribution functions for feature specifications of objects, this is not always possible in medical image processing. Statistic Classifiers: regards object identification as a problem of the statistical decision theory. • • Syntactic Classifiers: based on grammar, which can possibly generate an infinite amount of symbol chains with finite symbol formalism. • can be understood as a knowledge-based classification system (expert system), because the classification is based on a formal heuristic symbolic representation of expert knowledge, which is transferred into image processing systems by means of facts and rules. Classification Computational Intelligence-Based Classifiers: • • ANNs, GA, Fuzzy Logic based Face Detection/Recognition Face Detection • Face detection only involves the detection of a face within a digital image or video. It simply means that the face detection system can identify that there is a human face present in an image of video – it cannot identify that person. • Face detection is a component of Facial Recognition systems – the first stage of facial recognition is detecting the presence of a human face in the first place. • Face detection can also be used in cameras to help with auto-focus – you will have noticed that on some digital cameras and phones. • Face detection does not identify people or give names to faces. The technology simply checks to see whether there is, in fact, a person in a certain photograph or video. It uses machine learning algorithms to scan digital images for human faces, typically by looking for the eyes first and then calculating the edges of each human face. This is how the system pinpoints exactly where human faces are and counts how many people are present in a photo or video. • Labelling an object/element in an input image/video as a ‘Face’. Challenges: 1. Pose variation: ideal scenario = only frontal images, not likely in uncontrolled/dynamic situations; subject’s movements or camera’s angle. 2. Feature occlusion: elements like beards, glasses or hats introduces high variability; partially covered by objects or other faces. 3. Facial expression: Facial features also vary greatly because of different facial gestures 4. Imaging conditions: Different cameras and ambient conditions. Face Recognition The Problem: Given a still image or video of a scene, identify or verify one or more persons in this scene using a stored database of facial images. 1. Who is this person? 2. Is he/she who he/she claims to be? Face recognition in humans • The human visual system starts with a preference for facelike patterns • The human visual system devotes special neural mechanisms for face perception • Facial features are processed holistically: • • Among facial features eyebrows are most important for recognition. Humans can recognize faces in very low dimensional images • Tolerance to image degradation increases with familiarity • Color and texture are as important as shape • Illumination changes influence generalization • View-generalization is mediated by temporal association Challenges: Intrapersonal variations • If people can do it so easily, why can’t computers? • Intrapersonal (intra-class) variations are variations of the appearance of the same face caused by • Illumination variations • Pose variations • Facial expressions • Use of cosmetics and accessories, hairstyle changes • Temporal variations (aging, etc.) Challenges: Interclass similarity • Interclass similarity: different persons may have very similar appearance • Twins • Relatives • Strangers may look alike Challenges: Illumination variations • Illumination variations may significantly affect the appearance of a face in 2D images • Recognition performance may drop more than 40% for images taken outdoors! • Humans have difficulties in recognizing familiar faces when light direction changes (e.g. top-lit → bottom-lit) Challenges: Pose variations • Difference between two images of the same subject under different view angles is greater than differences between images of two different subjects under the same view. Challenges: Facial expressions • Facial expressions caused by facial muscle movements may significantly deform the face surface. Challenges: Disguises R. Singh, M. Vatsa and A. Noore, “Recognizing Face Images with Disguise Variations”, Recent Advances in Face Recognition, I-Tech, Vienna, 2008. Challenges: Information redundancy • 20x20 facial image • 256400=23200 possible combinations of intensity values • Total world population as of Sept. 2021 7.9 Billion ≈ 233 • Extremely high-dimensional space