Stanford CS223B Computer Vision, Winter 2006 Lecture 14: Object Detection and Classification Using Machine Learning Gary Bradski, Intel, Stanford CAs: Dan Maynes-Aminzade, Mitul Saha, Greg Corrado “Who will be strong and stand with me? Beyond the barricade, Is there a world you long to see?” -- Enjolras, Do you hear the people sing? Le Miserables Fast, accurate and general object recognition … This guy is wearing a haircut called a “Mullet” Find the Mullets… Rapid Learning and Generalization Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Approaches to Recognition Non-Geo Eigen Objects/Turk Shape models Constellation/Perona Patches/Ulman relations Geometric Histograms/Schiele HMAX/Poggio Local MRF/Freeman, Murphy features Global We’ll see a few of these … Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Global Eigenfaces Find a new coordinate system that best captures the scatter of the data. Eigen vectors point in the direction of scatter, ordered of the magnitude of the eigen values. We can typically prune the number of eigen vectors to a few dozen. Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Global Eigenfaces, the algorithm Assumptions: Square images with W=H=N M is the number of images in the database P is the number of persons in the database The database a1 a 2 a 2 N b1 b2 b 2 N c1 c2 c 2 N d1 d 2 d 2 N e1 e2 e 2 N g1 g2 g 2 N h1 h2 h 2 N f1 f2 f N 2 [slide credit: Alexander Roth] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Global Eigenfaces, the algorithm We compute the average face a1 1 a2 m M a 2 N b1 b2 bN 2 h1 h2 , with M 8 hN 2 Then subtract it from the training faces a1 m1 a2 m2 am a 2 m 2 N N b1 m1 b2 m2 bm b 2 m 2 N N c1 m1 c2 m2 cm c 2 m 2 N N d1 m1 d 2 m2 dm d 2 m 2 N N e1 m1 e m 2 2 em e 2 m 2 N N fm f1 m1 f 2 m2 f N 2 mN 2 g1 m1 g m 2 2 gm g 2 m 2 N N h1 m1 h2 m2 hm h 2 m 2 N N [slide credit: Alexander Roth] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Global Eigenfaces, the algorithm Now we build the matrix which is N2 by M A am bm cm d m em f m gm hm The covariance matrix which is N2 by N2 T C AA Find eigenvalues of the covariance matrix C A AT – The matrix is very large – The computational effort is very big We are interested in at most M eigenvalues – We can reduce the dimension of the matrix [slide credit: Alexander Roth] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Global Eigenvalue Theorem Define C AAT dimension N2 by N2 L AT A dimension M by M (e.g., 8 by 8) Let v be an eigenvector of L : Lv v Then Av is eigenvector of C : C ( Av) ( Av) This vast Proof: C ( Av) AAT ( Av) A( A Av) T A( Lv) Av dimensionality reduction is what makes the whole thing work. ( Av) [slide credit: Alexander Roth] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Global Eigenfaces, the algorithm Compute another matrix which is M by M: L AT A Find the M eigenvalues and eigenvectors – Eigenvectors of C and L are equivalent Build matrix V from the eigenvectors of L Eigenvectors of C are linear combination of image space with the eigenvectors of L U AV Eigenvectors represent the variation in the faces [slide credit: Alexander Roth] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Global Eigenfaces, the algorithm Compute for each face its projection onto the face space 1 U (am ) T 5 U (em ) T T 2 U (bm ) 3 U (cm ) T T 6 U ( fm ) 7 U ( gm ) T 2 U (dm ) T 8 U (hm ) T Compute the between-class threshold 1 max{ i j } for i, j 1....M 2 [slide credit: Alexander Roth] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Global Example Example set Eigenfaces Normalized Eigenfaces Photobook, MIT [Note: sharper] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Eigenfaces, the algorithm in use Global To recognize a face, subtract the average face from it r1 r2 r 2 N r1 m1 r2 m2 rm r 2 m 2 N N Compute its projection onto the face space T U (rm ) Compute the distance in the face space between the face and all known faces Beyond uses in 2 2 i i 1...M r1 i mfor recognition, Eigen 1 r m 2 2 Distinguish between rm – If it’s not a face M ) – If and i , (ir 21,..., it’s a new face N mN 2 – If and min{ i } Sebastian Thrun & Gary Bradski it’s a known face Stanford University “backgrounds” can be very effective for background subtraction. [slide credit: Alexander Roth] CS223B Computer Vision Global Eigenfaces, the algorithm Problems with eigenfaces – spurious “scatter” – – – – Different illumination Different head pose Different alignment Different facial expression Fisherfaces may beat … Developed in 1997 by P.Belhumeur et al. Based on Fisher’s LDA Faster than eigenfaces, in some cases Has lower error rates Works well even if different illumination Works well even if different facial express. [slide credit: Alexander Roth] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Global/local feature mix Global-noGeo Global works OK, still used, but local now seems to outperform. Recent mix of local and global: – Use global features to bias local features with no internal geometric dependencies: Murphy, Torralba & Freeman (03) [image credit: Kevin Murphy] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Global-noGeo Use local features to find objects * convolution Filter bank Object bounding box normalizedcorrelation Gaussian within bounding box Image patch Training x positive O negative [image credit: Kevin Murphy] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Global feature: Global-noGeo Back to neural nets: Propagate Mixture Density Networks* Feature used: Steerable pyramid transformation using 4 orientations and 2 scales; Image divided into 4x4 grid, average energy computed in each channel yields 128 features. PCA down to 80. Iteration Uses “boosted random fields” to learn graph structure * C. M. Bishop. Mixture density networks. Technical Report NCRG 4288, Neural Computing Research Group, Department of Computer Science, Aston University, 1994 Sebastian Thrun & Gary Bradski Stanford University Final output [slide credit: Kevin Murphy] CS223B Computer Vision Example of context focus Global-noGeo The algorithm knows where to focus for objects [image credit: Kevin Murphy] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Global-noGeo Results Performance is boosted by knowing context [image credit: Kevin Murphy] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Local-noGeo Completely Local: Color Histograms Swain and Ballard ’91 took the normalized r,g,b color histogram of objects: and noted the tolerance to 3D rotation, partial occlusions etc: [image credit: Swain & Ballard] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Local-noGeo Color Histogram Matching Objects were recognized based on their histogram intersection: Yielding excellent results over 30 objects: The problem is, color varies markedly with lighting … [image credit: Swain & Ballard] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Local-noGeo Local Feature Histogram Matching Scheile and Crowley used derivative type features instead: And a probabilistic matching rule: • For multiple objects: [image credit: Scheile & Crowley] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Local-noGeo Local Feature Histogram Results Again with impressive performance results, much more tolerant to lighting: 30 of 100f objects Problem is: Histograms suffer exponential blow up with number of features [image credit: Scheile & Crowley] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Local Features Local features, for example: – – – – – Lowe’s SIFT Malik’s Shape Context Poggio’s HMAX von der Malsburg’s Gabor Jets Yokono’s Gaussian Derivative Jets Adding patches thereof seems to work great, but they are of high dimensionality. Idea: Encode in Hierarchy: – Overview some techniques... Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Convolutional Neural Networks Yann LeCun Local-Hierarchy Broke all the HIPs code (Human Interaction Proofs) from Yahoo, MSN, E-Bay … [image credit: LeCun] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Fragment Based Hierarchy Shimon Ullman Local-Hierarchy Top down and bottom up hierarchy http://www.wisdom.weizmann.ac.il/~vision/research.html See also Perona’s group work on hierarchical feature models of objects http://www.vision.caltech.edu/html-files/publications.html [image credit: Ullman et al] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Constellation Model Perona’s Bayesian Decision based The shape model. The mean location is indicated by the cross, with the ellipse showing the uncertainty in location. The number by each part is the probability of that part being present. Recognition Result: [image credit: Perona et al] The appearance model closest to the mean of the appearance density of each part Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision From: Rob Fergus http://www.robots.ox.ac.uk/%7Efergus/ See also Perona’s group work on hierarchical feature models of objects http://www.vision.caltech.edu/html-files/publications.html Feature detector results: Local-Hierarchy Local-Hierarchy Joijic and Frey Scene description as hierarchy of sprites [image credit: Joijic et al] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Local-Hierarchy Jeff Hawkins, Dileep George Modular hierarchical spatial temporal memory Hierarchy Module Results Templates Good Classifications Bad Classifications In (D) Out (E) [image credit: George, Hawkins] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Peter Bock’s ALISA An explicit Cognitive Model Local-Hierarchy Histogram based [image credit: Bock et al] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision ALISA Labeling 2 Scenes Local-Hierarchy [image credit: Bock et al] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision HMAX from the “Standard Model” Maximilian Riesenhuber and Tomaso Poggio Local-Hierarchy In object recognition hierarchy Basic building blocks Modulated by attention Pick this up momentarily, first, a little on trees and boosting … Sebastian Thrun & Gary Bradski Stanford University [image credit: Vision Riesenhuber CS223B Computer et al] Machine Learning – Many Techniques Libraries from Intel Key: • Optimized • Implemented • Not implemented Unsupervised Supervised focus • Physical Models • Boosted decision trees • MART • Influence diagrams • SVM • HMM • Multi-Layer Perceptron • BayesNets: Classification • CART • Logistic Regression • Decision trees • K-NN • Radial Basis • Naïve Bayes • Kalman Filter • ARTMAP • Assoc. Net. • Random Forests. • Diagnostic Bayesnet • Bayesnet structure learning • Adaptive Filters • Histogram density est. • Kernel density est. • K-means • Tree distributions • Gaussian Fitting • Dependency Nets • ART • Spectral clustering • Agglomerative clustering • PCA • Kohonen Map • BayesNets: Parameter fitting • Inference Modeless Model based Statistical Learning Library: MLL Sebastian Thrun & Gary Bradski Stanford University Bayesian Networks Library: PNL CS223B Computer Vision Machine Learning Learn a model/function f INPUT OUTPUT That maps input to output underfit just right f y overfit X Find a function that describes given data and predicts unknown data Example Uses of Prediction: - Insurance risk prediction - Parameters that impact yields - Gene classification by function - Topics of a document ... Specific example: prediction, using a decision tree => => => Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Binary Recursive Decision Trees Leo Breiman’s “CART”* At Each Level: Find the variable (predictor) and its threshold. – – That splits the data into 2 groups With maximal purity within each group All variables/predictors are considered at every level. underfit Data of different types, each containing a vector of “predictors” Data set f y overfit maximal purity splits X Perfect purity, but… Sebastian Thrun & Gary Bradski Stanford University *Classification Regression Tree CS223B ComputerAnd Vision Binary Recursive Decision Trees Leo Breiman’s “CART”* At Each Level: Find the variable (predictor) and its threshold. – – That splits the data into 2 groups With maximal purity within each group All variables/predictors are considered at every level. just right Data set f y overfit x Prune to avoid over fitting using complexity cost measure Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Consider a Face Detector via Decision Stumps Consider a tree “Stump” – just one split. It selects the single most discriminative feature … For each rectangle combination region: Find the threshold – That splits the data into 2 groups (face, non-face) – With maximal purity within each group Face and non-face data that he features can be tried on Data set maximal purity splits: Thresh = N Bar detector works well for “nose” a face detecting stump. Sebastian Thrun & Gary Bradski Stanford University See Appendix for Viola, Jones’s feature generator: Intregral Images It doesn’t detect cars. CS223B Computer Vision We use “Boosting” to Select a “Forest of Stumps” Each stump is a selected feature plus a split threshold Gentle Boost: Given example images (x1,y1) , … , (xn,yn) where yi = 0, 1 for negative and positive examples respectively. Initialize weights w1,i = 1/(2m), 1/(2l) for training example i, where m and l are the number of negatives and positives respectively. For t = 1 … T 1) Normalize weights so that wt is a distribution 2) For each feature j train a classifier hj and evaluate its error j with respect to wt. 3) Chose the classifier hj with lowest error. 4) Update weights according to: 1 i wt 1,i wt ,i t where ei = 0 is xi is classified correctly, 1 otherwise, and t t 1 t The final strong classifier is: 1 h( x ) 0 1 T 2 t 1 t , otherwise t 1 t ht ( x) T Sebastian Thrun & Gary Bradski Stanford University where t log( 1 ) t CS223B Computer Vision For efficient calculation, form a Detection Cascade A boosted cascade is assembled such that at each node, non-object regions stop further processing. If the detection of each node is high (~99.9%), at cost of a high false positive rate (say 50% of everything detected as “object), and if the nodes are independent, then theoveralldetectionand false positiveratesare n n i 1 i 1 d detecti and f falsePosi . If so, thenfor a 20 node cascade 7 we get : d 0.98 and f 9.6e . Sebastian Thrun & Gary Bradski Rapid Object Detection using a Boosted Cascade of Simple Features - Viola, Jones (2001) Stanford University CS223B Computer Vision Improvements to Cascade J. Wu, J. M. Rehg, and M. D. Mullin just do one Boosting round, then select from the feature pool as needed: Viola, Jones Wu, Rehg, Mullin Kobi Levi and Yair Weiss just used better features (gradient histograms) to cut training needs by an order of magnitude. Let’s focus on better features and descriptors … [image credit: Wu et al] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision The Standard Model of Visual Cortex Biologically Motivated Features Thomas Serre, Lior Wolf and Tomaso Poggio used the model of the human visual cortex developed in Riesenhuber’s lab: Classifier (SVM, Boosting, …) C2 Layer: Max S2 Response .8 .4 .9 .2 .6 S2 Layer: Radial Basis fit to it’s patch template over the whole image Inter layer: Dictionary of Patches of C1 First 5 chosen features from Boosting C1 layer: Local Spatial Max S1 layer: Gabor at 4 orientations [image credit: Serre et al] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision The Standard Model of Visual Cortex Biologically Motivated Features Results in state of the art/top performance: Seems to handily beat SIFT features: [image credit: Serre et al] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Yokonos’ Generalization to The Standard Model of Visual Cortex Used Gaussian Derivates: 3 orders X 3 scales X 4 orientations = 36 base features: Similar to Standard Model’s Gabor base filters. [image credit: Yokono et al] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Yokonos’ Generalization to The Standard Model of Visual Cortex Created a local spatial jet, oriented to the gradient at the largest scale at the center pixel: Since Gabor has ringing spatial extent ~ still approximately similar to standard model. [image credit: Yokono et al] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Yokonos’ Generalization to The Standard Model of Visual Cortex Full system: ~S1, C1: Features memorized from positive samples at Harris corner interest points. ~S2: Dictionary of learned features is measured (normalized cross correlation) against all interest points in the image. ~C2: The maximum normalized cross correlation scores are arranged in a feature vector Classifier: Again: SVM, Boosting, … [image credit: Yokono et al] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Yokonos’ Generalization to The Standard Model of Visual Cortex Excellent Results: CBCL Database ROC curve for 1200 Stumps: SVM with 1 to 5 training images beats other techniques: [image credit: Yokono et al] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Yokonos’ Generalization to The Standard Model of Visual Cortex Excellent Results: Some features chosen: AIBO Dog in articulated poses: ROC Curve: [image credit: Yokono et al] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Brash Claim In the high 90% performance under lighting, articulation, scale and 3D rotation. – The classifier inside humans is unlikely to be much more accurate. We are not that far from raw human level performance. – By 2015 I predict. Base classifier is embedded in larger system that makes it more reliable: – – – – – Attention Color constancy features Context Temporal filtering Sensor fusion Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Back to Kevin Murphy: Context: Missing [slide credit: Kevin Murphy] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Missing Context We know there is a keyboard present in this scene even if we cannot see it clearly. We know there is no keyboard present in this scene … even if there is one indeed. [slide credit: Kevin Murphy] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Missing Attention Change blindness Farm Sebastian Thrun & Gary Bradski Truck Stanford University CS223B Computer Vision Call for a Program: Generalize Standard Model Even Further Research Framework Detect: Descriptors: Local – SIFT – Steerable – Gabor Image Level Scoring: Global – DOG – Harris Corner – Histogram – Max Correlation – Max Probability Classifier: – SVM – Boosting – K-NN … Dictionary: – All descriptors – Subset – Clustered Sebastian Thrun & Gary Bradski Embedding: – – – – Stanford University Attention, active vision Context: Scene, 3D inference Sensor fusion/association Motion CS223B Computer Vision Call for a Program: Generalize Standard Model Even Further Ashutosh Saxena, Chung and Ng learned depth using local features in an MRF (similar to Kevin Murphy). Ashutosh also has a robot picking up novel objects from local features. Together with active vision, active manipulation, context – Now is a good time for vision systems! Apply to “Stanley II” and to STAIR [image credit: Saxena et al] Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Summary: Mix local with global Generalize Standard Model Even Further Research Framework Detect: Descriptors: Local – SIFT – Steerable – Gabor Image Level Scoring: Global – DOG – Harris Corner – Histogram – Max Correlation – Max Probability Classifier: – SVM – Boosting – K-NN … Dictionary: – All descriptors – Subset – Clustered Sebastian Thrun & Gary Bradski Embedding: – – – – Stanford University Attention, active vision Context: Scene, 3D inference Sensor fusion/association Motion CS223B Computer Vision Bibliography for this lecture Papers for this lecture: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. R. Fergus, P. Perona and A.Zisserman, “Object Class Recognition by Unsupervised Scale-Invariant Learning”, CVPR 03. M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive Neuroscience, Vol. 3, No. 1, 1991. Serre, T., L. Wolf and T. Poggio. Object Recognition with Features Inspired by Visual Cortex. In: Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society Press, San Diego, June 2005. Jerry Jun Yokono & Tomaso Poggio, “Boosting a Biologically Inspired Local Descriptor for Geometry-free Face and Full Multi-view 3D Object Recognition”, AI Memo 3005-023 CBCL Memo 254, July 2005 J. Wu, J. M. Rehg, and M. D. Mullin, “Learning a Rare Event Detection Cascade by Direct Feature Selection” Proc. Advances in Neural Information Processing Systems 16 (NIPS*2003), MIT Press, 2004 J. Wu, M. D. Mullin, and J. M. Rehg, “Linear Asymmetric Classifier for Face Detection”, International Conference on Machine Learning (ICML 05), pages 993-1000, Bonn, Germany, August 2005 Kobi Levi and Yair Weiss, “Learning Object Detection from a Small Number of Examples: The Importance of Good Features” International Conference on Computer Vision and Pattern Recognition (CVPR) 2004. P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In Proc. CVPR, pages 511–518, 2001. B. Schiele and JL Crowley. Probabilistic object recognition using multidimensional receptive field histograms. submitted to ICPR'96 M. J. Swain and D. H. Ballard, "Color Indexing," International Journal of Computer Vision, vol. 7, pp. 11-32, 1991. Antonio Torralba, Kevin Murphy and William Freeman , “Contextual Models for Object Detection using Boosted Random Fields ”, NIPS 2004. Kevin Murphy, Antonio Torralba, Daniel Eaton, William Freeman, “Object detection and localization using local and global features”, Sicily workshop on object recognition, 2005 M. Riesenhuber and T. Poggio. How visual cortex recognizes objects: The tale of the standard model. The Visual Neurosciences, 2:1640–1653, 2003. A. Saxena, S.H. Chung, A.Y. Ng, “Learning depth from Single Monocular Images”, NIPS 2005 Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Feature set generators Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Intregral Images -- a Feature Set Generator 3 rectangular features types: • two-rectangle feature type (horizontal/vertical) • three-rectangle feature type • four-rectangle feature type Using a 24x24 pixel base detection window, with all the possible combination of horizontal and vertical location and scale of these feature types the full set of features has 49,396 features. The motivation behind using rectangular features, as opposed to more expressive steerable filters is due to their extreme computational efficiency. Paul Viola and Michael Jones www.cs.ucsd.edu/classes/fa01/cse291/ViolaJones.ppt ICCV 2001 Workshop on Statistical and Computation Theories of Vision Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Define an “Integral image” Sum Def: The integral image at location (x,y), is the sum of the pixel values above and to the left of (x,y), inclusive. Using the following two recurrences, where i(x,y) is the pixel value of original image at the given location and s(x,y) is the cumulative column sum, we can calculate the integral image representation of the image in a single pass. x (0,0) s(x,y) = s(x,y-1) + i(x,y) ii(x,y) = ii(x-1,y) + s(x,y) (x,y) y Paul Viola and Michael Jones www.cs.ucsd.edu/classes/fa01/cse291/ViolaJones.ppt ICCV 2001 Workshop on Statistical and Computation Theories of Vision Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Allows rapid evaluation of rectangular features Using the integral image representation one can compute the value of any rectangular sum in constant time. For example the integral sum inside rectangle D we can compute as: ii(4) + ii(1) – ii(2) – ii(3) As a result: two-, three-, and four-rectangular features can be computed with 6, 8 and 9 array references respectively. Paul Viola and Michael Jones www.cs.ucsd.edu/classes/fa01/cse291/ViolaJones.ppt ICCV 2001 Workshop on Statistical and Computation Theories of Vision Sebastian Thrun & Gary Bradski Stanford University CS223B Computer Vision Intregal Image Example Image 0 8 6 1 0 8 - - 1 5 9 0 1 14 - - 0 7 5 0 1 - - - 2 8 9 2 4 - - - Intregal Image 0 8 14 - 0 8 14 15 1 14 29 - 1 14 29 30 1 21 41 - 1 21 41 42 4 32 61 - 4 32 61 64 Sebastian Thrun & Gary Bradski Stanford University Can calculate in one pass. CS223B Computer Vision Intregal Image Example Image Intregal Image 0 8 6 1 0 8 14 15 1 5 9 0 1 14 29 30 0 7 5 0 1 21 41 42 2 8 9 2 4 32 61 64 Find sum 5+9+7+5+8+9=43 Sebastian Thrun & Gary Bradski 61+0-(14+4)=43 Stanford University CS223B Computer Vision