OpenCV Une brève introduction avec des exemples. Sommaire 1. Introduction 2. Structure de base de OpenCV : The Core Functionality <http://opencv.itseez.com/modules/core/doc/core.html> , http://opencv.itseez.com/index.html, (CV::Mat) –classe foindamentale pour représentation des images et des images vidéo *differents types de codage (RGB, Grey, RGB= alpha -- structure de données) 3. HighGUI : gestion des image, affichage, lecture ,-ecriture High-level GUI and Media I/O <http://opencv.itseez.com/modules/highgui/doc/highgui.html> 4. Traitements d'images et videos modes et algorithmes pour les applications de mobilité Introduction OpenCV (pour Open Computer Vision) est une bibliothèque graphique libre, initialement développée par Intel, spécialisée dans le traitement d'images en temps réel. Elle peut être utilisée dans une application Android en grâce au jni (Java Native Interface). 26/10/11 Introduction Pourquoi? Téléphon eMobile : outil de communication multimédia : Prendre des photos bien focalisées, Envoyer les vidéos : « citizen reporteur » - BBC, BFMTV, sécurité : MotionCam, Filtrer l’information personnelle, Consulter les BD images Filtrage de l’information personnelle 26/10/11 Structure de base de OpenCV OpenCV 2.x is a C++ library as opposed to OpenCV 1.x Significant changes in module structure since version 2.2 Modules: core, imgproc, video, calib3d, features2d, objdetect, highgui, gpu... OpenCV core functionality All the OpenCV classes and functions are placed into the cv namespace core - compact module defining basic data structures and basic functions used by all other modules Basic image class cv::Mat 26/10/11 cv::Mat memory management OpenCV handles all the memory automatically Memory is freed automatically when needed When a Mat instance is copied, no actual data is really copied To make a real copy, use Mat::clone 26/10/11 The Mat class I • cv::Mat covers the old CvMat and IplImage – Data representation • Data is row ordered • Colour pixels are interleaved (e. g. RGBRGBRGB...) • Let's see some important members of the class: – Create and initialize // Mat(int _rows, int _cols, int _type); // Mat(Size _size, int _type); type = CV_8UC3, CV_32FC1, ... // Mat(Size _size, int _type, const Scalar& _s); fill with values in _s Mat M(7,7,CV_32FC2,Scalar(1,3));//7x7, float, 2 channels, fill with (1,3) M.create(Size(15,15), CV_8U);//reallocate (if needed) //Matlab-like initializers Mat ident = Mat::eye(3,3, CV_32F);//also Mat::ones(..) and Mat::zeros(..) int* data = {1,2,3,9,0,-3}; Mat C (2,3,CV_32S, data); //no data copied. C = C.clone(); //clone the matrix -> now the data is created. 26/10/11 The Mat class II Important things to know: Shallow copy: Mat A = B; does not copy data. Deep copy: clone() and/or B.copyTo(A); (for ROIs, etc). Most OpenCV functions can resize matrices if needed Lots of convenient functionality (Matrix expressions): 26/10/11 s is a cv::Scalar, α scalar (double) Addition, scaling, ...: A±B, A±s, s±A, αA Per-element multiplication, division...: A.mul(B), A/B, α/A Matrix multiplication, dot, cross product: A*B, A.dot(B), A.cross(B) Transposition, inversion: A.t(), A.inv([method]) And a few more. Mat class: element access I Rows, columns, ROIs,... Mat A = B.row(int row); //same for B.col() A = B.rowRange(Range rg);//same for B.colRange() A = B(Rect r);//use a rectangle to set ROI Ranges, ROIs, etc... only create new headers. Where is a ROI in the bigger matrix? Mat A = B(Rect r); Size s; Point offset; A.locateROI(s, offset); //'s' and 'offset' will define the rectangle 'rect' Element access: 3 options Using at<>() double val = M.at<double>(i, j);//You have to know the type 26/10/11 Mat class: element access II Old C style. // compute sum of positive matrix elements double sum=0; for(int i = 0; i < M.rows; i++) { const double* Mi = M.ptr<double>(i); //we know it's double data for(int j = 0; j < M.cols; j++) sum += std::max(Mi[j], 0.); } STL-like iterators // compute sum of positive matrix elements, iteratorbased variant double sum=0; MatConstIterator_<double> it = M.begin<double>(), it_end = M.end<double>(); for(; it != it_end; ++it) sum += std::max(*it, 0.); This iterators can be used with STL functions, like std::sort() 26/10/11 The Mat_ class I A thin wrap around the Mat class. Mat ↔ Mat_ can be converted freely With care: no data conversion is done Type specification is different Useful if you do lots of element access. Same internal code, but shorted to write Mat_<double> M(20,20); //a double matrix 20x20 double k = M(2,18); //no data specification needed For multichannel (colour images), use cv::Vec Mat_<Vec3f> M3f(20,20); //a 20x20 3 channel float matrix 26/10/11 Image examples Mat_<uchar> (8 bpp) 26/10/11 Mat_<Vec3u> (24 bpp) Manipulation with images using Mat class Reading and writing images is easy Mat imread(const string& filename, int flags=1); //flags =0 -> always grayscale //flags >0 -> always color //flags <0 -> read image as-is bool imwrite(const string& filename, const Mat& img, const vector<int>& params=vector<int>()); //params set compressions values. defaults are fine. example: Mat img = imread("filename.jpg", 1); imwrite("file.png", myImage); 26/10/11 Examples: thresholding #include <cv.h> #include <highgui.h> using namespace std; using namespace cv; int main( int argc, char** argv ) { Mat src, gray, grayThresh; src = imread(argc >= 2 ? argv[1] : "fruits.jpg", 1); gray.create(src.size(), CV_8U);//not needed, actually namedWindow("src", CV_WINDOW_AUTOSIZE); namedWindow("gray", CV_WINDOW_AUTOSIZE); namedWindow("grayThreshold", CV_WINDOW_AUTOSIZE); cvtColor(src, gray, CV_BGR2GRAY); //color images are BGR! threshold(gray, grayThresh, 100, 250, CV_THRESH_BINARY); imshow("src", src); imshow("gray", gray); imshow("grayThreshold", grayThresh); waitKey(0); //waits for a key: it also handles the GUI events. } return 0; //no need to free the matrices, they are deleted automatically 26/10/11 Examples: Canny edge detector #include <cv.h> #include <highgui.h> using namespace std; using namespace cv; int main( int argc, char** argv ) { Mat src, dst; src = imread(argc >= 2 ? argv[1] : "fruits.jpg", 0); // dst = Mat(src.size(), src.type()); Canny(src, dst, 100, 150, 3); namedWindow("src"); imshow("src", src); namedWindow("canny"); imshow("canny", dst); WaitKey(0); return 0; } 26/10/11 HighGUI: Creating Interfaces I Start off by creating a program that will constantly input images from a camera #include <cv.h> #include <highgui.h> int main() { CvCapture* capture = 0; capture = cvCaptureFromCAM(0); if(!capture) { printf("Could not initialize capturing...\n"); return -1; } cvNamedWindow("video"); This code creates a capture structure pointing to camera #0 and creates a window named “video” 26/10/11 HighGUI: Creating Interfaces II Create two variables holding the values of the trackbars we’ll create int bright=128, contrast=26; And now we actually create the trackbars: cvCreateTrackbar("brightness", //name of the trackbar "video", //name of the window &bright, //pointer to a variable that will hold the value of the trackbar) 255, //maximum value of the trackbar (minimum is always 0) NULL); //A callback function (which is called whenever the position of the trackbar is changed) cvCreateTrackbar("contrast", "video", &contrast, 50, NULL); • Start the infinite loop requesting for frames: while(true) { IplImage* frame = 0; frame = cvQueryFrame(capture); if (!frame) break; 26/10/11 HighGUI: Creating Interfaces III bright is in range [0,255], thus subtract 128 to have a convenient range 127...128 to reduce to increase brightness. Modify image contrast and brightness by adding to every pixel bright value and scaling by contrast cvAddS(frame, cvScalar(bright-128,bright-128,bright128), frame); Display the image in the window “video” until the Esc key (ASCII = 27) is pressed cvShowImage("video", frame); int c = cvWaitKey(20); if ((char)c==27) break; } cvReleaseCapture(&capture); return 0; } 26/10/11 HighGUI: trackbar example 26/10/11 OpenCV: image filtering • In this tutorial you will learn how to apply diverse linear filters to smooth images using OpenCV functions such as: • Blur • Gaussian blur • Median blur • Bilateral filter 26/10/11 Theory • Smoothing (blurring) is a simple and frequently used operation • There are many reasons for smoothing, e.g. noise suppression • To perform a smoothing operation we will apply a filter to our image. The most common type of filters are linear, in which an output pixel’s value (i.e. 𝑔(𝑖, 𝑗)) is determined as a weighted sum of input pixel values (i.e. 𝑓(𝑖 + 𝑘, 𝑗 + 𝑙)) : 𝑔 𝑖, 𝑗 = 𝑓 𝑖 + 𝑘, 𝑗 + 𝑙 ℎ(𝑘, 𝑙) 𝑘,𝑙 • ℎ(𝑘, 𝑙) is called the kernel, which is nothing more than the coefficients of the filter. • It helps to visualize a filter as a window of coefficients sliding across the image. 26/10/11 Normalized Box Filter • This filter is the simplest of all! Each output pixel is the mean of its kernel neighbors (all of them contribute with equal weights) • The kernel is below: 1 … 1 1 𝐾= … … … 𝐾𝑤𝑖𝑑𝑡ℎ ∗ 𝐾ℎ𝑒𝑖𝑔ℎ𝑡 1 … 1 26/10/11 Gaussian Filter I • Probably the most useful filter (although not the fastest). Gaussian filtering is done by convolving each point in the input array with a Gaussian kernel. • 1D Gaussian kernel 26/10/11 Gaussian Filter II • Pixel located in the middle has the biggest weight. • The weight of its neighbors decreases as the spatial distance between them and the center pixel increases. • 2D Gaussian kernel • where 𝜇 is the mean (the peak) and 𝜎 represents the variance (per each of the variables 𝑥 and 𝑦) 26/10/11 Median filter • The median filter run through each element of the signal (in this case the image) and replace each pixel with the median of its neighboring pixels (located in a square neighborhood around the evaluated pixel). • The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest value and picking the middle one. 26/10/11 Bilateral Filter • Considered filters main goal were to smooth an input image. However, sometimes the filters do not only dissolve the noise, but also smooth away the edges. To avoid this (at certain extent at least), we can use a bilateral filter. • In an analogous way as the Gaussian filter, the bilateral filter also considers the neighboring pixels with weights assigned to each of them. • These weights have two components – The first component is the same weighting used by the Gaussian filter – The second component takes into account the difference in intensity between the neighboring pixels and the evaluated one. 26/10/11 Usage examples • Box filter blur(src, dst, Size( filt_size_x, filt_size_y), Point(-1,1)); – – – – src: Source image dst: Destination image Size( w,h ): Defines the size of the kernel to be used ( of width w pixels and height h pixels) Point(-1, -1): Indicates where the anchor point (the pixel evaluated) is located with respect to the neighborhood. If there is a negative value, then the center of the kernel is considered the anchor point. • Gaussian blur GaussianBlur( src, dst, Size(filt_size_x, filt_size_y ), 0, 0 ); – Size(w, h): The size of the kernel to be used (the neighbors to be considered). and have to be odd and positive numbers otherwise the size will be calculated using the and arguments. – sigma_x: The standard deviation in x. Writing 0 implies that is calculated using kernel size. – sigma_y: The standard deviation in y. Writing 0 implies that is calculated using kernel size. 26/10/11 TD • Détection de visages • Masquage 26/10/11 Usage examples II • Median blur medianBlur(src, dst, filt_size – src: Source image – dst: Destination image, must be the same type as src – i: Size of the kernel (only one because we use a square window). Must be odd. • Bilateral filter bilateralFilter ( src, dst, filt_size , filt_size *2, filt_size /2 ) – d: The diameter of each pixel neighborhood. – sigma_col: Standard deviation in the color space (pixel values). – sigma: Standard deviation in the coordinate space (in pixels) 26/10/11 Median blur example 26/10/11 Face detection Slides partly borrowed from http://www.cs.unc.edu/~lazebnik/spring09/lec23_face_detection.ppt 26/10/11 Face detection II • Basic idea: slide a window across image and evaluate a face model at every location • Challenges: • Sliding window detector must evaluate tens of thousands of location/scale combinations • Faces are rare: 0–10 per image • For computational efficiency, we should try to spend as little time as possible on the non-face windows • A megapixel image has ~106 pixels and a comparable number of candidate face locations • To avoid having a false positive in every image, our false positive rate has to be less than 10-6 26/10/11 Object classification • Takes as input 𝑛-dimensional vector of parameters 𝑣 • Training step: given a number of samples of each class and corresponding feature vectors -> builds 𝑛-dimensional space partitioning • Classification: using acquired partitioning, make decision to which class object belongs to based on its feature vector • Feature vector is any reasonable set of parameters that makes good separation of the objects of different class 26/10/11 The simplest classifier : thresholding 1 if f ( x ) h( x) 1 otherwise 26/10/11 f(x) The Viola/Jones Face Detector • Training is slow, but detection is very fast • Key ideas • Integral images for fast feature evaluation • Boosting for feature selection • Classifier cascade for fast rejection of non-face windows P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001 P. Viola and M. Jones. Robust real-time face detection. IJCV 57(2), 2004. 26/10/11 Image Features • Rectangular filters of wearable size in window 24x24 pels • Value = ∑ (pixels in white area) – ∑ (pixels in black area) Examples of features 26/10/11 Example Source Result 26/10/11 Value~0 Value >>0 Fast computation with integral image • The integral image computes a value at each pixel (x,y) that is the sum of the pixel values above and to the left of (x,y), inclusive D • Sum of values inside the rectangle sum = A – B – C + D C 26/10/11 (x,y) B A Feature selection • For a 24x24 detection region, the number of possible rectangle features is ~160,000! • At detection time, it is impractical to evaluate the entire feature set • Can we create a good classifier using just a small subset of all possible features? • How to select such a subset? 26/10/11 Boosting • • • • • • • • • • Learn a single simple classifier Classify the data Look at where it makes errors Reweight the data so that the inputs where we made errors get higher weight in the learning process Now learn a 2nd simple classifier on the weighted data Combine the 1st and 2nd classifier and weight the data according to where they make errors Learn a 3rd classifier on the weighted data Final classifier is the combination of all T classifiers This procedure is called “Boosting” – works very well in practice 26/10/11 Boosting illustration Weak Classifier 1 26/10/11 Boosting illustration Weights Increased Boosting illustration Weak Classifier 3 Boosting illustration Final classifier is a combination of weak classifiers Boosting for face detection • Define weak learners based on rectangle features value of rectangle feature 1 if f t ( x ) t ht ( x ) 1 otherwise threshold window • Final classifier 𝐻𝑓𝑖𝑛𝑎𝑙 = 𝑠𝑖𝑔𝑛 (0.4 ∙ ℎ1 + 0.6 ∙ ℎ2 + 0.9 ∙ ℎ3 +. . . ) 26/10/11 Boosting for face detection • First two features selected by boosting: This feature combination can yield 100% detection rate and 50% false positive rate Boosting for face detection • A 200-feature classifier can yield 95% detection rate and a false positive rate of 1 in 14084 Not good enough! Receiver operating characteristic (ROC) curve Classifier cascade • We start with simple classifiers which reject many of the negative sub-windows while detecting almost all positive sub-windows • Positive response from the first classifier triggers the evaluation of a second (more complex) classifier, and so on • A negative outcome at any point leads to the immediate rejection of the sub-window IMAGE SUB-WINDOW T Classifier 1 F NON-FACE T Classifier 2 F NON-FACE T Classifier 3 F NON-FACE FACE Classifier cascade • A detection rate of 0.9 and a false positive rate on the order of 10-6 can be achieved by a 10-stage cascade if each stage has a detection rate of 0.99 (0.9910 ≈ 0.9) and a false positive rate of about 0.30 (0.310 ≈ 6×10-6) 26/10/11 Output of Face Detector on Test Images OpenCV face detection • OpenCV already have cascade classifiers and Haar features implemented • Pre-trained cascades for face detection as well as eyes and some more are also included! (Look in "data/haarcascades/ “) 26/10/11 Code example #include <opencv2/objdetect/objdetect.hpp> #include <opencv2/highgui/highgui.hpp> #include <opencv2/imgproc/imgproc.hpp> int main() { CascadeClassifier cascade("../../data/haarcascades/haarcascade_frontalface_alt.xml“); Mat img = imread( "lena.jpg", 1 ); vector<Rect> faces; Mat gray; cvtColor( img, gray, CV_BGR2GRAY ); cascade.detectMultiScale( smallImg, faces, 1.1, 2, 0 /*default optons*/, Size(30, 30) ); for( vector<Rect>::const_iterator r = faces.begin(); r != faces.end(); r++ ) { circle( img, center, radius, color, 3, 8, 0 ); } imshow( "result", img ); } 26/10/11 TD • Détection de visages • Masquage 26/10/11