Introduction to IPL and OpenCV libraries Bogdan Raducanu Centre de Visiò per Computador E-mail: bogdan@cvc.uab.es Cover Story OpenCV was of key use in the vision system of "Stanley" What is IPL? - The IPL (Image Processing Library ) is a collection of functions implementing several image processing algorithms. It was developed by INTEL. - Is optimized for MMX and different processor types (there is a DLL for each type of INTEL processor) - The images are stored in a specific data structure. In order to work with the image, we need to know the information contained in the structure header (contains image specific characteristics) Image structure IplImage * IplImage • Width • Height • Bits per pixel • Channel sequence •… • pointer to image data Data The “IplImage” structure includes a header containing image information and attributes: - nChannels: number of channels (1 for grayscale, 3 for RGB, 4 for CMYK, ...) - depth: number of bits/pixel and data type - IPL_DEPTH_1U (1-bit) - IPL_DEPTH_8U (8-bit unsigned) - IPL_DEPTH_8S (8-bit signed) - IPL_DEPTH_16U (16-bit unsigned) - IPL_DEPTH_16S (16-bit signed) - IPL_DEPTH_32S (32-bit signed) - IPL_DEPTH_32F (32-bit float) - colorModel: “GRAY”, “RGB”, “CMYK”, etc. - channelSeq: “GRAY”, “BGR”, “BGRA”, “RGB”, “RGBA”, “HSV”, “YUV”, etc. - dataOrder: RGBRGBRGB... or RRR...GGG...BBB - origin: Top-Left or Bottom-Left (IPL_ORIGIN_TL ó IPL_ORIGIN_BL) - scanline alignment: DWORD or QWORD - width (in pixels) - height (in pixels) - ROI: (could be NULL) - maskROI: (could be NULL) - imageSize: image size (in bytes) - imageData: pointer to pixel data Function categories in IPL: - create/destroy an image and access its content - arithmetical/logical operations - filtering - morphological operations - color space conversion - histogram - linear/geometrical transformations - image statistics Create/destroy an image and access its content: - iplCreateImageHeader - iplAllocateImage - iplCreateROI - iplSetROI - iplCopy - iplClone - iplDeallocateImage - iplPutPixel - iplGetPixel Example (create/destroy an image): #include ”ipl.h” … IplImage *img = iplCreateImageHeader( 3, 0, IPL_DEPTH_8U, “RGB”, “BGR”, IPL_DATA_ORDER_PIXEL, IPL_ORIGIN_TL, IPL_ALIGN_QWORD, 150, 100, NULL, NULL, NULL, NULL ); iplAllocateImage(img, 0, 0); /////// Use the image //////// iplDeallocate(img, IPL_IMAGE_ALL); • We created an image of 150x100 pixels with 3 channels. The color model is RGB and data type is 8 bits/pixel, unsigned. The channel order is BGRBGRBGR… starting from the upper row. The data are aligned in memory as QWORD (64 bits). There is no ROI defined • We allocated memory for data, but without initialize it. • With the IPL_IMAGE_ALL parameter, we freed the header of the structure, the data and the existing ROIs (if any) - The content of an IplImage can be accessed in several ways: - using the functions GetPixel and PutPixel: -Inconvenience: slow access - go directly to the memory address corresponding to the pixel al pixel: - Inconvenience: •this operation is ‘complex’ because we have to compute the memory address beforehand •careful with the data types - Advantage: •is faster to access a big chunk of data - Example: Let’s assume we have a RGB image with 8 bits/pixel, unsigned: #include ”ipl.h” … IplImage *img = iplCreateImageHeader( 3, 0, IPL_DEPTH_8U, “RGB”, “BGR”, IPL_DATA_ORDER_PIXEL, IPL_ORIGIN_TL, IPL_ALIGN_QWORD, 150, 100, NULL, NULL, NULL, NULL ); iplAllocateImage(img, 0, 0); unsigned char *R,*G,*B; B = (unsigned char *) img->imageData; G = B+1; R = G+1; for (int i=0; i<15000; i++, R+=3, G+=3, B+=3) { ///// here we can use/modify the pixel data } iplDeallocate(img, IPL_IMAGE_ALL); Remarks: - in order to be able to use the IPL functions, you must include the header file “ipl.h” in your source code - include the “ipl.lib” in the project settings - the structures’ names use the “Ipl” (“I” uppercase) prefix, meanwhile the functions’ names use the “ipl” (“i” lowercase) prefix - the remaining function categories will be presented in the section dedicated to OpenCV library - for more information, consult the IPL manual: C:\Archivos de Programa\Intel\plsuite\doc\iplman.pdf OpenCV library Why OpenCV? - IPL is a “low-level” library (it allows basic operations) - OpenCV is a library which contains more complex data structures and “high-level” functions: optical flow, pattern recognition 2D-3D real-time tracking, camera calibration, etc. - it comes with some extensions that allow: - accessing a camera or working with AVI files - graphical user interface (“HighGUI”) allowing a faster and easier way to interact/visualize the images OpenCV is in general compatible with the IPL library. It is based also on the “IplImage” structure. But it must be employed taking into account the following restrictions: - the image statistics functions require that “IplImage” be defined either with a single channel or three channels of the following data types: IPL_DEPTH_8U, IPL_DEPTH_8S or IPL_DEPTH_32F. - OpenCV supports only interleaved images - the following attributes: colorModel, channelSeq, BorderMode, and BorderConst are ignored - the attibutes maskROI and tileInfo must be set to 0. - the ROIs of the input and output image must be the same. Remark: the structures’ name uses the “Cv” (“C” uppercase) prefix, meanwhile the functions’ name uses the “cv” (“c” lowercase) prefix. Create/Destroy an image and access its content - cvCreateImage - cvCreateImageHeader - cvReleaseImageHeader - cvReleaseImage - cvCreateImageData - cvReleaseImageData - cvSetImageROI - cvCopyImage - cvCloneImage - cvGetImageRawData #include ”ipl.h” #include “cv.h” #include “cxcore.h” … IplImage *img = cvCreateImage(cvSize(150, 100), IPL_DEPTH_8U, 3); /////// Use the image //////// cvReleaseImage(&img); Remark: In some cases (when we work with images captured from the camera), would be more convenient to initialize the IplImage structure using the IPL functions. Arithmetical/Logical operations - Arithmetical operations: - unary; cvAddS, cvSubS, ... - binary: cvAdd, cvMul, cvSub, cvMatMulAdd, cvInvert, ... - Logical operations - unary: cvAndS, cvOrS, cvXorS, ... - binary: cvAnd, cvOr, cvXor, ... Remark: Most of the OpenCV functions are defined to support both the IplImage and CvMat data types. That’s possible because of the definition of CvArr data type: typedef void CvArr; The OpenCV functions look-up for the first integer of the structure being passed, in order to distinguish between the two data types. In the case of IplImage, this integer is equal to the size of IplImage structure, meanwhile it is equal to 0x4224xxxx in the case of CvMat. Image filtering - Based on convolution with fixed kernel: - cvLaplace, cvSobel, cvSmooth Morphological operations - cvErode, cvDilate - cvMorphologyEx (advanced operations: opening, closing, Top-Hat, etc.) - user-defined structuring elements: - cvCreateStructuringElementEx - cvReleaseStructuringElement Example: application of the erosion function Color space conversion - cvCvtColor allows the following color-space conversions: - CV_RGB2GRAY - CV_RGB2HSV - CV_RGB2YCrCb Histogram - cvCreateHist, cvReleaseHist - cvCalcHist, cvCopyHist - cvCompareHist, cvThreshHist - cvGetMinMaxHistValue, cvNormalizeHist Linear/geometrical transformations - cvFFT (Fast Fourier Transform) - cvDCT (Discrete Cosine Transform) - cvResize, cvMirror, cvConvertScale Feature extraction - cvCanny (border detection) - cvHoughLines (line detection) - cvFindCornerSubPix (corner detection) Example of border detection Image statistics - cvNorm (C- , L1- and L2-norm) - cvMoments (spatial and central moments) - cvMinMaxLoc (find the min/max values) Drawing functions - cvLine, cvRectangle - cvCircle, cvEllipse - cvPolyLine, cvFillPoly - cvInitFont, cvPutText Motion analysis - estimators: Kalman, Condensation - cvKalmanXX, cvCondensXX - movement patterns - cvCalcMotionGradient, cvMotionHistoryUpdate - optical flow - cvCalcOpticalFlowPyrLK (implements the Lucas-Kanade method based on pyramidal decomposition) - tracking - cvMeanShift, cvCamShift, cvSnakeImage - background substraction 3D reconstruction - camera calibration - cvCalibrateCamera, cvFindExtrinsecCameraParams, cvUnDistort - hand detection - cvFindHandRegion - pose estimation - cvPOSIT - finding pixel correspondence in a pair of stereo images - cvFindStereoCorrespondence Example of distorsion correction Object detection (faces) The object detection algorithm implemented in OpenCV is based on the following papers: - Paul Viola and Michael J. Jones. Rapid Object Detection using a Boosted Cascade of Simple Features. IEEE CVPR, 2001 and - Rainer Lienhart and Jochen Maydt. An Extended Set of Haar-like Features for Rapid Object Detection. IEEE ICIP 2002, Vol. 1, pp. 900-903, Sep. 2002. Idea A classifier (namely a cascade of boosted classifiers working with haar-like features) is trained with a few hundreds of sample views of a particular object (i.e., a face or a car), called positive examples, that are scaled to the same size (say, 20x20), and negative examples - arbitrary images of the same size. - The weak classifier outputs a "1" if the region is likely to show the object of interest (i.e., face/car), and "0" otherwise - The classifier is designed so that it can be easily "resized" in order to be able to find the objects of interest at different sizes, which is more efficient than resizing the image itself. - The word "cascade" in the classifier name means that the resultant classifier consists of several simpler classifiers (stages) that are applied subsequently to a region of interest until at some stage the candidate is rejected or all the stages are passed. Data structures and functions implemented in OpenCV for face detection - CvHaarClassifierCascade (structure representing a cascade of classifiers) - cvLoadHaarClassifierCascade (reads from a file a cascade of classifiers and stores it in the CvHaarClassifierCascade structure). The cascade is stored in a XML file format: C:\Archivos de programa\Intel\OpenCV\data\haarcascades\...) - cvHaarDetectObjects (detects the objects in the image) - cvReleaseHaarClassifierCascade (frees the memory occupied by the CvHaarClassifierCascade structure) HighGUI library Allows a fast and easy interaction/visualization with/of images - Reading an image from a file - cvLoadImage (const char* filename, int iscolor CV_DEFAULT(1)); - Writing an image to a file - cvSaveImage (const char* filename, const CvArr* image); It supports several formats: BMP, GIF, JPG, TIFF, etc. - Open a window for visualization - cvNamedWindow (const char* name, int flags); - Visualize the image - cvShowImage (const char* name, const CvArr* image); The structure of OpenCV The ‘cv.h’ contains: - general image processing functions: filter, color conversion, morphological operators, structural analysis, motion analysis, pattern recognition (object detection), camera calibration and 3D reconstruction The ‘cxcore.h’ contains: - basic structures, arithmetical/logical operators (copy, transformation), dynamic structures (sets, graphs, trees), drawing functions, error handling and system functions The ‘cvaux.h’ - stereo correspondence, texture descriptors, 2D-3D trackers, background segmentation, morphing, etc. The ‘highgui.h’ contains: - user interface Installation of IPL/OpenCV libraries and environment settings for MSVC++ In order to keep track with the following settings, OpenCV folder and IPL folder (named ‘plsuite’) must be installed in C:\Archivos de Programa\Intel\ - in the ‘c’ source file add the header files needed: ‘ipl.h’, ‘cv.h’, ‘cxcore.h’, ‘highgui.h’, ‘cvaux.h’, ‘cvhaartraining.h’ - from the ‘Project’ menu, choose ‘Settings’ and click on the ‘Link Tab’. In the field: ‘Object/library modules’ add the following: ipl.lib cv.lib cxcore.lib highgui.lib cvaux.lib cvhaartraining.lib (if you use the face detector functions) - from the menu ‘Tools’ choose ‘Options’ and then click on the ‘Directories’ tab. In the field ‘Show directories for’ choose ‘Include files’, then edit the following paths: - C:\ARCHIVOS DE PROGRAMA\INTEL\PLSUITE\INCLUDE - C:\ARCHIVOS DE PROGRAMA\INTEL\OPENCV\CV\INCLUDE - C:\ARCHIVOS DE PROGRAMA\INTEL\OPENCV\OTHERLIBS\HIGHGUI - C:\ARCHIVOS DE PROGRAMA\INTEL\OPENCV\CXCORE\INCLUDE - C:\ ARCHIVOS DE PROGRAMA\INTEL\OPENCV\apps\HaarTraining\include (if you are using the face detector) - from the menu ‘Tools’ choose ‘Options’ and then click on the ‘Directories’ tab. In the field ‘Show directories for’ choose ‘Library files’, then edit the following paths: - C:\ARCHIVOS DE PROGRAMA\INTEL\PLSUITE\LIB\MSVC - C:\ARCHIVOS DE PROGRAMA\INTEL\OPENCV\LIB - the last step consists in adding the paths for DLLs in the ‘Environment Variables’ - from ‘Control Panel’, choose ‘System’, then ‘Advanced’ and finally click on ‘Environment Variable’. From dialog box that appears, in the ‘System Variables’ section, click and edit the ‘PATH’ item - you have to add the following route: C:\Archivos de programa\Intel\OpenCV\bin C:\Archivos de programa\Intel\plsuite\bin Examples Three examples are included (provided in separate files): - first is about getting a live stream from a webcam (you must make sure you have DirectX library installed - second is about detecting faces in an image by using the detector which comes with the OpenCV library - third is about image convolution More information... - http://www.cvc.uab.es/~bogdan/CV/cv.html - IPL/OpenCV online documentation (html, pdf files) - C:\Archivos de programa\Intel\OpenCV\samples - http://www.site.uottawa.ca/~laganier/tutorial/opencv+directshow/cvision.htm - http://groups.yahoo.com/group/OpenCV/ (you have to register)