Detecting and Segmenting Objects for Mobile Manipulation

OpenCV Tutorial Omri Perez Adapted from: Gary Bradski Senior Scientist, Willow Garage Consulting Professor: Stanford CS Dept. http://opencv.willowgarage.com www.willowgarage.com 11 • • • • Vision is Hard Camera Model, Lens, Problems and Corrections OpenCV OpenCV Tour CS324 2 Vision is Hard • What is it? – Turning sensor readings into perception. • Why is it hard? – It’s just numbers. Maybe try gradients to find edges? CS324 3 Use Edges? … It’s not so simple • Depth discontinuity • Surface orientation discontinuity • Reflectance discontinuity (i.e., change in surface material properties) • Illumination discontinuity (e.g., shadow) Slide credit: Christopher Rasmussen CS324 4 Must deal with Lighting Changes … CS324 5 Lighting is also a Strong Cue Gary Bradski (c) 2008 66 The Brain Assumes 3D Geometry Perception is ambiguous … depending on your point of view! 77 Geometrical aberrations Non-Geometrical aberrations q spherical distortion q Chromatic q astigmatism q Vignetting q tangential distortion q coma aberrations are reduced by combining lenses These are typically what are corrected for in camera Calibration 8 Marc Pollefeys Distortion Correction so that Lens can Approximate a Pinhole Camera • Distortions are corrected mathematically – We use a calibration pattern • We find where the points ended up • We know where the points hould be • OpenCV 2.2 Function: double calibrateCamera( const vector<vector<Point3f> >& objectPoints, const vector<vector<Point2f> >& imagePoints, Size imageSize, Mat& cameraMatrix, Mat& distCoeffs, vector<Mat>& rvecs, vector<Mat>& tvecs, int flags=0); CS324 9 • • • • Vision is Hard Camera Model, Lens, Problems and Corrections OpenCV OpenCV Tour CS324 10 OpenCV Overview: opencv.willowgarage.com Robot support > 2000 algorithms Image Pyramids General Image Processing Functions Geometric descriptors Camera calibration, Stereo, 3D Segmentation Features Utilities and Data Structures Transforms Tracking Machine Learning: Fitting • Detection, • Recognition Matrix Math Gary Bradski 11 OpenCV Tends Towards Real Time http://opencv.willowgarage.com Where is OpenCV Used? • • • • • • • • • Google Maps, Google street view, Google Earth, Books Academic and Industry Research Safety monitoring (Dam sites, mines, swimming pools) Security systems Image retrieval • Well over 2M downloads Video search Structure from motion in movies Machine vision factory production inspection systems Robotics 2M downloads Screen shots by Gary Bradski, 2005 OpenCV Modules • Calib3d – Calibration, stereo, homography, rectify, projection, solvePNP • Contrib – Octree, self-similar feature, sparse L-M, bundle adj, chamfer match • Core – Data structures, access, matrix ops, basic image operations • features2D – Feature detectors, descriptors and matchers in one architecture • Flann (Fast library for approximate nearest neighbors) • Gpu – CUDA speedups • Highgui – Gui to read, write, draw, print and interact with images • • • • Imgproc – image processing functions Ml – statistical machine learning, boosting, clustering Objdetect – PASCAL VOC latent SVM and data reading Traincascade – boosted rejection cascade CS324 14 Software Engineering • Works on: – Linux, Windows, Mac OS (+ Android since open CV 2.2) • Languages: – C++, Python, C • Online documentation: – Online reference manuals: C++, C and Python. • • • • Vision is Hard Camera Model, Lens, Problems and Corrections OpenCV OpenCV Tour CS324 16 Gradients: Scharr instead of Sobel • Sobel has been the traditional 3x3 gradient finder. • Use the 3x3 Scharr operator instead since it is just as fast but has more accurate response on diagonals. void Scharr(const Mat& src, Mat& dst, int ddepth, int xorder, int yorder, double scale=1, double delta=0, int borderType=BORDER_DEFAULT) CS324 17 Canny Edge Detector Canny() OpenCV team, Gary Bradski 18 Hough Transform HoughCircles(), HoughLines(), HoughLinesP() (probabilistic Hough) Gary Bradski (c) 2008 19 Scale Space void cvPyrDown( IplImage* src, IplImage* dst, IplFilter filter = IPL_GAUSSIAN_5x5); void cvPyrUp( IplImage* src, IplImage* dst, IplFilter filter = IPL_GAUSSIAN_5x5); Gary Bradski (c) 2008 20 Space Variant vision: Log-Polar Transform cvLogPolar(src,dst,center,size, CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS) Gary Bradski (c) 2008 21 Delaunay Triangulation, Voronoi Tessellation CvSubdiv2D* cvCreateSubdivDelaunay2D(CvRect rect, CvMemStorage* storage) Gary Bradski (c) 2008 22 Contours void findContours() Gary Bradski (c) 2008 23 Histogram Equalization void equalizeHist(const Mat& src, Mat& dst) Gary Bradski (c) 2008 24 Image textures • Inpainting: • Removes damage to images, in this case, it removes the text. void inpaint(const Mat& src, const Mat& inpaintMask, Mat& dst, double inpaintRadius, int flags); Gary Bradski (c) 2008 25 Morphological Operations Examples • Morphology - applying Min-Max. Filters and its combinations Void morphologyEx() Image I Erosion IB Closing I•B= (IB)B Grad(I)= (IB)-(IB) createMorphologyFilter() erode() dilate() Dilatation IB Opening IoB= (IB)B TopHat(I)= I - (IB) BlackHat(I)= (IB) - I Gary Bradski (c) 2008 26 Distance Transform • Distance field from edges of objects void distanceTransform(c onst Mat& src, Mat& dst, int distanceType, int maskSize) int floodFill(Mat& image, Point seed, Scalar newVal, Rect* rect=0, Scalar loDiff=Scalar(), Scalar upDiff=Scalar(), int flags=4) Flood Filling Gary Bradski (c) 2008 27 Thresholds void adaptiveThreshold() double threshold() Gary Bradski (c) 2008 28 Segmentation • Pyramid, mean-shift, graph-cut • Here: Watershed void watershed(const Mat& image, Mat& markers) Gary Bradski (c) 2008 29 Background Subtraction BackgroundSubtractorMOG2(), see samples/cpp/bgfg_segm.cpp Gary Bradski (c) 2008 30 Image Segmentation & Minimum Cut Pixel Neighborhood Image Pixels w Similarity Measure Minimum Cut 31 * From Khurram Hassan-Shafique CAP5415 Computer Vision 2003 GrabCut void grabCut(const Mat& image, Mat& mask, Rect rect, Mat& bgdModel, Mat& fgdModel, int iterCount, int mode) • Graph Cut based segmentation Gary Bradski 32 Motion Templates (My work with James Davies) • • • • Object silhouette Motion history images Motion history gradients Motion segmentation algorithm silhouette MHI MHG Gary Bradski (c) 2008 33 Segmentation, Motion Tracking void updateMotionHistory(); void calcMotionGradient(); double calcGlobalOrientation(); Motion Segmentation Motion Segmentation Pose Recognition Gesture Recognition Gary Bradski (c) 2008 James Davies, Gary Bradski 34 Tracking with CAMSHIFT • Control game with head RotatedRect CamShift(const Mat& probImage, Rect& window, TermCriteria criteria) 3D tracking • Camera Calibration • View Morphing • POSIT void POSIT() A more general technique for solving pose is solving the Percpective N Point problem: void solvePnP(…) Mean-Shift for Tracking CamShift(); MeanShift(); Gary Bradski (c) 2008 37 Optical Flow // opencv/samples/c/lkdemo.c int main(…){ … CvCapture* capture = <…> ? cvCaptureFromCAM(camera_id) : cvCaptureFromFile(path); if( !capture ) return -1; for(;;) { IplImage* frame=cvQueryFrame(capture); if(!frame) break; // … copy and process image cvCalcOpticalFlowPyrLK( …) cvShowImage( “LkDemo”, result ); c=cvWaitKey(30); // run at ~20-30fps speed calcOpticalFlowPyrLK() if(c >= { Also see0)dense optical flow: // process key calcOpticalFlowFarneback() }} cvReleaseCapture(&capture);} I ( x  dx, y  dy, t  dt)  I ( x, y , t );  I / t  I / x  ( dx / dt)  I / y  ( dy / dt); G  X  b, X  (x, y ), G    I x2 , I x I y   , b  2 I x I y , I y     I x  It   I y  Features 2D Read two input images: Mat img1 = imread(argv[1], CV_LOAD_IMAGE_GRAYSCALE); Detect keypoints in both images: // detecting keypoints FastFeatureDetector detector(15); vector<KeyPoint> keypoints1; detector.detect(img1, keypoints1); Compute descriptors for each of the keypoints: // computing descriptors SurfDescriptorExtractor extractor; Mat descriptors1; extractor.compute(img1, keypoints1, descriptors1); Now, ﬁnd the closest matches between descriptors from the ﬁrst image to the second: // matching descriptors BruteForceMatcher<L2<float> > matcher; vector<DMatch> matches; matcher.match(descriptors1, descriptors2, matches); CS324 39 Features 2D continued … Viusalize the results namedWindow("matches", 1); Mat img_matches; drawMatches(img1, keypoints1, img2, keypoints2, matches, img_matches); imshow("matches", img_matches); waitKey(0); Find the homography transformation between two sets of points: vector<Point2f> points1, points2; // fill the arrays with the points .... Mat H = findHomography(Mat(points1), Mat(points2), CV_RANSAC, ransacReprojThreshold); Create a set of inlier matches and draw them. Use perspectiveTransform function to map points with homography: Mat points1Projected; perspectiveTransform(Mat(points1), points1Projected, H); Use drawMatches() again for drawing inliers. CS324 40 Detection: Features2d contents Detectors available • SIFT • SURF • FAST • STAR • MSER • GFTT (Good Features To Track) Description: Descriptors available • SIFT • SURF • One way • Calonder (under construction) • FERNS Kalman Filter, Partical Filter for Tracking Kalman ::KalmanFilter class Condensation or Particle Filter ConDensation Gary Bradski (c) 2008 42 Projections Find: Warp: Mat getAffineTransform() void warpAffine() Mat getPerspectiveTransform() void warpPerspective() Homography • Maps one plane to another – In our case: A plane in the world to the camera plane – Great notes on this: Robert Collins CSE486 • http://www.cse.psu.edu/~rcollins/CSE486/lecture16.pdf – Derivation details: Learning OpenCV 384-387 Perspective Matrix Equation (camera coords Pt in world to pt on image) Gary Bradski and Adrian Kaehler: Learning OpenCV Gary Bradski, CS223A, Into to Robotics X   x '   f 0 0 0  Y   y '   0 f 0 0  Z        z '   0 0 1 0  1      p  M int PC X Z X y f Z x f 44 Homography • We often use the chessboard detector to find 4 non-colinear points – (X,Y * 4 = 8 constraints) – To solve for the 8 homography parmeters. • Code: Once again, OpenCV makes this easy – findHomography(…) or: – getPerspectiveTransform(…) Gary Bradski, CS223A, Into to Robotics 45 Single Camera Calibration See samples/cpp/calibration.cpp Now, camera calibration can be done by holding checkerboard in front of the camera for a few seconds. And after that you’ll get: 3D view of checkerboard Gary Bradski (c) 2008 Un-distorted image 46 Stereo … Depth from Triangulation • Involved topic, here we will just skim the basic geometry. • Imagine two perfectly aligned image planes: Depth “Z” and disparity “d” are inversly related: 47 Stereo • In aligned stereo, depth is from similar triangles: T  ( xl  x r ) T fT  Z  l Z f Z x  xr • Problem: Cameras are almost impossible to align • Solution: Mathematically align them: 48 All: Gary Bradski and Adrian Kaehler: Learning OpenCV Stereo Rectification • Algorithm steps are shown at right: • Goal: – Each row of the image contains the same world points – “Epipolar constraint” Result: Epipolar alignment of features: 49 All: Gary Bradski and Adrian Kaehler: Learning OpenCV samples/c In ...\opencv_incomp\samples\c bgfg_codebook.cpp bgfg_segm.cpp learning engine blobtrack.cpp calibration.cpp camshiftdemo.c simple color tracking contours.c convert_cascade.c recognition convexhull.c delaunay.c demhist.c recognition dft.c distrans.c drawing.c edge.c facedetect.c ffilldemo.c find_obj.cpp fitellipse.c houghlines.c image.cpp CvImage(); inpaint.cpp kalman.c kmeans.c laplace.c - Use of a image value codebook for background detection for collecting objects - Use of a background - Engine for blob tracking in images - Camera Calibration - Use of meanshift in - Demonstrates how to compute and use object contours - Change the window size in a cascade - Find the convex hull of an object - Triangulate a 2D point cloud - Show how to use histograms for - Discrete fourier transform - distance map from edges in an image - Various drawing functions - Edge detection - Face detection by classifier cascade - Flood filling demo - Demo use of SURF features - Robust elipse fitting - Line detection - Shows use of new image class, - Texture infill to repair imagery - Kalman filter for trackign - K-Means - Convolve image with laplacian. letter_recog.cpp lkdemo.c minarea.c morphology.c Close motempl.c silhouettes) mushroom.cpp decision trees (CART) pyramid_segmentation.c squares.c squares stereo_calib.cpp disparity watershed.cpp - Example of using machine learning Boosting, Backpropagation (MLP) and Random forests - Lukas-Canada optical flow - For a cloud of points in 2D, find min bounding box and circle. Shows use of Cv_SEQ - Demonstrates Erode, Dilate, Open, - Demonstrates motion templates (orthogonal optical flow given - Demonstrates use of for recognition - Color segmentation in pyramid - Uses contour processing to find in an image - Stereo calibration, recognition and map computation - Watershed transform demo. 50 samples/cpp Code of Possible use for Projects • Brief_match_test – Use of fast det., brief descrp. ORB will replace. See video_homography.cpp • • • Calibration (single camera) Chamfer (2D edge matching) Connected_components – Using contours to clean up regions in images. • • • • Contours2 (finding and drawing) Convexhull (finding in 2D) Cout_mat – (print out Mat) Demhist using calcHist() – histograms and histogram normalization • • • • • • • • Ffilldemo (flood fill methods) Filestorage (I/O of data structs) Fitellipse (find contours, fit ellispe) Grabcut (energy based segmentation) Imagelist_creator (yaml or xml lists) • • • • • • • • Descriptor_extractor_matcher – Use of features 2D detector descriptor • – Also see matcher_simple.cpp • Distrans • – Use of the distanceTransform on edge • images and voroni tessel. CS324• Edge (Canny edge detection) Read using: starter_imagelist.cpp Kalman (Using the kalman filter) Kinect_maps (using kinect in OpenCV) Kmeans (using kmeans clustering) Laplace (finding points/edges) Letter_recog (machine learning) • Use of Random trees, boosting, MLP Lkdemo (Lukas Kanada optical flow) Morphology2 (erosion, dilation etc) Multicascadeclassifier (rejection cascade) Peopledetect (use of HOG) Select3dobj (calc R and t from calib) Stereo_* (stereo calib. and matching) Watersed (segmentation algorithm) 51 ML for Recognition Gary Bradski (c) 2008 52 Machine Learning Library (MLL) CLASSIFICATION / REGRESSION (new) Fast Approximate NN (FLANN) (new) Extremely Random Trees (coming) LSH CART Naïve Bayes MLP (Back propagation) Statistical Boosting, 4 flavors Random Forests SVM Face Detector (Histogram matching) (Correlation) CLUSTERING K-Means EM (Mahalanobis distance) AACBAABBCBCC AACACB CCB CC CBABBC AAA B CB C ABBC B A BBC BB C TUNING/VALIDATION Cross validation Bootstrapping Variable importance Sampling methods http://opencv.willowgarage.com 53 53 K-Means, Mahalanobis K-Means: •Choose K data points as cluster centers • While cluster centers change: • Assign each data point to the closest center • If a cluster has no points, chose a random point from points far away from other cluster centers • Move the centers to the mean position of points in their cluster double kmeans() double Mahalanobis() Gary Bradski (c) 2008 54 Patch Matching void matchTemplate() Gary Bradski (c) 2008 55 Gesture Recognition double compareHist() Gestures: Up R L Stop OK Gesture via: Gradient histogram* based gesture recognition with Tracking. Meanshift Algorithm used to track, histogram intersection with gradient used to recognize. *Bill Freeman Gary Bradski (c) 2008 56 Boosting: Face Detection with Viola-Jones Rejection Cascade In samples/cpp, see: Multicascadeclassifier.cpp Gary Bradski (c) 2008 57 Machine learning • Good features often beat good algorithms • Choose an operating point that trades off accuracy vs. cost Gary Bradski (c) 2008 TP FN 100% FP TN100% 58 Some project ideas: (feel free to steal, modify or ignore) 1. Identify faces in (cellphone) pictures using facebook as database. 2. Use the (cellphone) camera to detect dangerous road events and or detect when someone is awake or sleeping (even with sunglasses on?) also in low light conditions. 3. Use webcam/cellphone to take pictures or videos of a room and then generate the floor plan. 4. Photograph or video a Jenga tower, and advise the player which is the safest block to remove. 5. Make a multiplayer game (if possible more than one computers/ cameras) based on CV. 6. Make an intuitive two handed UI for the OS (extra points for adding the use of facial gestures). 7. Do something with kinect (e.g. a golf game) 8. For engineers: make a paintball turret (e.g. http://www.paintballsentry.com/Videos.htm). 9. Make a security system with multiple cameras that records high quality portrait images and low quality video and alerts the presence suspicious people in real time (e.g. covered faces). 10. Use the camera to cheat/gain an advantage in real life interactions (sports, gambling) 11. Make a system (on the cellphone) that identifies/ classifies photographed objects (for 59 example mushrooms) Questions? Useful OpenCV Links OpenCV Wiki: http://opencv.willowgarage.com/wiki User Group (44700 members 4/2011): http://tech.groups.yahoo.com/group/OpenCV/join OpenCV Code Repository: svn co https://code.ros.org/svn/opencv/trunk/opencv New Book on OpenCV: http://oreilly.com/catalog/9780596516130/ Or, direct from Amazon: http://www.amazon.com/Learning-OpenCV-Computer-VisionLibrary/dp/0596516134 Code examples from the book: http://examples.oreilly.com/9780596516130/ Documentation http://opencv.willowgarage.com/documentation/index.html 61 61

Detecting and Segmenting Objects for Mobile Manipulation

Related documents

Products

Support

Detecting and Segmenting Objects for Mobile Manipulation

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib