How to Accelerate OpenCV Applications with the Zynq-7000 All Programmable SoC using Vivado HLS Video Libraries August 28, 2013 © Copyright 2013 Xilinx . OpenCV Overview Open Source Computer Vision (OpenCV) is widely used to develop Computer Vision applications – Library of 2500+ optimized video functions – Optimized for desktop processors and GPUs – Tens of thousands users – Runs out of the box on ARM processors in Zynq However – HD processing with OpenCV is often limited by external memory – Memory bandwidth is a bottleneck for performance – Memory accesses limit power efficiency Zynq All-programmable SOCs are a great way of implementing embedded computer vision applications – High performance and Low Power Page 2 © Copyright 2013 Xilinx . Real-Time Computer Vision Applications Computer Vision Applications Real-time Analytics Function Advanced Drivers Assist for Safety Lane or Pedestrian detection Surveillance for Security Friend vs Foe recognition Machine Vision for Quality High velocity object detection Medical Imaging For non invasive surgery Tumor detection Page 3 © Copyright 2013 Xilinx . Real-time Video Analytics Processing Pixel based Image Processing and Feature Extraction Frame based Feature processing and decision making Pixel based 4Kx2K Image processing and Feature extraction F1 F2 F3 ….. 1080p 720p 480p 10000s Ops/feature 1000s of features/sec = Mops 100s Ops/pixel 8MPx100 Ops/ frame = 100s Gops Page 4 © Copyright 2013 Xilinx . Heterogeneous Implementation of Real-time Video Analytics Pixel based Image Processing and Feature Extraction Frame based Feature processing and decision making Pixel based 4Kx2K Image processing and Feature extraction F1 SoftwareF2 Domain F3 (ARM) ….. Hardware Domain (FPGA) 1080p 720p 480p 10000s Ops/feature 1000s of features/sec = Mops 100s Ops/pixel 8MPx100 Ops/ frame = 100s Gops Page 5 © Copyright 2013 Xilinx . Xilinx Real-time Image Analytics Implementation: Zynq All Programmable SoC Pixel based Image Processing and Feature Extraction Frame Frame based based Feature Feature processing processing and and decision decision making making Pixel based 4Kx2K Image processing and Feature extraction F1 F2 F3 ….. 1080p 720p 480p 10000s Ops/feature 1000s of features/sec = Mops 100s Ops/pixel 8MPx100 Ops/ frame = 100s Gops Page 6 © Copyright 2013 Xilinx . Vivado: Productivity gains for OpenCV functions C simulation of HD video algorithm ~1 fps – RTL simulation of HD video 1 frame per hour Real-time FPGA implementation up to 60fps Page 7 © Copyright 2013 Xilinx . Accelerating OpenCV Applications Driver Assist Broadcast Monitor HD Surveillance Video Conferencing Studio Cinema Camera Frame-level processing Library for PS Pixel processing interfaces and basic functions for analytics Vivado HLS Digital Signage Consumer Displays Office-class MFP Machine Vision Page 8 Cinema Projection Medical Displays © Copyright 2013 Xilinx . Zynq Video TRD architecture DDR3 External Memory DDR3 Processing System SD Card DDR Memory Controller Dual Core Cortex-A9 Hardened Peripherals S_AXI_HP 64 bit S_AXI_GP 32b bit AXI4 Stream IP Core AXI Interconnect AXI VDMA HDMI Video Input Xylon Display Controller HLS-generated pipeline HDMI Video access to external memory using 64-bit High Performance ports Control register access using 32-bit General Purpose ports Video streams implemented using AXI4-Stream Page 9 © Copyright 2013 Xilinx . IP Centric Design flow Accelerated IP Generation and Integration C based IP Creation User Preferred System Integration Environment C, C++ or SystemC System Generator for DSP C Libraries • Floating point math.h • Fixed point • Video VHDL or Verilog plus SW Drivers Vivado IP Integrator IP Subsystem Xilinx IP 3rd Party IP Vivado RTL Integration User IP Page 10 © Copyright 2013 Xilinx . Page 11 © Copyright 2013 Xilinx . Using OpenCV in FPGA designs Pure OpenCV Application Integrated OpenCV Application Accelerated OpenCV Application OpenCV Reference Image File Read (OpenCV) Live Video Input Live Video Input OpenCV2AXIvideo AXIvideo2Mat OpenCV function chain OpenCV function chain HLS video library function chain Mat2AXIvideo Image File Write (OpenCV) Live Video Output AXIvideo2OpenCV Image File Write (OpenCV) Page 12 © Copyright 2013 Xilinx . Synthesizable Block AXIvideo2Mat HLS video library function chain Mat2AXIvideo Live Video Output Synthesized Block Image File Read (OpenCV) Pure OpenCV Application DDR3 External Memory Image File Read (OpenCV) Processing System OpenCV function chain DDR3 DDR Memory Controller SD Card Dual Core Cortex-A9 Hardened Peripherals AXI Interconnect Image File Write (OpenCV) AXI VDMA HDMI Page 13 Video Input HLS-generated pipeline © Copyright 2013 Xilinx . Xylon Display Controller HDMI Pure OpenCV Application Processing System OpenCV function chain 1 DDR3 DDR3 External Memory Image File Read (OpenCV) DDR Memory Controller SD Card Dual Core Cortex-A9 Hardened Peripherals AXI Interconnect Image File Write (OpenCV) AXI VDMA HDMI Page 14 Video Input HLS-generated pipeline © Copyright 2013 Xilinx . Xylon Display Controller HDMI Pure OpenCV Application Processing System OpenCV function chain 1 DDR32 DDR3 External Memory Image File Read (OpenCV) 3 4 5 DDR Memory Controller SD Card Dual Core Cortex-A9 Hardened Peripherals AXI Interconnect Image File Write (OpenCV) AXI VDMA HDMI Page 15 Video Input HLS-generated pipeline © Copyright 2013 Xilinx . Xylon Display Controller HDMI Pure OpenCV Application DDR3 External Memory Image File Read (OpenCV) Processing System OpenCV function chain DDR3 DDR Memory Controller SD Card Dual Core Cortex-A9 Hardened Peripherals AXI Interconnect Image File Write (OpenCV) AXI VDMA HDMI Page 16 Video Input HLS-generated pipeline © Copyright 2013 Xilinx . Xylon Display Controller HDMI Integrated OpenCV Application 1 DDR32 DDR3 External Memory Live Video Input Processing System OpenCV function chain 3 4 5 DDR Memory Controller SD Card Dual Core Cortex-A9 Hardened Peripherals AXI Interconnect Live Video Output AXI VDMA HDMI Page 17 Video Input HLS-generated pipeline © Copyright 2013 Xilinx . Xylon Display Controller HDMI OpenCV Reference / Software Execution Processing System OpenCV2AXIvideo AXIvideo2Mat 4 5 Dual Core Cortex-A9 Hardened Peripherals Mat2AXIvideo AXI Interconnect AXIvideo2OpenCV Page 18 3 DDR Memory Controller SD Card HLS video library function chain Image File Write (OpenCV) 1 DDR32 DDR3 External Memory Image File Read (OpenCV) AXI VDMA HDMI Video Input HLS-generated pipeline © Copyright 2013 Xilinx . Xylon Display Controller HDMI OpenCV Reference / In system Test Processing System OpenCV2AXIvideo AXIvideo2Mat DDR Memory Controller SD Card Dual Core Cortex-A9 Hardened Peripherals HLS video library function chain Mat2AXIvideo AXI Interconnect AXIvideo2OpenCV Image File Write (OpenCV) Page 19 1 DDR32 DDR3 External Memory Image File Read (OpenCV) AXI VDMA HDMI Video Input HLS-generated pipeline © Copyright 2013 Xilinx . Xylon Display Controller HDMI Accelerated OpenCV Application 1 DDR32 DDR3 External Memory Live Video Input AXIvideo2Mat Processing System DDR Memory Controller SD Card Dual Core Cortex-A9 Hardened Peripherals HLS video library function chain Mat2AXIvideo AXI Interconnect Live Video Output AXI VDMA HDMI Page 20 Video Input HLS-generated pipeline © Copyright 2013 Xilinx . Xylon Display Controller HDMI OpenCV design flow OpenCV Block A 1) Develop OpenCV application on Desktop 2) Run OpenCV application on ARM cores without modification OpenCV Block B 3) Abstract FPGA portion using I/O functions 4) Replace OpenCV function calls with synthesizable code OpenCV Block C 5) Run HLS to generate FPGA accelerator 6) Replace call to synthesizable code with call to FPGA accelerator OpenCV Block D Page 21 © Copyright 2013 Xilinx . Partitioned OpenCV Application opencv2AXIvideo OpenCV Block A AXIvideo2HLS OpenCV Block B HLS Block B Synchronization HLS Block C OpenCV Block C HLS2AXIvideo Synthesizable OpenCV Block D Page 22 AXIvideo2opencv © Copyright 2013 Xilinx . OpenCV Design Tradeoffs OpenCV-based image processing is built around memory frame buffers – Poor access locality -> small caches perform poorly – Complex architectures for performance -> higher power – Likely ‘good enough’ for many applications • Low resolution or framerate • Processing of features or regions of interest in a larger image Streaming architectures give high performance and low power – Chaining image processing functions reduces external memory accesses – Video-optimized line buffers and window buffers simpler than processor caches – Can be implemented with streaming optimizations in HLS – Requires conversion of code to be synthesizable © Copyright 2013 Xilinx . HLS Video Libraries OpenCV functions are not directly synthesizable with HLS – Dynamic memory allocation – Floating point – Assumes images are modified in external memory The HLS video library is intended to replace many basic OpenCV functions – Similar interfaces and algorithms to OpenCV – Focus on image processing functions implemented in FPGA fabric – Includes FPGA-specific optimizations • Fixed point operations instead of floating point • On-chip Linebuffers and window buffers – Not necessarily bit-accurate Page 24 © Copyright 2013 Xilinx . Xilinx HLS Video Library 2013.2 AXI4-Stream IO Functions Video Data Modeling Linebuffer class Window class OpenCV Interface Functions cvMat2AXIvideo AXIvideo2cvMat IplImage2AXIvideo AXIvideo2IplImage CvMat2AXIvideo AXIvideo2CvMat Video Functions AbsDiff AddS AddWeighted And Avg AvgSdv Cmp CmpS CornerHarris CvtColor Dilate AXIvideo2Mat Mat2AXIvideo cvMat2hlsMat IplImage2hlsMat CvMat2hlsMat hlsMat2cvMat hlsMat2IplImage hlsMat2CvMat MaxS Mean Merge Min MinMaxLoc MinS Mul Not PaintMask Range Reduce Duplicate EqualizeHist Erode FASTX Filter2D GaussianBlur Harris HoughLines2 Integral InitUndistortRectifyMap Max Remap Resize Scale Set Sobel Split SubRS SubS Sum Threshold Zero For function signatures and descriptions, see the HLS user guide UG 902 Page 25 © Copyright 2013 Xilinx . Video Library Functions C++ code contained in hls namespace. #include “hls_video.h” Similar interface, equivalent behavior with OpenCV, e.g. – OpenCV library: cvScale(src, dst, scale, shift); – HLS video library: hls::Scale<...>(src, dst, scale, shift); Some constructor arguments have corresponding or replacement template parameters, e.g. – OpenCV library: cv::Mat mat(rows, cols, CV_8UC3); – HLS video library: hls::Mat<ROWS, COLS, HLS_8UC3> mat(rows, cols); ROWS and COLS specify the maximum size of an image processed Page 26 © Copyright 2013 Xilinx . Video Library Core Structures OpenCV HLS Video Library cv::Point_<T>, CvPoint hls::Point_<T>, hls::Point cv::Size_<T>, CvSize hls::Size_<T>, hls::Size cv::Rect_<T>, CvRect hls::Rect_<T>, hls::Rect cv::Scalar_<T>, CvScalar hls::Scalar<N, T> cv::Mat, IplImage, CvMat hls::Mat<ROWS, COLS, T> cv::Mat mat(rows, cols, CV_8UC3); hls::Mat<ROWS, COLS, HLS_8UC3> mat (rows, cols); IplImage* img = cvCreateImage(cvSize(cols,rows), IPL_DEPTH_8U, 3); hls::Mat<ROWS, COLS, HLS_8UC3> img, (rows, cols); hls::Mat<ROWS, COLS, HLS_8UC3> img; hls::Window<ROWS, COLS, T> hls::LineBuffer<ROWS, COLS, T> Page 27 © Copyright 2013 Xilinx . Limitations Must replace OpenCV calls with video library functions Frame buffer access not supported through pointers – use VDMA and AXI Stream adapter functions Random access not supported – data read more than once must be duplicated – see hls::Duplicate() In-place update not supported – e.g. cvRectangle (img, point1, point2) OpenCV HLS Video Library Read operation pix = cv_mat.at<T>(i,j) pix = cvGet2D(cv_img,i,j) hls_img >> pix Write operation cv_mat.at<T>(i,j) = pix cvSet2D(cv_img,i,j,pix) hls_img << pix Page 28 © Copyright 2013 Xilinx . OpenCV Code One image input, one image output – Processed by chain of functions sequentially … IplImage* src=cvLoadImage("test_1080p.bmp"); IplImage* dst=cvCreateImage(cvGetSize(src), src->depth, src->nChannels); cvSobel(src, dst, 1, 0); cvSubS(dst, cvScalar(100,100,100), src); cvScale(src, dst, 2, 0); cvErode(dst, src); cvDilate(src, dst); cvSaveImage("result_1080p.bmp", dst); cvReleaseImage(&src); cvReleaseImage(&dst); … OpenCV function chain Image Write (OpenCV) test_opencv.cpp Page 29 Image Read (OpenCV) © Copyright 2013 Xilinx . Integrated OpenCV Application System provides pointer to frame buffers Synthesizable code can also be run on ARM void img_process(ZNQ_S32 *rgb_data_in, ZNQ_S32 *rgb_data_out, int height, int width, int stride, int flag_OpenCV) { // constructing OpenCV interface IplImage* src_dma = cvCreateImageHeader(cvSize(width, height), IPL_DEPTH_8U, 4); IplImage* dst_dma = cvCreateImageHeader(cvSize(width, height), IPL_DEPTH_8U, 4); src_dma->imageData = (char*)rgb_data_in; dst_dma->imageData = (char*)rgb_data_out; src_dma->widthStep = 4 * stride; dst_dma->widthStep = 4 * stride; if (flag_OpenCV) { opencv_image_filter(src_dma, dst_dma); } else { sw_image_filter(src_dma, dst_dma); } OpenCV function chain Live Video Output cvReleaseImageHeader(&src_dma); cvReleaseImageHeader(&dst_dma); img_filters.c } Page 30 Live Video Input © Copyright 2013 Xilinx . Accelerated with Vivado HLS video library Top level function extracted for HW acceleration #include “hls_video.h” // header file of HLS video library #include “hls_opencv.h” // header file of OpenCV I/O // typedef video library core structures typedef hls::stream<ap_axiu<32,1,1,1> > typedef hls::Scalar<3, uchar> typedef hls::Mat<1080,1920,HLS_8UC3> AXI_STREAM; RGB_PIXEL; RGB_IMAGE; Image Read (OpenCV) void image_filter(AXI_STREAM& src_axi, AXI_STREAM& dst_axi, int rows, int cols); top.h OpenCV2AXIvideo AXIvideo2Mat #include “top.h” … HLS video library function chain IplImage* src=cvLoadImage("test_1080p.bmp"); IplImage* dst=cvCreateImage(cvGetSize(src), src->depth, src->nChannels); AXI_STREAM src_axi, dst_axi; IplImage2AXIvideo(src, src_axi); Mat2AXIvideo image_filter(src_axi, dst_axi, src->height, src->width); AXIvideo2IplImage(dst_axi, dst); Image Write (OpenCV) cvSaveImage("result_1080p.bmp", dst); cvReleaseImage(&src); cvReleaseImage(&dst); Page 31 AXIvideo2OpenCV test.cpp © Copyright 2013 Xilinx . Accelerated with Vivado HLS video library HW Synthesizable Block for FPGA acceleration – Consist of video library function and interfaces – Replace OpenCV function with similar function in hls namespace void image_filter(AXI_STREAM& input, AXI_STREAM& output, int rows, int cols) { //Create AXI streaming interfaces for the core #pragma #pragma #pragma #pragma #pragma #pragma #pragma HLS HLS HLS HLS HLS HLS HLS RESOURCE variable=input core=AXIS metadata="-bus_bundle INPUT_STREAM" RESOURCE variable=output core=AXIS metadata="-bus_bundle OUTPUT_STREAM" RESOURCE variable=rows core=AXI_SLAVE metadata="-bus_bundle CONTROL_BUS" RESOURCE variable=cols core=AXI_SLAVE metadata="-bus_bundle CONTROL_BUS" RESOURCE variable=return core=AXI_SLAVE metadata="-bus_bundle CONTROL_BUS" INTERFACE ap_stable port=rows INTERFACE ap_stable port=cols RGB_IMAGE img_0(rows, cols), img_1(rows, cols), img_2(rows, cols); RGB_IMAGE img_3(rows, cols), img_4(rows, cols), img_5(rows, cols); RGB_PIXEL pix(50, 50, 50); #pragma HLS dataflow hls::AXIvideo2Mat(input, img_0); hls::Sobel<1,0,3>(img_0, img_1); hls::SubS(img_1, pix, img_2); hls::Scale(img_2, img_3, 2, 0); hls::Erode(img_3, img_4); hls::Dilate(img_4, img_5); hls::Mat2AXIvideo(img_5, output); top.cpp } Page 32 © Copyright 2013 Xilinx . Image Read (OpenCV) OpenCV2AXIvideo AXIvideo2Mat HLS video library function chain Mat2AXIvideo AXIvideo2OpenCV Image Write (OpenCV) Using Linux Userspace API Modify device tree to include register map FILTER@0x400D0000 { compatible = "xlnx,generic-hls"; reg = <0x400d0000 0xffff>; interrupts = <0x0 0x37 0x4>; interrupt-parent = <0x1>; }; Live Video Input Call from userspace after mmap() AXIvideo2Mat Ximage_filter xsfilter; int fd_uio = 0; if ((fd_uio = open("/dev/uio0", O_RDWR)) < 0) { printf("UIO: Cannot open device node\n"); } xsfilter.Control_bus_BaseAddress = (u32)mmap(NULL, XSOBEL_FILTER_CONTROL_BUS_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd_uio, 0); xsfilter.IsReady = XIL_COMPONENT_IS_READY; // init the configuration for image filter XImage_filter_SetRows(&xsfilter, sobel_configuration.height); XImage_filter_SetCols(&xsfilter, sobel_configuration.width); XImage_filter_EnableAutoRestart(&xsfilter); XImage_filter_Start(&xsfilter); Page 33 © Copyright 2013 Xilinx . HLS video library function chain Mat2AXIvideo Live Video Output HLS Directives for Video Processing Assign „input‟ to be an AXI4 stream named “INPUT_STREAM” #pragma HLS RESOURCE variable=input core=AXIS metadata="-bus_bundle INPUT_STREAM" Assign control interface to an AXI4-Lite interface #pragma HLS RESOURCE variable=return core=AXI_SLAVE metadata="-bus_bundle CONTROL_BUS" Assign „rows‟ to be accessible through the AXI4-Lite interface #pragma HLS RESOURCE variable=rows core=AXI_SLAVE metadata="-bus_bundle CONTROL_BUS" Declare that „rows‟ will not be changed during the execution of the function #pragma HLS INTERFACE ap_stable port=rows Enable streaming dataflow optimizations #pragma HLS dataflow Page 34 © Copyright 2013 Xilinx . A more complex OpenCV example: fast-corners This code is not „streaming‟ and must be rewritten – Random access and in-place operation on ‘dst’ void opencv_image_filter(IplImage* img, IplImage* dst ) { IplImage* gray = cvCreateImage(cvSize(img->width,img->height), 8, 1 ); cvCvtColor( img, gray, CV_BGR2GRAY ); std::vector<cv::KeyPoint> keypoints; cv::Mat gray_mat(gray,0); cv::FAST(gray_mat, keypoints, 20,true ); int rect=2; cvCopy(img,dst); for (int i=0; i<keypoints.size(); i++) { cvRectangle(dst, cvPoint(keypoints[i].pt.x,keypoints[i].pt.y), cvPoint(keypoints[i].pt.x+rect,keypoints[i].pt.y+rect), cvScalar(255,0,0),1); } cvReleaseImage( &gray ); } opencv_top.cpp Page 35 © Copyright 2013 Xilinx . A more complex OpenCV example: fast-corners This code is „streaming‟ – Note that function correspondence is not 1:1! void opencv_image_filter(IplImage* src, IplImage* dst) { IplImage* gray = cvCreateImage( cvGetSize(src), 8, 1 ); IplImage* mask = cvCreateImage( cvGetSize(src), 8, 1 ); IplImage* dmask = cvCreateImage( cvGetSize(src), 8, 1 ); std::vector<cv::KeyPoint> keypoints; cv::Mat gray_mat(gray,0); cvCvtColor(src, gray, CV_BGR2GRAY ); cv::FAST(gray_mat, keypoints, 20, true); GenMask(mask, keypoints); cvDilate(mask,dmask); cvCopy(src,dst); PrintMask(dst,dmask,cvScalar(255,0,0)); hls::FASTX hls::PaintMask cvReleaseImage( &mask ); cvReleaseImage( &dmask ); cvReleaseImage( &gray ); } Page 36 opencv_top.cpp © Copyright 2013 Xilinx . A more complex OpenCV example: fast-corners Synthesizable code – Note ‘#pragma HLS stream” hls::Mat<MAX_HEIGHT,MAX_WIDTH,HLS_8UC3> _src(rows,cols); hls::Mat<MAX_HEIGHT,MAX_WIDTH,HLS_8UC3> _dst(rows,cols); hls::AXIvideo2Mat(input, _src); hls::Mat<MAX_HEIGHT,MAX_WIDTH,HLS_8UC3> src0(rows,cols); hls::Mat<MAX_HEIGHT,MAX_WIDTH,HLS_8UC3> src1(rows,cols); #pragma HLS stream depth=20000 variable=src1.data_stream hls::Mat<MAX_HEIGHT,MAX_WIDTH,HLS_8UC1> mask(rows,cols); hls::Mat<MAX_HEIGHT,MAX_WIDTH,HLS_8UC1> dmask(rows,cols); hls::Scalar<3,unsigned char> color(255,0,0); hls::Duplicate(_src,src0,src1); hls::Mat<MAX_HEIGHT,MAX_WIDTH,HLS_8UC1> gray(rows,cols); hls::CvtColor<HLS_BGR2GRAY>(src0,gray); hls::FASTX(gray,mask,20,true); hls::Dilate(mask,dmask); hls::PaintMask(src1,dmask,_dst,color); hls::Mat2AXIvideo(_dst, output); top.cpp Page 37 © Copyright 2013 Xilinx . Streams and Reconvergent paths hls::Mat conceptually represents a whole image, but is implemented as a stream of pixels template<int ROWS, int COLS, int T> class Mat { public: HLS_SIZE_T rows, cols; hls::stream<HLS_TNAME(T)> data_stream[HLS_MAT_CN(T)]; }; hls_video_core.h Fast-corners contains a reconvergent path – The stream of pixels for src1 must include enough buffering to match the delay through FASTX and Dilate (approximately 10 video lines * 1920 pixels) CvtColor FASTX Dilate PaintMask src1 #pragma HLS stream depth=20000 variable=src1.data_stream Page 38 © Copyright 2013 Xilinx . Performance Analysis AXI Performance Monitor collects statistics on memory bandwidth – see /mnt/AXI_PerfMon.log Video + fast corners – 1920*1080*60*32 = ~4 Gb/s per stream – HP0: Read 4.01 Gb/s, Write 4.01 Gb/s, Total 8.03 Gb/s – HP2: Read 4.01 Gb/s, Write 4.01 Gb/s, Total 8.03 Gb/s Page 39 © Copyright 2013 Xilinx . Power Analysis Voltage and Current can be read from the digital power regulators on the ZC702 board. Custom, realtime HD video processing in 2-3 Watts total system power – FASTX is less than 200 mW incremental power 3000 2500 2000 DDR PL IO PL core PS IO PS core 1500 1000 500 0 Active Idle Page 40 Idle + Video Fast Corners + video © Copyright 2013 Xilinx . HLS and Zynq accelerates OpenCV apps OpenCV functions enable fast prototyping of Computer Vision algorithms Computer Vision applications are inherently heterogenous and require a mix HW and SW implementation Vivado HLS video library accelerates mapping of openCV functions to FPGA programmable fabric Zynq offers power-optimized integrated solution with high performance programmable logic and embedded ARM Page 41 © Copyright 2013 Xilinx . Additional OpenCV Collateral at Xilinx.com Download XAPP1167 from Xilinx.com QuickTake: Leveraging OpenCV and High-Level Synthesis with Vivado http://www.xilinx.com/hls http://www.xilinx.com/getlicense Page 42 © Copyright 2013 Xilinx .