Laboratory Manual
Digital Image Processing
Name: Sana Liaquat
DEPARTMENT OF MECHATRONICS ENGINEERING,
FACULTY OF ENGINEERING & TECHNOLOGY,
UNIVERSITY OF CHAKWAL, CHAKWAL
Engr. Sana Liaquat
Course Instructor
DEPARTMENT OF MECHATRONICS ENGINEERING UNIVERSITY OF
CHAKWAL, CHAKWAL
Subject: Digital Image Processing
Course Code:
List of Experiments
Sr. #
01
02
Lab Title
Installation of Anaconda, Jupyter and Running First Python Program.
Write program to read and display digital image using Open CV
• Become familiar with Basic commands
• Read and display image in Open CV.
• Resize given image
• Convert given color image into gray -scale image
• Convert given color/gray -scale image into black & white image
• Draw image profile.
• Separate color image in three R G & B planes
• Create color image using R, G and B three separate planes
• Flow control and LOOP in open CV
• Write given 2 -D data in image file
03
To write and execute image processing programs using point processing method
• Obtain Negative image
• Obtain Flip image
• Thresholding
• Contrast stretching
04
To write and execute programs for image arithmetic operations
• Addition of two images
• Subtract one image from other image
• Calculate mean value of image
• Different Brightness by changing mean value
05
To write and execute programs for image logical operations
• AND operation between two images
• OR operation between two images
• Calculate intersection of two images
• Water Marking using EX -OR operation
• NOT operation (Negative image)
06
07
To write a program for color detection, shape detection, contour etection. Explain and
Code Image Segmentation techniques.
To write and execute program for geometric transformation of image
•
•
•
•
•
08
09
10
11
12
Translation
Scaling
Rotation
Shrinking
Zooming
To understand various image noise models and to write programs for image denoising
in Open CV
Understand Warping Effect in Open CV. Explain and Code Digital Image
Watermarking in Open CV
Write and execute programs for image frequency domain filtering
• Apply FFT on given image
• Perform low pass and high pass filtering in frequency domain
• Apply IFFT to reconstruct image
• Edge Detection by DFT
Write a program in Python for edge detection using different edge detection mask
Write program for Feature Detection
• Chain Code
• Face Detection
o Simple Face
o Group Face
• Webcam Use
13
Write and execute program for image morphological operations Erosion, dilation,
opening and closing in python.
• Hit or mis Transformation.
• Skeleton
• Hole Filling
• Boundary Extraction
• Convex Hull
14
Open CV project “Document Scanner”
LABORATORY MANUAL
1
LAB 01
7
DOWNLOAD ANACONDA FOR WINDOWS:
INSTALLING ANACONDA ON WINDOWS:
VERIFY THE INSTALLATION OF ANACONDA:
CREATING A PROJECT IN JUPYTER NOTEBOOK:
RUNNING YOUR FIRST PROGRAM ON JUPYTER NOTEBOOK:
7
10
16
18
24
LAB 02
26
OBJECTIVE:
THEORY:
OPEN CV
COMPUTER VISION
I. READ AND DISPLAY IMAGE AND VIDEO:
II. RESIZE A GIVEN IMAGE:
III. CONVERT GIVEN COLOR IMAGE INTO GRAY-SCALE IMAGE
IV. CONVERT GIVEN COLOR/GRAY -SCALE IMAGE INTO BLACK & WHITE IMAGE
V. CONVERT IMAGE TO 2D ARRAY
26
26
26
26
26
28
29
29
30
LAB 03
33
OBJECTIVE:
TO WRITE AND EXECUTE IMAGE PROCESSING PROGRAMS USING POINT PROCESSING METHOD
THEORY:
OPENCV
POINT OPERATION
POINT PROCESSING IN SPATIAL DOMAIN
I. NEGATIVE IMAGE:
II. FLIP IMAGE:
III. THRESHOLDING:
IV. CONTRAST STRETCHING:
33
33
33
33
33
33
33
35
36
39
LAB 04
43
OBJECTIVE:
THEORY:
ADDITION OF TWO IMAGES:
CODE:
43
43
43
43
LAB 05
46
OBJECTIVE:
THEORY:
BITWISE OPERATIONS
I. BITWISE AND OPERATION ON IMAGES:
II. BITWISE OR OPERATION ON IMAGES:
III. BITWISE XOR OPERATION ON IMAGES:
IV. BITWISE NOT OPERATION ON IMAGES:
46
46
46
47
48
49
49
LAB 06
51
OBJECTIVE:
THEORY:
WHAT IS IMAGE SEGMENTATION?
DIFFERENT TYPES OF IMAGE SEGMENTATION TECHNIQUES
COLOR DETECTION
SHAPE DETECTION
CONTOUR DETECTION
51
51
51
51
56
57
60
LAB 07
62
OBJECTIVE:
THEORY:
GEOMETRIC TRANSFORMATION
SCALING
TRANSLATION
ROTATION
62
62
62
62
63
64
LAB 08
66
OBJECTIVE:
THEORY:
NOISE
SOURCES OF NOISE
IMAGE NOISE MODELS
POISSON NOISE
ANALYSIS OF BEST SUITED FILTERS FOR NOISES
ADAPTIVE FILTERING
IMAGE DENOISING IN OPENCV
66
66
66
66
66
72
74
74
75
LAB 09
79
OBJECTIVE:
THEORY:
WATERMARKING USING EX-OR OPERATION:
79
79
79
CREATING A WATERMARK USING THE IMAGE GIVEN BELOW:
81
LAB 10
84
OBJECTIVE:
THEORY:
FREQUENCY DOMAIN FILTERS:
DOMAIN FILTER
GAUSSIAN BLUR METHOD
MEAN FILTERING TECHNIQUES
MEDIAN FILTERING TECHNIQUES
FREQUENCY BAND FILTERING TECHNIQUES
84
84
84
84
85
85
86
87
LAB 11
89
OBJECTIVE:
THEORY:
EDGE DETECTION USING OPENCV:
SOBEL EDGE DETECTION
CANNY EDGE DETECTION:
89
89
89
90
95
LAB 12
97
OBJECTIVE:
THEORY:
FACE DETECTION
PROGRAM I:
FACE DETECTION (SINGLE + GROUP) IN USING OPENCV & PYTHON
PROGRAM II:
FACE DETECTION (WEBCAM) IN USING OPENCV & PYTHON
97
97
97
97
97
100
100
LAB 12
102
OBJECTIVE:
THEORY:
MORPHOLOGICAL OPERATIONS
102
102
102
LAB 14
107
OBJECTIVE:
THEORY:
MAKING A DOCUMENT SCANNER:
USE THE EDGES TO FIND ALL THE CONTOURS
107
107
107
109
Lab 01
Objective:
Installation of Anaconda and Running First Program.
Theory:
In this lab, we will download the Anaconda software and install it on our computer so we
can use it in our future labs. Here are some steps we perform to download and install the
software.
•
•
•
•
Download Anaconda Setup for Windows
Install Anaconda on Windows
Verify Installation of Anaconda
Running Your First Program in Python
Download Anaconda for Windows:
System Settings
Programming Language
IDE
Python 3.7.6
Jupyter Notebook
Platform
Window 10 Pro 64-bit Operating System
Subject
Digital Image Processing
Step 1:
• Open your Web Browser
• I am using Google Chrome Version 83.0.4103.97
Step 2:
• Type the following URL in the Address Bar of Web Browser and
press Enter Key
•
URL: https://www.anaconda.com/products/individual
Step 3:
• Scroll the Web Page
• In the Your Data Toolkit section, Click on Download
Step 4:
• After clicking the Download button
• Download Anaconda Setup depending upon your operating systems
• Control automatically moves to the Anaconda Installers section on the same Web Page
Step 5:
• Download the latest version of Anaconda released
• In my case it is Latest Anaconda 3 Release - Anaconda 3.7
Step 6:
• Click on the file named 64-bit graphical installer for download
Step 7:
• The file will start downloading
Installing Anaconda on Windows:
Step 1:
• Open the Folder containing Anaconda Setup for Windows
• In this case, it is in Download Folder.
Step 2:
• Double Click on the following File in the Folder
• File Name: Anaconda3-2020.02-Windows-x86_64.exe
Step 3:
• Anaconda Installer Wizard will open
• Click on Next
Step 4:
• The License Agreement will appear
• Click on I Agree
Step 5:
• By Default, the Just Me (recommended) radio button is checked and use the default settings
for installation
• Click on Next.
Step 6:
• By Default, the Just Me (recommended) radio button is checked and use the default settings
for installation
• Click on Next.
• In this step, the Install Location of Anaconda is required, the default location of Anaconda
is: C:/ProgramData
• You can change the Install Location of Anaconda by Clicking on Browse button and select
the Folder where you want to install Anaconda
•
Click on Next
Step 7:
• You have two choices here
❖ Choice 01: If Python is not installed on your Laptop / PC
➢ Select both Options given below:
✓ Add Anaconda to system Path Variable
✓ Register Anaconda as System Path 3.6
❖ Choice 02: If Python is installed on your Laptop / PC
➢ Select the Option given below:
✓ Register Anaconda as System Path 3.6
• In my case, Python is already installed on my laptop
❖ We will select the following option
•
➢ Register Anaconda as System Path 3.6
Click on Install
Step 8:
• The Anaconda Setup will start installing
• Don't worry it will take time in installation
Step 9:
• After the completion of installation
• Click on Next
Step 10:
• In the next step, it will provide the option of installing Visual Studio Code Installation
• Click on Skip to skip this Step (see figure below)
Step 11:
• Uncheck both the boxes (see figure below)
• Click on Finish to close the Anaconda Installer Wizard
Verify the Installation of Anaconda:
Step 1:
• Click on the Start button in the Task Bar
Step 2:
• Search for Anaconda
• If you are using Windows 8/8.1 Pro Search button will appear as shown below
Step 3:
• Click on Anaconda Prompt
Step 4:
• Search for Anaconda Prompt
• If you are using Windows 10
• Search button will appear as shown in the Figure below
Step 5:
• Anaconda Prompt window will appear on your Computer Screen
Step 6:
• Checking the Anaconda Version
❖ Step 6.1:
✓ In Anaconda Prompt window, type either of the two commands
python -v
OR
python --version
❖ Step 6.2:
✓ Press Enter Key
✓ This command will display the version of Python installed on your Laptop/Personal
Computer
❖ Step 6.3:
✓ In Anaconda Prompt window, type the following Command.
conda info
Creating a Project in Jupyter Notebook:
Step 1:
• Open the Anaconda Prompt
❖ Step 1.1:
✓ Type the following command in Anaconda Prompt
jupyter notebook
❖ Step 1.2:
✓ Press Enter key
✓ Jupyter Notebook will open in the Web Browser
Step 2:
• The GUI of Jupyter Notebook will display all Files and Folders from your Computer
System
❖ Step 2.1:
✓ Select the Folder amongst the given Folders where youwant to save Jupyter
Notebook Projects
✓ In our case, I will be storing my Jupyter Notebook Projects at Desktop.
✓ Click on Desktop
Step 3:
• It will take you to the Desktop Folder
Step 4:
• Click on New to make a Folder inside Desktop to save Jupyter Notebook Projects
• From drop down Click on Folder
Step 5:
• A new folder named Untitled Folder will be added
❖ Step 5.1:
✓ Check the Check Box
✓ Click on Rename
• A Dialog Box will appear on the Computer Screen
Step 6:
• Type the new name of the Folder
•
In our case, Data is the Folder Name
Step 7:
• Click on New
• Then from the drop down, Click on Python 3
Step 8:
• A new File will Open in a new Tab
Step 9:
• Click on Untitled
•
A Dialog Box will appear on Computer Screen
Step 10:
• Rename the project
• I am naming it as MyFirstProgram
Step 11:
• The project will be renamed and saved
Running Your First Program on Jupyter Notebook:
• Finally, our Jupyter Notebook Project is created
• In this Section, I will try to show, how to run your first Python
Program on Jupyter Notebook.
Program Aim:
• The main aim of this program is to display the following message “Hello Python”
• Note: We’re using the Jupyter Notebook File (named MyFirstProgram) which I created in
the previous Section.
Step 1:
• In Jupyter Notebook, type the following Python Statement:
Print (“Hello Python”)
Step 2:
• Click on Run (or Press Shift + Enter from Keyboard) to execute the command print()
statement will be executed and the message will be displayed.
Step 3:
• Click on Plus (+) to add a new row in Jupyter Notebook File
Conclusion:
In this lab we learnt about the digital image processing and we also learnt the installation of
Anaconda software for performing the coding for python. We download the software, install the
software and verify the installation of anaconda and create our first program on it in python.
Now I’ve learnt the installation of the anaconda software and a basic coding of python
Lab 02
Objective:
Write program to read and display digital image using Open CV
Theory:
Open CV
OpenCV is a cross-platform library using which we can develop real-time computer vision
applications. It mainly focuses on image processing, video capture and analysis including
features like face detection and object detection.
Computer Vision
Computer Vision can be defined as a discipline that explains how to reconstruct, interrupt, and
understand a 3D scene from its 2D images, in terms of the properties of the structure present in
the scene. It deals with modeling and replicating human vision using computer software and
hardware. Its fields are:
•
•
•
I.
Image Processing − It focuses on image manipulation.
Pattern Recognition − It explains various techniques to classify patterns.
Photogrammetry − It is concerned with obtaining accurate measurements from images.
Read and Display Image and Video:
Import Package
import cv2
print("Package Imported")
Read Image from the user and Display
img = cv2.imread('img1.jpg')
cv2.imshow('Ouput', img)
cv2.waitKey(0)
FIG: READING IMAGE
Read Video from the User and Display
import cv2
# capture variable is the instance of the VideoCapture Class
# Inside the while loop we grab the video frame by frame
# We display each frame of the video by imshow method
capture = cv2.VideoCapture()
# If you enter the integer value i.e 0,1,2,3 then it represents the webcam.
# Normally the 0 represents the webcam.
# If you need to display the video then give the path of the video.
while True:
isTrue, frame = capture.read()
# Read function reads the video frame by frame and then returns the
frame
# And the boolean that says that the frame is successfully read or not
# to display this video
cv2.imshow("Video", frame)
if cv2.waitKey(20) & 0xFF == ord('d'):
break
capture.release()
cv2.destroyAllWindows()
# The assertion 215 error represents the video run out of frames,
# so consider it as a warning
#
# Or in case of wrong path file specified. Same is the case in images
II.
Resize a Given Image:
Resize to specific width and height.
Resizing an image means changing the dimensions of it, be it width alone, height alone or
changing both of them. Also, the aspect ratio of the original image could be preserved in the
resized image.
•
•
Img.shape() function returns a tuple of the number of rows, columns, and channels.
cv2.resize(src, dsize[, dst[, fx[, fy[, interpolation]]]]) is the function
for resizing.
Where:
•
•
•
•
•
src is the [required] source input image
dsize is the [required] desired size of the image
fx is the [optional] scale factor along the horizontal axis
fy [optional] scale factor along the vertical axis
interpolation [optional] flag that takes one of the following methods.
INTER_NEAREST – a nearest-neighbor interpolation
INTER_LINEAR – a bilinear interpolation (used by default)
INTER_AREA – resampling using pixel area relation. It may be a preferred method
for image decimation, as it gives moire’free results. But when the image is zoomed, it
is similar to the INTER_NEAREST method.
INTER_CUBIC – a bicubic interpolation over 4×4-pixel neighborhood
import cv2
img = cv2.imread('img1.jpg')
print('Original Dimensions', img.shape)
width = 300
height = 200
dim = (width, height)
# resize image
resized = cv2.resize(img, dim, interpolation=cv2.INTER_AREA)
print('Resized Dimensions: ', resized.shape)
cv2.imshow("Resized image", resized)
cv2.waitKey(0)
cv2.destroyAllWindows()
OUTPUT:
FIG: IMAGE RESIZED
III.
Convert given color image into gray-scale image
Convert to another color space.
• cv2.cvtColor() method is used to convert an image from one color space to another.
There are more than 150 color-space conversion methods available in OpenCV.
import cv2
image = cv2.imread('img1.jpg')
cv2.imshow('Original', image)
grayscale = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
cv2.imshow('Grayscale', grayscale)
cv2.waitKey(0)
cv2.destroyAllWindows()
FIG: GRAYSCALE IMAGE
IV.
Convert given color/gray -scale image into black & white image
Convert to another color space.
•
cv2.cvtColor() method is used to convert an image from one color space to another.
There are more than 150 color-space conversion methods available in OpenCV.
•
The function cv.threshold is used to apply the thresholding. The first argument is the
source image, which should be a grayscale image. The second argument is the threshold
value which is used to classify the pixel values. The third argument is the maximum
value which is assigned to pixel values exceeding the threshold. OpenCV provides
different types of thresholding which is given by the fourth parameter of the function.
import cv2
originalImage = cv2.imread("img1.jpg")
grayImage = cv2.cvtColor(originalImage, cv2.COLOR_BGR2GRAY)
(thresh, blackAndWhiteImage) = cv2.threshold(
grayImage, 127, 255, cv2.THRESH_BINARY)
cv2.imshow('Black white image', blackAndWhiteImage)
cv2.imshow('Original image', originalImage)
cv2.imshow('Gray image', grayImage)
cv2.waitKey(0)
cv2.destroyAllWindows()
FIG: BLACK & WHITE IMAGE
V.
Convert Image to 2D array
from numpy import asarray
from PIL import Image
# load the image and convert into
# # numpy array
img = Image.open('img1.jpg')
numpydata = asarray(img)
# data
print(numpydata)
FIG: IMAGE TO 2D ARRAY
Lab 03
Objective:
To Write and Execute Image Processing Programs using Point Processing Method
Theory:
OpenCV
OpenCV is the huge open-source library for computer vision, machine learning, and image
processing and now it plays a major role in real-time operation which is very important in
today’s systems. By using it, one can process images and videos to identify objects, faces, or
even the handwriting of a human.
Point Operation
Point operations are often used to change the grayscale range and distribution. The concept of
point operation is to map every pixel onto a new image with a predefined transformation
function.
g(x, y) = T(f(x, y))
•
•
•
g (x, y) is the output image
T is an operator of intensity transformation
f (x, y) is the input image
We all already know that images are simply represented digitally as a 2D ordered matrix.
Operations that are used to modify a pixel value without affecting the neighboring pixels are
known as Point Operations. Point operations will:
•
•
•
•
Not change the size of the image
Not change the geometry of the image
Not change the local structure of the image
Not affect the neighbor pixels
Point processing in spatial domain
All the processing done on the pixel values. Point processing operations take the form:
s = T (r)
Here, T is referred to as a grey level transformation function or a point processing operation, s
refers to the processed image pixel value and r refers to the original image pixel value.
i.
Negative Image:
This is a photographic image of a dull and dark part, as well as the bright areas in the photos.
Plastics and glass, for instance, contain negative material most often.
Unlike an ordinary image, a negative one reflects on light-dark areas in both cases. Similarly,
negative color images offer the choice of altering the areas within them; from cyan, to
magenta, to blues, and vice versa.
s = (L-1) – r,
Where L= number of grey levels.
Obtain Negative Image:
import numpy as np
import cv2
from PIL import Image
import math
image = cv2.imread('img1.jpg')
L = image.max()
negative = L - image
cv2.imshow('original', image)
cv2.imshow('negative', negative)
cv2.waitKey(0)
cv2.destroyAllWindows()
Original Image:
FIG: ORIGINAL IMAGE
Negative Image:
FIG: NEGATIVE IMAGE
ii.
Flip Image:
A flipped image or reversed image, the more formal term, is a static or moving image that
is generated by a mirror-reversal of an original across a horizontal axis. A flopped image
is mirrored across the vertical axis.
Syntax:
img.transpose(Image.TRANSPOSE)
Obtain Flip Image:
import cv2
image = cv2.imread('img1.jpg')
flippedimage = cv2.flip(image, -1)
cv2.imshow('Flipped Image', flippedimage)
cv2.waitKey(0)
cv2.destroyAllWindows()
Flipped Image:
FIG: FLIPPED IMAGE
iii.
Thresholding:
Thresholding is a technique in OpenCV, which is the assignment of pixel values in relation to
the threshold value provided. In thresholding, each pixel value is compared with the
threshold value.
If the pixel value is smaller than the threshold, it is set to 0, otherwise, it is set to a maximum
value (generally 255).
Thresholding is a very popular segmentation technique, used for separating an object
considered as a foreground from its background. In Computer Vision, this technique of
thresholding is done on grayscale images. So initially, the image must be converted in
grayscale color space.
Syntax:
cv2.threshold(source, thresholdValue, maxVal, thresholdingTechnique)
# Python program to illustrate
# simple thresholding type on an image
# organizing imports
import cv2
image1 = cv2.imread('img1.jpg')
img = cv2.cvtColor(image1, cv2.COLOR_BGR2GRAY)
# Different Types of Threshold images
ret, thresh1 = cv2.threshold(img, 120, 255, cv2.THRESH_BINARY)
ret, thresh2 = cv2.threshold(img, 120, 255, cv2.THRESH_BINARY_INV)
ret, thresh3 = cv2.threshold(img, 120, 255, cv2.THRESH_TRUNC)
ret, thresh4 = cv2.threshold(img, 120, 255, cv2.THRESH_TOZERO)
ret, thresh5 = cv2.threshold(img, 120, 255, cv2.THRESH_TOZERO_INV)
# Printing Threshold In Different Windows
cv2.imshow('Binary Threshold', thresh1)
cv2.imshow('Binary Threshold Inverted', thresh2)
cv2.imshow('Truncated Threshold', thresh3)
cv2.imshow('Threshold type 4', thresh4)
cv2.imshow('Threshold type 4 Inverted', thresh5)
if cv2.waitKey(0) & 0xff == 27:
cv2.destroyAllWindows()
Output:
FIG: 'BINARY THRESHOLD'
FIG: 'BINARY THRESHOLD INVERTED'
FIG: 'TRUNCATED THRESHOLD'
FIG: 'THRESHOLD TYPE 4'
FIG: 'THRESHOLD TYPE 4 INVERTED'
iv.
Contrast Stretching:
Contrast stretching (often called normalization) is a simple image enhancement technique
that attempts to improve the contrast in an image by `stretching’ the range of intensity values
it contains to span a desired range of values, e.g., the full range of pixel values that the image
type concerned allows. By changing the location of points (r1, s1) and (r2, s2), we can
control the shape of the transformation function. For example:
•
•
•
•
When r1 =s1 and r2=s2, transformation becomes a Linear function.
When r1=r2, s1=0 and s2=L-1, transformation becomes a thresholding function.
When (r1, s1) = (rmin, 0) and (r2, s2) = (rmax, L-1), this is known as Min-Max
Stretching.
When (r1, s1) = (rmin + c, 0) and (r2, s2) = (rmax – c, L-1), this is known as Percentile
Stretching.
In Min-Max Stretching, the lower value of the input image is mapped to 0 and the upper
value is mapped to 255.
When Min-Max is performed, the tail ends of the histogram become long resulting in no
improvement in the image quality. So, it is better to clip a certain percentage like 1%, 2% of
the data from the tail ends of the input image histogram. This is known as Percentile
Stretching.
import numpy as np
import cv2
def Contrast_stretch(p, r1, s1, r2, s2):
if (0 <= p and p <= r1):
equation = (s1 / r1)*p
elif (r1 < p and p <= r2):
equation = ((s2 - s1)/(r2 - r1))*(p - r1)+s1
else:
equation = ((255 - s2)/(255 - r2))*(p - r2)+s2
return equation
image = cv2.imread('Cake.JFIF')
r1 = 300
s1 = 200
r2 = 140
s2 = 200
pixelVal_vec = np.vectorize(Contrast_stretch)
contrast = pixelVal_vec(image, r1, s1, r2, s2)
cv2.imshow('Contrast Stretching Image', contrast)
cv2.waitKey(0)
cv2.destroyAllWindows()
Contrasted Image:
FIG: CONTRAST STRETCHED IMAGE
Lab 04
Objective:
To write and execute programs for image arithmetic operations
•
•
•
•
Addition of two images
Subtract one image from another image
Calculate mean value of image
Different Brightness by changing mean value
Theory:
Addition of two images:
We can add two images by using function cv2.add(). This directly adds up image pixels in the
two images.
Warning: But adding the pixels is not an ideal situation. So, we use cv2.addweighted().
Remember, both images should be of equal size and depth.
Terminologies:
Syntax:
cv2.addWeighted(img1, wt1, img2, wt2, gammaValue)
Parameters:
• img1: First Input Image array(Single-channel, 8-bit or floating-point)
• wt1: Weight of the first input image elements to be applied to the final image
• img2: Second Input Image array(Single-channel, 8-bit or floating-point)
• wt2: Weight of the second input image elements to be applied to the final image
• gammaValue: Measurement of light
Code:
import cv2
# path to input images are specified and images are loaded with imread command
image1 = cv2.imread('input1.jpg')
image2 = cv2.imread('input2.jpg')
# cv2.addWeighted is applied over the image inputs with applied parameters
weightedSum = cv2.addWeighted(image1, 0.5, image2, 0.4, 0)
# the window showing output image with the weighted sum
cv2.imshow('Weighted Image', weightedSum)
# De-allocate any associated memory usage
if cv2.waitKey(0) & 0xff == 27:
cv2.destroyAllWindows()
Input Images
FIG: INPUT IMAGE 1
FIG: INPUT IMAGE 2
Weighted Image:
FIG: WEIGHTED IMAGE
Lab 05
Objective:
To write and execute programs for image logical operations
•
•
•
•
•
AND operation between two images
OR operation between two images
Calculate intersection of two images
Water Marking using EX -OR operation
NOT operation (Negative image)
Theory:
Bitwise operations
Bitwise operations are used in image manipulation and used for extracting essential parts in the
image. In this article, Bitwise operations used are:
1) AND
2) OR
3) XOR
4) NOT
Also, Bitwise operations helps in image masking. Image creation can be enabled with the help of
these operations. These operations can be helpful in enhancing the properties of the input images.
NOTE: The Bitwise operations should be applied on input images of same dimensions.
Input Images:
FIG: INPUT IMAGES
I.
Bitwise AND operation on Images:
Bit-wise conjunction of input array elements.
Syntax:
cv2.bitwise_and(source1, source2, destination, mask)
Parameters:
• Source1: First Input Image array(Single-channel, 8-bit or floating-point)
• Source2: Second Input Image array(Single-channel, 8-bit or floating-point)
• Dest: Output array (Similar to the dimensions and type of Input image array)
• Mask: Operation mask, Input / output 8-bit single-channel mask
import cv2
# path to input images are specified and images are loaded with imread
command
img1 = cv2.imread('i1.png')
img2 = cv2.imread('i2.png')
# cv2.bitwise_and is applied over the image inputs with applied parameters
dest_and = cv2.bitwise_and(img2, img1, mask=None)
# the window showing output image with the Bitwise AND operation on the input
images
cv2.imshow('Bitwise And', dest_and)
# De-allocate any associated memory usage
if cv2.waitKey(0) & 0xff == 27:
cv2.destroyAllWindows()
FIG: BITWISE AND OPERATION ON IMAGE
II.
Bitwise OR operation on Images:
Syntax:
cv2.bitwise_or(source1, source2, destination, mask)
Parameters:
• Source1: First Input Image array(Single-channel, 8-bit or floating-point)
• Source2: Second Input Image array(Single-channel, 8-bit or floating-point)
• Dest: Output array (Similar to the dimensions and type of Input image array)
• Mask: Operation mask, Input / output 8-bit single-channel mask.
import cv2
# path to input images are specified and images are loaded with imread
command
img1 = cv2.imread('i1.png')
img2 = cv2.imread('i2.png')
# cv2.bitwise_or is applied over the image inputs with applied parameters
dest_or = cv2.bitwise_or(img2, img1, mask=None)
# the window showing output image with the Bitwise OR operation on the input
images
cv2.imshow('Bitwise OR', dest_or)
# De-allocate any associated memory usage
if cv2.waitKey(0) & 0xff == 27:
cv2.destroyAllWindows()
FIG: BITWISE OR OPERATION ON IMAGE
III.
Bitwise XOR operation on Images:
Syntax:
cv2.bitwise_xor(source1, source2, destination, mask)
Parameters:
• Source1: First Input Image array(Single-channel, 8-bit or floating-point)
• Source2: Second Input Image array(Single-channel, 8-bit or floating-point)
• Dest: Output array (Similar to the dimensions and type of Input image array)
• Mask: Operation mask, Input / output 8-bit single-channel mask.
FIG: BITWISE XOR OPERATION ON IMAGE
IV.
Bitwise NOT operation on Images:
Syntax:
cv2.bitwise_not(source, destination, mask)
Parameters:
• Source: Input Image array(Single-channel, 8-bit or floating-point)
• Destination: Output array (Similar to the dimensions and type of Input image array)
• Mask: Operation mask, Input / output 8-bit single-channel mask
import cv2
# path to input images are specified and images are loaded with imread
command
img1 = cv2.imread('i1.png')
img2 = cv2.imread('i2.png')
# cv2.bitwise_not is applied over the image inputs with applied parameters
dest_not = cv2.bitwise_not(img2, img1, mask=None)
# the window showing output image with the Bitwise NOT operation on the input
images
cv2.imshow('Bitwise NOT', dest_not)
# De-allocate any associated memory usage
if cv2.waitKey(0) & 0xff == 27:
cv2.destroyAllWindows()
FIG: BITWISE NOT OPERATION ON IMAGE
Lab 06
Objective:
To write a program for color detection, shape detection, contour detection. Explain and Code
Image Segmentation techniques
Theory:
What is Image Segmentation?
Image segmentation is a branch of digital image processing which focuses on partitioning an
image into different parts according to their features and properties. The primary goal of image
segmentation is to simplify the image for easier analysis. In image segmentation, you divide an
image into various parts that have similar attributes. The parts in which you divide the image are
called Image Objects.
Different Types of Image Segmentation Techniques
Following are the primary types of image segmentation techniques:
1.
2.
3.
4.
5.
6.
Thresholding Segmentation
Edge-Based Segmentation
Region-Based Segmentation
Watershed Segmentation
Clustering-Based Segmentation Algorithms
Neural Networks for Segmentation
Thresholding Segmentation
The simplest method for segmentation in image processing is the threshold method. It divides the
pixels in an image by comparing the pixel’s intensity with a specified value (threshold). It is
useful when the required object has a higher intensity than the background (unnecessary parts).
You can consider the threshold value (T) to be a constant but it would only work if the image has
very little noise (unnecessary information and data). You can keep the threshold value constant
or dynamic according to your requirements.
The thresholding method converts a grey-scale image into a binary image by dividing it into two
segments (required and not required sections).
According to the different threshold values, we can classify thresholding segmentation in the
following categories:
i.
Simple Thresholding
In this method, you replace the image’s pixels with either white or black. Now, if the
intensity of a pixel at a particular position is less than the threshold value, you’d replace it
with black. On the other hand, if it’s higher than the threshold, you’d replace it with
white. This is simple thresholding and is particularly suitable for beginners in image
segmentation.
ii.
Otsu’s Binarization
In simple thresholding, you picked a constant threshold value and used it to perform
image segmentation. However, how do you determine that the value you chose was the
right one? While the straightforward method for this is to test different values and choose
one, it is not the most efficient one.
Take an image with a histogram having two peaks, one for the foreground and one for the
background. By using Otsu binarization, you can take the approximate value of the
middle of those peaks as your threshold value.
In Otsu binarization, you calculate the threshold value from the image’s histogram if the
image is bimodal.
This process is quite popular for scanning documents, recognizing patterns, and removing
unnecessary colors from a file. However, it has many limitations. You can’t use it for
images that are not bimodal (images whose histograms have multiple peaks).
iii.
Adaptive Thresholding
Having one constant threshold value might not be a suitable approach to take with every
image. Different images have different backgrounds and conditions which affect their
properties.
Thus, instead of using one constant threshold value for performing segmentation on the
entire image, you can keep the threshold value variable. In this technique, you’ll keep
different threshold values for different sections of an image.
This method works well with images that have varying lighting conditions. You’ll need
to use an algorithm that segments the image into smaller sections and calculates the
threshold value for each of them.
Edge-Based Segmentation
Edge-based segmentation is one of the most popular implementations of segmentation in image
processing. It focuses on identifying the edges of different objects in an image. This is a crucial
step as it helps you find the features of the various objects present in the image as edges contain
a lot of information you can use.
Edge detection is widely popular because it helps you in removing unwanted and unnecessary
information from the image. It reduces the image’s size considerably, making it easier to analyse
the same.
Algorithms used in edge-based segmentation identify edges in an image according to the
differences in texture, contrast, grey level, colour, saturation, and other properties. You can
improve the quality of your results by connecting all the edges into edge chains that match the
image borders more accurately.
There are many edge-based segmentations methods available. We can divide them into two
categories:
i.
Search-Based Edge Detection
Search-based edge detection methods focus on computing a measure of edge strength and
look for local directional maxima of the gradient magnitude through a computed estimate
of the edge’s local orientation.
ii.
Zero-Crossing Based Edge Detection
Zero-crossing based edge detection methods look for zero crossings in a derivative
expression retrieved from the image to find the edges.
Typically, you’ll have to pre-process the image to remove unwanted noise and make it
easier to detect edges. Canny, Prewitt, Deriche, and Roberts cross are some of the most
popular edge detection operators. They make it easier to detect discontinuities and find
the edges.
In edge-based detection, your goal is to get a partial segmentation minimum where you
can group all the local edges into a binary image. In your newly created binary image, the
edge chains must match the existing components of the image in question.
iii.
Region-Based Segmentation
Region-based segmentation algorithms divide the image into sections with similar
features. These regions are only a group of pixels and the algorithm find these groups by
first locating a seed point which could be a small section or a large portion of the input
image.
After finding the seed points, a region-based segmentation algorithm would either add
more pixels to them or shrink them so it can merge them with other seed points.
Based on these two methods, we can classify region-based segmentation into the
following categories:
a. Region Growing
In this method, you start with a small set of pixels and then start iteratively
merging more pixels according to particular similarity conditions. A region
growing algorithm would pick an arbitrary seed pixel in the image, compare it
with the neighboring pixels and start increasing the region by finding matches to
the seed point.
When a particular region can’t grow further, the algorithm will pick another seed
pixel which might not belong to any existing region. One region can have too
many attributes causing it to take over most of the image. To avoid such an error,
region growing algorithms grow multiple regions at the same time.
You should use region growing algorithms for images that have a lot of noise as
the noise would make it difficult to find edges or use thresholding algorithms.
b. Region Splitting and Merging
As the name suggests, a region splitting and merging focused method would
perform two actions together – splitting and merging portions of the image.
It would first the image into regions that have similar attributes and merge the
adjacent portions which are similar to one another. In region splitting, the
algorithm considers the entire image while in region growth, the algorithm would
focus on a particular point.
The region splitting and merging method follows a divide and conquer
methodology. It divides the image into different portions and then matches them
according to its predetermined conditions. Another name for the algorithms that
perform this task is split-merge algorithms.
Watershed Segmentation
In image processing, a watershed is a transformation on a grayscale image. It refers to the
geological watershed or a drainage divide. A watershed algorithm would handle the image as if it
was a topographic map. It considers the brightness of a pixel as its height and finds the lines that
run along the top of those ridges.
Watershed has many technical definitions and has several applications. Apart from identifying
the ridges of the pixels, it focuses on defining basins (the opposite of ridges) and floods the
basins with markers until they meet the watershed lines going through the ridges.
As basins have a lot of markers while the ridges don’t, the image gets divided into multiple
regions according to the ‘height’ of every pixel.
The watershed method converts every image into a topographical map The watershed
segmentation method would reflect the topography through the grey values of their pixels.
Now, a landscape with valleys and ridges would certainly have three-dimensional aspects. The
watershed would consider the three-dimensional representation of the image and create regions
accordingly, which are called “catchment basins”.
It has many applications in the medical sector such as MRI, medical imaging, etc. Watershed
segmentation is a prominent part of medical image segmentation so if you want to enter that
sector, you should focus on learning this method for segmentation in image processing
particularly.
Clustering-Based Segmentation Algorithms
If you’ve studied classification algorithms, you must have come across clustering algorithms.
They are unsupervised algorithms and help you in finding hidden data in the image that might
not be visible to a normal vision. This hidden data includes information such as clusters,
structures, shadings, etc.
As the name suggests, a clustering algorithm divides the image into clusters (disjoint groups) of
pixels that have similar features. It would separate the data elements into clusters where the
elements in a cluster are more similar in comparison to the elements present in other clusters.
Some of the popular clustering algorithms include fuzzy c-means (FCM), k-means, and
improved k-means algorithms. In image segmentation, you’d mostly use the k-means clustering
algorithm as it’s quite simple and efficient. On the other hand, the FCM algorithm puts the pixels
in different classes according to their varying degrees of membership.
The most important clustering algorithms for segmentation in image processing are:
K-means Clustering
K-means is a simple unsupervised machine learning algorithm. It classifies an image
through a specific number of clusters. It starts the process by dividing the image space
into k pixels that represent k group centroids.
Then they assign each object to the group based on the distance between them and the
centroid. When the algorithm has assigned all pixels to all the clusters, it can move and
reassign the centroids.
Fuzzy C Means
With the fuzzy c-means clustering method, the pixels in the image can get clustered in
multiple clusters. This means a pixel can belong to more than one cluster. However,
every pixel would have varying levels of similarities with every cluster. The fuzzy cmeans algorithm has an optimization function which affects the accuracy of your results.
Clustering algorithms can take care of most of your image segmentation needs. If you
want to learn more about them, check out this guide on what is clustering and the
different types of clustering algorithms.
Neural Networks for Segmentation
Perhaps you don’t want to do everything by yourself. Perhaps you want to have an AI do most of
your tasks, which you can certainly do with neural networks for image segmentation.
You’d use AI to analyse an image and identify its different components such as faces, objects,
text, etc. Convolutional Neural Networks are quite popular for image segmentation because they
can identify and process image data much quickly and efficiently.
The experts at Facebook AI Research (FAIR) created a deep learning architecture called Mask
R-CNN which can make a pixel-wise mask for every object present in an image. It is an
enhanced version of the Faster R-CNN object detection architecture. The Faster R-CNN uses two
pieces of data for every object in an image, the bounding box coordinates and the class of the
object. With Mask R-CNN, you get an additional section in this process. Mask R-CNN outputs
the object mask after performing the segmentation.
In this process, you’d first pass the input image to the ConvNet which generates the feature map
for the image. Then the system applies the region proposal network (RPN) on the feature maps
and generates the object proposals with their objectness scores.
After that, the Roi pooling layer gets applied to the proposals to bring them down to one size. In
the final stage, the system passes the proposals to the connected layer for classification and
generates the output with the bounding boxes for every object.
Color Detection
import cv2
import numpy as np
image = cv2.imread('umar.jpg')
# define the list of boundaries
boundaries = [
([17, 15, 100], [50, 56, 200]),
([86, 31, 4], [220, 88, 50]),
([25, 146, 190], [62, 174, 250]),
([103, 86, 65], [145, 133, 128])
]
# loop over the boundaries
for (lower, upper) in boundaries:
# create NumPy arrays from the boundaries
lower = np.array(lower, dtype="uint8")
upper = np.array(upper, dtype="uint8")
# find the colors within the specified boundaries and apply
# the mask
mask = cv2.inRange(image, lower, upper)
output = cv2.bitwise_and(image, image, mask=mask)
# show the images
cv2.imshow("images", np.hstack([image, output]))
cv2.waitKey(0)
FIG: COLOR DETECTION
FIG: COLOR DETECTION
Shape Detection
# import the necessary packages
import cv2
import imutils
class ShapeDetector:
def __init__(self):
pass
def detect(self, c):
# initialize the shape name and approximate the contour
shape = "unidentified"
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.04 * peri, True)
# if the shape is a triangle, it will have 3 vertices
if len(approx) == 3:
shape = "triangle"
# if the shape has 4 vertices, it is either a square or
# a rectangle
elif len(approx) == 4:
# compute the bounding box of the contour and use the
# bounding box to compute the aspect ratio
(x, y, w, h) = cv2.boundingRect(approx)
ar = w / float(h)
# a square will have an aspect ratio that is approximately
# equal to one, otherwise, the shape is a rectangle
shape = "square" if ar >= 0.95 and ar <= 1.05 else "rectangle"
# if the shape is a pentagon, it will have 5 vertices
elif len(approx) == 5:
shape = "pentagon"
# otherwise, we assume the shape is a circle
else:
shape = "circle"
# return the name of the shape
return shape
# load the image and resize it to a smaller factor so that
# the shapes can be approximated better
image = cv2.imread('test.jpg')
resized = imutils.resize(image, width=300)
ratio = image.shape[0] / float(resized.shape[0])
# convert the resized image to grayscale, blur it slightly,
# and threshold it
gray = cv2.cvtColor(resized, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
thresh = cv2.threshold(blurred, 60, 255, cv2.THRESH_BINARY)[1]
# find contours in the thresholded image and initialize the
# shape detector
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
sd = ShapeDetector()
# loop over the contours
for c in cnts:
# compute the center of the contour, then detect the name of the
# shape using only the contour
M = cv2.moments(c)
cX = int((M["m10"] / M["m00"]) * ratio)
cY = int((M["m01"] / M["m00"]) * ratio)
shape = sd.detect(c)
# multiply the contour (x, y)-coordinates by the resize ratio,
# then draw the contours and the name of the shape on the image
c = c.astype("float")
c *= ratio
c = c.astype("int")
cv2.drawContours(image, [c], -1, (0, 255, 0), 2)
cv2.putText(image, shape, (cX, cY), cv2.FONT_HERSHEY_SIMPLEX,
0.5, (255, 255, 255), 2)
# show the output image
cv2.imshow("Image", image)
cv2.waitKey(0)
FIG: SHAPE DETECTION
Contour Detection
import cv2
# read the image
image = cv2.imread('test.jpg')
# convert the image to grayscale format
img_gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# apply binary thresholding
ret, thresh = cv2.threshold(img_gray, 150, 255, cv2.THRESH_BINARY)
# visualize the binary image
cv2.imshow('Binary image', thresh)
cv2.waitKey(0)
cv2.imwrite('image_thres1.jpg', thresh)
cv2.destroyAllWindows()
# detect the contours on the binary image using cv2.CHAIN_APPROX_NONE
contours, hierarchy = cv2.findContours(
image=thresh, mode=cv2.RETR_TREE, method=cv2.CHAIN_APPROX_NONE)
# draw contours on the original image
image_copy = image.copy()
cv2.drawContours(image=image_copy, contours=contours, contourIdx=-1,
color=(0, 255, 0), thickness=2, lineType=cv2.LINE_AA)
# see the results
cv2.imshow('None approximation', image_copy)
cv2.waitKey(0)
cv2.imwrite('contours_none_image1.jpg', image_copy)
cv2.destroyAllWindows()
FIG: BINARY IMAGE
FIG: CONTOUR DETECTION
Lab 07
Objective:
To write and execute program for geometric transformation of image AND operation between
two images
•
•
•
Scaling (i.e, Shrinking, Zooming)
Translation
Rotation
Theory:
Geometric Transformation
Geometric transformations are needed to give an entity the needed position, orientation,
or shape starting from existing position, orientation, or shape. The basic transformations are
scaling, rotation, translation, and shear. Other important types of transformations are projections
and mappings.
Scaling
Scaling is just resizing of the image. OpenCV comes with a function cv2.resize() for this
purpose. The size of the image can be specified manually, or you can specify the scaling factor.
Different interpolation methods are used. Preferable interpolation methods are
cv2.INTER_AREA for shrinking and cv2.INTER_CUBIC (slow) & cv2.INTER_LINEAR for
zooming. By default, interpolation method used is cv2.INTER_LINEAR for all resizing
purposes. You can resize an input image either of following methods:
Code:
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread('img1.jpg')
#res = cv2.resize(img,None,fx=4, fy=4, interpolation = cv2.INTER_CUBIC)
# OR
height, width = img.shape[:2]
res = cv2.resize(img, (2*width, 2*height), interpolation=cv2.INTER_CUBIC)
plt.subplot(121), plt.imshow(img), plt.title("Before")
plt.xlabel(img.shape)
plt.xticks([]), plt.yticks([])
plt.subplot(122), plt.imshow(res), plt.title("After")
plt.xlabel(res.shape)
plt.xticks([]), plt.yticks([])
plt.show()
Output:
FIG: SCALING
Translation
Translation is the shifting of object’s location. If you know the shift in (x,y) direction, let it
be
, you can create the transformation matrix
as follows:
You can take make it into a Numpy array of type np.float32 and pass it into cv2.warpAffine()
function. See below example for a shift of (100,50):
Third argument of the cv2.warpAffine() function is the size of the output image, which should be
in the form of (width, height). Remember width = number of columns, and height = number of
rows.
Code:
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread('img1.jpg', 0)
rows, cols = img.shape
M = np.float32([[1, 0, 100], [0, 1, 50]])
dst = cv2.warpAffine(img, M, (cols, rows))
cv2.imshow('img', dst)
cv2.waitKey(0)
cv2.destroyAllWindows()
plt.subplot(121), plt.imshow(img), plt.title("Before")
plt.xticks([]), plt.yticks([])
plt.subplot(122), plt.imshow(dst), plt.title("After")
plt.xticks([]), plt.yticks([])
plt.show()
Output:
FIG: TRANSLATION
Translation and rotating an image with OpenCV are easy, but sometimes simple
rotation/translation tasks cropped/cut sides of an image (as you see above), which leads to a half
image. This is because of the padding which means the pixels that are left by the image after
translation will get filled with a specific color (Default Color: Black).
Rotation
Rotation of an image for an angle
is achieved by the transformation matrix of the form
But OpenCV provides scaled rotation with adjustable center of rotation so that you can rotate at
any location you prefer. Modified transformation matrix is given by
where:
To find this transformation matrix, OpenCV provides a function, cv2.getRotationMatrix2D.
Check below example which rotates the image by 90 degree with respect to center without any
scaling.
Code:
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread('img1.jpg', 0)
rows, cols = img.shape
M = cv2.getRotationMatrix2D((cols/2, rows/2), 90, 1)
dst = cv2.warpAffine(img, M, (cols, rows))
plt.subplot(121), plt.imshow(img), plt.title("Before")
plt.xticks([]), plt.yticks([])
plt.subplot(122), plt.imshow(dst), plt.title("After")
plt.xticks([]), plt.yticks([])
plt.show()
Output:
FIG: ROTATION
Lab 08
Objective:
To understand various image noise models and to Write programs for image denoising in
OpenCV
•
•
•
Noise and Image Noise Models
Image Denoising in OpenCV
Install OpenCV and other packages
Theory:
Noise
Noise is typically defined as a random variation in brightness or colour information and it is
frequently produced by technical limits of the image collection sensor or by improper
environmental circumstances. These difficulties are frequently inevitable in real scenarios,
making image noise a common issue that must be addressed with appropriate denoising
approaches.
Denoising an image is a difficult task since the noise is tied to the image’s high-frequency
content, i.e., the details. As a result, the goal is to strike a balance between suppressing noise as
much as possible while not losing too much information. Filter-based approaches for picture
denoising, such as the Inverse, Median, and Wiener Filters, are the most often utilized.
Sources of Noise
During picture acquisition and transmission, noise may be introduced into the image. The
quantification of noise is determined by the number of corrupted pixels in the image. The
following are the primary sources of noise in digital images: –
•
•
•
•
Environmental factors may have an impact on the imaging sensor.
Low light and sensor temperature may cause image noise.
Dust particles in the scanner can cause noise in the digital image.
Transmission channel interference.
Image Noise models
The pattern of the noise, as well as its probabilistic properties, distinguishes it. There is a wide
range of noise types. While we focus primarily on the most important forms, these are Gaussian
noise, salt and pepper noise, poison noise, impulse noise, and speckle noise.
I.
Gaussian Noise
It is also called as electronic noise because it arises in amplifiers or detectors. Gaussian noise
caused by natural sources such as thermal vibration of atoms and discrete nature of radiation
of warm objects. Gaussian noise generally disturbs the gray values in digital images. That is
why Gaussian noise model essentially designed and characteristics by its PDF or normalizes
histogram with respect to gray value. This is given as
Where g = gray value, σ = standard deviation and µ = mean. Generally Gaussian noise
mathematical model represents the correct approximation of real world scenarios. In this noise
model, the mean value is zero, variance is 0.1 and 256 gray levels in terms of its PDF, which is
shown in Figure.
FIG: PDF OF GAUSSIAN NOISE
Due to this equal randomness the normalized Gaussian noise curve look like in bell shaped. The
PDF of this noise model shows that 70% to 90% noisy pixel values of degraded image in
between µ - σ and µ + σ. The shape of normalized histogram is almost same in spectral domain.
Gaussian Noise with OpenCV-Python:
import random
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Guassian Noise
img = cv2.imread('test.jpg')
blur = cv2.GaussianBlur(img, (5, 5), 0)
plt.subplot(121), plt.imshow(img), plt.title("Original")
plt.xticks([]), plt.yticks([])
plt.subplot(122), plt.imshow(blur), plt.title("Averaging")
plt.xticks([]), plt.yticks([])
cv2.imwrite('GuassFilter.jpg', blur)
plt.show()
FIG: GAUSSIAN NOISE
II.
Impulse (Salt and Pepper) Noise
The image which is low in quality has bright and dark pixels present in it which causes noise
in it also referred as Salt Pepper noise. An image which contains Salt Pepper noise will
generally have bright pixels in dark portion and dark pixels in bright portion of the image.
Due to sharp and unexpected changes of image signal the noise arises.
Dead pixels, analog-to-digital converter errors, bit errors in transmission, etc. are caused due
to the presence of Salt Pepper noise in the image. Salt and Pepper noise generally corrupted
the digital image by malfunctioning of pixel elements in camera sensors, fualty memory
space in storage, errors in digitization process and many more. This kind of noise can be
removed by using Dark Frame Subtraction (DFS) and by constructing new data points
around dark and bright pixels which is obtained by the Median filter or morphological filter.
The probability density function is given as:
Fig. 2 shows the PDF of Salt and Pepper noise, if mean is zero and variance is 0.05. Here we
will meet two spike one is for bright region (where gray level is less) called ‘region a’ and
another one is dark region (where gray level is large) called ‘region b’, we have clearly seen
here the PDF values are minimum and maximum in ‘region a’ and ‘region b’, respectively.
FIG: PDF of Salt and Pepper Noise
Salt and Pepper Noise with OpenCV-Python:
import random
import cv2
import numpy as np
from matplotlib import pyplot as plt
# salt and paper
def sp_noise(image, prob):
output = np.zeros(image.shape, np.uint8)
thres = 1 - prob
for i in range(image.shape[0]):
for j in range(image.shape[1]):
rdn = random.random()
if rdn < prob:
output[i][j] = 0
elif rdn > thres:
output[i][j] = 255
else:
output[i][j] = image[i][j]
return output
image = cv2.imread('test.jpg', 0)
noise_img = sp_noise(image, 0.05)
cv2.imwrite('Saltpepper.jpg', noise_img)
plt.subplot(121), plt.imshow(image, cmap="gray")
plt.subplot(122), plt.imshow(noise_img, cmap="gray")
plt.show()
FIG: SALT & PEPPER NOISE
III.
Speckle Noise
The Speckle Noise is defined as a noise which is present in the images and which degrades
the quality of an image. This makes it more difficult for the observer to distinguish fine
details in the images.
This type of noise can be found in a wide range of systems, including synthetic aperture radar
(SAR) images, ultrasound imaging, and many more.
This noise is multiplicative noise. Their appearance is seen in coherent imaging system such
as laser, radar and acoustics etc. Speckle noise can exist similar in an image as Gaussian
noise. Its probability density function follows gamma distribution, which is shown in Fig. 3
and given as:
Figure 3: PDF of Speckle Noise
Speckle Noise with OpenCV-Python:
import random
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Speckel Noise
def spec_noise(image, prob):
output = np.zeros(image.shape, np.uint8)
thres = 1 - prob
for i in range(image.shape[0]):
for j in range(image.shape[1]):
rdn = random.random()
if rdn < prob:
output[i][j] = 128
for k in range(5):
output[i-k][j-k] = 128 + 10*rdn
else:
output[i][j] = image[i][j]
return output
image = cv2.imread('test.jpg', 0)
noise_img = spec_noise(image, 0.07)
cv2.imwrite('out.jpg', noise_img)
cv2.imwrite('in.jpg', image)
plt.subplot(121), plt.imshow(image, cmap="gray")
plt.subplot(122), plt.imshow(noise_img, cmap="gray")
plt.show()
FIG: SPECKLE NOISE
Poisson Noise
Poisson Noise is an electronic noise that occurs in an image when the limited number of particles
that carry energy, such as electrons in an electronic circuit or photons in photosensitive device, is
small enough to give rise to detectable statistical variations in a measurement. Consider light a
stream of discrete photons coming out of a source and hitting a point which creates a visible spot,
the physical process which governs the light emission are such that those photos which are
emitted from the light source hits the point many times but to create visible spot billions of
photons are needed. However, if the source is not able to emit handful number of photons which
hits the point every second then this noise is caused.
FIG: PDF of Poisson Noise
Poisson Noise with OpenCV-Python:
import random
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Poison Noise
def p_noise(image, prob):
noise = np.random.poisson(50, image.shape)
plt.hist(noise.ravel(), 256, [~256, 256])
plt.show()
output = image + noise
return output
image = cv2.imread('test.jpg', 0)
noise_img = p_noise(image, 0.05)
plt.subplot(121), plt.imshow(image, cmap="gray")
plt.subplot(122), plt.imshow(noise_img, cmap="gray")
plt.show()
FIG: POISSON NOISE
Analysis of best suited filters for noises
Noise
Best Suited Filter
Gaussian
Gaussian filter
Salt and Pepper
Median
Poisson
Mean
Speckle
Weiner
Adaptive Filtering
Adaptive filter is performed on the degraded image that contains original image and noise. The
mean and variance are the two statistical measures that a local adaptive filter depends with a
defined (m x n) window region.
The premise behind adaptive image filtering is that by varying the filtering method as the kernel
slides across the image (in the same manner as the convolution operation), they are able to tailor
themselves to the local properties and structures of an image. In essence, they can be thought of
as self-adjusting digital filters. While certain types of adaptive filters may perform better than
median filters at removing impulse noise (these are mostly variations on the basic median
filtering scheme), they are most often used for denoising non-stationary images, which tend to
exhibit abrupt intensity changes. Because the filtering operation is no longer purely uniform and
instead modulated based on the local characteristics of the image, these filters can be employed
effectively when there is little a priori knowledge of the signal being processed.
import random
from turtle import width
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Guassian Noise
img = cv2.imread('test.jpg')
col = 20
row = 60
width = 200
height = 300
roi_of_img = img[col:col+width, row:row+height]
blur = cv2.GaussianBlur(roi_of_img, (5, 5), 0)
plt.subplot(121), plt.imshow(img), plt.title("Before")
plt.xticks([]), plt.yticks([])
plt.subplot(122), plt.imshow(blur), plt.title("After")
plt.xticks([]), plt.yticks([])
cv2.imwrite('GuassFilter.jpg', blur)
plt.show()
FIG: ADAPTIVE FILTERING
Image Denoising in OpenCV
OpenCV provides four variations of this technique.
1. cv.fastNlMeansDenoising() - works with a single grayscale image
2. cv.fastNlMeansDenoisingColored() - works with a colour image.
3. cv.fastNlMeansDenoisingMulti() - works with image sequence captured in short period of
time (grayscale images)
4. cv.fastNlMeansDenoisingColoredMulti() - same as above, but for colour images.
Common arguments are:
•
•
•
•
h :- parameter deciding filter strength. Higher h value removes noise better, but removes
details of image also. (10 is ok)
hForColorComponents :- same as h, but for colour images only. (Normally same as h)
templateWindowSize :- should be odd. (Recommended 7)
searchWindowSize :- should be odd. (Recommended 21)
cv.fastNlMeansDenoisingColored()
It is used to remove noise from color images. (Noise is expected to be gaussian).
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('Noisy.jpg')
dst = cv.fastNlMeansDenoisingColored(img, None, 10, 10, 7, 21)
plt.subplot(121), plt.imshow(img)
plt.subplot(122), plt.imshow(dst)
plt.show()
FIG: IMAGE DENOISING
cv.fastNlMeansDenoisingMulti()
This is used to apply on video. The detail of the arguments is:
•
•
•
•
The first argument is the list of noisy frames.
Second argument imgToDenoiseIndex specifies which frame we need to denoise, for that
we pass the index of frame in our input list.
Third is the temporalWindowSize which specifies the number of nearby frames to be
used for denoising. It should be odd.
In that case, a total of temporalWindowSize frames are used where central frame is the
frame to be denoised. For example, you passed a list of 5 frames as input. Let
imgToDenoiseIndex = 2 and temporalWindowSize = 3. Then frame-1, frame-2 and
frame-3 are used to denoise frame-2. Let's see an example.
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
cap = cv.VideoCapture('man.mp4')
# create a list of first 5 frames
img = [cap.read()[1] for i in range(5)]
# convert all to grayscale
gray = [cv.cvtColor(i, cv.COLOR_BGR2GRAY) for i in img]
# convert all to float64
gray = [np.float64(i) for i in gray]
# create a noise of variance 25
noise = np.random.randn(*gray[1].shape)*10
# Add this noise to images
noisy = [i+noise for i in gray]
# Convert back to uint8
noisy = [np.uint8(np.clip(i, 0, 255)) for i in noisy]
# Denoise 3rd frame considering all the 5 frames
dst = cv.fastNlMeansDenoisingMulti(noisy, 2, 5, None, 4, 7, 35)
plt.subplot(131), plt.imshow(gray[2], 'gray')
plt.subplot(132), plt.imshow(noisy[2], 'gray')
plt.subplot(133), plt.imshow(dst, 'gray')
plt.show()
FIG: VIDEO DENOISING
Lab 09
Objective:
Understand Warping Effect in Open CV. Explain and Code Digital Image Watermarking in
Open CV
Theory:
Bitwise logical operations can be performed between pixels of one or more than one image.
AND/NAND Logical operations can be used for following applications:
•
•
•
Compute intersection of the images
Design of filter masks
Slicing of gray scale images
OR/NOR logical operations can be used for following applications:
•
Merging of two images
XOR/XNOR operations can be used for following applications:
•
•
•
•
To detect change in gray level in the image
Check similarity of two images NOT operation is used for
To obtain negative image
Making some features clear
Watermarking using EX-OR operation:
To provide copy protection of digital audio, image and video two techniques are used:
encryption and watermarking. Encryption techniques normally used to protect data during
transmission from sender to receiver. Once data received at receiver, it is decrypted and data is
same as original which is not protected. Watermarking techniques can complement encryption
by embedding secret key into original data. Watermarking can be applied to audio, image or
video data.
Watermarking technique used for can be visible or non-visible. For visible watermarking
technique, watermark image is visible on original image in light form. In non-visible
watermarking technique, watermark image is hidden inside original image. In digital
watermarking, watermarking key bits are scattered in the image and cannot be identified. There
are so many techniques being developed for secure and robust watermarking. Watermarking
using EX-OR operation is simplest technique.
Digital Watermarking:
The digital watermarking or watermarking explains the ways and mechanisms to hide the data
and the data can be a number or text, in digital media, it may be a picture or video. The
watermarking is a message that can be embedded into the digital data like video, pictures, and
text and the embedded data can be extracted later.
The steganography is also another form of watermarking and in this, the messages are hidden in
the content without making the people to note its presence. The Indian currency is a good
example of watermarking and in the general watermarking procedure the genuine image
undergoes the embedding procedure along with the watermark and the output generated will be a
watermarked image.
General Framework for Watermarking:
Watermarking is the process that embeds data called a watermark or digital signature or tag or
label into a multimedia object such that watermark can be detected or extracted later to make an
assertion about the object. The object may be an image or audio or video.
In general, any watermarking scheme (algorithm) consists of three parts:
•
•
•
The watermark
The encoder (marking insertion algorithm)
The decoder and comparator (verification or extraction or detection algorithm)
Each owner has a unique watermark or an owner can also put different watermarks in different
objects the marking algorithm incorporates the watermark into the object. The verification
algorithm authenticates the object determining both the owner and the integrity of the object.
Encoding Process:
The figure illustrates the encoding process
FIG: ENCODING PROCESS
Let us denote an image by I, a signature by S = {sŗ, sŘ, …} the watermarked image by I’. E is an
encoder function, it takes an image I and a signature S, and it generates a new image which is
called watermarked image I’, i.e., E (I, S) = I’.
Decoding Process:
A decoder function D takes an image J (J can be a watermarked or unwatermarked image, and
possibly corrupted) whose ownership is to be determined and recovers a signature S’ from the
image.
In this process, an additional image I can also be included which is often the original and unwatermarked version of J. This is due to the fact that some encoding schemes may make use of
the original images in the watermarking process to provide extra robustness against intentional
and unintentional corruption of pixels.
Mathematically,
D (J, I) = S’
Depending on the way the watermark is inserted and depending on the nature of the
watermarking algorithm, the method used can involve very distinct approaches. In some
watermarking schemes, a watermark can be extracted in its exact form, a procedure we call
watermark extraction.
In other cases, we can detect only whether a specific given Watermark Extraction Image Water
watermarking signal is present in an image, a procedure we call watermark detection. It should
be noted that watermark extraction can prove ownership whereas watermark detection can only
verify ownership.
Creating a watermark using the image given below:
# Write a program to simulate the embedding and extraction process of digital
watermark
import cv2
import numpy as np
# Read the original carrier image
gray1 = cv2.imread("test.jpg", 0)
# Read watermark image
watermark = cv2.imread("watermark.png", 0)
# The value 255 in the watermark image is processed to 1 to facilitate embedding
w = watermark[:, :] > 0
watermark[w] = 1
# Read the shape value of the original carrier image
r, c = gray1.shape
# ----------------------Embedded process-----------------------# Generate an array with element values of 254
t254 = np.ones((r, c), dtype=np.uint8)*254
# Get the top seven bits of the girl image
gray1H7 = cv2.bitwise_and(gray1, t254)
# Embed watermark in girlH7
e = cv2.bitwise_or(gray1H7, watermark)
# ----------------------Extraction process-------------------------# Generate an array with element values of 1
t1 = np.ones((r, c), dtype=np.uint8)
# Extracting watermark image from carrier image
wm = cv2.bitwise_and(e, t1)
print(wm)
# The value 1 in the watermark image is processed to 255 to facilitate display
# The following chapters will introduce the implementation of threshold
w = wm[:, :] > 0
wm[w] = 255
# ------------------------Display--------------------------cv2.imshow("gray1", gray1)
# The maximum value in the current watermark is 1
cv2.imshow("watermark", watermark*255)
cv2.imshow("e", e)
cv2.imshow("wm", wm)
cv2.waitKey()
cv2.destroyAllWindows()
FIG: Original Image
FIG: Watermarking Key
FIG: Watermarked Image
FIG: Extracted Key
Lab 10
Objective:
Write and execute programs for image frequency domain filtering
•
•
•
•
Apply FFT on given image
Perform low pass and high pass filtering in frequency domain
Apply IFFT to reconstruct image
Edge Detection by DFT
Theory:
Frequency Domain Filters:
Frequency Domain Filters are used for smoothing and sharpening of images by removal of high
or low-frequency components.
Frequency domain filters are different from spatial domain filters as it mainly focuses on the
frequency of the images. It is done for two basic operations i.e., Smoothing and Sharpening.
Domain Filter
Let us perform some Domain Filter using cv2.edgePreservingFilter() method.
import numpy as np
import cv2
img = cv2.imread("test.jpg")
domainFilter = cv2.edgePreservingFilter(img, flags=1, sigma_s=60, sigma_r=0.6)
cv2.imshow('Domain Filter', domainFilter)
cv2.waitKey(0)
cv2.destroyAllWindows()
FIG: Original Image vs Domain Filtered Image
Gaussian Blur Method
Gaussian blur (also known as Gaussian smoothing) is the result of blurring an image by a
Gaussian function.
It is a widely used effect in graphics software, typically to reduce image noise and reduce detail.
The visual effect of this blurring technique is a smooth blur resembling that of viewing the image
through a translucent screen, distinctly different from the bokeh effect produced by an out-offocus lens or the shadow of an object under usual illumination.
import numpy as np
# import pandas as pd
import cv2
img = cv2.imread("test.jpg")
gaussBlur = cv2.GaussianBlur(img, (5, 5), cv2.BORDER_DEFAULT)
cv2.imshow("Gaussian Smoothing", np.hstack((img, gaussBlur)))
cv2.waitKey(0)
cv2.destroyAllWindows()
FIG: GAUSSIAN SMOOTHING
Mean Filtering Techniques
The idea of mean filtering is simply to replace each pixel value in an image with the mean
(`average’) value of its neighbours, including itself. This has the effect of eliminating pixel
values that are unrepresentative of their surroundings. Mean filtering is usually thought of as a
convolution filter. Like other convolutions, it is based around a kernel, which represents the
shape and size of the neighbourhood to be sampled when calculating the mean.
import numpy as np
# import pandas as pd
import cv2
img = cv2.imread("test.jpg")
kernel = np.ones((10, 10), np.float32)/25
meanFilter = cv2.filter2D(img, -1, kernel)
cv2.imshow("Mean Filtered Image", np.hstack((img, meanFilter)))
cv2.waitKey(0)
cv2.destroyAllWindows()
FIG: MEAN FILTERED IMAGE
Median Filtering Techniques
Median filtering is a nonlinear process useful in reducing impulsive, or salt-and-pepper noise. It
is also useful in preserving edges in an image while reducing random noise. Impulsive or saltand-pepper noise can occur due to a random bit error in a communication channel. In a median
filter, a window slides along the image, and the median intensity value of the pixels within the
window becomes the output intensity of the pixel being processed.
import numpy as np
# import pandas as pd
import cv2
img = cv2.imread("test.jpg")
# Median Filter
medianFilter = cv2.medianBlur(img, 5)
cv2.imshow("Median Filter", np.hstack((img, medianFilter)))
cv2.waitKey(0)
cv2.destroyAllWindows()
FIG: MEDIAN FILTER
Frequency Band Filtering Techniques
Frequency filters process an image in the frequency domain. The image is Fourier transformed,
multiplied with the filter function and then re-transformed into the spatial domain. Attenuating
high frequencies results in a smoother image in the spatial domain, attenuating low frequencies
enhances the edges.
All frequency filters can also be implemented in the spatial domain and, if there exists a simple
kernel for the desired filter effect, it is computationally less expensive to perform the filtering in
the spatial domain. Frequency filtering is more appropriate if no straightforward kernel can be
found in the spatial domain, and may also be more efficient.
For High Band Pass Filter:
import numpy as np
# import pandas as pd
import cv2
img = cv2.imread("test.jpg")
gaussBlur = cv2.GaussianBlur(img, (5, 5), cv2.BORDER_DEFAULT)
highPass = img - gaussBlur
# or We can use this statement to filter the high pass image
highPass = highPass + 127*np.ones(img.shape, np.uint8)
cv2.imshow("High Pass", np.hstack((img, highPass)))
cv2.waitKey(0)
cv2.destroyAllWindows()
FIG: HIGH PASS
For Low Band Pass Filter:
import numpy as np
# import pandas as pd
import cv2
img = cv2.imread("test.jpg")
kernel = np.ones((10, 10), np.float32)/25
lowPass = cv2.filter2D(img, -1, kernel)
lowPass = img - lowPass
cv2.imshow("Low Pass", np.hstack((img, lowPass)))
cv2.waitKey(0)
cv2.destroyAllWindows()
FIG: LOW PASS
Lab 11
Objective:
Write a program in Python for edge detection using different edge detection mask.
Theory:
Edge Detection Using OpenCV:
Edge detection is an image-processing technique, which is used to identify the boundaries
(edges) of objects, or regions within an image. Edges are among the most important features
associated with images. We come to know of the underlying structure of an image through its
edges. Computer vision processing pipelines therefore extensively use edge detection in
applications.
FIG: Edge Detection Using OpenCV
The first step is to read in the image, using the imread() function in OpenCV. Here, we read in
the color image as a grayscale image because you do not need color information to detect edges.
After reading the image, we also blur it, using the GaussianBlur() function. This is done to
reduce the noise in the image.
In edge detection, numerical derivatives of the pixel intensities have to be computed, and this
typically results in ‘noisy’ edges. In other words, the intensity of neighboring pixels in an image
(especially near edges) can fluctuate quite a bit, giving rise to edges that don’t represent the
predominant edge structure we are looking for.
Blurring smoothens the intensity variation near the edges, making it easier to identify the
predominant edge structure within the image. You can refer to the OpenCV
documentation page for more details on the GaussianBlur() function. We supply the size of the
convolution kernel (in this case 1 3×3 kernel), which specifies the degree of blurring.
Sobel Edge Detection
Sobel Edge Detection is one of the most widely used algorithms for edge detection. The Sobel
Operator detects edges that are marked by sudden changes in pixel intensity, as shown in the
figure below.
FIG: Pixel intensity as a function of t
The rise in intensity is even more evident, when we plot the first derivative of the intensity
function.
FIG: First Derivative of Pixel intensity as a function of t
The above plot demonstrates that edges can be detected in areas where the gradient is higher than
a particular threshold value. In addition, a sudden change in the derivative will reveal a change in
the pixel intensity as well. With this in mind, we can approximate the derivative, using a 3×3
kernel. We use one kernel to detect sudden changes in pixel intensity in the X direction, and
another in the Y direction.
These are the kernels used for Sobel Edge Detection:
X-Direction Kernel
Y-Direction Kernel
When these kernels are convolved with the original image, you get a ‘Sobel edge image’.
•
•
If we use only the Vertical Kernel, the convolution yields a Sobel image, with edges
enhanced in the X-direction
Using the Horizontal Kernel yields a Sobel image, with edges enhanced in the Ydirection.
Let
and
represent the intensity gradient in the and
If
and denote the X and Y kernels defined above:
directions respectively.
where denotes the convolution operator, and represents the input image. The final
approximation of the gradient magnitude,
can be computed as:
And the orientation of the gradient can then be approximated as:
In the code example below, we use the Sobel() function to compute:
•
•
the Sobel edge image individually, in both directions (x and y),
the composite gradient in both directions (xy)
The following is the syntax for applying Sobel edge detection using OpenCV:
Sobel(src, ddepth, dx, dy)
The parameter ddepth specifies the precision of the output image, while dx and dy specify the
order of the derivative in each direction. For example:
•
•
If dx=1 and dy=0, we compute the 1st derivative Sobel image in the x-direction.
If both dx=1 and dy=1, we compute the 1st derivative Sobel image in both directions
CODE:
import cv2
# Read the original image
img = cv2.imread('test.jpg')
# Display original image
cv2.imshow('Original', img)
cv2.waitKey(0)
# Convert to graycsale
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Blur the image for better edge detection
img_blur = cv2.GaussianBlur(img_gray, (3, 3), 0)
# Sobel Edge Detection
sobelx = cv2.Sobel(src=img_blur, ddepth=cv2.CV_64F, dx=1,
dy=0, ksize=5) # Sobel Edge Detection on the X axis
sobely = cv2.Sobel(src=img_blur, ddepth=cv2.CV_64F, dx=0,
dy=1, ksize=5) # Sobel Edge Detection on the Y axis
# Combined X and Y Sobel Edge Detection
sobelxy = cv2.Sobel(src=img_blur, ddepth=cv2.CV_64F, dx=1, dy=1, ksize=5)
# Display Sobel Edge Detection Images
cv2.imshow('Sobel X', sobelx)
cv2.waitKey(0)
cv2.imshow('Sobel Y', sobely)
cv2.waitKey(0)
cv2.imshow('Sobel X Y using Sobel() function', sobelxy)
cv2.waitKey(0)
cv2.destroyAllWindows()
FIG: ORIGINAL IMAGE
FIG: SOBEL X
FIG: SOBEL Y
FIG: SOBEL X Y USING SOBEL() FUNCTION
Canny Edge Detection:
Canny Edge Detection is one of the most popular edge-detection methods in use today because it
is so robust and flexible.The algorithm itself follows a three-stage process for extracting edges
from an image. Add to it image blurring, a necessary preprocessing step to reduce noise. This
makes it a four-stage process, which includes:
•
•
•
•
Noise Reduction
Calculating Intensity Gradient of the Image
Suppression of False Edges
Hysteresis Thresholding
CODE:
import cv2
# Read the original image
img = cv2.imread('umar.jpg')
# Display original image
cv2.imshow('Original', img)
cv2.waitKey(0)
# Convert to graycsale
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Blur the image for better edge detection
img_blur = cv2.GaussianBlur(img_gray, (3, 3), 0)
# Canny Edge Detection
edges = cv2.Canny(image=img_blur, threshold1=100,
threshold2=200) # Canny Edge Detection
# Display Canny Edge Detection Image
cv2.imshow('Canny Edge Detection', edges)
cv2.waitKey(0)
cv2.destroyAllWindows()
FIG: CANNY EDGE DETECTION
Lab 12
Objective:
Write program for Feature Detection
•
•
Face Detection
▪ Single Face
▪ Group Face
Webcam use
Theory:
Face Detection
First of all, make sure you have OpenCV installed. You can install it using pip:
pip install opencv-python
Face detection using Haar cascades is a machine learning based approach where a cascade
function is trained with a set of input data. OpenCV already contains many pre-trained classifiers
for face, eyes, smiles, etc.. In this lab, we will be using the face classifier.
You need to download the trained classifier XML file (haarcascade_frontalface_default.xml),
which is available in OpenCv’s GitHub repository.
https://github.com/opencv/opencv/tree/master/data/haarcascades
Save it to your working location.
A few things to note:
• The detection works only on grayscale images. So it is important to convert the color
image to grayscale.
• detectMultiScale function is used to detect the faces. It takes 3 arguments:
▪ The input image,
▪ scaleFactor specifies how much the image size is reduced with each scale.
▪ minNeighbours specifies how many neighbors each candidate rectangle should
have to retain it. You may have to tweak these values to get the best results.
• faces contains a list of coordinates for the rectangular regions where faces were found.
We use these coordinates to draw the rectangles in our image.
Program I:
Face Detection (single + group) in using OpenCV & Python
Code:
import cv2
# Load the cascade
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
# Read the input image
img = cv2.imread('image_name.jpg')
# Convert into grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Detect faces
faces = face_cascade.detectMultiScale(gray, 1.1, 4)
# Draw rectangle around the faces
for (x, y, w, h) in faces:
cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)
# Display the output
cv2.imshow('img', img)
cv2.waitKey()
Input:
FIG: FACE 1
FIG: FACE 2
FIG: GROUP PHOTO (MULTIPLE FACES)
Output:
FIG: DETECTED FACE 1
FIG: DETECTED FACE 2
FIG: DETECTED MULTIPLE FACES IN GROUP PHOTO
Similarly, we can detect faces in videos. As you know videos are basically made up of frames,
which are still images. So, we perform the face detection for each frame in a video.
Program II:
Face Detection (webcam) in using OpenCV & Python
Code:
import cv2
cascPath = "haarcascade_frontalface_default.xml"
faceCascade = cv2.CascadeClassifier(cascPath)
video_capture = cv2.VideoCapture(0)
while True:
# Capture frame-by-frame
ret, frame = video_capture.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
faces = faceCascade.detectMultiScale(
gray,
scaleFactor=1.1,
minNeighbors=5,
minSize=(30, 30),
flags=cv2.CASCADE_SCALE_IMAGE
)
# Draw a rectangle around the faces
for (x, y, w, h) in faces:
cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2)
# Display the resulting frame
cv2.imshow('Video', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# When everything is done, release the capture
video_capture.release()
cv2.destroyAllWindows()
Output:
FIG: FACE DETECTION USING WEBCAM
The only difference is that we use an infinite loop to loop through each frame in the video. We
use cap.read() to read each frame. The first value returned is a flag that indicates if the frame
was read correctly or not. We don’t need it. The second value returned is the still frame on which
we will be performing the detection.
Lab 12
Objective:
Write and execute program for image morphological operations Erosion, dilation, opening and
closing in python
•
•
•
•
•
•
•
•
Erosion
Dilation
Opening and Closing
Hit or mis Transformation.
Skeleton
Hole Filling
Boundary Extraction
Convex Hull
Theory:
Morphological operations
Morphological operations are a set of operations that process images based on shapes. They
apply a structuring element to an input image and generate an output image.
The most basic morphological operations are two:
•
•
Erosion
Dilation
Basics of Erosion:
• Erodes away the boundaries of the foreground object
• Used to diminish the features of an image.
Working of erosion:
1. A kernel (a matrix of odd size (3,5,7) is convolved with the image.
2. A pixel in the original image (either 1 or 0) will be considered 1 only if all the pixels
under the kernel are 1, otherwise, it is eroded (made to zero).
3. Thus, all the pixels near the boundary will be discarded depending upon the size of the
kernel.
4. So, the thickness or size of the foreground object decreases or simply the white region
decreases in the image.
Basics of dilation:
• Increases the object area
• Used to accentuate features
Working of dilation:
1. A kernel(a matrix of odd size(3,5,7) is convolved with the image
2. A pixel element in the original image is ‘1’ if at least one pixel under the kernel is ‘1’.
3. It increases the white region in the image or the size of the foreground object increases
Code
import cv2
import numpy as np
# Reading the input image
img = cv2.imread('test.jpg', 0)
kernel = np.ones((5, 5), np.uint8)
img_erosion = cv2.erode(img, kernel, iterations=1)
img_dilation = cv2.dilate(img, kernel, iterations=1)
cv2.imshow('Input', img)
cv2.imshow('Erosion', img_erosion)
cv2.imshow('Dilation', img_dilation)
cv2.waitKey(0)
Input:
FIG: INPUT IMAGE
Outputs:
Dilated Image
FIG: DILATEED IMAGE
Eroded Image:
FIG: ERODED IMAGE
Opening and Closing
Opening is similar to erosion as it tends to remove the bright foreground pixels from the edges
of regions of foreground pixels. The impact of the operator is to safeguard foreground region that
has similarity with the structuring component, or that can totally contain the structuring
component while taking out every single other area of foreground pixels. Opening operation is
used for removing internal noise in an image.
Closing is similar to the opening operation. In closing operation, the basic premise is that the
closing is opening performed in reverse. It is defined simply as a dilation followed by an
erosion using the same structuring element used in the opening operation.
import cv2
import numpy as np
# Reading the input image
img = cv2.imread('watermark.jpg', 0)
kernel = np.ones((5, 5), np.uint8)
opening = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel)
closing = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)
cv2.imshow('Input', img)
cv2.imshow('Erosion', opening)
cv2.imshow('Dilation', closing)
cv2.waitKey(0)
FIG: OPENING
FIG: CLOSING
The Hit-or-Miss Transformation
The Hit-or-Miss transform is a basic tool for shape detection. The objective is to find the location
of one of the shapes in image. The small window, W, is assumed that have at least one-pixelthick than an object. Anyway, in some applications, we may be interested in detecting certain
patterns, in which case a background is not required.
Region Filling
Beginning with a point p inside the boundary, the objective is to fill the entire region with 1’s, by
iteratively processing dilation
Boundary Extraction
Beginning with a point p inside the boundary, the objective is to fill the entire region with 1’s, by
iteratively processing dilation. Adding the intelligence to detect a black inner point of sphere, we
can use region filling to fill up the sphere to be completely white.
Convex Hull
A is said to be convex if the straight-line segment joining any two points in A lies entirely within
A. with , and let (“conv” → convergence).
Lab 14
Objective:
Open CV Project “Document Scanner”
•
How to make a Document Scanner
Theory:
Making a Document Scanner:
The steps that we need to follow to build this project are:
•
•
•
•
•
Convert the image to grayscale
Find the edges in the image
Use the edges to find all the contours
Select only the contours of the document
Apply warp perspective to get the top-down view of the document
Load the Image
Create a new file inside the document-scanner directory, name it scanner.py and put the
following code:
from imutils.perspective import four_point_transform
import cv2
height = 800
width = 600
green = (0, 255, 0)
image = cv2.imread("h1.jpg")
image = cv2.resize(image, (width, height))
orig_image = image.copy()
We start by importing the OpenCV library and the four_point_transform helper function from
the imutils package.
This function will help us perform a 4-point perspective transform to obtain the top-down view
of the document.
Next, we set the height and width of the image so that we can resize it, and we also create
the green variable for the contour display later on.
To load an image with OpenCV we use the cv2.imread() function, it takes the path of the image
as argument.
Note that this function doesn’t throw an error if the image path is wrong, it will simply
return None.
To resize the image, we use the cv2.resize() function for that. The first argument is the image we
want to resize, and the second is the width and height for the new image.
The function has a third argument which defines the algorithm used for the resizing (the default
one is cv2.INTER_LINEAR).
Lastly, we take a copy of our image. This will allow us later to display the contours of the
document on the original image rather than the modified image.
Image Processing
Now we start preprocessing our image by converting it to grayscale, blurring it, and then finding
the edges in the image. Let's see how to do it:
# convert the image to gray scale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5, 5), 0) # Add Gaussian blur
edged = cv2.Canny(blur, 75, 200) # Apply the Canny algorithm to find the edges
# Show the image and the edges
cv2.imshow('Original image:', image)
cv2.imshow('Edged:', edged)
cv2.waitKey(0)
cv2.destroyAllWindows()
Now that our image is loaded, we start by converting it from the RGB color to grayscale.
Next, to remove noise from the image, we smooth it by using the cv2.GaussianBlur function.
The first argument is the image we want to blur. The second argument is the width and height of
the kernel which must be positive and odd.
The last argument is the standard deviation. If we set it to 0, OpenCV calculate it from the kernel
size.
Lastly, we apply the so-know Canny edge detector. This is a multi-stage algorithm that is used to
remove noise and detect edges in the image.
The first argument is our input image. The second and third argument are the thresholds that the
algorithm uses to determine the edges and non-edges in the image.
We used the cv2.imshow function to display our images in a window.
The cv2.waitKey(delay) function will wait for a pressed key for delay milliseconds if delay is
positive. Otherwise, it will wait infinitely for a pressed key.
The cv2.destroyAllWindows() function simply destroys all the windows we created.
Below you can see the output that we get:
FIG: FOUND EDGES OF THE IMAGE
Use the Edges to Find all the Contours
Now we can use our edged image to find all the contours.
from imutils.perspective import four_point_transform
import cv2
height = 800
width = 600
green = (0, 255, 0)
image = cv2.imread("h1.jpg")
image = cv2.resize(image, (width, height))
orig_image = image.copy()
# convert the image to gray scale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5, 5), 0) # Add Gaussian blur
edged = cv2.Canny(blur, 75, 200) # Apply the Canny algorithm to find the edges
# If you are using OpenCV v3, v4-pre, or v4-alpha
# cv2.findContours returns a tuple with 3 element instead of 2
# where the `contours` is the second one
# In the version OpenCV v2.4, v4-beta, and v4-official
# the function returns a tuple with 2 element
contours, _ = cv2.findContours(edged, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
contours = sorted(contours, key=cv2.contourArea, reverse=True)
# Show the image and all the contours
cv2.imshow("Image", image)
cv2.drawContours(image, contours, -1, green, 3)
cv2.imshow("All contours", image)
cv2.waitKey(0)
cv2.destroyAllWindows()
To find the contours on the image we apply the cv2.findContours function. This function takes
three arguments, the first one is the source image.
The second parameter is the contour retrieval mode. Here we are using cv2.RETR_LIST to
retrieve all the contours. Please refer to the documentation for the other options.
The last argument represents the contour approximation method. For example, if we set it
to cv2.CHAIN_APPROX_NONE, the function will store all the (x, y) coordinates of a contour.
But do we really need that?
For example, for a rectangle contour, we only need 4 points.
That's why we used the cv2.CHAIN_APPROX_SIMPLE. This will allow us to save memory
by keeping only the important points.
Note that since opencv 3.2 this function does not change the source image.
The drawContours function allow us to draw contours on an image. The first argument is the
source image, then we need to pass it the contours that we want to draw.
The third argument is to indicate which contour we want to draw, a negative value means draw
all the contours.
The fourth parameter is the color of the contour and the fifth one is the thickness.
Let's see what we get so far:
FIG: USE EDGES TO FIND ALL CONTOURS
0
You can add this document to your study collection(s)
Sign in Available only to authorized usersYou can add this document to your saved list
Sign in Available only to authorized users(For complaints, use another form )