Object Detection using Python OpenCV

Published  March 22, 2019   0
M Madhav
Author
Object Detection using Python & OpenCV

We started with learning basics of OpenCV and then done some basic image processing and manipulations on images followed by Image segmentations and many other operations using OpenCV and python language. Here, in this section, we will perform some simple object detection techniques using template matching. We will find an object in an image and then we will describe its features. Features are the common attributes of the image such as corners, edges etc. We will also take a look at some common and popular object detection algorithms such as SIFT, SURF, FAST, BREIF & ORB.

As told in the previous tutorials, OpenCV is Open Source Commuter Vision Library which has C++, Python and Java interfaces and supports Windows, Linux, Mac OS, iOS and Android. So it can be easily installed in Raspberry Pi with Python and Linux environment. And Raspberry Pi with OpenCV and attached camera can be used to create many real-time image processing applications like Face detection, face lock, object trackingcar number plate detectionHome security system etc.

Object detection and recognition form the most important use case for computer vision, they are used to do powerful things such as

  • Labelling scenes
  • Robot Navigation
  • Self-driving cars
  • Body recognition (Microsoft Kinect)
  • Disease and cancer detection
  • Facial recognition
  • Handwriting recognition
  • Identifying objects in satellite images

 

Object Detection VS Recognition

Object recognition is the second level of object detection in which computer is able to recognize an object from multiple objects in an image and may be able to identify it.

Now, we will perform some image processing functions to find an object from an image.

 

Finding an Object from an Image

Here we will use template matching for finding character/object in an image, use OpenCV’s cv2.matchTemplate() function for finding that object

import cv2
import numpy as np

 

Load input image and convert it into gray

image=cv2.imread('WaldoBeach.jpg')
cv2.imshow('people',image)
cv2.waitKey(0)
gray=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)

 

Load the template image

template=cv2.imread('waldo.jpg',0)
#result of template matching of object over an image
result=cv2.matchTemplate(gray,template,cv2.TM_CCOEFF)
sin_val, max_val, min_loc, max_loc=cv2.minMaxLoc(result)

 

Create bounding box

top_left=max_loc
#increasing the size of bounding rectangle by 50 pixels
bottom_right=(top_left[0]+50,top_left[1]+50)
cv2.rectangle(image, top_left, bottom_right, (0,255,0),5)

cv2.imshow('object found',image)
cv2.waitKey(0)
cv2.destroyAllWindows()

 

Find Object using OpenCV and Python

 

Object Founded using OpenCV and Python

 

In cv2.matchTemplate(gray,template,cv2.TM_CCOEFF), input the gray-scale image to find the object and template. Then apply the template matching method for finding the objects from the image, here cv2.TM_CCOEFF is used.

The whole function returns an array which is inputted in result, which is the result of the template matching procedure.

And then we use cv2.minMaxLoc(result), which gives the coordinates or the bounding box where the object was found in an image, and when we get those coordinates draw a rectangle over it, and stretch a little dimensions of the box so the object can easily fit inside the rectangle.

 

There are variety of methods to perform template matching and in this case we are using cv2.TM_CCOEFF which stands for correlation coefficient.

cv2.matchTemplate takes a “sliding window” of the object and slides it over the image from left to right and top to bottom, one pixel at a time. Then for each location, we compute the correlation coefficient to determine how “good” or “bad” the match is.

Regions with sufficiently high correlation can be considered as matches, from there all we need is to call to cv2.minMaxLoc to find where the good matches are in template matching.

 

Feature Description Theory

In template matching we slide a template image across a source image until a match is found. But it is not the best method for object recognition, as it has severe limitations. This method isn’t very resilient.

The following factors make template matching a bad choice for object detection.

  • Rotation renders this method ineffective.
  • Size (known as scaling) affects this as well.
  • Photometric changes (e.g. brightness, contrast, hue etc.)
  • Distortion form view point changes (Affine).

The one solution for this problem is image features

Image features are interesting areas of an image that are somewhat unique to that specific image. They are also called key point features or interest points.

Getting Image Features using OpenCV and Python

 

The sky is an uninteresting feature, whereas as certain keypoints (marked in red circles) can be used for the detection of the above image (interesting Features). The image shown above clearly shows the difference between the interesting feature and uninteresting feature.

 

Importance of feature detection

Features are important as they can be used to analyze, describe and match images. They have extensive use in:

  • Image alignment – e.g panorma stiching (finding corresponding matches so we can stitch images together)
  • 3D reconstruction
  • Robot navigation
  • Object recognition
  • Motion tracking
  • And more!

 

Finding Corners of Image using OpenCV and Python

 

Image Stiching using OpenCV and Python

 

What defines the interest points?

Interesting areas carry a lot of distinct information and unique information of an area. Typically, they are areas of high change of intensity, corners or edges and more. But always be careful as noise can appear “informative” when it is not! So try to blur so as to reduce noise.

Edge Detection using OpenCV and Python

 

Characteristic of Good or Interesting Features

Repeatable – They can be found in multiple pictures of the same scene.

Distinctive – Each feature is somewhat unique and different to other features of the same scene.

Compactness/Efficiency – Significantly less features than pixels in the image.

Locality – Feature occupies a small area of the image and is robust to clutter and occlusion.

Finding Features of Image using OpenCV and Python

 

Corners as features

Corners are identified when shifting a window in any direction over that point gives a large change in intensity.

Corner Detection using OpenCV and Python

 

Corners are not the best cases for identifying the images, but yes they have certainly good use cases of them which make them handy to use.

So to identify corners in your image, imagine the green window we are looking at and the black one is the image we want to find corners in, and now when we move the window only inside the black box we see there is no change in intensity and hence the image is flat i.e. no corners identified.

Now when we move the window in one direction we see that there is change of intensity in one direction only, hence it’s an edge not a corner.

When we move the window in the corner, and no matter in what direction we move the window now there is a change in intensity, and this is identified as a corner.

So let’s identify corner with the help of Harris Corner Detection algorithm, developed in 1998 for corner detection and works fairly well.

 

The following OpenCV function is used for the detection of the corners.

cv2.cornerHarris(input image, block size, ksize, k)

Input image - Should be grayscale and float32 type.

blockSize - The size of neighborhood considered for corner detection

ksize - Aperture parameter of Sobel derivative used.

k - Harris detector free parameter in the equation

Output – array of corner locations (x,y)

Also an important thing to note is that Harris corner detection algorithm requires a float 32 array datatype of image, i.e. image should be gray image of float 32 type.

import cv2
import numpy as np

 

Load image then grayscale

image = cv2.imread('chess.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

 

The cornerHarris function requires the array datatype to be float32

gray = np.float32(gray)
harris_corners = cv2.cornerHarris(gray, 3, 3, 0.05)

 

We use dilation of the corner points to enlarge them

kernel = np.ones((7,7),np.uint8)
harris_corners = cv2.dilate(harris_corners, kernel, iterations = 2)

 

Threshold for an optimal value, it may vary depending on the image

image[harris_corners > 0.025 * harris_corners.max() ] = [255, 127, 127]

cv2.imshow('Harris Corners', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

 

Corner Harris returns the location of the corners, so as to visualize these tiny locations we use dilation so as to add pixels to the edges of the corners. So to enlarge the corner we run the dilation twice. And then we again do some thresholding to change the colors of the corners.

Harris Corner Detection using OpenCV and Python

 

The following function is used for the same with the below mentioned parameters

cv2.goodFeaturesToTrack(input image, maxCorners, qualityLevel, minDistance)

 

  • Input Image - 8-bit or floating-point 32-bit, single-channel image.
  • maxCorners – Maximum number of corners to return. If there are more number of corners than the total numbers of corners which are actually found, then the strongest one of them is returned.
  • qualityLevel – Parameter characterizing the minimal accepted quality of image corners. The parameter value is multiplied by the best corner quality measure (smallest eigenvalue). The corners with the quality measure less than the product are rejected. For example, if the best corner has the quality measure = 1500, and the qualityLevel=0.01 , then all the corners with the quality measured less than 15 are rejected.
  • minDistance – Minimum possible Euclidean distance between the returned corners.

 

import cv2
import numpy as np

img = cv2.imread('chess.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

 

We specify the top 50 corners

corners = cv2.goodFeaturesToTrack(gray, 100, 0.01, 15)

for corner in corners:
    x, y = corner[0]
    x = int(x)
    y = int(y)
    cv2.rectangle(img,(x-10,y-10),(x+10,y+10),(0,255,0), 2)

cv2.imshow("Corners Found", img)
cv2.waitKey()
cv2.destroyAllWindows()

 

It also returns the array of location of the corners like previous method, so we iterate through each of the corner position and plot a rectangle over it.

Detected Harris Corner using OpenCV and Python

 

Problems with corners as features
Corner matching in images is tolerant of or corner detection don’t have any problem with image detection when the image is
• Rotated
• Translated (i.e. shifts in image)
• Slight photometric changes e.g. brightness
or affine intensity


However, it is intolerant of:
• Large changes in intensity or photometric
changes)
• Scaling (i.e. enlarging or shrinking)

 

Enlarging and Scalling using OpenCV and Python

 

SIFT, SURF, FAST, BRIEF & ORB Algorithms

Scale Invariant Feature Transform (SIFT)

The corner detectors like Harris corner detection algorithm are rotation invariant, which means even if the image is rotated we could still get the same corners. It is also obvious as corners remain corners in rotated image also. But when we scale the image, a corner may not be the corner as shown in the above image.

SIFT is used to detect interesting keypoints in an image using the difference of Gaussian method, these are the areas of the image where variation exceeds a certain threshold and are better than edge descriptor.

Then we create a vector descriptor for these interesting areas. And the scale Invariance is achieved via the following process:
i. Interesting points are scanned at several different scales.
ii. The scale at which we meet a specific stability criteria, is then selected and encoded by the vector descriptor. Therefore, regardless of the initial size, the more stable scale is found which allows us to be scale invariant.

Rotation invariance is achieved by obtaining the Orientation Assignment of the key point using image gradient magnitudes. Once we know the 2D direction, we can normalize this direction.

A full paper on SIFT can be read here:

http://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf.

And you can also find a tutorial on the official OpenCV link.

Gaussian Method in OpenCV using Python

 

Image Gradient using OpenCV and Python

 

Speeded Up Robust Features (SURF)

SURF is the speeded up version of SIFT, as the SIFT is quite computational expensive

SURF was developed to improve the speed of a scale invariant feature detector. Instead of using the Difference of Gaussian approach, SURF uses Hessian matrix approximation to detect interesting points and uses the sum of Haar wavelet responses for orientation assignment.

A full paper on SIFT can be read here: http://www.vision.ee.ethz.ch/~surf/eccv06.pdf

 

Alternatives of SIFT and SURF

As the SIFT and SURF are patented they are not freely available for commercial use however there are alternatives to these algorithms which are explained in brief here

 

Features from Accelerated Segment Test (FAST)

• Key point detection only (no descriptor, we can use SIFT or SURF to compute that)
• Used in real time applications

Here you can find the papers on FAST

https://www.edwardrosten.com/work/rosten_2006_machine.pdf


Binary Robust Independent Elementary Features (BRIEF)

• Computers descriptors quickly (instead of using SIFT or SURF)
• it is quite fast.

Here you can find the paper on BRIEF

http://cvlabwww.epfl.ch/~lepetit/papers/calonder_pami11.pdf

 

Oriented FAST and Rotated BRIEF (ORB)

  • Developed out of OpenCV Labs (not patented so free to use!)
  • Combines both Fast and Brief

Here you can find the paper on ORB

http://www.willowgarage.com/sites/default/files/orb_final.pdf

 

Using SIFT, SURF, FAST, BRIEF & ORB in OpenCV

Flow process for SIFT, SURF, FAST, BRIEF & ORB in OpenCV

 

Feature Detection implementation

The SIFT & SURF algorithms are patented by their respective creators, and while they are free to use in academic and research settings, you should technically be obtaining a license/permission from the creators if you are using them in a commercial (i.e. for-profit) application.

Below we are explaining programming examples of all the algorithms mentioned above.

 

SIFT

import cv2
import numpy as np

image = cv2.imread('paris.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

Create SIFT Feature Detector object
sift = cv2.xfeatures2d.SIFT_create()

#Detect key points
keypoints = sift.detect(gray, None)
print("Number of keypoints Detected: ", len(keypoints))

 

Draw rich key points on input image

image = cv2.drawKeypoints(image, keypoints, None, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

cv2.imshow('Feature Method - SIFT', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

 

Console Output:

Number of keypoints Detected:  1893

 

SIFT keypoint Detection using OpenCV and Python

 

Here the keypoints are (X,Y) coordinates extracted using sift detector and drawn over the image using cv2 draw keypoint function.

 

SURF

import cv2
import numpy as np

image = cv2.imread('paris.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

 

Create SURF Feature Detector object, here we set hessian threshold to 500

# Only features, whose hessian is larger than hessianThreshold are retained by the detector

#you can increase the value of hessian threshold to decrease the keypoints

surf = cv2.xfeatures2d.SURF_create(500)

keypoints, descriptors = surf.detectAndCompute(gray, None)
print ("Number of keypoints Detected: ", len(keypoints))

 

Draw rich key points on input image

image = cv2.drawKeypoints(image, keypoints, None, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

cv2.imshow('Feature Method - SURF', image)
cv2.waitKey()
cv2.destroyAllWindows()

 

Console Output:

Number of keypoints Detected:  1548

Surf keypoint Detection using OpenCV and Python

 

FAST

import cv2
import numpy as np

image = cv2.imread('paris.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

 

Create FAST Detector object

fast = cv2.FastFeatureDetector_create()
# Obtain Key points, by default non max suppression is On
# to turn off set fast.setBool('nonmaxSuppression', False)
keypoints = fast.detect(gray, None)
print ("Number of keypoints Detected: ", len(keypoints))

 

Draw rich keypoints on input image

image = cv2.drawKeypoints(image, keypoints, None, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

cv2.imshow('Feature Method - FAST', image)
cv2.waitKey()
cv2.destroyAllWindows()

 

Console Output:

Number of keypoints Detected:  8960

Fast keypoint Detection using OpenCV and Python

 

BRIEF

import cv2
import numpy as np

image = cv2.imread('paris.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

 

Create FAST detector object

brief = cv2.xfeatures2d.BriefDescriptorExtractor_create()

 

Create BRIEF extractor object

#brief = cv2.DescriptorExtractor_create("BRIEF")

# Determine key points
keypoints = fast.detect(gray, None)

 

Obtain descriptors and new final keypoints using BRIEF

keypoints, descriptors = brief.compute(gray, keypoints)
print ("Number of keypoints Detected: ", len(keypoints))

 

Draw rich keypoints on input image

image = cv2.drawKeypoints(image, keypoints, None, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)                                

cv2.imshow('Feature Method - BRIEF', image)
cv2.waitKey()
cv2.destroyAllWindows()

 

Console Output:

Number of keypoints Detected:  8735

Brief keypoint Detection using OpenCV and Python

 

ORB

import cv2
import numpy as np

image = cv2.imread('paris.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

 

Create ORB object, we can specify the number of key points we desire

orb = cv2.ORB_create()
# Determine key points
keypoints = orb.detect(gray, None)

 

Obtain the descriptors

keypoints, descriptors = orb.compute(gray, keypoints)
print("Number of keypoints Detected: ", len(keypoints))

 

Draw rich keypoints on input image

image = cv2.drawKeypoints(image, keypoints, None, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)

cv2.imshow('Feature Method - ORB', image)
cv2.waitKey()
cv2.destroyAllWindows()

 

Console Output:

Number of keypoints Detected:  500

ORB keypoint Detection using OpenCV and Python

 

We can specify the number of keypoints which has maximum limit of 5000, however the default value is 500, i.e. ORB automatically would detect best 500 keypoints if not specified for any value of keypoints.

So this is how object detection takes place in OpenCV, the same programs can also be run in OpenCV installed Raspberry Pi and can be used as a portable device like Smartphones having Google Lens.

This article is referred from Master Computer Vision™ OpenCV4 in Python with Deep Learning course on Udemy, created by Rajeev Ratan, subscribe it to learn more about Computer Vision and Python.

Have any question realated to this Article?

Ask Our Community Members