The communication between the user and the computer can be done through various input devices such as the keyboard, mouse etc. However this project report is based on the method of another way of communication using hand gestures and identifying them based on image processing techniques. It is coded in Python and uses the OpenCV library. Experiments show that the implementation is reliable enough for practical use.
Hand gestures can be identified by various algorithms, I am going to use python programming and more specifically open-cv(library of programming functions which mainly aims at computer vision and it includes all those methods which deals with all the computer vision techniques) through which hand gesture will be identified.
The project is based on taking the gestures as input from the camera of the laptop.The whole system has been divided into various stages :
1. The acquisition of image frames
2.Creating a hand segmentation mask
3.Using dilation and erosion for noise removal
4.The most important part the contour identification stage
5. Finding convexity defects and contour areas
6.Identifying the gesture shown based on the contour areas and ratio
Architecture diagram/ Flow diagram describes the different modules and process involved in this project and how they are arranged. The flow diagram given below shows the steps or tasks done to make a complete hand gesture recognition project
Main Function used = ret, frame = cap.read()
The first stage of any vision system is the image acquisition stage. In this stage the I
will take the video as input from my laptop camera. Video is nothing but just frames
that keep on playing one after other very quickly.
Colour is very powerful descriptor for object detection. So for the
segmentation purpose colour information was used, which is invariant to rotation and
geometric variation of the hand . Human perceives characteristics of colour component
such as brightness, saturation and hue component than the percentage of primary
colour red, green, and blue.
For this a mask needs to be created which will segment the hand from the background.
The upper and lower skin colour range are used for the identification of hand. Than
the mask is created through this range using cv2.inRange()
Functions used:
lower_skin =
np.array([0,20,70], dtype=np.uint8)
upper_skin = np.array([20,255,255],
dtype=np.uint8)
mask =
cv2.inRange(hsv, lower_skin,
upper_skin)
The most basic morphological operations are dilation and erosion. Dilation adds pixels
to the boundaries of objects in an image, while erosion removes pixels on object
boundaries. The number of pixels added or removed from the objects in an image
depends on the size and shape of the structuring element used to process the image
1. Erosion:
• It is useful for removing small white noises.
• Used to detach two connected objects etc.
2. Dilation:
• In cases like noise removal, erosion is followed by dilation. Because,
erosion removes white noises, but it also shrinks our object. So we dilate
it. Since noise is gone, they won’t come back, but our object area increases.
It is also useful in joining broken parts of an object.
Functions used are:
Contours, hierarchy= cv2.findContours(mask, cv2.RETR_TREE,
cv2.CHAIN_APPROX )
cnt = max(contours, key = lambda x: cv2.contourArea(x))
In this step we focus on finding or forming the boundary along the segmented hand as
contour is nothing but join of all continuous or similar values the function is applied
over the mask we made.
Functions used:
hull = cv2.convexHull(cnt)
areahull = cv2.contourArea(hull)
areacnt = cv2.contourArea(cnt)
arearatio=((areahull-areacnt)/areacnt)*100
cv2.convexHull(approx, returnPoints=False)
defects = cv2.convexityDefects(approx, hull)
Given a set of points in the plane. the convex hull of the set is the smallest convex
polygon that contains all the points of it.
Thus in this step a boundary or overlapping object kind of thing is being made and
then the convexity defects are find out which are like gaps between the fingers and I
draw a dot kind of structure representing different different numbers taking contour
area also In consideration.
At last for every defect we calculate some angles between the fingers and the number of defects which is used for identifying the gesture. Based on different different values of the output parameters like area of the contour we classify the input getures.
The entire project has been developed using python language.I have taken python
because it has many good modules and packages which can be imported and their
functions can be used.
Particularly for image processing python contains open-cv library:
OpenCV (Open Source Computer Vision) is a library of programming functions
mainly aimed at real-time computer vision. In simple language it is library used for
Image Processing. It is mainly used to do all the operation related to Images.
It contains inbuilt functions for many processing and enhancement related work like:
1.Cv2.VideoCapture()
2.Cv2.recatngle()
3.Cv2.inrange()
4.Cv2.erode()
5.Cv2.dilate()
6.Cv2.cvtColor() …. And many more such functions
Next requirement is to import numpy which is used to handle multi-dimensional
arrays , specifically as images are 2d representation of pixels numpy is used to handle
operations on them.
All the OpenCV array structures are converted to and from Numpy arrays. This also
makes it easier to integrate with other libraries that use Numpy such as SciPy and
Matplotlib.
Also to perform mathematical operations on the images we will need to import the
math module present in python which also has certain functions like sqrt abs and other.