The objective of this project is to utilize the camera and OpenCV to instruct the robocar to identify a tennis ball, collect the ball, and return it to home. The following steps were taken to accomplish this:
1) Setup camera to take images 2) Process images with OpenCV to isolate the ball in the images 3) Use the ball's pixel location to calculate distance and bearing to the ball 4) Determine the PWM values to send to the motors based on these calculations 5) Develop and implement code into the DonkeyCar Framework
Krishna Naik email@example.com
Using OpenCV to Find The Ball
OpenCV is open source computer vision library optimized to process images. It can be used in either Python or C++, but because the donkeycar framework is in Python, we also programmed our OpenCV functions in Python. OpenCV is also already installed within the donkeycar framework so you don't have to worry about downloading it.
- To use OpenCV, import the library with 'import cv2' without the quotes.
Converting, Filtering, and Finding Contours with OpenCV
1) First, convert the default color-space (BGR) of the Raspberry Pi Camera Module to HSV with the function cvtColor. img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) #The function takes 2 arguments: an image and a flag (what color space to convert to) *Converting to HSV makes it easier to find distinct colors. Further explanation can be found here.
2) Next, to actually filter out the color green, use the function inRange. img_inRange = cv2.inRange(img_hsv, lowest_bound, upper_bound) #The function takes 3 arguments: an image, the lower bound of a certain color that you're looking for, and the upper bound of the color that you're looking for. It takes those pixels in the range of that bound and converts it to white pixels, and every other pixel is converted to the color black. *The lower bound we used was green_lower = (41, 25, 57), and an upper bound of green_upper = (180, 255, 255)
3) After converting the image to a binary black and white image, there might be noise (small white blobs of pixels that made it through the inRange filtering. To reduce the noise, we used the function erode to convert small blobs of white to black pixels. eroded_img = cv2.erode(img_inRange, None, iterations=2) #The function takes 3 arguments: an image, the Kernel (None -> not necessary to know for our purposes), and the number of iterations to perform. Each iteration erodes the image more and more.
4) Eroding our image also erodes the main blob of white pixels that we want to track, so we want to normalize it with the function dilate. dilated_img = cv2.dilate(eroded_img, None, iterations=2) #Again, the function takes 3 arguments: an image, the Kernel, and the number of iterations to perform. The number of iterations should be the same as the eroded iterations to keep the original size of the blob.
5) Now that we have our filtered image, we want to find the outline of the ball by using the function findContours. contours = cv2.findContours(dilated_img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2] #Function takes 3 arguments: an image to find contours in, the contours retrieval mode (in this case, we only retrieve external contours from the image and not inner ones), and the third specifies what boundary points to save. This image shows the difference. The rectangle on the left saves all the boundary points (cv2.CHAIN_APPROX_NONE) whereas the right one only saves 4 (cv2.CHAIN_APPROX_SIMPLE), which saves a lot of memory. The variable 'contours' stores the number of boundary points found. Also, the [-2] at the end allows the function to be used in both versions 2's and 3's of OpenCV.
6) There might be noise on the contours found too, so we found the biggest contour with the function max. max_contours = max(contours, key=cv2.contourArea) #The function takes two arguments: the contours to compare, and how to compare which contour is the largest. In this case, we're using the area encompassed by the contour to determine which one is the max. *More about the contour approximations can be found here
7) After finding the largest contour, we can extract geometric information of the encompassing contour with function minEnclosingCircle. ((x, y), radius) = cv2.minEnclosingCircle(c) #The function takes one argument: the contours to work with. It returns the center_x and center_y pixel of the circle, and the radius.
8) We then use the size of the radius (in pixels) to calculate the distance of the ball from the camera. A more in depth explanation can be found below.
Many more OpenCV functions and tutorials can be found here
Determining Distance and Heading to Target
Determining an objects relative distance from the donkeycar given some 2D image is an important step. In order to accomplish this step, a function between pixel data and spacial location was determined in both the object heading and distance problems.
The image recognition process imposes a circular contour on a group of interesting pixels. This contour has some geometric properties which are used to infer longitudinal distance of the target from the robot (DL). In this project, the contour radius was used to determine object distance by taking pictures of the target at incrementally increasing distances from the camera and performing and fitting a function to the resulting curve of radius vs distance.
As seen from the chart and figure to the right, the relationship between contour radius and distance is nonlinear at a certain distance from the donkey car and roughly linear after a certain point. The function representing this fit form is noted in the code but not here since the fit itself is not unique but dependent on the type of fit selected. Regardless of the type of fit selected, the resulting function provides the mapping between the pixel contour size and the longitudinal distance from the camera.
The ball may not always be within the catching zone for the donkey; therefore, it is important to determine its heading from the camera image. Camera resolution settings are defined in terms of X by Y pixels (720 by 480 for this case). Knowing the following:horizontal pixel count is X pixels, the center location (Xc,Yc) of the imposed circular contour, and the camera is located in the center of the donkey car it is simple to determine the distance between the center of the camera frame and the detected object (Xcd).
- Xcd = X/2 - Xc
The above formulation gives the distance from the center of the camera frame in pixels, the pixel distance needs to be converted to real world distance and is performed via the following calculation
- Xd = Xcd * Rreal/Rimage
Where Rreal/Rimage is the ratio of the radius of a real tennis ball (1.375 in) to the radius given of the imposed contour
Once the distance to the center of the camera frame is determined (Xcd) and converted to distance to center of the donkeycar body (Xd) an inverse trig operation is sufficient to determine the needed heading angle given the distance of the target
- θ = tan-1(Xd/DL)
Once the heading and distance parameters for the target are determined they need to be converted to appropriate PWM (Pulse Width Modulation) signals to send to the steering and throttle systems. Details for these operations are found in the next section.
Move and End Conditions
This section describes some of the movement features added to the code.
Donuts: If the camera does not find any objects within a certain time, it will perform donuts. This is done by sending the lowest PWM value (hard left turn) and a median throttle value. If an object is found at any point, a boolean "startDonuts" is set to false and the car knowns to stop doing donuts.
Ball Find Conditions: The way the car knows it has found the ball is by checking to see if the y pixel is above a certain pixel number (note: the top of the image is at y=0). At a sufficiently high y value, determined in the calibration step above, the car knows the ball is at the distance in which it must have been grabbed by the arms. If this is the first ball (in our case, green), the code then switches the color range that it is looking for by changing color range values within a configuration file. This configuration file holds all the constants that are referenced within the code. With the color search changed, the robot repeats the steps performed when finding the ball until it reaches home. It is important that the color range be switched back to the first color upon ending the code (see "def shutdown():" in the Code Implementation section).
The process described above can be pretty easily implemented using a while loop and divvying up code into different functions. But, this kind of implementation puts restrictions on the code in that it is difficult to modify/build on without significant understanding and restructuring of the code. A better way is to modularize the code by using the DonkeyCar Framework, which is already used when running the car with manage.py.
To learn how to modify the DonkeyCar framework, start with the following tutorial videos done by the creator of DonkeyCar. Understanding of these two is all you need to know, and the following discussion should be used as a complement to the tutorials:
The GitHub file "FetchBall.py" is Team 1's overarching program that was used to run our code (in replacement of manage.py). This is the function to be called when running the car. All other programs are called from within this one.
FetchBall.py: This program sets up the DonkeyCar Vehicle, which holds each "part." A "part" consists of a series of functions defined in a class that is run at a specific time, as determined by the Framework. To add a part to the vehicle, an instance of the class needs to be created (e.g cam=CvCam1.CvCam()). This creates a variable that is linked to another program that holds the code for what the part is going to do. For example, the cam variable holds a class that will take an image with the camera and return the image. These parts need to be added to the Vehicle using the .add command.
Parts: In the case where threaded=True, the part is not restricted to the frequency at which the DonkeyCar runs each part (which can be changed in the Vehicle.py program, default=20Hz). In our case, all parts use threaded=False, in which case the Vehicle essentially runs like a while loop that will run the parts in the order that they are added. Each part can have specified inputs or outputs. These inputs and outputs are held by the Vehicle to be accessed by any other part. For example, the camera part outputs an image: outputs=["camera/image"]. This output can be accessed by the image filtering part by adding: inputs=["camera/image"]. The ability to add additional or threaded parts to the Vehicle is what gives this framework an advantage over a simple while loop.
Each part should have at least three functions: When parts are called, they are first fed the inputs and variables are set within the "def __init__():" function. These are the variables that are available to the class as it runs its code. Next, the "def run:" function is run. This run function should hold everything that you want the part to do. It may call other functions within the class, but when it returns its outputs (or return None if it has no outputs), the part will be finished and the Vehicle will move to the next part. Lastly, "def shutdown():" will be called when the user hits Ctl+C to stop the code. This is a good place to shut down motors/deactivate the camera so that the car does not keep running after the code has stopped. While these three functions should be in each part, you may place other functions in the part, which can be called by "def run():" during its operation.
Team 1's Parts: 5 parts were used in FetchBall.py:
cam - takes and outputs an Image with the RPI Camera
filterImage - Takes the Image and applies image manipulations. Outputs x,y, and radius pixel information
Controller - Takes x,y, and radius. Calculates distance and bearing to the ball and outputs the PWM to send to the car. This part is also responsible for determining when to perform donuts, checking the PWM bounds, applying the PWM control loop, and determining if the ball has been found.
SteeringPWMSender, ThrottlePWMSender - Takes PWM values and sends them to the car. These parts use the pre-existing PCA9685 class that is already used in manage.py. This class can be found in donkeycar/donkeycar/parts/actuators.py. PCA9685 is a class that is setup to communication with our specific hardware. When setting up the PCA9685 instance, the class takes an input of the channel number (e.g Channel 1 or Channel 2), as determined by your electronic hookup to your motor controller. Setting up this instance provides a method to send PWM values to the car, but does not provide flexibility to do other things with this instance (unless the PCA9685 class is edited directly). As a result, another class called SteeringPWMSender/ThrottlePWMSender is made, which uses the PCA9685 instance as an input, but also contains other actions. This second class is what is added as a part to the Vehicle.
CvCam1.py: This program holds the CvCam and ImageConvandFilter class.
FetchFunctionsModularized.py: This program holds the Controller, SteeringPWMSender, ThrottlePWMSender classes.
cfg2.py: This is a configuration file that holds all the constants for the code. It use makes for cleaner code and easy adjustable values.
This project had some additional extensions to the initial donkeycar build in the form of an improved camera mount a capture arm.
One of the design improvements was in the camera mount. The original camera mount design allowed for some variation in the camera position if the vehicle experienced some sort of impact or had to experience some slight readjustment from natural screw loosening and movement. This readjustment resulted in variation of captured image which detracted from color calibration settings and distance presets. In order to alleviate this issue a new camera mount was created which allows for rigid attachment of the camera to several different view angles which allows for greater consistency and repeatability.