The objective of this project is to utilize the camera and OpenCV to instruct the robocar to identify a tennis ball, collect the ball, and return it to home. The following steps were taken to accomplish this:
1) Setup camera to take images
2) Process images with OpenCV to isolate the ball in the images
3) Use the ball's pixel location to calculate distance and bearing to the ball
4) Determine the PWM values to send to the motors based on these calculations
5) Develop and implement code into the DonkeyCar Framework
The instruction below describe how each step was accomplished.
Krishna Naik email@example.com
Ian Liu - firstname.lastname@example.org
Nick Garrett - email@example.com
Setting Up Camera
To access the RaspberryPi Camera, make sure to enable the camera after running the command sudo raspi-config. Enabling the camera can be found under 'Interfacing Options'. After enabling the camera, connnect the camera module to the Pi.
Next, import these
from picamera.array import PiRGBArray
from picamera import PiCamera
import cfg2 #configuration file used to store variables
1) Initialize the camera. camera = PiCamera()
2) Set the resolution and framerate. camera.resolution = cfg2.Cam_Resolution and camera.framerate = cfg2.Cam_FrameRate
3) Add the camera as a donkeycar part. Further details are found below in code implementation
4) self.rawCapture=PiRGBArray(self.camera, size=[720, 480]) #Create a reference to the raw camera capture self.camera.capture(self.rawCapture, format="bgr",use_video_port=True) #Use the camera to take a raw image img = self.rawCapture.array #Store the image in arrays of pixels
- Note: This is easier to understand while looking through our code on our github.
After capturing an image, we can process it with OpenCV.
Using OpenCV to Find The Ball
OpenCV is open source computer vision library optimized to process images. It can be used in either Python or C++, but because the donkeycar framework is in Python, we also programmed our OpenCV functions in Python. OpenCV is also already installed within the donkeycar framework so you don't have to worry about downloading it.
- To use OpenCV, import the library with 'import cv2' without the quotes.
Converting, Filtering, and Finding Contours with OpenCV
1) First, convert the default color-space (BGR) of the Raspberry Pi Camera Module to HSV with the function cvtColor. img_hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) #The function takes 2 arguments: an image and a flag (what color space to convert to) *Converting to HSV makes it easier to find distinct colors. Further explanation can be found here.
2) Next, to actually filter out the color green, use the function inRange. img_inRange = cv2.inRange(img_hsv, lowest_bound, upper_bound) #The function takes 3 arguments: an image, the lower bound of a certain color that you're looking for, and the upper bound of the color that you're looking for. It takes those pixels in the range of that bound and converts it to white pixels, and every other pixel is converted to the color black. *The lower bound we used was green_lower = (41, 25, 57), and an upper bound of green_upper = (180, 255, 255)
3) After converting the image to a binary black and white image, there might be noise (small white blobs of pixels that made it through the inRange filtering. To reduce the noise, we used the function erode to convert small blobs of white to black pixels. eroded_img = cv2.erode(img_inRange, None, iterations=2) #The function takes 3 arguments: an image, the Kernel (None -> not necessary to know for our purposes), and the number of iterations to perform. Each iteration erodes the image more and more.
4) Eroding our image also erodes the main blob of white pixels that we want to track, so we want to normalize it with the function dilate. dilated_img = cv2.dilate(eroded_img, None, iterations=2) #Again, the function takes 3 arguments: an image, the Kernel, and the number of iterations to perform. The number of iterations should be the same as the eroded iterations to keep the original size of the blob.
5) Now that we have our filtered image, we want to find the outline of the ball by using the function findContours. contours = cv2.findContours(dilated_img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2] #Function takes 3 arguments: an image to find contours in, the contours retrieval mode (in this case, we only retrieve external contours from the image and not inner ones), and the third specifies what boundary points to save. This image shows the difference. The rectangle on the left saves all the boundary points (cv2.CHAIN_APPROX_NONE) whereas the right one only saves 4 (cv2.CHAIN_APPROX_SIMPLE), which saves a lot of memory. The variable 'contours' stores the number of boundary points found. Also, the [-2] at the end allows the function to be used in both versions 2's and 3's of OpenCV.
6) There might be noise on the contours found too, so we found the biggest contour with the function max. max_contours = max(contours, key=cv2.contourArea) #The function takes two arguments: the contours to compare, and how to compare which contour is the largest. In this case, we're using the area encompassed by the contour to determine which one is the max. *More about the contour approximations can be found here
7) After finding the largest contour, we can extract geometric information of the encompassing contour with function minEnclosingCircle. ((x, y), radius) = cv2.minEnclosingCircle(c) #The function takes one argument: the contours to work with. It returns the center_x and center_y pixel of the circle, and the radius.
8) We then use the size of the radius (in pixels) to calculate the distance of the ball from the camera. A more in depth explanation can be found below.
Many more OpenCV functions and tutorials can be found here
Determining Distance and Heading to Target
Determining an objects relative distance from the donkeycar given some 2D image is an important step. In order to accomplish this step, a function between pixel data and spacial location was determined in both the object heading and distance problems.
The image recognition process imposes a circular contour on a group of interesting pixels. This contour has some geometric properties which are used to infer longitudinal distance of the target from the robot (DL). In this project, the contour radius was used to determine object distance by taking pictures of the target at incrementally increasing distances from the camera and performing and fitting a function to the resulting curve of radius vs distance.
As seen from the chart and figure to the right, the relationship between contour radius and distance is nonlinear at a certain distance from the donkey car and roughly linear after a certain point. The function representing this fit form is noted in the code but not here since the fit itself is not unique but dependent on the type of fit selected. Regardless of the type of fit selected, the resulting function provides the mapping between the pixel contour size and the longitudinal distance from the camera.
The ball may not always be within the catching zone for the donkey; therefore, it is important to determine its heading from the camera image. Camera resolution settings are defined in terms of X by Y pixels (720 by 480 for this case). Knowing the following:horizontal pixel count is X pixels, the center location (Xc,Yc) of the imposed circular contour, and the camera is located in the center of the donkey car it is simple to determine the distance between the center of the camera frame and the detected object (Xcd).
- Xcd = X/2 - Xc
The above formulation gives the distance from the center of the camera frame in pixels, the pixel distance needs to be converted to real world distance and is performed via the following calculation
- Xd = Xcd * Rreal/Rimage
Where Rreal/Rimage is the ratio of the radius of a real tennis ball (1.375 in) to the radius given of the imposed contour
Once the distance to the center of the camera frame is determined (Xcd) and converted to distance to center of the donkeycar body (Xd) an inverse trig operation is sufficient to determine the needed heading angle given the distance of the target
- θ = tan-1(Xd/DL)
Once the heading and distance parameters for the target are determined they need to be converted to appropriate PWM (Pulse Width Modulation) signals to send to the steering and throttle systems. Details for these operations are found in the next section.
Steering and Throttle Control
There are a million ways to control the system. It all depends on what and how the user wants the robot to accomplish. In regards to throttle, we wanted the robot move faster when the ball was farther away and slow down when the robot was was closer to the ball. To do this, we implemented a closed loop Proportional-Integral (PI) controller around throttle and a Proportional controller around the steering.
The first step to controlling the robot is calibrating the steering and throttle values. Using the donkey calibrate code, we selected reasonable and safe minimum and maximum throttle speeds and recorded the associated PWM values.
(dk) donkey calibrate --channel [1 or 2]
The maximum throttle value was to be used in the case where the ball was far away and the minimum throttle value will be used when the ball is very close. The difference between max and min throttle for us was only 4. More on that later. Then, using the same code on the steering channel we measured the angle the each wheel was from straight ahead. Note that due to the steering mechanism, the wheels are actually only parallel when the steering is set to directly forward. In the case where the steering is not directly forward, the inside wheel is at a sharper angle away from center than the outside wheel. We approximated the steering by taking the average steering angle of each wheel. The relationship between the steering angle and PWM steering value will be used by the controller to command the robot where to point the wheels.
For both steering and throttle, the relationship between the PWM values and the output can be assumed to be linear to the nature of the servo and ESC driving the steering and throttle mechanisms. Because speed is a difficult thing to measure here, create a relationship between PWM and distance where the minimum PWM value maps to where distance is zero and the maximum PWM value maps to where the distance is maximum. Therefore, taking the values recorded using the donkey calibrate create two linear relations between the PWM inputs and the outputs.
Closed Loop Control
Control systems is an expansive topic, so this will give a very, very, brief synopsis. Generally, systems have inputs and outputs. Controllers modify the inputs to systems to produce a desired output. In the robot fetch example, the robot’s inputs are PWM values to the throttle and the steering. The robot’s outputs are the distance and angle calculations made by the computer vision code. The controller will receive the outputs, do something with them, and then send the robot a new set of inputs. Use the figure below and steering as an example. The desired angle reference measurement is zero degrees (aka straight ahead). So, when the image processor (from our robot system, G) outputs an angle of 15 degrees, it is compared to the reference (which is zero). Therefore, the error is 15 degrees. Now, the controller, C, uses the linear relationship created in the further calibration to output a PWM that the esc uses to steer the robot. Earlier this controller was called a proportional controller, because it uses proportional constant (the slope of the linear relationship) to control the system.
The throttle uses a slightly more complicated controller. The difference from the steering control is that the throttle utilizes an integrator term with the proportional term. The integrator term sums the error overtime. For example, if the error that should be going to zero based off of the proportional term (slope of the PWM to distance line) is reaching a nonzero constant, the integral term effects the system input to overcome that steady state error. This is particularly important when trying to control a system where the dynamics are difficult to model. The velocity of the robot at PWM values just above neutral are fairly consistent, so the integral term can compensate errors and noise.
In order to respect the max and min values, it is important to create a bounding function that takes in the maximum value, the minimum value, and the input value. If the value is outside of the bounds, then set the output value to the nearest bounding value. This is particularly important if the controller integral wants the input to exceed the maximum desired output.
Move and End Conditions
This section describes some of the movement features added to the code.
1)Donuts: If the camera does not find any objects within a certain time, it will perform donuts. This is done by sending the lowest PWM value (hard left turn) and a median throttle value. If an object is found at any point, a boolean "startDonuts" is set to false and the car knowns to stop doing donuts.
2)Ball Find Conditions: The way the car knows it has found the ball is by checking to see if the y pixel is above a certain pixel number (note: the top of the image is at y=0). At a sufficiently high y value, determined in the calibration step above, the car knows the ball is at the distance in which it must have been grabbed by the arms. If this is the first ball (in our case, green), the code then switches the color range that it is looking for by changing color range values within a configuration file. This configuration file holds all the constants that are referenced within the code. With the color search changed, the robot repeats the steps performed when finding the ball until it reaches home. It is important that the color range be switched back to the first color upon ending the code (see "def shutdown():" in the Code Implementation section).
The process described above can be pretty easily implemented using a while loop and divvying up code into different functions. But, this kind of implementation puts restrictions on the code in that it is difficult to modify/build on without significant understanding and restructuring of the code. A better way is to modularize the code by using the DonkeyCar Framework, which is already used when running the car with manage.py.
To learn how to modify the DonkeyCar framework, start with the following tutorial videos done by the creator of DonkeyCar. Understanding of these two is all you need to know, and the following discussion should be used as a complement to the tutorials:
The GitHub file "FetchBall.py" is Team 1's overarching program that was used to run our code (in replacement of manage.py). This is the function to be called when running the car. All other programs are called from within this one.
1) FetchBall.py: This program sets up the DonkeyCar Vehicle, which holds each "part." A "part" consists of a series of functions defined in a class that is run at a specific time, as determined by the Framework. To add a part to the vehicle, an instance of the class needs to be created (e.g cam=CvCam1.CvCam()). This creates a variable that is linked to another program that holds the code for what the part is going to do. For example, the cam variable holds a class that will take an image with the camera and return the image. These parts need to be added to the Vehicle using the .add command.
2) Parts: In the case where threaded=True, the part is not restricted to the frequency at which the DonkeyCar runs each part (which can be changed in the Vehicle.py program, default=20Hz). In our case, all parts use threaded=False, in which case the Vehicle essentially runs like a while loop that will run the parts in the order that they are added. Each part can have specified inputs or outputs. These inputs and outputs are held by the Vehicle to be accessed by any other part. For example, the camera part outputs an image: outputs=["camera/image"]. This output can be accessed by the image filtering part by adding: inputs=["camera/image"]. The ability to add additional or threaded parts to the Vehicle is what gives this framework an advantage over a simple while loop.
Each part should have at least three functions: When parts are called, they are first fed the inputs and variables are set within the "def __init__():" function. These are the variables that are available to the class as it runs its code. Next, the "def run:" function is run. This run function should hold everything that you want the part to do. It may call other functions within the class, but when it returns its outputs (or return None if it has no outputs), the part will be finished and the Vehicle will move to the next part. Lastly, "def shutdown():" will be called when the user hits Ctl+C to stop the code. This is a good place to shut down motors/deactivate the camera so that the car does not keep running after the code has stopped. While these three functions should be in each part, you may place other functions in the part, which can be called by "def run():" during its operation.
Team 1's Parts: 5 parts were used in FetchBall.py:
1) cam - takes and outputs an Image with the RPI Camera
2) filterImage - Takes the Image and applies image manipulations. Outputs x,y, and radius pixel information
3) Controller - Takes x,y, and radius. Calculates distance and bearing to the ball and outputs the PWM to send to the car. This part is also responsible for determining when to perform donuts, checking the PWM bounds, applying the PWM control loop, and determining if the ball has been found.
4,5) SteeringPWMSender, ThrottlePWMSender - Takes PWM values and sends them to the car. These parts use the pre-existing PCA9685 class that is already used in manage.py. This class can be found in donkeycar/donkeycar/parts/actuators.py. PCA9685 is a class that is setup to communication with our specific hardware. When setting up the PCA9685 instance, the class takes an input of the channel number (e.g Channel 1 or Channel 2), as determined by your electronic hookup to your motor controller. Setting up this instance provides a method to send PWM values to the car, but does not provide flexibility to do other things with this instance (unless the PCA9685 class is edited directly). As a result, another class called SteeringPWMSender/ThrottlePWMSender is made, which uses the PCA9685 instance as an input, but also contains other actions. This second class is what is added as a part to the Vehicle.
Functions called by Fetchball.py:
1) CvCam1.py: This program holds the CvCam and ImageConvandFilter class.
2) FetchFunctionsModularized.py: This program holds the Controller, SteeringPWMSender, ThrottlePWMSender classes.
3) cfg2.py: This is a configuration file that holds all the constants for the code. It use makes for cleaner code and easy adjustable values.
This project had some additional extensions to the initial donkeycar build in the form of an improved camera mount a capture arm.
One of the design improvements was in the camera mount. The original camera mount design allowed for some variation in the camera position if the vehicle experienced some sort of impact or had to experience some slight readjustment from natural screw loosening and movement. This readjustment resulted in variation of captured image which detracted from color calibration settings and distance presets. In order to alleviate this issue a new camera mount was created which allows for rigid attachment of the camera to several different view angles which allows for greater consistency and repeatability.
Another design implemented for this project was a set of arms used to capture the target. These arms were created with the intent of being simple with minimal moving parts to ensure smooth operation and high reliability. The basic design is two arms angled outwardly from the donkeycar front bumper and mounted to some pre-existing holes in the same front bumper with an angled ramp attached to the arms to allow for smooth transport of the ball into the holding area. The connection between the ramp and the arms is facilitated by a makeshift shock-absorber system consisting of a screw and a spring to ensure the ramp is kept flat against the ground.