Our team(Team 3-Ds on my transcript) chose to create a experimental comparison between an Intel Realsense 3D depth camera and a simple webcam for both static and dynamic obstacle avoidance. We chose to do this project for multiple reasons. One being that obstacle avoidance is always going to have to be considered when making autonomous vehicles, to help prevent damages to both the vehicle and its surroundings. The other being that no one has integrated the real sense into the donkey frame work and we hope that our donkey part and trial and error trying different 3D cameras will benefit the open source DonkeyCar community.
Ysabelle Lam [MAE]
Jordan Prazak [ECE]
Hongjian Cui [UCSD Extension]
Boomer's Revenge [Car/Zoomer Tracker]
Design and Assembly of Donkey
Plate and Camera Mount Design
Webcam: The camera mount was designed to be adjustable as we did not know the optimal camera angle for both the indoor and outdoor track. The mount and camera case connect with using a M3 screw. The screw/nut combo allows the camera angle to be easily adjusted for optimal training.
Intel Realsense: We started off having the real sense case at a 30 degree angle which we found to be the ideal angle with the webcam. However, with further use we found that it would be better to have the real sense at a flatter angle so that it could see further ahead and we could have a better use of the point cloud.
Mount Plate: We designed the mount plate with some tolerance. The car chassis was designed to be used for all terrain, so we wanted a mount plate that could keep up with the chassis.
Mount plate CAD
Autonomous Driving using Donkey Car Neural Network Training
Indoor video with autonomous lap.
Outdoor video with autonomous lap.
Sample loss diagram after training:
Project: 3D Camera v. Webcam Obstacle Avoidance
Choosing 3D Camera
-Software development kit only in C++
-No viable way to transfer depth data gained from the camera to run in python file for real time.
-Could not implement into donkey framework.
-SDK in python
-Now able to collect depth and IR data in python
-Wrote/ found programs to implement in donkey
-Only does depth field and infrared meaning there is no RGB values
-Will be a problem with outdoors (as sunlight is IR) and there is no other data to rely on outdoors
-SDK in python
-Marketed for both indoor and outdoor use
-Collects calibrated imaging sub-system that features Active IR or Passive Stereo Depth technology, RGB images, rolling or global shutter image sensor technology, wide or standard FOV
-Meaning point cloud and images are not dependent on IR
We started off having the real sense case at a 30 degree angle which we found to be the ideal angle with the webcam. However, with further use we found that it would be better to have the real sense at a flatter angle so that it could see further ahead and we could have a better use of the point cloud.
You can see the final iteration here:
Donkey Part for Realsense
Attempting to create a 4 channel image
At first we attempted encode depth as a 4th channel of the image using the CMYK (Cyan, Magenta, Yellow, Black) image format. The depth channel is natively 16 bits and was scaled down to 8 bits to match the channels of the color images. We were able to successfully generate a 4 channel image, but the donkeycar framework made it difficult and time consuming to modify all of the code necessary to accept a 4 channel image.
Removing an RGB Channel to fit Depth Data
In order to make a 3 cahnnel image for the donkey car that includes depth data, we removed the green channel from the image and replaced it with the depth image.
Integrating both RGB and Depth Data
The depth data and full RGB were obtained by vertically concatenating a colorized depth image with RGB data. In this step we also added post processing of data provided by Realsense (spatial/temporal filtering, hole filling).
The depth data is only 1 layer, so using a 3 color image to represent depth causes redundant pixel values. To solve this we split the depth image by 3 and spread over 3 layers (R,G,B) in order to reduce image size and improve training time.
Below is a sample image taken from the realsense camera while operating on the car.
The final improvement that was made was to reduce the unit depth per bit to 100um. At the default value of 1mm per bit, the sensor provided a range of 65.535m which was more than necessary and provided only 255mm of resolution per bit after scaling. After reducing the unit depth to 100um, the resolution was improved to 25.5mm per bit and the effective range was still 6.5535m. The image below shows the improvement in the quality of the depth data. It is much more descriptive of the area of depth immediately in front of the car.
Static Obstacle Comparison
Video w/ Webcam: https://youtu.be/NG8DB0FhqQU
Video w/ Real Sense removing Green Channel: https://youtu.be/4lgLF7CeVF4
Video w/ Real Sense and Concatenated Part: https://youtu.be/vBtSB5AywY0
Video w/ Webcam: https://youtu.be/mfGb-J7hcpY
Video w/ Real Sense and Concatenated Part: https://youtu.be/xeeLgV3EJok
Dynamic Obstacle Comparison
Ultimately, we recommend the Intel Realsense over a webcam for obstacle avoidance. For static obstacle avoidance, there isn't much of a difference in performance. However, with dynamic obstacle avoidance the webcam completely fails, while the real-sense does a fairly good job. Realistically, if someone is trying to do obstacle avoidance, the obstacles are not going to remain static, so the Intel Realsense is the clear choice for obstacle avoidance.
Problems Encountered and Lessons Learned
This project had its ups and downs. Ultimately, we accomplished what we set out to do but we hit some hiccups along the way. Ultimately what we learned was to read documentation and product support before attempting to implement a product. We also learned code wrappers are a blessing.