Our project was to integrate OpenCV into donkeycar in order to try and improve training time and robustness of the model. We used OpenCV to pre-process images in order to extract edges and highlight the white and yellow lanes. We wanted to see if using those images as a combined 3-D image (edges, white lane, yellow lane) would improve training time and robustness of the model compared to the original default donkeycar model which trained using RGB images.
Here is our google drive with our presentations.
- Cyrus Shen - ECE
- Chenfeng Wu - ECE
- Maximilian Stadler - CSE (UCSD Extension)
- Isabella Franco - MAE
Our mechanical design is relatively simple. We designed a baseplate to be fitted onto the four mounting points originally meant to fix the chassis. A large slot in the middle gives us the necessary space for the cables. All the other things are fixed on the plate using Velcro. In retrospective, it would have been better to use screws drilled through the baseplate to do so. Inertia effects are large in the corners causing the parts to come loose. This is a problem, especially on the outdoor track.
To keep our recorded images valid, we fixed the camera right from the beginning with four screws. The camera design itself is rather simple but provides sufficient housing for the lens. We used two iterations as the first camera mount's inclination was too flat which lead to images mainly focusing on the parts of the track right in front of the car. Therefore, we changed the inclination in the second iteration leading to - at least by human inspection - better-looking recordings.
The circuity follows the descriptions given in class. As we do not have additional hardware, no further changes were necessary. For unsafe situations, a wireless relay acts as an emergency switch by deactivating the PWM outputs. Therefore, set outputs are not active, i.e. not sent to the motors for example, which leads to the desired effect of stopping the car without resetting anything in the running driving lop on the Jetson. A LED-indicator shows the status of the car: a red light indicates a stopped car while a blue light indicates that the car is in the 'driving' state. Since we use the default circuitry, we do not give further details but only append our layout for an overview.
Aspect Ratios and Cropping
The original configuration suggested the recording of images with the size of 160x120 pixels resulting in a 4 by 3 aspect ratio and doubling the resolution for the outdoor setting, i.e. to 320x240. The used camera is itself a 2 MP camera being capable of recording FHD images. Thus its native resolution is 16 by 9. To allow for easy comparison, we decided to record indoor and outdoor images at the same resolution. We furthermore tried different aspect ratios to find how much information, especially in the corners, is lost due to cropping. We modified the framework such that only the height and the aspect ratio are specified in the configuration files. The image width is derived. We tested 1 by 1, 4 by 3, and 16 by 9. We concluded that the native wide angle resolution gives us the most information in the corners, which seemed to be prone to errors in the beginning anyway.
# newly added flags for aspect ratio and image sizes IMAGE_H = 144 IMAGE_ASPECT_RATIO = 16 / 9 IMAGE_W = round(IMAGE_H * IMAGE_ASPECT_RATIO)
|track / aspect ratio||1x1||4x3||16x9|
The Donkeycar-Framework is coded in a relatively rigid fashion. Many hard-coded sections and in-line definitions make any customization hard. Our goal was to compare the original models with models using visual primitives as priors, namely edges and lane segmentations. While these primitives can be easily computed using OpenCV, coding had to be done carefully to keep the code modular and allow an easy comparison of the original version without any code changes and the OpenCV-based models. Furthermore, to allow flexible modifications, our code was kept parametric which resulted in several additions to config.py allowing for customizable preprocessing.
Our code-changes support different "modes". Each mode implements the abstract _OpenCvPreProcessor. Depending on the specification in the config file, the corresponding preprocessor is initialized and used. More details about the different modes are given in the next section.
class _OpenCvPreProcessor: Abstract class of image processor Implement framework of shutdown and run function as required by Donkey car framework. def __init__(self, cfg): self.cfg = cfg pass def shutdown(self): return def run(self, image_arr): raise NotImplementedError
class OpenCvPreProcessor: preprocessor-manager defining kind of pre-processing based on a string-argument loads corresponding class in init def __init__(self, cfg): self.cfg = cfg if cfg.CV_PREPROCESSOR_TYPE == 'canny': assert cfg.CV_TARGET_IMAGE_DEPTH == 1 self.processor = OpenCvCanny(cfg) elif cfg.CV_PREPROCESSOR_TYPE == 'segmentation': assert cfg.CV_TARGET_IMAGE_DEPTH == 1 self.processor = OpenCvColorSegmentation(cfg) elif cfg.CV_PREPROCESSOR_TYPE == 'combined': assert cfg.CV_TARGET_IMAGE_DEPTH == 3 self.processor = OpenCvCannyAndSegmentation(cfg) elif cfg.CV_PREPROCESSOR_TYPE == 'combined_with_gray': assert cfg.CV_TARGET_IMAGE_DEPTH == 3 self.processor = OpenCvCannySegmentationAndGray(cfg) else: raise NotImplementedError def run(self, image_arr): if image_arr is None: return None processed_image = self.processor.run(image_arr) return processed_image def shutdown(self): self.processor.shutdown()
Newly Added Configuration Settings
To achieve easy changes to our cv-framework, we kept all parameters modular and added corresponding flags to the configuration. The newly added configuration possibilites are given here.
USE_OPENCV = False # boolean flag for whether images are preprocessed using opencv CV_STORE = False # whether to store raw-image or pre-processed image (TRUE == store cv-processed image) CV_PREPROCESSOR_TYPE = 'combined_with_gray' # describes type of pre-processing (canny | segmentation | combined | combined_with_gray) CV_TARGET_IMAGE_DEPTH = 3 CV_STORE_INF = False # whether to store "inference-input" during ai-mode for testing CV_COLOR_MODE = 'indoor' # mode for color segmentation (indoor | outdoor)
# OPENCV COLOR SEGMENTATION SETTING - INDOOR # WHITE CV_WHITE_LOWER_IN = [140, 140, 140] CV_WHITE_UPPER_IN = [255, 255, 255] # YELLOW CV_YELLOW_LOWER_IN = [25, 110, 30] CV_YELLOW_UPPER_IN = [70, 160, 70]
# OPENCV COLOR SEGMENTATION SETTING - OUTDOOR # WHITE CV_WHITE_LOWER_OUT = [170, 170, 170] CV_WHITE_UPPER_OUT = [255, 255, 255] # YELLOW CV_YELLOW_LOWER_OUT = [5, 100, 70] CV_YELLOW_UPPER_OUT = [20, 180, 170]
# OPENCV CANNY SETTING CV_CANNY_MIN = 50 CV_CANNY_MAX = 125 CV_CANNY_APPLY_HOUGH = False CV_HOUGH_MIN_VOTES = 20 CV_HOUGH_LINE_THICKNESS = 5 CV_MIN_LINE_LENGTH = 5 CV_MAX_LINE_GAP = 10
# OPENCV ROI SELECTION SETTING CV_MAKE_COORDINATE_UPPER_LIMIT = 3 / 4 CV_ROI_TYPE = 'crop' # type of roi-operation (crop | mask | None) CV_ROI_Y_UPPER_EDGE = 65
# OPENCV GAUSSIAN BLUR setting CV_GAUSSIAN_KERNEL = (5, 5) CV_GAUSSIAN_SIGMA = 0.0 # might be unnecessary at all, if 0 is being used
Changes to manage.py
To include the preprocessor during running the car, we mainly modified the sections of the code where image preprocessing already takes place (cropping and normalization). We introduced an if-else branch to include the new pipeline. Furthermore, we changed several other lines of code to allow storing preprocessed images or the inference input for debugging purposes.
if "coral" in model_type: inf_input = 'cam/image_array' # new OPENCV-flag: insert preprocessor between cropping and normalize # therefore splitting the normalize_and_crop function was necessary # default-code in else-branch elif cfg.USE_OPENCV: from opencv.preprocessor import OpenCvPreProcessor if cfg.CV_ROI_TYPE == 'crop': V.add(ImgPreProcessCrop(cfg), inputs=['cam/image_array'], outputs=['cam/cropped'] ) V.add(OpenCvPreProcessor(cfg), inputs=['cam/cropped'], outputs=['cam/image_array_cv'], threaded=False) else: V.add(OpenCvPreProcessor(cfg), inputs=['cam/image_array'], outputs=['cam/image_array_cv'], threaded=False) inf_input = 'cam/normalized/cropped' V.add(ImgPreProcessNormalize(cfg), inputs=['cam/image_array_cv'], outputs=[inf_input], run_condition='run_pilot') else: inf_input = 'cam/normalized/cropped' V.add(ImgPreProcess(cfg), inputs=['cam/image_array'], outputs=[inf_input], run_condition='run_pilot')
Changes to train.py (mainly in donkeycar.utils)
Also, changes in the training-loop were necesary to include the preprocessor on loading the training images. Herefore, mainly changes in the load_scaled_image functino were necessary - besides several small changes in the training loop.
def load_scaled_image_arr(filename, cfg, cv_preprocessor=None): load an image from the filename, and use the cfg to resize if needed also apply cropping and normalize import donkeycar as dk try: img = Image.open(filename) if img.height != cfg.IMAGE_H or img.width != cfg.IMAGE_W: img = img.resize((cfg.IMAGE_W, cfg.IMAGE_H)) img_arr = np.array(img) # new OPENCV-flag # run preprocessor before loading training image_arr if cfg.USE_OPENCV: img_arr = crop(img_arr, cfg) img_arr = cv_preprocessor.run(img_arr) img_arr = normalize(img_arr) else: img_arr = normalize_and_crop(img_arr, cfg) croppedImgH = img_arr.shape croppedImgW = img_arr.shape # account for canny image if cfg.CV_TARGET_IMAGE_DEPTH == 1: img_arr = img_arr.reshape(croppedImgH, croppedImgW, 1) elif img_arr.shape == 3 and cfg.IMAGE_DEPTH == 1: img_arr = dk.utils.rgb2gray(img_arr).reshape(croppedImgH, croppedImgW, 1) except Exception as e: print(e) print('failed to load image:', filename) img_arr = None return img_arr
Lane Finding with OpenCV
HSL Colors for Lane Detection
In order to single out the white and yellow lanes from the rest of the image, we used HSL color-space to threshold the image for the colors white and yellow.
The RGB image taken by the camera was converted to HSL using OpenCV's COLOR_RGB2HLS and then thresholded so that only white and yellow colors in the image were kept.
|white lange segmentation||orange lange segmentation||final segmentation result|
Canny Edge Detection
To outline the lanes, we used OpenCV's canny function which shows edges in an image.
Combined 3-D image
We combined the white, yellow, and canny edge images into a 3-D image and trained on that instead of the original RGB image.
|combined||combined with grey-scale|
|canny edges (blue)
white segmentation (green)
yellow segmentation (red)
gray-scale image (blue)
In order to compare our OpenCV model to the current default model, we trained both models on the same images. For the indoor models, we trained separate models on 1000, 5000, 10000, 20000, and 30000 clockwise training images from the indoor track. For the outdoor models, we trained on 1000, 5000, and 10000 clockwise training images from the outdoor track.
None of the clockwise models worked when driving counterclockwise.
We tested each model based on the following criteria:
- Training time
- Local Angle completion of 3 clockwise laps on the respective track
- Autonomous completion of 3 clockwise laps on the respective track
- Able to drive when environment is brighter or darker than the trained images?
Indoor Track Model Results
Robustness Against Brighter Environment
For the 1000, 5000, and 10000 indoor images, they were taken with the Makerspace doors closed so the images of the track in front of the door were darker. When we tried testing those 3 models with the Makerspace doors open, brightening the area of the track in front of the doors, none of the 3 default models could pass that part of the track. However, the OpenCV 10000 model could pass that part of the track most of the time which is significantly better than the default 10000 model which passed it 0 times. This is a good sign that our OpenCV model works better than the default under brighter track conditions.
Example of default models failing with door open:
At 20000 and 30000 train images, both default and OpenCV models could pass that part of the track which could be due to overfitting.
Faster Training Time
Our OpenCV models trained on average 1 minute faster than the default models.
Less Training Data Required
Our OpenCV models could run autonomously on 5000 training images while the default model required 10000 training images.
The following training times are from training with 420 x 175 pixel images using Ubuntu 18.04.3 LTS on a GeForce GTX 965M NVIDIA GPU. Those marked with "N/A" were trained using the UCSD Supercomputer Center and are thus not comparable.
|Criteria \ Images Trained||1000||5000||10000||20000||30000|
|Training Time (min:sec)||02:23||06:10||09:35||N/A||N/A|
|Local Angle Completion||Χ||Χ||✓||✓||✓|
|Criteria \ Images Trained||1000||5000||10000||20000||30000|
|Training Time (min:sec)||01:24||05:20||08:21||N/A||N/A|
|Local Angle Completion||Χ||✓||✓||✓||✓|
Outdoor Track Model Results
Our OpenCV outdoor models performed significantly better than the default outdoor models. The 5000 and 10000 OpenCV models completed the 3 laps around the outdoor track while the 5000 and 10000 default models could not finish half of the track, failing on the curves.
During the third week of class, we trained a default outdoor model with ~20000 images and it successfully completed 3 laps on the track (so we know that the default model can be used on the outdoor track). With our OpenCV preprocessing, we significantly reduce the amount of training images needed from ~20000 to 5000 at minimum.
In order to test criteria 4 (Darker Environment), we also tested the model at 5pm when the sun was setting so the track was darker than the train images used. Unfortunately, none of the models worked when the track was darker although our OpenCV models did get farther on the track than the default models.
|Criteria \ Images Trained||1000||5000||10000|
|Training Time (min:sec)||00:55||04:19||06:56|
|Local Angle Completion||Χ||Χ||✓|
|Criteria \ Images Trained||1000||5000||10000|
|Training Time (min:sec)||01:19||03:45||6:30|
|Local Angle Completion||Χ||✓||✓|
Trained on identical image size and hardware as that of the indoor track.
OpenCV image processing allows our car to drive autonomously with as little as 5k training images.