From MAE/ECE 148 - Introduction to Autonomous Vehicles
Jump to navigation Jump to search

Team Members

Team Members
Flow Diagram for Stereo P-TAM
Baseplate. Designed to mount to the RC car and accommodate wiring from below as well as a camera mount
Monocular Camera Mount. Designed to be rigid in case of crashes
Stereo Camera Mount. Uses a fixed angle to allow for better training
The electronic schematic for the RC car

Keshav Rungta - Electrical Engineering
Adam Molani - Mechanical Engineering
Wayde Gilliam - Data Science


Our project was to create an autonomous RC car using stereoscopic imaging and simultaneous localization & mapping (SLAM) and to compare its performance to basic autonomous driving using a webcam. We built upon an existing framework for autonomous RC cars called Donkey Car. Donkey Car is designed to be employed on single-board computers (SBCs) such as the Raspberry Pi. We opted to use an alternative SBC, the Nvidia Jetson Nano, as it provides us more processing power therefore allowing us to pursue a more intensive project. The remaining components and their connections are shown in the schematic to the right. Once we assembled the car and confirmed that the Donkey Car framework performed properly, we shifted our attention to implementing SLAM into the framework.

Problem Overview

SLAM, or Simultaneous Localisation and Mapping, is a very big problem in robotics to try and construct a map of an environment the system is not familiar with and compute its position in this map. It is essentially a visual odometry problem. Having raw depth information from sensors like depth cameras and lidars, makes this a lot easier. However, we attempt at using stereo vision to implement SLAM and feed this information to the Donkey Car Framework and compare its performance with that of the native framework. We want to analyse the effects of these additional channels of information on the Donkey Framework. It would have been nice to actually fine tune the SLAM parameters and the Donkey Parameters to make the car perform better than just the raw Donkey Car framework.


Capture synchronized images from two points of view and implement stereo vision

Perform SLAM using stereo vision on personal computer

Implement SLAM with the Donkey Car framework

Analyze the effects on performance compared to basic mono vision

Mechanical Design

The mechanical design of our project was limited since the task of achieving a SLAM implementation is primarily software dependent. The main mechanical consideration is the location and angle of the camera relative to the chassis. Our camera mount designs were created with structural stability as the focus and did not facilitate adjustability. The rigidity of the designs ensured that our camera did not move as the car moved and crashed into things, which is essential since any variations can affect the training and execution of an autonomous model

In addition we laser-cut a baseplate so that we could attach electronics on the surface and wire them to the RC car components below. We allowed a large slot in the middle to allow easy wiring between the two levels.

View of the latency due to lack of synchronization. One can easily see the latency as right side has moved closer to the cones (this is capturing with no delay between the two parts)
View of the image synchronization. The two images look very similar to each other with the only difference being that the right image is shifted to the right

Electrical Design

In addition to the basic electronics that the autonomous vehicle comprises of, we only used the stereo camera setup. This had its own set of challenges associated with it.

Initial Attempt

We started by attempting stereoscopic vision by using two webcams independently. We were trying to get the time stamps of the cameras and use that to place the two images in bins based on which frame the image would probably belong to. We started by figuring out the FPS of each of the cameras (20 FPS). This is equivalent to an image every 0.05 seconds. We created a bin at +- 0.025 seconds from the stamp of one of the cameras and try to slot in the second image into this bin. This is definitely possible while post processing the streams. But trying to do this in real time was very challenging. As a lot of time is taken up by trying to place the two images in the proper bin and then implement SLAM on it.


Realizing that the task of synchronizing independent webcams was too problematic, we requested a hardware synchronized camera that performed much better. However, the camera that we received was out of focus and the lens was not glued properly. In order to improve the quality of images we removed the glue, cleaned the lenses, and properly reattached the lenses, adjusting the focal length appropriately by testing the cam viewer in the online viewer of the donkey model. Once we overcame these dilemmas, the synchronized camera worked well with the Donkey framework and the only adjustment needed was to split the image based on the resolution of the camera.

Despite choosing the Nvidia Jetson Nano for its computational superiority to the Raspberry Pi, it is still limited in its capabilities. The Nano uses an ARM based processor which is by no means comparable to the processors inside modern personal computers. Because SLAM is very computationally intensive, we had to iterate through multiple libraries and packages to find one that successfully ran on the Nano.


We attempted to synchronize the two webcams we had to collect the two images from them simultaneously. This was, however, more challenging than we originally thought. No matter how fast we captured the data, one would always be a little ahead of the other, which would be a problem for the SLAM implementation, as it needs to be able to gauge the depth of the feature points by looking at the disparity between two point of views.

We set up ROS Melodic on the jetson and wrote a node to collect images from the stereo camera. This is nice because it allows donkey to simply subscribe to the rostopic to read the stream of data directly. This would save the donkey framework a little run time. ROS also allows us to run RTAB - Map, a SLAM implementation with our stereo vision setup.

Once we figured out the stereo vision using the hardware cameras, we wrote a few lines of code to split the single concatenated image from the two cameras into two. The code is something as follows:
Insert Code

ROS Environment Setup

This install guide is for using ROS Melodic on Ubuntu 18.04. If using any different version of ROS or Ubuntu you will run into problems with installing and building the various libraries we need for this project. This part by far was the most difficult to accomplish and setup, there was a lot of trial and error while doing this and trying to get everything to build/install correctly.

1) Install ROS Melodic

Follow the official installation instructions here.

Note: the majority of ROS packages are built/maintained in Python 2 and extra installation configuration is needed for installing ROS libraries in virtual environments. For our project, we will be installing everything using our system Python 2 for simplicity, though given more time it would be worth it to get everything running in a virtual environment.

2) Setup Catkin Workspace

Follow the official instructions for setting up your workspace here.

When creating our ROS package for DonkeySLAM, we will be working inside the Catkin Workspace that holds all of our package data/links all-things-ROS together.

Note: Make sure you ```source devel/setup.bash``` everytime you begin working in the workspace. This is to properly configure your development environment and get all needed system PATH variables.

3) Build Required ROS Packages

We will be building several packages inside our Catkin Workspace to go from taking two monocular image feeds and getting SLAM point clouds:

1. Clone each of the following packages into ```[catkin_ws_path]/src/```

camera_info_manager_py  (ROS CameraInfo message conversion)
vision_opencv (Opencv with ROSPY)
donkey_slam (our package that uses opencv to output SLAM points!)

2. Change into the top level directory of your workspace and run ```catkin_make install```. This will build and install all packages inside your /src directory.

Note: You need to run ```catkin_make``` whenever you update/change packages to ensure nothing will break/update properly.

4) Install RTABMAP

This ROS package is what we’ll be using to run SLAM on the stereo images we receive as its the most well documented/widely used ROS package for this kind of application. Follow the installation instructions here and install for Melodic.

You’ll have to calibrate your stereo camera with ROS, so follow the official calibration documentation here if you haven’t already.

5) Install ROSPY and required Python packages

We will be using ROSPY for communication between ROS nodes and the topics/messages they generate. It should be installed automatically when you installed ROS Melodic. Individually install each package below (we will be needing them for our donkey_slam ROS package):

# numpy - math/tensor operations
pip install numpy

# PIL - image editor
pip install Pillow

# cv_bridge - communicate between opencv images and ROS image messages
sudo apt-get install ros-melodic-cv-bridge

6) Optional - Configure Tmuxp

You will be running a lot of processes simultaneously by the time you get to running RTABMAP, so to resolve the issue of having a dozen different terminal windows open, configure Tmuxp to have as many panes/windows as needed (it lets you have multiple virtual windows in a single terminal window). I’ve attached an example configuration yaml config file below. You’ll have to modify it to suit your own environment setup:


session_name: ros
  - window_name: dev window
    layout: tiled
      - cd ~/catkin_ws/
      - source devel/setup.bash
      - PS1='\u:\W\$ '
      - shell_command:
        - clear
      - shell_command:
        - clear
      - shell_command:
        - source ~/.bashrc
        - conda activate ros
        - cd ~/development/classes/ece148/ros
        - jupyter notebook
        - clear

DonkeySLAM Environment Setup

While all of the ROS stuff will be running on our system Python 2, fortunately Donkeycar is built around running everything in an Anaconda Python 3 virtual environment. When driving, training, or doing inference, we will be working in this conda environment.

1) Install Donkeycar and configure Conda environment

Follow the official Donkeycar instructions for setting up your conda environment here.

We will simply be modifying a few of the framework files along with the files generated by the local car directory you’d normally run the ```manage.py``` and ```train.py``` files from, so getting your conda environment setup is identical to what you would do if running just normal Donkeycar operations. The following steps are under the assumption you’ve already gone through and installed/configured all the components of your own physical car and are able to get it driving/training normally. Please go through the official donkeycar documentation and do this if you haven’t already.

To use as a reference, see what our donkeycar library and car setup looks like by cloning this repository. It has all of the modifications we’ll be detailing below.

2) Modify Donkeycar Framework

We will now go through each file that will be updated, what’s being changed, and why. Ideally, we would have a forked version of the official donkeycar library that sets all of this up automatically/cleanly, though for now we will be sticking to the classic copy-and-paste method up updating everything due to the time constraints of this project. For each file below, just copy and paste all of the code from the reference repo to the official installed Donkeycar library/car directories. Paths to data tubs and SLAM output directories will need to be updated per environment setup.

../donkeycar/donkeycar/parts/cv.py The ```CvCam``` class is updated in that we no longer get images from a cv2 video capture, but from saved images in a directory that are produced by our donkey_slam ROS node.

../donkeycar/donkeycar/parts/datastore.py Every time Donkey records json data, we open the most recent json file produced by our donkey_slam node and add that into our json dict. This is a very basic way of pulling in our SLAM points, and really should be done more properly at some point as there is some unaccounted de-sync with the timing of everything.

../donkeycar/donkeycar/parts/keras.py The ```default_n_linear``` keras model is updated by taking in an image array and SLAM point array as inputs. There are essentially 2 linear models inside of this model: one that handles the image data and one for SLAM data. Each produces 2 values for throttle and steering, then their results are averaged and measured with MSE loss.

../mycar/train.py The SLAM point data is fed into the Keras linear model that originally only just took a single monocular image, so the data batch generator is updated for this.

../mycar/myconfig.py Some configuration settings will need to be updated based on your setup for this project, most importantly being ```CAMERA_TYPE``` and ```DATA_PATH```.

A Walkthrough of the ROS donkey_slam Package

In the donkey_slam ROS package, there are two main python files that we’ll be using.

donkey_slam/scripts/stereo_image_stream.py Processes and streams image data as ROS messages. We will be generating a multitude of other messages from these initial messages the image stream generates.

donkey_slam/scripts/stereo_image_receiver.py Reads raw image messages and RTABMAP point cloud data and outputs them as images and json files respectively. The most recent outputs will be read in by donkey and used in the donkey pipeline of data preprocessing/model training/inference.

Running ROS and Donkey Simultaneously

1) Run ROS Functions

Before anything, we must run all the necessary ros functions to generate stereo images and slam point data. In your Catkin Workspace, run each of the following lines in separate terminal windows in order:

$ roscore

Links all ROS topics, nodes together.

$ rosrun donkey_slam stereo_image_stream.py

Pulls stereo images from opencv and publishes them and the camera info to ROS topics.

$ ROS_NAMESPACE=stereo_cam rosrun stereo_image_proc stereo_image_proc

Preprocesses stereo images and generates a bunch of ROS topics that RTABMAP needs to do its thing.

$ roslaunch rtabmap_ros donkey_slam.launch stereo_namespace:="/stereo_cam" rtabmap_args:="--delete_db_on_start"

Launch RTABMAP. Make sure your donkey_slam.launch file is the same directory as the RTABMAP launch files.

$ rosrun donkey_slam stereo_image_receiver.py

Generates stereo images and SLAM point cloud outputs into directories for Donkey to pull from. This is a quick and dirty way of transferring data between the two, and should be improved.

2) Run Donkey Functions

Now you can just run Donkeycar functions as you normally would in your ```../mycar/``` directory and donkey conda environment.

For example, driving the car would just look like this:

$ python manage.py drive

For training:

$ python train.py --tub [tub_path] --model=models/stereo_model.h5

Useful Knowledge

A little background information on ROS and embedded systems is very helpful. It really helps with debugging as ARM based development is not easy because compatibility is a big issue. Also a good understanding of Python and C++ is helpful to write code for the different sub-tasks in this project.

This is a view of how Stereo P-TAM performed without any hyper-parameters being tuned for the smaller scale. We can see that the algorithm is able to generate the point cloud with the features



ROS DonkeySLAM Package:


Source files for modified Donkeycar:



The biggest challenge we faced was implementing SLAM on the Jetson Nano. The Jetson is an ARM based CPU. Which means it is not compatible with many libraries and packages that were originally built for the i386 (x86) architecture. In addition to this, SLAM is a very notorious computational problem. So some libraries would not even build on the Jetson nano. As a result we constantly had to adapt to the restrictions as we found new implementations.

In addition to this, proper synchronisation between the two cameras was crucial. And we were glad to be able to properly implement the hardware synchronised camera with ROS on the jetson.

Future Work

We want to fine tune our set up to be able to properly run SLAM and allow the car to be able to predict its next actions a lot better than it would with the base Donkey setup. This is because SLAM also tells the car where it is in the map it generates allowing the car to properly navigate around the track.


ROS Wiki
Stack Overflow
RTAB - Map
Stereo - PTAM
Geek for Geeks

Final Presentation