[Part 2 of 2]
In the first of our DeepRacer series, we introduced DeepRacer and explained the general concept behind it.
Well, that’s all well and good I hear you say, but show us the good stuff!
First, we have to create our model. We must choose the track to train our model on.
AWS have 6 options; where you wish to race your model will determine which track you select. For this experiment we chose the London Loop so we could take part in the Virtual League.
Next, we set up is the car action space. This essentially outlines what the car can do. The steering angle, the max speed, as well as the granularity of both these two settings. How to select these settings will ultimately depend on the track. If training on the straight track, then providing the car with the ability to make left and right turns is somewhat wasteful - and may slow down your training.
Perhaps the most important thing to determine when setting up the model is the reward system. For the model to understand what it does right, you need to determine a reward function.
This is written in python and contains the basic logic for how to reward your model.
In the photo above, we chose to reward the car for:
1. Keeping close to the centre line
2. Progress made around the track
We would then penalise oversteering and leaving the track.
Final step is to train our model! While the model is training, you see a graph that shows you how many points the model has earned on each try as well as a video stream showing you how your car moves around the track. Your car will veer all over the place as it explores the track, often taking some exciting off-road adventures but it’s all important for the model to learn. We caught ours going around in circles at one point.
The final stage of this process is to evaluate how well your model has learned. It is given 3 chances to make it around the track as fast as it can. As of writing our model’s best time is 32 seconds which is a far cry from the 12 seconds currently posted as the top time in the virtual league!
But the fun of DeepRacer is improving your model and once you have a model you like you can clone it and fine tune the hyper parameters or the reward function.
Why not have a go yourself and comment what your best times are, either on this blog or on the social media post!
Author - Michelle Chismon
Comments