The self-driving based on deep reinforcement learning, as the most important application of artificial intelligence, has become a popular topic. Most of the current self-driving methods focus on how to directly learn end-to-end self-driving control strategy from the raw sensory data. Essentially, this control strategy can be considered as a mapping between images and driving behavior, which usually faces a problem of low generalization ability. To improve the generalization ability for the driving behavior, the reinforcement learning method requires extrinsic reward from the real environment, which may damage the car. In order to obtain a good generalization ability in safety, a virtual simulation environment that can be constructed different driving scene is designed by Unity. A theoretical model is established and analyzed in the virtual simulation environment, and it is trained by double Deep Q-network. Then, the trained model is migrated to a scale car in real world. This process is also called a sim2real method. The sim2real training method efficiently handle the these two problems. The simulations and experiments are carried out to evaluate the performance and effectiveness of the proposed algorithm. Finally, it is demonstrated that the scale car in real world obtain the capability for autonomous driving.
Deep Dive into Self-driving scale car trained by Deep reinforcement learning.
The self-driving based on deep reinforcement learning, as the most important application of artificial intelligence, has become a popular topic. Most of the current self-driving methods focus on how to directly learn end-to-end self-driving control strategy from the raw sensory data. Essentially, this control strategy can be considered as a mapping between images and driving behavior, which usually faces a problem of low generalization ability. To improve the generalization ability for the driving behavior, the reinforcement learning method requires extrinsic reward from the real environment, which may damage the car. In order to obtain a good generalization ability in safety, a virtual simulation environment that can be constructed different driving scene is designed by Unity. A theoretical model is established and analyzed in the virtual simulation environment, and it is trained by double Deep Q-network. Then, the trained model is migrated to a scale car in real world. This process i
The automotive industry is a special industry. In order to keep the passengers' safety, any accident is unacceptable. Therefore, the reliability and security must satisfy the stringent standard. The accuracy and rubustness of the sensors and algorithms are required extremely precision in the proccess of self-driving vehicles. On the other hand, self-driving cars are products for the average consumers, so the cost of the cars need be controled. High-precision sensors [1] can improve the accuracy of the algorithms but very expensive. This is a difficult contradiction need to solve.
Recently, the rapid development of artificial intelligence technology, especially the deep learning, has made a major breakthrough in the fields such like image recognition and intelligent control. Deep learning techniques, typically such as convolutional neural networks, are widely used in various types of image processing, which makes them suitable for self-driving applications. The researchers use deep learning to build end-toend deep learning self-driving car whose core is learning through the neural network under supervised, then get the mapping relationship, finally achieve a pattern-replicating driving skills [2]. While end-to-end driving is easy to scale and adaptable, it has limited ability to handle long-term planning which involves the nature of imitation learning [3,4]. We perfer to let scale cares learn how to drive on their own than under human’s supervision.
Because there are many problems of this replication pattern, especially on the sensor. The traffic accidents of Tesla are caused by the failure of the perceived module in a bright light environment. Deep reinforcement learning can make appropriate decisions even some modules fail in working [5].
This paper focus on the issue of self-driving based on deep reiforencement learning, we modify a 1:16 RC car and train it by double deep Q network. We use a virtual-to-reality process to achieve it, which means training the car in the virtual environment and testing in reality. In order to get a reliable simulation environment, we create a Unity simulation training environment based on OpenAI gym. We set a reasonable reward mechanism and modify the double deep Q-learning networks which makes the algrothm suitable for training a self-driving car. The car was trained in the Unity simulation environment for many episodes. At last, the scale car is able to learn a pretty good policy to drive itself and we successfully transfer the learned policy to the real world !
Our aim is making a self-driving car trained by deep reinforcement learning. Right now, the most common methods to train the car to perform self driving are behavioral cloning and line following. On a high level, behavioral cloning works by using a convolutional neural network to learn a mapping between car images (taken by the front camera) and steering angle and throttle values through supervised learning. The other method, line following, works by using computer vision techniques to track the middle line and utilizes a PID controller to get the car to follow the line. Aditya Kumar Jain used CNN technology to complete the self-driving car with a camera [6]. Kaspar Sakmannti proposed a behavioral learning method [7], collecting human driving data through a camera, and then learning driving through CNN, which is a typical supervised learning.Kwabena Agyeman designed a car by linear regression versusblob tracking.However,these are the capabilities that under under manual intervention. We hope that cars can learn to drive by themselves,which is an intelligent way.
In 1989, Watkins proposed the noted Q-learning algorithm. The algorithm is mainly based on the Q table to record the state -the value of the action pair, each episode will update the state value. Recently, the use of virtual simulation techniques to train intensive learning models and then migrated to reality has been largely verified. OpenAI has developed a robotic arm called Dactyl [12] that trains AI robots in a virtual environment and finally applies them to physical robots. In the later research and exploration, the relevant personnel have been verified by the tasks of picking up and placing objects [13], visual servo [14], flexible movement [15], etc., all indicating their feasibility. In 2019, Luo, Wenhan, et al. proposed an end-to-end active target tracking method based on reinforcement learning, which trained a robust active tracker in a virtual environment through a custom reward function and environment enhancement technology.
From the above work, we can see that many of the visual autopilot algorithms learn through the neural network under the condition of supervised learning, get the mapping relationship, and then control. But this is not smart enough. Tesla’s driverless accident is caused by perceived module failure in a bright light environment. Reinforced learning can do so, even in the event of failure of certain modules. Reinforcemen
…(Full text truncated)…
This content is AI-processed based on ArXiv data.