NVidia published a very interesting paper on this technique called "End-To-End Learning For Self-Driving Cars": https://arxiv.org/abs/1604.07316 . It's an enjoyable read.
The car learns to steer itself on an empty road. It's a good experiment to witness the power of deep learning and neural nets. For autonomous vehicles though, you need much more than that (e.g. sensor fusion, obstacle detection, localization, behaviour prediction, trajectory prediction, path planning, motion control, etc.).
> Compared to explicit decomposition of the problem, such as lane marking detection, path planning, and control, our end-to-end system optimizes all processing steps simultaneously. […] Better performance will result because the internal components self-optimize to maximize overall system performance, instead of optimizing human-selected intermediate criteria, e. g., lane detection.
One can imagine that it might be more difficult to get a network to solve this large problem all at once, and that there might be easier to decompose the problem and solve each part. Would it be a good idea to guide the end-to-end system by first decomposing the problem and solving each part, then using that solution as a starting guess for the whole problem? I mean, the decomposition might perhaps be a reasonable approximation of how the whole problem should be solved. (Then again, it might not.)
The car learns to steer itself on an empty road. It's a good experiment to witness the power of deep learning and neural nets. For autonomous vehicles though, you need much more than that (e.g. sensor fusion, obstacle detection, localization, behaviour prediction, trajectory prediction, path planning, motion control, etc.).