Automotive camera technology and computer vision algorithms

Automotive camera technology and computer vision algorithms


[Music] just one thing I’d like to mention so for those of you don’t know to to motor Europe is the European operation of the global Toyota and we’re based in Belgium right here in Brussels so just down the road along the 40 near Brussels Airport we have a Technical Centre there that is the European R&D headquarters that’s where I work I work for the recognition technology group and we’re involved in totus global R&D focusing on creating the algorithm solutions required to make the concepts that total is now it’s regarding automated driving become a reality so I need be a screen switching there we are green to switch good yes so I was talking about the concept so many of you probably familiar already with the mobility teammate concept it’s kind of an umbrella umbrella of different that covers different automated driving technologies so the highway teammate for instance which was announced last year there’s the the guardian angel concept which has been announced recently they’re two slightly different concepts when is focused on more traditional automated driving in the highway scenario the other one is more subtle but very ambitious or system that understands and learns about the drivers behavior that only intervenes when the driver is in need of help so we believe that there three cornerstones of driving intelligence that are required to make these technologies happen one is driving intelligence itself so how to move the vehicle so we need to stand the environment and we need the car to be able to make decisions based on that we will need to take advantage efficiently of the connectivity between vehicles that’s being developed and also since none but many of these technologies aren’t fully automated we will have to hand over to the driver at some stage and therefore we need to understand the drivers state and transfer control to the driver so what’s tMI’s role in this so we do advanced research in computer vision that’s my background as well we work with partners in industry and academia throughout Europe on projects such as work related to stereo depth perception we’ve got up to produce an object detection pedestrian tracking detection and tracking got something on the semantic segmentation for example they’re using the latest deep learning methods also in inside the cabin we were looking at the driver face the gaze of the driver we’re looking at behaviors we can infer from image data such as when vehicles are likely to change lanes so what all this technology has in common is that it relies quite critically on the image data and the quality of the image data that we receive from the images themselves now we’ve got cameras obviously all around the vehicle for this so it’s around view we’ve got stereo we’ve got we’ve got cameras inside the cabin and obviously there are many different scenarios where the images really have to perform well in order for us to be able to give a good estimate of the envelope of performance of the different and algorithm as we’re using so we really need to as a part of our activity although we’re focused mostly on algorithm development we also need to understand the image of technology that we’re using you need to we need obviously to be able to get the best quality image for the application that we’re targeting but we also need to understand their trade-offs in image of technology so we need to understand what these trade-offs are and how they are going to affect the algorithm performance so that we can also then decide how much we can compensate for the different trade-offs so there are typical problems that most of you are probably very familiar with so we have global shutter and rolling shutter sensors we have color sensor in monochrome sensors various competing HDR technologies which have certain issues and then of course very often in in driving situations you have lose lighting scenarios with with extreme lighting conditions like coming out of a tunnel you know if your dynamics are too low it’s going to just it’s just going to be quite a flat image without any information in it if you’re using a CCD sensor you’ll have you specular reflections they they’ll cause the bleeding effects which yeah are going to ruin your stereo algorithm for example so this kind of puts us into a difficult situation it’s a bit of a dilemma for us in in R&D because the expectation that that people have is that we we we managed to build a bridge between this kind of very open blue sky research that we’re doing and the reality of production systems in vehicles we need to catch no go problems early on and the in the research process and of course we need to build very strong arguments to justify a cost in vehicles that possibly cost as little as nine thousand euros so and that that’s really sit there really some of the challenges face in R&D is that as a result there’s quite limited flexibility in production visit vision systems as they’re made for the product and they’re optimized to fulfill all the guidelines etc but we as an R&D department we really need to have that flexibility we need to be able to switch the sensors we need to change the baselines of stereo cameras we we want different processing pipelines and also surprisingly but it’s actually true is often quite difficult to get access to some of these production systems and on the other hand off-the-shelf components you can buy often aren’t the kind of product that would will be in a vehicle at some later stage so what options are there for us to anticipate production cameras while maintaining the kind of forward-looking and and blue sky R&D that we’re doing so one is we can simulate the camera stack that we that we’re using also another more obvious option is well we could build our own R&D camera hardware that has these properties so they’re both things that we’ve investigated just like to talk about two approaches that we’ve done our research so as an example to make this a bit more tangible so looking at a stereo depth sensing so you’re all most likely all familiar with us so we have two calibrated cameras we can produce these rectified images and the problem reduces to finding the matches of a pixel on on the FD match of a pixel on and epipolar line and from that correspondence we can get this disparity and we can infer triangulate the the distance to the object now the problem is that they’re two competing interests here so we’ve got the algorithms on one side which ya can do a great job if it’s a really high resolution sensor if it’s a monochrome sensor especially also if it’s a glare we’ll shut a sensor on the other hand we have the interest of the production system which of course has to be low-cost also we can’t afford very big hardware a very big algorithm complexity we might need color sensors for other processing tasks and also as I said earlier having high dynamic range in in these cameras is almost a must in an automotive scenarios but the problem is that currently all high dynamic range sensors I know are rolling shutter sensors and if we want to do long range and high depth accuracy for example in a stereo camera we still have to understand what we can do to handle these trade-offs so yeah then you used to seeing like reconstructions of your back garden from outer space you know then with really high resolution cameras that’s all possible but then you realize this is what you’re working with in a vehicle so you’re turning around a corner the angular velocity is high so you get all these interesting artifacts so suddenly straight lines start tilting due to the rolling shutter nature of the sensor you get these these blooming effects around these moving objects due to the HDR processing you have by a pattern issues due to the color processor the color filter the physical filter that’s placed on on top of the sensor so in regions with higher higher spatial frequency uuugh beyond the limit of what’s possible sampling so you get artifacts aliasing artifacts there all these things are issues that we we have to somehow deal with to make a system that fulfills our requirements so we’ve done some investigation on this so working with okay you live in university but further down the road and livin with Nick O’Connell is Mike first Manson Lichtman Hall who many of you might know is a well-known researcher in computer vision so for example for the bio patent issue it’s something that we can simulate fairly severely easily we don’t really need to know any specifics about the sensor and you know ray-tracing software and the 3d vehicle model we can take us quite a long way so what we did was we we put together like a virtual stereo camera with a different bio filters that are commonly used so we’ve got the the mono monochrome case with clear filters we’ve got a RG GB which is very common by a patent the full color space we’ve got this red clear clear clear which is like a bit of a compromise between the two and you can see like the original rendered image of the vehicle there which is going to move up to about 220 metres into the distance and that that the rays coming from that image they’re going to go through the bio filter to generate these these nice images on the right there you can already see that obviously the adding this filter on top it creates an extra sampling of the of the signal and that obviously reduces the the frequencies that we can we can see in the image without aliasing artifacts appearing the problem is of course that the way these stereo algorithms work by matching pixels in either image once we get these aliasing artifacts for example here on the left so at 120 meters already you can see that although the ideal disparity in this case would be 3 pixels roughly the fact that we’ve got this bio pattern which alternates between the different filters it kind of messes up this matching because it encourages the matches to be found at disparities that don’t correspond to the actual disparity so you can see this plot on the right which shows 5% act distance accuracy envelope over the disparity so the black line here is the the mono monochrome sensor you can see that all the buyer pattern filters tend to go beyond that 5% accuracy at certain disparity levels which are defined basically defined by the the distance and the disparity at that position so yes there’s a video here showing this this is now the monochrome case you can see that the the depth estimation or the disparity estimation rather it follows a nice hyperbola which is what you expect now in the case of the different filters you can see jumps in that line which is due to this aliasing that appears so these are all problems that that we have to learn to deal with if we want to do stereo to a really high distance so for the rolling shutter this is a bit more interesting so it’s something that’s quite difficult to simulate realistically because we just don’t have the knowledge about the physics of the sensor itself to do this so this is really where it’s interesting for us to have some kind of modular R&D camera system where we can really replace and and change the parameters of the sensors and the center the modules we’re using so this is what we did in this experiment so we found a facility where we can shake the vehicle so you can see that here on the left so there these for moving posts and when you put the vehicle on it you can start bumping it around and it will simulate different bumpy road scenarios for example I went and set up like a range of targets here too to give us something to measure in the image now those of you familiar with stereo you can already see that this is a very unfavorable scene considering that it’s full of flats and textured service surfaces the repetitive structures of the chessboard so this is actually very challenging for any stereo algorithm to solve and so we’ve got a rolling shutter sensor there which is a high dynamic range sensor and we’ve got a global show sensor on the right now what happens when you start taking the vehicle now what you see here is a 3d reconstruction which we get from the stereo algorithm and you can see also that it’s a bit noisy this is what I said it’s not an ideal situation so this is unfortunately what we have to live with here but we can already see that the in some cases the reconstruction of the chess boards has worked fairly well so the distances I’ve measured them correspond to what we get from the algorithm and as it starts vibrating and shaking so I can’t actually see where we are in the video but you can start see you will see the image on the left becoming a bit unstable and jelly-like so that’s the effect of the rolling shutter manifesting itself and as you see as as the shaking becomes intense you will start seeing some of these chess boards moving around in the in the 3d reconstruction so jumping you get this jumping between different levels in in in the disparity yes it’s quite entertaining to watch so of course we try to do the same experiment with a global shutter sensor so there we are so same thing just that unfortunately the exposure dropped a little low on this so that adds an extra challenge to the stereo matching algorithm but what you can see is that it kind of goes wrong a bit on the left there but the the chessboard at the back is reconstructed correctly and you can see that it’s really not affected at all by any of these the shaking motion or vibration which is down to the fact that literally the the global shutter it takes the image at an instant and each pixel is exposed exactly in the same way whereas in the rolling shutter case you know you start at the top of the image and after a certain time period you end up at the the last line and in that time though the image has moved so you’re fighting a moving target really so yes so much for that so what do we conclude from all that so we do need actually a high resolution global shutter HDR imager we’d love to have one so if any of you I’ve got one or a developing one were very interested so for me really a conclusion also is that using these automotive images the R&D level is still quite a challenging thing to do so we really needed quite extensive specialist support in order to get what I thought were fairly simple things out of the imager and we also needed to re-engineer quite a lot of the basic image processing pipeline because some of the processing chips weren’t actually designed to do to give us the kind of information we need at the R&D stage so things like exposure control white balance etc they’re things that we often had to do ourselves so yeah as a consequence we really would like a flexible camera platform that we can use at the R&D level but which is close to production form factor so that we can really start making predictions about you know 10 years before it goes into vehicle if we’ve got a great algorithm we can already start saying you know this is roughly what it will perform like in a vehicle at the later stage so yeah as part of that we would you know we’d like to switch the images we want to have the rolling paper ability for doing rolling updates as new images become available we want to use them and also we would really like to reuse the vendor ISPs for basic image of control because I’m sure that you do a much better job at that then then we can do with all the other tasks that we’re trying to solve okay so thank you very much for your attention that was all for me and thanks a lot for inviting us to this conference such a magnificent setting so I’ve really enjoyed it myself I hope to come back next time thank you very much thank you very much [Applause] [Music]


Leave a Reply

Your email address will not be published. Required fields are marked *