Toshiba Corporation has developed an AI with 3D recognition that is capable of measuring distance with the accuracy of a stereo camera, by using the image taken with a commercial camera and analyzing the image blurring caused by the camera lens using deep learning. This technology will eliminate the use of stereo cameras which eventually reduces the cost and space. Toshiba will be presenting this achievement at the international conference on computer vision (ICCV2019) to be held in South Korea on October 30, 2019, from 10 am.
Image sensing is becoming more important and applications such as robots moving objects, autonomous unmanned vehicles, remote-controlled drones inspecting infrastructure etc, require more than just images of the subjects, they need a small device to analyze 3D data to include shape and distance. Hence researches have been increased to develop a measuring technology with monocular cameras (they are easy to miniaturize) by using deep learning for better learning of the shape, background, and other scenery data of the imaged object.
This method holds a drawback; the accuracy of the distance is estimated with the help of a monocular camera depending on learned scenery data which causes an accuracy drop due to the shots taken in different landscapes. To overcome this Toshiba has developed colour filtered aperture photography in which two colour filter is attached to the lens and the colour and size of the resulting image blur are analyzed according to the distance from the subject. Although this solves the data dependence issue, it costs time and money to modify existing lenses.
Toshiba has overcome this problem by developing AI with 3D recognition technology that uses deep learning to analyze how the image is blurred according to its position on the lens, in order to achieve distance measurement with the same high precision as a stereo camera system, with a normal monocular camera but without any need for scenery data. Until now, it was considered theoretically impossible to measure the distance based on the shape of the blur, which is the same for objects with both distance and far when they are equidistant from the focal point. But, the analytical results have shown a substantial difference between the blur shapes near and far objects, even they are equidistant from the focal point. With that Toshiba successfully analyzed blur data from captured images by a deep learning module trained with the deep neural network model.
When the light passes through the lens the shape of the blur created is known to change depending on the light’s wavelength and its position in the lens. In the developed network, position and colour are processed separately to properly perceive changes in blur shape, and then, after passing through a weighted attention mechanism, to control where on the brightness gradient to focus in order to correctly measure the distance. Through learning, the network is then updated to reduce an error between the measured distance and actual distance. Using this AI module, Toshiba has confirmed that a single image captured with a commercially available camera realizes the same distance measurement accuracy secured with stereo cameras. More information can be found on this official page of Toshiba.
Toshiba will confirm the versatility of the system with commercially available cameras and lenses and speed up the image processing, aiming for public implementation in the fiscal year 2020.