MOSCOW, February 12. /TASS/. Russian mathematicians and programmers have developed a universal neural network that can improve computer drone vision systems, while they were developing a smartphone document scanning app, the Smart Engines company press service says.
"To put it in simple terms, the new architecture implements the laws of [optical] perspective as mathematical constraints. This new approach allows intuitive perception of geometric laws of the physical three-dimensional world, laws of perspective and dynamic scene change while moving," the company says.
In the last few years, the scientists have developed tens of neural networks, capable of recognizing obstacles and calculate optimal movement trajectory for autonomous cars, drones and other gadgets that need to "see" the surrounding world and classify it into various categories.
For example, such systems are being used for automatic document scanning, production quality check on factories and visitors count in restaurants. These systems are usually optimized for one single task, and they tend to be rather bad at different tasks.
A team of Russian scientists, led by Dmitry Nikolayev and Vladimir Arlazarov, lab chiefs from Institute for Information Transmission Problems and ‘Informatics and control’ Federal Research Center of Russian Academy of Sciences, have accidentally created a universal neural network, which is equally good at solving all these tasks. The researchers were developing a document scanning app.
Neural network 'eyes'
The scientists explain that the key difficulty in developing such app is that the users make document photo not perfectly flat, but at some angle. As a result, the algorithm sees not a flat image, but a 3D image with some kind of perspective.
This is usually not a problem for humans, but development of artificial intelligence systems, capable of solving this simple task, usually requires introducing several dozen or even several hundreds of neuron layers. This significantly increases the system’s energy consumption, making them unfit for mobile devices and prevents creation of universal neural networks.
Arlazarov, Nikolayev and their colleagues solved this problem by using several new mathematic principles, including the so-called Hough transform - an algorithm developed by US mathematician Paul Hough in the mid-20th century for bubble chamber particle detector image analysis.
This set of formulas allows finding straight lines and certain types of geometric figures in an image, received from a smartphone or drone camera. By introducing the algorithm in one of the neural network layers, the scientists created a universal system, equally good for solving all most important computer vision tasks. The first tests revealed that the system is one hundred times more effective than the classic U-net system.
According to the researchers, this algorithm could be applied not only for document scanning or improving drones and self-driving cars, but for analyzing medical tomography images and images from other branches of science where computer vision is not used yet.