Prof. Dr. Michael Möller
With the help of artificial intelligence (AI), more and more information will be extracted from image data in the future. However, conventional camera sensors are often not optimally adapted to developments in the field of AI. In the "Learning to Sense" project, a research group consisting of seven working groups at the University of Siegen is now developing both together for the first time: innovative image sensors and matching AI software - for the cameras, scanners and microscopes of the future.
No matter how high-quality a smartphone or camera is, no image sensor can currently compete with the eye, because the eye and image processing in the brain form a unit that works in an ingeniously simple way. It starts with the retina. It contains thousands of sensory cells that are distributed differently. At the point of sharpest vision, there are many small sensory cells close together. The image resolution is correspondingly high when we humans focus on an object. At the edge of the retina, the sensory cells are larger and less densely packed, resulting in a lower resolution image. Nevertheless, we still perceive movement very well.
In the early days of mankind, this was a life insurance policy. Our ancestors were able to recognize attacking animals in good time. Today, peripheral vision prevents us from stepping onto the road when a car is approaching. The sensory cell design of the retina makes it much easier for the brain to process image data. As the point of sharpest vision is very small, the brain only has to process a small amount of high-resolution data. The lower-resolution information from the edge of the retina is usually less important and requires less processing.
Less information is more
Until now, optical image sensors have had a completely different structure. Their sensory cells, the sensor pixels, are always arranged at the same distance from each other in a rectangular grid. "This structure makes automatic image evaluation increasingly difficult in many areas of application today," says Margret Keuper, Professor of Machine Learning at the University of Mannheim and sub-project leader of the "Learning to Sense" research group. A large number of applications that work with a fixed camera position only require high-resolution data in a small image section, although a larger context is certainly relevant.
These applications would benefit greatly from a new chip design - for example, quality control in factories or traffic monitoring in self-driving cars. The problem with a conventional megapixel chip is that it captures the entire image in very high resolution and therefore delivers a huge amount of image data that software then has to process. This is despite the fact that in most cases only a small part of the image section is relevant - such as a defective part on an assembly line or a significant change in the traffic situation.
This flood of data is becoming a problem today because increasingly complex artificial intelligence (AI) is being used in image analysis - in particular neural networks that process information in several steps, known as layers. The more image information is poured into such a network, the greater the computing effort and the longer it takes for the neural network to output a result. "It is possible to imagine a multitude of applications in which image processing systems would experience an immense increase in effectiveness if the sensors were designed in an unconventional way so that they are better suited to AI data processing," says Michael Möller, Professor of Computer Vision and spokesperson for the AI research group "Learning to Sense".
Basically, the two worlds have so far been separate: electrical engineering, which has continued to optimize image sensors in the traditional way, and computer science, which has developed its very own tools. To date, these tools have hardly ever been developed in such a way that they are tailored to the needs of the other discipline. "In our "Learning to Sense" project, we now want to systematically merge these worlds - the development of the sensors and the automatic analysis of the data obtained," says Michael Möller.
"Learning to Sense"
Working on the project together with Möller are Prof. Dr. Volker Blanz from the field of computer science. Prof. Dr. Andreas Kolb (both University of Siegen) and Prof. Dr. Margret Keuper (University of Mannheim), as well as Prof. Dr. Bhaskar Choubey, Prof. Dr. Peter Haring Bolívar and Prof. Dr. Ivo Ihrke (all University of Siegen) from the field of sensor technology. "Together with our doctoral students, we want to design new sensor chips and develop perfectly tailored machine learning methods," explains Möller. The seven working groups are working together on the development of new techniques to optimize both the imaging systems and the data analysis approaches using artificial intelligence. Beyond specific applications, the main focus of the project is to carry out basic research so that the design of future sensor systems "learns" in the same way that artificial intelligence is already "learning" to understand our world today.
To achieve this, the group's results will be tested and validated in three main fields: Firstly, the field of terahertz imaging, a technique in which light frequencies are measured that are invisible to the human eye. This technology can be used, for example, to visualize defects on workpieces that are hidden beneath the surface. The second field deals with 3D microscopy, in which, for example, the lighting is optimized so that the cell geometry can be displayed in a way that is particularly relevant in cancer research. The third field deals with the further development of CMOS sensors for visible light.
Today, neural networks and other AI software are so complex that even the experts who design them can hardly understand how the networks analyze the data in detail. The AI software is fed with training data - such as images showing typical damage to components. Over time, the neural network learns what damage looks like. However, its inner workings remain a black box. As long as you feed neural networks with conventional image information that a human can also recognize, you can ultimately check whether the neural network has worked correctly; whether a defect that the software has found is actually a hole in a component. However, if you design completely new sensors that do not provide conventional image information, it becomes difficult. Neural networks could then learn completely different image characteristics, such as the difference in brightness between adjacent pixels, which we humans cannot recognize. "When developing our AI solutions, we therefore have to ensure that the results are plausible and that the algorithms actually output the desired information in the end," says Margret Keuper.
The "Learning to Sense" project is one of eight renowned projects funded by the German Research Foundation as part of the AI initiative. With their special design optimized for artificial intelligence, they are likely to take image processing a good step forward.
Dieser Text erschien zuerst im Forschungsmagazin Future der Universität Siegen:
Future 2023: Ich sehe was, was du nicht siehst (Autor: Tim Schröder)