The most accurate computer vision algorithms around take in high-resolution images, examine every pixel, and use that information to make sense of the world. This process repeats itself dozens of times per second. This arrangement works quite well as far as understanding the world is concerned, but it is highly inefficient. Processing tens of millions of pixels every few tens of milliseconds requires a lot of processing power, and with it, a large amount of energy.
That there is a better way to process image data is obvious, as the brain does not operate in this way. Rather than poring over every tiny pixel, even the ones that add no additional information, the brain is able to produce a general outline of a scene that captures all of the important information about it. It does this incredibly quickly, and while consuming very little energy. And it is not just a matter of efficiency — these simplified outlines make understanding of visual scenes more accurate and robust to environmental changes or other small differences that trip up artificial solutions.
A schematic of the adjustable synaptic phototransistors (📷: J. Kwon et al.)
A group led by researchers at the Korea Institute of Science and Technology wants to make computer vision more brain-like, so they have developed a system that mimics the dopamine-glutamate signaling pathway found in brain synapses. This signaling pathway extracts the most important features from a visual scene, which helps us to prioritize critical information, while ignoring irrelevant details.
Inspired by this biological mechanism, the team created a novel synapse-mimicking vision sensor that selectively filters visual input, emphasizing high-contrast edges and object contours. This approach dramatically reduces the amount of data that needs to be processed (by as much as 91.8%), while simultaneously improving object recognition accuracy to about 86.7%.
All of this processing happens on-sensor. Rather than sending raw visual data to remote processors, the sensor itself adjusts brightness and contrast on the fly, much like how dopamine modulates synaptic activity to enhance signal clarity in the human brain. This is made possible through the use of a synaptic phototransistor whose response can be tuned through electrostatic gating, allowing it to dynamically adapt to changes in lighting. This hardware-level adaptability allows the sensor to highlight contours even in difficult conditions, such as low-light or high-glare environments, without relying on computationally expensive software-based corrections.
Image adjustments take place in-sensor (📷: J. Kwon et al.)
In tests using road scenes from the Cambridge-driving Labeled Video Database, the system excelled at semantic segmentation — a process that assigns labels to each part of an image. By feeding these cleaner, high-clarity contours into standard vision models like DeepLab v3+, the team achieved both improved detection accuracy and faster data handling.
This development holds a lot of promise for autonomous vehicles, drones, and mobile robots, where every bit of saved processing power translates into longer operation times and more responsive systems. Traditional high-resolution cameras can generate up to 40 gigabits of data per second, overwhelming even the most advanced onboard processors. By compacting visual input through contour extraction, the new sensor dramatically lightens this load and could significantly speed up the development of future autonomous systems.