Maker John Walters has been working on pushing the boundaries of what’s possible with an Espressif ESP32-S-based AI Thinker ESP32-CAM development board — performing on-device edge detection on the incoming video feed from the camera sensor in real-time.
“This project […] is simply just a proof-of-concept to see if the [Espressif] ESP32 is fast enough to do kernel convolutions in real time,” Walters explains. “[It] uses the AI-Thinker ESP32-CAM with the [Omnivision] OV2640 camera module with the GC9A01 circular 240×240 pixel display. An FDTI adapter is used for programming the ESP32-CAM. Oh yeah, this uses the PSRAM [Pseudo-Static RAM], so GPIO [General-Purpose Input/Output pin] 16 is off limits.”
The AI-Thinker ESP32-CAM at the heart of the system is far from the most powerful Espressif ESP32-based device, being built around two Tensilica Xtensa LX6 cores running at up to 160MHz paired with 520kB of static RAM (SRAM) plus 4MB of external pseudo-static RAM (PSRAM). As the name suggests, it’s designed for edge computer vision work, and either comes bundled or can be paired with Omnivision OV2640 or OV7670 camera sensors.
In Walters’ case, the module uses an Omnivision OV2640 sensor to capture an eight-bit grayscale image and transfer it to PSRAM, then perform two key image processing steps: a Gaussian blur followed by a Laplacian edge detection operation — both occurring on-device, with the user’s choice of 3×3 and 5×5 sizes. Finally, the processed image is output to a GC9A01 circular LCD display with a 240×240 resolution.
Walters has shared full source code for the project, plus the above wiring example. (📷: John Walters)
“I have no doubt that the frame rate could be increased by sharing the load with the other CPU core,” Walters adds of the process, which is written to only use one of the ESP32-CAM’s two Xtensa LX6 cores, “but I’m going to go down a different route.”
The project is documented in full, with source code, on Walters’ website.