Five separate systems. One badge. Here's how we orchestrated wake word detection, audio embeddings, LLM inference, and TTS into a seamless voice-activated experience-all within 8MB of RAM.
Deep dive into running a Large Language Model on an ESP32-S3 microcontroller - exploring llama2.c, SIMD optimizations, and the challenges of streaming inference on embedded hardware.
Exploring embedded video playback on ESP32-S3 with gyro-based auto-rotation. From MP4 to MJPEG conversion, PSRAM buffering, IMU integration, and building a smooth looping video player on a microcontroller.