REGISTER HERE – https://streamyard.com/bjmiqevr68
Join us for three back-to-back hands on workshops with Intel, STMicroelectronics, MiTO and ForestHub to learn the latest techniques and technologies in generative and agentic edge AI.
Please register but note that these workshops are FIRST COME FIRST SERVED based on availability of virtualized platforms.
——————————————–
WORKSHOP #1- Building your local voice assistant on compact edge device- Hosted by Danilo Pietro Pau, IEEE and ST Fellow & Ashutosh Kumar, Ph.D., M.B.A, AI Technical Marketing Lead at Intel
This workshop explores how to build a fully local, low-latency conversational AI agent optimized for edge devices. Participants will examine a production-ready pipeline integrating streaming Speech-to-Text, a quantized Small Language Model, and real-time Text-to-Speech, orchestrated via OpenVINO across CPU, GPU, and NPU on Intel® Core™ Ultra Processor powered edge device. We highlight advanced optimizations including neural VAD for fast endpointing, overlapping audio windows for accurate transcription, and token-level output streaming with punctuation-aware chunking to reduce time-to-first-audio. Through hands-on insights, developers will learn how to design responsive, power-efficient voice assistants that run entirely on-device—delivering privacy, reliability, and real-time interaction without cloud dependency.
What you will learn:
– Understanding how to quantize AI models with OpenVINO for optimized performance – be it speech to text, text to speech, or language models
– Explore how local machine learning models can be used to run multi-modal AI workloads for man
– Hands-on lab with open-source models and an application to try the workflow out in Jupyter notebook environment
* First 20 participants will get access to an instance to try it out and experience the compute efficiency of Intel® Core™ Ultra processor powered edge devices. Other participants are welcome to download the workshop and try on their own systems.
——————————
WORKSHOP #2: The Self-Driving Home: TinyAgents Cooperating over Tiny A2A on STM32 = Hosted by Marcus Rueb of ForestHub.ai
TARGET AUDIENCE
Embedded & firmware engineers, edge-AI developers, and technical decision-makers building local, sensor-driven products – anyone interested in running multiple cooperating agents on constrained MCUs/MPUs without the cloud.
GOALS
By the end of the session attendees will understand:
– How several TinyAgents run locally on the STM32N6 (Neural-ART NPU) and STM32MP2, each owning a domain – comfort, energy, safety.
– How those agents coordinate over Tiny A2A, a lightweight agent-to-agent protocol for low-power, bandwidth-constrained devices – instead of static rules/scenes.
– How this plays out across three concrete use cases:
* Comfort – a vision agent detects dirt/spills and dispatches the vacuum robot to the spot.
* Energy – agents detect real presence/usage, steer heating & cooling dynamically, and orchestrate bidirectional EV charging (V2H/V2G).
* Safety – on-device vision detects falls/injury (elderly care) and alerts instantly – no camera frame ever leaves the home.
– How to operate these agents in production: deploy, monitor (AgentOps), update and govern them across a fleet of homes – a full lifecycle, 100% local.
PREPARATION REQUIRED
– Nothing mandatory to follow along.
– To replay the demo: an STM32N6 Discovery kit and/or STM32MP2 board plus our walkthrough repo (link shared ahead of time); a Linux host or WSL environment recommended for the local toolchain.
DURATION
1 hour (technical webinar + live demo)
——————————
WORKSHOP #3 – Grounded Video Understanding on the Edge with Small Language Models – Hosted by Andrea Basso, PhD, MITO Tech Ventures
Deploying vision-language AI on edge devices is no longer just a matter of shrinking models, it requires rethinking how they are structured, optimized, and grounded. In this hands-on workshop, we present a practical approach to building efficient multimodal pipelines using small language models, lightweight vision components, and embedded NPUs.
Participants will learn how to optimize llama.cpp, extend multimodal projection layers for flexible resolutions, quantize models, and deploy YOLO on STM32MP2 hardware. The workshop will present the Narrative Camera, a real-world example of a grounded, multi-stage pipeline that transforms raw video streams into structured, human-readable narratives while minimizing hallucinations.
By the end of the workshop, attendees will have the tools and insights to design by themself an STM32MP2 edge-native AI systems that move beyond raw perception to meaningful, interpretable understanding.
