Learn more: https://bit.ly/4vPQ3HE Voice is one of the most natural human interfaces, but adding it to AI applications has historically forced a tradeoff: fast voice-to-voice models that sacrifice reliability, or accurate speech-to-text-to-LLM-to-speech pipelines that add...
Learn more: https://bit.ly/4vPQ3HE
Voice is one of the most natural human interfaces, but adding it to AI applications has historically forced a tradeoff: fast voice-to-voice models that sacrifice reliability, or accurate speech-to-text-to-LLM-to-speech pipelines that add latency. This course teaches you how to get both, using Vocal Bridge's architecture that pairs a real-time foreground agent with a reasoning background agent.
Taught by Ashwyn Sharma, CEO and Co-Founder of Vocal Bridge (an AI Fund portfolio company), this course covers three practical integration patterns that meet you where you are: voice embedded in an application, voice layered onto an existing agent without touching its logic, and voice as a tool your LLM can call when it decides a conversation is the right modality.
In detail, you'll survey the traditional voice stack and its tradeoffs, then explore three live integration patterns to understand when each one applies. Build a voice-interactive tic-tac-toe game where voice commands and mouse clicks work together over a single synchronized channel, then add a voice layer to an existing agent with minimal code, leaving your prompts, RAG pipeline, and tools untouched. Give your agent a make_phone_call tool so it can dial a real number, hold a conversation with a demo agent, and stream the transcript back live. Set up evaluation-driven development using Vocal Bridge's multimodal evaluator to score calls, catch regressions, and refine prompts before issues reach users. Hear from Scott Johnston, former CEO of Docker and Vocal Bridge board member, on what it actually takes to move voice agents from demos to production.
By the end of this course, you’ll have implemented three hands-on voice AI patterns: adding voice to an interactive app, layering voice onto a text-based agent, and giving an agent the ability to place outbound calls. You’ll also know how to evaluate and improve voice interactions.
Enroll here: https://bit.ly/4vPQ3HE
From centralized to distributed: In the old world, organizations relied on one centralized data and AI platform. In the new world of AI agents, every agent needs its own sandboxed, secure, and modern data stack. In this 20-minute talk with live demo by Spice AI's Luke Kim, he...
From centralized to distributed: In the old world, organizations relied on one centralized data and AI platform. In the new world of AI agents, every agent needs its own sandboxed, secure, and modern data stack.
In this 20-minute talk with live demo by Spice AI's Luke Kim, he explores why this architectural shift is critical and the key patterns required to give agents reliable, real-time data.
The next major shift in enterprise AI is underway; enterprises are moving from generic AI they rent to specialized AI they own. The benefits are clear: higher quality, dramatically lower costs, full control, and a quality improvement flywheel while in production. But building...
The next major shift in enterprise AI is underway; enterprises are moving from generic AI they rent to specialized AI they own. The benefits are clear: higher quality, dramatically lower costs, full control, and a quality improvement flywheel while in production.
But building specialized AI models has been prohibitively hard; each use case requires months of effort and deep AI expertise. Well, it used to. VibeML is enabling engineers to build specialized AI models automatically from a prompt, in minutes. An AI agent builds your AI model end-to-end; evaluation, data synthesis, training and repeat.
This talk by OUMI's Manos Koukoumidis & Stefan Webb demonstrates how VibeML can give deep AI experts superpowers while enabling non-experts as well.