| | Stay up to date with your local chapter
here. |
|
|
It's good to see thinking in action - and we think you'll like this edition's Guest Editor. |
|
|
|
MLOps Engineer @ Electric Twin // MLOps Community London Co-host |
|
|
Still buzzing 🤠 after the great Community meetup we had in London last week. We packed in a lightning talk marathon with six speakers - and now I get to guest edit the newsletter!
A quick thanks again to our ⚡speakers: Pier, Matt, Li, Vinay, Richard, and Wendy, and I'm taking this chance to share some of their materials:
Be sure to check here to see what’s happening near you.
To the newsletter! This week’s chat between Demetrios, Paul and Floris about real-time voice AI was interesting for us at Electric Twin. Their take on balancing latency with responsiveness - especially how turn detection impacts conversation flow - stood out. And going with the flow ties nicely into Médéric’s blog on building more flexible ML pipelines.
But the thing I’m most interested in this week? The stories that’ll come from the ML Confessions section! Please, share your stories, help us all feel better about ourselves!
Before I go, Demetrios asked me to highlight three ways he can help you (just hit reply and let him know): - Curated intros to other community members
-
What problems are you dealing with? Let him help you find the best solutions through his network
- Looking to augment your staff for an MLOps or AI project? He’s got you covered
Enjoy the newsletter! |
|
|
|
The Challenge with Voice Agents |
Paul van der Boor // Senior Director Data Science @ Prosus Group
Floris Fok // AI Engineer at @ Prosus Group |
|
|
There's a lot of talk about AI stealing jobs, but as a podcast host, I wasn’t too worried. After this chat, I’m starting to think I should have a backup plan.
Voice AI is moving fast, and Paul and Floris walked me through what it takes to build real-time voice agents. OpenAI’s Real-Time API is a big step forward, but challenges like latency, memory, and making interactions feel natural still remain. Unlike text-based LLMs, voice models struggle with context, hallucinations, and handling multiple languages.
One of the biggest hurdles is knowing when to listen and when to speak. Turn detection has to balance avoiding awkward silences with not interrupting the user. A few key things stood out: -
Adapting to individual speaking styles - A pause might mean “I’m thinking” for one person and “I’m done talking” for another. Fine-tuning for different users is essential.
-
Handling interruptions properly - If a user cuts the AI off, it needs to know they didn’t hear the last part. OpenAI’s API trims responses when interrupted, which helps.
-
Allowing for natural clarifications - Most voice models are weirdly overconfident and never ask you to repeat yourself, even if they’ve completely misheard. A simple “Sorry, what was that?” would improve things massively.
We also looked at testing. Custom evals with synthetic voices surfaced some funny issues - certain accents made the model randomly switch languages, and slow speakers accidentally triggered turn detection. Open-source models like Kokura are making text-to-speech more accessible, but voice AI still needs a rethink of workflows to handle the unpredictability of real conversations. So, my job’s safe for now, but I should probably polish up my resume… just to be safe. |
|
|
|
Stop Building Rigid AI/ML Pipelines: Embrace Reusable Components for Flexible MLOps |
With thanks to Médéric Hurier for their contribution. |
|
|
There’s a lot of talk about AI mimicking humans, and MLOps pipelines seem to follow suit - starting flexible but seizing up over time.
This stiffness makes iteration and experimentation harder, but an artifact-driven approach can fix that. This blog breaks down how modularizing each step into reusable components - Python packages, Docker images, and config files - creates a more adaptable workflow. These artifacts are versioned, stored in repositories, and combined using Directed Acyclic Graphs (DAGs) instead of tightly coupled steps. One of the biggest wins is how DAGs keep workflows flexible: -
Easier experimentation – Need to test a new preprocessing method? Swap in a new artifact without reworking everything.
- Less duplication – The same preprocessing or model training code can be used for both training and inference, keeping things clean.
-
Better scalability – DAGs allow parallel execution and smart resource allocation, so big jobs don’t slow everything down.
This approach borrows from functional programming, treating each artifact as a self-contained unit, making workflows more modular and maintainable. Click below before your pipelines start groaning every time they bend down.
|
|
|
|