← Back to Roadmaps

Building Real-Time Voice Agents from Scratch - Learning Roadmap | Nemorize

Loading roadmap...

Learning Topics

This roadmap covers the following topics:

Part I: Foundations
Part II: The Pipeline
  • ⚪ ASR with faster-whisper
    • ⚪ Model Size Trade-offs
    • ⚪ ASR as a Blocking Call
  • ⚪ LLM Streaming & State
    • ⚪ Speakable System Prompt
    • ⚪ The Commit Pattern
  • ⚪ TTS & Latency Trick
    • ⚪ pop_sentences Deep Dive
    • ⚪ Kokoro vs Piper Backends
Part III: The Hard Parts
  • ⚪ Barge-in: Interruption
    • ⚪ Yield-Point Latency
    • ⚪ Cancel Wire Protocol
  • ⚪ The Feedback Loop
    • ⚪ Browser AEC
  • ⚪ Playback State Machine
    • ⚪ Three Distinct Moments
Part IV: Engineering It Well
  • ⚪ Frontend Audio Scheduling
    • ⚪ AudioWorklet for Mic Capture
    • ⚪ Gapless playHead Scheduling
  • ⚪ Concurrency & Orchestration
    • ⚪ run_in_executor Pattern
    • ⚪ asyncio vs Threads — Same Shape
Part V: Make It Yours
  • ⚪ Capstone Extensions
    • ⚪ Measurable Latency Fork
    • ⚪ Extension Projects
  • ⚪ The Production Bridge
    • ⚪ Trade-offs You Now Own
    • ⚪ Why Hosted APIs Choose as They Do

Community Feedback

Share your thoughts and rate this roadmap

Loading comments...