🤖 AIBy Neel VoraDecember 6, 20252 min read

Building a Real Voice Assistant with Tools

VoiceAI AgentsTool UseSpeech RecognitionOpenAI

By Neel Vora

This post walks through how I built a Real Voice Assistant with Tools, and where it fits in the rest of my work.

Building voice-driven characters like Geary, Charleen, and Humphrey for museums taught me how critical voice UX is in high-traffic public spaces - lessons that shaped this assistant.

I wanted to build something more than a chat box. I wanted a voice assistant that could actually do things. This post walks through how I designed a full conversational assistant with:

  • Real tool calling
  • Speech recognition on the client
  • Text to speech with multiple voices
  • A clean agent loop that supports memory and context

Goals

My goals were simple:

  • Build a real assistant that feels responsive and natural
  • Support real actions like fetching weather, math, time zones, and search
  • Use a simple tool calling architecture that is easy to extend

Architecture

The system is split into three layers:

  • Frontend for speech recognition and UI events
  • Tool router that handles function calls from the model
  • Backend conversation engine powered by OpenAI and a lightweight memory system

Speech input

I used the browser SpeechRecognition API with fallback to manual text. The assistant starts listening when you press the mic button.

Tool calling

The tools I implemented:

  • Weather
  • Search
  • Calculator
  • Timezones
  • Session notes

The model chooses the tool. My backend routes it to a handler.

TTS

OpenAI TTS provides natural voices. I allow the user to choose.

UI design

I aimed for clarity:

  • Big mic button
  • Transcript preview
  • Tool call badges
  • Clean message bubbles

What this project does

This project demonstrates that I understand:

  • Agent design
  • Tool calling ergonomics
  • Real time UX
  • Speech interfaces

And I plan to extend it with streaming audio output.

Keep exploring

From here you can:

  • See how I applied similar patterns on the

Thanks for reading! If you found this useful, check out my other posts or explore the live demos in my AI Lab.

More Posts