What’s the idea?
I’d love to see a native voice interface for OpenPawz — the ability to talk to your agent instead of typing.
Why would this be useful?
- Hands-free operation while coding or working
- Faster for quick tasks
- More natural conversation flow
- Accessibility benefits
How might it work?
- Use Whisper API for speech-to-text
- ElevenLabs or similar for text-to-speech
- Hot word activation (“Hey Pawz!”)
- Optional: run entirely local with faster-whisper
Anything else?
I’ve seen this done well in other projects. Would be happy to help build this if others are interested!
Like this post if you’d use voice mode!