Today Brethof Voice Pro v1.0 is available for download on Windows 10/11 and Linux (Ubuntu, Fedora, Arch). It is a desktop application that captures audio from your microphone, runs a Qwen-based ASR model locally on your CPU or GPU, and types the transcribed text into whatever window you are focused on.
What ships in v1.0
- 30 transcription languages plus 38 translation languages — Arabic, Cantonese, Chinese, Czech, Danish, Dutch, English, Filipino, Finnish, French, German, Greek, Hindi, Croatian, Hungarian, Indonesian, Italian, Japanese, Korean, Macedonian, Malay, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swedish, Thai, Turkish, Ukrainian, Vietnamese, Bulgarian.
- Global hotkey dictation — press your hotkey anywhere, speak, release, and the text types into the focused app.
- 100% offline — no cloud, no telemetry, no account required to transcribe.
- Three model tiers — base (fast, 0.6B params), medium (1.7B), large (3B). Pick based on your hardware.
- Noise reduction via DeepFilterNet and voice activity detection via Silero VAD, both running locally.
- Hotwords — inject names, acronyms, and domain terms as inference-time hints (no retraining needed).
One thing not in v1.0: model fine-tuning. Earlier builds had it via PyTorch, but when we moved inference to GGUF we broke the training pipeline and we are not shipping it until the conversion step is solid. See the fine-tuning roadmap post.
Pricing
One-time purchase, no subscription:
- Personal — $49, 2 machines, covers solo and freelance commercial use.
- Business — $149/seat, team/organisation use with volume discounts from 10 seats.
- Free 14-day trial — no credit card, all features unlocked.
Prices auto-adjust by region: if you are in Eastern Europe, South-East Asia, Latin America, or Africa, you will see purchasing-power-parity pricing. A Personal license in Poland is roughly $39, in India $29.
What we got right
The thing we are happiest about is time-to-first-word. Open the app, press your hotkey, speak. No signup. No browser. No uploading. No waiting for a cloud round-trip. On a mid-range laptop, Voice Pro transcribes a five-second sentence in under 400 ms with the base model. That speed was the whole point.
What is next
- macOS build — in progress, target Q3 2026.
- Fine-tuning on your voice and vocabulary — coming once the PyTorch-to-GGUF conversion pipeline is ready. Q3 2026 target.
- Real-time streaming mode — words appear as you speak, not after you release the hotkey.
- More model tiers — an ultra-small edge-device model for low-end hardware, and a 7B model for the serious machines.
If you have been waiting for a privacy-respecting dictation tool, today is the day. Download the trial, dictate for 14 days, tell us where it breaks. We read every email.