Launch announcements, Champions Program news, new features, and behind-the-scenes engineering from the Brethof Voice Pro team.
Brethof Voice Pro is no longer just voice-to-text. v2.0.0 ships offline translation powered by Tencent Hunyuan MT2 — on FLORES-200 (XCOMET-XXL) the 7B tier reaches 97.9% of Google Gemini 3.1 Pro, and it surpasses Gemini on real-world and minority-language tests. Translation runs entirely on your machine. Two model tiers, downloaded on demand: Fast (~1 GB, sub-second on CPU or GPU) and Quality (~4.3 GB, sub-second on GPU). Plus several long-awaited additions.
What is new in v2.0.0:
EN: … || PL: …), or first target only.translate_text, translate_srt, list_compute_devices, set_compute_device. Total tool count now 19.Linux binary is 161 MB, Windows installer is 118 MB. Same launch prices: $49 personal, $149 business. Existing licences carry over — just download v2.0.0 and the translation models will appear in Settings → Models.
Download v2.0.0 →The training pipeline shipped. LoRA fine-tuning on your own voice now runs end-to-end on your machine — the app auto-picks NVIDIA CUDA or CPU, then auto-exports the trained model to GGUF when done. Every correction you make in the GUI is auto-saved to your local training dataset; the main window's training card shows total samples and minutes at a glance.
Bonus: voice-keyboard accuracy improved across all languages thanks to a llama.cpp upgrade (build b9222) that fixed a chunk-boundary collapse on long clips. Free with every paid licence.
Two new MCP tools land: start_transcription returns a job ID instantly so the agent can do other work, and get_transcription_status polls for completion. One job at a time, result inlined when done. Long files no longer block the agent loop.
Plus a chain of fixes to word-level SRT/VTT output: no more stranded spaces before punctuation, no more lone-dot cues, no more hotword/context strings leaking into the transcript. Cleaner subtitles, no manual cleanup needed.
Full engine rewrite. Brethof Voice Pro now runs Qwen3-ASR end-to-end on llama.cpp with GGUF-quantised weights via libmtmd. ONNX Runtime is gone. The result: smaller install (~83 MB binary, down from 400+ MB), faster cold-start, and no more fighting per-platform CUDA/DirectML wheels.
Vulkan picks up your GPU automatically — NVIDIA, AMD, or Intel Arc — with a CPU fallback when no GPU is present. The same engine now powers every downstream feature: ASR, voice keyboard, the MCP server, and (now in v2.0.0) translation.
The Model Context Protocol server lands. Any MCP-compatible AI agent — Claude Desktop, Claude Code, Cursor, Cline — can drive transcription over stdio (no port, no firewall). Same release ships a multi-GPU device selector so you can pick which Vulkan GPU runs ASR, plus the optional Forced Aligner add-on for word-level timestamps on every transcription.
Paid-tier only — the MCP server refuses to launch without a Personal or Business licence. brethof-voice --mcp is the one-line invocation.
The most-asked question before launch was "when macOS?". Answer: in active development. Apple Silicon native build with Metal acceleration comes first, Intel follows. Target Q3 2026 — and we are opening a closed beta in Q2. Here is what is being built and how to sign up.
Read post →Personal voice training is live in Voice Pro. Every time you correct a misrecognised word, the audio clip + correction is auto-saved to your local training dataset. One click in the Training tab fine-tunes a LoRA on your accent — the app picks NVIDIA CUDA or CPU automatically, then auto-exports the trained model to GGUF. Free with every paid licence.
Read post →After months of engineering, Voice Pro v1.0 ships today for Windows and Linux. 30 transcription languages plus 22 Chinese dialects, fully offline transcription, hotkey-anywhere dictation, and a one-time price with no subscription. Here is what made it into the launch build and what we are working on next.
Read post →The Champions Program opens today and runs until May 16. Fifty free Personal licenses per supported language — 1,800 total — plus 70% off for every qualifier who does not land in the top 50. Here is how it works and why we are doing it this way.
Read post →We listened to early feedback from writers, consultants, and translators who felt the old "personal use only" line was confusing. The Personal license at $49 now explicitly covers solo and freelance commercial use. Business license ($149/seat) is for teams. Here is what changed and why.
Read post →Voice Pro now transcribes and presents itself in 30 languages plus 22 Chinese dialects: Arabic, Cantonese, Chinese, Czech, Danish, Dutch, English, Filipino, Finnish, French, German, Greek, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Macedonian, Malay, Persian, Polish, Portuguese, Romanian, Russian, Spanish, Swedish, Thai, Turkish, Vietnamese. Here is how the app picks the right one and what "fully translated" actually means.
Read post →Voice Pro's ASR backend runs on llama.cpp with GGUF-quantised Qwen models instead of ONNX Runtime. The result: a smaller install (83 MB exe vs 400+ MB), faster cold-start, and no more fighting with CUDA/DirectML wheels on every platform. Here is the engineering story behind the switch.
Read post →Every major dictation product sends your voice through someone else's server. We think that is the wrong default. Your voice is the most personal data you generate — medical notes, legal drafts, private journals, work secrets. Here is why Voice Pro has no cloud mode, no "optional telemetry", and no account requirement to transcribe.
Read post →Create an account to be notified when we publish new posts and ship new versions.
Create AccountLocal speech-to-text that learns your voice. Perpetual licence. Our flagship.
PAID · flagship
Local long-term memory for Claude Code — full-text + vector + graph, on SurrealDB. MIT.
FREE · open source
Print-ready digital models. STL/3MF/OBJ included. Lifetime access.
PAID · digital catalog
Our printed designs, shipped across Europe. Buy the object, not the file.
PAID · physical objects
Cyber-tiger AI host. Privacy-first AI explained without the corporate filter.
CHANNEL · live
Curated GitHub lists for AI, MCP, local AI, Linux for AI, and more. Receipts, not vibes.
FREE · curated
Long-form how-tos for local AI on Linux, Windows, macOS. Real configs, not marketing.
FREE · coming soon
Production-tested ComfyUI graphs — LTX chunked-loop, the Nova pipeline, and more.
FREE · workflows landing
Negative-curation: practices and tools that waste your time, ranked. Receipts required.
FREE · coming soon
Who we are, why we build local-first AI, and what we won't do.