Features — Brethof Voice Pro

🔒

Complete Privacy

Every word you speak is processed on your device. No audio, text, or metadata is ever transmitted to any server. There is no cloud backend, no telemetry, no analytics, and no phone-home.

Zero network calls during transcription
Models stored locally after one-time download
Open-source Qwen3-ASR engine — fully auditable

⚡

GPU Acceleration

Brethof Voice Pro uses the GGUF-optimized engine with llama.cpp for blazing-fast inference. Supports all three major GPU vendors out of the box.

NVIDIA — Vulkan acceleration (GTX 10-series and newer)
AMD — Vulkan acceleration (RX 500-series and newer)
Intel — Vulkan acceleration (Arc GPUs and integrated graphics)
CPU fallback — runs without a GPU, just slower

🌐

36 Languages

Speak in any supported language and the engine auto-detects it. Or lock to a specific language for maximum accuracy. All processing happens locally.

EnglishChineseCantonesePolish GermanFrenchSpanishPortuguese ItalianDutchRussianUkrainian CzechSlovakHungarianRomanian BulgarianCroatianSlovenianSerbian MacedonianSwedishDanishFinnish NorwegianTurkishArabicHebrew HindiJapaneseKoreanThai VietnameseIndonesianMalayGreek

📈

6 Quality Tiers

Choose the perfect balance of accuracy, speed, and resource usage. All tiers use the same Qwen3-ASR architecture with different encoder precision and model sizes.

Max Full (3.2 GB) — Maximum accuracy, FP32 encoder + 1.7B decoder
Max Balanced (2.6 GB) — Near-identical quality, FP16 encoder
Max Light (2.3 GB) — Large model, INT8 encoder for lower VRAM
Fast Balanced (1.2 GB) — Recommended default, excellent quality
Fast Full (1.5 GB) — Compact model, maximum accuracy
Fast Light (1.0 GB) — Smallest tier, runs on integrated GPUs

Switch between tiers at any time from the app settings. No re-download needed — all tiers ship with the installer.

🎵

AI Noise Reduction

Built-in DeepFilter noise suppression processes your microphone input in real-time before transcription. Background noise, keyboard sounds, and room echo are removed automatically.

Runs in real-time on your audio input
Configurable attenuation (0–100 dB)
No extra hardware needed
Can be disabled when not needed

🎓

Personal Voice Training Coming Q3 2026

Fine-tune the model on your own voice using LoRA (Low-Rank Adaptation) to improve recognition of your accent, language, and speaking style. Not available in v1.0 — we disabled training when we moved the inference engine to GGUF, and we are waiting to ship it until the PyTorch-to-GGUF conversion pipeline is solid. When it lands it will be free for anyone with a Personal or Business license, and all training will run on your machine.

Adapt to your accent, dialect, and speaking rhythm
Record training samples directly in the app
LoRA fine-tuning — fast, efficient, no full retrain
Train on domain-specific vocabulary (medical, legal, technical)
Your voice data stays local — always
Read the roadmap post →

⌨️

Direct Text Injection

Transcribed text is typed directly into whatever application has focus. No copy-paste, no clipboard. It works like a keyboard — press your hotkey, speak, and the text appears.

Works with any text field, editor, terminal, or chat
Configurable hotkey (default: Ctrl+D)
Extra mouse button support for hands-free recording
Streaming mode: text appears while you speak
Supports X11 and Wayland on Linux

📚

Hotword Context

Provide a list of domain-specific terms, names, or jargon to bias the model toward correct recognition. Ideal for technical dictation, medical notes, or any specialized vocabulary.

Add hotwords in Settings — one per line
Improves recognition of proper nouns and abbreviations
No retraining needed — applied at inference time

Everything in One App