Everything in One App

Professional voice-to-text that runs entirely on your machine. No cloud, no latency, no compromise.

Brethof Voice Pro — Main Screen
Brethof Voice Pro — Recording
🔒

Complete Privacy

Every word you speak is processed on your device. No audio, text, or metadata is ever transmitted to any server. There is no cloud backend, no telemetry, no analytics, and no phone-home.

  • Zero network calls during transcription
  • Models stored locally after one-time download
  • Open-source Qwen3-ASR engine — fully auditable

GPU Acceleration

Brethof Voice Pro uses the GGUF-optimized engine with llama.cpp for blazing-fast inference. Supports all three major GPU vendors out of the box.

  • NVIDIA — Vulkan acceleration (GTX 10-series and newer)
  • AMD — Vulkan acceleration (RX 500-series and newer)
  • Intel — Vulkan acceleration (Arc GPUs and integrated graphics)
  • CPU fallback — runs without a GPU, just slower
🌐

36 Languages

Speak in any supported language and the engine auto-detects it. Or lock to a specific language for maximum accuracy. All processing happens locally.

EnglishChineseCantonesePolish GermanFrenchSpanishPortuguese ItalianDutchRussianUkrainian CzechSlovakHungarianRomanian BulgarianCroatianSlovenianSerbian MacedonianSwedishDanishFinnish NorwegianTurkishArabicHebrew HindiJapaneseKoreanThai VietnameseIndonesianMalayGreek
📈

6 Quality Tiers

Choose the perfect balance of accuracy, speed, and resource usage. All tiers use the same Qwen3-ASR architecture with different encoder precision and model sizes.

  • Max Full (3.2 GB) — Maximum accuracy, FP32 encoder + 1.7B decoder
  • Max Balanced (2.6 GB) — Near-identical quality, FP16 encoder
  • Max Light (2.3 GB) — Large model, INT8 encoder for lower VRAM
  • Fast Balanced (1.2 GB) — Recommended default, excellent quality
  • Fast Full (1.5 GB) — Compact model, maximum accuracy
  • Fast Light (1.0 GB) — Smallest tier, runs on integrated GPUs

Switch between tiers at any time from the app settings. No re-download needed — all tiers ship with the installer.

🎵

AI Noise Reduction

Built-in DeepFilter noise suppression processes your microphone input in real-time before transcription. Background noise, keyboard sounds, and room echo are removed automatically.

  • Runs in real-time on your audio input
  • Configurable attenuation (0–100 dB)
  • No extra hardware needed
  • Can be disabled when not needed
🎓

Personal Voice Training Coming Q3 2026

Fine-tune the model on your own voice using LoRA (Low-Rank Adaptation) to improve recognition of your accent, language, and speaking style. Not available in v1.0 — we disabled training when we moved the inference engine to GGUF, and we are waiting to ship it until the PyTorch-to-GGUF conversion pipeline is solid. When it lands it will be free for anyone with a Personal or Business license, and all training will run on your machine.

  • Adapt to your accent, dialect, and speaking rhythm
  • Record training samples directly in the app
  • LoRA fine-tuning — fast, efficient, no full retrain
  • Train on domain-specific vocabulary (medical, legal, technical)
  • Your voice data stays local — always
  • Read the roadmap post →
⌨️

Direct Text Injection

Transcribed text is typed directly into whatever application has focus. No copy-paste, no clipboard. It works like a keyboard — press your hotkey, speak, and the text appears.

  • Works with any text field, editor, terminal, or chat
  • Configurable hotkey (default: Ctrl+D)
  • Extra mouse button support for hands-free recording
  • Streaming mode: text appears while you speak
  • Supports X11 and Wayland on Linux
📚

Hotword Context

Provide a list of domain-specific terms, names, or jargon to bias the model toward correct recognition. Ideal for technical dictation, medical notes, or any specialized vocabulary.

  • Add hotwords in Settings — one per line
  • Improves recognition of proper nouns and abbreviations
  • No retraining needed — applied at inference time

Ready to try it?

14-day free trial. All features unlocked. No credit card.