Personal voice training shipped in Brethof Voice Pro. Fine-tune the recognition model on your own voice, locally, with one click — no command line, no cloud. Here is how it works.
How it works
Every time you correct a misrecognised word, the app saves the audio clip and your correction to a local training dataset. The main window's training card shows how many samples and minutes you have captured — click it to browse, play back, or delete entries. When you are ready, the Training tab fine-tunes a LoRA adapter on your voice.
The pipeline picks its backend automatically: an NVIDIA CUDA build (cu128 PyTorch, auto-installed to a cached venv on first run) when a compatible GPU is present, otherwise a CPU fallback. When training finishes, the LoRA is merged with the base model and exported to GGUF so llama.cpp can load it — switch to your personal model from Settings → Models. The PyTorch-to-GGUF conversion that used to block this is done and runs end-to-end.
What you get
- In-app data collection. Corrections become training data automatically — just keep using the app.
- One-click LoRA fine-tuning. Train on your GPU (NVIDIA CUDA) or CPU. Progress bar, not a log file.
- Auto GGUF export. The trained model is merged and converted for you — no manual steps.
- Model versioning. Keep the stock model alongside your fine-tuned one and switch between them in settings.
- Hotwords for vocabulary. Add names, acronyms, and product terms in the Hotwords dialog — applied instantly, no retraining needed.
Who this matters for
- Non-native speakers. Heavy accents get solid baseline accuracy today, but fine-tuning on your own voice is where it jumps into the 95%+ range.
- Medical, legal, and technical users. Drug names, case law citations, codebase identifiers — all benefit massively from a custom vocab.
- Speech-impaired users. Fine-tuning on dysarthric or atypical speech is one of the highest-value uses of local training. Commercial cloud services do not offer it.
- Low-resource languages. Base models are good in popular languages and weaker in less-common ones. Fine-tune on a few hours of your own audio and the gap closes fast.
What is not changing
- Training runs locally. Voice samples never leave your machine. That is the whole point.
- Your fine-tuned model is yours. We do not upload it. We do not pool it into a "shared" model. It lives on your disk.
- Free with any paid license. Not a separate upgrade. If you own Personal or Business now, you get the training UI in the update that ships it.
Availability
Available now — personal voice training ships in the current build and runs end-to-end on Linux and Windows. Free with any paid licence; your voice data never leaves your machine.
Have a specific fine-tuning use case — specialised vocabularies, unusual accents, atypical speech? Tell us so we keep improving it: [email protected].