Principles Apr 4, 2026

Why Voice Pro runs 100% offline

Your voice is the most personal data you generate. It should not leave your machine unless you explicitly send it somewhere.

Every major dictation product — Dragon, Otter, Google's dictation, Apple's dictation, every cloud transcription SaaS — sends your voice through someone else's server. The audio is captured on your machine, streamed over HTTPS to a data centre, transcribed there, and the text comes back. Sometimes the audio is stored. Sometimes it is used to train models. Sometimes it is "anonymised", which is a word that has lost most of its meaning.

We think that is the wrong default.

What people actually dictate

Once you start watching real usage, the content of dictation is almost never casual. It is:

  • Medical notes — a doctor transcribing a consultation.
  • Legal drafts — a lawyer dictating a motion or a client letter.
  • Journalism — interviews with named sources who assumed the recording would not leave the journalist's machine.
  • Therapy notes — a psychologist writing up session summaries.
  • Business strategy — an executive drafting memos about deals, people, money.
  • Personal journals — the stuff people actually think, which nobody else should ever read.

All of that routinely gets uploaded to cloud services by default. Often in violation of HIPAA, GDPR, attorney-client privilege, or plain decency — because the user did not realise, or had no alternative.

What Voice Pro does differently

  • No cloud mode, period. There is no toggle to "send to server for better accuracy". The model runs on your CPU or GPU. That is the only option.
  • No telemetry. The app does not phone home with usage stats, crash reports, or anything else. The only network calls are (a) licence check on startup, (b) update check, (c) optional manual model downloads. All three are documented and can be disabled.
  • No account required to transcribe. You can use the 14-day trial without creating an account. An email is only needed if you want to buy a licence.
  • Audio never hits disk. The audio buffer lives in RAM during transcription and is freed the moment the text is produced. Nothing to leak, nothing to forensically recover.

Can local models really match the cloud?

Two years ago, no. A local 100 MB model on a laptop CPU could not touch what Google's data centre was doing with 200 GB of model and 40 GPUs.

Today, yes. Qwen3-ASR 3B running on a mid-range CPU hits word error rates within 2% of the big cloud providers on most languages, and beats them on low-resource languages where the cloud providers have less training data. For dictation specifically — short, intentional utterances — the gap is closer to 0%. Local ASR has quietly become good enough, and it keeps getting better. We are just the ones choosing to ship it.

The principle

Your voice is the most personal data you generate. It carries your identity, your thoughts, the people you are talking about, the words you chose when nobody was editing. It should not leave your machine unless you explicitly, knowingly send it somewhere.

That is not a marketing line. That is the entire reason the product exists.

Try Voice Pro See how it works

Everything we build

External:   YouTube · GitHub