How to Dictate Code Documentation and Comments on Mac

April 23, 2026 · 6 min read mac dictation developers speech-to-text productivity documentation

Here’s something counterintuitive: the part of a developer’s job that benefits most from voice dictation isn’t the code — it’s everything else.

Think about your average workday. You write a few dozen lines of new code. You also write: inline comments explaining why you did it that way, a README section describing how to run the thing, a PR description walking reviewers through the change, three Slack threads, and a commit message that’s actually useful. That’s a lot of prose. And prose is where dictation shines.

If you’ve tried Mac’s built-in dictation for technical work and been disappointed, that’s a reasonable reaction — it genuinely struggles with technical vocabulary. But the problem is usually the tool, not the concept.

What to Dictate vs. What to Type

Before getting into setup, it’s worth being clear about what voice dictation is good for in a developer workflow — and what it isn’t.

Dictate:

Inline comments and docstrings
README and documentation sections
PR descriptions and review feedback
Commit messages
Slack and Discord messages
Linear/GitHub issues
Meeting notes and summaries

Keep typing:

Actual code syntax (brackets, semicolons, operators)
File paths and commands
Anything with dense symbol sequences

The reason is simple: natural speech maps well onto prose, but poorly onto const handleClick = (e: React.MouseEvent) => {. You’d spend more time correcting than you saved speaking.

The hybrid approach — voice for thinking and explaining, keyboard for syntax — is what most developers who successfully use dictation land on. Once you accept this, the productivity gains become real.

Why Built-in Dictation Falls Short for Developers

macOS has solid built-in dictation (Fn+Fn or Globe key). For everyday writing it works fine. For technical content, it breaks down in predictable ways:

Technical vocabulary: Say “useState” and get “use state.” Say “Kubernetes” and get something unrecognizable. Say “camelCase” and definitely get “camel case.”
No custom vocabulary: You can’t train Apple Dictation to learn your codebase’s naming conventions or project-specific terms.
Short timeout: Built-in dictation disconnects after longer pauses, which breaks flow when you’re thinking through a complex explanation.
No post-processing: What you say is what you get — raw transcription with whatever filler words and false starts you produced.

These limitations don’t matter much for everyday writing. For a developer explaining why useEffect has that specific dependency array, they matter a lot.

Whisper-Based Tools Handle Technical Content Better

Apps built on OpenAI’s Whisper model — or equivalent — handle technical vocabulary significantly better than Apple’s built-in engine. Whisper was trained on a wide variety of audio including technical content, so it recognizes terms like React, Kubernetes, useState, and GraphQL more reliably.

The practical difference: you can speak naturally about your code without mentally translating technical terms into phonetics that the dictation engine might recognize.

Several macOS apps use Whisper under the hood. LittleWhisper is one option — it’s a menu bar app that sits out of the way until you press a hotkey, records your voice, runs it through Whisper (or Deepgram/Groq if you prefer speed), and types the result directly into whatever window is focused. There’s a Code Comment editor mode that automatically reformats raw speech into clean, comment-appropriate prose — removing filler words and structuring the output for inline documentation.

That last part matters more than it sounds. “Okay so uh this function basically handles the case where the user hasn’t set up their profile yet and we need to fall back to defaults” becomes // Handles unauthenticated state by falling back to default profile values with no editing required.

A Practical Workflow by Context

Inline Comments

The most natural use case. When you’ve just written a non-obvious piece of code, press your hotkey, explain your reasoning aloud in plain English, and let the app transcribe and reformat it. The spoken explanation is usually better than what you’d type anyway — you naturally include the why, not just the what.

Docstrings

For longer docstrings (function parameters, return values, edge cases), dictate a first draft and then edit. Speaking a 4-sentence explanation takes about 15 seconds; typing it from scratch takes 2-3 minutes.

README and Documentation

This is where voice input pays off most dramatically. README files are essentially essays — structured prose explaining how something works. Dictating a first draft and editing it is usually faster than typing the whole thing, especially for sections where you’re working through the explanation as you go.

Position your cursor at the start of a new section, press your hotkey, speak the section, and move on. Then do an editing pass.

PR Descriptions

PR descriptions written by voice tend to be better, not just faster. When you explain a change verbally, you naturally include context that typed descriptions often skip: why you chose this approach, what alternatives you considered, what reviewers should focus on. Speak it as if you’re explaining the PR to a teammate in person.

Commit Messages

A commit message is a sentence or two of prose. Three seconds of speaking, formatted by an AI editor into a clean imperative-mood message, is faster than typing and tends to produce more useful commit history.

Tips for Better Results with Technical Content

Slow down for proper nouns. Whisper handles “React” and “TypeScript” well, but less-common terms benefit from a deliberate pace.

Speak in complete sentences. Whisper is better at transcribing complete grammatical sentences than fragments. “This function handles the authentication flow” works better than “auth flow handler.”

Don’t try to fix while speaking. If you said something imperfectly, keep going and fix it in editing. Stopping and restarting mid-sentence trips up transcription.

Use editor modes for output shaping. Raw transcription of a spoken explanation will sound informal. A post-processing step that reformats for “code comment” style saves editing time and produces more consistent output.

Test with your actual vocabulary first. Run a few short tests dictating realistic content from your project. See what gets mangled, and decide whether it’s frequent enough to matter. Most developers find that 90%+ of technical vocabulary comes through accurately enough.

Getting Started

If you’ve never tried voice input for code documentation, the lowest-friction entry point is macOS’s built-in dictation (Settings → Keyboard → Dictation). Try it for a few PR descriptions or README sections. If you find the accuracy frustrating or want AI post-processing to clean up the output, a Whisper-based app is the next step.

The goal isn’t to replace typing — it’s to stop using your fastest input method (typing) for your slowest thought process (composing explanations). Speaking what you’re thinking, then editing the result, is almost always faster than typing a draft from scratch.

For most developers, documentation is the backlog that never gets done because it takes too long. Dictation makes it fast enough that it actually happens.