← All posts

How to Dictate on Mac Without Saying 'Comma' and 'Period'

· 7 min read mac dictation speech-to-text

If you’ve ever used Apple’s built-in dictation on your Mac, you know the drill. You speak a sentence, then awkwardly insert “comma” or “period” or “question mark” out loud, breaking your train of thought every few seconds. It sounds something like this:

“Hey Sarah comma I wanted to follow up on the proposal period Can we schedule a call this week question mark”

It works, technically. But it doesn’t feel like talking — it feels like performing. You’re simultaneously composing a thought and manually formatting it, which defeats the purpose of dictation. You’re supposed to be thinking faster, not juggling two cognitive tasks at once.

The good news: this is a solved problem. Modern dictation apps handle punctuation automatically using AI, so you can speak naturally and get properly punctuated text without saying a single formatting command.

Why Apple Dictation Requires Spoken Punctuation

Apple Dictation on macOS does have an “auto-punctuation” feature, introduced with macOS Ventura. In theory, it adds commas and periods automatically based on your speech patterns.

In practice, it’s unreliable. If you’ve spent any time in Apple’s support forums, you’ll find years of complaints: auto-punctuation inserts commas in random places, misses periods entirely, adds question marks based on vocal inflection rather than actual questions, and sometimes just stops working. Some users have found that toggling both “Smart Punctuation” and “Auto-Punctuation” off (Settings → General → Keyboard) actually produces better results — because at least then you’re in control, even if that control means saying “comma” out loud.

The core issue is that Apple Dictation is doing basic speech-to-text. It converts audio waveforms to words. Punctuation is a separate, bolt-on step that tries to guess where pauses and inflections indicate sentence boundaries. It’s pattern-matching on audio cues, not understanding what you’re saying.

How AI Post-Processing Handles It Differently

Third-party dictation apps take a fundamentally different approach. Instead of trying to infer punctuation from how you speak, they transcribe your words first and then pass the raw text through an AI language model that understands grammar, context, and sentence structure.

The difference is like proofreading versus guessing. A language model can read “hey sarah i wanted to follow up on the proposal can we schedule a call this week” and understand — from the meaning, not the audio — that there should be a comma after “Hey Sarah,” a period after “proposal,” and a question mark at the end. It knows this because it understands English grammar, not because you paused at the right moments.

This is why the output from AI-powered dictation apps reads like something you’d actually type:

What you say: “hey sarah i wanted to follow up on the proposal can we schedule a call this week”

What Apple Dictation produces: “Hey Sarah I wanted to follow up on the proposal can we schedule a call this week” (maybe with some random commas thrown in)

What an AI-powered app produces: “Hey Sarah, I wanted to follow up on the proposal. Can we schedule a call this week?”

No spoken punctuation required. You just talk, and the AI figures out the structure.

Which Apps Do This Well

Most modern Mac dictation apps include some form of automatic punctuation. Here’s how the main options handle it:

Cloud-based transcription engines

OpenAI’s Whisper, Deepgram’s Nova, and Groq’s hosted models all produce reasonably good automatic punctuation at the transcription stage. They’re trained on enough text data to handle basic sentence boundaries. For everyday dictation — emails, messages, notes — the punctuation is usually correct without any post-processing at all.

Where these engines still struggle is with ambiguous cases. A long run-on thought might get one period where it should have two. A rhetorical question might not get a question mark. Lists embedded in sentences can get messy. But for most practical use, cloud transcription engines handle punctuation well enough that you’ll rarely need to fix anything.

AI post-processing (editor modes)

For truly polished output, the best approach is to run the transcribed text through a language model after transcription. This is what apps with “editor modes” or “AI cleanup” do — they take the raw transcript (which may have decent punctuation already) and rewrite it with proper grammar, punctuation, paragraph breaks, and formatting.

This second pass catches things that transcription-level punctuation misses: it can split a rambling thought into two clean sentences, add em dashes or semicolons where appropriate, and format lists properly. It also handles the filler words and false starts that automated punctuation can’t help with.

Apps that offer this include SuperWhisper (via its modes system), Wispr Flow (automatic context-aware formatting), Aqua Voice (context-aware cleanup), VoiceInk (AI enhancement modes), and LittleWhisper (customizable editor modes with multiple AI providers).

On-device transcription

Local Whisper models (used by SuperWhisper, VoiceInk, LittleWhisper, and others in their on-device mode) also produce automatic punctuation, though it’s slightly less polished than the cloud versions. The smaller models especially — tiny and base — can miss sentence boundaries or produce sparse punctuation. The “small” model does a noticeably better job.

If you’re using on-device transcription for privacy reasons and still want reliable punctuation, the best setup is to pair local transcription with AI post-processing via your own API key. The transcription stays on your Mac (audio never leaves the device), but the resulting text gets sent to a language model for cleanup. Since only text is sent — not audio — this is a reasonable privacy trade-off for most users.

Tips for Getting Better Punctuation from Any App

Regardless of which app you use, a few habits will help the AI produce better-punctuated output:

Pause briefly between sentences. You don’t need to say “period,” but a natural half-second pause gives the transcription engine a strong signal that one sentence ended and another is beginning. If you run sentences together without any pause, even good engines can struggle with boundaries.

Speak in complete thoughts. AI punctuation works best when you give it grammatically complete sentences. “Let’s push the meeting to Thursday, I think the deck needs more work” will punctuate better than a stream-of-consciousness ramble that changes direction mid-sentence. This doesn’t mean you need to speak formally — just aim for one idea per sentence.

Use a clear voice and decent microphone. Punctuation inference partly depends on transcription accuracy. If the engine mishears a word, the grammar around that word gets harder to parse, and punctuation suffers. A good microphone (even a $30 USB mic) makes a noticeable difference compared to a built-in laptop mic.

Don’t fight it for edge cases. Sometimes you genuinely need a semicolon, an ellipsis, or a specific formatting choice that no AI will guess. For these cases, it’s faster to dictate naturally, let the AI handle 95% of the punctuation, and then fix the remaining 5% by hand. That’s still vastly faster than saying “comma” and “period” throughout.

What About Other Formatting Commands?

Apple Dictation supports a long list of spoken commands beyond punctuation: “new paragraph,” “new line,” “all caps on/off,” “bold on/off,” and so on. If you’ve memorized these commands and use them fluently, they can be productive.

But most people don’t memorize them, and even those who do find that spoken formatting commands break their flow. AI post-processing handles most of this automatically — it adds paragraph breaks at logical points, capitalizes appropriately, and formats text for the destination (Slack, email, notes, etc.).

Custom editor modes take this even further. Instead of saying “new paragraph” and “bold that,” you can just talk and have the AI format the output according to a profile you’ve defined. A “meeting notes” mode might automatically produce bullet points. A “professional email” mode might add a greeting and sign-off. A “code comment” mode might strip all conversational language and produce terse technical prose.

The goal is the same: you think about what to say, and the tool handles how it’s formatted.

The Bigger Point

Having to say “comma” and “period” while dictating isn’t just annoying — it’s a design that fights against the fundamental advantage of dictation. You speak because it’s faster than typing. But if you have to constantly interrupt yourself with formatting commands, you’ve traded one form of friction (moving your fingers) for another (mentally tracking punctuation while composing thoughts).

AI post-processing solves this by separating the two tasks entirely. You handle the content. The AI handles the formatting. They happen sequentially, not simultaneously, which means neither one degrades the other.

If you’ve tried Apple Dictation, been annoyed by the “comma period” problem, and concluded that dictation isn’t for you, it’s worth giving a modern app a second look. The experience is significantly different when punctuation just works.

LittleWhisper handles punctuation automatically through AI editor modes — you speak naturally and get properly formatted text in any app. It’s free to try on macOS, no spoken punctuation required.