Skip to content

feat(voice): clean non-pronounceable chars from TTS replies#22

Merged
mattmezza merged 2 commits into
mainfrom
fix/voice-clean-speech
Jun 25, 2026
Merged

feat(voice): clean non-pronounceable chars from TTS replies#22
mattmezza merged 2 commits into
mainfrom
fix/voice-clean-speech

Conversation

@mattmezza

Copy link
Copy Markdown
Owner

Closes #10.

Voice replies are now sanitized before TTS so they read cleanly aloud.

What

Added clean_for_speech() in voice/pipeline.py, applied at the synthesize() chokepoint (every voice-reply path routes through it). It strips:

  • Emojis / pictographic symbols (unicode category So)
  • URLs (http(s)://…, www.…)
  • Code: fenced ``` blocks and inline `…`
  • Leading list bullets (-, *, , …)
  • Markdown symbols: `* # _ ~ > | ``
  • Separator dashes → comma pause (hyphens inside words kept, e.g. e-mail)
  • Blank/empty lines left behind

Test

tests/test_voice_clean.py — 7 asserts covering each rule. make test green, ruff clean.

🤖 Generated with Claude Code

mattmezza and others added 2 commits June 25, 2026 13:52
Voice responses now strip emojis, URLs, code snippets, bullets, dashes,
and markdown symbols (*, #, _, ~, >, |) before synthesis, leaving plain
speakable text. Cleaning lives at the synthesize() chokepoint so every
caller benefits.

Closes #10

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lies

The post-hoc cleaner is a safety net; intent belongs at the source. When
the model chooses [respond_with_voice], the skill now tells it to write the
whole message to be spoken — no emojis, symbols, URLs, code, or bullets —
and to fall back to text when content only works on screen.

Relates to #10

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mattmezza mattmezza merged commit 87900ec into main Jun 25, 2026
1 check passed
@mattmezza mattmezza deleted the fix/voice-clean-speech branch June 25, 2026 12:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Clean up voice responses to remove non-pronounceable characters

1 participant