BeClaude

claude-stt

New
276Community RegistryDevelopmentby Jarrod Watts · MIT

Speech-to-text input for Claude Code with live streaming dictation. Hold a hotkey, speak, and your words appear in the input field.

Community PluginView Source

Overview

Native voice/STT support has been officially added to Claude Code via the /voice command — use that instead. This repo is no longer maintained.


Claude STT

Speech-to-text input for Claude Code. Hold a hotkey, speak, and your words appear in the input field — all processed locally.

![License](LICENSE) ![Stars](https://github.com/jarrodwatts/claude-stt/stargazers)

!Claude STT in action

Install

Inside a Claude Code instance, run the following commands:

Step 1: Add the marketplace

code
/plugin marketplace add jarrodwatts/claude-stt

Step 2: Install the plugin

code
/plugin install claude-stt

Step 3: Run setup

code
/claude-stt:setup

Done! Press Ctrl+Shift+Space to start recording, press again to stop and transcribe.

Note: Setup installs dependencies (uv if available, otherwise a local .venv),

downloads the Moonshine model (~200MB), and checks microphone permissions.


What is Claude STT?

Claude STT gives you voice input directly into Claude Code. No typing required — just speak naturally.

What You GetWhy It Matters
Local processingAll audio processed on-device using Moonshine STT
Low latency~400ms transcription time
Push-to-talkHold hotkey to record, release to transcribe
Cross-platformmacOS, Linux, Windows
Privacy firstNo audio or text sent to external services

How It Works

code
Press Ctrl+Shift+Space → start recording
        ↓
Audio captured from microphone
        ↓
Press Ctrl+Shift+Space → stop recording
        ↓
Moonshine STT processes locally (~400ms)
        ↓
Text inserted into Claude Code input

Key details:

  • Audio is processed in memory and immediately discarded
  • Uses Moonshine ONNX for fast local inference
  • Keyboard injection or clipboard fallback
  • Native system sounds for audio feedback

Configuration

Customize your settings anytime:

code
/claude-stt:config

Options

OptionValuesDefaultDescription
hotkeyKey comboctrl+shift+spaceTrigger recording
modetoggle, push-to-talktogglePress to toggle vs hold to record
enginemoonshine, whispermoonshineSTT engine
moonshine_modelmoonshine/tiny, moonshine/base, other Moonshine model IDsmoonshine/baseModel size
output_modeauto, injection, clipboardautoHow text is inserted
sound_effectstrue, falsetruePlay audio feedback
max_recording_seconds1-600300Maximum recording duration

Settings stored in ~/.claude/plugins/claude-stt/config.toml.


Requirements

  • Python 3.10-3.13
  • ~200MB disk space for STT model
  • Microphone access

Platform-Specific

PlatformAdditional Requirements
macOSAccessibility permissions (System Settings > Privacy & Security)
Linuxxdotool for window management; X11 recommended (Wayland has limitations); WSL not supported
Windowspywin32 for window tracking

Commands

CommandDescription
/claude-stt:setupFirst-time setup: check environment, install deps, download model
/claude-stt:startStart the STT daemon
/claude-stt:stopStop the STT daemon
/claude-stt:statusShow daemon status and readiness checks
/claude-stt:configChange settings

You can also use the CLI directly:

code
claude-stt setup
claude-stt start --background

Troubleshooting

IssueSolution
No audio inputCheck microphone permissions in system settings
Keyboard injection not workingmacOS: Grant Accessibility permissions. Linux: Ensure xdotool installed
Model not loadingRun /claude-stt:setup to download. Check disk space (~200MB)
Hotkey test fails during setupFix permissions or rerun /claude-stt:setup --skip-hotkey-test to continue setup
Whisper dependencies missingRun /claude-stt:setup --with-whisper, or uv sync --directory $CLAUDE_PLUGIN_ROOT --extra whisper, or python $CLAUDE_PLUGIN_ROOT/scripts/exec.py -m pip install .[whisper]
Hotkey not triggeringCheck for conflicts with other apps. Try /claude-stt:config to change hotkey
Text going to wrong windowPlugin tracks original window — ensure Claude Code was focused when recording started
Running under WSLNot supported; use native Windows or Linux

Logging

Set CLAUDE_STT_LOG_LEVEL=DEBUG to get verbose logs when starting the daemon.


Privacy

All processing is local:

  • Audio captured from your microphone is processed entirely on-device
  • Moonshine runs locally — no cloud API calls
  • Audio is never sent anywhere, never stored (processed in memory, discarded)
  • Transcribed text only goes to Claude Code input or clipboard

No telemetry or analytics.


Development

bash
git clone https://github.com/jarrodwatts/claude-stt
cd claude-stt

# Install dependencies (uv preferred, falls back to local venv)
python scripts/setup.py --dev --skip-audio-test --skip-model-download --no-start

# Test locally without installing
claude --plugin-dir /path/to/claude-stt

See CONTRIBUTING.md for guidelines.

Release Checklist

  • Bump versions in pyproject.toml, .claude-plugin/plugin.json, .claude-plugin/marketplace.json
  • Update CHANGELOG.md
  • Run tests: uv run python -m unittest discover -s tests
  • Verify onboarding in Claude Code (/plugin install, /claude-stt:setup)

License

MIT — see LICENSE


Star History

![Star History Chart](https://star-history.com/#jarrodwatts/claude-stt&Date)

Install & Usage

1
Create the skills directory
mkdir -p .claude/skills
2
Download the skill file
mkdir -p .claude/skills && curl -o .claude/skills/claude-stt.md https://raw.githubusercontent.com/jarrodwatts/claude-stt/main/SKILL.md
3
Invoke in Claude Code
/claude-stt
View source on GitHub
speech-to-textsttvoicedictationmoonshineaudioaccessibility

Frequently Asked Questions

What is claude-stt?

Speech-to-text input for Claude Code with live streaming dictation. Hold a hotkey, speak, and your words appear in the input field.

How to install claude-stt?

To install claude-stt, create the .claude/skills directory in your project, then run the curl command to download the skill file. Once installed, invoke it in Claude Code with /claude-stt.

What is claude-stt best for?

claude-stt is a community categorized under Development. It is designed for: speech-to-text, stt, voice, dictation, moonshine, audio, accessibility. Created by Jarrod Watts.