The MCP servers repository provides reference server implementations. A FunASR speech-to-text MCP server would enable any MCP-compatible AI client (Claude, Cursor, etc.) to transcribe audio without external API dependencies.
FunASR (17.8K+ stars, https://github.com/modelscope/FunASR) provides:
- SenseVoice: Ultra-fast multilingual ASR (50x faster than Whisper-large, 50+ languages including strong CJK)
- Paraformer: Production-grade ASR with timestamps and punctuation
- OpenAI-compatible API: POST /v1/audio/transcriptions
A FunASR MCP server would expose:
- tool — transcribe audio files to text
- tool — real-time streaming transcription
- tool — identify spoken language in audio
Since FunASR runs fully self-hosted, this MCP server would give MCP clients local STT capabilities without needing external Whisper API calls. Its particularly valuable for Chinese/Asian language support where Whisper performance is weaker.
Would a FunASR MCP server contribution be welcome?
The MCP servers repository provides reference server implementations. A FunASR speech-to-text MCP server would enable any MCP-compatible AI client (Claude, Cursor, etc.) to transcribe audio without external API dependencies.
FunASR (17.8K+ stars, https://github.com/modelscope/FunASR) provides:
A FunASR MCP server would expose:
Since FunASR runs fully self-hosted, this MCP server would give MCP clients local STT capabilities without needing external Whisper API calls. Its particularly valuable for Chinese/Asian language support where Whisper performance is weaker.
Would a FunASR MCP server contribution be welcome?