Skip to content

Feature Proposal: Add FunASR Speech-to-Text MCP Server #4299

@LauraGPT

Description

@LauraGPT

The MCP servers repository provides reference server implementations. A FunASR speech-to-text MCP server would enable any MCP-compatible AI client (Claude, Cursor, etc.) to transcribe audio without external API dependencies.

FunASR (17.8K+ stars, https://github.com/modelscope/FunASR) provides:

  • SenseVoice: Ultra-fast multilingual ASR (50x faster than Whisper-large, 50+ languages including strong CJK)
  • Paraformer: Production-grade ASR with timestamps and punctuation
  • OpenAI-compatible API: POST /v1/audio/transcriptions

A FunASR MCP server would expose:

  1. tool — transcribe audio files to text
  2. tool — real-time streaming transcription
  3. tool — identify spoken language in audio

Since FunASR runs fully self-hosted, this MCP server would give MCP clients local STT capabilities without needing external Whisper API calls. Its particularly valuable for Chinese/Asian language support where Whisper performance is weaker.

Would a FunASR MCP server contribution be welcome?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions