Kokoro语音合成
STDIO基于Kokoro模型的文本转语音服务
基于Kokoro模型的文本转语音服务
A Model Context Protocol server that provides text-to-speech capabilities using the Kokoro TTS model.
The server can be configured using the following environment variables:
Variable | Description | Default | Valid Range |
---|---|---|---|
MCP_DEFAULT_SPEECH_SPEED | Default speed multiplier for text-to-speech | 1.1 | 0.5 to 2.0 |
MCP_DEFAULT_VOICE | Default voice for text-to-speech | af_bella | Any valid voice ID |
In Cursor:
{
"mcpServers": {
"speech": {
"command": "npx",
"args": [
"-y",
"speech-mcp-server"
],
"env": {
"MCP_DEFAULT_SPEECH_SPEED": 1.3,
"MCP_DEFAULT_VOICE": "af_bella"
}
}
}
}
# Using npm npm install speech-mcp-server # Using pnpm (recommended) pnpm add speech-mcp-server # Using yarn yarn add speech-mcp-server
Run the server:
# Using default configuration npm start # With custom configuration MCP_DEFAULT_SPEECH_SPEED=1.5 MCP_DEFAULT_VOICE=af_bella npm start
The server provides the following MCP tools:
text_to_speech
: Basic text-to-speech conversiontext_to_speech_with_options
: Text-to-speech with customizable speedlist_voices
: List all available voicesget_model_status
: Check the initialization status of the TTS model# Clone the repository git clone <your-repo-url> cd speech-mcp-server # Install dependencies pnpm install # Start development server with auto-reload pnpm dev # Build the project pnpm build # Run linting pnpm lint # Format code pnpm format # Test with MCP Inspector pnpm inspector
Converts text to speech using the default settings.
{ "type": "request", "id": "1", "method": "call_tool", "params": { "name": "text_to_speech", "arguments": { "text": "Hello world", "voice": "af_bella" // optional } } }
Converts text to speech with customizable parameters.
{ "type": "request", "id": "1", "method": "call_tool", "params": { "name": "text_to_speech_with_options", "arguments": { "text": "Hello world", "voice": "af_bella", // optional "speed": 1.0, // optional (0.5 to 2.0) } } }
Lists all available voices for text-to-speech.
{ "type": "request", "id": "1", "method": "list_voices", "params": {} }
Check the current status of the TTS model initialization. This is particularly useful when first starting the server, as the model needs to be downloaded and initialized.
{ "type": "request", "id": "1", "method": "call_tool", "params": { "name": "get_model_status", "arguments": {} } }
Response example:
{ "content": [{ "type": "text", "text": "Model status: initializing (5s elapsed)" }] }
Possible status values:
uninitialized
: Model initialization hasn't startedinitializing
: Model is being downloaded and initializedready
: Model is ready to useerror
: An error occurred during initializationYou can test the server using the MCP Inspector or by sending raw JSON messages:
# List available tools echo '{"type":"request","id":"1","method":"list_tools","params":{}}' | node dist/index.js # List available voices echo '{"type":"request","id":"2","method":"list_voices","params":{}}' | node dist/index.js # Convert text to speech echo '{"type":"request","id":"3","method":"call_tool","params":{"name":"text_to_speech","arguments":{"text":"Hello world","voice":"af_bella"}}}' | node dist/index.js
To use this server with Claude Desktop, add the following to your Claude Desktop config file (~/Library/Application Support/Claude/claude_desktop_config.json
):
{ "servers": { "speech": { "command": "npx", "args": ["@decodershq/speech-mcp-server"] } } }
Contributions are welcome! Please feel free to submit a Pull Request.
MIT License - see the LICENSE file for details.
The server automatically attempts to download and initialize the TTS model on startup. If you encounter initialization errors:
get_model_status
tool to monitor initialization progress and any errors# Remove model files (MacOS/Linux) rm -rf ~/.npm/_npx/**/node_modules/@huggingface/transformers/.cache/onnx-community/Kokoro-82M-v1.0-ONNX/onnx/model_quantized.onnx rm -rf ~/.cache/huggingface/transformers/onnx-community/Kokoro-82M-v1.0-ONNX/onnx/model_quantized.onnx # Then restart the server npm start
The get_model_status
tool will now include retry information in its response:
{ "content": [{ "type": "text", "text": "Model status: initializing (5s elapsed, retry 1/3)" }] }