Speech Converter
STDIOA powerful command-line utility for text-to-speech conversion using OpenAI's API.
A powerful command-line utility for text-to-speech conversion using OpenAI's API.
A powerful command-line utility for text-to-speech conversion using OpenAI's API.
Clone this repository:
git clone https://github.com/j3k0/speech.sh.git cd speech.sh
Make the scripts executable:
chmod +x speech.sh mcp.sh launch
Ensure you have the required dependencies:
Basic usage:
./speech.sh --text "Hello, world!"
With more options:
./speech.sh --text "Hello, world!" --voice nova --speed 1.2 --model tts-1-hd
-h, --help Show help message and exit
-t, --text TEXT Text to convert to speech (required)
-v, --voice VOICE Voice model to use (default: onyx)
-s, --speed SPEED Speech speed (default: 1.0)
-o, --output FILE Output file path (default: auto-generated)
-a, --api_key KEY OpenAI API key
-m, --model MODEL TTS model to use (default: tts-1)
-p, --player PLAYER Audio player to use: auto, ffmpeg, or mplayer (default: auto)
--verbose Enable verbose logging
-V, --verbose Same as --verbose
-r, --retries N Number of retry attempts for API calls (default: 3)
-T, --timeout N Timeout in seconds for API calls (default: 30)
The script accepts an OpenAI API key in three ways (in order of precedence):
--api_key "your-api-key"
export OPENAI_API_KEY="your-api-key"
API_KEY
in the script's directoryThe script caches audio files by default to avoid unnecessary API calls. If you request the same text with the same voice and speed, it will reuse the previously generated audio file.
The script includes sophisticated retry logic for API calls:
You can choose your preferred audio player:
--player auto
: Use ffmpeg if available, fall back to mplayer (default)--player ffmpeg
: Force using ffmpeg--player mplayer
: Force using mplayerThe mcp.sh
script provides Model Context Protocol compatibility, allowing the
text-to-speech functionality to be used by MCP-compatible AI assistants like Claude.
To use the MCP server:
# Start the MCP server using the launch script ./launch
For detailed instructions on using the MCP integration, see MCP_README.md.
The script takes several steps to ensure security:
jq
for parameter processingConvert text to speech with default settings:
./speech.sh --text "Hello, world!"
Use a different voice:
./speech.sh --text "Hello, world!" --voice nova
Adjust the speech speed:
./speech.sh --text "Hello, world!" --speed 1.5
Save to a specific file:
./speech.sh --text "Hello, world!" --output hello.mp3
Use environment variable for API key:
export OPENAI_API_KEY="your-api-key" ./speech.sh --text "Hello, world!"
If you encounter issues:
--verbose
flagGPL