
Supadata
STDIOOfficialMCP server integrating Supadata for video transcript extraction and web scraping capabilities
MCP server integrating Supadata for video transcript extraction and web scraping capabilities
A Model Context Protocol (MCP) server implementation that integrates with Supadata for video & web scraping capabilities.
Play around with our MCP Server on Smithery or on MCP.so's playground.
env SUPADATA_API_KEY=your-api-key npx -y @supadata/mcp
npm install -g @supadata/mcp
Configuring Cursor 🖥️ Note: Requires Cursor version 0.45.6+ For the most up-to-date configuration instructions, please refer to the official Cursor documentation on configuring MCP servers: Cursor MCP Server Configuration Guide
To configure Supadata MCP in Cursor v0.48.6
{ "mcpServers": { "@supadata/mcp": { "command": "npx", "args": ["-y", "@supadata/mcp"], "env": { "SUPADATA_API_KEY": "YOUR-API-KEY" } } } }
To configure Supadata MCP in Cursor v0.45.6
env SUPADATA_API_KEY=your-api-key npx -y @supadata/mcp
If you are using Windows and are running into issues, try
cmd /c "set SUPADATA_API_KEY=your-api-key && npx -y @supadata/mcp"
Replace your-api-key
with your Supadata API key. If you don't have one yet, you can create an account and get it from https://www.supadata.dev/app/api-keys
After adding, refresh the MCP server list to see the new tools. The Composer Agent will automatically use Supadata MCP when appropriate, but you can explicitly request it by describing your web scraping needs. Access the Composer via Command+L (Mac), select "Agent" next to the submit button, and enter your query.
Add this to your ./codeium/windsurf/model_config.json
:
{ "mcpServers": { "@supadata/mcp": { "command": "npx", "args": ["-y", "@supadata/mcp"], "env": { "SUPADATA_API_KEY": "YOUR_API_KEY" } } } }
To install Supadata for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install @supadata-ai/mcp --client claude
For one-click installation, click one of the install buttons below...
For manual installation, add the following JSON block to your User Settings (JSON) file in VS Code. You can do this by pressing Ctrl + Shift + P
and typing Preferences: Open User Settings (JSON)
.
{ "mcp": { "inputs": [ { "type": "promptString", "id": "apiKey", "description": "Supadata API Key", "password": true } ], "servers": { "supadata": { "command": "npx", "args": ["-y", "@supadata/mcp"], "env": { "SUPADATA_API_KEY": "${input:apiKey}" } } } } }
Optionally, you can add it to a file called .vscode/mcp.json
in your workspace. This will allow you to share the configuration with others:
{ "inputs": [ { "type": "promptString", "id": "apiKey", "description": "Supadata API Key", "password": true } ], "servers": { "supadata": { "command": "npx", "args": ["-y", "@supadata/mcp"], "env": { "SUPADATA_API_KEY": "${input:apiKey}" } } } }
SUPADATA_API_KEY
: Your Supadata API keyAdd this to your claude_desktop_config.json
:
{ "mcpServers": { "@supadata/mcp": { "command": "npx", "args": ["-y", "@supadata/mcp"], "env": { "SUPADATA_API_KEY": "YOUR_API_KEY_HERE" } } } }
The server includes several configurable parameters that can be set via environment variables. Here are the default values if not configured:
const CONFIG = { retry: { maxAttempts: 3, // Number of retry attempts for rate-limited requests initialDelay: 1000, // Initial delay before first retry (in milliseconds) maxDelay: 10000, // Maximum delay between retries (in milliseconds) backoffFactor: 2, // Multiplier for exponential backoff }, };
The server utilizes Supadata's built-in rate limiting and batch processing capabilities:
Use this guide to select the right tool for your task:
Tool | Best for | Returns |
---|---|---|
transcript | Video transcript extraction | text/markdown |
scrape | Single page content | markdown/html |
map | Discovering URLs on a site | URL[] |
crawl | Multi-page extraction (with limits) | markdown/html[] |
supadata_transcript
)Extract transcripts from supported video platforms and file URLs.
Best for:
Not recommended for:
Common mistakes:
Prompt Example:
"Get the transcript from this YouTube video: https://youtube.com/watch?v=example"
Usage Example:
{ "name": "supadata_transcript", "arguments": { "url": "https://youtube.com/watch?v=example", "lang": "en", "text": false, "mode": "auto" } }
Returns:
supadata_check_transcript_status
)Check the status of a transcript job.
{ "name": "supadata_check_transcript_status", "arguments": { "id": "550e8400-e29b-41d4-a716-446655440000" } }
Returns:
supadata_scrape
)Scrape content from a single URL with advanced options.
Best for:
Not recommended for:
Common mistakes:
Prompt Example:
"Get the content of the page at https://example.com."
Usage Example:
{ "name": "supadata_scrape", "arguments": { "url": "https://example.com", "noLinks": false, "lang": "en" } }
Returns:
supadata_map
)Map a website to discover all indexed URLs on the site.
Best for:
Not recommended for:
Common mistakes:
Prompt Example:
"List all URLs on example.com."
Usage Example:
{ "name": "supadata_map", "arguments": { "url": "https://example.com" } }
Returns:
supadata_crawl
)Starts an asynchronous crawl job on a website and extract content from all pages.
Best for:
Not recommended for:
Warning: Crawl responses can be very large and may exceed token limits. Limit the number of pages to crawl for better control.
Common mistakes:
Prompt Example:
"Get all pages from example.com/blog."
Usage Example:
{ "name": "supadata_crawl", "arguments": { "url": "https://example.com/blog", "limit": 100 } }
Returns:
{ "content": [ { "type": "text", "text": "Started crawl for: https://example.com/* with job ID: 550e8400-e29b-41d4-a716-446655440000. Use supadata_check_crawl_status to check progress." } ], "isError": false }
supadata_check_crawl_status
)Check the status of a crawl job.
{ "name": "supadata_check_crawl_status", "arguments": { "id": "550e8400-e29b-41d4-a716-446655440000" } }
Returns:
# Install dependencies npm install # Build npm run build # Run tests npm test
npm test
MIT License - see LICENSE file for details