
Omnisearch
STDIOUnified search and content processing MCP server integrating multiple search providers and AI tools.
Unified search and content processing MCP server integrating multiple search providers and AI tools.
A Model Context Protocol (MCP) server that provides unified access to multiple search providers and AI tools. This server combines the capabilities of Tavily, Perplexity, Kagi, Jina AI, Brave, Exa AI, and Firecrawl to offer comprehensive search, AI responses, content processing, and enhancement features through a single interface.
filename:
, path:
, repo:
,
user:
, language:
, in:file
)MCP Omnisearch provides powerful search capabilities through operators and parameters:
// Using Brave or Kagi with query string operators { "query": "filetype:pdf site:microsoft.com typescript guide" } // Using Tavily with API parameters { "query": "typescript guide", "include_domains": ["microsoft.com"], "exclude_domains": ["github.com"] }
filename:remote.ts
- Search for specific filespath:src/lib
- Search within specific directoriesrepo:user/repo
- Search within specific repositoriesuser:username
- Search within a user's repositorieslanguage:typescript
- Filter by programming languagein:file "export function"
- Search for text within filesMCP Omnisearch is designed to work with the API keys you have available. You don't need to have keys for all providers - the server will automatically detect which API keys are available and only enable those providers.
For example:
This flexibility makes it easy to get started with just one or two providers and add more as needed.
This server requires configuration through your MCP client. Here are examples for different environments:
Add this to your Cline MCP settings:
{ "mcpServers": { "mcp-omnisearch": { "command": "node", "args": ["/path/to/mcp-omnisearch/dist/index.js"], "env": { "TAVILY_API_KEY": "your-tavily-key", "PERPLEXITY_API_KEY": "your-perplexity-key", "KAGI_API_KEY": "your-kagi-key", "JINA_AI_API_KEY": "your-jina-key", "BRAVE_API_KEY": "your-brave-key", "GITHUB_API_KEY": "your-github-key", "EXA_API_KEY": "your-exa-key", "FIRECRAWL_API_KEY": "your-firecrawl-key", "FIRECRAWL_BASE_URL": "http://localhost:3002" }, "disabled": false, "autoApprove": [] } } }
For WSL environments, add this to your Claude Desktop configuration:
{ "mcpServers": { "mcp-omnisearch": { "command": "wsl.exe", "args": [ "bash", "-c", "TAVILY_API_KEY=key1 PERPLEXITY_API_KEY=key2 KAGI_API_KEY=key3 JINA_AI_API_KEY=key4 BRAVE_API_KEY=key5 GITHUB_API_KEY=key6 EXA_API_KEY=key7 FIRECRAWL_API_KEY=key8 FIRECRAWL_BASE_URL=http://localhost:3002 node /path/to/mcp-omnisearch/dist/index.js" ] } } }
The server uses API keys for each provider. You don't need keys for all providers - only the providers corresponding to your available API keys will be activated:
TAVILY_API_KEY
: For Tavily SearchPERPLEXITY_API_KEY
: For Perplexity AIKAGI_API_KEY
: For Kagi services (FastGPT, Summarizer, Enrichment)JINA_AI_API_KEY
: For Jina AI services (Reader, Grounding)BRAVE_API_KEY
: For Brave SearchGITHUB_API_KEY
: For GitHub search services (Code, Repository, User
search)EXA_API_KEY
: For Exa AI services (Search, Answer, Contents,
Similar)FIRECRAWL_API_KEY
: For Firecrawl services (Scrape, Crawl, Map,
Extract, Actions)FIRECRAWL_BASE_URL
: For self-hosted Firecrawl instances (optional,
defaults to Firecrawl cloud service)You can start with just one or two API keys and add more later as needed. The server will log which providers are available on startup.
To use GitHub search features, you'll need a GitHub personal access token with public repository access only for security:
Go to GitHub Settings: Navigate to GitHub Settings > Developer settings > Personal access tokens
Create a new token: Click "Generate new token" → "Generate new token (classic)"
Configure token settings:
Name: MCP Omnisearch - Public Search
Expiration: Choose your preferred expiration (90 days recommended)
Scopes: Leave all checkboxes UNCHECKED
⚠️ Important: Do not select any scopes. An empty scope token can only access public repositories and user profiles, which is exactly what we want for search functionality.
Generate and copy: Click "Generate token" and copy the token immediately
Add to environment: Set GITHUB_API_KEY=your_token_here
Security Notes:
If you're running a self-hosted instance of Firecrawl, you can
configure MCP Omnisearch to use it by setting the FIRECRAWL_BASE_URL
environment variable. This allows you to maintain complete control
over your data processing pipeline.
Self-hosted Firecrawl setup:
http://localhost:3002
)FIRECRAWL_BASE_URL=http://localhost:3002 # or for a remote self-hosted instance: FIRECRAWL_BASE_URL=https://your-firecrawl-domain.com
Important notes:
FIRECRAWL_BASE_URL
is not set, MCP Omnisearch will default to
the Firecrawl cloud service/v1/scrape
,
/v1/crawl
, etc.)FIRECRAWL_API_KEY
even for self-hosted
instancesThe server implements MCP Tools organized by category:
Search the web using Tavily Search API. Best for factual queries requiring reliable sources and citations.
Parameters:
query
(string, required): Search queryExample:
{ "query": "latest developments in quantum computing" }
Privacy-focused web search with good coverage of technical topics.
Parameters:
query
(string, required): Search queryExample:
{ "query": "rust programming language features" }
High-quality search results with minimal advertising influence. Best for finding authoritative sources and research materials.
Parameters:
query
(string, required): Search querylanguage
(string, optional): Language filter (e.g., "en")no_cache
(boolean, optional): Bypass cache for fresh resultsExample:
{ "query": "latest research in machine learning", "language": "en" }
Search for code on GitHub using advanced syntax. This tool searches through file contents in public repositories and provides code snippets with metadata.
Parameters:
query
(string, required): Search query with GitHub search syntaxlimit
(number, optional): Maximum number of results (1-50,
default: 10)Example:
{ "query": "filename:remote.ts @sveltejs/kit", "limit": 5 }
Advanced query examples:
"filename:config.json path:src"
- Find config.json files in src
directories"function fetchData language:typescript"
- Find fetchData
functions in TypeScript"repo:microsoft/vscode extension"
- Search within specific
repository"user:torvalds language:c"
- Search user's repositories for C codeDiscover GitHub repositories with enhanced metadata including stars, forks, language, and last update information.
Parameters:
query
(string, required): Repository search querylimit
(number, optional): Maximum number of results (1-50,
default: 10)sort
(string, optional): Sort results by 'stars', 'forks', or
'updated'Example:
{ "query": "sveltekit remote functions", "sort": "stars", "limit": 5 }
Find GitHub users and organizations with profile information.
Parameters:
query
(string, required): User/organization search querylimit
(number, optional): Maximum number of results (1-50,
default: 10)Example:
{ "query": "Rich-Harris", "limit": 3 }
AI-powered web search using neural and keyword search. Automatically chooses between traditional keyword search and Exa's embeddings-based model to find the most relevant results for your query.
Parameters:
query
(string, required): Search querylimit
(number, optional): Maximum number of results (1-100,
default: 10)include_domains
(array, optional): Only include results from these
domainsexclude_domains
(array, optional): Exclude results from these
domainsExample:
{ "query": "latest AI research papers", "limit": 15, "include_domains": ["arxiv.org", "scholar.google.com"] }
AI-powered response generation with real-time web search integration.
Parameters:
query
(string, required): Question or topic for AI responseExample:
{ "query": "Explain the differences between REST and GraphQL" }
Quick AI-generated answers with citations.
Parameters:
query
(string, required): Question for quick AI responseExample:
{ "query": "What are the main features of TypeScript?" }
Get direct AI-generated answers to questions using Exa Answer API.
Parameters:
query
(string, required): Question for AI responseinclude_domains
(array, optional): Only include sources from these
domainsexclude_domains
(array, optional): Exclude sources from these
domainsExample:
{ "query": "How does machine learning work?", "include_domains": ["arxiv.org", "nature.com"] }
Convert URLs to clean, LLM-friendly text with image captioning.
Parameters:
url
(string, required): URL to processExample:
{ "url": "https://example.com/article" }
Summarize content from URLs.
Parameters:
url
(string, required): URL to summarizeExample:
{ "url": "https://example.com/long-article" }
Extract raw content from web pages with Tavily Extract.
Parameters:
url
(string | string[], required): Single URL or array of URLs to
extract content fromextract_depth
(string, optional): Extraction depth - 'basic'
(default) or 'advanced'Example:
{ "url": [ "https://example.com/article1", "https://example.com/article2" ], "extract_depth": "advanced" }
Response includes:
Extract clean, LLM-ready data from single URLs with enhanced formatting options.
Parameters:
url
(string | string[], required): Single URL or array of URLs to
extract content fromextract_depth
(string, optional): Extraction depth - 'basic'
(default) or 'advanced'Example:
{ "url": "https://example.com/article", "extract_depth": "basic" }
Response includes:
Deep crawling of all accessible subpages on a website with configurable depth limits.
Parameters:
url
(string | string[], required): Starting URL for crawlingextract_depth
(string, optional): Extraction depth - 'basic'
(default) or 'advanced' (controls crawl depth and limits)Example:
{ "url": "https://example.com", "extract_depth": "advanced" }
Response includes:
Fast URL collection from websites for comprehensive site mapping.
Parameters:
url
(string | string[], required): URL to mapextract_depth
(string, optional): Extraction depth - 'basic'
(default) or 'advanced' (controls map depth)Example:
{ "url": "https://example.com", "extract_depth": "basic" }
Response includes:
Structured data extraction with AI using natural language prompts.
Parameters:
url
(string | string[], required): URL to extract structured data
fromextract_depth
(string, optional): Extraction depth - 'basic'
(default) or 'advanced'Example:
{ "url": "https://example.com", "extract_depth": "basic" }
Response includes:
Support for page interactions (clicking, scrolling, etc.) before extraction for dynamic content.
Parameters:
url
(string | string[], required): URL to interact with and
extract content fromextract_depth
(string, optional): Extraction depth - 'basic'
(default) or 'advanced' (controls complexity of interactions)Example:
{ "url": "https://news.ycombinator.com", "extract_depth": "basic" }
Response includes:
Extract full content from Exa search result IDs.
Parameters:
ids
(string | string[], required): Exa search result ID(s) to
extract content fromextract_depth
(string, optional): Extraction depth - 'basic'
(default) or 'advanced'Example:
{ "ids": ["exa-result-id-123", "exa-result-id-456"], "extract_depth": "advanced" }
Response includes:
Find web pages semantically similar to a given URL using Exa.
Parameters:
url
(string, required): URL to find similar pages forextract_depth
(string, optional): Extraction depth - 'basic'
(default) or 'advanced'Example:
{ "url": "https://arxiv.org/abs/2106.09685", "extract_depth": "advanced" }
Response includes:
Get supplementary content from specialized indexes.
Parameters:
query
(string, required): Query for enrichmentExample:
{ "query": "emerging web technologies" }
Verify statements against web knowledge.
Parameters:
statement
(string, required): Statement to verifyExample:
{ "statement": "TypeScript adds static typing to JavaScript" }
MCP Omnisearch supports containerized deployment using Docker with MCPO (Model Context Protocol Over HTTP) integration, enabling cloud deployment and OpenAPI access.
# Clone the repository git clone https://github.com/spences10/mcp-omnisearch.git cd mcp-omnisearch # Create .env file with your API keys echo "TAVILY_API_KEY=your-tavily-key" > .env echo "KAGI_API_KEY=your-kagi-key" >> .env echo "PERPLEXITY_API_KEY=your-perplexity-key" >> .env echo "EXA_API_KEY=your-exa-key" >> .env # Add other API keys as needed echo "GITHUB_API_KEY=your-github-key" >> .env # Start the container docker-compose up -d
docker build -t mcp-omnisearch . docker run -d \ -p 8000:8000 \ -e TAVILY_API_KEY=your-tavily-key \ -e KAGI_API_KEY=your-kagi-key \ -e PERPLEXITY_API_KEY=your-perplexity-key \ -e EXA_API_KEY=your-exa-key \ -e GITHUB_API_KEY=your-github-key \ --name mcp-omnisearch \ mcp-omnisearch
Configure the container using environment variables for each provider:
TAVILY_API_KEY
: For Tavily SearchPERPLEXITY_API_KEY
: For Perplexity AIKAGI_API_KEY
: For Kagi services (FastGPT, Summarizer, Enrichment)JINA_AI_API_KEY
: For Jina AI services (Reader, Grounding)BRAVE_API_KEY
: For Brave SearchGITHUB_API_KEY
: For GitHub search servicesEXA_API_KEY
: For Exa AI servicesFIRECRAWL_API_KEY
: For Firecrawl servicesFIRECRAWL_BASE_URL
: For self-hosted Firecrawl instances (optional)PORT
: Container port (defaults to 8000)Once deployed, the MCP server is accessible via OpenAPI at:
http://your-container-host:8000
/omnisearch
The containerized version can be deployed to any container platform that supports Docker:
Example deployment to a cloud platform:
# Build and tag for your registry docker build -t your-registry/mcp-omnisearch:latest . docker push your-registry/mcp-omnisearch:latest # Deploy with your platform's CLI or web interface # Configure environment variables through your platform's settings
pnpm install
pnpm run build
pnpm run dev
pnpm run build
pnpm publish
Each provider requires its own API key and may have different access requirements:
Each provider has its own rate limits. The server will handle rate limit errors gracefully and return appropriate error messages.
Please read CONTRIBUTING.md before opening a PR. In short:
src/common/http.ts
(http_json
)
for HTTP, read keys from src/config/env.ts
, respect timeouts, and
surface errors via ProviderError
.MIT License - see the LICENSE file for details.
Built on: