
Cloudflare AutoRAG
HTTP-SSEMCP server providing search capabilities for Cloudflare AutoRAG knowledge base instances
MCP server providing search capabilities for Cloudflare AutoRAG knowledge base instances
A Model Context Protocol (MCP) server that provides search capabilities for Cloudflare AutoRAG instances. This server enables AI assistants like Claude to directly search and query your AutoRAG knowledge base using three distinct search methods.
score_threshold
(default: 0.5) and max_num_results
(1-50, default: 10)autorag_basic_search
Performs a basic vector similarity search in your Cloudflare AutoRAG index without AI query rewriting or answer generation. Returns raw document chunks only.
Parameters:
query
(string, required) - The search query text (max 10,000 characters)score_threshold
(number, optional) - Minimum similarity score threshold (0.0-1.0, default: 0.5)max_num_results
(number, optional) - Maximum number of results to return (1-50, default: 10)autorag_name
(string, optional) - Name of the AutoRAG instance to use (defaults to configured default)autorag_rewrite_search
Performs a vector search with AI query rewriting but no answer generation. Uses Cloudflare's search()
method with configurable rewrite_query
for better semantic matching and returns only document chunks.
Parameters:
query
(string, required) - The search query text (max 10,000 characters)score_threshold
(number, optional) - Minimum similarity score threshold (0.0-1.0, default: 0.5)max_num_results
(number, optional) - Maximum number of results to return (1-50, default: 10)rewrite_query
(boolean, optional) - Whether to rewrite query for better matching (default: true)autorag_name
(string, optional) - Name of the AutoRAG instance to use (defaults to configured default)autorag_ai_search
Performs AI-powered search using Cloudflare's aiSearch()
method with optional AI-generated response. Returns document chunks and optionally an AI answer based on the include_ai_response
parameter. Supports pagination for large result sets.
Parameters:
query
(string, required) - The search query text (max 10,000 characters)score_threshold
(number, optional) - Minimum similarity score threshold (0.0-1.0, default: 0.5)max_num_results
(number, optional) - Maximum number of results to return (1-50, default: 10)rewrite_query
(boolean, optional) - Whether to rewrite the query for better semantic matching (default: true)include_ai_response
(boolean, optional) - Whether to include the AI-generated response in the output (default: false)cursor
(string, optional) - Pagination cursor from previous response to fetch next page of results (v1.2.0+)autorag_name
(string, optional) - Name of the AutoRAG instance to use (defaults to configured default)Response includes:
data
- Array of source document chunks with scores and metadata (always included)response
- AI-generated answer based on retrieved documents (only when include_ai_response: true
)has_more
- Boolean indicating if more results are availablenext_page
- Cursor token for fetching the next page (when has_more
is true)nextCursor
- MCP-compliant cursor field (mirrors next_page
value)list_autorags
(v2.0.0+)Lists all available AutoRAG instances configured in the server.
Parameters: None
Response includes:
autorags
- Array of AutoRAG instances with name, description, and is_default flagtotal
- Total number of configured AutoRAG instancesdefault
- Name of the default AutoRAG instanceget_current_autorag
(v2.0.0+)Gets information about the currently configured default AutoRAG instance.
Parameters: None
Response includes:
current_autorag
- Name of the current default AutoRAG instancedescription
- Description of the instanceis_default
- Always true for this endpointnpm install --save-dev wrangler
)Clone the repository:
git clone <repository-url> cd cf-autorag-mcp
Install dependencies:
npm install
Configure your AutoRAG instance:
Edit wrangler.toml
and update the configuration:
For a single AutoRAG instance:
[vars] AUTORAG_NAME = "your-autorag-instance-name"
For multiple AutoRAG instances:
[vars] AUTORAG_INSTANCES = "instance1,instance2,instance3" AUTORAG_DESCRIPTIONS = "Description 1,Description 2,Description 3"
Deploy to Cloudflare Workers:
npx wrangler deploy
This will output your Worker URL, which you'll need for the MCP client configuration.
To use this MCP server with Claude Desktop, add the following configuration to your Claude Desktop config file:
Edit ~/Library/Application Support/Claude/claude_desktop_config.json
:
Edit %APPDATA%/Claude/claude_desktop_config.json
:
{ "mcpServers": { "cf-autorag-mcp": { "command": "npx", "args": [ "mcp-remote", "https://your-worker-url.workers.dev/" ] } } }
Replace https://your-worker-url.workers.dev/
with your actual deployed Worker URL.
After updating the configuration:
The server uses the following Cloudflare Worker bindings:
AI
- Cloudflare AI binding for AutoRAG access (handles all AutoRAG operations)AUTORAG_NAME
- Your AutoRAG instance name (for single instance configuration)AUTORAG_INSTANCES
- Comma-separated list of AutoRAG instances (for multi-instance configuration)AUTORAG_DESCRIPTIONS
- Comma-separated list of descriptions for each instanceThe wrangler.toml
file includes:
name = "cf-autorag-mcp" main = "src/server.ts" compatibility_date = "2024-09-23" compatibility_flags = ["nodejs_compat"] [vars] # For single AutoRAG instance: AUTORAG_NAME = "your-autorag-instance-name" # For multiple AutoRAG instances (v2.0.0+): # AUTORAG_INSTANCES = "default-autorag,secondary-autorag,specialized-autorag" # AUTORAG_DESCRIPTIONS = "Main knowledge base,Secondary knowledge base,Specialized documents" [ai] binding = "AI"
Note: The VECTORIZE binding is not required. AutoRAG manages its own vector index access internally through the AI binding.
Once configured with Claude Desktop, you can use the tools like this:
Basic Search (no query rewriting, no AI response):
Search for documents about "machine learning" in my AutoRAG with a minimum score threshold of 0.7
Rewrite Search (AI query rewriting, no AI response):
Use rewrite search to find information about "deployment strategies" with query rewriting enabled
AI Search with Document Chunks Only (default behavior):
Use AI search to find information about "deployment strategies" with max 5 results
AI Search with AI-Generated Response:
Use AI search to find information about "deployment strategies" and include the AI-generated response
Multi-AutoRAG Usage (v2.0.0+):
List all available AutoRAG instances
Search for "security policies" in the secondary-autorag instance
Use AI search in specialized-autorag to find "compliance requirements" with AI response
Important Notes:
autorag_basic_search
performs pure vector search without any AI enhancementsautorag_rewrite_search
uses AI query rewriting but returns document chunks onlyautorag_ai_search
by default returns document chunks only (letting the client LLM generate responses), but can optionally include Cloudflare's AI-generated response# Start local development server npm run dev # Build for production npm run build
cf-autorag-mcp/
├── src/
│ └── server.ts # Main MCP server implementation
├── wrangler.toml # Cloudflare Workers configuration
├── package.json # Dependencies and scripts
└── README.md # This file
"AutoRAG instance not found"
AUTORAG_NAME
in wrangler.toml
"MCP server disconnected"
"Tool not found" errors
npx wrangler tail
Empty search results
score_threshold
parameter (default is 0.5)View real-time logs from your deployed Worker:
npx wrangler tail
include_ai_response
parameter to AI search tool, default score threshold of 0.5, comprehensive parameter validationThis project is licensed under the MIT License.
For issues related to: