Sage Multi-Model Orchestrator
STDIOMCP server for sending prompts to OpenAI O3 or Google Gemini based on token count.
MCP server for sending prompts to OpenAI O3 or Google Gemini based on token count.
mcp-sage
An MCP (Model Context Protocol) server that provides tools for sending prompts to either OpenAI's O3 model or Google's Gemini 2.5 Pro based on token count. The tools embed all referenced filepaths (recursively for folders) in the prompt. This is useful for getting second opinions or detailed code reviews from a model that can handle tons of context accurately.
I make heavy use of Claude Code. It's a great product that works well for my workflow. Newer models with large amounts of context seem really useful though for dealing with more complex codebases where more context is needed. This lets me continue to use Claude Code as a development tool while leveraging the large context capabilities of O3 and Gemini 2.5 Pro to augment Claude Code's limited context.
The server automatically selects the appropriate model based on token count and available API keys:
Fallback behavior:
API Key Fallback:
Network Connectivity Fallback:
This project draws inspiration from two other open source projects:
This project implements an MCP server that exposes three tools:
sage-opinion
sage-review
sage-plan
The sage-plan
tool doesn't ask a single model for a plan.
Instead it orchestrates a structured debate that runs for one or more rounds and then asks a separate judge model (or the same model in CoRT mode) to pick the winner.
flowchart TD S0[Start Debate] -->|determine models, judge, budgets| R1 subgraph R1["Round 1"] direction TB R1GEN["Generation Phase<br/>*ALL models run in parallel*"] R1GEN --> R1CRIT["Critique Phase<br/>*ALL models critique others in parallel*"] end subgraph RN["Rounds 2 to N"] direction TB SYNTH["Synthesis Phase<br/>*every model refines own plan*"] SYNTH --> CONS[Consensus Check] CONS -->|Consensus reached| JUDGE CONS -->|No consensus & round < N| CRIT["Critique Phase<br/>*models critique in parallel*"] CRIT --> SYNTH end R1 --> RN JUDGE[Judgment Phase<br/>*judge model selects/merges plan*] JUDGE --> FP[Final Plan] classDef round fill:#e2eafe,stroke:#4169E1; class R1GEN,R1CRIT,SYNTH,CRIT round; style FP fill:#D0F0D7,stroke:#2F855A,stroke-width:2px style JUDGE fill:#E8E8FF,stroke:#555,stroke-width:1px
Key phases in the multi-model debate:
Setup Phase
Round 1
Rounds 2 to N (N defaults to 3)
Judgment Phase
flowchart TD SD0[Start Self-Debate] --> R1 subgraph R1["Round 1 - Initial Plans"] direction TB P1[Generate Plan 1] --> P2[Generate Plan 2<br/>*different approach*] P2 --> P3[Generate Plan 3<br/>*different approach*] end subgraph RN["Rounds 2 to N"] direction TB REF[Generate Improved Plan<br/>*addresses weaknesses in all previous plans*] DEC{More rounds left?} REF --> DEC DEC -->|Yes| REF end R1 --> RN DEC -->|No| FP[Final Plan = last plan generated] style FP fill:#D0F0D7,stroke:#2F855A,stroke-width:2px
When only one model is available, a Chain of Recursive Thoughts (CoRT) approach is used:
Phase / Functionality | Code Location | Notes |
---|---|---|
Generation Prompts | prompts/debatePrompts.generatePrompt | Adds heading "# Implementation Plan (Model X)" |
Critique Prompts | prompts/debatePrompts.critiquePrompt | Uses "## Critique of Plan {ID}" sections |
Synthesis Prompts | prompts/debatePrompts.synthesizePrompt | Model revises its own plan |
Consensus Check | debateOrchestrator.checkConsensus | Judge model returns JSON with consensusScore |
Judgment | prompts/debatePrompts.judgePrompt | Judge returns "# Final Implementation Plan" + confidence |
Self-Debate Prompt | prompts/debatePrompts.selfDebatePrompt | Chain-of-Recursive-Thoughts loop |
⚠️ Important: The sage-plan tool can:
Typical resource usage:
To install Sage for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install @jalehman/mcp-sage --client claude
# Clone the repository git clone https://github.com/your-username/mcp-sage.git cd mcp-sage # Install dependencies npm install # Build the project npm run build
Set the following environment variables:
OPENAI_API_KEY
: Your OpenAI API key (for O3 model)GEMINI_API_KEY
: Your Google Gemini API key (for Gemini 2.5 Pro)After building with npm run build
, add the following to your MCP configuration:
OPENAI_API_KEY=your_openai_key GEMINI_API_KEY=your_gemini_key node /path/to/this/repo/dist/index.js
You can also use environment variables set elsewhere, like in your shell profile.
To get a second opinion on something just ask for a second opinion.
To get a code review, ask for a code review or expert review.
Both of these benefit from providing paths of files that you wnat to be included in context, but if omitted the host LLM will probably infer what to include.
The server provides detailed monitoring information via the MCP logging capability. These logs include:
Logs are sent via the MCP protocol's notifications/message
method, ensuring they don't interfere with the JSON-RPC communication. MCP clients with logging support will display these logs appropriately.
Example log entries:
Token usage: 1,234 tokens. Selected model: o3-2025-04-16 (limit: 200,000 tokens)
Files included: 3, Document count: 3
Sending request to OpenAI o3-2025-04-16 with 1,234 tokens...
Received response from o3-2025-04-16 in 982ms
Token usage: 235,678 tokens. Selected model: gemini-2.5-pro-preview-03-25 (limit: 1,000,000 tokens)
Files included: 25, Document count: 18
Sending request to Gemini with 235,678 tokens...
Received response from gemini-2.5-pro-preview-03-25 in 3240ms
The sage-opinion
tool accepts the following parameters:
prompt
(string, required): The prompt to send to the selected modelpaths
(array of strings, required): List of file paths to include as contextExample MCP tool call (using JSON-RPC 2.0):
{ "jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": { "name": "sage-opinion", "arguments": { "prompt": "Explain how this code works", "paths": ["path/to/file1.js", "path/to/file2.js"] } } }
The sage-review
tool accepts the following parameters:
instruction
(string, required): The specific changes or improvements neededpaths
(array of strings, required): List of file paths to include as contextExample MCP tool call (using JSON-RPC 2.0):
{ "jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": { "name": "sage-review", "arguments": { "instruction": "Add error handling to the function", "paths": ["path/to/file1.js", "path/to/file2.js"] } } }
The response will contain SEARCH/REPLACE blocks that you can use to implement the suggested changes:
<<<<<<< SEARCH
function getData() {
return fetch('/api/data')
.then(res => res.json());
}
=======
function getData() {
return fetch('/api/data')
.then(res => {
if (!res.ok) {
throw new Error(`HTTP error! Status: ${res.status}`);
}
return res.json();
})
.catch(error => {
console.error('Error fetching data:', error);
throw error;
});
}
>>>>>>> REPLACE
The sage-plan
tool accepts the following parameters:
prompt
(string, required): Description of what you need an implementation plan forpaths
(array of strings, required): List of file paths to include as contextExample MCP tool call (using JSON-RPC 2.0):
{ "jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": { "name": "sage-plan", "arguments": { "prompt": "Create an implementation plan for adding user authentication to this application", "paths": ["src/index.js", "src/models/", "src/routes/"] } } }
The response contains a detailed implementation plan with:
This plan benefits from the collective intelligence of multiple AI models (or thorough self-review by a single model) and typically contains more robust, thoughtful, and detailed recommendations than a single-pass approach.
To test the tools:
# Test the sage-opinion tool OPENAI_API_KEY=your_openai_key GEMINI_API_KEY=your_gemini_key node test/run-test.js # Test the sage-review tool OPENAI_API_KEY=your_openai_key GEMINI_API_KEY=your_gemini_key node test/test-expert.js # Test the sage-plan tool OPENAI_API_KEY=your_openai_key GEMINI_API_KEY=your_gemini_key node test/run-sage-plan.js # Test the model selection logic specifically OPENAI_API_KEY=your_openai_key GEMINI_API_KEY=your_gemini_key node test/test-o3.js
Note: The sage-plan test may take 5-15 minutes to run as it orchestrates a multi-model debate.
src/index.ts
: The main MCP server implementation with tool definitionssrc/pack.ts
: Tool for packing files into a structured XML formatsrc/tokenCounter.ts
: Utilities for counting tokens in a promptsrc/gemini.ts
: Gemini API client implementationsrc/openai.ts
: OpenAI API client implementation for O3 modelsrc/debateOrchestrator.ts
: Multi-model debate orchestration for sage-plansrc/prompts/debatePrompts.ts
: Templates for debate prompts and instructionstest/run-test.js
: Test for the sage-opinion tooltest/test-expert.js
: Test for the sage-review tooltest/run-sage-plan.js
: Test for the sage-plan tooltest/test-o3.js
: Test for the model selection logicISC