Ultimate MCP
STDIOComprehensive MCP-native system providing AI agents with dozens of powerful capabilities for cognitive augmentation.
Comprehensive MCP-native system providing AI agents with dozens of powerful capabilities for cognitive augmentation.
Getting Started • Key Features • Usage Examples • Architecture
Ultimate MCP Server is a comprehensive MCP-native system that serves as a complete AI agent operating system. It exposes dozens of powerful capabilities through the Model Context Protocol, enabling advanced AI agents to access a rich ecosystem of tools, cognitive systems, and specialized services.
While it includes intelligent task delegation from sophisticated models (e.g., Claude 3.7 Sonnet) to cost-effective ones (e.g., Gemini Flash 2.0 Lite), this is just one facet of its extensive functionality. The server provides unified access to multiple LLM providers while optimizing for cost, performance, and quality.
The system offers integrated cognitive memory systems, browser automation, Excel manipulation, database interactions, document processing, command-line utilities, dynamic API integration, OCR capabilities, vector operations, entity relation graphs, SQL database interactions, audio transcription, and much more. These capabilities transform an AI agent from a conversational interface into a powerful autonomous system capable of complex, multi-step operations across digital environments.
---## 🎯 Vision: The Complete AI Agent Operating System
At its core, Ultimate MCP Server represents a fundamental shift in how AI agents operate in digital environments. It serves as a comprehensive operating system for AI, providing:
This approach mirrors how sophisticated operating systems provide applications with access to hardware, services, and resources - but designed specifically for augmenting AI agents with powerful new capabilities beyond their native abilities.
The server is built entirely on the Model Context Protocol (MCP), making it specifically designed to work with AI agents like Claude. All functionality is exposed through standardized MCP tools that can be directly called by these agents, creating a seamless integration layer between AI agents and a comprehensive ecosystem of capabilities, services, and external systems.
The Ultimate MCP Server transforms AI agents like Claude 3.7 Sonnet into autonomous systems capable of sophisticated operations across digital environments:
interacts with ┌─────────────┐ ────────────────────────► ┌───────────────────┐ ┌──────────────┐ │ Claude 3.7 │ │ Ultimate MCP │ ───────►│ LLM Providers│ │ (Agent) │ ◄──────────────────────── │ Server │ ◄───────│ External │ └─────────────┘ returns results └───────────────────┘ │ Systems │ │ └──────────────┘ ▼ ┌─────────────────────────────────────────────┐ │ Cognitive Memory Systems │ │ Web & Data: Browser, DB, RAG, Vector Search │ │ Documents: Excel, OCR, PDF, Filesystem │ │ Analysis: Entity Graphs, Classification │ │ Integration: APIs, CLI, Audio, Multimedia │ └─────────────────────────────────────────────┘
Example workflow:
This integration unlocks transformative capabilities that enable AI agents to autonomously complete complex projects while intelligently utilizing resources - including potentially saving 70-90% on API costs by using specialized tools and cost-effective models where appropriate.
A unified hub enabling advanced AI agents to access an extensive ecosystem of tools:
API costs for advanced models can be substantial. Ultimate MCP Server helps reduce costs by:
Avoid provider lock-in with a unified interface:
Process documents and data efficiently:
ocr
extra dependencies: uv pip install -e ".[ocr]"
)readability-lxml
, trafilatura
, markdownify
.ripgrep
(fast regex search), awk
(text processing), sed
(stream editor), jq
(JSON processing) as MCP tools. Process text locally without API calls.list_tools
).register_api
, call_dynamic_tool
).Rich
./healthz
endpoint for readiness checks.umcp
CLI for management and interaction.# Install uv (fast Python package manager) if you don't have it: curl -LsSf https://astral.sh/uv/install.sh | sh # Clone the repository git clone https://github.com/Dicklesworthstone/ultimate_mcp_server.git cd ultimate_mcp_server # Create a virtual environment and install dependencies using uv: uv venv --python 3.13 source .venv/bin/activate uv lock --upgrade uv sync --all-extras
Note: The uv sync --all-extras
command installs all optional extras defined in the project (e.g., OCR, Browser Automation, Excel). If you only need specific extras, adjust your project dependencies and run uv sync
without --all-extras
.
Create a file named .env
in the root directory of the cloned repository. Add your API keys and any desired configuration overrides:
# --- API Keys (at least one provider required) --- OPENAI_API_KEY=your_openai_sk-... ANTHROPIC_API_KEY=your_anthropic_sk-... GEMINI_API_KEY=your_google_ai_studio_key... # For Google AI Studio (Gemini API) # Or use GOOGLE_APPLICATION_CREDENTIALS=/path/to/your/service-account-key.json for Vertex AI DEEPSEEK_API_KEY=your_deepseek_key... OPENROUTER_API_KEY=your_openrouter_key... GROK_API_KEY=your_grok_key... # For Grok via xAI API # --- Server Configuration (Defaults shown) --- GATEWAY_SERVER_PORT=8013 GATEWAY_SERVER_HOST=127.0.0.1 # Change to 0.0.0.0 to listen on all interfaces (needed for Docker/external access) # GATEWAY_API_PREFIX=/ # --- Logging Configuration (Defaults shown) --- LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR, CRITICAL USE_RICH_LOGGING=true # Set to false for plain text logs # --- Cache Configuration (Defaults shown) --- GATEWAY_CACHE_ENABLED=true GATEWAY_CACHE_TTL=86400 # Default Time-To-Live in seconds (24 hours) # GATEWAY_CACHE_TYPE=memory # Options might include 'memory', 'redis', 'diskcache' (check implementation) # GATEWAY_CACHE_MAX_SIZE=1000 # Example: Max number of items for memory cache # GATEWAY_CACHE_DIR=./.cache # Directory for disk cache storage # --- Provider Timeouts & Retries (Defaults shown) --- # GATEWAY_PROVIDER_TIMEOUT=120 # Default timeout in seconds for API calls # GATEWAY_PROVIDER_MAX_RETRIES=3 # Default max retries on failure # --- Provider-Specific Configuration --- # GATEWAY_OPENAI_DEFAULT_MODEL=gpt-4.1-mini # Customize default model # GATEWAY_ANTHROPIC_DEFAULT_MODEL=claude-3-5-sonnet-20241022 # Customize default model # GATEWAY_GEMINI_DEFAULT_MODEL=gemini-2.0-pro # Customize default model # --- Tool Specific Config (Examples) --- # FILESYSTEM__ALLOWED_DIRECTORIES=["/path/to/safe/dir1","/path/to/safe/dir2"] # For Filesystem tools (JSON array) # GATEWAY_AGENT_MEMORY_DB_PATH=unified_agent_memory.db # Path for agent memory database # GATEWAY_PROMPT_TEMPLATES_DIR=./prompt_templates # Directory for prompt templates
Make sure your virtual environment is active (source .venv/bin/activate
).
# Start the MCP server with all registered tools found umcp run # Start the server including only specific tools umcp run --include-tools completion chunk_document read_file write_file # Start the server excluding specific tools umcp run --exclude-tools browser_init browser_navigate research_and_synthesize_report # Start with Docker (ensure .env file exists in the project root or pass environment variables) docker compose up --build # Add --build the first time or after changes
Once running, the server will typically be available at http://localhost:8013
(or the host/port configured in your .env
or command line). You should see log output indicating the server has started and which tools are registered.
The Ultimate MCP Server provides a powerful command-line interface (CLI) through the umcp
command that allows you to manage the server, interact with LLM providers, test features, and explore examples. This section details all available commands and their options.
The umcp
command supports the following global option:
umcp --version # Display version information
The run
command starts the Ultimate MCP Server with specified options:
# Basic server start with default settings from .env umcp run # Run on a specific host (-h) and port (-p) umcp run -h 0.0.0.0 -p 9000 # Run with multiple worker processes (-w) umcp run -w 4 # Enable debug logging (-d) umcp run -d # Use stdio transport (-t) umcp run -t stdio # Use streamable-http transport (recommended for HTTP clients) umcp run -t shttp # Run only with specific tools (no shortcut for --include-tools) umcp run --include-tools completion chunk_document read_file write_file # Run with all tools except certain ones (no shortcut for --exclude-tools) umcp run --exclude-tools browser_init browser_navigate
Example output:
┌─ Starting Ultimate MCP Server ───────────────────┐
│ Host: 0.0.0.0 │
│ Port: 9000 │
│ Workers: 4 │
│ Transport mode: streamable-http │
└────────────────────────────────────────────────┘
INFO: Started server process [12345]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:9000 (Press CTRL+C to quit)
Available options:
-h, --host
: Host or IP address to bind the server to (default: from .env)-p, --port
: Port to listen on (default: from .env)-w, --workers
: Number of worker processes to spawn (default: from .env)-t, --transport-mode
: Transport mode for server communication ('shttp' for streamable-http, 'sse', or 'stdio', default: shttp)-d, --debug
: Enable debug logging--include-tools
: List of tool names to include (comma-separated)--exclude-tools
: List of tool names to exclude (comma-separated)The providers
command displays information about configured LLM providers:
# List all configured providers umcp providers # Check API keys (-c) for all configured providers umcp providers -c # List available models (no shortcut for --models) umcp providers --models # Check keys and list models umcp providers -c --models
Example output:
┌─ LLM Providers ──────────────────────────────────────────────────┐
│ Provider Status Default Model API Key │
├───────────────────────────────────────────────────────────────────┤
│ openai ✓ gpt-4.1-mini sk-...5vX [VALID] │
│ anthropic ✓ claude-3-5-sonnet-20241022 sk-...Hr [VALID] │
│ gemini ✓ gemini-2.0-pro [VALID] │
│ deepseek ✗ deepseek-chat [NOT CONFIGURED] │
│ openrouter ✓ -- [VALID] │
│ grok ✓ grok-1 [VALID] │
└───────────────────────────────────────────────────────────────────┘
With --models
:
OPENAI MODELS:
- gpt-4.1-mini
- gpt-4o
- gpt-4-0125-preview
- gpt-3.5-turbo
ANTHROPIC MODELS:
- claude-3-5-sonnet-20241022
- claude-3-5-haiku-20241022
- claude-3-opus-20240229
...
Available options:
-c, --check
: Check API keys for all configured providers--models
: List available models for each providerThe test
command allows you to test a specific provider:
# Test the default OpenAI model with a simple prompt umcp test openai # Test a specific model (--model) with a custom prompt (--prompt) umcp test anthropic --model claude-3-5-haiku-20241022 --prompt "Write a short poem about coding." # Test Gemini with a different prompt umcp test gemini --prompt "What are three interesting AI research papers from 2024?"
Example output:
Testing provider 'anthropic'...
Provider: anthropic
Model: claude-3-5-haiku-20241022
Prompt: Write a short poem about coding.
❯ Response:
Code flows like water,
Logic cascades through the mind—
Bugs bloom like flowers.
Tokens: 13 input, 19 output
Cost: $0.00006
Response time: 0.82s
Available options:
--model
: Model ID to test (defaults to the provider's default)--prompt
: Prompt text to send (default: "Hello, world!")The complete
command lets you generate text directly from the CLI:
# Generate text with default provider (OpenAI) using a prompt (--prompt) umcp complete --prompt "Write a concise explanation of quantum computing." # Specify a provider (--provider) and model (--model) umcp complete --provider anthropic --model claude-3-5-sonnet-20241022 --prompt "What are the key differences between Rust and Go?" # Use a system prompt (--system) umcp complete --provider openai --model gpt-4o --system "You are an expert programmer..." --prompt "Explain dependency injection." # Stream the response token by token (-s) umcp complete --provider openai --prompt "Count from 1 to 10." -s # Adjust temperature (--temperature) and token limit (--max-tokens) umcp complete --provider gemini --temperature 1.2 --max-tokens 250 --prompt "Generate a creative sci-fi story opening." # Read prompt from stdin (no --prompt needed) echo "Tell me about space exploration." | umcp complete
Example output:
Quantum computing uses quantum bits (qubits) that can exist in multiple states simultaneously, unlike classical bits (0 or 1). This quantum superposition, along with entanglement, allows quantum computers to process vast amounts of information in parallel, potentially solving certain complex problems exponentially faster than classical computers. Applications include cryptography, materials science, and optimization problems.
Tokens: 13 input, 72 output
Cost: $0.00006
Response time: 0.37s
Available options:
--provider
: Provider to use (default: openai)--model
: Model ID (defaults to provider's default)--prompt
: Prompt text (reads from stdin if not provided)--temperature
: Sampling temperature (0.0-2.0, default: 0.7)--max-tokens
: Maximum tokens to generate--system
: System prompt for providers that support it-s, --stream
: Stream the response token by tokenThe cache
command allows you to view or clear the request cache:
# Show cache status (default action) umcp cache # Explicitly show status (no shortcut for --status) umcp cache --status # Clear the cache (no shortcut for --clear, with confirmation prompt) umcp cache --clear # Show stats and clear the cache in one command umcp cache --status --clear
Example output:
Cache Status:
Backend: memory
Enabled: True
Items: 127
Hit rate: 73.2%
Estimated savings: $1.47
Available options:
--status
: Show cache status (enabled by default if no other flag)--clear
: Clear the cache (will prompt for confirmation)The benchmark
command lets you compare performance and cost across providers:
# Run default benchmark (3 runs per provider) umcp benchmark # Benchmark only specific providers umcp benchmark --providers openai,anthropic # Benchmark with specific models umcp benchmark --providers openai,anthropic --models gpt-4o,claude-3.5-sonnet # Use a custom prompt and more runs (-r) umcp benchmark --prompt "Explain the process of photosynthesis in detail." -r 5
Example output:
┌─ Benchmark Results ───────────────────────────────────────────────────────┐
│ Provider Model Avg Time Tokens Cost Tokens/sec │
├──────────────────────────────────────────────────────────────────────────┤
│ openai gpt-4.1-mini 0.47s 76 / 213 $0.00023 454 │
│ anthropic claude-3-5-haiku 0.52s 76 / 186 $0.00012 358 │
│ gemini gemini-2.0-pro 0.64s 76 / 201 $0.00010 314 │
│ deepseek deepseek-chat 0.71s 76 / 195 $0.00006 275 │
└──────────────────────────────────────────────────────────────────────────┘
Available options:
--providers
: List of providers to benchmark (default: all configured)--models
: Model IDs to benchmark (defaults to default model of each provider)--prompt
: Prompt text to use (default: built-in benchmark prompt)-r, --runs
: Number of runs per provider/model (default: 3)The tools
command lists available tools, optionally filtered by category:
# List all tools umcp tools # List tools in a specific category umcp tools --category document # Show related example scripts umcp tools --examples
Example output:
┌─ Ultimate MCP Server Tools ─────────────────────────────────────────┐
│ Category Tool Example Script │
├──────────────────────────────────────────────────────────────────────┤
│ completion generate_completion simple_completion_demo.py │
│ completion stream_completion simple_completion_demo.py │
│ completion chat_completion claude_integration_demo.py│
│ document summarize_document document_processing.py │
│ document chunk_document document_processing.py │
│ extraction extract_json advanced_extraction_demo.py│
│ filesystem read_file filesystem_operations_demo.py│
└──────────────────────────────────────────────────────────────────────┘
Tip: Run examples using the command:
umcp examples <example_name>
Available options:
--category
: Filter tools by category--examples
: Show example scripts alongside toolsThe examples
command lets you list and run example scripts:
# List all example scripts (default action) umcp examples # Explicitly list example scripts (-l) umcp examples -l # Run a specific example umcp examples rag_example.py # Can also run by just the name without extension umcp examples rag_example
Example output when listing:
┌─ Ultimate MCP Server Example Scripts ─────────────────────────────────┐
│ Category Example Script │
├────────────────────────────────────────────────────────────────────────┤
│ text-generation simple_completion_demo.py │
│ text-generation claude_integration_demo.py │
│ document-processing document_processing.py │
│ search-and-retrieval rag_example.py │
│ browser-automation browser_automation_demo.py │
└────────────────────────────────────────────────────────────────────────┘
Run an example:
umcp examples <example_name>
When running an example:
Running example: rag_example.py
Creating vector knowledge base 'demo_kb'...
Adding sample documents...
Retrieving context for query: "What are the benefits of clean energy?"
Generated response:
Based on the retrieved context, clean energy offers several benefits:
...
Available options:
-l, --list
: List example scripts only--category
: Filter examples by categoryEvery command has detailed help available:
# General help umcp --help # Help for a specific command umcp run --help umcp providers --help umcp complete --help
Example output:
Usage: umcp [OPTIONS] COMMAND [ARGS]...
Ultimate MCP Server: Multi-provider LLM management server
Unified CLI to run your server, manage providers, and more.
Options:
--version, -v Show the application version and exit.
--help Show this message and exit.
Commands:
run Run the Ultimate MCP Server
providers List Available Providers
test Test a Specific Provider
complete Generate Text Completion
cache Cache Management
benchmark Benchmark Providers
tools List Available Tools
examples Run or List Example Scripts
Command-specific help:
Usage: umcp run [OPTIONS]
Run the Ultimate MCP Server
Start the server with optional overrides.
Examples:
umcp run -h 0.0.0.0 -p 8000 -w 4 -t sse
umcp run -d
Options:
-h, --host TEXT Host or IP address to bind the server to.
Defaults from config.
-p, --port INTEGER Port to listen on. Defaults from config.
-w, --workers INTEGER Number of worker processes to spawn.
Defaults from config.
-t, --transport-mode [shttp|sse|stdio]
Transport mode for server communication (-t
shortcut). Options: 'shttp' (streamable-http,
recommended), 'sse', or 'stdio'.
-d, --debug Enable debug logging for detailed output (-d
shortcut).
--include-tools TEXT List of tool names to include when running
the server.
--exclude-tools TEXT List of tool names to exclude when running
the server.
--help Show this message and exit.
This section provides Python examples demonstrating how an MCP client (like an application using mcp-client
or an agent like Claude) would interact with the tools provided by a running Ultimate MCP Server instance.
Note: These examples assume you have mcp-client
installed (pip install mcp-client
) and the Ultimate MCP Server is running at http://localhost:8013
.
(The detailed code blocks from the original input are preserved below for completeness)
import asyncio from mcp.client import Client async def basic_completion_example(): client = Client("http://localhost:8013") response = await client.tools.completion( prompt="Write a short poem about a robot learning to dream.", provider="openai", model="gpt-4.1-mini", max_tokens=100, temperature=0.7 ) if response["success"]: print(f"Completion: {response['completion']}") print(f"Cost: ${response['cost']:.6f}") else: print(f"Error: {response['error']}") await client.close() # if __name__ == "__main__": asyncio.run(basic_completion_example())
import asyncio from mcp.client import Client async def document_analysis_example(): # Assume Claude identifies a large document needing processing client = Client("http://localhost:8013") document = "... large document content ..." * 100 # Placeholder for large content print("Delegating document chunking...") # Step 1: Claude delegates document chunking (often a local, non-LLM task on server) chunks_response = await client.tools.chunk_document( document=document, chunk_size=1000, # Target tokens per chunk overlap=100, # Token overlap method="semantic" # Use semantic chunking if available ) if not chunks_response["success"]: print(f"Chunking failed: {chunks_response['error']}") await client.close() return print(f"Document divided into {chunks_response['chunk_count']} chunks.") # Step 2: Claude delegates summarization of each chunk to a cheaper model summaries = [] total_cost = 0.0 print("Delegating chunk summarization to gemini-2.0-flash-lite...") for i, chunk in enumerate(chunks_response["chunks"]): # Use Gemini Flash (much cheaper than Claude or GPT-4o) via the server summary_response = await client.tools.summarize_document( document=chunk, provider="gemini", # Explicitly delegate to Gemini via server model="gemini-2.0-flash-lite", format="paragraph", max_length=150 # Request a concise summary ) if summary_response["success"]: summaries.append(summary_response["summary"]) cost = summary_response.get("cost", 0.0) total_cost += cost print(f" Processed chunk {i+1}/{chunks_response['chunk_count']} summary. Cost: ${cost:.6f}") else: print(f" Chunk {i+1} summarization failed: {summary_response['error']}") print("\nDelegating entity extraction to gpt-4.1-mini...") # Step 3: Claude delegates entity extraction for the whole document to another cheap model entities_response = await client.tools.extract_entities( document=document, # Process the original document entity_types=["person", "organization", "location", "date", "product"], provider="openai", # Delegate to OpenAI's cheaper model model="gpt-4.1-mini" ) if entities_response["success"]: cost = entities_response.get("cost", 0.0) total_cost += cost print(f"Extracted entities. Cost: ${cost:.6f}") extracted_entities = entities_response['entities'] # Claude would now process these summaries and entities using its advanced capabilities print(f"\nClaude can now use {len(summaries)} summaries and {len(extracted_entities)} entity groups.") else: print(f"Entity extraction failed: {entities_response['error']}") print(f"\nTotal estimated delegation cost for sub-tasks: ${total_cost:.6f}") # Claude might perform final synthesis using the collected results final_synthesis_prompt = f""" Synthesize the key information from the following summaries and entities extracted from a large document. Focus on the main topics, key people involved, and significant events mentioned. Summaries: {' '.join(summaries)} Entities: {extracted_entities} Provide a concise final report. """ # This final step would likely use Claude itself (not shown here) await client.close() # if __name__ == "__main__": asyncio.run(document_analysis_example())
import asyncio from mcp.client import Client async def browser_research_example(): client = Client("http://localhost:8013") print("Starting browser-based research task...") # This tool likely orchestrates multiple browser actions (search, navigate, scrape) # and uses an LLM (specified or default) for synthesis. result = await client.tools.research_and_synthesize_report( topic="Latest advances in AI-powered drug discovery using graph neural networks", instructions={ "search_query": "graph neural networks drug discovery 2024 research", "search_engines": ["google", "duckduckgo"], # Use multiple search engines "urls_to_include": ["nature.com", "sciencemag.org", "arxiv.org", "pubmed.ncbi.nlm.nih.gov"], # Prioritize these domains "max_urls_to_process": 7, # Limit the number of pages to visit/scrape "min_content_length": 500, # Ignore pages with very little content "focus_areas": ["novel molecular structures", "binding affinity prediction", "clinical trial results"], # Guide the synthesis "report_format": "markdown", # Desired output format "report_length": "detailed", # comprehensive, detailed, summary "llm_model": "anthropic/claude-3-5-sonnet-20241022" # Specify LLM for synthesis } ) if result["success"]: print("\nResearch report generated successfully!") print(f"Processed {len(result.get('extracted_data', []))} sources.") print(f"Total processing time: {result.get('processing_time', 'N/A'):.2f}s") print(f"Estimated cost: ${result.get('total_cost', 0.0):.6f}") # Includes LLM synthesis cost print("\n--- Research Report ---") print(result['report']) print("-----------------------") else: print(f"\nBrowser research failed: {result.get('error', 'Unknown error')}") if 'details' in result: print(f"Details: {result['details']}") await client.close() # if __name__ == "__main__": asyncio.run(browser_research_example())
import asyncio from mcp.client import Client import uuid async def cognitive_memory_example(): client = Client("http://localhost:8013") # Generate a unique ID for this session/workflow if not provided workflow_id = str(uuid.uuid4()) print(f"Using Workflow ID: {workflow_id}") print("\nCreating a workflow context...") # Create a workflow context to group related memories and actions workflow_response = await client.tools.create_workflow( workflow_id=workflow_id, title="Quantum Computing Investment Analysis", description="Analyzing the impact of quantum computing on financial markets.", goal="Identify potential investment opportunities or risks." ) if not workflow_response["success"]: print(f"Error creating workflow: {workflow_response['error']}") print("\nRecording an agent action...") # Record the start of a research action action_response = await client.tools.record_action_start( workflow_id=workflow_id, action_type="research", title="Initial literature review on quantum algorithms in finance", reasoning="Need to understand the current state-of-the-art before assessing impact." ) action_id = action_response.get("action_id") if action_response["success"] else None if not action_id: print(f"Error starting action: {action_response['error']}") print("\nStoring facts in semantic memory...") # Store some key facts discovered during research memory1 = await client.tools.store_memory( workflow_id=workflow_id, content="Shor's algorithm can break RSA encryption, posing a threat to current financial security.", memory_type="fact", memory_level="semantic", importance=9.0, tags=["quantum_algorithm", "cryptography", "risk", "shor"] ) memory2 = await client.tools.store_memory( workflow_id=workflow_id, content="Quantum annealing (e.g., D-Wave) shows promise for portfolio optimization problems.", memory_type="fact", memory_level="semantic", importance=7.5, tags=["quantum_computing", "finance", "optimization", "annealing"] ) if memory1["success"]: print(f"Stored memory ID: {memory1['memory_id']}") if memory2["success"]: print(f"Stored memory ID: {memory2['memory_id']}") print("\nStoring an observation (episodic memory)...") # Store an observation from a specific event/document obs_memory = await client.tools.store_memory( workflow_id=workflow_id, content="Read Nature article (doi:...) suggesting experimental quantum advantage in a specific financial modeling task.", memory_type="observation", memory_level="episodic", importance=8.0, source="Nature Article XYZ", timestamp="2024-07-20T10:00:00Z", # Example timestamp tags=["research_finding", "publication", "finance_modeling"] ) if obs_memory["success"]: print(f"Stored episodic memory ID: {obs_memory['memory_id']}") print("\nSearching for relevant memories...") # Search for memories related to financial risks search_results = await client.tools.hybrid_search_memories( workflow_id=workflow_id, query="What are the financial risks associated with quantum computing?", top_k=5, memory_type="fact", # Search for facts first semantic_weight=0.7, keyword_weight=0.3 # Example weighting for hybrid search ) if search_results["success"]: print(f"Found {len(search_results['results'])} relevant memories:") for res in search_results["results"]: print(f" - Score: {res['score']:.4f}, ID: {res['memory_id']}, Content: {res['content'][:80]}...") else: print(f"Memory search failed: {search_results['error']}") print("\nGenerating a reflection based on stored memories...") # Generate insights or reflections based on the accumulated knowledge in the workflow reflection_response = await client.tools.generate_reflection( workflow_id=workflow_id, reflection_type="summary_and_next_steps", # e.g., insights, risks, opportunities context_query="Summarize the key findings about quantum finance impact and suggest next research actions." ) if reflection_response["success"]: print("Generated Reflection:") print(reflection_response["reflection"]) else: print(f"Reflection generation failed: {reflection_response['error']}") # Mark the action as completed (assuming research phase is done) if action_id: print("\nCompleting the research action...") await client.tools.record_action_end( workflow_id=workflow_id, action_id=action_id, status="completed", outcome="Gathered initial understanding of quantum algorithms in finance and associated risks." ) await client.close() # if __name__ == "__main__": asyncio.run(cognitive_memory_example())
import asyncio from mcp.client import Client import os async def excel_automation_example(): client = Client("http://localhost:8013") output_dir = "excel_outputs" os.makedirs(output_dir, exist_ok=True) output_path = os.path.join(output_dir, "financial_model.xlsx") print(f"Requesting creation of Excel financial model at {output_path}...") # Example: Create a financial model using natural language instructions create_result = await client.tools.excel_execute( instruction="Create a simple 3-year financial projection.\n" "Sheet name: 'Projections'.\n" "Columns: Year 1, Year 2, Year 3.\n" "Rows: Revenue, COGS, Gross Profit, Operating Expenses, Net Income.\n" "Data: Start Revenue at $100,000, grows 20% annually.\n" "COGS is 40% of Revenue.\n" "Operating Expenses start at $30,000, grow 10% annually.\n" "Calculate Gross Profit (Revenue - COGS) and Net Income (Gross Profit - OpEx).\n" "Format currency as $#,##0. Apply bold headers and add a light blue fill to the header row.", file_path=output_path, # Server needs write access to this path/directory if relative operation_type="create", # create, modify, analyze, format # sheet_name="Projections", # Can specify sheet if modifying # cell_range="A1:D6", # Can specify range show_excel=False # Run Excel in the background (if applicable on the server) ) if create_result["success"]: print(f"Excel creation successful: {create_result['message']}") print(f"File saved at: {create_result.get('output_file_path', output_path)}") # Confirm output path # Example: Modify the created file - add a chart print("\nRequesting modification: Add a Revenue chart...") modify_result = await client.tools.excel_execute( instruction="Add a column chart showing Revenue for Year 1, Year 2, Year 3. " "Place it below the table. Title the chart 'Revenue Projection'.", file_path=output_path, # Use the previously created file operation_type="modify", sheet_name="Projections" # Specify the sheet to modify ) if modify_result["success"]: print(f"Excel modification successful: {modify_result['message']}") print(f"File updated at: {modify_result.get('output_file_path', output_path)}") else: print(f"Excel modification failed: {modify_result['error']}") else: print(f"Excel creation failed: {create_result['error']}") if 'details' in create_result: print(f"Details: {create_result['details']}") # Example: Analyze formulas (if the tool supports it) # analysis_result = await client.tools.excel_analyze_formulas(...) await client.close() # if __name__ == "__main__": asyncio.run(excel_automation_example())
import asyncio from mcp.client import Client async def multi_provider_completion_example(): client = Client("http://localhost:8013") prompt = "Explain the concept of 'Chain of Thought' prompting for Large Language Models." print(f"Requesting completions for prompt: '{prompt}' from multiple providers...") # Request the same prompt from different models/providers multi_response = await client.tools.multi_completion( prompt=prompt, providers=[ {"provider": "openai", "model": "gpt-4.1-mini", "temperature": 0.5}, {"provider": "anthropic", "model": "claude-3-5-sonnet-20241022", "temperature": 0.5}, {"provider": "gemini", "model": "gemini-2.0-pro", "temperature": 0.5}, # {"provider": "deepseek", "model": "deepseek-chat", "temperature": 0.5}, # Add others if configured ], # Common parameters applied to all if not specified per provider max_tokens=300 ) if multi_response["success"]: print("\n--- Multi-completion Results ---") total_cost = multi_response.get("total_cost", 0.0) print(f"Total Estimated Cost: ${total_cost:.6f}\n") for provider_key, result in multi_response["results"].items(): print(f"--- Provider: {provider_key} ---") if result["success"]: print(f" Model: {result.get('model', 'N/A')}") print(f" Cost: ${result.get('cost', 0.0):.6f}") print(f" Tokens: Input={result.get('input_tokens', 'N/A')}, Output={result.get('output_tokens', 'N/A')}") print(f" Completion:\n{result['completion']}\n") else: print(f" Error: {result['error']}\n") print("------------------------------") # An agent could now analyze these responses for consistency, detail, accuracy etc. else: print(f"\nMulti-completion request failed: {multi_response['error']}") await client.close() # if __name__ == "__main__": asyncio.run(multi_provider_completion_example())
import asyncio from mcp.client import Client async def optimized_workflow_example(): client = Client("http://localhost:8013") # Example document to process through the workflow document_content = """ Project Alpha Report - Q3 2024 Lead: Dr. Evelyn Reed ([email protected]) Status: On Track Budget: $50,000 remaining. Spent $25,000 this quarter. Key Findings: Successful prototype development (v0.8). User testing feedback positive. Next Steps: Finalize documentation, prepare for Q4 deployment. Target date: 2024-11-15. Risks: Potential delay due to supplier issues for component X. Mitigation plan in place. """ print("Defining a multi-stage workflow...") # Define a workflow with stages, dependencies, and provider preferences # Use ${stage_id.output_key} to pass outputs between stages workflow_definition = [ { "stage_id": "summarize_report", "tool_name": "summarize_document", "params": { "document": document_content, "format": "bullet_points", "max_length": 100, # Let the server choose a cost-effective model for summarization "provider_preference": "cost", # 'cost', 'quality', 'speed', or specific like 'openai/gpt-4.1-mini' } # No 'depends_on', runs first # Default output key is 'summary' for this tool, access via ${summarize_report.summary} }, { "stage_id": "extract_key_info", "tool_name": "extract_json", # Use JSON extraction for structured data "params": { "document": document_content, "json_schema": { "type": "object", "properties": { "project_lead": {"type": "string"}, "lead_email": {"type": "string", "format": "email"}, "status": {"type": "string"}, "budget_remaining": {"type": "string"}, "next_milestone_date": {"type": "string", "format": "date"} }, "required": ["project_lead", "status", "next_milestone_date"] }, # Prefer a model known for good structured data extraction, balancing cost "provider_preference": "quality", # Prioritize quality for extraction "preferred_models": ["openai/gpt-4o", "anthropic/claude-3-5-sonnet-20241022"] # Suggest specific models } }, { "stage_id": "generate_follow_up_questions", "tool_name": "generate_qa", # Assuming a tool that generates questions "depends_on": ["summarize_report"], # Needs the summary first "params": { # Use the summary from the first stage as input "document": "${summarize_report.summary}", "num_questions": 3, "provider_preference": "speed" # Use a fast model for question generation } # Default output key 'qa_pairs', access via ${generate_follow_up_questions.qa_pairs} } ] print("Executing the optimized workflow...") # Execute the workflow - the server handles dependencies and model selection results = await client.tools.execute_optimized_workflow( workflow=workflow_definition # Can also pass initial documents if workflow steps reference 'original_document' # documents = {"report.txt": document_content} ) if results["success"]: print("\nWorkflow executed successfully!") print(f" Total processing time: {results.get('processing_time', 'N/A'):.2f}s") print(f" Total estimated cost: ${results.get('total_cost', 0.0):.6f}\n") print("--- Stage Outputs ---") for stage_id, output in results.get("stage_outputs", {}).items(): print(f"Stage: {stage_id}") if output["success"]: print(f" Provider/Model Used: {output.get('provider', 'N/A')}/{output.get('model', 'N/A')}") print(f" Cost: ${output.get('cost', 0.0):.6f}") print(f" Output: {output.get('result', 'N/A')}") # Access the primary result # You might access specific keys like output.get('result', {}).get('summary') etc. else: print(f" Error: {output.get('error', 'Unknown error')}") print("-" * 20) else: print(f"\nWorkflow execution failed: {results.get('error', 'Unknown error')}") if 'details' in results: print(f"Details: {results['details']}") await client.close() # if __name__ == "__main__": asyncio.run(optimized_workflow_example())
import asyncio from mcp.client import Client # import networkx as nx # To process the graph data if needed # import matplotlib.pyplot as plt # To visualize async def entity_graph_example(): client = Client("http://localhost:8013") document_text = """ Meta Platforms, Inc., led by CEO Mark Zuckerberg, announced a partnership with IBM on developing new AI hardware accelerators. The collaboration aims to challenge Nvidia's dominance. IBM, headquartered in Armonk, New York, brings its deep expertise in semiconductor design. The project, codenamed 'Synergy', is expected to yield results by late 2025. """ print("Extracting entity relationships from text...") # Request extraction of entities and their relationships entity_graph_response = await client.tools.extract_entity_relations( document=document_text, entity_types=["organization", "person", "location", "date", "project"], # Specify desired entity types relationship_types=["led_by", "partnership_with", "aims_to_challenge", "headquartered_in", "expected_by"], # Specify relationship types # Optional parameters: # provider_preference="quality", # Choose model strategy # llm_model="anthropic/claude-3-5-sonnet-20241022", # Suggest a specific model include_visualization=False # Set True to request image data if tool supports it ) if entity_graph_response["success"]: print("Entity relationship extraction successful.") print(f"Estimated Cost: ${entity_graph_response.get('cost', 0.0):.6f}") # The graph data might be in various formats (e.g., node-link list, adjacency list) graph_data = entity_graph_response.get("graph_data") print("\n--- Graph Data (Nodes & Edges) ---") print(graph_data) print("------------------------------------") # Example: Query the extracted graph using another tool or LLM call # (Assuming a separate query tool or using a general completion tool) print("\nQuerying the extracted graph (example)...") query_prompt = f""" Based on the following graph data representing relationships extracted from a text: {graph_data} Answer the question: Who is the CEO of Meta Platforms, Inc.? """ query_response = await client.tools.completion( prompt=query_prompt, provider="openai", model="gpt-4.1-mini", max_tokens=50 ) if query_response["success"]: print(f"Graph Query Answer: {query_response['completion']}") else: print(f"Graph query failed: {query_response['error']}") else: print(f"Entity relationship extraction failed: {entity_graph_response.get('error', 'Unknown error')}") await client.close() # if __name__ == "__main__": asyncio.run(entity_graph_example())
import asyncio from mcp.client import Client async def document_chunking_example(): client = Client("http://localhost:8013") large_document = """ This is the first paragraph of a potentially very long document. It discusses various concepts. The second paragraph continues the discussion, adding more details and nuances. Proper chunking is crucial for processing large texts with Large Language Models, especially those with limited context windows. Different strategies exist, such as fixed token size, sentence splitting, or more advanced semantic chunking that tries to keep related ideas together. Overlap between chunks helps maintain context across boundaries. This paragraph is intentionally made longer to demonstrate how chunking might split it. It keeps going and going, describing the benefits of effective text splitting for downstream tasks like summarization, question answering, and retrieval-augmented generation (RAG). The goal is to create manageable pieces of text that still retain coherence. Semantic chunking often uses embedding models to find natural breakpoints in the text's meaning, potentially leading to better results than simple fixed-size chunks. The final sentence of this example paragraph. """ * 5 # Make it a bit longer for demonstration print("Requesting document chunking...") # Request chunking using a specific method and size chunking_response = await client.tools.chunk_document( document=large_document, chunk_size=100, # Target size in tokens (approximate) overlap=20, # Token overlap between consecutive chunks method="semantic" # Options: "token", "sentence", "semantic", "structural" (if available) ) if chunking_response["success"]: print(f"Document successfully divided into {chunking_response['chunk_count']} chunks.") print(f"Method Used: {chunking_response.get('method_used', 'N/A')}") # Confirm method if returned print("\n--- Example Chunks ---") for i, chunk in enumerate(chunking_response['chunks'][:3]): # Show first 3 chunks print(f"Chunk {i+1} (Length: {len(chunk)} chars):") print(f"'{chunk}'\n") if chunking_response['chunk_count'] > 3: print("...") print("----------------------") # These chunks can now be passed individually to other tools (e.g., summarize_document) else: print(f"Document chunking failed: {chunking_response['error']}") await client.close() # if __name__ == "__main__": asyncio.run(document_chunking_example())
import asyncio from mcp.client import Client async def multi_provider_completion_example(): client = Client("http://localhost:8013") prompt = "What are the main benefits of using the Model Context Protocol (MCP)?" print(f"Requesting completions for prompt: '{prompt}' from multiple providers...") multi_response = await client.tools.multi_completion( prompt=prompt, providers=[ {"provider": "openai", "model": "gpt-4.1-mini"}, {"provider": "anthropic", "model": "claude-3-5-haiku-20241022"}, {"provider": "gemini", "model": "gemini-2.0-flash-lite"} # Add more configured providers as needed ], temperature=0.5, max_tokens=250 ) if multi_response["success"]: print("\n--- Multi-completion Results ---") total_cost = multi_response.get("total_cost", 0.0) print(f"Total Estimated Cost: ${total_cost:.6f}\n") for provider_key, result in multi_response["results"].items(): print(f"--- Provider: {provider_key} ---") if result["success"]: print(f" Model: {result.get('model', 'N/A')}") print(f" Cost: ${result.get('cost', 0.0):.6f}") print(f" Completion:\n{result['completion']}\n") else: print(f" Error: {result['error']}\n") print("------------------------------") else: print(f"\nMulti-completion request failed: {multi_response['error']}") await client.close() # if __name__ == "__main__": asyncio.run(multi_provider_completion_example())
import asyncio from mcp.client import Client import json async def json_extraction_example(): client = Client("http://localhost:8013") text_with_data = """ Meeting Minutes - Project Phoenix - 2024-07-21 Attendees: Alice (Lead), Bob (Dev), Charlie (QA) Date: July 21, 2024 Project ID: PX-001 Discussion Points: - Reviewed user feedback from v1.1 testing. Mostly positive. - Identified performance bottleneck in data processing module. Bob to investigate. Assigned High priority. - QA cycle for v1.2 planned to start next Monday (2024-07-29). Charlie confirmed readiness. Action Items: 1. Bob: Investigate performance issue. Due: 2024-07-26. Priority: High. Status: Open. 2. Alice: Prepare v1.2 release notes. Due: 2024-07-28. Priority: Medium. Status: Open. """ # Define the desired JSON structure (schema) desired_schema = { "type": "object", "properties": { "project_name": {"type": "string", "description": "Name of the project"}, "meeting_date": {"type": "string", "format": "date", "description": "Date of the meeting"}, "attendees": {"type": "array", "items": {"type": "string"}, "description": "List of attendee names"}, "action_items": { "type": "array", "items": { "type": "object", "properties": { "task": {"type": "string"}, "assigned_to": {"type": "string"}, "due_date": {"type": "string", "format": "date"}, "priority": {"type": "string", "enum": ["Low", "Medium", "High"]}, "status": {"type": "string", "enum": ["Open", "In Progress", "Done"]} }, "required": ["task", "assigned_to", "due_date", "priority", "status"] } } }, "required": ["project_name", "meeting_date", "attendees", "action_items"] } print("Requesting JSON extraction based on schema...") # Request extraction using a model capable of following JSON schema instructions json_response = await client.tools.extract_json( document=text_with_data, json_schema=desired_schema, provider="openai", # OpenAI models are generally good at this model="gpt-4o", # Use a capable model like GPT-4o or Claude 3.5 Sonnet # provider_preference="quality" # Could also use preference ) if json_response["success"]: print("JSON extraction successful.") print(f"Estimated Cost: ${json_response.get('cost', 0.0):.6f}") # The extracted data should conform to the schema extracted_json_data = json_response.get('json_data') print("\n--- Extracted JSON Data ---") # Pretty print the JSON print(json.dumps(extracted_json_data, indent=2)) print("---------------------------") # Optionally, validate the output against the schema client-side (requires jsonschema library) # try: # from jsonschema import validate # validate(instance=extracted_json_data, schema=desired_schema) # print("\nClient-side validation successful: Output matches schema.") # except ImportError: # print("\n(Install jsonschema to perform client-side validation)") # except Exception as e: # print(f"\nClient-side validation failed: {e}") else: print(f"JSON Extraction Error: {json_response.get('error', 'Unknown error')}") if 'details' in json_response: print(f"Details: {json_response['details']}") await client.close() # if __name__ == "__main__": asyncio.run(json_extraction_example())
import asyncio from mcp.client import Client async def rag_query_example(): # This example assumes the Ultimate MCP Server has been configured with a RAG pipeline, # including a vector store/index containing relevant documents. client = Client("http://localhost:8013") query = "What are the latest treatment options for mitigating Alzheimer's disease according to recent studies?" print(f"Performing RAG query: '{query}'...") # Call the RAG tool, which handles retrieval and generation rag_response = await client.tools.rag_query( # Assuming the tool name is 'rag_query' query=query, # Optional parameters to control the RAG process: index_name="medical_research_papers", # Specify the index/collection to search top_k=3, # Retrieve top 3 most relevant documents/chunks # filter={"year": {"$gte": 2023}}, # Example filter (syntax depends on vector store) # generation_model={"provider": "anthropic", "model": "claude-3-5-sonnet-20241022"}, # Specify generation model # instruction_prompt="Based on the provided context, answer the user's query concisely." # Customize generation prompt ) if rag_response["success"]: print("\nRAG query successful.") print(f"Estimated Cost: ${rag_response.get('cost', 0.0):.6f}") # Includes retrieval + generation cost print("\n--- Generated Answer ---") print(rag_response.get('answer', 'No answer generated.')) print("------------------------") # The response might also include details about the retrieved sources retrieved_sources = rag_response.get('sources', []) if retrieved_sources: print("\n--- Retrieved Sources ---") for i, source in enumerate(retrieved_sources): print(f"Source {i+1}:") print(f" ID: {source.get('id', 'N/A')}") print(f" Score: {source.get('score', 'N/A'):.4f}") # Depending on RAG setup, might include metadata or text snippet print(f" Content Snippet: {source.get('text', '')[:150]}...") print("-" * 15) print("-----------------------") else: print("\nNo sources information provided in the response.") else: print(f"\nRAG Query Error: {rag_response.get('error', 'Unknown error')}") if 'details' in rag_response: print(f"Details: {rag_response['details']}") await client.close() # if __name__ == "__main__": asyncio.run(rag_query_example())
import asyncio from mcp.client import Client async def fused_search_example(): # This example assumes the server is configured with a hybrid search provider like Marqo. client = Client("http://localhost:8013") query = "impact of AI on software development productivity and code quality" print(f"Performing fused search for: '{query}'...") # Call the fused search tool fused_search_response = await client.tools.fused_search( # Assuming tool name is 'fused_search' query=query, # --- Parameters specific to the hybrid search backend (e.g., Marqo) --- index_name="tech_articles_index", # Specify the target index searchable_attributes=["title", "content"], # Fields to search within limit=5, # Number of results to return # Tunable weights for keyword vs. semantic relevance (example) hybrid_factors={"keyword_weight": 0.4, "semantic_weight": 0.6}, # Optional filter string (syntax depends on backend) filter_string="publication_year >= 2023 AND source_type='journal'" # -------------------------------------------------------------------- ) if fused_search_response["success"]: print("\nFused search successful.") results = fused_search_response.get("results", []) print(f"Found {len(results)} hits.") if results: print("\n--- Search Results ---") for i, hit in enumerate(results): print(f"Result {i+1}:") # Fields depend on Marqo index structure and what's returned print(f" ID: {hit.get('_id', 'N/A')}") print(f" Score: {hit.get('_score', 'N/A'):.4f}") # Combined score print(f" Title: {hit.get('title', 'N/A')}") print(f" Content Snippet: {hit.get('content', '')[:150]}...") # Print highlight info if available highlights = hit.get('_highlights', {}) if highlights: print(f" Highlights: {highlights}") print("-" * 15) print("--------------------") else: print("No results found matching the criteria.") else: print(f"\nFused Search Error: {fused_search_response.get('error', 'Unknown error')}") if 'details' in fused_search_response: print(f"Details: {fused_search_response['details']}") await client.close() # if __name__ == "__main__": asyncio.run(fused_search_example())
import asyncio from mcp.client import Client async def local_text_processing_example(): client = Client("http://localhost:8013") # Example assumes a tool named 'process_local_text' exists on the server # that bundles various non-LLM text operations. raw_text = " This text has EXTRA whitespace,\n\nmultiple newlines, \t tabs, and needs Case Normalization. " print("Requesting local text processing operations...") local_process_response = await client.tools.process_local_text( text=raw_text, operations=[ {"action": "trim_whitespace"}, # Remove leading/trailing whitespace {"action": "normalize_whitespace"}, # Collapse multiple spaces/tabs to single space {"action": "remove_blank_lines"}, # Remove empty lines {"action": "lowercase"} # Convert to lowercase # Other potential actions: uppercase, remove_punctuation, normalize_newlines, etc. ] ) if local_process_response["success"]: print("\nLocal text processing successful.") print(f"Original Text:\n'{raw_text}'") print(f"\nProcessed Text:\n'{local_process_response['processed_text']}'") # Note: This operation should ideally have zero LLM cost. print(f"Cost: ${local_process_response.get('cost', 0.0):.6f}") else: print(f"\nLocal Text Processing Error: {local_process_response['error']}") await client.close() # if __name__ == "__main__": asyncio.run(local_text_processing_example())
import asyncio from mcp.client import Client async def browser_basic_interaction_example(): # This example shows fundamental browser actions controlled by an agent client = Client("http://localhost:8013") print("--- Browser Automation: Basic Interaction ---") # 1. Initialize the browser (creates a browser instance on the server) print("\nInitializing browser (headless)...") # `headless=True` runs without a visible GUI window (common for automation) init_response = await client.tools.browser_init(headless=True, browser_type="chromium") if not init_response["success"]: print(f"Browser initialization failed: {init_response.get('error', 'Unknown error')}") await client.close() return print("Browser initialized successfully.") # Might return session ID if needed for subsequent calls, depends on tool design # 2. Navigate to a page target_url = "https://example.com/" print(f"\nNavigating to {target_url}...") # `wait_until` controls when navigation is considered complete nav_response = await client.tools.browser_navigate( url=target_url, wait_until="domcontentloaded" # Options: load, domcontentloaded, networkidle, commit ) if nav_response["success"]: print(f"Navigation successful.") print(f" Current URL: {nav_response.get('url', 'N/A')}") print(f" Page Title: {nav_response.get('title', 'N/A')}") # The 'snapshot' gives the agent context about the page state (accessibility tree) # print(f" Snapshot: {nav_response.get('snapshot', 'N/A')}") else: print(f"Navigation failed: {nav_response.get('error', 'Unknown error')}") # Attempt to close browser even if navigation failed await client.tools.browser_close() await client.close() return # 3. Extract text content using a CSS selector selector = "h1" # CSS selector for the main heading print(f"\nExtracting text from selector '{selector}'...") text_response = await client.tools.browser_get_text(selector=selector) if text_response["success"]: print(f"Extracted text: '{text_response.get('text', 'N/A')}'") else: print(f"Text extraction failed: {text_response.get('error', 'Unknown error')}") # Optionally check text_response['snapshot'] for page state at time of failure # 4. Take a screenshot (optional) print("\nTaking a screenshot...") screenshot_response = await client.tools.browser_screenshot( file_path="example_com_screenshot.png", # Path where server saves the file full_page=False, # Capture only the viewport image_format="png" # png or jpeg ) if screenshot_response["success"]: print(f"Screenshot saved successfully on server at: {screenshot_response.get('saved_path', 'N/A')}") # Agent might use this path with a filesystem tool to retrieve the image if needed else: print(f"Screenshot failed: {screenshot_response.get('error', 'Unknown error')}") # 5. Close the browser session print("\nClosing the browser...") close_response = await client.tools.browser_close() if close_response["success"]: print("Browser closed successfully.") else: # Log error, but might happen if browser already crashed print(f"Browser close failed (might be expected if previous steps failed): {close_response.get('error', 'Unknown error')}") print("--- Browser Automation Example Complete ---") await client.close() # if __name__ == "__main__": asyncio.run(browser_basic_interaction_example())
import asyncio from mcp.client import Client import json async def model_tournament_example(): client = Client("http://localhost:8013") # Define the task and prompt for the tournament task_prompt = "Write a Python function that takes a list of integers and returns a new list containing only the even numbers." # Optional: Provide ground truth for automated evaluation if the tool supports it ground_truth_code = """ def get_even_numbers(numbers): \"\"\"Returns a new list containing only the even numbers from the input list.\"\"\" return [num for num in numbers if num % 2 == 0] """ print("Setting up and running a model tournament for code generation...") # Call the tournament tool tournament_response = await client.tools.run_model_tournament( task_type="code_generation", # Helps select appropriate evaluation metrics prompt=task_prompt, # List of models/providers to compete competitors=[ {"provider": "openai", "model": "gpt-4.1-mini", "temperature": 0.2}, {"provider": "anthropic", "model": "claude-3-5-sonnet-20241022", "temperature": 0.2}, {"provider": "deepseek", "model": "deepseek-coder", "temperature": 0.2}, # Specialized coder model {"provider": "gemini", "model": "gemini-2.0-pro", "temperature": 0.2}, ], # Criteria for evaluating the generated code evaluation_criteria=["correctness", "efficiency", "readability", "docstring_quality"], # Provide ground truth if available for automated correctness checks ground_truth=ground_truth_code, # Optional: Specify an LLM to act as the judge for qualitative criteria evaluation_model={"provider": "anthropic", "model": "claude-3-5-opus-20240229"}, # Use a powerful model for judging num_rounds=1 # Run multiple rounds for stability if needed ) if tournament_response["success"]: print("\n--- Model Tournament Results ---") print(f"Task Prompt: {task_prompt}") print(f"Total Estimated Cost: ${tournament_response.get('total_cost', 0.0):.6f}\n") # Display the ranking ranking = tournament_response.get("ranking", []) if ranking: print("Overall Ranking:") for i, result in enumerate(ranking): provider = result.get('provider', 'N/A') model = result.get('model', 'N/A') score = result.get('overall_score', 'N/A') cost = result.get('cost', 0.0) print(f" {i+1}. {provider}/{model} - Score: {score:.2f}/10 - Cost: ${cost:.6f}") else: print("No ranking information available.") # Display detailed results for each competitor detailed_results = tournament_response.get("results", {}) if detailed_results: print("\nDetailed Scores per Competitor:") for competitor_key, details in detailed_results.items(): print(f" Competitor: {competitor_key}") print(f" Generated Code:\n```python\n{details.get('output', 'N/A')}\n```") scores = details.get('scores', {}) if scores: for criterion, score_value in scores.items(): print(f" - {criterion}: {score_value}") print("-" * 10) print("------------------------------") else: print(f"\nModel Tournament Failed: {tournament_response.get('error', 'Unknown error')}") if 'details' in tournament_response: print(f"Details: {tournament_response['details']}") await client.close() # if __name__ == "__main__": asyncio.run(model_tournament_example())
import asyncio from mcp.client import Client import json async def meta_tools_example(): client = Client("http://localhost:8013") print("--- Meta Tools Example ---") # 1. List all available tools print("\nFetching list of available tools...") # Assumes a tool named 'list_tools' provides this info list_tools_response = await client.tools.list_tools(include_schemas=False) # Set True for full schemas if list_tools_response["success"]: tools = list_tools_response.get("tools", {}) print(f"Found {len(tools)} available tools:") for tool_name, tool_info in tools.items(): description = tool_info.get('description', 'No description available.') print(f" - {tool_name}: {description[:100]}...") # Print truncated description else: print(f"Failed to list tools: {list_tools_response.get('error', 'Unknown error')}") # 2. Get detailed information about a specific tool tool_to_inspect = "extract_json" print(f"\nFetching details for tool: '{tool_to_inspect}'...") # Assumes a tool like 'get_tool_info' or using list_tools with specific name/schema flag tool_info_response = await client.tools.list_tools(tool_names=[tool_to_inspect], include_schemas=True) if tool_info_response["success"] and tool_to_inspect in tool_info_response.get("tools", {}): tool_details = tool_info_response["tools"][tool_to_inspect] print(f"\nDetails for '{tool_to_inspect}':") print(f" Description: {tool_details.get('description', 'N/A')}") # Print the parameter schema if available schema = tool_details.get('parameters', {}).get('json_schema', {}) if schema: print(f" Parameter Schema:\n{json.dumps(schema, indent=2)}") else: print(" Parameter Schema: Not available.") else: print(f"Failed to get info for tool '{tool_to_inspect}': {tool_info_response.get('error', 'Not found or error')}") # 3. Get tool recommendations for a task (if such a meta tool exists) task_description = "Read data from a PDF file, extract tables, and save them as CSV." print(f"\nGetting tool recommendations for task: '{task_description}'...") # Assumes a tool like 'get_tool_recommendations' recommendations_response = await client.tools.get_tool_recommendations( task=task_description, constraints={"priority": "accuracy", "max_cost_per_doc": 0.10} # Example constraints ) if recommendations_response["success"]: print("Recommended Tool Workflow:") recommendations = recommendations_response.get("recommendations", []) if recommendations: for i, step in enumerate(recommendations): print(f" Step {i+1}: Tool='{step.get('tool', 'N/A')}' - Reason: {step.get('reason', 'N/A')}") else: print(" No recommendations provided.") else: print(f"Failed to get recommendations: {recommendations_response.get('error', 'Unknown error')}") print("\n--- Meta Tools Example Complete ---") await client.close() # if __name__ == "__main__": asyncio.run(meta_tools_example())
import asyncio from mcp.client import Client import json async def local_cli_tool_example(): client = Client("http://localhost:8013") print("--- Local CLI Tool Example (jq) ---") # Example JSON data to be processed by jq json_input_data = json.dumps({ "users": [ {"id": 1, "name": "Alice", "email": "[email protected]", "status": "active"}, {"id": 2, "name": "Bob", "email": "[email protected]", "status": "inactive"}, {"id": 3, "name": "Charlie", "email": "[email protected]", "status": "active"} ], "metadata": {"timestamp": "2024-07-21T12:00:00Z"} }) # Define the jq filter to apply # This filter selects active users and outputs their name and email jq_filter = '.users[] | select(.status=="active") | {name: .name, email: .email}' print(f"\nRunning jq with filter: '{jq_filter}' on input JSON...") # Call the server tool that wraps jq (e.g., 'run_jq') jq_result = await client.tools.run_jq( args_str=jq_filter, # Pass the filter as arguments (check tool spec how it expects filters) input_data=json_input_data, # Provide the JSON string as input # Additional options might be available depending on the tool wrapper: # e.g., output_format="json_lines" or "compact_json" ) if jq_result["success"]: print("jq execution successful.") # stdout typically contains the result of the jq filter print("\n--- jq Output (stdout) ---") print(jq_result.get("stdout", "No output")) print("--------------------------") # stderr might contain warnings or errors from jq itself stderr_output = jq_result.get("stderr") if stderr_output: print("\n--- jq Stderr ---") print(stderr_output) print("-----------------") # This should have minimal or zero cost as it runs locally on the server print(f"\nCost: ${jq_result.get('cost', 0.0):.6f}") else: print(f"\njq Execution Error: {jq_result.get('error', 'Unknown error')}") print(f"Stderr: {jq_result.get('stderr', 'N/A')}") print("\n--- Local CLI Tool Example Complete ---") await client.close() # if __name__ == "__main__": asyncio.run(local_cli_tool_example())
import asyncio from mcp.client import Client import json async def dynamic_api_example(): # This example assumes the server has tools like 'register_api', 'list_registered_apis', # 'call_dynamic_tool', and 'unregister_api'. client = Client("http://localhost:8013") print("--- Dynamic API Integration Example ---") # 1. Register an external API using its OpenAPI (Swagger) specification URL api_name_to_register = "public_cat_facts" openapi_spec_url = "https://catfact.ninja/docs/api-docs.json" # Example public API spec print(f"\nRegistering API '{api_name_to_register}' from {openapi_spec_url}...") register_response = await client.tools.register_api( api_name=api_name_to_register, openapi_url=openapi_spec_url, # Optional: Provide authentication details if needed (e.g., Bearer token, API Key) # authentication={"type": "bearer", "token": "your_api_token"}, # Optional: Set default headers # default_headers={"X-Custom-Header": "value"}, # Optional: Cache settings for API responses (if tool supports it) cache_ttl=300 # Cache responses for 5 minutes ) if register_response["success"]: print(f"API '{api_name_to_register}' registered successfully.") print(f" Registered {register_response.get('tools_count', 0)} new MCP tools derived from the API.") print(f" Tools Registered: {register_response.get('tools_registered', [])}") else: print(f"API registration failed: {register_response.get('error', 'Unknown error')}") await client.close() return # 2. List currently registered dynamic APIs print("\nListing registered dynamic APIs...") list_apis_response = await client.tools.list_registered_apis() if list_apis_response["success"]: registered_apis = list_apis_response.get("apis", {}) print(f"Currently registered APIs: {list(registered_apis.keys())}") # print(json.dumps(registered_apis, indent=2)) # Print full details else: print(f"Failed to list registered APIs: {list_apis_response.get('error', 'Unknown error')}") # 3. Call a dynamically created tool corresponding to an API endpoint # The tool name is typically derived from the API name and endpoint's operationId or path. # Check the 'tools_registered' list from step 1 or documentation for the exact name. # Let's assume the tool for GET /fact is 'public_cat_facts_getFact' dynamic_tool_name = "public_cat_facts_getFact" # Adjust based on actual registered name print(f"\nCalling dynamic tool '{dynamic_tool_name}'...") call_response = await client.tools.call_dynamic_tool( tool_name=dynamic_tool_name, # Provide inputs matching the API endpoint's parameters inputs={ # Example query parameter for GET /fact (check API spec) "max_length": 100 } ) if call_response["success"]: print("Dynamic tool call successful.") # The result usually contains the API's response body and status code print(f" Status Code: {call_response.get('status_code', 'N/A')}") print(f" Response Body:\n{json.dumps(call_response.get('response_body', {}), indent=2)}") else: print(f"Dynamic tool call failed: {call_response.get('error', 'Unknown error')}") print(f" Status Code: {call_response.get('status_code', 'N/A')}") print(f" Response Body: {call_response.get('response_body', 'N/A')}") # 4. Unregister the API when no longer needed (optional cleanup) print(f"\nUnregistering API '{api_name_to_register}'...") unregister_response = await client.tools.unregister_api(api_name=api_name_to_register) if unregister_response["success"]: print(f"API unregistered successfully. Removed {unregister_response.get('tools_count', 0)} tools.") else: print(f"API unregistration failed: {unregister_response.get('error', 'Unknown error')}") print("\n--- Dynamic API Integration Example Complete ---") await client.close() # if __name__ == "__main__": asyncio.run(dynamic_api_example())
import asyncio from mcp.client import Client import os async def ocr_example(): # Requires 'ocr' extras installed: uv pip install -e ".[ocr]" # Also requires Tesseract OCR engine installed on the server host system. client = Client("http://localhost:8013") print("--- OCR Tool Example ---") # --- Create dummy files for testing --- # In a real scenario, these files would exist on a path accessible by the server. # Ensure the server process has permissions to read these files. dummy_files_dir = "ocr_test_files" os.makedirs(dummy_files_dir, exist_ok=True) dummy_pdf_path = os.path.join(dummy_files_dir, "dummy_document.pdf") dummy_image_path = os.path.join(dummy_files_dir, "dummy_image.png") # Create a simple dummy PDF (requires reportlab - pip install reportlab) try: from reportlab.pdfgen import canvas from reportlab.lib.pagesizes import letter c = canvas.Canvas(dummy_pdf_path, pagesize=letter) c.drawString(100, 750, "This is page 1 of a dummy PDF.") c.drawString(100, 730, "It contains some text for OCR testing.") c.showPage() c.drawString(100, 750, "This is page 2.") c.save() print(f"Created dummy PDF: {dummy_pdf_path}") except ImportError: print("Could not create dummy PDF: reportlab not installed. Skipping PDF test.") dummy_pdf_path = None except Exception as e: print(f"Error creating dummy PDF: {e}. Skipping PDF test.") dummy_pdf_path = None # Create a simple dummy PNG image (requires Pillow - pip install Pillow) try: from PIL import Image, ImageDraw, ImageFont img = Image.new('RGB', (400, 100), color = (255, 255, 255)) d = ImageDraw.Draw(img) # Use a default font if possible, otherwise basic text try: font = ImageFont.truetype("arial.ttf", 15) except IOError: font = ImageFont.load_default() d.text((10,10), "Dummy Image Text for OCR\nLine 2 of text.", fill=(0,0,0), font=font) img.save(dummy_image_path) print(f"Created dummy Image: {dummy_image_path}") except ImportError: print("Could not create dummy Image: Pillow not installed. Skipping Image test.") dummy_image_path = None except Exception as e: print(f"Error creating dummy Image: {e}. Skipping Image test.") dummy_image_path = None # --- End of dummy file creation --- # 1. Extract text from the PDF using OCR and LLM correction if dummy_pdf_path: print(f"\nExtracting text from PDF: {dummy_pdf_path} (using hybrid method)...") pdf_text_result = await client.tools.extract_text_from_pdf( file_path=dummy_pdf_path, # Server needs access to this path extraction_method="hybrid", # Try direct extraction, fallback to OCR max_pages=2, # Limit pages to process reformat_as_markdown=True, # Request markdown formatting # Optional: Use an LLM to correct/improve the raw OCR text llm_correction_model={"provider": "openai", "model": "gpt-4.1-mini"} ) if pdf_text_result["success"]: print("PDF text extraction successful.") print(f" Method Used: {pdf_text_result.get('extraction_method_used', 'N/A')}") print(f" Cost (incl. LLM correction): ${pdf_text_result.get('cost', 0.0):.6f}") print("\n--- Extracted PDF Text (Markdown) ---") print(pdf_text_result.get("text", "No text extracted.")) print("-------------------------------------") else: print(f"PDF OCR failed: {pdf_text_result.get('error', 'Unknown error')}") if 'details' in pdf_text_result: print(f"Details: {pdf_text_result['details']}") else: print("\nSkipping PDF OCR test as dummy file could not be created.") # 2. Process the image file with OCR and preprocessing if dummy_image_path: print(f"\nProcessing image OCR: {dummy_image_path} with preprocessing...") image_text_result = await client.tools.process_image_ocr( image_path=dummy_image_path, # Server needs access to this path # Optional preprocessing steps (require OpenCV on server) preprocessing_options={ "grayscale": True, # "threshold": "otsu", # e.g., otsu, adaptive # "denoise": True, # "deskew": True }, ocr_language="eng" # Specify language(s) for Tesseract e.g., "eng+fra" # Optional LLM enhancement for image OCR results # llm_enhancement_model={"provider": "gemini", "model": "gemini-2.0-flash-lite"} ) if image_text_result["success"]: print("Image OCR successful.") print(f" Cost (incl. LLM enhancement): ${image_text_result.get('cost', 0.0):.6f}") print("\n--- Extracted Image Text ---") print(image_text_result.get("text", "No text extracted.")) print("----------------------------") else: print(f"Image OCR failed: {image_text_result.get('error', 'Unknown error')}") if 'details' in image_text_result: print(f"Details: {image_text_result['details']}") else: print("\nSkipping Image OCR test as dummy file could not be created.") # --- Clean up dummy files --- # try: # if dummy_pdf_path and os.path.exists(dummy_pdf_path): os.remove(dummy_pdf_path) # if dummy_image_path and os.path.exists(dummy_image_path): os.remove(dummy_image_path) # if os.path.exists(dummy_files_dir): os.rmdir(dummy_files_dir) # Only if empty # except Exception as e: # print(f"\nError cleaning up dummy files: {e}") # --- End cleanup --- print("\n--- OCR Tool Example Complete ---") await client.close() # if __name__ == "__main__": asyncio.run(ocr_example())
(Note: Many examples involving file paths assume the server process has access to those paths. For Docker deployments, volume mapping is usually required.)
The Ultimate MCP Server includes a powerful feature for autonomously analyzing, testing, and refining the documentation of registered MCP tools. This feature, implemented in ultimate/tools/docstring_refiner.py
, helps improve the usability and reliability of tools when invoked by Large Language Models (LLMs) like Claude.
The documentation refiner follows a methodical, iterative approach:
This feature is particularly valuable when:
The documentation refiner is typically invoked as a server-side maintenance or administrative task, not directly exposed as an MCP tool for external agents to call.
# This code snippet shows how the refiner might be called from within the # server's environment (e.g., via a CLI command or admin interface). # Assume necessary imports and context setup: # from ultimate_mcp_server.tools.docstring_refiner import refine_tool_documentation # from ultimate_mcp_server.core import mcp_context # Represents the server's context async def invoke_doc_refiner_task(): # Ensure mcp_context is properly initialized with registered tools, config, etc. print("Starting Autonomous Documentation Refinement Task...") # Example: Refine documentation for a specific list of tools refinement_result = await refine_tool_documentation( tool_names=["extract_json", "browser_navigate", "chunk_document"], # Tools to refine max_iterations=3, # Limit refinement cycles per tool refinement_model_config={ # Specify LLM for refinement tasks "provider": "anthropic", "model": "claude-3-5-sonnet-20241022" }, testing_model_config={ # Optional: Specify LLM for test generation/simulation "provider": "openai", "model": "gpt-4o" }, enable_winnowing=True, # Apply final streamlining pass stop_on_first_error=False, # Continue refining other tools if one fails ctx=mcp_context # Pass the server's MCP context ) # Example: Refine all available tools (potentially very long running) # refinement_result = await refine_tool_documentation( # refine_all_available=True, # max_iterations=2, # ctx=mcp_context # ) print("\nDocumentation Refinement Task Complete.") # Process the results if refinement_result["success"]: print(f"Successfully processed {len(refinement_result.get('refined_tools', []))} tools.") # The actual docstrings/schemas of the tools in mcp_context might be updated in-memory. # Persisting these changes would require additional logic (e.g., writing back to source files). print("Detailed report available in the result object.") # print(refinement_result.get('report')) # Contains detailed logs and changes else: print(f"Refinement task encountered errors: {refinement_result.get('error', 'Unknown error')}") # Check the report for details on which tools failed and why. # To run this, it would need to be integrated into the server's startup sequence, # a dedicated CLI command, or an administrative task runner. # e.g., await invoke_doc_refiner_task()
The Ultimate MCP Server includes an extensive collection of 35+ end-to-end examples located in the examples/
directory. These serve a dual purpose:
model_integration
, tool_specific
, workflows
, advanced_features
).*.py
) is a runnable Python script using mcp-client
to interact with a running server instance.Rich
library for formatted, color-coded console output, clearly showing requests, responses, costs, timings, and results.Expect informative console output, including:
Example output snippet:
╭────────────────────── Tournament Results ───────────────────────╮
│ [1] claude-3-5-haiku-20241022: Score 8.7/10 │
│ Cost: $0.00013 │
│ ... │
╰────────────────────────────────────────────────────────────────╯
.env
), different models, custom prompts, or input files.--model
, --input-file
, --headless
).The run_all_demo_scripts_and_check_for_errors.py
script orchestrates the execution of all examples as a test suite:
examples/*.py
sequentially.stderr
against predefined patterns to distinguish real errors from expected messages (e.g., missing API key warnings).Example test framework configuration snippet:
"sql_database_interactions_demo.py": { "expected_exit_code": 0, "allowed_stderr_patterns": [ r"Could not compute statistics...", # Known non-fatal warning r"Connection failed...", # Expected if DB not set up r"Configuration not yet loaded..." # Standard info message ] }
# Ensure the Ultimate MCP Server is running in a separate terminal # Run the entire test suite python run_all_demo_scripts_and_check_for_errors.py # Run a specific example script directly python examples/browser_automation_demo.py --headless # Run an example with custom arguments python examples/text_redline_demo.py --input-file1 path/to/doc1.txt --input-file2 path/to/doc2.txt
This combined example library and testing framework provides invaluable resources for understanding, utilizing, and verifying the functionality of the Ultimate MCP Server.
Ultimate MCP Server comes with a command-line interface (umcp
) for server management and tool interaction:
# Show available commands and global options umcp --help # --- Server Management --- # Start the server (loads .env, registers tools) umcp run [--host HOST] [--port PORT] [--include-tools tool1 tool2] [--exclude-tools tool3 tool4] # --- Information --- # List configured LLM providers umcp providers [--check] [--models] # List available tools umcp tools [--category CATEGORY] [--examples] # --- Testing & Interaction --- # Test connection and basic generation for a specific provider umcp test <provider_name> [--model MODEL_NAME] [--prompt TEXT] # Generate a completion directly from the CLI umcp complete --provider <provider_name> --model <model_name> --prompt "Your prompt here" [--temperature N] [--max-tokens N] [--system TEXT] [--stream] # --- Cache Management --- # View or clear the request cache umcp cache [--status] [--clear] # --- Benchmark --- umcp benchmark [--providers P1 P2] [--models M1 M2] [--prompt TEXT] [--runs N] # --- Examples --- umcp examples [--list] [<example_name>] [--category CATEGORY]
Each command typically has additional options. Use umcp COMMAND --help
to see options for a specific command (e.g., umcp complete --help
).
Configuration is primarily managed through environment variables, often loaded from a .env
file in the project root upon startup.
SERVER_HOST
: (Default: 127.0.0.1
) Network interface to bind to. Use 0.0.0.0
to listen on all interfaces (necessary for Docker containers or external access).SERVER_PORT
: (Default: 8013
) Port the server listens on.API_PREFIX
: (Default: /
) URL prefix for all API endpoints (e.g., set to /mcp/v1
to serve under that path).WORKERS
: (Optional, e.g., 4
) Number of worker processes for the web server (e.g., Uvicorn). Adjust based on CPU cores.Control which tools are registered when the server starts using CLI flags:
--include-tools tool1,tool2,...
: Only register the specified tools.--exclude-tools tool3,tool4,...
: Register all tools except those specified.
This is useful for creating lightweight instances, managing dependencies, or restricting agent capabilities.# Example: Start with only filesystem and basic completion tools umcp run --include-tools read_file,write_file,list_directory,completion # Example: Start with all tools except browser automation umcp run --exclude-tools browser_init,browser_navigate,browser_click
LOG_LEVEL
: (Default: INFO
) Controls log verbosity (DEBUG
, INFO
, WARNING
, ERROR
, CRITICAL
). DEBUG
is very verbose.USE_RICH_LOGGING
: (Default: true
) Enables colorful, structured console logs via the Rich library. Set to false
for plain text logs (better for file redirection or some logging systems).LOG_FORMAT
: (Optional) Specify a Python logging
format string for custom log formats (if USE_RICH_LOGGING=false
).LOG_TO_FILE
: (Optional, e.g., /var/log/ultimate_mcp_server.log
) Path to a file where logs should also be written (in addition to console). Ensure the server process has write permissions.CACHE_ENABLED
: (Default: true
) Globally enable or disable response caching.CACHE_TTL
: (Default: 86400
seconds = 24 hours) Default Time-To-Live for cached items. Specific tools might have overrides.CACHE_TYPE
: (Default: memory
) Backend storage. Check implementation for supported types (e.g., memory
, redis
, diskcache
). diskcache
provides persistence.CACHE_DIR
: (Default: ./.cache
) Directory used if CACHE_TYPE=diskcache
. Ensure write permissions.CACHE_MAX_SIZE
: (Optional, e.g., 1000
for items or 536870912
for 512MB for diskcache
) Sets size limits for the cache.REDIS_URL
: (Required if CACHE_TYPE=redis
) Connection URL for Redis server (e.g., redis://localhost:6379/0
).PROVIDER_TIMEOUT
: (Default: 120
) Default timeout in seconds for waiting for a response from an LLM provider API.PROVIDER_MAX_RETRIES
: (Default: 3
) Default number of times to retry a failed request to a provider (for retryable errors like rate limits or temporary server issues). Uses exponential backoff.OPENAI_TIMEOUT
, ANTHROPIC_MAX_RETRIES
). Check configuration loading logic or documentation.Individual tools might load their own configuration from environment variables. Examples:
ALLOWED_DIRS
: Comma-separated list of base directories filesystem tools are restricted to. Crucially for security.PLAYWRIGHT_BROWSER_TYPE
: (Default: chromium
) Browser used by Playwright tools (chromium
, firefox
, webkit
).PLAYWRIGHT_TIMEOUT
: Default timeout for Playwright actions.DATABASE_URL
: Connection string for the SQL Database Interaction tools (uses SQLAlchemy).MARQO_URL
: URL for the Marqo instance used by the fused search tool.TESSERACT_CMD
: Path to the Tesseract executable if not in standard system PATH (for OCR).Always ensure environment variables are set correctly before starting the server. Changes typically require a server restart to take effect.
While umcp run
or docker compose up
are fine for development, consider these for more robust deployments:
Ensure the server runs continuously and restarts automatically.
systemd
(Linux): Create a service unit file (.service
) to manage the process with systemctl start|stop|restart|status
. Provides robust control and logging integration.supervisor
: A process control system written in Python. Configure supervisord
to monitor and manage the server process.--restart unless-stopped
or --restart always
in your docker run
command or in docker-compose.yml
to have Docker manage restarts.Placing a reverse proxy in front of the Ultimate MCP Server is highly recommended:
https://api.yourdomain.com/mcp/
) to the internal server (http://localhost:8013
). Configure API_PREFIX
if needed.Strict-Transport-Security
(HSTS), Content-Security-Policy
(CSP).Example Nginx location
block (simplified):
location /mcp/ { # Match your desired public path (corresponds to API_PREFIX if set) proxy_pass http://127.0.0.1:8013/; # Point to the internal server (note trailing /) proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # Increase timeouts for potentially long-running AI tasks proxy_connect_timeout 60s; proxy_send_timeout 300s; proxy_read_timeout 300s; # Optional: Add basic authentication # auth_basic "Restricted Access"; # auth_basic_user_file /etc/nginx/.htpasswd; }
For scalable, managed deployments:
/healthz
endpoint (or similar) in your deployment manifests..env
files.diskcache
, persistent memory, file storage), configure persistent volumes (PVs/PVCs).WORKERS
env var).diskcache
) or extensive filesystem operations. Use fast storage (SSDs) if needed.Using Ultimate MCP Server for intelligent delegation can yield significant cost savings compared to using only a high-end model like Claude 3.7 Sonnet or GPT-4o for every task.
Task Scenario | High-End Model Only (Est.) | Delegated via MCP Server (Est.) | Estimated Savings | Notes |
---|---|---|---|---|
Summarize 100-page document | ~$4.50 - $6.00 | ~$0.45 - $0.70 (Gemini Flash) | ~90% | Chunking + parallel cheap summaries |
Extract data from 50 records | ~$2.25 - $3.00 | ~$0.35 - $0.50 (GPT-4.1 Mini) | ~84% | Batch processing with cost-effective model |
Generate 20 content ideas | ~$0.90 - $1.20 | ~$0.12 - $0.20 (DeepSeek/Haiku) | ~87% | Simple generation task on cheaper model |
Process 1,000 customer queries | ~$45.00 - $60.00 | ~$7.50 - $12.00 (Mixed Models) | ~83% | Routing based on query complexity |
OCR & Extract from 10 Scans | ~$1.50 - $2.50 (If LLM OCR) | ~$0.20 - $0.40 (OCR + LLM Fix) | ~85% | Using dedicated OCR + cheap LLM correction |
Basic Web Scrape & Summarize | ~$0.50 - $1.00 | ~$0.10 - $0.20 (Browser + Haiku) | ~80% | Browser tool + cheap LLM for summary |
(Costs are highly illustrative, based on typical token counts and approximate 2024 pricing. Actual costs depend heavily on document size, complexity, specific models used, and current provider pricing.)
How savings are achieved:
Ultimate MCP Server acts as the intelligent routing layer to make these cost optimizations feasible within a sophisticated agent architecture.
The strategic importance of AI-to-AI delegation, facilitated by systems like the Ultimate MCP Server, extends beyond simple cost savings:
This vision of efficient, intelligently delegated, self-optimizing AI systems represents the next frontier in practical AI deployment, moving beyond the current paradigm of often using a single, powerful (and expensive) model for almost everything.
The Ultimate MCP Server is built natively on the Model Context Protocol (MCP):
This adherence to the MCP standard ensures seamless, predictable integration with any MCP-compatible agent or client application.
+---------------------+ MCP Request +------------------------------------+ API Request +-----------------+ | MCP Agent/Client | ----------------------> | Ultimate MCP Server | ----------------------> | LLM Providers | | (e.g., Claude 3.7) | <---------------------- | (FastAPI + MCP Core + Tool Logic) | <---------------------- | (OpenAI, Anthro.)| +---------------------+ MCP Response +------------------+-----------------+ API Response +--------+--------+ | | | Tool Invocation | External API Call ▼ ▼ +-----------------------------------------------------------+------------------------------------------------------------+ | Internal Services & Tool Implementations | | +-------------------+ +-------------------+ +-------------------+ +-------------------+ +-------------------+ | | | Completion/LLM | | Document Proc. | | Data Extraction | | Browser Automation| | Excel Automation | | | | (Routing/Provider)| | (Chunking, Sum.) | | (JSON, Table) | | (Playwright) | | (OpenPyXL/COM) | | | +---------+---------+ +-------------------+ +-------------------+ +-------------------+ +-------------------+ | | | | | +---------+---------+ +-------------------+ +-------------------+ +-------------------+ +-------------------+ | | | Cognitive Memory | | Filesystem Ops | | SQL Database | | Entity/Graph | | Vector/RAG | | | | (Storage/Query) | | (Secure Access) | | (SQLAlchemy) | | (NetworkX) | | (Vector Stores) | | | +-------------------+ +-------------------+ +-------------------+ +-------------------+ +---------+---------+ | | | | | +-------------------+ +-------------------+ +-------------------+ +-------------------+ +---------+---------+ | | | Audio Transcription| | OCR Tools | | Text Classify | | CLI Tools | | Dynamic API | | | | (Whisper, etc.) | | (Tesseract+LLM) | | | | (jq, rg, awk) | | (OpenAPI->Tool) | | | +-------------------+ +-------------------+ +-------------------+ +-------------------+ +-------------------+ | | | | +-------------------+ +-------------------+ +-------------------+ +-------------------+ +-------------------+ | | | Caching Service | | Analytics/Metrics | | Prompt Management | | Config Service | | Meta Tools/Refiner| | | | (Memory/Disk/Redis| | (Cost/Usage Track)| | (Jinja2/Repo) | | (Loads .env) | | (list_tools etc.) | | | +-------------------+ +-------------------+ +-------------------+ +-------------------+ +-------------------+ | +------------------------------------------------------------------------------------------------------------------------+
tool_name
and required inputs
according to the tool's schema (which it might have discovered via list_tools
).tool_name
in its registry of registered tools.inputs
provided in the request. If validation fails, an MCP error response is generated immediately.tool_name
and validated inputs
. If a valid, non-expired cache entry exists for this key, the cached response is retrieved and returned (skipping to step 14).completion
, summarize_document
, extract_json
):
provider
, model
, provider_preference
) and server configuration.read_file
, browser_navigate
, run_sql_query
, run_ripgrep
):
status
, outputs
, error
(if any), and potentially cost
, usage_metadata
.Empower agents like Claude or custom-built autonomous agents to perform complex, multi-modal tasks by giving them tools for:
Build sophisticated automated processes that leverage AI reasoning and specialized tools:
Handle complex data tasks beyond simple ETL:
Support research teams with AI-powered tools:
Create comprehensive systems for understanding document collections:
Equip financial professionals with advanced tools:
When deploying and operating the Ultimate MCP Server, security must be a primary concern. Consider the following aspects:
🔑 API Key Management:
.env
file for local dev, system environment variables, or preferably secrets management tools like HashiCorp Vault, AWS Secrets Manager, GCP Secret Manager for production)..env
file (if used locally) has strict file permissions (e.g., chmod 600 .env
) readable only by the user running the server.🌐 Network Exposure & Access Control:
127.0.0.1
(SERVER_HOST
) by default to only allow local connections. Only change to 0.0.0.0
if you intend to expose it, and only behind appropriate network controls.SERVER_PORT
only from trusted sources (e.g., the reverse proxy's IP, specific application server IPs, VPN ranges).👤 Authentication & Authorization:
🚦 Rate Limiting & Abuse Prevention:
🛡️ Input Validation & Sanitization:
ALLOWED_DIRS
strictly. Validate and normalize all path inputs rigorously to prevent directory traversal (../
). Ensure the server process runs with least privilege.browser_evaluate_script
). Avoid running scripts based directly on untrusted agent input if possible. Playwright's sandboxing helps but isn't foolproof.run_ripgrep
, run_jq
, etc., to prevent command injection, especially if constructing complex command strings. Use safe methods for passing input data (e.g., stdin).📦 Dependency Security:
uv pip install --upgrade ...
or uv sync
to patch known vulnerabilities in third-party libraries (FastAPI, Pydantic, Playwright, database drivers, etc.).pip-audit
, GitHub Dependabot, Snyk) to automatically identify vulnerable dependencies in your pyproject.toml
or requirements.txt
.📄 Logging Security:
DEBUG
level logging might log sensitive information, including full prompts, API responses, file contents, or keys present in data. Configure LOG_LEVEL
appropriately for production (INFO
or WARNING
is usually safer).LOG_TO_FILE
is used) have appropriate permissions and consider log rotation and retention policies. Avoid logging raw API keys.⚙️ Tool-Specific Security:
ALLOWED_DIRS
, database credentials with limited permissions) follow the principle of least privilege. Disable tools that are not needed or cannot be secured adequately for your environment.This project is licensed under the MIT License - see the LICENSE
file for details.
This project builds upon the work of many fantastic open-source projects and services. Special thanks to:
Tesseract
, ripgrep
, jq
, awk
, sed
.This README provides a comprehensive overview. For specific tool parameters, advanced configuration options, and detailed implementation notes, please refer to the source code and individual tool documentation within the project.
Start the server using the CLI:
# Start in default stdio mode umcp run # Start in streamable-http mode for web interfaces or remote clients (recommended) umcp run --transport-mode shttp # Or use the shortcut: umcp run -t shttp # Run on a specific host and port (streamable-http mode) umcp run -t shttp --host 0.0.0.0 --port 8080