OpenVision
STDIOMCP server providing image analysis capabilities through OpenRouter vision models integration
MCP server providing image analysis capabilities through OpenRouter vision models integration
MCP OpenVision is a Model Context Protocol (MCP) server that provides image analysis capabilities powered by OpenRouter vision models. It enables AI assistants to analyze images via a simple interface within the MCP ecosystem.
To install mcp-openvision for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install @Nazruden/mcp-openvision --client claude
pip install mcp-openvision
uv pip install mcp-openvision
MCP OpenVision requires an OpenRouter API key and can be configured through environment variables:
MCP OpenVision works with any OpenRouter model that supports vision capabilities. The default model is qwen/qwen2.5-vl-32b-instruct:free, but you can specify any other compatible model.
Some popular vision models available through OpenRouter include:
qwen/qwen2.5-vl-32b-instruct:free (default)anthropic/claude-3-5-sonnetanthropic/claude-3-opusanthropic/claude-3-sonnetopenai/gpt-4oYou can specify custom models by setting the OPENROUTER_DEFAULT_MODEL environment variable or by passing the model parameter directly to the image_analysis function.
The easiest way to test MCP OpenVision is with the MCP Inspector tool:
npx @modelcontextprotocol/inspector uvx mcp-openvision
Edit your MCP configuration file:
%USERPROFILE%\.cursor\mcp.json~/.cursor/mcp.json or ~/Library/Application Support/Claude/claude_desktop_config.jsonAdd the following configuration:
{ "mcpServers": { "openvision": { "command": "uvx", "args": ["mcp-openvision"], "env": { "OPENROUTER_API_KEY": "your_openrouter_api_key_here", "OPENROUTER_DEFAULT_MODEL": "anthropic/claude-3-sonnet" } } } }
# Set the required API key export OPENROUTER_API_KEY="your_api_key" # Run the server module directly python -m mcp_openvision
MCP OpenVision provides the following core tool:
image: Can be provided as:
query: User instruction for the image analysis tasksystem_prompt: Instructions that define the model's role and behavior (optional)model: Vision model to usetemperature: Controls randomness (0.0-1.0)max_tokens: Maximum response lengthThe query parameter is crucial for getting useful results from the image analysis. A well-crafted query provides context about:
| Basic Query | Enhanced Query | 
|---|---|
| "Describe this image" | "Identify all retail products visible in this store shelf image and estimate their price range" | 
| "What's in this image?" | "Analyze this medical scan for abnormalities, focusing on the highlighted area and providing possible diagnoses" | 
| "Analyze this chart" | "Extract the numerical data from this bar chart showing quarterly sales, and identify the key trends from 2022-2023" | 
| "Read the text" | "Transcribe all visible text in this restaurant menu, preserving the item names, descriptions, and prices" | 
By providing context about why you need the analysis and what specific information you're seeking, you help the model focus on relevant details and produce more valuable insights.
# Analyze an image from a URL result = await image_analysis( image="https://example.com/image.jpg", query="Describe this image in detail" ) # Analyze an image from a local file with a focused query result = await image_analysis( image="path/to/local/image.jpg", query="Identify all traffic signs in this street scene and explain their meanings for a driver education course" ) # Analyze with a base64-encoded image and a specific analytical purpose result = await image_analysis( image="SGVsbG8gV29ybGQ=...", # base64 data query="Examine this product packaging design and highlight elements that could be improved for better visibility and brand recognition" ) # Customize the system prompt for specialized analysis result = await image_analysis( image="path/to/local/image.jpg", query="Analyze the composition and artistic techniques used in this painting, focusing on how they create emotional impact", system_prompt="You are an expert art historian with deep knowledge of painting techniques and art movements. Focus on formal analysis of composition, color, brushwork, and stylistic elements." )
The image_analysis tool accepts several types of image inputs:
project_root parameter to specify a base directoryWhen using relative file paths (like "examples/image.jpg"), you have two options:
project_root parameter:# Example with relative path and project_root result = await image_analysis( image="examples/image.jpg", project_root="/path/to/your/project", query="What is in this image?" )
This is particularly useful in applications where the current working directory may not be predictable or when you want to reference files using paths relative to a specific directory.
# Clone the repository git clone https://github.com/modelcontextprotocol/mcp-openvision.git cd mcp-openvision # Install development dependencies pip install -e ".[dev]"
This project uses Black for automatic code formatting. The formatting is enforced through GitHub Actions:
You can also run Black locally to format your code before committing:
# Format all Python code in the src and tests directories black src tests
pytest
This project uses an automated release process:
pyproject.toml following Semantic Versioning principles
python scripts/bump_version.py [major|minor|patch]CHANGELOG.md with details about the new version
main branchThis automation helps maintain a consistent release process and ensures that every release is properly versioned and documented.
If you find this project helpful, consider buying me a coffee to support ongoing development and maintenance.
This project is licensed under the MIT License - see the LICENSE file for details.