图像识别
STDIO基于Anthropic和OpenAI视觉API的图像识别服务器
基于Anthropic和OpenAI视觉API的图像识别服务器
An MCP server that provides image recognition capabilities using Anthropic and OpenAI vision APIs. Version 0.1.2.
sudo apt-get install tesseract-ocrbrew install tesseractgit clone https://github.com/mario-andreschak/mcp-image-recognition.git cd mcp-image-recognition
cp .env.example .env # Edit .env with your API keys and preferences
build.bat
Spawn the server using python:
python -m image_recognition_server.server
Start the server using batch instead:
run.bat server
Start the server in development mode with the MCP Inspector:
run.bat debug
describe_image
describe_image_from_file
ANTHROPIC_API_KEY: Your Anthropic API key.OPENAI_API_KEY: Your OpenAI API key.VISION_PROVIDER: Primary vision provider (anthropic or openai).FALLBACK_PROVIDER: Optional fallback provider.LOG_LEVEL: Logging level (DEBUG, INFO, WARNING, ERROR).ENABLE_OCR: Enable Tesseract OCR text extraction (true or false).TESSERACT_CMD: Optional custom path to Tesseract executable.OPENAI_MODEL: OpenAI Model (default: gpt-4o-mini). Can use OpenRouter format for other models (e.g., anthropic/claude-3.5-sonnet:beta).OPENAI_BASE_URL: Optional custom base URL for the OpenAI API.  Set to https://openrouter.ai/api/v1 for OpenRouter.OPENAI_TIMEOUT: Optional custom timeout (in seconds) for the OpenAI API.OpenRouter allows you to access various models using the OpenAI API format. To use OpenRouter, follow these steps:
OPENAI_API_KEY in your .env file to your OpenRouter API key.OPENAI_BASE_URL to https://openrouter.ai/api/v1.OPENAI_MODEL to the desired model using the OpenRouter format (e.g., anthropic/claude-3.5-sonnet:beta).VISION_PROVIDER to openai.claude-3.5-sonnet-betagpt-4o-minianthropic/claude-3.5-sonnet:beta format in OPENAI_MODEL.Run all tests:
run.bat test
Run specific test suite:
run.bat test server run.bat test anthropic run.bat test openai
Build the Docker image:
docker build -t mcp-image-recognition .
Run the container:
docker run -it --env-file .env mcp-image-recognition
MIT License - see LICENSE file for details.