图像识别
STDIO基于AI视觉API的图像识别服务器
基于AI视觉API的图像识别服务器
An MCP server that provides image recognition capabilities using Anthropic and OpenAI vision APIs. Version 0.1.2.
sudo apt-get install tesseract-ocr
brew install tesseract
git clone https://github.com/mario-andreschak/mcp-image-recognition.git cd mcp-image-recognition
cp .env.example .env # Edit .env with your API keys and preferences
build.bat
Spawn the server using python:
python -m image_recognition_server.server
Start the server using batch instead:
run.bat server
Start the server in development mode with the MCP Inspector:
run.bat debug
describe_image
describe_image_from_file
ANTHROPIC_API_KEY
: Your Anthropic API key.OPENAI_API_KEY
: Your OpenAI API key.VISION_PROVIDER
: Primary vision provider (anthropic
or openai
).FALLBACK_PROVIDER
: Optional fallback provider.LOG_LEVEL
: Logging level (DEBUG, INFO, WARNING, ERROR).ENABLE_OCR
: Enable Tesseract OCR text extraction (true
or false
).TESSERACT_CMD
: Optional custom path to Tesseract executable.OPENAI_MODEL
: OpenAI Model (default: gpt-4o-mini
). Can use OpenRouter format for other models (e.g., anthropic/claude-3.5-sonnet:beta
).OPENAI_BASE_URL
: Optional custom base URL for the OpenAI API. Set to https://openrouter.ai/api/v1
for OpenRouter.OPENAI_TIMEOUT
: Optional custom timeout (in seconds) for the OpenAI API.OpenRouter allows you to access various models using the OpenAI API format. To use OpenRouter, follow these steps:
OPENAI_API_KEY
in your .env
file to your OpenRouter API key.OPENAI_BASE_URL
to https://openrouter.ai/api/v1
.OPENAI_MODEL
to the desired model using the OpenRouter format (e.g., anthropic/claude-3.5-sonnet:beta
).VISION_PROVIDER
to openai
.claude-3.5-sonnet-beta
gpt-4o-mini
anthropic/claude-3.5-sonnet:beta
format in OPENAI_MODEL
.Run all tests:
run.bat test
Run specific test suite:
run.bat test server run.bat test anthropic run.bat test openai
Build the Docker image:
docker build -t mcp-image-recognition .
Run the container:
docker run -it --env-file .env mcp-image-recognition
MIT License - see LICENSE file for details.