icon for mcp server

DINO-X

STDIO

Fine-grained object detection and image understanding for LLMs using DINO-X technology.

DINO-X MCP

License npm version npm downloads PRs Welcome MCP Badge GitHub stars

English | 中文

DINO-X Official MCP Server — powered by the DINO-X and Grounding DINO models — brings fine-grained object detection and image understanding to your multimodal applications.

Why DINO-X MCP?

With DINO-X MCP, you can:

  • Fine-Grained Understanding: Full image detection, object detection, and region-level descriptions.

  • Structured Outputs: Get object categories, counts, locations, and attributes for VQA and multi-step reasoning tasks.

  • Composable: Works seamlessly with other MCP servers to build end-to-end visual agents or automation pipelines.

Transport Modes

DINO-X MCP supports two transport modes:

FeatureSTDIO (default)Streamable HTTP
RuntimeLocalLocal or Cloud
TransportStandard I/OHTTP (streaming responses)
Input sourcefile:// and https://https:// only
VisualizationSupported (saves annotated images locally)Not supported (for now)

Quick Start

1. Prepare an MCP client

Any MCP-compatible client works, e.g.:

2. Get your API key

Apply on the DINO-X platform: Request API Key (new users get free quota).

3. Configure MCP

Option A: Official Hosted Streamable HTTP (Recommended)

Add to your MCP client config and replace with your API key:

{ "mcpServers": { "dinox-mcp": { "url": "https://mcp.deepdataspace.com/mcp?key=your-api-key" } } }

Option B: Use the NPM package locally (STDIO)

Install Node.js first

  • Download the installer from nodejs.org

  • Or use command:

# macOS / Linux curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash # or wget -qO- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash # load nvm into current shell (choose the one you use) source ~/.bashrc || true source ~/.zshrc || true # install and use LTS Node.js nvm install --lts nvm use --lts # Windows (one of the following) winget install OpenJS.NodeJS.LTS # or with Chocolatey (in admin PowerShell) iwr -useb https://raw.githubusercontent.com/chocolatey/chocolatey/master/chocolateyInstall/InstallChocolatey.ps1 | iex choco install nodejs-lts -y

Configure your MCP client:

{ "mcpServers": { "dinox-mcp": { "command": "npx", "args": ["-y", "@deepdataspace/dinox-mcp"], "env": { "DINOX_API_KEY": "your-api-key-here", "IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory" } } } }

Note: Replace your-api-key-here with your real key.

Option C: Run from source locally

Make sure Node.js is installed (see Option B), then:

# clone git clone https://github.com/IDEA-Research/DINO-X-MCP.git cd DINO-X-MCP # install deps npm install # build npm run build

Configure your MCP client:

{ "mcpServers": { "dinox-mcp": { "command": "node", "args": ["/path/to/DINO-X-MCP/build/index.js"], "env": { "DINOX_API_KEY": "your-api-key-here", "IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory" } } } }

CLI Flags & Environment Variables

  • Common flags

    • --http: start in Streamable HTTP mode (otherwise STDIO by default)
    • --stdio: force STDIO mode
    • --dinox-api-key=...: set API key
    • --enable-client-key: allow API key via URL ?key= (Streamable HTTP only)
    • --port=8080: HTTP port (default 3020)
  • Environment variables

    • DINOX_API_KEY (required/conditionally required): DINO-X platform API key
    • IMAGE_STORAGE_DIRECTORY (optional, STDIO): directory to save annotated images
    • AUTH_TOKEN (optional, HTTP): if set, client must send Authorization: Bearer <token>

    Examples:

# STDIO (local) node build/index.js --dinox-api-key=your-api-key # Streamable HTTP (server provides a shared API key) node build/index.js --http --dinox-api-key=your-api-key # Streamable HTTP (custom port) node build/index.js --http --dinox-api-key=your-api-key --port=8080 # Streamable HTTP (require client-provided API key via URL) node build/index.js --http --enable-client-key

Client config when using ?key=:

{ "mcpServers": { "dinox-mcp": { "url": "http://localhost:3020/mcp?key=your-api-key" } } }

Using AUTH_TOKEN with a gateway that injects Authorization: Bearer <token>:

AUTH_TOKEN=my-token node build/index.js --http --enable-client-key

Client example with supergateway:

{ "mcpServers": { "dinox-mcp": { "command": "npx", "args": [ "-y", "supergateway", "--streamableHttp", "http://localhost:3020/mcp?key=your-api-key", "--oauth2Bearer", "my-token" ] } } }

Tools

CapabilityTool IDTransportInputOutput
Full-scene object detectiondetect-all-objectsSTDIO / HTTPImage URLCategory + bbox + (optional) captions
Text-prompted object detectiondetect-objects-by-textSTDIO / HTTPImage URL + English nouns (dot-separated for multiple, e.g., person.car)Target object bbox + (optional) captions
Human pose estimationdetect-human-pose-keypointsSTDIO / HTTPImage URL17 keypoints + bbox + (optional) captions
Visualizationvisualize-detection-resultSTDIO onlyImage URL + detection results arrayLocal path to annotated image

🎬 Use Cases

🎯 Scenario📝 Input✨ Output
Detection & Localization💬 Prompt:
Detect and visualize the
fire areas in the forest

🖼️ Input Image:
1-1
1-2
Object Counting💬 Prompt:
Please analyze this
warehouse image, detect
all the cardboard boxes,
count the total number

🖼️ Input Image:
2-1
2-2
Feature Detection💬 Prompt:
Find all red cars
in the image

🖼️ Input Image:
4-1
4-2
Attribute Reasoning💬 Prompt:
Find the tallest person
in the image, describe
their clothing

🖼️ Input Image:
5-1
5-2
Full Scene Detection💬 Prompt:
Find the fruit with
the highest vitamin C
content in the image

🖼️ Input Image:
6-1
6-3

Answer: Kiwi fruit (93mg/100g)
Pose Analysis💬 Prompt:
Please analyze what
yoga pose this is

🖼️ Input Image:
3-1
3-3

FAQ

  • Supported image sources?
    • STDIO: file:// and https://
    • Streamable HTTP: https:// only
  • Supported image formats?
    • jpg, jpeg, webp, png

Development & Debugging

Use watch mode to auto-rebuild during development:

npm run watch

Use MCP Inspector for debugging:

npm run inspector

License

Apache License 2.0

Be the First to Experience MCP Now