
Image Gen
STDIOUniversal MCP server enabling AI image generation across multiple providers and chatbot clients
Universal MCP server enabling AI image generation across multiple providers and chatbot clients
Empowering Universal Image Generation for AI Chatbots
Traditional AI chatbot interfaces are limited to text-only interactions, regardless of how powerful their underlying language models are. Image Gen MCP Server bridges this gap by enabling any LLM-powered chatbot client to generate professional-quality images through the standardized Model Context Protocol (MCP).
Whether you're using Claude Desktop, a custom ChatGPT interface, Llama-based applications, or any other LLM client that supports MCP, this server democratizes access to multiple AI image generation models including OpenAI's gpt-image-1, dall-e-3, dall-e-2, and Google's Imagen series (imagen-4, imagen-4-ultra, imagen-3), transforming text-only conversations into rich, visual experiences.
📦 Package Manager: This project uses UV for fast, reliable Python package management. UV provides better dependency resolution, faster installs, and proper environment isolation compared to traditional pip/venv workflows.
The AI ecosystem has evolved to include powerful language models from multiple providers (OpenAI, Anthropic, Meta, Google, etc.), but image generation capabilities remain fragmented and platform-specific. This creates a significant gap:
Image Gen MCP Server solves this by providing:
Claude Desktop seamlessly generating images through MCP integration
High-quality images generated through the MCP server, demonstrating professional-grade output
Key Advantage: Unlike platform-specific solutions, this universal approach means your image generation capabilities move with you across different tools and workflows, eliminating vendor lock-in and maximizing workflow efficiency.
Clone and setup:
git clone <repository-url> cd image-gen-mcp uv sync
Note: This project uses UV for fast, reliable Python package management. UV provides better dependency resolution and faster installs compared to pip.
Configure environment:
cp .env.example .env # Edit .env and add your API keys: # - PROVIDERS__OPENAI__API_KEY for OpenAI models # - PROVIDERS__GEMINI__API_KEY for Gemini models (optional)
Test the setup:
uv run python scripts/dev.py setup uv run python scripts/dev.py test
# HTTP transport for web development and testing ./run.sh dev # HTTP transport with development tools (Redis Commander) ./run.sh dev --tools # STDIO transport for Claude Desktop integration ./run.sh stdio # Production deployment with monitoring ./run.sh prod
# STDIO transport (default) - for Claude Desktop uv run python -m gpt_image_mcp.server # HTTP transport - for web deployment uv run python -m gpt_image_mcp.server --transport streamable-http --port 3001 # SSE transport - for real-time applications uv run python -m gpt_image_mcp.server --transport sse --port 8080 # With custom configuration uv run python -m gpt_image_mcp.server --config /path/to/.env --log-level DEBUG # Enable CORS for web development uv run python -m gpt_image_mcp.server --transport streamable-http --cors
uv run python -m gpt_image_mcp.server --help Image Gen MCP Server - Generate and edit images using OpenAI's gpt-image-1 model options: --config PATH Path to configuration file (.env format) --log-level LEVEL Set logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) --transport TYPE Transport method (stdio, sse, streamable-http) --port PORT Port for HTTP transports (default: 3001) --host HOST Host address for HTTP transports (default: 127.0.0.1) --cors Enable CORS for web deployments --version Show version information --help Show help message Examples: # Claude Desktop integration uv run python -m gpt_image_mcp.server # Web deployment with Redis cache uv run python -m gpt_image_mcp.server --transport streamable-http --port 3001 # Development with debug logging and tools uv run python -m gpt_image_mcp.server --log-level DEBUG --cors
This server works with any MCP-compatible chatbot client. Here are configuration examples:
{ "mcpServers": { "image-gen-mcp": { "command": "uv", "args": [ "--directory", "/path/to/image-gen-mcp", "run", "image-gen-mcp" ], "env": { "OPENAI_API_KEY": "your-api-key-here" } } } }
{ "mcpServers": { "gpt-image": { "command": "uv", "args": ["--directory", "/path/to/image-gen-mcp", "run", "image-gen-mcp"], "env": { "OPENAI_API_KEY": "your-api-key-here" } } } }
For other MCP-compatible applications, use the standard MCP STDIO transport:
uv run python -m gpt_image_mcp.server
Universal Compatibility: This server follows the standard MCP protocol, ensuring compatibility with current and future MCP-enabled clients across the AI ecosystem.
# Use via MCP client result = await session.call_tool( "generate_image", arguments={ "prompt": "A beautiful sunset over mountains, digital art style", "quality": "high", "size": "1536x1024", "style": "vivid" } )
# Get optimized prompt for social media prompt_result = await session.get_prompt( "social_media_prompt", arguments={ "platform": "instagram", "content_type": "product announcement", "brand_style": "modern minimalist" } )
# Access via resource URI image_data = await session.read_resource("generated-images://img_20250630143022_abc123") # Check recent images history = await session.read_resource("image-history://recent?limit=5") # Storage statistics stats = await session.read_resource("storage-stats://overview")
list_available_models
List all available image generation models and their capabilities.
Returns: Dictionary with model information, capabilities, and provider details.
generate_image
Generate images from text descriptions using any supported model.
Parameters:
prompt
(required): Text description of desired imagemodel
(optional): Model to use (e.g., "gpt-image-1", "dall-e-3", "imagen-4")quality
: "auto" | "high" | "medium" | "low" (default: "auto")size
: "1024x1024" | "1536x1024" | "1024x1536" (default: "1536x1024")style
: "vivid" | "natural" (default: "vivid")output_format
: "png" | "jpeg" | "webp" (default: "png")background
: "auto" | "transparent" | "opaque" (default: "auto")Note: Parameter availability depends on the selected model. Use list_available_models
to check capabilities.
edit_image
Edit existing images with text instructions.
Parameters:
image_data
(required): Base64 encoded image or data URLprompt
(required): Edit instructionsmask_data
: Optional mask for targeted editingsize
, quality
, output_format
: Same as generate_imagegenerated-images://{image_id}
- Access specific generated imagesimage-history://recent
- Browse recent generation historystorage-stats://overview
- Storage usage and statisticsmodel-info://gpt-image-1
- Model capabilities and pricingBuilt-in templates for common use cases:
Configure via environment variables or .env
file:
# ============================================================================= # Provider Configuration # ============================================================================= # OpenAI Provider (default enabled) PROVIDERS__OPENAI__API_KEY=sk-your-openai-api-key-here PROVIDERS__OPENAI__BASE_URL=https://api.openai.com/v1 PROVIDERS__OPENAI__ORGANIZATION=org-your-org-id PROVIDERS__OPENAI__TIMEOUT=300.0 PROVIDERS__OPENAI__MAX_RETRIES=3 PROVIDERS__OPENAI__ENABLED=true # Gemini Provider (default disabled) PROVIDERS__GEMINI__API_KEY=your-gemini-api-key-here PROVIDERS__GEMINI__BASE_URL=https://generativelanguage.googleapis.com/v1beta/ PROVIDERS__GEMINI__TIMEOUT=300.0 PROVIDERS__GEMINI__MAX_RETRIES=3 PROVIDERS__GEMINI__ENABLED=false PROVIDERS__GEMINI__DEFAULT_MODEL=imagen-4 # ============================================================================= # Image Generation Settings # ============================================================================= IMAGES__DEFAULT_MODEL=gpt-image-1 IMAGES__DEFAULT_QUALITY=auto IMAGES__DEFAULT_SIZE=1536x1024 IMAGES__DEFAULT_STYLE=vivid IMAGES__DEFAULT_MODERATION=auto IMAGES__DEFAULT_OUTPUT_FORMAT=png # Base URL for image hosting (e.g., https://cdn.example.com for nginx/CDN) IMAGES__BASE_HOST= # ============================================================================= # Server Configuration # ============================================================================= SERVER__NAME=Image Gen MCP Server SERVER__VERSION=0.1.0 SERVER__PORT=3001 SERVER__HOST=127.0.0.1 SERVER__LOG_LEVEL=INFO SERVER__RATE_LIMIT_RPM=50 # ============================================================================= # Storage Configuration # ============================================================================= STORAGE__BASE_PATH=./storage STORAGE__RETENTION_DAYS=30 STORAGE__MAX_SIZE_GB=10.0 STORAGE__CLEANUP_INTERVAL_HOURS=24 # ============================================================================= # Cache Configuration # ============================================================================= CACHE__ENABLED=true CACHE__TTL_HOURS=24 CACHE__BACKEND=memory CACHE__MAX_SIZE_MB=500 # CACHE__REDIS_URL=redis://localhost:6379
The server supports production deployment with Docker, monitoring, and reverse proxy:
# Quick production deployment ./run.sh prod # Manual Docker Compose deployment docker-compose -f docker-compose.prod.yml up -d
Production Stack includes:
Access Points:
http://localhost:3001
(behind proxy)http://localhost:3000
http://localhost:9090
(localhost only)For VPS deployment with SSL, monitoring, and production hardening:
# Download deployment script wget https://raw.githubusercontent.com/your-repo/image-gen-mcp/main/deploy/vps-setup.sh chmod +x vps-setup.sh ./vps-setup.sh
Features included:
See VPS Deployment Guide for detailed instructions.
Available Docker Compose profiles:
# Development with HTTP transport docker-compose -f docker-compose.dev.yml up # Development with Redis Commander docker-compose -f docker-compose.dev.yml --profile tools up # STDIO transport for desktop integration docker-compose -f docker-compose.dev.yml --profile stdio up # Production with monitoring docker-compose -f docker-compose.prod.yml up -d
# Setup development environment uv run python scripts/dev.py setup # Run tests uv run python scripts/dev.py test # Code quality and formatting uv run python scripts/dev.py lint # Check code quality with ruff and mypy uv run python scripts/dev.py format # Format code with black # Run example client uv run python scripts/dev.py example # Development server with auto-reload ./run.sh dev --tools # Includes Redis Commander UI
# Run full test suite ./run.sh test # Run specific test categories uv run pytest tests/unit/ # Unit tests only uv run pytest tests/integration/ # Integration tests only uv run pytest -v --cov=gpt_image_mcp # With coverage
The server follows a modular, production-ready architecture:
Core Components:
server.py
): FastMCP-based MCP server with multi-transport supportconfig/
): Environment-based settings management with validationtools/
): Image generation and editing capabilitiesresources/
): MCP resources for data access and model registrystorage/
): Organized local image storage with cleanuputils/cache.py
): Memory and Redis-based caching systemMulti-Provider Architecture:
providers/registry.py
): Centralized provider and model managementproviders/base.py
): Abstract base class for all providersproviders/openai.py
): OpenAI API integration with retry logicproviders/gemini.py
): Google Gemini API integrationtypes/
): Pydantic models for type safetyutils/validators.py
): Input validation and sanitizationInfrastructure:
prompts/
): Template system for optimized promptsDeployment:
The server provides cost estimation for operations:
Comprehensive error handling includes:
Security features include:
MIT License - see LICENSE file for details.
For issues and questions:
Built with ❤️ using the Model Context Protocol and OpenAI's gpt-image-1
The Model Context Protocol represents a paradigm shift towards standardized AI tool integration. As more LLM clients adopt MCP support, servers like this one become increasingly valuable by providing universal capabilities across the entire ecosystem.
Current MCP Adoption:
Vision: A future where AI capabilities are modular, interoperable, and user-controlled rather than locked to specific platforms.
🌟 Building the Universal AI Ecosystem
Democratizing advanced AI capabilities across all platforms through the power of the Model Context Protocol. One server, infinite possibilities.