
Gemini Claude
STDIOMCP server enabling Claude Desktop to generate images with Google's Gemini AI models
MCP server enabling Claude Desktop to generate images with Google's Gemini AI models
Welcome to the Gemini MCP Server, the first MCP server with Smart Tool Intelligence - a revolutionary self-learning system that adapts to your preferences and improves over time. This comprehensive platform provides 7 AI-powered tools with automatic prompt enhancement and context awareness.
git clone https://github.com/Garblesnarff/gemini-mcp-server.git cd gemini-mcp-server npm install
cp .env.example .env
.env
and add your API key:
GEMINI_API_KEY=your_actual_api_key_here OUTPUT_DIR=/path/to/your/output/directory # Optional DEBUG=false # Optional
npm start # or for development with debug logging: npm run dev
Add to your Claude Desktop config (claude_desktop_config.json
):
{ \"mcpServers\": { \"gemini\": { \"command\": \"node\", \"args\": [\"/path/to/gemini-mcp-server/gemini-server.js\"], \"env\": { \"GEMINI_API_KEY\": \"your_api_key_here\" } } } }
generate_image
)Generate images from text descriptions using Gemini 2.0 Flash.
Parameters:
prompt
(string, required) - Description of the image to generatecontext
(string, optional) - Context for Smart Tool Intelligence enhancementExample:
{ \"prompt\": \"A serene mountain landscape at sunset with vibrant colors\", \"context\": \"artistic\" }
Returns:
{ \"content\": [{ \"type\": \"text\", \"text\": \"Generated a beautiful mountain landscape image.\" }, { \"type\": \"image\", \"data\": \"base64_image_data\", \"mimeType\": \"image/png\" }] }
gemini-edit-image
)Edit existing images using natural language instructions.
Parameters:
image_path
(string, required) - Path to the image file to editedit_instruction
(string, required) - Description of desired changescontext
(string, optional) - Context for enhancementExample:
{ \"image_path\": \"/path/to/image.jpg\", \"edit_instruction\": \"Add shooting stars to the night sky\", \"context\": \"artistic\" }
gemini-chat
)Interactive conversations with Gemini AI that learns your preferences.
Parameters:
message
(string, required) - Your message or questioncontext
(string, optional) - Context for Smart Tool IntelligenceExample:
{ \"message\": \"Explain quantum computing in simple terms\", \"context\": \"consciousness\" // Will apply academic rigor enhancement }
gemini-transcribe-audio
)Convert audio files to text with Smart Tool Intelligence enhancement.
Parameters:
file_path
(string, required) - Path to audio file (MP3, WAV, FLAC, AAC, OGG, WEBM, M4A)language
(string, optional) - Language hint for better accuracycontext
(string, optional) - Use "verbatim" for exact word-for-word transcriptionpreserve_spelled_acronyms
(boolean, optional) - Keep U-R-L instead of URLExample (Standard):
{ \"file_path\": \"/path/to/audio.mp3\", \"language\": \"en\" }
Example (Verbatim Mode):
{ \"file_path\": \"/path/to/audio.mp3\", \"context\": \"verbatim\", // Gets exact word-for-word transcription \"preserve_spelled_acronyms\": true }
Verbatim Mode Features:
gemini-code-execute
)Execute Python code in a secure sandbox environment.
Parameters:
code
(string, required) - Python code to executecontext
(string, optional) - Context for enhancementExample:
{ \"code\": \"import pandas as pd\\ndata = {'x': [1,2,3], 'y': [4,5,6]}\\ndf = pd.DataFrame(data)\\nprint(df.describe())\", \"context\": \"code\" }
gemini-analyze-video
)Analyze video content for summaries, transcripts, and detailed insights.
Parameters:
file_path
(string, required) - Path to video file (MP4, MOV, AVI, WEBM, MKV, FLV)analysis_type
(string, optional) - "summary", "transcript", "objects", "detailed", "custom"context
(string, optional) - Context for enhancementExample:
{ \"file_path\": \"/path/to/video.mp4\", \"analysis_type\": \"detailed\" }
gemini-analyze-image
)Extract detailed information from images including objects, text, and descriptions.
Parameters:
file_path
(string, required) - Path to image file (JPEG, PNG, WebP, HEIC, HEIF, BMP, GIF)analysis_type
(string, optional) - "summary", "objects", "text", "detailed", "custom"context
(string, optional) - Context for enhancementExample:
{ \"file_path\": \"/path/to/image.jpg\", \"analysis_type\": \"objects\" }
The Smart Tool Intelligence system is the first of its kind in the MCP ecosystem. It automatically:
The system recognizes these contexts and applies appropriate enhancements:
consciousness
- Adds academic rigor, citations, detailed explanationscode
- Includes practical examples, working code, best practicesdebugging
- Focuses on root cause analysis and specific fixesgeneral
- Applies comprehensive, structured responsesverbatim
- For audio transcription, provides exact word-for-word outputPreferences are stored internally at ./data/tool-preferences.json
with automatic migration from external storage.
Want to add this revolutionary capability to your own MCP server? Here's how:
// src/intelligence/context-detector.js class ContextDetector { detectContext(prompt, toolName) { // Implement pattern matching for different contexts if (this.isConsciousnessContext(prompt)) return 'consciousness'; if (this.isCodeContext(prompt)) return 'code'; if (this.isDebuggingContext(prompt)) return 'debugging'; return 'general'; } } // src/intelligence/prompt-enhancer.js class PromptEnhancer { enhancePrompt(originalPrompt, context, toolName) { // Apply context-specific enhancements const enhancement = this.getEnhancementForContext(context); return `${originalPrompt}\\n\\n${enhancement}`; } } // src/intelligence/preference-store.js class PreferencesManager { async storePattern(original, enhanced, context, toolName, success) { // Store successful patterns for future learning } async getPatterns(context) { // Retrieve learned patterns for context } }
// In your tool's execute method: async execute(args) { const intelligence = IntelligenceSystem.getInstance(); // Detect context and enhance prompt const context = args.context || intelligence.contextDetector.detectContext(args.prompt, this.name); const enhancedPrompt = await intelligence.enhancePrompt(args.prompt, context, this.name); // Execute with enhanced prompt const result = await this.geminiService.generateContent(enhancedPrompt); // Store successful pattern await intelligence.storeSuccessfulPattern(args.prompt, enhancedPrompt, context, this.name); return result; }
Study these files from this repository:
src/intelligence/index.js
- Main intelligence coordinatorsrc/intelligence/context-detector.js
- Context recognition logicsrc/intelligence/prompt-enhancer.js
- Enhancement applicationsrc/intelligence/preference-store.js
- Pattern storage and retrievalsrc/tools/base-tool.js
- Integration with tool execution# Test basic functionality npm test # Test Smart Tool Intelligence node test-tool-intelligence-full.js # Test internal storage node test-internal-storage.js # Test verbatim transcription node test-verbatim-mode.js
# Test image generation echo '{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/call\",\"params\":{\"name\":\"generate_image\",\"arguments\":{\"prompt\":\"A cute robot reading a book\"}}}' | node gemini-server.js # Test chat with consciousness context echo '{\"jsonrpc\":\"2.0\",\"id\":2,\"method\":\"tools/call\",\"params\":{\"name\":\"gemini-chat\",\"arguments\":{\"message\":\"What is consciousness?\",\"context\":\"consciousness\"}}}' | node gemini-server.js
src/
├── server.js # MCP protocol handler
├── config.js # Configuration management
├── tools/ # Tool implementations
│ ├── index.js # Tool registry & dispatcher
│ ├── base-tool.js # Abstract base class
│ ├── chat.js # Chat tool
│ ├── image-generation.js # Image generation tool
│ ├── image-editing.js # Image editing tool
│ ├── audio-transcription.js # Audio transcription tool
│ ├── code-execution.js # Code execution tool
│ ├── video-analysis.js # Video analysis tool
│ └── image-analysis.js # Image analysis tool
├── intelligence/ # Smart Tool Intelligence
│ ├── index.js # Intelligence coordinator
│ ├── context-detector.js # Context recognition
│ ├── prompt-enhancer.js # Prompt enhancement
│ └── preference-store.js # Pattern storage
├── gemini/ # Gemini API integration
│ ├── gemini-service.js # API service layer
│ └── request-handler.js # Request formatting
└── utils/ # Utilities
├── logger.js # Logging system
└── file-utils.js # File operations
// In src/intelligence/context-detector.js isMyCustomContext(prompt) { const patterns = [ /custom pattern 1/i, /custom pattern 2/i ]; return patterns.some(pattern => pattern.test(prompt)); } // In src/intelligence/prompt-enhancer.js getEnhancementForContext(context) { const enhancements = { 'my_custom_context': 'Apply my custom enhancement instructions here.', // ... other contexts }; return enhancements[context] || enhancements.general; }
src/tools/my-new-tool.js
BaseTool
classexecute
method with intelligence integrationsrc/tools/index.js
// src/tools/my-new-tool.js class MyNewTool extends BaseTool { constructor(geminiService, intelligenceSystem) { super('my-new-tool', 'Description of my tool', geminiService, intelligenceSystem); } async execute(args) { // Use intelligence system for enhancement const context = args.context || this.detectContext(args.input); const enhancedPrompt = await this.enhancePrompt(args.input, context); // Your tool logic here const result = await this.geminiService.someMethod(enhancedPrompt); // Store successful pattern await this.storeSuccessfulPattern(args.input, enhancedPrompt, context); return result; } }
"Missing GEMINI_API_KEY" Error
# Ensure .env file exists and contains your API key cp .env.example .env # Edit .env and add: GEMINI_API_KEY=your_key_here
"File not found" Errors
# Ensure file paths are absolute and files exist # Check file permissions and formats
Intelligence System Not Learning
# Check data directory permissions ls -la data/ # Verify tool-preferences.json is writable
DEBUG=true npm start # or npm run dev
./data/tool-preferences.json
$OUTPUT_DIR
(default: ~/Claude/gemini-images
)We welcome contributions! This project represents a new paradigm in MCP server development.
git clone https://github.com/Garblesnarff/gemini-mcp-server.git cd gemini-mcp-server npm install npm run dev
This is the first MCP server that truly learns and adapts. Traditional MCP servers are static - they do the same thing every time. Our Smart Tool Intelligence system represents a paradigm shift toward AI tools that become more helpful over time.
For Users: Better results with less effort as the system learns your preferences.
For Developers: A blueprint for building truly intelligent, adaptive AI tools.
For the MCP Ecosystem: A new standard for what MCP servers can become.
This project is licensed under the MIT License - feel free to use, modify, and distribute.
Built with:
Ready to experience the future of MCP servers? Get started now and watch your AI tools become smarter with every interaction! 🚀"