
Gemini Claude
STDIO使Claude Desktop通过谷歌Gemini生成图像
使Claude Desktop通过谷歌Gemini生成图像
Welcome to the Gemini MCP Server, the first MCP server with Smart Tool Intelligence - a revolutionary self-learning system that adapts to your preferences and improves over time. This comprehensive platform provides 7 AI-powered tools with automatic prompt enhancement and context awareness.
git clone https://github.com/Garblesnarff/gemini-mcp-server.git cd gemini-mcp-server npm install
cp .env.example .env
.env
and add your API key:
GEMINI_API_KEY=your_actual_api_key_here OUTPUT_DIR=/path/to/your/output/directory # Optional DEBUG=false # Optional
npm start # or for development with debug logging: npm run dev
Add to your Claude Desktop config (claude_desktop_config.json
):
{ \"mcpServers\": { \"gemini\": { \"command\": \"node\", \"args\": [\"/path/to/gemini-mcp-server/gemini-server.js\"], \"env\": { \"GEMINI_API_KEY\": \"your_api_key_here\" } } } }
generate_image
)Generate images from text descriptions using Gemini 2.0 Flash.
Parameters:
prompt
(string, required) - Description of the image to generatecontext
(string, optional) - Context for Smart Tool Intelligence enhancementExample:
{ \"prompt\": \"A serene mountain landscape at sunset with vibrant colors\", \"context\": \"artistic\" }
Returns:
{ \"content\": [{ \"type\": \"text\", \"text\": \"Generated a beautiful mountain landscape image.\" }, { \"type\": \"image\", \"data\": \"base64_image_data\", \"mimeType\": \"image/png\" }] }
gemini-edit-image
)Edit existing images using natural language instructions.
Parameters:
image_path
(string, required) - Path to the image file to editedit_instruction
(string, required) - Description of desired changescontext
(string, optional) - Context for enhancementExample:
{ \"image_path\": \"/path/to/image.jpg\", \"edit_instruction\": \"Add shooting stars to the night sky\", \"context\": \"artistic\" }
gemini-chat
)Interactive conversations with Gemini AI that learns your preferences.
Parameters:
message
(string, required) - Your message or questioncontext
(string, optional) - Context for Smart Tool IntelligenceExample:
{ \"message\": \"Explain quantum computing in simple terms\", \"context\": \"consciousness\" // Will apply academic rigor enhancement }
gemini-transcribe-audio
)Convert audio files to text with Smart Tool Intelligence enhancement.
Parameters:
file_path
(string, required) - Path to audio file (MP3, WAV, FLAC, AAC, OGG, WEBM, M4A)language
(string, optional) - Language hint for better accuracycontext
(string, optional) - Use "verbatim" for exact word-for-word transcriptionpreserve_spelled_acronyms
(boolean, optional) - Keep U-R-L instead of URLExample (Standard):
{ \"file_path\": \"/path/to/audio.mp3\", \"language\": \"en\" }
Example (Verbatim Mode):
{ \"file_path\": \"/path/to/audio.mp3\", \"context\": \"verbatim\", // Gets exact word-for-word transcription \"preserve_spelled_acronyms\": true }
Verbatim Mode Features:
gemini-code-execute
)Execute Python code in a secure sandbox environment.
Parameters:
code
(string, required) - Python code to executecontext
(string, optional) - Context for enhancementExample:
{ \"code\": \"import pandas as pd\\ndata = {'x': [1,2,3], 'y': [4,5,6]}\\ndf = pd.DataFrame(data)\\nprint(df.describe())\", \"context\": \"code\" }
gemini-analyze-video
)Analyze video content for summaries, transcripts, and detailed insights.
Parameters:
file_path
(string, required) - Path to video file (MP4, MOV, AVI, WEBM, MKV, FLV)analysis_type
(string, optional) - "summary", "transcript", "objects", "detailed", "custom"context
(string, optional) - Context for enhancementExample:
{ \"file_path\": \"/path/to/video.mp4\", \"analysis_type\": \"detailed\" }
gemini-analyze-image
)Extract detailed information from images including objects, text, and descriptions.
Parameters:
file_path
(string, required) - Path to image file (JPEG, PNG, WebP, HEIC, HEIF, BMP, GIF)analysis_type
(string, optional) - "summary", "objects", "text", "detailed", "custom"context
(string, optional) - Context for enhancementExample:
{ \"file_path\": \"/path/to/image.jpg\", \"analysis_type\": \"objects\" }
The Smart Tool Intelligence system is the first of its kind in the MCP ecosystem. It automatically:
The system recognizes these contexts and applies appropriate enhancements:
consciousness
- Adds academic rigor, citations, detailed explanationscode
- Includes practical examples, working code, best practicesdebugging
- Focuses on root cause analysis and specific fixesgeneral
- Applies comprehensive, structured responsesverbatim
- For audio transcription, provides exact word-for-word outputPreferences are stored internally at ./data/tool-preferences.json
with automatic migration from external storage.
Want to add this revolutionary capability to your own MCP server? Here's how:
// src/intelligence/context-detector.js class ContextDetector { detectContext(prompt, toolName) { // Implement pattern matching for different contexts if (this.isConsciousnessContext(prompt)) return 'consciousness'; if (this.isCodeContext(prompt)) return 'code'; if (this.isDebuggingContext(prompt)) return 'debugging'; return 'general'; } } // src/intelligence/prompt-enhancer.js class PromptEnhancer { enhancePrompt(originalPrompt, context, toolName) { // Apply context-specific enhancements const enhancement = this.getEnhancementForContext(context); return `${originalPrompt}\\n\\n${enhancement}`; } } // src/intelligence/preference-store.js class PreferencesManager { async storePattern(original, enhanced, context, toolName, success) { // Store successful patterns for future learning } async getPatterns(context) { // Retrieve learned patterns for context } }
// In your tool's execute method: async execute(args) { const intelligence = IntelligenceSystem.getInstance(); // Detect context and enhance prompt const context = args.context || intelligence.contextDetector.detectContext(args.prompt, this.name); const enhancedPrompt = await intelligence.enhancePrompt(args.prompt, context, this.name); // Execute with enhanced prompt const result = await this.geminiService.generateContent(enhancedPrompt); // Store successful pattern await intelligence.storeSuccessfulPattern(args.prompt, enhancedPrompt, context, this.name); return result; }
Study these files from this repository:
src/intelligence/index.js
- Main intelligence coordinatorsrc/intelligence/context-detector.js
- Context recognition logicsrc/intelligence/prompt-enhancer.js
- Enhancement applicationsrc/intelligence/preference-store.js
- Pattern storage and retrievalsrc/tools/base-tool.js
- Integration with tool execution# Test basic functionality npm test # Test Smart Tool Intelligence node test-tool-intelligence-full.js # Test internal storage node test-internal-storage.js # Test verbatim transcription node test-verbatim-mode.js
# Test image generation echo '{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/call\",\"params\":{\"name\":\"generate_image\",\"arguments\":{\"prompt\":\"A cute robot reading a book\"}}}' | node gemini-server.js # Test chat with consciousness context echo '{\"jsonrpc\":\"2.0\",\"id\":2,\"method\":\"tools/call\",\"params\":{\"name\":\"gemini-chat\",\"arguments\":{\"message\":\"What is consciousness?\",\"context\":\"consciousness\"}}}' | node gemini-server.js
src/
├── server.js # MCP protocol handler
├── config.js # Configuration management
├── tools/ # Tool implementations
│ ├── index.js # Tool registry & dispatcher
│ ├── base-tool.js # Abstract base class
│ ├── chat.js # Chat tool
│ ├── image-generation.js # Image generation tool
│ ├── image-editing.js # Image editing tool
│ ├── audio-transcription.js # Audio transcription tool
│ ├── code-execution.js # Code execution tool
│ ├── video-analysis.js # Video analysis tool
│ └── image-analysis.js # Image analysis tool
├── intelligence/ # Smart Tool Intelligence
│ ├── index.js # Intelligence coordinator
│ ├── context-detector.js # Context recognition
│ ├── prompt-enhancer.js # Prompt enhancement
│ └── preference-store.js # Pattern storage
├── gemini/ # Gemini API integration
│ ├── gemini-service.js # API service layer
│ └── request-handler.js # Request formatting
└── utils/ # Utilities
├── logger.js # Logging system
└── file-utils.js # File operations
// In src/intelligence/context-detector.js isMyCustomContext(prompt) { const patterns = [ /custom pattern 1/i, /custom pattern 2/i ]; return patterns.some(pattern => pattern.test(prompt)); } // In src/intelligence/prompt-enhancer.js getEnhancementForContext(context) { const enhancements = { 'my_custom_context': 'Apply my custom enhancement instructions here.', // ... other contexts }; return enhancements[context] || enhancements.general; }
src/tools/my-new-tool.js
BaseTool
classexecute
method with intelligence integrationsrc/tools/index.js
// src/tools/my-new-tool.js class MyNewTool extends BaseTool { constructor(geminiService, intelligenceSystem) { super('my-new-tool', 'Description of my tool', geminiService, intelligenceSystem); } async execute(args) { // Use intelligence system for enhancement const context = args.context || this.detectContext(args.input); const enhancedPrompt = await this.enhancePrompt(args.input, context); // Your tool logic here const result = await this.geminiService.someMethod(enhancedPrompt); // Store successful pattern await this.storeSuccessfulPattern(args.input, enhancedPrompt, context); return result; } }
"Missing GEMINI_API_KEY" Error
# Ensure .env file exists and contains your API key cp .env.example .env # Edit .env and add: GEMINI_API_KEY=your_key_here
"File not found" Errors
# Ensure file paths are absolute and files exist # Check file permissions and formats
Intelligence System Not Learning
# Check data directory permissions ls -la data/ # Verify tool-preferences.json is writable
DEBUG=true npm start # or npm run dev
./data/tool-preferences.json
$OUTPUT_DIR
(default: ~/Claude/gemini-images
)We welcome contributions! This project represents a new paradigm in MCP server development.
git clone https://github.com/Garblesnarff/gemini-mcp-server.git cd gemini-mcp-server npm install npm run dev
This is the first MCP server that truly learns and adapts. Traditional MCP servers are static - they do the same thing every time. Our Smart Tool Intelligence system represents a paradigm shift toward AI tools that become more helpful over time.
For Users: Better results with less effort as the system learns your preferences.
For Developers: A blueprint for building truly intelligent, adaptive AI tools.
For the MCP Ecosystem: A new standard for what MCP servers can become.
This project is licensed under the MIT License - feel free to use, modify, and distribute.
Built with:
Ready to experience the future of MCP servers? Get started now and watch your AI tools become smarter with every interaction! 🚀"