Gemini Claude
STDIO使Claude Desktop通过谷歌Gemini生成图像
使Claude Desktop通过谷歌Gemini生成图像
Welcome to the Gemini MCP Server, the first MCP server with Smart Tool Intelligence - a revolutionary self-learning system that adapts to your preferences and improves over time. This comprehensive platform provides 7 AI-powered tools with automatic prompt enhancement and context awareness.
git clone https://github.com/Garblesnarff/gemini-mcp-server.git cd gemini-mcp-server npm install
cp .env.example .env
.env and add your API key:
GEMINI_API_KEY=your_actual_api_key_here OUTPUT_DIR=/path/to/your/output/directory # Optional DEBUG=false # Optional
npm start # or for development with debug logging: npm run dev
Add to your Claude Desktop config (claude_desktop_config.json):
{ \"mcpServers\": { \"gemini\": { \"command\": \"node\", \"args\": [\"/path/to/gemini-mcp-server/gemini-server.js\"], \"env\": { \"GEMINI_API_KEY\": \"your_api_key_here\" } } } }
generate_image)Generate images from text descriptions using Gemini 2.0 Flash.
Parameters:
prompt (string, required) - Description of the image to generatecontext (string, optional) - Context for Smart Tool Intelligence enhancementExample:
{ \"prompt\": \"A serene mountain landscape at sunset with vibrant colors\", \"context\": \"artistic\" }
Returns:
{ \"content\": [{ \"type\": \"text\", \"text\": \"Generated a beautiful mountain landscape image.\" }, { \"type\": \"image\", \"data\": \"base64_image_data\", \"mimeType\": \"image/png\" }] }
gemini-edit-image)Edit existing images using natural language instructions.
Parameters:
image_path (string, required) - Path to the image file to editedit_instruction (string, required) - Description of desired changescontext (string, optional) - Context for enhancementExample:
{ \"image_path\": \"/path/to/image.jpg\", \"edit_instruction\": \"Add shooting stars to the night sky\", \"context\": \"artistic\" }
gemini-chat)Interactive conversations with Gemini AI that learns your preferences.
Parameters:
message (string, required) - Your message or questioncontext (string, optional) - Context for Smart Tool IntelligenceExample:
{ \"message\": \"Explain quantum computing in simple terms\", \"context\": \"consciousness\" // Will apply academic rigor enhancement }
gemini-transcribe-audio)Convert audio files to text with Smart Tool Intelligence enhancement.
Parameters:
file_path (string, required) - Path to audio file (MP3, WAV, FLAC, AAC, OGG, WEBM, M4A)language (string, optional) - Language hint for better accuracycontext (string, optional) - Use "verbatim" for exact word-for-word transcriptionpreserve_spelled_acronyms (boolean, optional) - Keep U-R-L instead of URLExample (Standard):
{ \"file_path\": \"/path/to/audio.mp3\", \"language\": \"en\" }
Example (Verbatim Mode):
{ \"file_path\": \"/path/to/audio.mp3\", \"context\": \"verbatim\", // Gets exact word-for-word transcription \"preserve_spelled_acronyms\": true }
Verbatim Mode Features:
gemini-code-execute)Execute Python code in a secure sandbox environment.
Parameters:
code (string, required) - Python code to executecontext (string, optional) - Context for enhancementExample:
{ \"code\": \"import pandas as pd\\ndata = {'x': [1,2,3], 'y': [4,5,6]}\\ndf = pd.DataFrame(data)\\nprint(df.describe())\", \"context\": \"code\" }
gemini-analyze-video)Analyze video content for summaries, transcripts, and detailed insights.
Parameters:
file_path (string, required) - Path to video file (MP4, MOV, AVI, WEBM, MKV, FLV)analysis_type (string, optional) - "summary", "transcript", "objects", "detailed", "custom"context (string, optional) - Context for enhancementExample:
{ \"file_path\": \"/path/to/video.mp4\", \"analysis_type\": \"detailed\" }
gemini-analyze-image)Extract detailed information from images including objects, text, and descriptions.
Parameters:
file_path (string, required) - Path to image file (JPEG, PNG, WebP, HEIC, HEIF, BMP, GIF)analysis_type (string, optional) - "summary", "objects", "text", "detailed", "custom"context (string, optional) - Context for enhancementExample:
{ \"file_path\": \"/path/to/image.jpg\", \"analysis_type\": \"objects\" }
The Smart Tool Intelligence system is the first of its kind in the MCP ecosystem. It automatically:
The system recognizes these contexts and applies appropriate enhancements:
consciousness - Adds academic rigor, citations, detailed explanationscode - Includes practical examples, working code, best practicesdebugging - Focuses on root cause analysis and specific fixesgeneral - Applies comprehensive, structured responsesverbatim - For audio transcription, provides exact word-for-word outputPreferences are stored internally at ./data/tool-preferences.json with automatic migration from external storage.
Want to add this revolutionary capability to your own MCP server? Here's how:
// src/intelligence/context-detector.js class ContextDetector { detectContext(prompt, toolName) { // Implement pattern matching for different contexts if (this.isConsciousnessContext(prompt)) return 'consciousness'; if (this.isCodeContext(prompt)) return 'code'; if (this.isDebuggingContext(prompt)) return 'debugging'; return 'general'; } } // src/intelligence/prompt-enhancer.js class PromptEnhancer { enhancePrompt(originalPrompt, context, toolName) { // Apply context-specific enhancements const enhancement = this.getEnhancementForContext(context); return `${originalPrompt}\\n\\n${enhancement}`; } } // src/intelligence/preference-store.js class PreferencesManager { async storePattern(original, enhanced, context, toolName, success) { // Store successful patterns for future learning } async getPatterns(context) { // Retrieve learned patterns for context } }
// In your tool's execute method: async execute(args) { const intelligence = IntelligenceSystem.getInstance(); // Detect context and enhance prompt const context = args.context || intelligence.contextDetector.detectContext(args.prompt, this.name); const enhancedPrompt = await intelligence.enhancePrompt(args.prompt, context, this.name); // Execute with enhanced prompt const result = await this.geminiService.generateContent(enhancedPrompt); // Store successful pattern await intelligence.storeSuccessfulPattern(args.prompt, enhancedPrompt, context, this.name); return result; }
Study these files from this repository:
src/intelligence/index.js - Main intelligence coordinatorsrc/intelligence/context-detector.js - Context recognition logicsrc/intelligence/prompt-enhancer.js - Enhancement applicationsrc/intelligence/preference-store.js - Pattern storage and retrievalsrc/tools/base-tool.js - Integration with tool execution# Test basic functionality npm test # Test Smart Tool Intelligence node test-tool-intelligence-full.js # Test internal storage node test-internal-storage.js # Test verbatim transcription node test-verbatim-mode.js
# Test image generation echo '{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/call\",\"params\":{\"name\":\"generate_image\",\"arguments\":{\"prompt\":\"A cute robot reading a book\"}}}' | node gemini-server.js # Test chat with consciousness context echo '{\"jsonrpc\":\"2.0\",\"id\":2,\"method\":\"tools/call\",\"params\":{\"name\":\"gemini-chat\",\"arguments\":{\"message\":\"What is consciousness?\",\"context\":\"consciousness\"}}}' | node gemini-server.js
src/
├── server.js              # MCP protocol handler
├── config.js              # Configuration management
├── tools/                 # Tool implementations
│   ├── index.js           # Tool registry & dispatcher
│   ├── base-tool.js       # Abstract base class
│   ├── chat.js            # Chat tool
│   ├── image-generation.js # Image generation tool
│   ├── image-editing.js   # Image editing tool
│   ├── audio-transcription.js # Audio transcription tool
│   ├── code-execution.js  # Code execution tool
│   ├── video-analysis.js  # Video analysis tool
│   └── image-analysis.js  # Image analysis tool
├── intelligence/          # Smart Tool Intelligence
│   ├── index.js           # Intelligence coordinator
│   ├── context-detector.js # Context recognition
│   ├── prompt-enhancer.js # Prompt enhancement
│   └── preference-store.js # Pattern storage
├── gemini/               # Gemini API integration
│   ├── gemini-service.js # API service layer
│   └── request-handler.js # Request formatting
└── utils/                # Utilities
    ├── logger.js         # Logging system
    └── file-utils.js     # File operations
// In src/intelligence/context-detector.js isMyCustomContext(prompt) { const patterns = [ /custom pattern 1/i, /custom pattern 2/i ]; return patterns.some(pattern => pattern.test(prompt)); } // In src/intelligence/prompt-enhancer.js getEnhancementForContext(context) { const enhancements = { 'my_custom_context': 'Apply my custom enhancement instructions here.', // ... other contexts }; return enhancements[context] || enhancements.general; }
src/tools/my-new-tool.jsBaseTool classexecute method with intelligence integrationsrc/tools/index.js// src/tools/my-new-tool.js class MyNewTool extends BaseTool { constructor(geminiService, intelligenceSystem) { super('my-new-tool', 'Description of my tool', geminiService, intelligenceSystem); } async execute(args) { // Use intelligence system for enhancement const context = args.context || this.detectContext(args.input); const enhancedPrompt = await this.enhancePrompt(args.input, context); // Your tool logic here const result = await this.geminiService.someMethod(enhancedPrompt); // Store successful pattern await this.storeSuccessfulPattern(args.input, enhancedPrompt, context); return result; } }
"Missing GEMINI_API_KEY" Error
# Ensure .env file exists and contains your API key cp .env.example .env # Edit .env and add: GEMINI_API_KEY=your_key_here
"File not found" Errors
# Ensure file paths are absolute and files exist # Check file permissions and formats
Intelligence System Not Learning
# Check data directory permissions ls -la data/ # Verify tool-preferences.json is writable
DEBUG=true npm start # or npm run dev
./data/tool-preferences.json$OUTPUT_DIR (default: ~/Claude/gemini-images)We welcome contributions! This project represents a new paradigm in MCP server development.
git clone https://github.com/Garblesnarff/gemini-mcp-server.git cd gemini-mcp-server npm install npm run dev
This is the first MCP server that truly learns and adapts. Traditional MCP servers are static - they do the same thing every time. Our Smart Tool Intelligence system represents a paradigm shift toward AI tools that become more helpful over time.
For Users: Better results with less effort as the system learns your preferences.
For Developers: A blueprint for building truly intelligent, adaptive AI tools.
For the MCP Ecosystem: A new standard for what MCP servers can become.
This project is licensed under the MIT License - feel free to use, modify, and distribute.
Built with:
Ready to experience the future of MCP servers? Get started now and watch your AI tools become smarter with every interaction! 🚀"