Gemini MCP Server with Smart Tool Intelligence

Welcome to the Gemini MCP Server, the first MCP server with Smart Tool Intelligence - a revolutionary self-learning system that adapts to your preferences and improves over time. This comprehensive platform provides 7 AI-powered tools with automatic prompt enhancement and context awareness.

🚀 Features Overview

🤖 7 AI-Powered Tools

Image Generation - Create images from text prompts using Gemini 2.0 Flash
Image Editing - Edit existing images with natural language instructions
Chat - Interactive conversations with context-aware responses
Audio Transcription - Convert audio to text with optional verbatim mode
Code Execution - Run Python code in a secure sandbox environment
Video Analysis - Analyze video content for summaries, transcripts, and insights
Image Analysis - Extract objects, text, and detailed descriptions from images

🧠 Smart Tool Intelligence System (First in MCP Ecosystem)

Self-Learning - Automatically learns from successful interactions
Context Detection - Recognizes consciousness research, coding, debugging contexts
Pattern Recognition - Identifies usage patterns and user preferences
Prompt Enhancement - Refines prompts for better AI model performance
Persistent Memory - Stores learned preferences across sessions
Automatic Migration - Seamlessly upgrades preference storage

📦 Quick Start

Installation

git clone https://github.com/Garblesnarff/gemini-mcp-server.git
cd gemini-mcp-server
npm install

Configuration

Get your Gemini API key from Google AI Studio
Copy the environment template:
```
cp .env.example .env
```

Edit .env and add your API key:

GEMINI_API_KEY=your_actual_api_key_here
OUTPUT_DIR=/path/to/your/output/directory  # Optional
DEBUG=false  # Optional

Running the Server

npm start
# or for development with debug logging:
npm run dev

Integration with Claude Desktop

Add to your Claude Desktop config (claude_desktop_config.json):

{
  \"mcpServers\": {
    \"gemini\": {
      \"command\": \"node\",
      \"args\": [\"/path/to/gemini-mcp-server/gemini-server.js\"],
      \"env\": {
        \"GEMINI_API_KEY\": \"your_api_key_here\"
      }
    }
  }
}

🛠️ Tool Reference

1. Image Generation (`generate_image`)

Generate images from text descriptions using Gemini 2.0 Flash.

Parameters:

prompt (string, required) - Description of the image to generate
context (string, optional) - Context for Smart Tool Intelligence enhancement

Example:

{
  \"prompt\": \"A serene mountain landscape at sunset with vibrant colors\",
  \"context\": \"artistic\"
}

Returns:

{
  \"content\": [{
    \"type\": \"text\",
    \"text\": \"Generated a beautiful mountain landscape image.\"
  }, {
    \"type\": \"image\", 
    \"data\": \"base64_image_data\",
    \"mimeType\": \"image/png\"
  }]
}

2. Image Editing (`gemini-edit-image`)

Edit existing images using natural language instructions.

Parameters:

image_path (string, required) - Path to the image file to edit
edit_instruction (string, required) - Description of desired changes
context (string, optional) - Context for enhancement

Example:

{
  \"image_path\": \"/path/to/image.jpg\",
  \"edit_instruction\": \"Add shooting stars to the night sky\",
  \"context\": \"artistic\"
}

3. Chat (`gemini-chat`)

Interactive conversations with Gemini AI that learns your preferences.

Parameters:

message (string, required) - Your message or question
context (string, optional) - Context for Smart Tool Intelligence

Example:

{
  \"message\": \"Explain quantum computing in simple terms\",
  \"context\": \"consciousness\"  // Will apply academic rigor enhancement
}

4. Audio Transcription (`gemini-transcribe-audio`)

Convert audio files to text with Smart Tool Intelligence enhancement.

Parameters:

file_path (string, required) - Path to audio file (MP3, WAV, FLAC, AAC, OGG, WEBM, M4A)
language (string, optional) - Language hint for better accuracy
context (string, optional) - Use "verbatim" for exact word-for-word transcription
preserve_spelled_acronyms (boolean, optional) - Keep U-R-L instead of URL

Example (Standard):

{
  \"file_path\": \"/path/to/audio.mp3\",
  \"language\": \"en\"
}

Example (Verbatim Mode):

{
  \"file_path\": \"/path/to/audio.mp3\",
  \"context\": \"verbatim\",  // Gets exact word-for-word transcription
  \"preserve_spelled_acronyms\": true
}

Verbatim Mode Features:

Captures all "um", "uh", "like", repeated words
Preserves emotional expressions: [laughs], [sighs], [clears throat]
Maintains original punctuation and sentence structure
No summarization or cleanup

5. Code Execution (`gemini-code-execute`)

Execute Python code in a secure sandbox environment.

Parameters:

code (string, required) - Python code to execute
context (string, optional) - Context for enhancement

Example:

{
  \"code\": \"import pandas as pd\\ndata = {'x': [1,2,3], 'y': [4,5,6]}\\ndf = pd.DataFrame(data)\\nprint(df.describe())\",
  \"context\": \"code\"
}

6. Video Analysis (`gemini-analyze-video`)

Analyze video content for summaries, transcripts, and detailed insights.

Parameters:

file_path (string, required) - Path to video file (MP4, MOV, AVI, WEBM, MKV, FLV)
analysis_type (string, optional) - "summary", "transcript", "objects", "detailed", "custom"
context (string, optional) - Context for enhancement

Example:

{
  \"file_path\": \"/path/to/video.mp4\",
  \"analysis_type\": \"detailed\"
}

7. Image Analysis (`gemini-analyze-image`)

Extract detailed information from images including objects, text, and descriptions.

Parameters:

file_path (string, required) - Path to image file (JPEG, PNG, WebP, HEIC, HEIF, BMP, GIF)
analysis_type (string, optional) - "summary", "objects", "text", "detailed", "custom"
context (string, optional) - Context for enhancement

Example:

{
  \"file_path\": \"/path/to/image.jpg\",
  \"analysis_type\": \"objects\"
}

🧠 Smart Tool Intelligence System

How It Works

The Smart Tool Intelligence system is the first of its kind in the MCP ecosystem. It automatically:

Detects Context - Recognizes if you're doing consciousness research, coding, debugging, etc.
Enhances Prompts - Adds relevant instructions based on learned patterns
Learns Patterns - Stores successful interaction patterns for future use
Adapts Over Time - Gets better at helping you with each interaction

Context Types

The system recognizes these contexts and applies appropriate enhancements:

consciousness - Adds academic rigor, citations, detailed explanations
code - Includes practical examples, working code, best practices
debugging - Focuses on root cause analysis and specific fixes
general - Applies comprehensive, structured responses
verbatim - For audio transcription, provides exact word-for-word output

Storage Location

Preferences are stored internally at ./data/tool-preferences.json with automatic migration from external storage.

Implementing Smart Tool Intelligence in Your MCP Server

Want to add this revolutionary capability to your own MCP server? Here's how:

1. Core Architecture

// src/intelligence/context-detector.js
class ContextDetector {
  detectContext(prompt, toolName) {
    // Implement pattern matching for different contexts
    if (this.isConsciousnessContext(prompt)) return 'consciousness';
    if (this.isCodeContext(prompt)) return 'code';
    if (this.isDebuggingContext(prompt)) return 'debugging';
    return 'general';
  }
}

// src/intelligence/prompt-enhancer.js  
class PromptEnhancer {
  enhancePrompt(originalPrompt, context, toolName) {
    // Apply context-specific enhancements
    const enhancement = this.getEnhancementForContext(context);
    return `${originalPrompt}\\n\\n${enhancement}`;
  }
}

// src/intelligence/preference-store.js
class PreferencesManager {
  async storePattern(original, enhanced, context, toolName, success) {
    // Store successful patterns for future learning
  }
  
  async getPatterns(context) {
    // Retrieve learned patterns for context
  }
}

2. Integration Pattern

// In your tool's execute method:
async execute(args) {
  const intelligence = IntelligenceSystem.getInstance();
  
  // Detect context and enhance prompt
  const context = args.context || intelligence.contextDetector.detectContext(args.prompt, this.name);
  const enhancedPrompt = await intelligence.enhancePrompt(args.prompt, context, this.name);
  
  // Execute with enhanced prompt
  const result = await this.geminiService.generateContent(enhancedPrompt);
  
  // Store successful pattern
  await intelligence.storeSuccessfulPattern(args.prompt, enhancedPrompt, context, this.name);
  
  return result;
}

3. Key Implementation Files

Study these files from this repository:

src/intelligence/index.js - Main intelligence coordinator
src/intelligence/context-detector.js - Context recognition logic
src/intelligence/prompt-enhancer.js - Enhancement application
src/intelligence/preference-store.js - Pattern storage and retrieval
src/tools/base-tool.js - Integration with tool execution

🧪 Testing

Run Test Suite

# Test basic functionality
npm test

# Test Smart Tool Intelligence
node test-tool-intelligence-full.js

# Test internal storage
node test-internal-storage.js

# Test verbatim transcription
node test-verbatim-mode.js

Manual Testing Examples

# Test image generation
echo '{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/call\",\"params\":{\"name\":\"generate_image\",\"arguments\":{\"prompt\":\"A cute robot reading a book\"}}}' | node gemini-server.js

# Test chat with consciousness context
echo '{\"jsonrpc\":\"2.0\",\"id\":2,\"method\":\"tools/call\",\"params\":{\"name\":\"gemini-chat\",\"arguments\":{\"message\":\"What is consciousness?\",\"context\":\"consciousness\"}}}' | node gemini-server.js

📊 Performance & Limits

File Size Limits

Images: 20MB (JPEG, PNG, WebP, HEIC, HEIF, BMP, GIF)
Audio: 20MB (MP3, WAV, FLAC, AAC, OGG, WEBM, M4A)
Video: 100MB (MP4, MOV, AVI, WEBM, MKV, FLV)

API Rate Limits

Follows Google Gemini API rate limits
Built-in error handling and retry logic
Graceful degradation on quota exceeded

🏗️ Architecture Deep Dive

Modular Design

src/
├── server.js              # MCP protocol handler
├── config.js              # Configuration management
├── tools/                 # Tool implementations
│   ├── index.js           # Tool registry & dispatcher
│   ├── base-tool.js       # Abstract base class
│   ├── chat.js            # Chat tool
│   ├── image-generation.js # Image generation tool
│   ├── image-editing.js   # Image editing tool
│   ├── audio-transcription.js # Audio transcription tool
│   ├── code-execution.js  # Code execution tool
│   ├── video-analysis.js  # Video analysis tool
│   └── image-analysis.js  # Image analysis tool
├── intelligence/          # Smart Tool Intelligence
│   ├── index.js           # Intelligence coordinator
│   ├── context-detector.js # Context recognition
│   ├── prompt-enhancer.js # Prompt enhancement
│   └── preference-store.js # Pattern storage
├── gemini/               # Gemini API integration
│   ├── gemini-service.js # API service layer
│   └── request-handler.js # Request formatting
└── utils/                # Utilities
    ├── logger.js         # Logging system
    └── file-utils.js     # File operations

Intelligence System Flow

Request Received → Tool's execute method called
Context Detection → Analyze prompt for context clues
Pattern Retrieval → Get relevant learned patterns
Prompt Enhancement → Apply context-specific improvements
API Execution → Send enhanced prompt to Gemini
Pattern Storage → Store successful interaction pattern
Response Return → Return enhanced result to user

🔧 Customization

Adding New Contexts

// In src/intelligence/context-detector.js
isMyCustomContext(prompt) {
  const patterns = [
    /custom pattern 1/i,
    /custom pattern 2/i
  ];
  return patterns.some(pattern => pattern.test(prompt));
}

// In src/intelligence/prompt-enhancer.js
getEnhancementForContext(context) {
  const enhancements = {
    'my_custom_context': 'Apply my custom enhancement instructions here.',
    // ... other contexts
  };
  return enhancements[context] || enhancements.general;
}

Adding New Tools

Create tool file in src/tools/my-new-tool.js
Extend BaseTool class
Implement execute method with intelligence integration
Register in src/tools/index.js

// src/tools/my-new-tool.js
class MyNewTool extends BaseTool {
  constructor(geminiService, intelligenceSystem) {
    super('my-new-tool', 'Description of my tool', geminiService, intelligenceSystem);
  }
  
  async execute(args) {
    // Use intelligence system for enhancement
    const context = args.context || this.detectContext(args.input);
    const enhancedPrompt = await this.enhancePrompt(args.input, context);
    
    // Your tool logic here
    const result = await this.geminiService.someMethod(enhancedPrompt);
    
    // Store successful pattern  
    await this.storeSuccessfulPattern(args.input, enhancedPrompt, context);
    
    return result;
  }
}

🐛 Troubleshooting

Common Issues

"Missing GEMINI_API_KEY" Error

# Ensure .env file exists and contains your API key
cp .env.example .env
# Edit .env and add: GEMINI_API_KEY=your_key_here

"File not found" Errors

# Ensure file paths are absolute and files exist
# Check file permissions and formats

Intelligence System Not Learning

# Check data directory permissions
ls -la data/
# Verify tool-preferences.json is writable

Debug Mode

DEBUG=true npm start
# or
npm run dev

Logs Location

Application logs: Console output
Intelligence patterns: ./data/tool-preferences.json
Generated images: $OUTPUT_DIR (default: ~/Claude/gemini-images)

🤝 Contributing

We welcome contributions! This project represents a new paradigm in MCP server development.

Development Setup

git clone https://github.com/Garblesnarff/gemini-mcp-server.git
cd gemini-mcp-server
npm install
npm run dev

Areas for Contribution

New Contexts - Add support for specialized domains
Enhanced Patterns - Improve learning algorithms
New Tools - Expand Gemini AI capabilities
Performance - Optimize intelligence system performance
Documentation - Improve guides and examples

📈 Roadmap

Multi-language Support - Context detection in multiple languages
Advanced Analytics - Usage patterns and performance metrics
Tool Chaining - Intelligent coordination between multiple tools
Custom Models - Support for fine-tuned Gemini models
Collaborative Learning - Share anonymized patterns across instances
Visual Interface - Web-based configuration and monitoring

🌟 Why This Matters

This is the first MCP server that truly learns and adapts. Traditional MCP servers are static - they do the same thing every time. Our Smart Tool Intelligence system represents a paradigm shift toward AI tools that become more helpful over time.

For Users: Better results with less effort as the system learns your preferences.
For Developers: A blueprint for building truly intelligent, adaptive AI tools.
For the MCP Ecosystem: A new standard for what MCP servers can become.

📄 License

This project is licensed under the MIT License - feel free to use, modify, and distribute.

🙏 Acknowledgments

Built with:

Google Gemini AI - Powering the core AI capabilities
Model Context Protocol - Enabling seamless integration
Node.js & NPM - Runtime and package management
Claude & Rob - Human-AI collaboration at its finest

Ready to experience the future of MCP servers? Get started now and watch your AI tools become smarter with every interaction! 🚀"

Gemini Claude

Gemini MCP Server with Smart Tool Intelligence

🚀 Features Overview

🤖 7 AI-Powered Tools

🧠 Smart Tool Intelligence System (First in MCP Ecosystem)

📦 Quick Start

Installation

Configuration

Running the Server

Integration with Claude Desktop

🛠️ Tool Reference

1. Image Generation (generate_image)

2. Image Editing (gemini-edit-image)

3. Chat (gemini-chat)

4. Audio Transcription (gemini-transcribe-audio)

5. Code Execution (gemini-code-execute)

6. Video Analysis (gemini-analyze-video)

7. Image Analysis (gemini-analyze-image)

🧠 Smart Tool Intelligence System

How It Works

Context Types

Storage Location

Implementing Smart Tool Intelligence in Your MCP Server

1. Core Architecture

2. Integration Pattern

3. Key Implementation Files

🧪 Testing

Run Test Suite

Manual Testing Examples

📊 Performance & Limits

File Size Limits

API Rate Limits

🏗️ Architecture Deep Dive

Modular Design

Intelligence System Flow

🔧 Customization

Adding New Contexts

Adding New Tools

🐛 Troubleshooting

Common Issues

Debug Mode

Logs Location

🤝 Contributing

Development Setup

Areas for Contribution

📈 Roadmap

🌟 Why This Matters

📄 License

🙏 Acknowledgments

MCP Now 重磅来袭，抢先一步体验

1. Image Generation (`generate_image`)

2. Image Editing (`gemini-edit-image`)

3. Chat (`gemini-chat`)

4. Audio Transcription (`gemini-transcribe-audio`)

5. Code Execution (`gemini-code-execute`)

6. Video Analysis (`gemini-analyze-video`)

7. Image Analysis (`gemini-analyze-image`)