
Screen Monitor
STDIORevolutionary MCP server providing AI with real-time screen monitoring and UI interaction capabilities
Revolutionary MCP server providing AI with real-time screen monitoring and UI interaction capabilities
A REVOLUTIONARY Model Context Protocol (MCP) server! Gives AI real-time vision capabilities and enhanced UI intelligence power. This isn't just screen capture - it gives AI the power to truly "see" and understand your digital world!
🎯 NEW in v2.1.0: Enhanced Smart Click with 75% success rate, menu detection, and fuzzy matching!
start_continuous_monitoring()
- Starts AI's continuous vision capabilitystop_continuous_monitoring()
- Stops continuous monitoringget_monitoring_status()
- Real-time status information and statisticsget_recent_changes()
- Recently detected screen changesanalyze_ui_elements()
- Recognizes and maps all UI elements on screensmart_click()
- Smart clicking with natural language commands ("Click the save button")extract_text_from_screen()
- OCR text extraction from screenget_active_application()
- Get currently active application contextregister_application_events()
- Register for application-specific eventsbroadcast_application_change()
- Broadcast application changes to AI clientscapture_and_analyze()
- Screen capture and AI analysis (enhanced)list_tools()
- MCP standard compliant lists all tools (categorized, detailed information)# Start AI's continuous vision capability await start_continuous_monitoring(fps=3, change_threshold=0.1) # Check monitoring status status = await get_monitoring_status() # View recent changes changes = await get_recent_changes(limit=5)
# Analyze all UI elements on screen (now with menu detection!) ui_analysis = await analyze_ui_elements() # Smart clicking with natural language (75% success rate!) await smart_click("File") # ✅ Works! await smart_click("Save button") # ✅ Enhanced matching! # Extract text from screen with OCR text_data = await extract_text_from_screen()
# Get active application context app_context = await get_active_application() # Register for application events await register_application_events(app_name="Blender") # Monitor application changes changes = await get_recent_changes(limit=5)
# Navigate to project directory cd ScreenMonitorMCP # Install required libraries pip install -r requirements.txt
Edit the .env
file:
# Server Configuration HOST=127.0.0.1 PORT=7777 API_KEY=your_secret_key # AI Configuration OPENAI_API_KEY=your_openai_api_key OPENAI_BASE_URL=https://api.openai.com/v1 DEFAULT_OPENAI_MODEL=gpt-4o DEFAULT_MAX_TOKENS=1000
# Test the server python main.py # Test revolutionary features python test_revolutionary_features.py
Add the following JSON to your MCP client's configuration file:
{ "mcpServers": { "screenMonitorMCP": { "command": "python", "args": ["/path/to/ScreenMonitorMCP/main.py"], "cwd": "/path/to/ScreenMonitorMCP" } } }
{ "mcpServers": { "screenMonitorMCP": { "command": "python", "args": [ "/path/to/ScreenMonitorMCP/main.py" ], "cwd": "/path/to/ScreenMonitorMCP", "env": { "OPENAI_API_KEY": "your-api-key-here" } } } }
{ "mcpServers": { "screenMonitorMCP": { "command": "python", "args": [ "/path/to/ScreenMonitorMCP/main.py", "--api-key", "your-secret-key" ], "cwd": "/path/to/ScreenMonitorMCP" } } }
{ "mcpServers": { "screenMonitorMCP": { "command": "python", "args": ["C:/path/to/ScreenMonitorMCP/main.py"], "cwd": "C:/path/to/ScreenMonitorMCP" } } }
/path/to/ScreenMonitorMCP/main.py
path according to your project directory"C:/Python311/python.exe"
cwd
parameter is important for proper .env
file reading.env
file# Start AI's continuous vision capability result = await start_continuous_monitoring( fps=3, change_threshold=0.1, smart_detection=True ) # Check monitoring status status = await get_monitoring_status() # View recent changes changes = await get_recent_changes(limit=10) # Stop monitoring await stop_continuous_monitoring()
# Analyze all UI elements on screen ui_elements = await analyze_ui_elements( detect_buttons=True, extract_text=True, confidence_threshold=0.7 ) # Smart clicking with natural language await smart_click("Click the save button", dry_run=False) # Extract text from specific region text_data = await extract_text_from_screen( region={"x": 100, "y": 100, "width": 500, "height": 300} )
# Start application monitoring await start_application_monitoring() # Get active application context app_context = await get_active_application() # Register Blender for monitoring await register_application_events( app_name="Blender", event_types=["scene_change", "object_modification"] ) # Monitor application changes changes = await get_recent_application_events(limit=10) # Broadcast Blender scene change await broadcast_application_change( app_name="Blender", event_type="scene_change", event_data={"objects_modified": ["Cube", "Camera"]} )
With this system, you can relay real-time changes from Blender to your AI client (like Claude Desktop):
# Add ScreenMonitorMCP to your Claude Desktop config python main.py
# Run these commands in Claude Desktop: await start_application_monitoring() await register_application_events("Blender")
# From within your Blender script: await broadcast_application_change( app_name="Blender", event_type="object_added", event_data={"object_name": "Suzanne", "object_type": "MESH"} )
# Enhanced screen capture and analysis result = await capture_and_analyze( capture_mode="all", analysis_prompt="What do you see on this screen?", max_tokens=1500 # AI models can now use more tokens for detailed analysis ) # List all tools tools = await list_tools()
This MCP server gives AI the following capabilities:
Unicode/Encoding Error (Windows)
UnicodeEncodeError: 'charmap' codec can't encode character
Solution: ✅ This error is fixed! Server automatically uses UTF-8 encoding.
JSON Configuration Error
// ❌ Wrong { "command": "python", "args": ["path/to/main.py",] // Trailing comma is wrong } // ✅ Correct { "command": "python", "args": ["path/to/main.py"] }
Python Path Issue
{ "command": "C:/Python311/python.exe", // Use full path "args": ["C:/path/to/ScreenMonitorMCP/main.py"] }
Missing Dependencies
cd ScreenMonitorMCP pip install -r requirements.txt
OCR Issues
# Install Tesseract (optional) # EasyOCR installs automatically
MCP Connection Closed Error
MCP error -32000: Connection closed
Solution: Check file paths and add cwd
parameter.
This project is licensed under the MIT License.
🚀 Revolutionary MCP server that gives AI real "eyes"! 🔥 Next-generation AI-human interaction starts here!