Appium
STDIOMCP server for mobile app automation using Appium with W3C actions support.
MCP server for mobile app automation using Appium with W3C actions support.
mcp-appium-visual is an AI-powered mobile automation platform with Model Context Protocol (MCP) integration. It enables seamless control of Android and iOS devices through Appium, featuring intelligent visual element detection and recovery.
Before executing any commands, ensure your environment variables are properly set up:
.bash_profile, .zshrc or other shell configuration file contains the necessary environment variables:# Example environment variables in ~/.bash_profile export JAVA_HOME=/path/to/your/java export ANDROID_HOME=/path/to/your/android/sdk export PATH=$PATH:$ANDROID_HOME/tools:$ANDROID_HOME/platform-tools
source ~/.bash_profile # For bash # OR source ~/.zshrc # For zsh
Note: The system will attempt to source your
.bash_profileautomatically when initializing the driver, but it's recommended to ensure proper environment setup manually before running tests in a new terminal session.
For iOS testing, proper Xcode command line tools configuration is essential:
xcode-select --install
xcode-select -p
sudo xcode-select -s /Applications/Xcode.app/Contents/Developer
sudo xcodebuild -license accept
For iOS real device testing, ensure your Apple Developer account is properly configured in Xcode:
Set up environment variables for iOS development:
# Add these to your ~/.bash_profile or ~/.zshrc export DEVELOPER_DIR="/Applications/Xcode.app/Contents/Developer" export PATH="$DEVELOPER_DIR/usr/bin:$PATH"
source ~/.bash_profile # For bash # OR source ~/.zshrc # For zsh
npm install
npm install -g appium appium
Set up Android device/emulator:
adb devicesFor iOS testing (macOS only):
xcode-select --installnpm run build
npm run dev
npm test
The example test uses the Android Settings app as a demo. To test your own app:
Edit examples/appium-test.ts:
deviceName to match your deviceapp path to your APK file, orappPackage and appActivity for an installed appCommon capabilities configuration:
const capabilities: AppiumCapabilities = { platformName: "Android", deviceName: "YOUR_DEVICE_NAME", automationName: "UiAutomator2", // For installing and testing an APK: app: "./path/to/your/app.apk", // OR for testing an installed app: appPackage: "your.app.package", appActivity: ".MainActivity", noReset: true, };
For iOS testing using the new Xcode command line support:
examples/xcode-appium-example.ts:const capabilities: AppiumCapabilities = { platformName: "iOS", deviceName: "iPhone 13", // Your simulator or device name automationName: "XCUITest", udid: "DEVICE_UDID", // Get this from XcodeCommands.getIosSimulators() // For installing and testing an app: app: "./path/to/your/app.app", // OR for testing an installed app: bundleId: "com.your.app", noReset: true, };
The MCP server supports various Appium actions:
Element Interactions:
App Management:
Device Controls:
Advanced Features:
Xcode Command Line Tools (iOS only):
The MCP-Appium library now implements the W3C WebDriver Actions API for touch gestures, which is the modern standard for mobile automation.
The tapElement method now uses the W3C Actions API with intelligent fallbacks:
// The method will try in this order: // 1. Standard WebdriverIO click() // 2. W3C Actions API // 3. Legacy TouchAction API (fallback for backward compatibility) await appium.tapElement("//android.widget.Button[@text='OK']"); // or using the click alias await appium.click("//android.widget.Button[@text='OK']");
The scrollToElement method now uses W3C Actions API:
// Uses W3C Actions API for more reliable scrolling await appium.scrollToElement( "//android.widget.TextView[@text='About phone']", // selector "down", // direction: "up", "down", "left", "right" "xpath", // strategy 10 // maxScrolls );
You can create your own custom W3C gestures using the executeMobileCommand method:
// Create custom W3C Actions API gesture const w3cActions = { actions: [ { type: "pointer", id: "finger1", parameters: { pointerType: "touch" }, actions: [ // Move to start position { type: "pointerMove", duration: 0, x: startX, y: startY }, // Press down { type: "pointerDown", button: 0 }, // Move to end position over duration milliseconds { type: "pointerMove", duration: duration, origin: "viewport", x: endX, y: endY, }, // Release { type: "pointerUp", button: 0 }, ], }, ], }; // Execute the W3C Actions using executeScript await appium.executeMobileCommand("performActions", [w3cActions.actions]);
See examples/w3c-actions-swipe-demo.ts for more examples of W3C standard gesture implementations.
The new XcodeCommands class provides powerful tools for iOS testing:
import { XcodeCommands } from "../src/lib/xcode/xcodeCommands.js"; // Check if Xcode CLI tools are installed const isInstalled = await XcodeCommands.isXcodeCliInstalled(); // Get available simulators const simulators = await XcodeCommands.getIosSimulators(); // Boot a simulator await XcodeCommands.bootSimulator("SIMULATOR_UDID"); // Install an app await XcodeCommands.installApp("SIMULATOR_UDID", "/path/to/app.app"); // Launch an app await XcodeCommands.launchApp("SIMULATOR_UDID", "com.example.app"); // Take a screenshot await XcodeCommands.takeScreenshot("SIMULATOR_UDID", "/path/to/output.png"); // Shutdown a simulator await XcodeCommands.shutdownSimulator("SIMULATOR_UDID");
The click() method provides a more intuitive alternative to tapElement():
// Using the click method await appium.click("//android.widget.Button[@text='OK']"); // This is equivalent to: await appium.tapElement("//android.widget.Button[@text='OK']");
Device not found:
adb devices outputApp not installing:
Elements not found:
Connection issues:
iOS Simulator issues:
xcode-select -pxcrun simctl list devicesFeel free to submit issues and pull requests for additional features or bug fixes.
MIT