The Appium MCP server provides AI assistants with tools for cross-platform mobile app automation and testing on iOS and Android devices, simulators, and emulators. It offers capabilities including AI-powered element locating using natural language and vision models, automated session management, interactive element interactions (clicks, text input, screenshots), intelligent locator generation, and automated test code generation following the Page Object Model pattern. The server solves the problem of complex mobile test automation by enabling natural language-driven interactions that eliminate the need for complex XPath selectors and reduce boilerplate test code creation.
MCP Appium is an intelligent MCP (Model Context Protocol) server designed to empower AI assistants with a robust suite of tools for mobile automation. It streamlines mobile app testing by enabling natural language interactions, intelligent locator generation, and automated test creation for both Android and iOS platforms.
Before you begin, ensure you have the following installed:
MCP Appium supports two driver modes:
appium_session_management creates an android or ios session without remoteServerUrl, MCP Appium uses the bundled appium-uiautomator2-driver or appium-xcuitest-driver dependency directly. You still need the platform toolchains below, but you do not need to install a global Appium server or run appium driver install uiautomator2 / appium driver install xcuitest for this mode.remoteServerUrl is provided to action=create or action=attach, MCP Appium uses the webdriver client to talk to that existing server. In this mode the remote server is responsible for its installed drivers, plugins, device access, and capability handling. Use this mode for platform=general; embedded local creation is available only for Android and iOS.ANDROID_HOME environment variable.adb available on PATH.xcode-select --install.appium_prepare_ios_real_device to download and sign WebDriverAgent in a single call - it will guide you through provisioning profile selection and return capabilities for session startup.Standard config works in most of the tools::
{
"mcpServers": {
"appium-mcp": {
"disabled": false,
"timeout": 100,
"type": "stdio",
"command": "npx",
"args": ["appium-mcp@latest"],
"env": {
"ANDROID_HOME": "/path/to/android/sdk",
"CAPABILITIES_CONFIG": "/path/to/your/capabilities.json"
}
}
}
}
The easiest way to install MCP Appium in Cursor IDE is using the one-click install button:
This will automatically configure the MCP server in your Cursor IDE settings. Make sure to update the ANDROID_HOME environment variable in the configuration to match your Android SDK path.
Go to Cursor Settings → MCP → Add new MCP Server. Name it to your liking, use command type with the command npx -y appium-mcp@latest. You can also verify config or add command arguments via clicking Edit.
Here is the recommended configuration:
{
"appium-mcp": {
"disabled": false,
"timeout": 100,
"type": "stdio",
"command": "npx",
"args": ["appium-mcp@latest"],
"env": {
"ANDROID_HOME": "/Users/xyz/Library/Android/sdk"
}
}
}
Note: Make sure to update the ANDROID_HOME path to match your Android SDK installation path.
Use the Gemini CLI to add the MCP Appium server:
gemini mcp add appium-mcp npx -y appium-mcp@latest
This will automatically configure the MCP server for use with Gemini. Make sure to update the ANDROID_HOME environment variable in the configuration to match your Android SDK path.
Use the Claude Code CLI to add the MCP Appium server:
claude mcp add appium-mcp -- npx -y appium-mcp@latest
This will automatically configure the MCP server for use with Claude Code. Make sure to update the ANDROID_HOME environment variable in the configuration to match your Android SDK path.
Note: For embedded local Android/iOS sessions, MCP Appium already includes the UiAutomator2 and XCUITest driver packages. The system-level requirements are the platform toolchains (
ANDROID_HOME, Java, Android SDK tools, Xcode/iOS signing or simulator setup). For remote sessions, configure those requirements on the remote Appium/WebDriver server instead.
| Variable | Required | Description |
|---|---|---|
CAPABILITIES_CONFIG | Optional | Absolute path to a capabilities.json file with per-platform capability presets |
SCREENSHOTS_DIR | Optional | Directory where screenshots and screen recordings are saved. Defaults to the current working directory |
NO_UI | Optional | Set to true or 1 to disable HTML UI components — faster responses, fewer tokens. See NO_UI Mode |
APPIUM_MCP_ON_CLIENT_DISCONNECT | Optional | Session cleanup when the MCP client disconnects: delete_all (default) deletes MCP-owned Appium sessions (safeDeleteAllSessions); skip keeps those sessions across disconnects (e.g. HTTP/stream clients that reconnect). Attached/remote sessions are not removed by this path. See MCP disconnect behavior. |
APPIUM_MCP_WDA_APP_PATH | Optional | Absolute path to a pre-extracted WebDriverAgentRunner-Runner.app bundle. When set, prepare_ios_simulator skips all GitHub downloads and uses this bundle directly — useful in environments where external downloads are blocked |
REMOTE_SERVER_URL_ALLOW_REGEX | Optional | Regex pattern that remote Appium server URLs must match. Defaults to ^https?:// |
AI_VISION_ENABLED | Optional | Set to true to register the appium_ai tool (vision-based element finding). When unset or false, the AI tool is not registered and the LLM has no way to invoke vision-based finding. Requires AI_VISION_API_BASE_URL and AI_VISION_API_KEY to also be set, otherwise the server fails to start. |
AI_VISION_API_BASE_URL | Required when AI_VISION_ENABLED=true | Base URL of the OpenAI-compatible vision model API |
AI_VISION_API_KEY | Required when AI_VISION_ENABLED=true | API key for the vision model provider |
AI_VISION_MODEL | Optional | Vision model name (default: Qwen3-VL-235B-A22B-Instruct) |
AI_VISION_COORD_TYPE | Optional | Coordinate type: normalized (default) or absolute |
AI_VISION_IMAGE_MAX_WIDTH | Optional | Max image width in pixels before compression (default: 1080) |
AI_VISION_IMAGE_QUALITY | Optional | JPEG quality 1–100 for compressed screenshots sent to the vision API (default: 80) |
APPIUM_MCP_DOCS_ENABLED | Optional | Set to true (or 1/yes/on) to register the documentation tools (appium_documentation_query, appium_skills). Opt-in and disabled by default. Requires the optional @appium/mcp-documentation package (embeddings cache + ML stack) to be installed separately; when unset it is never downloaded. See Documentation Tools (opt-in). |
SENTENCE_TRANSFORMERS_MODEL | Optional | Hugging Face model used for semantic search in Appium documentation queries (default: Xenova/all-MiniLM-L6-v2). Only applies when APPIUM_MCP_DOCS_ENABLED is set. |
APPIUM_MCP_PERSIST_REMOTE_SESSIONS_PATH | Optional | Directory path for persisted attached remote session info. When set, attached remote sessions are stored as JSON files in that directory and can be rehydrated after restart. |
APPIUM_MCP_EVIDENCE | Optional | Set to true or 1 to attach a structured action evidence record (locator, resolved element id, context, timing, normalized error code) to appium_find_element and appium_gesture responses as an application/vnd.appium.evidence+json resource block, for CI/debugging. Disabled by default; responses are unchanged when unset. |
APPIUM_MCP_OTEL_ENABLED | Optional | Set to true to enable OpenTelemetry tracing (disabled by default). |
APPIUM_MCP_OTEL_INCLUDE_ARGUMENT_VALUES | Optional | Set to true to include sanitized non-sensitive argument values in spans; disabled by default because values may contain sensitive data. |
OTEL_SERVICE_NAME | Optional | Service name reported to the OpenTelemetry collector (example: appium-mcp). |
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT | Optional | OTLP/HTTP traces endpoint (example: http://127.0.0.1:4318/v1/traces). |
OTEL_TRACES_SAMPLER | Optional | Trace sampling strategy; parentbased_always_on samples new root traces and follows parent decisions. |
OTEL_RESOURCE_ATTRIBUTES | Optional | Comma-separated key=value pairs attached as resource attributes to every span (example: testcase.id=my-test-123,team=platform). |
OpenTelemetry tracing is disabled by default. Set APPIUM_MCP_OTEL_ENABLED=true to initialize the Node.js OpenTelemetry SDK before the MCP server is constructed. The SDK uses standard OTEL_* environment variables, for example:
APPIUM_MCP_OTEL_ENABLED=true
# Optional: include sanitized non-sensitive argument values in spans.
# APPIUM_MCP_OTEL_INCLUDE_ARGUMENT_VALUES=true
# Optional: attach custom key=value pairs to every span (e.g. test case ID, team name).
# OTEL_RESOURCE_ATTRIBUTES=testcase.id=my-test-123,team=platform
OTEL_SERVICE_NAME=appium-mcp
OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://127.0.0.1:4318/v1/traces
OTEL_TRACES_SAMPLER=parentbased_always_on
(Please check the official document as well)
When enabled, appium-mcp creates spans for MCP tool calls, prompt loads, resource reads, and resource template reads. Error status is recorded for thrown operation errors and MCP tool results marked with isError. Span attributes intentionally avoid raw screenshots, XML page source, prompts, credentials, and other high-cardinality or sensitive payloads.
For local trace inspection, use the Jaeger setup in tools/telemetry:
npm run telemetry:jaeger:start
Then open http://127.0.0.1:16686 and run appium-mcp with the environment values in tools/telemetry/jaeger.env.
Create a capabilities.json file to define your device capabilities:
{
"android": {
"appium:app": "/path/to/your/android/app.apk",
"appium:deviceName": "Android Device",
"appium:platformVersion": "11.0",
"appium:automationName": "UiAutomator2",
"appium:udid": "your-device-udid"
},
"ios": {
"appium:app": "/path/to/your/ios/app.ipa",
"appium:deviceName": "iPhone 15 Pro",
"appium:platformVersion": "17.0",
"appium:automationName": "XCUITest",
"appium:udid": "your-device-udid"
},
"general": {
"platformName": "mac",
"appium:automationName": "mac2",
"appium:bundleId": "com.apple.Safari"
}
}
Set the CAPABILITIES_CONFIG environment variable to point to your configuration file.
appium_session_management (action=create).ios or android, the server builds capabilities for that platform (including selected device info when local).general:
capabilities exactly as given, orCAPABILITIES_CONFIG is set, it will merge with the general section from your capabilities file.For CI, device farms, or multi-session setups:
sessionIdThe process keeps one active Appium session; tools use it when sessionId is omitted. If a tool call does not include a sessionId, it will target the active session instead of a specific one. If more than one session exists (see appium_session_management with action=list), pass sessionId on every tool call that must target a specific session. Do not assume the active session is stable if other clients or flows can create, select, or delete sessions.
If APPIUM_MCP_PERSIST_REMOTE_SESSIONS_PATH is set, MCP Appium persists attached remote sessions to that directory as JSON files. The path may be absolute or relative to the current working directory. Each session is stored under a canonical filename derived from a hash of the sessionId; older legacy filenames are migrated, and duplicate files for the same session are removed when the directory is read. When a persisted attached session is used again, the server tries to reattach to the remote Appium session; unreachable entries are pruned automatically.
When the MCP client disconnects, the server deletes only MCP-owned sessions it is tracking (Appium deleteSession for each, via safeDeleteAllSessions). Attached sessions (ownership=attached) are intentionally left on the remote Appium server. Transports that drop often—httpStream behind proxies, idle timeouts, or flaky clients—can wipe owned automation in one go under the default policy. stdio is usually safer for a single long-lived operator; if you use httpStream, expect reconnects to require new owned sessions where applicable.
For grids, cloud labs, or CI, prefer remoteServerUrl plus explicit capabilities on appium_session_management (action=create)—for example appium:udid, app path or id, platform version—rather than depending on local discovery. select_device is geared toward local ADB / simulator picking; use it as a dev convenience, not the main path for allocated remote devices.
Tool calls are logged with argument redaction implemented via JSON.stringify. Oversized payloads (especially long base64 strings, e.g., screenshot/image payloads, and also very large capabilities objects) cost CPU and log volume. Prefer CAPABILITIES_CONFIG and avoid passing large inline blobs in tool arguments when possible.
Set the SCREENSHOTS_DIR environment variable to specify where screenshots are saved. If not set, screenshots are saved to the current working directory. Supports both absolute and relative paths (relative paths are resolved from the current working directory). The directory is created automatically if it doesn't exist.
Screen recordings are saved as MP4 files to the same directory as screenshots (SCREENSHOTS_DIR, or os.tmpdir() if not set).
PATH. The default codec is libx264 with yuv420p pixel format for QuickTime compatibility.screenrecord command via UiAutomator2. No additional dependencies required.To start recording, call appium_screen_recording with action="start". You may provide timeLimit in seconds to limit the maximum recording duration, but the start call still returns immediately. To finalize the recording, save the video, and receive the file path, call appium_screen_recording again with action="stop".
Configure AI-powered element finding using vision models. When enabled, a separate tool — appium_ai — is registered alongside appium_find_element. It exposes action=find_element, which locates UI elements from natural-language descriptions and returns a coordinate UUID (ai-element:x,y:bbox) that can be passed to appium_gesture (tap / double_tap / long_press).
This feature is opt-in. When AI_VISION_ENABLED is unset or false, the appium_ai tool is not registered and the LLM has no way to invoke vision-based finding — keeping appium_find_element purely traditional. This deliberate gating prevents the model from defaulting to a slow, paid vision call when a stable locator (accessibility id, resource-id, etc.) would do the job.
Required Environment Variables:
{
"appium-mcp": {
"env": {
"ANDROID_HOME": "/path/to/android/sdk",
"AI_VISION_ENABLED": "true",
"AI_VISION_API_BASE_URL": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"AI_VISION_API_KEY": "your_api_key_here"
}
}
}
If AI_VISION_ENABLED=true is set without both API vars, the server fails to start with a clear error message — misconfiguration is surfaced immediately rather than mid-test.
Optional Environment Variables:
See the Environment Variables table above for the full list of AI_VISION_* options and their defaults.
Supported Vision Model Providers:
Based on benchmark testing, the following models are recommended:
Qwen3-VL-235B-A22B-Instruct
https://dashscope.aliyuncs.com/compatible-mode/v1gemini-3-flash-preview
https://generativelanguage.googleapis.com/v1betaMore models benchmarked can be found here.
Performance Features:
normalized mode (default), the model returns 0–1000 range coordinates that are automatically scaled to absolute pixel coordinates using the original image dimensions — independent of any image compression. In absolute mode, image resizing is disabled so the model's returned pixel coordinates always map directly to the original screen dimensions.Set the NO_UI environment variable to true or 1 to disable UI components and improve performance:
{
"appium-mcp": {
"env": {
"NO_UI": "true",
"ANDROID_HOME": "/path/to/android/sdk"
}
}
}
Benefits:
Affected Tools:
The following tools return lightweight text-only responses when NO_UI is enabled:
appium_screenshot - Screenshot files are still saved to disk, but base64 data is not embedded in responsesappium_get_page_source - Returns XML as text without interactive inspector UIgenerate_locators - Returns locator data as JSON without interactive UIselect_device - Returns device list as text without picker UIappium_session_management (action=create) - Returns session info as text without dashboard UIappium_context - Returns context list as text with action=list without switcher UIappium_app_lifecycle (action=list) - Returns app list as JSON without interactive UIWhen to Enable NO_UI:
The documentation tools — appium_documentation_query (RAG search over the Appium docs) and appium_skills — live in a separate package, @appium/mcp-documentation, that carries a multi-megabyte embeddings cache and pulls in a heavy ML stack (@xenova/transformers, @langchain/*). To keep the default install lean, this package is not a runtime dependency of appium-mcp and is never downloaded unless you opt in. It is declared as an optional peer dependency.
Enabling the tools is a two-step opt-in:
1. Install the optional package (in the same project/environment as appium-mcp):
npm install @appium/mcp-documentation
Installing it with your own package manager dedupes against appium-mcp's existing dependencies, so only the genuinely new code is added.
2. Set APPIUM_MCP_DOCS_ENABLED in your MCP server config:
{
"appium-mcp": {
"env": {
"APPIUM_MCP_DOCS_ENABLED": "true",
"ANDROID_HOME": "/path/to/android/sdk"
}
}
}
Behavior:
true/1/yes/on): the server registers the documentation tools if @appium/mcp-documentation is installed. If the flag is set but the package is not installed, the server starts normally without the documentation tools and logs a hint to run npm install @appium/mcp-documentation.The gate is governed by the env var, not by mere presence of the package: with APPIUM_MCP_DOCS_ENABLED unset, the tools stay hidden even if the package happens to be installed.
Pre-installing it that way also avoids the first-run download delay.
By default (APPIUM_MCP_ON_CLIENT_DISCONNECT unset or delete_all), when the MCP client disconnects, this server deletes every MCP-owned Appium session (the same sessions safeDeleteAllSessions targets) so embedded drivers are not left running after a short-lived assistant run. Attached sessions (ownership=attached) are unchanged by this teardown.
HTTP and streamable MCP clients may disconnect briefly (reconnect, reload, proxy). If that tears down drivers you still need, set APPIUM_MCP_ON_CLIENT_DISCONNECT to skip in your MCP server env (same pattern as NO_UI above). With skip, sessions survive disconnect until you call appium_session_management with action=delete, or you stop the Appium server / process.
Tradeoff: skip can leave orphaned sessions on your Appium server if nothing cleans up — use it when disconnect is not the same as “automation finished.”
Use appium-mcp/core to compose the default Appium MCP server with custom business logic without maintaining a fork. Plugins can register MCP tools, prompts, resources, and resource templates, and can wrap tool execution with lifecycle hooks. Call hooks are tool-only: prompts, resources, and resource templates are registered with FastMCP but are not wrapped by beforeCall or afterCall.
createAppiumMcpServer({ policy }) can also hide nonmatching tools and resources from MCP discovery. The factory is async, so await it before starting the returned server. Policy rules are regular expressions matched against tool and resource names exactly as registered. The policy is applied at registration time to both single and batch registration methods. Resource policy matches the resource name only; resources or resource templates without a string name cannot match a non-empty allowResources list.
import { createAppiumMcpServer } from 'appium-mcp/core';
import type {
AppiumMcpPlugin,
McpRegistry,
ToolCallContext,
} from 'appium-mcp/core';
import { z } from 'zod';
class CheckoutPlugin implements AppiumMcpPlugin {
readonly name = 'checkout-plugin';
readonly version = '1.0.0';
register(registry: McpRegistry): void {
const parameters = z.object({ orderId: z.string() });
registry.addTool({
name: 'assert_checkout_summary',
description:
'Assert that the checkout summary screen shows an expected order ID.',
parameters,
execute: async (args) => {
const { orderId } = parameters.parse(args);
return {
content: [
{ type: 'text', text: `Assert checkout order ${orderId}` },
],
};
},
});
}
async beforeCall(ctx: ToolCallContext): Promise<void> {
if (ctx.toolName === 'appium_gesture') {
console.error(`[checkout-plugin] about to call ${ctx.toolName}`);
}
}
}
const server = await createAppiumMcpServer({
plugins: [new CheckoutPlugin()],
additionalInstructions: 'Custom checkout policies are active.',
policy: {
allowTools: [/^appium_session_management$/, /^assert_checkout_summary$/],
allowResources: [/^Generate Code With Locators$/],
},
});
await server.start({ transportType: 'stdio' });
Plugin lifecycle:
register(registry, core): called during server construction. Register custom tools, prompts, resources, and resource templates here.initialize(ctx): called lazily on the first MCP client connection. Use it for async setup such as artifact storage or internal service clients.beforeCall(ctx): called before a registered MCP tool executes. Return a ToolCallResult to short-circuit the tool. This hook only applies to tools, not prompts, resources, or resource templates.afterCall(ctx, result): called after a registered MCP tool executes. Return a modified ToolCallResult to decorate or replace the response. This hook only applies to tools, not prompts, resources, or resource templates.destroy(): called after the last MCP client disconnects.The supported plugin API is intentionally small:
| Surface | Safe methods |
|---|---|
McpRegistry | addTool, addTools, addPrompt, addPrompts, addResource, addResources, addResourceTemplate, addResourceTemplates |
AppiumMcpCore | getSessionId(), getSessionInfo(sessionId?), getDriver(sessionId?), listSessions() |
ToolCallContext.session | getSessionId(), getSessionInfo(sessionId?), getDriver(sessionId?), listSessions() |
PluginContext | core, plugins |
McpRegistry methods delegate to the matching FastMCP registration APIs, so their object shapes follow FastMCP's documented tool, prompt, resource, and resource-template definitions. Appium MCP wraps registered tools with plugin call hooks, but prompts and resources are registered directly with FastMCP.
Each plugin name should be unique within the server. If two plugins use the same name, Appium MCP keeps the first plugin registered for that name and skips later plugins with a warning. Use a stable, package-style or organization-prefixed name, such as acme-checkout-plugin, to avoid collisions when composing plugins from multiple teams.
Each tool name should also be unique across all plugins and the core server. Tool names follow FastMCP behavior, not plugin-name behavior: when a tool is registered with the same name as an existing tool, FastMCP replaces the earlier tool definition with the later one. Appium MCP registers built-in tools before plugin tools, which means a plugin tool that uses the same name as a built-in tool replaces the built-in tool. Appium MCP tools usually have an appium_ prefix, so plugin tool names should use that pattern only when they intentionally override a core tool.
Use verifyAppiumMcpNames before publishing or deploying a custom plugin setup. It registers your plugin capabilities into a lightweight collector, registers the Appium MCP core tools, and reports duplicate plugin names, duplicate tool names, and registration errors without starting the MCP server.
The recommended approach is to verify the same plugin array you pass to createAppiumMcpServer({ plugins }). This preserves your real plugin instances and order:
import {
formatVerificationReport,
verifyAppiumMcpNames,
} from 'appium-mcp/core';
import { plugins } from './plugins.js';
const report = verifyAppiumMcpNames({ plugins });
console.log(formatVerificationReport(report));
process.exit(report.ok ? 0 : 1);
When you provide multiple plugins, order is preserved. Plugins are verified in array order after the appium-mcp core tools. This matters because Appium MCP keeps the first plugin for a duplicate plugin name and skips later plugins with the same name, while duplicate tool names follow FastMCP's later-registration-wins behavior. Tool names still need to be unique across all loaded plugins and appium-mcp core; the verifier reports any collisions it finds.
The report labels this package's own shipped tools as appium-mcp core. Plugin sources are labeled as plugin:<name> with the plugin version.
Treat anything outside appium-mcp/core as internal. In particular, plugins should not rely on private server internals, internal session-store modules, tool implementation files, or the raw FastMCP server instance. If a plugin needs another stable primitive, open an issue so it can be added to AppiumMcpCore or McpRegistry deliberately.
See examples/plugin-example.ts for a fuller cookbook with tools, prompts, resources, resource templates, call hooks, and lifecycle setup.
MCP Appium provides a comprehensive set of tools organized into the following categories:
| Tool | Description |
|---|---|
select_device | REQUIRED FIRST: Discover available devices and select one. Auto-selects if only one device found |
prepare_ios_simulator | Boot an iOS/tvOS simulator, download WDA (if not cached), and install/launch WDA in a single call. Each step is skipped if already satisfied (iOS/tvOS only). Set APPIUM_MCP_WDA_APP_PATH to skip all downloads and use a local .app bundle instead. |
appium_prepare_ios_real_device | Prepare a real iOS device for Appium testing. Two-step flow: (1) call without provisioningProfileUuid to list available .mobileprovision profiles; (2) call again with the chosen UUID and isFreeAccount to download the matching WDA release, package it as an IPA, and resign with the profile. Results are cached per WDA version and profile, so repeat runs are fast. Pass the returned capabilitiesHint to create_session so Appium installs and launches WDA. macOS + Xcode 16+ required. |
| Tool | Description |
|---|---|
appium_session_management | Unified session management. action=create: start a new session for Android, iOS, or general capabilities (see 'general' mode above); forwards capabilities to a remote server via WebDriver newSession when remoteServerUrl is provided. action=attach: connect MCP Appium to an already-running remote Appium session without taking ownership. action=detach: forget an attached session without deleting the real remote session. action=delete: stop and clean up an owned session (defaults to active). action=list: show all active sessions, including ownership. action=select: switch the active session by sessionId. |
appium_mobile_device_control | Control device behavior: lock/unlock the screen, shake the device, or open the notifications panel (action: lock | unlock | shake | open_notifications). shake is iOS only; open_notifications is Android only; seconds is optional for timed lock. |
appium_driver_settings | Read or update Appium driver session settings in one tool. action=get returns current settings as JSON; action=update merges a settings map (driver-specific keys; use action=get first to inspect). |
The remote server URL in appium_session_management (action=create or action=attach) can be set via the remoteServerUrl parameter.
When remoteServerUrl is omitted, action=create starts an embedded local UiAutomator2 or XCUITest driver for platform=android or platform=ios. platform=general requires remoteServerUrl. When remoteServerUrl is present, action=create calls WebDriver newSession on the remote server, and action=attach connects MCP Appium to an existing remote session without owning its lifecycle.
If REMOTE_SERVER_URL_ALLOW_REGEX is set, the URL must match the provided regex pattern for security reasons.
This allows you to restrict which remote servers can be used with your MCP Appium instance, preventing unauthorized connections.
The default regex pattern allows any URL that starts with http:// or https://.
| Tool | Description |
|---|---|
appium_context | Manage contexts in one tool. action=list gets all available contexts including NATIVEAPP and WEBVIEW* entries. action=switch switches to a target context (context required). |
| Tool | Description |
|---|---|
appium_find_element | Find a specific element using traditional locator strategies. Strategy priority: accessibility id > id > platform-native (-ios predicate string / -ios class chain on iOS, -android uiautomator on Android) > xpath (last resort — slow & brittle). To scroll until an element appears, use appium_gesture with action=scroll_to_element (same strategy / selector as find). |
appium_ai | Opt-in (gated by AI_VISION_ENABLED=true). Vision-based element finding — fallback for when traditional locators don't work. action=find_element takes a natural-language instruction (e.g., "yellow search button at bottom") and returns a coordinate UUID consumable by appium_gesture (tap / double_tap / long_press). See AI Vision Element Finding for setup. |
appium_gesture | Perform a touch gesture. action = back, tap, double_tap, long_press, scroll, swipe, pinch_zoom, or scroll_to_element. scroll_to_element scrolls vertically (direction = up | down) until the locator matches, page source stops changing after a scroll (end of list), or maxScrollAttempts (default 10, max 80). Optional scrollDistance (0.05–1) or scrollDistancePreset = small | medium | large. Supports element UUIDs and raw coordinates for other actions. For swipe, use speed = slow | normal | fast (fast for pull-to-refresh). |
appium_drag_and_drop | Perform a drag and drop gesture from a source location to a target location (supports element-to-element, element-to-coordinates, coordinates-to-element, and coordinates-to-coordinates) |
appium_perform_actions | Execute raw W3C Actions API sequences for custom multi-touch gestures (rotate, three-finger swipe, edge swipes, precise timing). Prefer appium_gesture for standard gestures. |
appium_set_value | Enter text into an input field |
appium_mobile_keyboard | Hide the on-screen keyboard or query visibility. action=hide | is_shown (keys optional for hide). |
appium_get_text | Get text content from an element |
appium_mobile_clipboard | Read or set device clipboard plain text. action=get | set (content required for set). |
appium_alert | Handle alerts with action = accept, dismiss, or get_text (optional buttonLabel) |
| Tool | Description |
|---|---|
appium_screenshot | Take a screenshot and save as PNG. Optionally provide elementUUID to capture a specific element. |
appium_get_window_size | Get the width and height of the device screen in pixels |
appium_get_page_source | Get the page source (XML) from the current screen |
appium_orientation | Get or set device/screen orientation with action = get or set (requires orientation for set). |
appium_geolocation | Get, set, or reset the device GPS coordinates with action = get, set, or reset. For set, provide latitude and longitude (and optional altitude on Android). Not supported on Android emulators for reset. |
appium_screen_recording | Start or stop screen recording with action = start or stop. On stop, returns the saved MP4 path. |
appium_mobile_device_info | Get device information, battery status, or current device time. Use action = info (model, OS version, locale, timezone, screen density, etc.), battery (level as percentage and charging state), or time (current device time; accepts an optional format moment.js string, defaults to ISO 8601). Works on both iOS and Android. |
| Tool | Action | Description |
|---|---|---|
appium_app_lifecycle | activate | Activate (launch/bring to foreground) a specified app by bundle ID or name |
appium_app_lifecycle | terminate | Terminate (close) a specified app |
appium_app_lifecycle | install | Install an app on the device from a file path |
appium_app_lifecycle | uninstall | Uninstall an app from the device by bundle ID or name |
appium_app_lifecycle | list | List all installed apps on the device (Android and iOS) |
appium_app_lifecycle | is_installed | Check whether an app is installed. Package name for Android, bundle ID for iOS. |
appium_app_lifecycle | query_state | Query the current state of an app: 0=not installed, 1=not running, 2=background suspended, 3=background, 4=foreground |
appium_app_lifecycle | background | Background the current app for a duration (optional; defaults to 5 seconds) |
appium_app_lifecycle | clear | Clear app data and cache without uninstalling (mobile: clearApp). Android: stop the app first when possible. iOS: Simulator only; not supported on real devices. |
appium_app_lifecycle | deep_link | Open a deep link URL with the default or a specified app |
appium_mobile_permissions | Get, update, or reset app permissions in one tool (action: get / update / reset) | Android: list or change runtime permissions. iOS Simulator: get/set privacy via bundle id; reset (action=reset) applies to the AUT on sim and device. |
| Tool | Description |
|---|---|
generate_locators | Generate intelligent locators for all interactive elements on the current screen |
appium_generate_tests | Generate automated test code from natural language scenarios |
appium_documentation_query | Opt-in (gated by APPIUM_MCP_DOCS_ENABLED). Query Appium documentation using RAG for help and guidance |
appium_skills | Opt-in (gated by APPIUM_MCP_DOCS_ENABLED). Return ordered setup or troubleshooting skills from appium/skills for local Appium environments |
MCP Appium is designed to be compatible with any MCP-compliant client.
Here's an example prompt to test the Amazon mobile app checkout process:
Open Amazon mobile app, search for "iPhone 15 Pro", select the first search result, add the item to cart, proceed to checkout, sign in with email "test@example.com" and password "testpassword123", select shipping address, choose payment method, review order details, and place the order. Use JAVA + TestNG for test generation.
This example demonstrates a complete e-commerce checkout flow that can be automated using MCP Appium's intelligent locator generation and test creation capabilities.
Traditional Mode — prefer stable identifiers:
Try strategies in priority order: accessibility id first, then id, then platform-native predicates (-ios predicate string / -ios class chain on iOS, -android uiautomator on Android). Reach for xpath only when nothing more stable exists.
{
"tool": "appium_find_element",
"arguments": {
"strategy": "accessibility id",
"selector": "search-button"
}
}
xpath fallback (when no accessibility id, resource-id, or platform-native predicate works):
{
"tool": "appium_find_element",
"arguments": {
"strategy": "xpath",
"selector": "//android.widget.Button[@text='Search']"
}
}
Scroll until element is on screen (appium_gesture / scroll_to_element):
{
"tool": "appium_gesture",
"arguments": {
"action": "scroll_to_element",
"strategy": "xpath",
"selector": "//*[contains(@text,'My header')]",
"direction": "down",
"maxScrollAttempts": 40,
"scrollDistancePreset": "medium"
}
}
Use scrollDistance (0.05–1) instead of scrollDistancePreset when you want an exact fraction. Then call appium_find_element with the same strategy / selector to obtain the element id.
AI Mode (Natural Language) — requires AI_VISION_ENABLED=true:
When the AI tool is enabled, use appium_ai (not appium_find_element) for vision-based finding:
{
"tool": "appium_ai",
"arguments": {
"action": "find_element",
"instruction": "yellow search button at the bottom of the screen"
}
}
The returned UUID (ai-element:x,y:bbox) flows directly into appium_gesture:
{
"tool": "appium_gesture",
"arguments": {
"action": "tap",
"elementUUID": "ai-element:540,2280:480,2240,600,2320"
}
}
More instruction examples:
"username input field at top""settings icon in top-right corner""red delete button next to the item""blue submit button at bottom""profile picture in navigation bar"When to reach for appium_ai vs appium_find_element:
appium_find_element whenever a stable accessibility id, resource-id, or unique text exists — faster, free, deterministic.appium_ai only when the element has no stable identifier, the page source is unavailable, or you must locate by visual cues (color, position, icon).MCP Appium works seamlessly in any language - you don't need to know English! The AI assistant understands and responds in your native language. Simply describe what you want to do in your preferred language:
Examples in different languages:
🇪🇸 Spanish: "Abre la aplicación de Amazon, busca 'iPhone 15 Pro' y agrégalo al carrito"
🇨🇳 Chinese: "打开Amazon应用,搜索'iPhone 15 Pro'并添加到购物车"
🇯🇵 Japanese: "Amazonアプリを開いて、'iPhone 15 Pro'を検索してカートに追加する"
🇰🇷 Korean: "Amazon 앱을 열고 'iPhone 15 Pro'를 검색한 후 장바구니에 추가"
🇫🇷 French: "Ouvre l'application Amazon, recherche 'iPhone 15 Pro' et ajoute-le au panier"
🇩🇪 German: "Öffne die Amazon App, suche nach 'iPhone 15 Pro' und füge es zum Warenkorb hinzu"
The AI will handle your requests naturally and generate the appropriate test code, regardless of the language you use.
Contributions are welcome! Please feel free to submit a pull request or open an issue to discuss any changes.
This project is licensed under the Apache-2.0. See the LICENSE file for details.
makafeli/n8n-workflow-builder
danishashko/make-mcp
lukisch/n8n-manager-mcp
io.github.us-all/airflow
io.github.infoinlet-marketplace/mcp-workflow