> ## Documentation Index
> Fetch the complete documentation index at: https://flowdeck.studio/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# UI Automation

> Automate simulator interactions with FlowDeck's UI automation commands

FlowDeck UI automation runs on iOS simulators and exposes commands for capture, interaction, assertions, and app control. Commands live under `flowdeck ui simulator`.

## Common Flags

Most UI automation commands support:

* `-S, --simulator` to target a specific simulator by **name or UDID**. Recommended when multiple simulators are booted. Examples: `-S "iPhone 16"` or `-S "A1B2C3D4-..."`.
* For AI agent workflows, always pass `-S` explicitly on every `flowdeck ui simulator ...` command.
* If `-S` is omitted, FlowDeck falls back to session/default simulator selection. This is usually fine when only one simulator is booted (common in CI).
* `--json`, `-j` for machine-readable output.
* `--verbose`, `-v` and `--examples`, `-e` on commands that support extra output.

## Capture

| Command                      | Purpose                                    | Key Options                                    |
| ---------------------------- | ------------------------------------------ | ---------------------------------------------- |
| `ui simulator screen`        | Screenshot + accessibility tree            | `--tree`, `--optimize`, `--output`             |
| `ui simulator record`        | Record simulator video                     | `--duration`, `--codec`, `--force`, `--output` |
| `ui simulator session start` | Start background tree + screenshot capture | `-S`, `--interval-ms`, `--retention-seconds`   |
| `ui simulator session stop`  | Stop the active capture session            | `-S`                                           |

## Query

| Command             | Purpose                          | Key Options                          |
| ------------------- | -------------------------------- | ------------------------------------ |
| `ui simulator find` | Locate elements by label/ID/role | `--by-id`, `--by-role`, `--contains` |

## Interaction

| Command                   | Purpose                             | Key Options                                                    |
| ------------------------- | ----------------------------------- | -------------------------------------------------------------- |
| `ui simulator tap`        | Tap element or coordinates          | `--point`, `--duration`, `--by-id`                             |
| `ui simulator double-tap` | Double tap element or coordinates   | `--point`, `--by-id`                                           |
| `ui simulator type`       | Type into focused field             | `--clear`, `--mask`                                            |
| `ui simulator swipe`      | Swipe by direction or coordinates   | `--from`, `--to`, `--duration`, `--distance`                   |
| `ui simulator scroll`     | Scroll content (gentler than swipe) | `--direction`, `--speed`, `--distance`, `--until`, `--timeout` |
| `ui simulator pinch`      | Pinch to zoom in/out                | `in`/`out`, `--scale`, `--point`, `--duration`                 |
| `ui simulator rotate`     | Rotate with a two-finger gesture    | `<degrees>`, `--point`, `--radius`, `--duration`               |
| `ui simulator back`       | Navigate back (edge swipe)          |                                                                |

## Wait and Assert

| Command                        | Purpose                    | Key Options                                              |
| ------------------------------ | -------------------------- | -------------------------------------------------------- |
| `ui simulator wait`            | Wait for element state     | `--timeout`, `--poll`, `--gone`, `--enabled`, `--stable` |
| `ui simulator assert visible`  | Assert element is visible  | `--by-id`                                                |
| `ui simulator assert hidden`   | Assert element is hidden   | `--by-id`                                                |
| `ui simulator assert enabled`  | Assert element is enabled  | `--by-id`                                                |
| `ui simulator assert disabled` | Assert element is disabled | `--by-id`                                                |
| `ui simulator assert text`     | Assert element text        | `--expected`, `--contains`, `--by-id`                    |

## Input and Keyboard

| Command                      | Purpose                       | Key Options            |
| ---------------------------- | ----------------------------- | ---------------------- |
| `ui simulator erase`         | Erase text from focused field | `--characters`         |
| `ui simulator hide-keyboard` | Hide the on-screen keyboard   |                        |
| `ui simulator key`           | Send keyboard keycodes        | `--sequence`, `--hold` |

## App Control

| Command                    | Purpose                 | Key Options |
| -------------------------- | ----------------------- | ----------- |
| `ui simulator open-url`    | Open a URL or deep link |             |
| `ui simulator clear-state` | Clear app data/state    |             |

## Hardware

| Command                       | Purpose                | Key Options                                                            |
| ----------------------------- | ---------------------- | ---------------------------------------------------------------------- |
| `ui simulator button`         | Press hardware buttons | `home`, `lock`, `siri`, `applepay`, `volumeup`, `volumedown`, `--hold` |
| `ui simulator set-appearance` | Set light or dark mode | `light`, `dark`                                                        |

## Advanced

| Command                   | Purpose                   | Key Options |
| ------------------------- | ------------------------- | ----------- |
| `ui simulator touch down` | Touch down at coordinates | `x,y`, `-S` |
| `ui simulator touch up`   | Touch up at coordinates   | `x,y`, `-S` |

## Key Options Explained

### Screen and Record

* `--output` uses a file path. Screenshots default to a `.png`, recordings to a `.mov`. If omitted, FlowDeck writes to a temp file and prints the path.
* `--tree` returns only the accessibility tree (no screenshot data).
* `--optimize` shrinks the screenshot output for AI workflows.
* `screen` reports size in points; JSON includes `point_width`/`point_height` and `pixel_width`/`pixel_height` when available.
* `--duration` is in seconds (supports decimals like `2.5`).
* `--codec` accepts `h264` or `hevc`.
* `--force` overwrites an existing output file.

### Sessions

* **Always pass `-S <name-or-udid>`** on `session start` and `session stop` to target a specific simulator. Accepts a simulator name (e.g., `"iPhone 16"`) or raw UDID. This is required when multiple simulators are booted and recommended in all automation workflows.
* `session start` captures tree + screenshots every 500ms by default and writes to `./.flowdeck/automation/sessions/<session-short-id>/` (the short ID is the first 8 characters of the session UUID).
* Starting a session stops any active session first and requires a booted simulator.
* `--interval-ms` and `--retention-seconds` tune capture frequency and retention (default 60s). Retention always keeps at least one capture.
* Sessions capture screenshots on each interval; entries are written only when the tree or screenshot changes and screenshots are stored only when pixel content changes (JPEG at 50% quality).
* `latest.json` points to the most recent capture, `latest.jpg` symlinks to the latest screenshot, and `latest-tree.json` symlinks to the latest accessibility tree (no directory listing required).
* JSON output from `session start` includes full paths for the session directory, screens, trees, latest pointers, the UDID, and the current screen size in points (`screen`).
* `session stop` ends the active session and cleans up captured data. When `-S` is provided, it resolves the name to a UDID and verifies the active session matches before stopping (returns an error on mismatch). JSON output includes the stopped session's UDID.
* Sessions also end automatically if the simulator disappears.

### Find, Tap, and Double-Tap

* `--by-id` matches accessibility identifiers instead of labels.
* `--by-role` matches element roles (e.g. `button`, `textField`).
* `--contains` does a substring match against labels.
* `--point` expects `x,y` coordinates in points (matches normalized screenshots and tree output).
* Do not scale by @2x/@3x or device resolution; use the image coordinates directly.
* Coordinate taps use the provided point exactly; use label/ID taps to target element centers.
* `--geometry` accepts `points` only.
* Session screenshots are normalized to point size so image coordinates map 1:1 to points.
* `--duration` on `tap` is a long-press hold time in seconds.

### Swipe and Scroll

* `swipe` supports directions `up`, `down`, `left`, `right` or explicit `--from x,y` + `--to x,y`.
* `--duration` for `swipe` is in seconds (default `0.3`).
* `--distance` sets swipe/scroll distance as a **fraction of the screen** (0.05–0.95), not pixels or points. Defaults: swipe `0.4`, scroll `0.2`.
  * Example: `--distance 0.25` scrolls 25% of the screen height.
  * Invalid: `--distance 300` (FlowDeck will reject values outside 0.05–0.95).
* `--direction` for `scroll` uses `UP`, `DOWN`, `LEFT`, or `RIGHT` (default `DOWN`).
* `--speed` is a 0-100 value (higher is faster).
* `--until` scrolls until an element is visible. Use `id:myElement` to match by accessibility identifier.
* `--timeout` for `scroll --until` is in milliseconds.

### Pinch and Wait

* `pinch` uses a direction argument: `in` (zoom out) or `out` (zoom in).
* `--scale` overrides the zoom factor (defaults to `2.0` for `out`, `0.5` for `in`).
* `--point` expects `x,y` coordinates in points for the pinch center.
* `--geometry` accepts `points` only.
* `--duration` for `pinch` is in seconds.
* `--timeout` for `wait` is in seconds; `--poll` is in milliseconds.
* `--gone`, `--enabled`, and `--stable` change the wait condition (default is “exists”).

### Text, Key, and App Control

* `type --clear` clears the focused field before typing.
* `type --mask` hides typed text in terminal output and JSON.
* `erase --characters 5` deletes a specific number of characters; omit to clear all.
* `key 40` sends a single HID keycode (for example, `40` = Enter).
* `key --sequence 11,8,15` sends comma-separated HID keycodes.
* `key --hold 1.0` holds a single key for the given seconds.
* `open-url` accepts `https://...` or custom schemes like `myapp://path`.
* `clear-state` requires a bundle identifier (for example `com.example.app`) and resets the simulator container for that app.
* `rotate` performs a two-finger rotation gesture (degrees, optional center/radius).

### Set Appearance

* `set-appearance` accepts `light` or `dark` to switch the simulator's appearance mode.
* If `-S` is omitted, the booted simulator is used.
* JSON output includes the applied appearance and target UDID.

### Buttons and Touch

* `button --hold 1.5` holds a hardware button for the given seconds.
* `touch down` and `touch up` expect `x,y` coordinates in points.
* Do not scale by @2x/@3x or device resolution; use the image coordinates directly.
* `--geometry` accepts `points` only.

## Performance and Reliability Tips

* **Start a session before any UI work.** Run `flowdeck ui simulator session start -S <name-or-udid> --json`, parse the JSON output to get `latest_screenshot` and `latest_tree` file paths, then use Read tool on those paths to see the screen and inspect elements.
* **Verify after every action.** After each tap/type/swipe, wait \~1 second, then re-read `latest_screenshot` to confirm the UI changed. Never chain actions without checking.
* Use accessibility identifiers and `--by-id` whenever possible; label matching is slower and more ambiguous.
* Before tapping an element, read `latest_tree` to confirm it exists and is visible.
* For automation loops, re-read `latest-tree.json`/`latest.jpg` from disk instead of issuing `screen` for every step. The session updates these files automatically.
* For one-off tree-only checks, use `ui simulator screen --tree --json -S <name-or-udid>` (no screenshot). Use `--optimize` when you do need a one-off screenshot.
* Use `scroll --until "id:yourElement"` to bring off-screen targets into view before tapping.
* Increase `--poll` for slow UIs to reduce load; decrease when you need faster detection.
* Tune input timing with environment variables:
  * `FLOWDECK_HID_STABILIZATION_MS` (default 25) for tap/gesture stability
  * `FLOWDECK_TYPE_DELAY_MS` (default 20) for typing speed

For a full help listing, run `flowdeck ui simulator --help`.
