iOS development requires Apple’s CLI tools. xcodebuild compiles. simctl manages simulators. devicectl handles devices. Every build system for Apple platforms calls them eventually.
AI agents can call them too. The results are bad.
xcodebuild dumps thousands of lines of unstructured text. simctl needs UDIDs that vary across machines. devicectl has completely different syntax for the same operations. The agent guesses flags, fails, retries, burns context. Custom workspace structures, non-standard derived data paths, provisioning setups that only work with certain signing identities. The agent has no way to discover any of this.
That’s why wrappers exist. Something between the agent and Apple’s tools that parses the output and gives the agent structured data.
Two approaches have emerged.
MCP: register tools, let the agent pick
An MCP server registers tools with the agent. Each tool gets a name, description, and JSON parameter schema. The agent sees all definitions in its context window, picks one, calls it through the MCP stdio protocol, reads the response.
For iOS, the most popular MCP server registers 59 tools by default. Build for simulator. Build for device. Screenshot. Log capture. Test. Launch. Stop. Each a separate tool definition.
Each definition costs 550 to 1,400 tokens. With 59 tools, that’s tens of thousands of tokens consumed before the agent writes a single line of code. That context is gone for the entire session whether the agent uses those tools or not.
Tool discovery is automatic. That’s the real strength. The agent can’t miss a tool because it’s in context. The protocol is standardized. Every major AI coding tool supports it.
But the costs compound.
The agent chooses between build_sim, build_and_run, build_device, build_macOS, and spm_build on every step. More tools, more wrong choices. Multi-step reasoning degrades after 3 to 4 sequential MCP calls because each response adds to the context burden. MCP tool responses cap at 25K tokens. A large build log or full test suite exceeds that, forcing truncation or pagination across multiple calls.
CLI: teach the agent, let it execute
A CLI exposes a small number of commands with flags. build, run, test, clean. Simulator vs device vs macOS is a flag, not a separate command. The CLI calls xcodebuild, simctl, and devicectl under the hood. It handles flag resolution, destination matching, and output parsing.
The agent learns the CLI from a skill file. A markdown document with the command reference and workflow patterns. The agent calls commands via bash. No protocol. No runtime. No tool registration.
LLMs already understand terminal commands. They’ve been trained on billions of shell interactions from Stack Overflow, GitHub, and documentation. There’s no schema to inject. Each command costs roughly 200 tokens versus 550 to 1,400 per MCP tool definition.
The numbers are consistent across benchmarks. CLI completes identical tasks with 10x to 32x fewer tokens than MCP. One benchmark found checking a repo’s language cost 1,365 tokens via CLI and 44,026 via MCP. The overhead is almost entirely schema definitions the agent never touches.
The disadvantage: discovery. The agent needs the skill file to know the CLI exists. Without it, the agent falls back to raw xcodebuild. MCP doesn’t have this problem.
The comparison
Context cost. 59 tool schemas sit in context all session. A skill file is one document referenced as needed. Not marginal. Orders of magnitude different.
Tool selection. Five build variants to choose from vs one command with a flag. Fewer commands, fewer wrong choices.
Response limits. MCP caps at 25K tokens per response. Build logs and test suites exceed that. CLI stdout has no limit.
Composability. CLI commands chain. Pipe through jq, redirect to files, combine with grep. MCP tools don’t compose. Each call is isolated.
Human usability. Nobody runs MCP tools from the terminal during development. A CLI works for both humans and agents. Same command, same output, same failure mode.
Reasoning quality. Agents degrade after 3 to 4 sequential MCP tool calls as context fills with tool responses. CLI interactions are lighter. The agent keeps more of its context window for actual problem-solving.
Where MCP fits
Stateless integrations. Databases, APIs, CRMs. Services that don’t have a CLI (Slack, Notion, most SaaS platforms). Independent operations that don’t depend on each other. The tool model maps cleanly and automatic discovery is a genuine advantage.
Where CLI fits
Sequential workflows. Build pipelines. Anything where step two depends on step one. Operations that produce large output. Workflows where a human and an agent use the same tool. Local development where you don’t want a runtime, a daemon, or telemetry between you and your build.
iOS development is all of those things.
Why FlowDeck is a CLI
FlowDeck is a native Swift CLI for iOS, macOS, watchOS, and tvOS development. Eight commands. Structured NDJSON output on every command. Built-in UI automation for simulator interaction. Skill files for Claude Code, Codex, OpenCode, and Gemini CLI.
No Node.js. No MCP protocol. No telemetry. No background process. The agent calls flowdeck build the same way you do.
We chose CLI because a build system is a pipeline, not a bag of independent operations. The agent doesn’t need 59 tools. It needs 8 commands, structured output, and the ability to see the screen.
Further reading
Try FlowDeck free for 7 days.
One CLI for builds, simulators, tests, logs, and UI automation. Native Swift. Runs locally. No telemetry.
$59/yr after trial · Zero telemetry