Flow-Next Drive
The flow-next-drive skill drives any UI surface the way a real user would — a web app, a Chromium-backed desktop app (Electron / Windows WebView2), or a genuinely native app (macOS AppKit/SwiftUI, or a webview exposing no CDP). It detects the surface, picks the highest available driver on a ladder, and degrades gracefully when a richer driver is absent.
It is a router, not a single driver. The default rung — Vercel’s agent-browser CLI — is the only driver assumed present; every other rung is detected and optional. A pass succeeds with whatever the environment actually has: most cloud VMs, Linux, and CI have no Computer Use, so it is never a hard dependency and never on a headless/no-display path.
When to use it
Section titled “When to use it”- Verifying a deployed UI change matches the spec.
- Driving or testing a web app, an Electron / WebView2 desktop app, or a native desktop app.
- Reading documentation that has no clean text version.
- Capturing baseline screenshots before a redesign.
- Logging into a service and pulling structured data.
- Light e2e probes that do not warrant a full test framework.
It orchestrates drivers — it does not reimplement them. The full native-desktop QA workflow (scenario authoring, bug filing, verdict) is a downstream /flow-next:qa concern; this skill provides the driver/actuation + the surface conditional.
Step 1 — Detect the surface, then branch
Section titled “Step 1 — Detect the surface, then branch”| Surface | What it is | Path |
|---|---|---|
| Web app | A URL in a browser (localhost dev server, staging, production) | Web ladder |
| Chromium-backed desktop app | Electron / Windows WebView2 — Chromium under the hood, exposes a CDP debug port | Web ladder, attaching over CDP to the app’s remote-debugging port |
| True-native / non-CDP surface | macOS AppKit/SwiftUI, Catalyst, or a webview exposing no CDP (macOS WKWebView, which Tauri uses on macOS) | Native rung → Computer Use |
Per-platform caveat: Windows WebView2 is CDP-drivable (web ladder); macOS WKWebView generally is not (native rung). When unsure whether a desktop app exposes CDP, probe for the web ladder first (try to launch/attach with a debug port); if no port is reachable, fall to the native rung.
Step 2 — The universal flow (all surfaces)
Section titled “Step 2 — The universal flow (all surfaces)”Whatever driver the environment has, the work is the same shape:
observe / navigate to the targetsnapshot → fresh element refs (REQUIRED before each act)act → click / fill / type / press / scrollverify → confirm the expected text / state appearedcapture → screenshot + console/errors (and on failure)release → close the tab / end the session when fully doneRefs (@e1, @e2, …) go stale after any navigation, click, or form submit — always re-snapshot. “ref not found” or “pointer-events: none” almost always means a stale snapshot, not a real bug.
Step 3 — Web ladder (web apps + Chromium-backed desktop apps)
Section titled “Step 3 — Web ladder (web apps + Chromium-backed desktop apps)”Probe availability top-down and use the highest rung that passes; fail soft to the next; the terminal rung is manual.
| Rung | Driver | Use when |
|---|---|---|
| 1 (default) | agent-browser CLI | Always assumed present. CDP-based, headless-safe, no extra install. Drives web apps; drives Electron / WebView2 over CDP (--cdp <port> / --auto-connect). |
| 2 | chrome-devtools-mcp | You want built-in auto-wait, DevTools-grade network/console inspection, Lighthouse, or to attach to your real signed-in Chrome (--browser-url) so bot defenses don’t challenge an automated profile. |
| 3 | Playwright | The repo already has Playwright configured, or you need a headless CI-style / cross-browser regression run. |
| 4 | cursor-ide-browser MCP | Running inside Cursor with this MCP installed and you want its snapshot YAML + browser_cdp control. |
| 5 (terminal) | Manual + screenshot relay | No browser driver available — drive yourself, paste console errors and screenshots into chat. |
The same ladder drives Electron / WebView2 apps by attaching to the app’s remote-debugging port. Launch the app with a dedicated debug port and user-data-dir; treat the open debug port as a security exposure (any local app can drive that session).
Step 4 — Native rung: Computer Use
Section titled “Step 4 — Native rung: Computer Use”A genuinely native app (or a non-CDP webview) has no browser tab to attach to — the only way to drive it is Computer Use: the model looks at the screen, moves a cursor, clicks, and types. Driver-agnostic across what the host offers:
- Codex Computer Use (macOS / Windows).
- Anthropic “Claude” Computer Use — the API
computertool, run via its own harness (a controlled display/sandbox or an MCP wrapper).
Detect availability and use whichever the environment provides; verify the tool/beta-header version at build (it drifts). The actuation differs from the web ladder but the universal flow is identical.
Graceful degradation
Section titled “Graceful degradation”When no Computer Use is present:
- A Chromium-backed app still drives via the web-ladder CDP attach, or by driving its local dev-server URL in a browser. (Shell-level integration — system tray, native menus, OS dialogs — can’t be reached this way; surface that limitation.)
- A genuinely native app with no Computer Use → document the limitation rather than fail.
agent-browser stays the only assumed-present driver. No MCP server or Computer Use is ever a hard install dependency.
What this is not
Section titled “What this is not”- Not a test framework. There is no assertion DSL, no parallel runner, no flake retry.
- Not a scraper for restricted sites. Respect robots.txt and the target’s terms of service.
- It orchestrates drivers — it does not reimplement Playwright or Computer Use.
Common pitfalls
Section titled “Common pitfalls”- Forgetting to re-snapshot after a click or form submission — refs become stale.
- Routing an Electron / WebView2 app to Computer Use — it’s Chromium, drive it over CDP via the web ladder.
- Treating Computer Use as a default — most environments lack it; it’s the native-rung fallback, not the common case.
Driver ladder and universal-flow structure inspired by Ray Fernando’s running-bug-review-board skill (Apache-2.0).