SDKs, MCP clients, CLI, and custom WebSocket clients
This is where models or orchestrators live. They need a stable way to request highlights, annotations, actions, state reads, and playbook runs.
Architecture explainer
The key idea is simple: agent intent should travel through a structured runtime that keeps actions routable, observable, and explainable to a human operator.
Agent surfaces feed a shared runtime core, which then routes intent into the browser overlay, browser extension, or desktop accessibility layer.
Layers
This is where models or orchestrators live. They need a stable way to request highlights, annotations, actions, state reads, and playbook runs.
The runtime core handles sessions, role-aware routing, backpressure, tool execution, structured failures, and the verify-after-action loop.
This is where intent becomes visible or actionable: browser overlays for narration, extension state replay for tab churn, and macOS accessibility for desktop control.
Adoption paths
Add the overlay package to a web app and let agents visibly guide users without changing the entire architecture.
Run the combined MCP server and relay as a single runtime so model hosts get a structured tool surface immediately.
Extend beyond browser-only workflows by pairing the browser story with the native accessibility backend on macOS.
Trust boundaries
Highlights, annotations, and cursor movement give the operator a readable model of intent before a step executes.
The runtime can classify failures, re-read state, and recover instead of assuming a click or action worked.
SDKs, overlays, extensions, and accessibility layers stay composable so product teams can adopt the parts they need.
Bridge to docs
These are the best technical references once a developer understands the top-level architecture.