Files
openclaw/docs/concepts/architecture.md
2026-02-09 10:19:26 -05:00

4.9 KiB
Raw Blame History

summary, read_when, title
summary read_when title
WebSocket gateway architecture, components, and client flows
Working on gateway protocol, clients, or transports
Gateway Architecture

Gateway architecture

Last updated: 2026-01-22

Overview

  • A single longlived Gateway owns all messaging surfaces (WhatsApp via Baileys, Telegram via grammY, Slack, Discord, Signal, iMessage, WebChat).
  • Control-plane clients (macOS app, CLI, web UI, automations) connect to the Gateway over WebSocket on the configured bind host (default 127.0.0.1:18789).
  • Nodes (macOS/iOS/Android/headless) also connect over WebSocket, but declare role: node with explicit caps/commands.
  • One Gateway per host; it is the only place that opens a WhatsApp session.
  • A canvas host (default 18793) serves agenteditable HTML and A2UI.

Components and flows

Gateway (daemon)

  • Maintains provider connections.
  • Exposes a typed WS API (requests, responses, serverpush events).
  • Validates inbound frames against JSON Schema.
  • Emits events like agent, chat, presence, health, heartbeat, cron.

Clients (mac app / CLI / web admin)

  • One WS connection per client.
  • Send requests (health, status, send, agent, system-presence).
  • Subscribe to events (tick, agent, presence, shutdown).

Nodes (macOS / iOS / Android / headless)

  • Connect to the same WS server with role: node.
  • Provide a device identity in connect; pairing is devicebased (role node) and approval lives in the device pairing store.
  • Expose commands like canvas.*, camera.*, screen.record, location.get.

Protocol details:

WebChat

  • Static UI that uses the Gateway WS API for chat history and sends.
  • In remote setups, connects through the same SSH/Tailscale tunnel as other clients.

Connection lifecycle (single client)

```mermaid %%{init: { 'theme': 'base', 'themeVariables': { 'primaryColor': '#ffffff', 'primaryTextColor': '#000000', 'primaryBorderColor': '#000000', 'lineColor': '#000000', 'secondaryColor': '#f9f9fb', 'tertiaryColor': '#ffffff', 'clusterBkg': '#f9f9fb', 'clusterBorder': '#000000', 'nodeBorder': '#000000', 'mainBkg': '#ffffff', 'edgeLabelBackground': '#ffffff' } }}%% sequenceDiagram participant Client participant Gateway

Client->>Gateway: req:connect
Gateway-->>Client: res (ok)
Note right of Gateway: or res error + close
Note left of Client: payload=hello-ok<br>snapshot: presence + health

Gateway-->>Client: event:presence
Gateway-->>Client: event:tick

Client->>Gateway: req:agent
Gateway-->>Client: res:agent<br>ack {runId, status:"accepted"}
Gateway-->>Client: event:agent<br>(streaming)
Gateway-->>Client: res:agent<br>final {runId, status, summary}
</p>

## Wire protocol (summary)

- Transport: WebSocket, text frames with JSON payloads.
- First frame **must** be `connect`.
- After handshake:
  - Requests: `{type:"req", id, method, params}` → `{type:"res", id, ok, payload|error}`
  - Events: `{type:"event", event, payload, seq?, stateVersion?}`
- If `OPENCLAW_GATEWAY_TOKEN` (or `--token`) is set, `connect.params.auth.token`
  must match or the socket closes.
- Idempotency keys are required for sideeffecting methods (`send`, `agent`) to
  safely retry; the server keeps a shortlived dedupe cache.
- Nodes must include `role: "node"` plus caps/commands/permissions in `connect`.

## Pairing + local trust

- All WS clients (operators + nodes) include a **device identity** on `connect`.
- New device IDs require pairing approval; the Gateway issues a **device token**
  for subsequent connects.
- **Local** connects (loopback or the gateway hosts own tailnet address) can be
  autoapproved to keep samehost UX smooth.
- **Nonlocal** connects must sign the `connect.challenge` nonce and require
  explicit approval.
- Gateway auth (`gateway.auth.*`) still applies to **all** connections, local or
  remote.

Details: [Gateway protocol](/gateway/protocol), [Pairing](/channels/pairing),
[Security](/gateway/security).

## Protocol typing and codegen

- TypeBox schemas define the protocol.
- JSON Schema is generated from those schemas.
- Swift models are generated from the JSON Schema.

## Remote access

- Preferred: Tailscale or VPN.
- Alternative: SSH tunnel

  ```bash
  ssh -N -L 18789:127.0.0.1:18789 user@host
  • The same handshake + auth token apply over the tunnel.
  • TLS + optional pinning can be enabled for WS in remote setups.

Operations snapshot

  • Start: openclaw gateway (foreground, logs to stdout).
  • Health: health over WS (also included in hello-ok).
  • Supervision: launchd/systemd for autorestart.

Invariants

  • Exactly one Gateway controls a single Baileys session per host.
  • Handshake is mandatory; any nonJSON or nonconnect first frame is a hard close.
  • Events are not replayed; clients must refresh on gaps.