feat: add fast mode toggle for OpenAI models

This commit is contained in:
Peter Steinberger
2026-03-12 23:30:58 +00:00
parent ddcaec89e9
commit d5bffcdeab
66 changed files with 990 additions and 36 deletions

View File

@@ -47,6 +47,7 @@ OpenClaw ships with the piai catalog. These providers require **no**
- Override per model via `agents.defaults.models["openai/<model>"].params.transport` (`"sse"`, `"websocket"`, or `"auto"`)
- OpenAI Responses WebSocket warm-up defaults to enabled via `params.openaiWsWarmup` (`true`/`false`)
- OpenAI priority processing can be enabled via `agents.defaults.models["openai/<model>"].params.serviceTier`
- OpenAI fast mode can be enabled per model via `agents.defaults.models["<provider>/<model>"].params.fastMode`
```json5
{
@@ -78,6 +79,7 @@ OpenClaw ships with the piai catalog. These providers require **no**
- CLI: `openclaw onboard --auth-choice openai-codex` or `openclaw models auth login --provider openai-codex`
- Default transport is `auto` (WebSocket-first, SSE fallback)
- Override per model via `agents.defaults.models["openai-codex/<model>"].params.transport` (`"sse"`, `"websocket"`, or `"auto"`)
- Shares the same `/fast` toggle and `params.fastMode` config as direct `openai/*`
- Policy note: OpenAI Codex OAuth is explicitly supported for external tools/workflows like OpenClaw.
```json5

View File

@@ -281,7 +281,7 @@ Runtime override (owner only):
- `openclaw status` — shows store path and recent sessions.
- `openclaw sessions --json` — dumps every entry (filter with `--active <minutes>`).
- `openclaw gateway call sessions.list --params '{}'` — fetch sessions from the running gateway (use `--url`/`--token` for remote gateway access).
- Send `/status` as a standalone message in chat to see whether the agent is reachable, how much of the session context is used, current thinking/verbose toggles, and when your WhatsApp web creds were last refreshed (helps spot relink needs).
- Send `/status` as a standalone message in chat to see whether the agent is reachable, how much of the session context is used, current thinking/fast/verbose toggles, and when your WhatsApp web creds were last refreshed (helps spot relink needs).
- Send `/context list` or `/context detail` to see whats in the system prompt and injected workspace files (and the biggest context contributors).
- Send `/stop` (or standalone abort phrases like `stop`, `stop action`, `stop run`, `stop openclaw`) to abort the current run, clear queued followups for that session, and stop any sub-agent runs spawned from it (the reply includes the stopped count).
- Send `/compact` (optional instructions) as a standalone message to summarize older context and free up window space. See [/concepts/compaction](/concepts/compaction).

View File

@@ -165,6 +165,46 @@ pass that field through on direct `openai/*` Responses requests.
Supported values are `auto`, `default`, `flex`, and `priority`.
### OpenAI fast mode
OpenClaw exposes a shared fast-mode toggle for both `openai/*` and
`openai-codex/*` sessions:
- Chat/UI: `/fast status|on|off`
- Config: `agents.defaults.models["<provider>/<model>"].params.fastMode`
When fast mode is enabled, OpenClaw applies a low-latency OpenAI profile:
- `reasoning.effort = "low"` when the payload does not already specify reasoning
- `text.verbosity = "low"` when the payload does not already specify verbosity
- `service_tier = "priority"` for direct `openai/*` Responses calls to `api.openai.com`
Example:
```json5
{
agents: {
defaults: {
models: {
"openai/gpt-5.4": {
params: {
fastMode: true,
},
},
"openai-codex/gpt-5.4": {
params: {
fastMode: true,
},
},
},
},
},
}
```
Session overrides win over config. Clearing the session override in the Sessions UI
returns the session to the configured default.
### OpenAI Responses server-side compaction
For direct OpenAI Responses models (`openai/*` using `api: "openai-responses"` with

View File

@@ -14,7 +14,7 @@ The host-only bash chat command uses `! <cmd>` (with `/bash <cmd>` as an alias).
There are two related systems:
- **Commands**: standalone `/...` messages.
- **Directives**: `/think`, `/verbose`, `/reasoning`, `/elevated`, `/exec`, `/model`, `/queue`.
- **Directives**: `/think`, `/fast`, `/verbose`, `/reasoning`, `/elevated`, `/exec`, `/model`, `/queue`.
- Directives are stripped from the message before the model sees it.
- In normal chat messages (not directive-only), they are treated as “inline hints” and do **not** persist session settings.
- In directive-only messages (the message contains only directives), they persist to the session and reply with an acknowledgement.
@@ -102,6 +102,7 @@ Text + native (when enabled):
- `/send on|off|inherit` (owner-only)
- `/reset` or `/new [model]` (optional model hint; remainder is passed through)
- `/think <off|minimal|low|medium|high|xhigh>` (dynamic choices by model/provider; aliases: `/thinking`, `/t`)
- `/fast status|on|off` (omitting the arg shows the current effective fast-mode state)
- `/verbose on|full|off` (alias: `/v`)
- `/reasoning on|off|stream` (alias: `/reason`; when on, sends a separate message prefixed `Reasoning:`; `stream` = Telegram draft only)
- `/elevated on|off|ask|full` (alias: `/elev`; `full` skips exec approvals)
@@ -130,6 +131,7 @@ Notes:
- Discord thread-binding commands (`/focus`, `/unfocus`, `/agents`, `/session idle`, `/session max-age`) require effective thread bindings to be enabled (`session.threadBindings.enabled` and/or `channels.discord.threadBindings.enabled`).
- ACP command reference and runtime behavior: [ACP Agents](/tools/acp-agents).
- `/verbose` is meant for debugging and extra visibility; keep it **off** in normal use.
- `/fast on|off` persists a session override. Use the Sessions UI `inherit` option to clear it and fall back to config defaults.
- Tool failure summaries are still shown when relevant, but detailed failure text is only included when `/verbose` is `on` or `full`.
- `/reasoning` (and `/verbose`) are risky in group settings: they may reveal internal reasoning or tool output you did not intend to expose. Prefer leaving them off, especially in group chats.
- **Fast path:** command-only messages from allowlisted senders are handled immediately (bypass queue + model).

View File

@@ -1,7 +1,7 @@
---
summary: "Directive syntax for /think + /verbose and how they affect model reasoning"
summary: "Directive syntax for /think, /fast, /verbose, and reasoning visibility"
read_when:
- Adjusting thinking or verbose directive parsing or defaults
- Adjusting thinking, fast-mode, or verbose directive parsing or defaults
title: "Thinking Levels"
---
@@ -42,6 +42,19 @@ title: "Thinking Levels"
- **Embedded Pi**: the resolved level is passed to the in-process Pi agent runtime.
## Fast mode (/fast)
- Levels: `on|off`.
- Directive-only message toggles a session fast-mode override and replies `Fast mode enabled.` / `Fast mode disabled.`.
- Send `/fast` (or `/fast status`) with no mode to see the current effective fast-mode state.
- OpenClaw resolves fast mode in this order:
1. Inline/directive-only `/fast on|off`
2. Session override
3. Per-model config: `agents.defaults.models["<provider>/<model>"].params.fastMode`
4. Fallback: `off`
- For `openai/*`, fast mode applies the OpenAI fast profile: `service_tier=priority` when supported, plus low reasoning effort and low text verbosity.
- For `openai-codex/*`, fast mode applies the same low-latency profile on Codex Responses. OpenClaw keeps one shared `/fast` toggle across both auth paths.
## Verbose directives (/verbose or /v)
- Levels: `on` (minimal) | `full` | `off` (default).

View File

@@ -75,7 +75,7 @@ The Control UI can localize itself on first load based on your browser locale, a
- Stream tool calls + live tool output cards in Chat (agent events)
- Channels: WhatsApp/Telegram/Discord/Slack + plugin channels (Mattermost, etc.) status + QR login + per-channel config (`channels.status`, `web.login.*`, `config.patch`)
- Instances: presence list + refresh (`system-presence`)
- Sessions: list + per-session thinking/verbose overrides (`sessions.list`, `sessions.patch`)
- Sessions: list + per-session thinking/fast/verbose/reasoning overrides (`sessions.list`, `sessions.patch`)
- Cron jobs: list/add/edit/run/enable/disable + run history (`cron.*`)
- Skills: status, enable/disable, install, API key updates (`skills.*`)
- Nodes: list + caps (`node.list`)

View File

@@ -37,7 +37,7 @@ Use `--password` if your Gateway uses password auth.
- Header: connection URL, current agent, current session.
- Chat log: user messages, assistant replies, system notices, tool cards.
- Status line: connection/run state (connecting, running, streaming, idle, error).
- Footer: connection state + agent + session + model + think/verbose/reasoning + token counts + deliver.
- Footer: connection state + agent + session + model + think/fast/verbose/reasoning + token counts + deliver.
- Input: text editor with autocomplete.
## Mental model: agents + sessions
@@ -92,6 +92,7 @@ Core:
Session controls:
- `/think <off|minimal|low|medium|high>`
- `/fast <status|on|off>`
- `/verbose <on|full|off>`
- `/reasoning <on|off|stream>`
- `/usage <off|tokens|full>`