feat: add fast mode toggle for OpenAI models

2026-05-28 15:00:41 +00:00 · 2026-03-12 23:30:58 +00:00
parent ddcaec89e9
commit d5bffcdeab
66 changed files with 990 additions and 36 deletions
--- a/docs/tools/slash-commands.md
+++ b/docs/tools/slash-commands.md
@@ -14,7 +14,7 @@ The host-only bash chat command uses `! <cmd>` (with `/bash <cmd>` as an alias).
 There are two related systems:

 - **Commands**: standalone `/...` messages.
- **Directives**: `/think`, `/verbose`, `/reasoning`, `/elevated`, `/exec`, `/model`, `/queue`.
+- **Directives**: `/think`, `/fast`, `/verbose`, `/reasoning`, `/elevated`, `/exec`, `/model`, `/queue`.
  - Directives are stripped from the message before the model sees it.
  - In normal chat messages (not directive-only), they are treated as “inline hints” and do **not** persist session settings.
  - In directive-only messages (the message contains only directives), they persist to the session and reply with an acknowledgement.
@@ -102,6 +102,7 @@ Text + native (when enabled):
 - `/send on|off|inherit` (owner-only)
 - `/reset` or `/new [model]` (optional model hint; remainder is passed through)
 - `/think <off|minimal|low|medium|high|xhigh>` (dynamic choices by model/provider; aliases: `/thinking`, `/t`)
+- `/fast status|on|off` (omitting the arg shows the current effective fast-mode state)
 - `/verbose on|full|off` (alias: `/v`)
 - `/reasoning on|off|stream` (alias: `/reason`; when on, sends a separate message prefixed `Reasoning:`; `stream` = Telegram draft only)
 - `/elevated on|off|ask|full` (alias: `/elev`; `full` skips exec approvals)
@@ -130,6 +131,7 @@ Notes:
 - Discord thread-binding commands (`/focus`, `/unfocus`, `/agents`, `/session idle`, `/session max-age`) require effective thread bindings to be enabled (`session.threadBindings.enabled` and/or `channels.discord.threadBindings.enabled`).
 - ACP command reference and runtime behavior: [ACP Agents](/tools/acp-agents).
 - `/verbose` is meant for debugging and extra visibility; keep it **off** in normal use.
+- `/fast on|off` persists a session override. Use the Sessions UI `inherit` option to clear it and fall back to config defaults.
 - Tool failure summaries are still shown when relevant, but detailed failure text is only included when `/verbose` is `on` or `full`.
 - `/reasoning` (and `/verbose`) are risky in group settings: they may reveal internal reasoning or tool output you did not intend to expose. Prefer leaving them off, especially in group chats.
 - **Fast path:** command-only messages from allowlisted senders are handled immediately (bypass queue + model).
--- a/docs/tools/thinking.md
+++ b/docs/tools/thinking.md
@@ -1,7 +1,7 @@
 ---
-summary: "Directive syntax for /think + /verbose and how they affect model reasoning"
+summary: "Directive syntax for /think, /fast, /verbose, and reasoning visibility"
 read_when:
-  - Adjusting thinking or verbose directive parsing or defaults
+  - Adjusting thinking, fast-mode, or verbose directive parsing or defaults
 title: "Thinking Levels"
 ---

@@ -42,6 +42,19 @@ title: "Thinking Levels"

 - **Embedded Pi**: the resolved level is passed to the in-process Pi agent runtime.

+## Fast mode (/fast)
+
+- Levels: `on|off`.
+- Directive-only message toggles a session fast-mode override and replies `Fast mode enabled.` / `Fast mode disabled.`.
+- Send `/fast` (or `/fast status`) with no mode to see the current effective fast-mode state.
+- OpenClaw resolves fast mode in this order:
+  1. Inline/directive-only `/fast on|off`
+  2. Session override
+  3. Per-model config: `agents.defaults.models["<provider>/<model>"].params.fastMode`
+  4. Fallback: `off`
+- For `openai/*`, fast mode applies the OpenAI fast profile: `service_tier=priority` when supported, plus low reasoning effort and low text verbosity.
+- For `openai-codex/*`, fast mode applies the same low-latency profile on Codex Responses. OpenClaw keeps one shared `/fast` toggle across both auth paths.
+
 ## Verbose directives (/verbose or /v)

 - Levels: `on` (minimal) | `full` | `off` (default).