Commit Graph

85 Commits

Author SHA1 Message Date
Vincent Koc
5320ee7731 fix(venice): harden discovery limits and tool support (#38306)
* Config: add supportsTools compat flag

* Agents: add model tool support helper

* Venice: sync discovery and fallback metadata

* Agents: skip tools for unsupported models

* Changelog: note Venice provider hardening

* Update CHANGELOG.md

* Venice: cap degraded discovery metadata

* Apply suggestion from @greptile-apps[bot]

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

* Venice: tolerate partial discovery capabilities

* Venice: tolerate missing discovery specs

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-03-06 19:07:11 -05:00
Josh Lehman
fee91fefce feature(context): extend plugin system to support custom context management (#22201)
* feat(context-engine): add ContextEngine interface and registry

Introduce the pluggable ContextEngine abstraction that allows external
plugins to register custom context management strategies.

- ContextEngine interface with lifecycle methods: bootstrap, ingest,
  ingestBatch, afterTurn, assemble, compact, prepareSubagentSpawn,
  onSubagentEnded, dispose
- Module-level singleton registry with registerContextEngine() and
  resolveContextEngine() (config-driven slot selection)
- LegacyContextEngine: pass-through implementation wrapping existing
  compaction behavior for 100% backward compatibility
- ensureContextEnginesInitialized() guard for safe one-time registration
- 19 tests covering contract, registry, resolution, and legacy parity

* feat(plugins): add context-engine slot and registerContextEngine API

Wire the ContextEngine abstraction into the plugin system so external
plugins can register context engines via the standard plugin API.

- Add 'context-engine' to PluginKind union type
- Add 'contextEngine' slot to PluginSlotsConfig (default: 'legacy')
- Wire registerContextEngine() through OpenClawPluginApi
- Export ContextEngine types from plugin-sdk for external consumers
- Restore proper slot-based resolution in registry

* feat(context-engine): wire ContextEngine into agent run lifecycle

Integrate the ContextEngine abstraction into the core agent run path:

- Resolve context engine once per run (reused across retries)
- Bootstrap: hydrate canonical store from session file on first run
- Assemble: route context assembly through pluggable engine
- Auto-compaction guard: disable built-in auto-compaction when
  the engine declares ownsCompaction (prevents double-compaction)
- AfterTurn: post-turn lifecycle hook for ingest + background
  compaction decisions
- Overflow compaction: route through contextEngine.compact()
- Dispose: clean up engine resources in finally block
- Notify context engine on subagent lifecycle events

Legacy engine: all lifecycle methods are pass-through/no-op, preserving
100% backward compatibility for users without a context engine plugin.

* feat(plugins): add scoped subagent methods and gateway request scope

Expose runtime.subagent.{run, waitForRun, getSession, deleteSession}
so external plugins can spawn sub-agent sessions without raw gateway
dispatch access.

Uses AsyncLocalStorage request-scope bridge to dispatch internally via
handleGatewayRequest with a synthetic operator client. Methods are only
available during gateway request handling.

- Symbol.for-backed global singleton for cross-module-reload safety
- Fallback gateway context for non-WS dispatch paths (Telegram/WhatsApp)
- Set gateway request scope for all handlers, not just plugin handlers
- 3 staleness tests for fallback context hardening

* feat(context-engine): route /compact and sessions.get through context engine

Wire the /compact command and sessions.get handler through the pluggable
ContextEngine interface.

- Thread tokenBudget and force parameters to context engine compact
- Route /compact through contextEngine.compact() when registered
- Wire sessions.get as runtime alias for plugin subagent dispatch
- Add .pebbles/ to .gitignore

* style: format with oxfmt 0.33.0

Fix duplicate import (ControlUiRootState in server.impl.ts) and
import ordering across all changed files.

* fix: update extension test mocks for context-engine types

Add missing subagent property to bluebubbles PluginRuntime mock.
Add missing registerContextEngine to lobster OpenClawPluginApi mock.

* fix(subagents): keep deferred delete cleanup retryable

* style: format run attempt for CI

* fix(rebase): remove duplicate embedded-run imports

* test: add missing gateway context mock export

* fix: pass resolved auth profile into afterTurn compaction

Ensure the embedded runner forwards resolved auth profile context into
legacy context-engine compaction params on the normal afterTurn path,
matching overflow compaction behavior. This allows downstream LCM
summarization to use the intended provider auth/profile consistently.

Also fix strict TS typing in external-link token dedupe and align an
attempt unit test reasoningLevel value with the current ReasoningLevel
enum.

Regeneration-Prompt: |
  We were debugging context-engine compaction where downstream summary
  calls were missing the right auth/profile context in normal afterTurn
  flow, while overflow compaction already propagated it. Preserve current
  behavior and keep changes additive: thread the resolved authProfileId
  through run -> attempt -> legacy compaction param builder without
  broad refactors.

  Add tests that prove the auth profile is included in afterTurn legacy
  params and that overflow compaction still passes it through run
  attempts. Keep existing APIs stable, and only adjust small type issues
  needed for strict compilation.

* fix: remove duplicate imports from rebase

* feat: add context-engine system prompt additions

* fix(rebase): dedupe attempt import declarations

* test: fix fetch mock typing in ollama autodiscovery

* fix(test): add registerContextEngine to diffs extension mock APIs

* test(windows): use path.delimiter in ios-team-id fixture PATH

* test(cron): add model formatting and precedence edge case tests

Covers:
- Provider/model string splitting (whitespace, nested paths, empty segments)
- Provider normalization (casing, aliases like bedrock→amazon-bedrock)
- Anthropic model alias normalization (opus-4.5→claude-opus-4-5)
- Precedence: job payload > session override > config default
- Sequential runs with different providers (CI flake regression pattern)
- forceNew session preserving stored model overrides
- Whitespace/empty model string edge cases
- Config model as string vs object format

* test(cron): fix model formatting test config types

* test(phone-control): add registerContextEngine to mock API

* fix: re-export ChannelKind from config-reload-plan

* fix: add subagent mock to plugin-runtime-mock test util

* docs: add changelog fragment for context engine PR #22201
2026-03-06 05:31:59 -08:00
Vignesh Natarajan
4daaea1190 fix(agents): avoid synthetic tool-result writes on idle-timeout cleanup 2026-03-05 19:29:18 -08:00
Vincent Koc
71ec42127d feat(hooks): emit compaction lifecycle hooks (#16788) 2026-03-05 19:08:26 -08:00
Sid
6c0376145f fix(agents): skip compaction API call when session has no real messages (#36451)
Merged via squash.

Prepared head SHA: 52dd631789
Co-authored-by: Sid-Qin <201593046+Sid-Qin@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
2026-03-05 11:40:25 -08:00
Gustavo Madeira Santana
2370ea5d1b agents: propagate config for embedded skill loading 2026-03-03 02:44:56 -05:00
Vincent Koc
747902a26a fix(hooks): propagate run/tool IDs for tool hook correlation (#32360)
* Plugin SDK: add run and tool call fields to tool hooks

* Agents: propagate runId and toolCallId in before_tool_call

* Agents: thread runId through tool wrapper context

* Runner: pass runId into tool hook context

* Compaction: pass runId into tool hook context

* Agents: scope after_tool_call start data by run

* Tests: cover run and tool IDs in before_tool_call hooks

* Tests: add run-scoped after_tool_call collision coverage

* Hooks: scope adjusted tool params by run

* Tests: cover run-scoped adjusted param collisions

* Hooks: preserve active tool start metadata until end

* Changelog: add tool-hook correlation note
2026-03-02 17:23:08 -08:00
Vincent Koc
0954b6bf5f fix(hooks): propagate ephemeral sessionId through embedded tool contexts (#32273)
* fix(plugins): expose ephemeral sessionId in tool contexts for per-conversation isolation

The plugin tool context (`OpenClawPluginToolContext`) and tool hook
context (`PluginHookToolContext`) only provided `sessionKey`, which
is a durable channel identifier that survives /new and /reset.
Plugins like mem0 that need per-conversation isolation (e.g. mapping
Mem0 `run_id`) had no way to distinguish between conversations,
causing session-scoped memories to persist unbounded across resets.

Add `sessionId` (ephemeral UUID regenerated on /new and /reset) to:
- `OpenClawPluginToolContext` (factory context for plugin tools)
- `PluginHookToolContext` (before_tool_call / after_tool_call hooks)
- Internal `HookContext` for tool call wrappers

Thread the value from the run attempt through createOpenClawCodingTools
→ createOpenClawTools → resolvePluginTools and through the tool hook
wrapper.

Closes #31253

Made-with: Cursor

* fix(agents): propagate embedded sessionId through tool hook context

* test(hooks): cover sessionId in embedded tool hook contexts

* docs(changelog): add sessionId hook context follow-up note

* test(hooks): avoid toolCallId collision in after_tool_call e2e

---------

Co-authored-by: SidQin-cyber <sidqin0410@gmail.com>
2026-03-02 15:11:51 -08:00
Vincent Koc
a19a7f5e6e feat(security): Harden Docker browser container chromium flags (#23889) (#31504)
* Gateway: honor OPENCLAW_GATEWAY_URL override for remote/local calls

* Agents: fix sandbox sessionKey usage for PI embedded subagent calls

* Sandbox: tighten browser container Chromium runtime flags

* fix: add sandbox browser defaults for container hardening

* docs: expand sandbox browser default flags list

* fix: make sandbox browser flags optional and preserve gateway env auth overrides

* docs: scope PR 31504 changelog entry

* style: format gateway call override handling

* fix: dedupe sandbox browser chrome args

* fix: preserve remote tls fingerprint for env gateway override

* fix: enforce auth for env gateway URL override

* chore: document gateway override auth security expectations
2026-03-02 11:28:27 -08:00
Peter Steinberger
611dff985d fix(agents): harden embedded pi project settings loading 2026-02-26 21:46:39 +01:00
Onur Solmaz
a7d56e3554 feat: ACP thread-bound agents (#23580)
* docs: add ACP thread-bound agents plan doc

* docs: expand ACP implementation specification

* feat(acp): route ACP sessions through core dispatch and lifecycle cleanup

* feat(acp): add /acp commands and Discord spawn gate

* ACP: add acpx runtime plugin backend

* fix(subagents): defer transient lifecycle errors before announce

* Agents: harden ACP sessions_spawn and tighten spawn guidance

* Agents: require explicit ACP target for runtime spawns

* docs: expand ACP control-plane implementation plan

* ACP: harden metadata seeding and spawn guidance

* ACP: centralize runtime control-plane manager and fail-closed dispatch

* ACP: harden runtime manager and unify spawn helpers

* Commands: route ACP sessions through ACP runtime in agent command

* ACP: require persisted metadata for runtime spawns

* Sessions: preserve ACP metadata when updating entries

* Plugins: harden ACP backend registry across loaders

* ACPX: make availability probe compatible with adapters

* E2E: add manual Discord ACP plain-language smoke script

* ACPX: preserve streamed spacing across Discord delivery

* Docs: add ACP Discord streaming strategy

* ACP: harden Discord stream buffering for thread replies

* ACP: reuse shared block reply pipeline for projector

* ACP: unify streaming config and adopt coalesceIdleMs

* Docs: add temporary ACP production hardening plan

* Docs: trim temporary ACP hardening plan goals

* Docs: gate ACP thread controls by backend capabilities

* ACP: add capability-gated runtime controls and /acp operator commands

* Docs: remove temporary ACP hardening plan

* ACP: fix spawn target validation and close cache cleanup

* ACP: harden runtime dispatch and recovery paths

* ACP: split ACP command/runtime internals and centralize policy

* ACP: harden runtime lifecycle, validation, and observability

* ACP: surface runtime and backend session IDs in thread bindings

* docs: add temp plan for binding-service migration

* ACP: migrate thread binding flows to SessionBindingService

* ACP: address review feedback and preserve prompt wording

* ACPX plugin: pin runtime dependency and prefer bundled CLI

* Discord: complete binding-service migration cleanup and restore ACP plan

* Docs: add standalone ACP agents guide

* ACP: route harness intents to thread-bound ACP sessions

* ACP: fix spawn thread routing and queue-owner stall

* ACP: harden startup reconciliation and command bypass handling

* ACP: fix dispatch bypass type narrowing

* ACP: align runtime metadata to agentSessionId

* ACP: normalize session identifier handling and labels

* ACP: mark thread banner session ids provisional until first reply

* ACP: stabilize session identity mapping and startup reconciliation

* ACP: add resolved session-id notices and cwd in thread intros

* Discord: prefix thread meta notices consistently

* Discord: unify ACP/thread meta notices with gear prefix

* Discord: split thread persona naming from meta formatting

* Extensions: bump acpx plugin dependency to 0.1.9

* Agents: gate ACP prompt guidance behind acp.enabled

* Docs: remove temp experiment plan docs

* Docs: scope streaming plan to holy grail refactor

* Docs: refactor ACP agents guide for human-first flow

* Docs/Skill: add ACP feature-flag guidance and direct acpx telephone-game flow

* Docs/Skill: add OpenCode and Pi to ACP harness lists

* Docs/Skill: align ACP harness list with current acpx registry

* Dev/Test: move ACP plain-language smoke script and mark as keep

* Docs/Skill: reorder ACP harness lists with Pi first

* ACP: split control-plane manager into core/types/utils modules

* Docs: refresh ACP thread-bound agents plan

* ACP: extract dispatch lane and split manager domains

* ACP: centralize binding context and remove reverse deps

* Infra: unify system message formatting

* ACP: centralize error boundaries and session id rendering

* ACP: enforce init concurrency cap and strict meta clear

* Tests: fix ACP dispatch binding mock typing

* Tests: fix Discord thread-binding mock drift and ACP request id

* ACP: gate slash bypass and persist cleared overrides

* ACPX: await pre-abort cancel before runTurn return

* Extension: pin acpx runtime dependency to 0.1.11

* Docs: add pinned acpx install strategy for ACP extension

* Extensions/acpx: enforce strict local pinned startup

* Extensions/acpx: tighten acp-router install guidance

* ACPX: retry runtime test temp-dir cleanup

* Extensions/acpx: require proactive ACPX repair for thread spawns

* Extensions/acpx: require restart offer after acpx reinstall

* extensions/acpx: remove workspace protocol devDependency

* extensions/acpx: bump pinned acpx to 0.1.13

* extensions/acpx: sync lockfile after dependency bump

* ACPX: make runtime spawn Windows-safe

* fix: align doctor-config-flow repair tests with default-account migration (#23580) (thanks @osolmaz)
2026-02-26 11:00:09 +01:00
Peter Steinberger
6c2e999776 refactor(security): unify secure id paths and guard weak patterns 2026-02-22 10:16:19 +01:00
Peter Steinberger
c99e7696e6 fix: decouple owner display secret from gateway auth token 2026-02-22 09:35:07 +01:00
Vignesh Natarajan
cdfe45eeb8 Agents: validate persisted tool-call names 2026-02-21 23:06:44 -08:00
Vincent Koc
9abab6a2c9 Add explicit ownerDisplaySecret for owner ID hash obfuscation (#22520)
* feat(config): add owner display secret setting

* feat(prompt): add explicit owner hash secret to obfuscation path

* test(prompt): assert owner hash secret mode behavior

* Update src/agents/system-prompt.ts

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-02-21 03:13:56 -05:00
Glucksberg
1410d15c5e fix: compaction safeguard extension not loading in production builds (openclaw#22349) thanks @Glucksberg
Verified:
- pnpm build
- pnpm check
- pnpm test:macmini (local run had unrelated baseline failures; Tak approved proceed)

Co-authored-by: Glucksberg <80581902+Glucksberg@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-20 21:21:09 -06:00
Tak Hoffman
c1ac37a641 Config: expose Pi compaction tuning values (openclaw#21568) thanks @Takhoffman
Verified:
- pnpm install --frozen-lockfile
- pnpm build
- pnpm check
- pnpm test:macmini

Co-authored-by: Takhoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-19 21:41:09 -06:00
Peter Steinberger
b8b43175c5 style: align formatting with oxfmt 0.33 2026-02-18 01:34:35 +00:00
Peter Steinberger
31f9be126c style: run oxfmt and fix gate failures 2026-02-18 01:29:02 +00:00
Peter Steinberger
6dcc052bb4 fix: stabilize model catalog and pi discovery auth storage compatibility 2026-02-18 02:09:40 +01:00
Peter Steinberger
b05e89e5e6 fix(agents): make image sanitization dimension configurable 2026-02-18 00:54:20 +01:00
Tyler Yust
087dca8fa9 fix(subagent): harden read-tool overflow guards and sticky reply threading (#19508)
* fix(gateway): avoid premature agent.wait completion on transient errors

* fix(agent): preemptively guard tool results against context overflow

* fix: harden tool-result context guard and add message_id metadata

* fix: use importOriginal in session-key mock to include DEFAULT_ACCOUNT_ID

The run.skill-filter test was mocking ../../routing/session-key.js with only
buildAgentMainSessionKey and normalizeAgentId, but the module also exports
DEFAULT_ACCOUNT_ID which is required transitively by src/web/auth-store.ts.

Switch to importOriginal pattern so all real exports are preserved alongside
the mocked functions.

* pi-runner: guard accumulated tool-result overflow in transformContext

* PI runner: compact overflowing tool-result context

* Subagent: harden tool-result context recovery

* Enhance tool-result context handling by adding support for legacy tool outputs and improving character estimation for message truncation. This includes a new function to create legacy tool results and updates to existing functions to better manage context overflow scenarios.

* Enhance iMessage handling by adding reply tag support in send functions and tests. This includes modifications to prepend or rewrite reply tags based on provided replyToId, ensuring proper message formatting for replies.

* Enhance message delivery across multiple channels by implementing sticky reply context for chunked messages. This includes preserving reply references in Discord, Telegram, and iMessage, ensuring that follow-up messages maintain their intended reply targets. Additionally, improve handling of reply tags in system prompts and tests to support consistent reply behavior.

* Enhance read tool functionality by implementing auto-paging across chunks when no explicit limit is provided, scaling output budget based on model context window. Additionally, add tests for adaptive reading behavior and capped continuation guidance for large outputs. Update related functions to support these features.

* Refine tool-result context management by stripping oversized read-tool details payloads during compaction, ensuring repeated read calls do not bypass context limits. Introduce new utility functions for handling truncation content and enhance character estimation for tool results. Add tests to validate the removal of excessive details in context overflow scenarios.

* Refine message delivery logic in Matrix and Telegram by introducing a flag to track if a text chunk was sent. This ensures that replies are only marked as delivered when a text chunk has been successfully sent, improving the accuracy of reply handling in both channels.

* fix: tighten reply threading coverage and prep fixes (#19508) (thanks @tyler6204)
2026-02-17 15:32:52 -08:00
cpojer
d0cb8c19b2 chore: wtf. 2026-02-17 13:36:48 +09:00
Sebastian
ed11e93cf2 chore(format) 2026-02-16 23:20:16 -05:00
cpojer
ac38d51290 chore: Fix types in tests 7/N. 2026-02-17 11:22:49 +09:00
Peter Steinberger
fb6e415d0c fix(agents): align session lock hold budget with run timeouts 2026-02-17 03:10:36 +01:00
cpojer
90ef2d6bdf chore: Update formatting. 2026-02-17 09:18:40 +09:00
Peter Steinberger
824901083b refactor(pi): dedupe compaction failure 2026-02-15 19:09:05 +00:00
Tyler Yust
b8f66c260d Agents: add nested subagent orchestration controls and reduce subagent token waste (#14447)
* Agents: add subagent orchestration controls

* Agents: add subagent orchestration controls (WIP uncommitted changes)

* feat(subagents): add depth-based spawn gating for sub-sub-agents

* feat(subagents): tool policy, registry, and announce chain for nested agents

* feat(subagents): system prompt, docs, changelog for nested sub-agents

* fix(subagents): prevent model fallback override, show model during active runs, and block context overflow fallback

Bug 1: When a session has an explicit model override (e.g., gpt/openai-codex),
the fallback candidate logic in resolveFallbackCandidates silently appended the
global primary model (opus) as a backstop. On reinjection/steer with a transient
error, the session could fall back to opus which has a smaller context window
and crash. Fix: when storedModelOverride is set, pass fallbacksOverride ?? []
instead of undefined, preventing the implicit primary backstop.

Bug 2: Active subagents showed 'model n/a' in /subagents list because
resolveModelDisplay only read entry.model/modelProvider (populated after run
completes). Fix: fall back to modelOverride/providerOverride fields which are
populated at spawn time via sessions.patch.

Bug 3: Context overflow errors (prompt too long, context_length_exceeded) could
theoretically escape runEmbeddedPiAgent and be treated as failover candidates
in runWithModelFallback, causing a switch to a model with a smaller context
window. Fix: in runWithModelFallback, detect context overflow errors via
isLikelyContextOverflowError and rethrow them immediately instead of trying the
next model candidate.

* fix(subagents): track spawn depth in session store and fix announce routing for nested agents

* Fix compaction status tracking and dedupe overflow compaction triggers

* fix(subagents): enforce depth block via session store and implement cascade kill

* fix: inject group chat context into system prompt

* fix(subagents): always write model to session store at spawn time

* Preserve spawnDepth when agent handler rewrites session entry

* fix(subagents): suppress announce on steer-restart

* fix(subagents): fallback spawned session model to runtime default

* fix(subagents): enforce spawn depth when caller key resolves by sessionId

* feat(subagents): implement active-first ordering for numeric targets and enhance task display

- Added a test to verify that subagents with numeric targets follow an active-first list ordering.
- Updated `resolveSubagentTarget` to sort subagent runs based on active status and recent activity.
- Enhanced task display in command responses to prevent truncation of long task descriptions.
- Introduced new utility functions for compacting task text and managing subagent run states.

* fix(subagents): show model for active runs via run record fallback

When the spawned model matches the agent's default model, the session
store's override fields are intentionally cleared (isDefault: true).
The model/modelProvider fields are only populated after the run
completes. This left active subagents showing 'model n/a'.

Fix: store the resolved model on SubagentRunRecord at registration
time, and use it as a fallback in both display paths (subagents tool
and /subagents command) when the session store entry has no model info.

Changes:
- SubagentRunRecord: add optional model field
- registerSubagentRun: accept and persist model param
- sessions-spawn-tool: pass resolvedModel to registerSubagentRun
- subagents-tool: pass run record model as fallback to resolveModelDisplay
- commands-subagents: pass run record model as fallback to resolveModelDisplay

* feat(chat): implement session key resolution and reset on sidebar navigation

- Added functions to resolve the main session key and reset chat state when switching sessions from the sidebar.
- Updated the `renderTab` function to handle session key changes when navigating to the chat tab.
- Introduced a test to verify that the session resets to "main" when opening chat from the sidebar navigation.

* fix: subagent timeout=0 passthrough and fallback prompt duplication

Bug 1: runTimeoutSeconds=0 now means 'no timeout' instead of applying 600s default
- sessions-spawn-tool: default to undefined (not 0) when neither timeout param
  is provided; use != null check so explicit 0 passes through to gateway
- agent.ts: accept 0 as valid timeout (resolveAgentTimeoutMs already handles
  0 → MAX_SAFE_TIMEOUT_MS)

Bug 2: model fallback no longer re-injects the original prompt as a duplicate
- agent.ts: track fallback attempt index; on retries use a short continuation
  message instead of the full original prompt since the session file already
  contains it from the first attempt
- Also skip re-sending images on fallback retries (already in session)

* feat(subagents): truncate long task descriptions in subagents command output

- Introduced a new utility function to format task previews, limiting their length to improve readability.
- Updated the command handler to use the new formatting function, ensuring task descriptions are truncated appropriately.
- Adjusted related tests to verify that long task descriptions are now truncated in the output.

* refactor(subagents): update subagent registry path resolution and improve command output formatting

- Replaced direct import of STATE_DIR with a utility function to resolve the state directory dynamically.
- Enhanced the formatting of command output for active and recent subagents, adding separators for better readability.
- Updated related tests to reflect changes in command output structure.

* fix(subagent): default sessions_spawn to no timeout when runTimeoutSeconds omitted

The previous fix (75a791106) correctly handled the case where
runTimeoutSeconds was explicitly set to 0 ("no timeout"). However,
when models omit the parameter entirely (which is common since the
schema marks it as optional), runTimeoutSeconds resolved to undefined.

undefined flowed through the chain as:
  sessions_spawn → timeout: undefined (since undefined != null is false)
  → gateway agent handler → agentCommand opts.timeout: undefined
  → resolveAgentTimeoutMs({ overrideSeconds: undefined })
  → DEFAULT_AGENT_TIMEOUT_SECONDS (600s = 10 minutes)

This caused subagents to be killed at exactly 10 minutes even though
the user's intent (via TOOLS.md) was for subagents to run without a
timeout.

Fix: default runTimeoutSeconds to 0 (no timeout) when neither
runTimeoutSeconds nor timeoutSeconds is provided by the caller.
Subagent spawns are long-running by design and should not inherit the
600s agent-command default timeout.

* fix(subagent): accept timeout=0 in agent-via-gateway path (second 600s default)

* fix: thread timeout override through getReplyFromConfig dispatch path

getReplyFromConfig called resolveAgentTimeoutMs({ cfg }) with no override,
always falling back to the config default (600s). Add timeoutOverrideSeconds
to GetReplyOptions and pass it through as overrideSeconds so callers of the
dispatch chain can specify a custom timeout (0 = no timeout).

This complements the existing timeout threading in agentCommand and the
cron isolated-agent runner, which already pass overrideSeconds correctly.

* feat(model-fallback): normalize OpenAI Codex model references and enhance fallback handling

- Added normalization for OpenAI Codex model references, specifically converting "gpt-5.3-codex" to "openai-codex" before execution.
- Updated the `resolveFallbackCandidates` function to utilize the new normalization logic.
- Enhanced tests to verify the correct behavior of model normalization and fallback mechanisms.
- Introduced a new test case to ensure that the normalization process works as expected for various input formats.

* feat(tests): add unit tests for steer failure behavior in openclaw-tools

- Introduced a new test file to validate the behavior of subagents when steer replacement dispatch fails.
- Implemented tests to ensure that the announce behavior is restored correctly and that the suppression reason is cleared as expected.
- Enhanced the subagent registry with a new function to clear steer restart suppression.
- Updated related components to support the new test scenarios.

* fix(subagents): replace stop command with kill in slash commands and documentation

- Updated the `/subagents` command to replace `stop` with `kill` for consistency in controlling sub-agent runs.
- Modified related documentation to reflect the change in command usage.
- Removed legacy timeoutSeconds references from the sessions-spawn-tool schema and tests to streamline timeout handling.
- Enhanced tests to ensure correct behavior of the updated commands and their interactions.

* feat(tests): add unit tests for readLatestAssistantReply function

- Introduced a new test file for the `readLatestAssistantReply` function to validate its behavior with various message scenarios.
- Implemented tests to ensure the function correctly retrieves the latest assistant message and handles cases where the latest message has no text.
- Mocked the gateway call to simulate different message histories for comprehensive testing.

* feat(tests): enhance subagent kill-all cascade tests and announce formatting

- Added a new test to verify that the `kill-all` command cascades through ended parents to active descendants in subagents.
- Updated the subagent announce formatting tests to reflect changes in message structure, including the replacement of "Findings:" with "Result:" and the addition of new expectations for message content.
- Improved the handling of long findings and stats in the announce formatting logic to ensure concise output.
- Refactored related functions to enhance clarity and maintainability in the subagent registry and tools.

* refactor(subagent): update announce formatting and remove unused constants

- Modified the subagent announce formatting to replace "Findings:" with "Result:" and adjusted related expectations in tests.
- Removed constants for maximum announce findings characters and summary words, simplifying the announcement logic.
- Updated the handling of findings to retain full content instead of truncating, ensuring more informative outputs.
- Cleaned up unused imports in the commands-subagents file to enhance code clarity.

* feat(tests): enhance billing error handling in user-facing text

- Added tests to ensure that normal text mentioning billing plans is not rewritten, preserving user context.
- Updated the `isBillingErrorMessage` and `sanitizeUserFacingText` functions to improve handling of billing-related messages.
- Introduced new test cases for various scenarios involving billing messages to ensure accurate processing and output.
- Enhanced the subagent announce flow to correctly manage active descendant runs, preventing premature announcements.

* feat(subagent): enhance workflow guidance and auto-announcement clarity

- Added a new guideline in the subagent system prompt to emphasize trust in push-based completion, discouraging busy polling for status updates.
- Updated documentation to clarify that sub-agents will automatically announce their results, improving user understanding of the workflow.
- Enhanced tests to verify the new guidance on avoiding polling loops and to ensure the accuracy of the updated prompts.

* fix(cron): avoid announcing interim subagent spawn acks

* chore: clean post-rebase imports

* fix(cron): fall back to child replies when parent stays interim

* fix(subagents): make active-run guidance advisory

* fix(subagents): update announce flow to handle active descendants and enhance test coverage

- Modified the announce flow to defer announcements when active descendant runs are present, ensuring accurate status reporting.
- Updated tests to verify the new behavior, including scenarios where no fallback requester is available and ensuring proper handling of finished subagents.
- Enhanced the announce formatting to include an `expectFinal` flag for better clarity in the announcement process.

* fix(subagents): enhance announce flow and formatting for user updates

- Updated the announce flow to provide clearer instructions for user updates based on active subagent runs and requester context.
- Refactored the announcement logic to improve clarity and ensure internal context remains private.
- Enhanced tests to verify the new message expectations and formatting, including updated prompts for user-facing updates.
- Introduced a new function to build reply instructions based on session context, improving the overall announcement process.

* fix: resolve prep blockers and changelog placement (#14447) (thanks @tyler6204)

* fix: restore cron delivery-plan import after rebase (#14447) (thanks @tyler6204)

* fix: resolve test failures from rebase conflicts (#14447) (thanks @tyler6204)

* fix: apply formatting after rebase (#14447) (thanks @tyler6204)
2026-02-14 22:03:45 -08:00
Bin Deng
c0cd3c3c08 fix: add safety timeout to session.compact() to prevent lane deadlock (#16533)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 21e4045add
Co-authored-by: BinHPdev <219093083+BinHPdev@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-02-14 17:54:12 -05:00
Nikolay Petrov
3b5a9c14dd Fix: Preserve Per-Agent Exec Override After Session Compaction (#15833)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 9dfe5bdf23
Co-authored-by: napetrov <18015221+napetrov@users.noreply.github.com>
Co-authored-by: steipete <58493+steipete@users.noreply.github.com>
Reviewed-by: @steipete
2026-02-14 02:34:04 +01:00
Glucksberg
9bd2ccb017 feat: add pre-prompt context size diagnostic logging (openclaw#8930) thanks @Glucksberg
Verified:
- pnpm build
- pnpm check
- pnpm test

Co-authored-by: Glucksberg <80581902+Glucksberg@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-13 17:54:22 -06:00
solstead
ab71fdf821 Plugin API: compaction/reset hooks, bootstrap file globs, memory plugin status (#13287)
* feat: add before_compaction and before_reset plugin hooks with session context

- Pass session messages to before_compaction hook
- Add before_reset plugin hook for /new and /reset commands
- Add sessionId to plugin hook agent context

* feat: extraBootstrapFiles config with glob pattern support

Add extraBootstrapFiles to agent defaults config, allowing glob patterns
(e.g. "projects/*/TOOLS.md") to auto-load project-level bootstrap files
into agent context every turn. Missing files silently skipped.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(status): show custom memory plugins as enabled, not unavailable

The status command probes memory availability using the built-in
memory-core manager. Custom memory plugins (e.g. via plugin slot)
can't be probed this way, so they incorrectly showed "unavailable".
Now they show "enabled (plugin X)" without the misleading label.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use async fs.glob and capture pre-compaction messages

- Replace globSync (node:fs) with fs.glob (node:fs/promises) to match
  codebase conventions for async file operations
- Capture session.messages BEFORE replaceMessages(limited) so
  before_compaction hook receives the full conversation history,
  not the already-truncated list

* fix: resolve lint errors from CI (oxlint strict mode)

- Add void to fire-and-forget IIFE (no-floating-promises)
- Use String() for unknown catch params in template literals
- Add curly braces to single-statement if (curly rule)

* fix: resolve remaining CI lint errors in workspace.ts

- Remove `| string` from WorkspaceBootstrapFileName union (made all
  typeof members redundant per no-redundant-type-constituents)
- Use type assertion for extra bootstrap file names
- Drop redundant await on fs.glob() AsyncIterable (await-thenable)

* fix: address Greptile review — path traversal guard + fs/promises import

- workspace.ts: use path.resolve() + traversal check in loadExtraBootstrapFiles()
- commands-core.ts: import fs from node:fs/promises, drop fs.promises prefix

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: resolve symlinks before workspace boundary check

Greptile correctly identified that symlinks inside the workspace could
point to files outside it, bypassing the path prefix check. Now uses
fs.realpath() to resolve symlinks before verifying the real path stays
within the workspace boundary.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address Greptile review — hook reliability and type safety

1. before_compaction: add compactingCount field so plugins know both
   the full pre-compaction message count and the truncated count being
   fed to the compaction LLM. Clarify semantics in comment.

2. loadExtraBootstrapFiles: use path.basename() for the name field
   so "projects/quaid/TOOLS.md" maps to the known "TOOLS.md" type
   instead of an invalid WorkspaceBootstrapFileName cast.

3. before_reset: fire the hook even when no session file exists.
   Previously, short sessions without a persisted file would silently
   skip the hook. Now fires with empty messages array so plugins
   always know a reset occurred.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: validate bootstrap filenames and add compaction hook timeout

- Only load extra bootstrap files whose basename matches a recognized
  workspace filename (AGENTS.md, TOOLS.md, etc.), preventing arbitrary
  files from being injected into agent context.
- Wrap before_compaction hook in a 30-second Promise.race timeout so
  misbehaving plugins cannot stall the compaction pipeline.
- Clarify hook comments: before_compaction is intentionally awaited
  (plugins need messages before they're discarded) but bounded.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: make before_compaction non-blocking, add sessionFile to after_compaction

- before_compaction is now true fire-and-forget — no await, no timeout.
  Plugins that need full conversation data should persist it themselves
  and return quickly, or use after_compaction for async processing.
- after_compaction now includes sessionFile path so plugins can read
  the full JSONL transcript asynchronously. All pre-compaction messages
  are preserved on disk, eliminating the need to block compaction.
- Removes Promise.race timeout pattern that didn't actually cancel
  slow hooks (just raced past them while they continued running).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add sessionFile to before_compaction for parallel processing

The session JSONL already has all messages on disk before compaction
starts. By providing sessionFile in before_compaction, plugins can
read and extract data in parallel with the compaction LLM call rather
than waiting for after_compaction. This is the optimal path for memory
plugins that need the full conversation history.

sessionFile is also kept on after_compaction for plugins that only
need to act after compaction completes (analytics, cleanup, etc.).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor: move bootstrap extras into bundled hook

---------

Co-authored-by: Solomon Steadman <solstead@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Clawdbot <clawdbot@alfie.local>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-02-14 00:45:45 +01:00
rodbland2021
d3b2135f86 fix(agents): wait for agent idle before flushing pending tool results (#13746)
* fix(agents): wait for agent idle before flushing pending tool results

When pi-agent-core's auto-retry mechanism handles overloaded/rate-limit
errors, it resolves waitForRetry() on assistant message receipt — before
tool execution completes in the retried agent loop. This causes the
attempt's finally block to call flushPendingToolResults() while tools
are still executing, inserting synthetic 'missing tool result' errors
and causing silent agent failures.

The fix adds a waitForIdle() call before the flush to ensure the agent's
retry loop (including tool execution) has fully completed.

Evidence from real session: tool call and synthetic error were only 53ms
apart — the tool never had a chance to execute before being flushed.

Root cause is in pi-agent-core's _resolveRetry() firing on message_end
instead of agent_end, but this workaround in OpenClaw prevents the
symptom without requiring an upstream fix.

Fixes #8643
Fixes #13351
Refs #6682, #12595

* test: add tests for tool result flush race condition

Validates that:
- Real tool results are not replaced by synthetic errors when they arrive in time
- Flush correctly inserts synthetic errors for genuinely orphaned tool calls
- Flush is a no-op after real tool results have already been received

Refs #8643, #13748

* fix(agents): add waitForIdle to all flushPendingToolResults call sites

The original fix only covered the main run finally block, but there are
two additional call sites that can trigger flushPendingToolResults while
tools are still executing:

1. The catch block in attempt.ts (session setup error handler)
2. The finally block in compact.ts (compaction teardown)

Both now await agent.waitForIdle() with a 30s timeout before flushing,
matching the pattern already applied to the main finally block.

Production testing on VPS with debug logging confirmed these additional
paths can fire during sub-agent runs, producing spurious synthetic
'missing tool result' errors.

* fix(agents): centralize idle-wait flush and clear timeout handle

---------

Co-authored-by: Renue Development <dev@renuebyscience.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-02-13 20:35:43 +01:00
0xRain
43818e1583 fix(agents): re-run tool_use pairing repair after history truncation (#13926)
Co-authored-by: 0xRaini <0xRaini@users.noreply.github.com>
2026-02-12 08:42:05 +09:00
Tak Hoffman
f0722498a4 Agents: include runtime shell (#1835)
* Agents: include runtime shell

* Agents: fix compact runtime build

* chore: fix CLAUDE.md formatting, security regex for secret

---------

Co-authored-by: Tak hoffman <takayukihoffman@gmail.com>
Co-authored-by: quotentiroler <max.nussbaumer@maxhealth.tech>
2026-02-07 09:32:31 -08:00
Gustavo Madeira Santana
392bbddf29 Security: owner-only tools + command auth hardening (#9202)
* Security: gate whatsapp_login by sender auth

* Security: treat undefined senderAuthorized as unauthorized (opt-in)

* fix: gate whatsapp_login to owner senders (#8768) (thanks @victormier)

* fix: add explicit owner allowlist for tools (#8768) (thanks @victormier)

* fix: normalize escaped newlines in send actions (#8768) (thanks @victormier)

---------

Co-authored-by: Victor Mier <victormier@gmail.com>
2026-02-04 19:49:36 -05:00
Vignesh Natarajan
5d3af3bc62 feat (memory): Implement new (opt-in) QMD memory backend 2026-02-02 23:45:05 -08:00
Justin
0da6de6624 Agent: repair malformed tool calls and session files 2026-02-02 23:56:27 +00:00
Peter Steinberger
d03eca8450 fix: harden plugin and hook install paths 2026-02-02 02:07:47 -08:00
Mario Zechner
cf1d3f7a7c fix: update pi packages to 0.51.0, remove bogus type augmentation
- Update @mariozechner/pi-agent-core, pi-ai, pi-coding-agent, pi-tui to 0.51.0
- Delete src/types/pi-coding-agent.d.ts (declared additionalExtensionPaths which SDK never supported)
- Fix ToolDefinition.execute signature (parameter order changed in 0.51.0)
- Remove dead additionalExtensionPaths from createAgentSession calls
2026-02-02 01:52:33 +01:00
Peter Steinberger
e58291e070 fix: align embedded runner with pi-coding-agent API 2026-02-01 15:51:46 -08:00
Peter Steinberger
3367b2aa27 fix: align embedded runner with session API changes 2026-02-01 15:06:55 -08:00
Peter Steinberger
bcde2fca5a fix: align embedded agent session setup 2026-02-01 22:23:16 +00:00
cpojer
f06dd8df06 chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
cpojer
5ceff756e1 chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
Mario Zechner
cbc405c9e3 Agents: update pi-coding-agent API usage 2026-01-31 07:35:52 +01:00
Peter Steinberger
d2a852b982 fix: align embedded session setup with sdk 2026-01-31 06:22:24 +00:00
Peter Steinberger
e9f0be06eb fix: repair docker build typing 2026-01-31 06:50:56 +01:00
Peter Steinberger
08ed62852a chore: update deps and pi model discovery 2026-01-31 06:45:57 +01:00