Commit Graph

39 Commits

Author SHA1 Message Date
sebslight
553d17f8af refactor(agents): use silent token constant in prompts 2026-02-16 08:20:24 -05:00
Marcus Widing
ade11ec892 fix(announce): use deterministic idempotency keys to prevent duplicate subagent announces (#17150)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 54bba3cea1
Co-authored-by: widingmarcus-cyber <245375637+widingmarcus-cyber@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-02-15 10:34:34 -05:00
Tyler Yust
b8f66c260d Agents: add nested subagent orchestration controls and reduce subagent token waste (#14447)
* Agents: add subagent orchestration controls

* Agents: add subagent orchestration controls (WIP uncommitted changes)

* feat(subagents): add depth-based spawn gating for sub-sub-agents

* feat(subagents): tool policy, registry, and announce chain for nested agents

* feat(subagents): system prompt, docs, changelog for nested sub-agents

* fix(subagents): prevent model fallback override, show model during active runs, and block context overflow fallback

Bug 1: When a session has an explicit model override (e.g., gpt/openai-codex),
the fallback candidate logic in resolveFallbackCandidates silently appended the
global primary model (opus) as a backstop. On reinjection/steer with a transient
error, the session could fall back to opus which has a smaller context window
and crash. Fix: when storedModelOverride is set, pass fallbacksOverride ?? []
instead of undefined, preventing the implicit primary backstop.

Bug 2: Active subagents showed 'model n/a' in /subagents list because
resolveModelDisplay only read entry.model/modelProvider (populated after run
completes). Fix: fall back to modelOverride/providerOverride fields which are
populated at spawn time via sessions.patch.

Bug 3: Context overflow errors (prompt too long, context_length_exceeded) could
theoretically escape runEmbeddedPiAgent and be treated as failover candidates
in runWithModelFallback, causing a switch to a model with a smaller context
window. Fix: in runWithModelFallback, detect context overflow errors via
isLikelyContextOverflowError and rethrow them immediately instead of trying the
next model candidate.

* fix(subagents): track spawn depth in session store and fix announce routing for nested agents

* Fix compaction status tracking and dedupe overflow compaction triggers

* fix(subagents): enforce depth block via session store and implement cascade kill

* fix: inject group chat context into system prompt

* fix(subagents): always write model to session store at spawn time

* Preserve spawnDepth when agent handler rewrites session entry

* fix(subagents): suppress announce on steer-restart

* fix(subagents): fallback spawned session model to runtime default

* fix(subagents): enforce spawn depth when caller key resolves by sessionId

* feat(subagents): implement active-first ordering for numeric targets and enhance task display

- Added a test to verify that subagents with numeric targets follow an active-first list ordering.
- Updated `resolveSubagentTarget` to sort subagent runs based on active status and recent activity.
- Enhanced task display in command responses to prevent truncation of long task descriptions.
- Introduced new utility functions for compacting task text and managing subagent run states.

* fix(subagents): show model for active runs via run record fallback

When the spawned model matches the agent's default model, the session
store's override fields are intentionally cleared (isDefault: true).
The model/modelProvider fields are only populated after the run
completes. This left active subagents showing 'model n/a'.

Fix: store the resolved model on SubagentRunRecord at registration
time, and use it as a fallback in both display paths (subagents tool
and /subagents command) when the session store entry has no model info.

Changes:
- SubagentRunRecord: add optional model field
- registerSubagentRun: accept and persist model param
- sessions-spawn-tool: pass resolvedModel to registerSubagentRun
- subagents-tool: pass run record model as fallback to resolveModelDisplay
- commands-subagents: pass run record model as fallback to resolveModelDisplay

* feat(chat): implement session key resolution and reset on sidebar navigation

- Added functions to resolve the main session key and reset chat state when switching sessions from the sidebar.
- Updated the `renderTab` function to handle session key changes when navigating to the chat tab.
- Introduced a test to verify that the session resets to "main" when opening chat from the sidebar navigation.

* fix: subagent timeout=0 passthrough and fallback prompt duplication

Bug 1: runTimeoutSeconds=0 now means 'no timeout' instead of applying 600s default
- sessions-spawn-tool: default to undefined (not 0) when neither timeout param
  is provided; use != null check so explicit 0 passes through to gateway
- agent.ts: accept 0 as valid timeout (resolveAgentTimeoutMs already handles
  0 → MAX_SAFE_TIMEOUT_MS)

Bug 2: model fallback no longer re-injects the original prompt as a duplicate
- agent.ts: track fallback attempt index; on retries use a short continuation
  message instead of the full original prompt since the session file already
  contains it from the first attempt
- Also skip re-sending images on fallback retries (already in session)

* feat(subagents): truncate long task descriptions in subagents command output

- Introduced a new utility function to format task previews, limiting their length to improve readability.
- Updated the command handler to use the new formatting function, ensuring task descriptions are truncated appropriately.
- Adjusted related tests to verify that long task descriptions are now truncated in the output.

* refactor(subagents): update subagent registry path resolution and improve command output formatting

- Replaced direct import of STATE_DIR with a utility function to resolve the state directory dynamically.
- Enhanced the formatting of command output for active and recent subagents, adding separators for better readability.
- Updated related tests to reflect changes in command output structure.

* fix(subagent): default sessions_spawn to no timeout when runTimeoutSeconds omitted

The previous fix (75a791106) correctly handled the case where
runTimeoutSeconds was explicitly set to 0 ("no timeout"). However,
when models omit the parameter entirely (which is common since the
schema marks it as optional), runTimeoutSeconds resolved to undefined.

undefined flowed through the chain as:
  sessions_spawn → timeout: undefined (since undefined != null is false)
  → gateway agent handler → agentCommand opts.timeout: undefined
  → resolveAgentTimeoutMs({ overrideSeconds: undefined })
  → DEFAULT_AGENT_TIMEOUT_SECONDS (600s = 10 minutes)

This caused subagents to be killed at exactly 10 minutes even though
the user's intent (via TOOLS.md) was for subagents to run without a
timeout.

Fix: default runTimeoutSeconds to 0 (no timeout) when neither
runTimeoutSeconds nor timeoutSeconds is provided by the caller.
Subagent spawns are long-running by design and should not inherit the
600s agent-command default timeout.

* fix(subagent): accept timeout=0 in agent-via-gateway path (second 600s default)

* fix: thread timeout override through getReplyFromConfig dispatch path

getReplyFromConfig called resolveAgentTimeoutMs({ cfg }) with no override,
always falling back to the config default (600s). Add timeoutOverrideSeconds
to GetReplyOptions and pass it through as overrideSeconds so callers of the
dispatch chain can specify a custom timeout (0 = no timeout).

This complements the existing timeout threading in agentCommand and the
cron isolated-agent runner, which already pass overrideSeconds correctly.

* feat(model-fallback): normalize OpenAI Codex model references and enhance fallback handling

- Added normalization for OpenAI Codex model references, specifically converting "gpt-5.3-codex" to "openai-codex" before execution.
- Updated the `resolveFallbackCandidates` function to utilize the new normalization logic.
- Enhanced tests to verify the correct behavior of model normalization and fallback mechanisms.
- Introduced a new test case to ensure that the normalization process works as expected for various input formats.

* feat(tests): add unit tests for steer failure behavior in openclaw-tools

- Introduced a new test file to validate the behavior of subagents when steer replacement dispatch fails.
- Implemented tests to ensure that the announce behavior is restored correctly and that the suppression reason is cleared as expected.
- Enhanced the subagent registry with a new function to clear steer restart suppression.
- Updated related components to support the new test scenarios.

* fix(subagents): replace stop command with kill in slash commands and documentation

- Updated the `/subagents` command to replace `stop` with `kill` for consistency in controlling sub-agent runs.
- Modified related documentation to reflect the change in command usage.
- Removed legacy timeoutSeconds references from the sessions-spawn-tool schema and tests to streamline timeout handling.
- Enhanced tests to ensure correct behavior of the updated commands and their interactions.

* feat(tests): add unit tests for readLatestAssistantReply function

- Introduced a new test file for the `readLatestAssistantReply` function to validate its behavior with various message scenarios.
- Implemented tests to ensure the function correctly retrieves the latest assistant message and handles cases where the latest message has no text.
- Mocked the gateway call to simulate different message histories for comprehensive testing.

* feat(tests): enhance subagent kill-all cascade tests and announce formatting

- Added a new test to verify that the `kill-all` command cascades through ended parents to active descendants in subagents.
- Updated the subagent announce formatting tests to reflect changes in message structure, including the replacement of "Findings:" with "Result:" and the addition of new expectations for message content.
- Improved the handling of long findings and stats in the announce formatting logic to ensure concise output.
- Refactored related functions to enhance clarity and maintainability in the subagent registry and tools.

* refactor(subagent): update announce formatting and remove unused constants

- Modified the subagent announce formatting to replace "Findings:" with "Result:" and adjusted related expectations in tests.
- Removed constants for maximum announce findings characters and summary words, simplifying the announcement logic.
- Updated the handling of findings to retain full content instead of truncating, ensuring more informative outputs.
- Cleaned up unused imports in the commands-subagents file to enhance code clarity.

* feat(tests): enhance billing error handling in user-facing text

- Added tests to ensure that normal text mentioning billing plans is not rewritten, preserving user context.
- Updated the `isBillingErrorMessage` and `sanitizeUserFacingText` functions to improve handling of billing-related messages.
- Introduced new test cases for various scenarios involving billing messages to ensure accurate processing and output.
- Enhanced the subagent announce flow to correctly manage active descendant runs, preventing premature announcements.

* feat(subagent): enhance workflow guidance and auto-announcement clarity

- Added a new guideline in the subagent system prompt to emphasize trust in push-based completion, discouraging busy polling for status updates.
- Updated documentation to clarify that sub-agents will automatically announce their results, improving user understanding of the workflow.
- Enhanced tests to verify the new guidance on avoiding polling loops and to ensure the accuracy of the updated prompts.

* fix(cron): avoid announcing interim subagent spawn acks

* chore: clean post-rebase imports

* fix(cron): fall back to child replies when parent stays interim

* fix(subagents): make active-run guidance advisory

* fix(subagents): update announce flow to handle active descendants and enhance test coverage

- Modified the announce flow to defer announcements when active descendant runs are present, ensuring accurate status reporting.
- Updated tests to verify the new behavior, including scenarios where no fallback requester is available and ensuring proper handling of finished subagents.
- Enhanced the announce formatting to include an `expectFinal` flag for better clarity in the announcement process.

* fix(subagents): enhance announce flow and formatting for user updates

- Updated the announce flow to provide clearer instructions for user updates based on active subagent runs and requester context.
- Refactored the announcement logic to improve clarity and ensure internal context remains private.
- Enhanced tests to verify the new message expectations and formatting, including updated prompts for user-facing updates.
- Introduced a new function to build reply instructions based on session context, improving the overall announcement process.

* fix: resolve prep blockers and changelog placement (#14447) (thanks @tyler6204)

* fix: restore cron delivery-plan import after rebase (#14447) (thanks @tyler6204)

* fix: resolve test failures from rebase conflicts (#14447) (thanks @tyler6204)

* fix: apply formatting after rebase (#14447) (thanks @tyler6204)
2026-02-14 22:03:45 -08:00
Robby
cab0abf52a fix(sessions): resolve transcript paths with explicit agent context (#16288)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 7cbe9deca9
Co-authored-by: robbyczgw-cla <239660374+robbyczgw-cla@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-02-14 13:44:51 -05:00
Peter Steinberger
6daa4911e7 perf(subagents): speed announce retry polling and trim duplicate e2e coverage 2026-02-14 00:28:20 +00:00
Peter Steinberger
4199f9889f fix: harden session transcript path resolution 2026-02-13 01:28:17 +01:00
max
a1123dd9be Centralize date/time formatting utilities (#11831) 2026-02-08 04:53:31 -08:00
Tyler Yust
191da1feb5 fix: context overflow compaction and subagent announce improvements (#11664) (thanks @tyler6204)
* initial commit

* feat: implement deriveSessionTotalTokens function and update usage tests

* Added deriveSessionTotalTokens function to calculate total tokens based on usage and context tokens.
* Updated usage tests to include cases for derived session total tokens.
* Refactored session usage calculations in multiple files to utilize the new function for improved accuracy.

* fix: restore overflow truncation fallback + changelog/test hardening (#11551) (thanks @tyler6204)
2026-02-07 20:02:32 -08:00
Tyler Yust
8fae55e8e0 fix(cron): share isolated announce flow + harden cron scheduling/delivery (#11641)
* fix(cron): comprehensive cron scheduling and delivery fixes

- Fix delivery target resolution for isolated agent cron jobs
- Improve schedule parsing and validation
- Add job retry logic and error handling
- Enhance cron ops with better state management
- Add timer improvements for more reliable cron execution
- Add cron event type to protocol schema
- Support cron events in heartbeat runner (skip empty-heartbeat check,
  use dedicated CRON_EVENT_PROMPT for relay)

* fix: remove cron debug test and add changelog/docs notes (#11641) (thanks @tyler6204)
2026-02-07 19:46:01 -08:00
Tyler Yust
3f82daefd8 feat(cron): enhance delivery modes and job configuration
- Updated isolated cron jobs to support new delivery modes: `announce` and `none`, improving output management.
- Refactored job configuration to remove legacy fields and streamline delivery settings.
- Enhanced the `CronJobEditor` UI to reflect changes in delivery options, including a new segmented control for delivery mode selection.
- Updated documentation to clarify the new delivery configurations and their implications for job execution.
- Improved tests to validate the new delivery behavior and ensure backward compatibility with legacy settings.

This update provides users with greater flexibility in managing how isolated jobs deliver their outputs, enhancing overall usability and clarity in job configurations.
2026-02-04 01:03:59 -08:00
cpojer
f06dd8df06 chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
Peter Steinberger
a42e1c82d9 fix: restore tsc build and plugin install tests 2026-01-31 07:54:15 +00:00
cpojer
9e908ad6be chore: Fix TypeScript errors 4/n. 2026-01-31 16:48:44 +09:00
cpojer
5ceff756e1 chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
cpojer
15792b153f chore: Enable more lint rules, disable some that trigger a lot. Will clean up later. 2026-01-31 16:04:04 +09:00
Tyler Yust
57248a7ca1 fix: prefer requesterOrigin over stale session entry in subagent announce routing (#4957)
* fix: prefer requesterOrigin over stale session entry in subagent announce routing

When a subagent finishes and announces results back, resolveAnnounceOrigin
merged the session entry (primary) with requesterOrigin (fallback). If the
session store had a stale lastChannel (e.g. whatsapp) from a previous
interaction but the user was now on a different channel (e.g. bluebubbles),
the announce would route to the wrong channel.

Swap the merge order so requesterOrigin (captured at spawn time, reflecting
the actual current channel) takes priority, with the session entry as
fallback for any missing fields.

Error before fix:
  Delivery failed (whatsapp to bluebubbles:chat_guid:...): Unknown channel: whatsapp

Adds regression test for the stale-channel scenario.

* fix: match test to exact failure scenario and improve reliability (#4957) (thanks @tyler6204)

- Remove lastTo from stale session store to match the exact mismatch scenario described in the PR
- Replace 5ms setTimeout sleeps with expect.poll for better test reliability
- Prevents flakiness on slower CI machines
2026-01-30 15:52:19 -08:00
Peter Steinberger
02ca148583 fix: preserve subagent thread routing (#1241)
Thanks @gnarco.

Co-authored-by: gnarco <gnarco@users.noreply.github.com>
2026-01-20 17:22:07 +00:00
Peter Steinberger
37a2eee837 refactor: drop legacy session store keys 2026-01-17 06:48:44 +00:00
Peter Steinberger
e59d8c5436 style: oxfmt format 2026-01-17 05:48:56 +00:00
Peter Steinberger
4d314db750 refactor: extract subagent announce queue
Co-authored-by: adam91holt <adam91holt@users.noreply.github.com>
2026-01-17 05:29:07 +00:00
Peter Steinberger
19ee6699d2 refactor: clarify subagent announce origin
Co-authored-by: adam91holt <adam91holt@users.noreply.github.com>
2026-01-17 04:33:24 +00:00
Peter Steinberger
4f37f66264 refactor: normalize delivery context
Co-authored-by: adam91holt <adam91holt@users.noreply.github.com>
2026-01-17 04:24:59 +00:00
Peter Steinberger
9f4b7a1683 fix: normalize subagent announce delivery origin
Co-authored-by: Adam Holt <mail@adamholt.co.nz>
2026-01-17 04:03:28 +00:00
Peter Steinberger
a82217a5f3 chore: format + regenerate protocol 2026-01-17 03:40:49 +00:00
Peter Steinberger
bc7d603867 test: expand accountId delivery coverage
Co-authored-by: Adam Holt <adam91holt@users.noreply.github.com>
2026-01-17 03:17:33 +00:00
Peter Steinberger
0291105913 fix: thread accountId through subagent announce delivery
Co-authored-by: Adam Holt <adam91holt@users.noreply.github.com>
2026-01-17 02:45:18 +00:00
Peter Steinberger
d5332ae29a fix: thread accountId through subagent announce
Co-authored-by: Adam Holt <adam91holt@users.noreply.github.com>
2026-01-17 02:09:35 +00:00
Peter Steinberger
3fb699a84b style: apply oxfmt 2026-01-17 01:55:42 +00:00
Peter Steinberger
19016f16e0 fix: queue subagent announce delivery 2026-01-17 01:44:13 +00:00
Peter Steinberger
a4b347b454 feat: refine subagents + add chat.inject
Co-authored-by: Tyler Yust <tyler6204@users.noreply.github.com>
2026-01-15 23:44:31 +00:00
Roshan Singh
1baa55c145 Structured subagent announce output + include run outcome (#835)
* docs: clarify subagent announce status

* Make subagent announce structured and include run outcome

* fix: stabilize sub-agent announce status (#835) (thanks @roshanasingh4)

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-01-15 04:48:07 +00:00
Peter Steinberger
c379191f80 chore: migrate to oxlint and oxfmt
Co-authored-by: Christoph Nakazawa <christoph.pojer@gmail.com>
2026-01-14 15:02:19 +00:00
Peter Steinberger
b071f73fef fix: resume subagent registry safely (#831) (thanks @roshanasingh4) 2026-01-13 10:10:15 +00:00
Peter Steinberger
90342a4f3a refactor!: rename chat providers to channels 2026-01-13 08:40:39 +00:00
user
7d6f17d77f fix(subagent): make announce prompt more emphatic
The previous prompt was too permissive about skipping announcements.
Updated to strongly encourage announcing results since the requester
is waiting for a response.

- Add 'You MUST announce your result' instruction
- Clarify ANNOUNCE_SKIP is only for complete failures
- Improve guidance on providing useful summaries
2026-01-12 00:52:36 +00:00
Peter Steinberger
21eebb6d3b fix: limit subagent bootstrap context 2026-01-10 00:01:16 +00:00
Peter Steinberger
79f5ccc99d fix(gateway): harden agent provider routing 2026-01-09 23:00:36 +01:00
Azade
3133c7c84e feat(sessions): expose label in sessions.list and support label lookup in sessions_send
- Add `label` field to session entries and expose it in `sessions.list`
- Display label column in the web UI sessions table
- Support `label` parameter in `sessions_send` for lookup by label instead of sessionKey

- `sessions.patch`: Accept and store `label` field
- `sessions.list`: Return `label` in session entries
- `sessions_spawn`: Pass label through to registry and announce flow
- `sessions_send`: Accept optional `label` param, lookup session by label if sessionKey not provided
- `agent` method: Accept `label` and `spawnedBy` params (stored in session entry)

- Add `label` column to sessions table in web UI

- Changed session store writes to merge with existing entry (`{ ...existing, ...new }`)
  to preserve fields like `label` that might be set separately

We attempted to implement label persistence "properly" by passing the label
through the `agent` call and storing it during session initialization. However,
the auto-reply flow has multiple write points that overwrite the session entry,
and making all of them merge-aware proved unreliable.

The working solution patches the label in the `finally` block of
`runSubagentAnnounceFlow`, after all other session writes complete.
This is a workaround but robust - the patch happens at the very end,
just before potential cleanup.

A future refactor could make session writes consistently merge-based,
which would allow the cleaner approach of setting label at spawn time.

```typescript
// Spawn with label
sessions_spawn({ task: "...", label: "my-worker" })

// Later, find by label
sessions_send({ label: "my-worker", message: "continue..." })

// Or use sessions_list to see labels
sessions_list() // includes label field in response
```
2026-01-09 15:32:49 +01:00
Peter Steinberger
75c66acfd8 feat: update subagent announce + archive 2026-01-07 06:53:01 +01:00