Agents: add nested subagent orchestration controls and reduce subagent token waste (#14447)

* Agents: add subagent orchestration controls * Agents: add subagent orchestration controls (WIP uncommitted changes) * feat(subagents): add depth-based spawn gating for sub-sub-agents * feat(subagents): tool policy, registry, and announce chain for nested agents * feat(subagents): system prompt, docs, changelog for nested sub-agents * fix(subagents): prevent model fallback override, show model during active runs, and block context overflow fallback Bug 1: When a session has an explicit model override (e.g., gpt/openai-codex), the fallback candidate logic in resolveFallbackCandidates silently appended the global primary model (opus) as a backstop. On reinjection/steer with a transient error, the session could fall back to opus which has a smaller context window and crash. Fix: when storedModelOverride is set, pass fallbacksOverride ?? [] instead of undefined, preventing the implicit primary backstop. Bug 2: Active subagents showed 'model n/a' in /subagents list because resolveModelDisplay only read entry.model/modelProvider (populated after run completes). Fix: fall back to modelOverride/providerOverride fields which are populated at spawn time via sessions.patch. Bug 3: Context overflow errors (prompt too long, context_length_exceeded) could theoretically escape runEmbeddedPiAgent and be treated as failover candidates in runWithModelFallback, causing a switch to a model with a smaller context window. Fix: in runWithModelFallback, detect context overflow errors via isLikelyContextOverflowError and rethrow them immediately instead of trying the next model candidate. * fix(subagents): track spawn depth in session store and fix announce routing for nested agents * Fix compaction status tracking and dedupe overflow compaction triggers * fix(subagents): enforce depth block via session store and implement cascade kill * fix: inject group chat context into system prompt * fix(subagents): always write model to session store at spawn time * Preserve spawnDepth when agent handler rewrites session entry * fix(subagents): suppress announce on steer-restart * fix(subagents): fallback spawned session model to runtime default * fix(subagents): enforce spawn depth when caller key resolves by sessionId * feat(subagents): implement active-first ordering for numeric targets and enhance task display - Added a test to verify that subagents with numeric targets follow an active-first list ordering. - Updated `resolveSubagentTarget` to sort subagent runs based on active status and recent activity. - Enhanced task display in command responses to prevent truncation of long task descriptions. - Introduced new utility functions for compacting task text and managing subagent run states. * fix(subagents): show model for active runs via run record fallback When the spawned model matches the agent's default model, the session store's override fields are intentionally cleared (isDefault: true). The model/modelProvider fields are only populated after the run completes. This left active subagents showing 'model n/a'. Fix: store the resolved model on SubagentRunRecord at registration time, and use it as a fallback in both display paths (subagents tool and /subagents command) when the session store entry has no model info. Changes: - SubagentRunRecord: add optional model field - registerSubagentRun: accept and persist model param - sessions-spawn-tool: pass resolvedModel to registerSubagentRun - subagents-tool: pass run record model as fallback to resolveModelDisplay - commands-subagents: pass run record model as fallback to resolveModelDisplay * feat(chat): implement session key resolution and reset on sidebar navigation - Added functions to resolve the main session key and reset chat state when switching sessions from the sidebar. - Updated the `renderTab` function to handle session key changes when navigating to the chat tab. - Introduced a test to verify that the session resets to "main" when opening chat from the sidebar navigation. * fix: subagent timeout=0 passthrough and fallback prompt duplication Bug 1: runTimeoutSeconds=0 now means 'no timeout' instead of applying 600s default - sessions-spawn-tool: default to undefined (not 0) when neither timeout param is provided; use != null check so explicit 0 passes through to gateway - agent.ts: accept 0 as valid timeout (resolveAgentTimeoutMs already handles 0 → MAX_SAFE_TIMEOUT_MS) Bug 2: model fallback no longer re-injects the original prompt as a duplicate - agent.ts: track fallback attempt index; on retries use a short continuation message instead of the full original prompt since the session file already contains it from the first attempt - Also skip re-sending images on fallback retries (already in session) * feat(subagents): truncate long task descriptions in subagents command output - Introduced a new utility function to format task previews, limiting their length to improve readability. - Updated the command handler to use the new formatting function, ensuring task descriptions are truncated appropriately. - Adjusted related tests to verify that long task descriptions are now truncated in the output. * refactor(subagents): update subagent registry path resolution and improve command output formatting - Replaced direct import of STATE_DIR with a utility function to resolve the state directory dynamically. - Enhanced the formatting of command output for active and recent subagents, adding separators for better readability. - Updated related tests to reflect changes in command output structure. * fix(subagent): default sessions_spawn to no timeout when runTimeoutSeconds omitted The previous fix (75a791106) correctly handled the case where runTimeoutSeconds was explicitly set to 0 ("no timeout"). However, when models omit the parameter entirely (which is common since the schema marks it as optional), runTimeoutSeconds resolved to undefined. undefined flowed through the chain as: sessions_spawn → timeout: undefined (since undefined != null is false) → gateway agent handler → agentCommand opts.timeout: undefined → resolveAgentTimeoutMs({ overrideSeconds: undefined }) → DEFAULT_AGENT_TIMEOUT_SECONDS (600s = 10 minutes) This caused subagents to be killed at exactly 10 minutes even though the user's intent (via TOOLS.md) was for subagents to run without a timeout. Fix: default runTimeoutSeconds to 0 (no timeout) when neither runTimeoutSeconds nor timeoutSeconds is provided by the caller. Subagent spawns are long-running by design and should not inherit the 600s agent-command default timeout. * fix(subagent): accept timeout=0 in agent-via-gateway path (second 600s default) * fix: thread timeout override through getReplyFromConfig dispatch path getReplyFromConfig called resolveAgentTimeoutMs({ cfg }) with no override, always falling back to the config default (600s). Add timeoutOverrideSeconds to GetReplyOptions and pass it through as overrideSeconds so callers of the dispatch chain can specify a custom timeout (0 = no timeout). This complements the existing timeout threading in agentCommand and the cron isolated-agent runner, which already pass overrideSeconds correctly. * feat(model-fallback): normalize OpenAI Codex model references and enhance fallback handling - Added normalization for OpenAI Codex model references, specifically converting "gpt-5.3-codex" to "openai-codex" before execution. - Updated the `resolveFallbackCandidates` function to utilize the new normalization logic. - Enhanced tests to verify the correct behavior of model normalization and fallback mechanisms. - Introduced a new test case to ensure that the normalization process works as expected for various input formats. * feat(tests): add unit tests for steer failure behavior in openclaw-tools - Introduced a new test file to validate the behavior of subagents when steer replacement dispatch fails. - Implemented tests to ensure that the announce behavior is restored correctly and that the suppression reason is cleared as expected. - Enhanced the subagent registry with a new function to clear steer restart suppression. - Updated related components to support the new test scenarios. * fix(subagents): replace stop command with kill in slash commands and documentation - Updated the `/subagents` command to replace `stop` with `kill` for consistency in controlling sub-agent runs. - Modified related documentation to reflect the change in command usage. - Removed legacy timeoutSeconds references from the sessions-spawn-tool schema and tests to streamline timeout handling. - Enhanced tests to ensure correct behavior of the updated commands and their interactions. * feat(tests): add unit tests for readLatestAssistantReply function - Introduced a new test file for the `readLatestAssistantReply` function to validate its behavior with various message scenarios. - Implemented tests to ensure the function correctly retrieves the latest assistant message and handles cases where the latest message has no text. - Mocked the gateway call to simulate different message histories for comprehensive testing. * feat(tests): enhance subagent kill-all cascade tests and announce formatting - Added a new test to verify that the `kill-all` command cascades through ended parents to active descendants in subagents. - Updated the subagent announce formatting tests to reflect changes in message structure, including the replacement of "Findings:" with "Result:" and the addition of new expectations for message content. - Improved the handling of long findings and stats in the announce formatting logic to ensure concise output. - Refactored related functions to enhance clarity and maintainability in the subagent registry and tools. * refactor(subagent): update announce formatting and remove unused constants - Modified the subagent announce formatting to replace "Findings:" with "Result:" and adjusted related expectations in tests. - Removed constants for maximum announce findings characters and summary words, simplifying the announcement logic. - Updated the handling of findings to retain full content instead of truncating, ensuring more informative outputs. - Cleaned up unused imports in the commands-subagents file to enhance code clarity. * feat(tests): enhance billing error handling in user-facing text - Added tests to ensure that normal text mentioning billing plans is not rewritten, preserving user context. - Updated the `isBillingErrorMessage` and `sanitizeUserFacingText` functions to improve handling of billing-related messages. - Introduced new test cases for various scenarios involving billing messages to ensure accurate processing and output. - Enhanced the subagent announce flow to correctly manage active descendant runs, preventing premature announcements. * feat(subagent): enhance workflow guidance and auto-announcement clarity - Added a new guideline in the subagent system prompt to emphasize trust in push-based completion, discouraging busy polling for status updates. - Updated documentation to clarify that sub-agents will automatically announce their results, improving user understanding of the workflow. - Enhanced tests to verify the new guidance on avoiding polling loops and to ensure the accuracy of the updated prompts. * fix(cron): avoid announcing interim subagent spawn acks * chore: clean post-rebase imports * fix(cron): fall back to child replies when parent stays interim * fix(subagents): make active-run guidance advisory * fix(subagents): update announce flow to handle active descendants and enhance test coverage - Modified the announce flow to defer announcements when active descendant runs are present, ensuring accurate status reporting. - Updated tests to verify the new behavior, including scenarios where no fallback requester is available and ensuring proper handling of finished subagents. - Enhanced the announce formatting to include an `expectFinal` flag for better clarity in the announcement process. * fix(subagents): enhance announce flow and formatting for user updates - Updated the announce flow to provide clearer instructions for user updates based on active subagent runs and requester context. - Refactored the announcement logic to improve clarity and ensure internal context remains private. - Enhanced tests to verify the new message expectations and formatting, including updated prompts for user-facing updates. - Introduced a new function to build reply instructions based on session context, improving the overall announcement process. * fix: resolve prep blockers and changelog placement (#14447) (thanks @tyler6204) * fix: restore cron delivery-plan import after rebase (#14447) (thanks @tyler6204) * fix: resolve test failures from rebase conflicts (#14447) (thanks @tyler6204) * fix: apply formatting after rebase (#14447) (thanks @tyler6204)
2026-05-08 09:11:26 +00:00 · 2026-02-14 22:03:45 -08:00
parent c46f395bb9
commit b8f66c260d
86 changed files with 5700 additions and 821 deletions
--- a/src/agents/subagent-announce.ts
+++ b/src/agents/subagent-announce.ts
@@ -1,16 +1,13 @@
 import crypto from "node:crypto";
-import path from "node:path";
 import { resolveQueueSettings } from "../auto-reply/reply/queue.js";
 import { loadConfig } from "../config/config.js";
 import {
  loadSessionStore,
  resolveAgentIdFromSessionKey,
  resolveMainSessionKey,
-  resolveSessionFilePath,
  resolveStorePath,
 } from "../config/sessions.js";
 import { callGateway } from "../gateway/call.js";
-import { formatDurationCompact } from "../infra/format-time/format-duration.ts";
 import { normalizeMainKey } from "../routing/session-key.js";
 import { defaultRuntime } from "../runtime.js";
 import {
@@ -25,10 +22,28 @@ import {
  waitForEmbeddedPiRunEnd,
 } from "./pi-embedded.js";
 import { type AnnounceQueueItem, enqueueAnnounce } from "./subagent-announce-queue.js";
+import { getSubagentDepthFromSessionStore } from "./subagent-depth.js";
 import { readLatestAssistantReply } from "./tools/agent-step.js";

+function formatDurationShort(valueMs?: number) {
+  if (!valueMs || !Number.isFinite(valueMs) || valueMs <= 0) {
+    return "n/a";
+  }
+  const totalSeconds = Math.round(valueMs / 1000);
+  const hours = Math.floor(totalSeconds / 3600);
+  const minutes = Math.floor((totalSeconds % 3600) / 60);
+  const seconds = totalSeconds % 60;
+  if (hours > 0) {
+    return `${hours}h${minutes}m`;
+  }
+  if (minutes > 0) {
+    return `${minutes}m${seconds}s`;
+  }
+  return `${seconds}s`;
+}
+
 function formatTokenCount(value?: number) {
-  if (!value || !Number.isFinite(value)) {
+  if (typeof value !== "number" || !Number.isFinite(value) || value <= 0) {
    return "0";
  }
  if (value >= 1_000_000) {
@@ -40,65 +55,44 @@ function formatTokenCount(value?: number) {
  return String(Math.round(value));
 }

-function formatUsd(value?: number) {
-  if (value === undefined || !Number.isFinite(value)) {
-    return undefined;
-  }
-  if (value >= 1) {
-    return `$${value.toFixed(2)}`;
-  }
-  if (value >= 0.01) {
-    return `$${value.toFixed(2)}`;
-  }
-  return `$${value.toFixed(4)}`;
-}
-
-function resolveModelCost(params: {
-  provider?: string;
-  model?: string;
-  config: ReturnType<typeof loadConfig>;
-}):
-  | {
-      input: number;
-      output: number;
-      cacheRead: number;
-      cacheWrite: number;
-    }
-  | undefined {
-  const provider = params.provider?.trim();
-  const model = params.model?.trim();
-  if (!provider || !model) {
-    return undefined;
-  }
-  const models = params.config.models?.providers?.[provider]?.models ?? [];
-  const entry = models.find((candidate) => candidate.id === model);
-  return entry?.cost;
-}
-
-async function waitForSessionUsage(params: { sessionKey: string }) {
+async function buildCompactAnnounceStatsLine(params: {
+  sessionKey: string;
+  startedAt?: number;
+  endedAt?: number;
+}) {
  const cfg = loadConfig();
  const agentId = resolveAgentIdFromSessionKey(params.sessionKey);
  const storePath = resolveStorePath(cfg.session?.store, { agentId });
  let entry = loadSessionStore(storePath)[params.sessionKey];
-  if (!entry) {
-    return { entry, storePath };
-  }
-  const hasTokens = () =>
-    entry &&
-    (typeof entry.totalTokens === "number" ||
-      typeof entry.inputTokens === "number" ||
-      typeof entry.outputTokens === "number");
-  if (hasTokens()) {
-    return { entry, storePath };
-  }
-  for (let attempt = 0; attempt < 4; attempt += 1) {
-    await new Promise((resolve) => setTimeout(resolve, 200));
-    entry = loadSessionStore(storePath)[params.sessionKey];
-    if (hasTokens()) {
+  for (let attempt = 0; attempt < 3; attempt += 1) {
+    const hasTokenData =
+      typeof entry?.inputTokens === "number" ||
+      typeof entry?.outputTokens === "number" ||
+      typeof entry?.totalTokens === "number";
+    if (hasTokenData) {
      break;
    }
+    await new Promise((resolve) => setTimeout(resolve, 150));
+    entry = loadSessionStore(storePath)[params.sessionKey];
  }
-  return { entry, storePath };
+
+  const input = typeof entry?.inputTokens === "number" ? entry.inputTokens : 0;
+  const output = typeof entry?.outputTokens === "number" ? entry.outputTokens : 0;
+  const ioTotal = input + output;
+  const promptCache = typeof entry?.totalTokens === "number" ? entry.totalTokens : undefined;
+  const runtimeMs =
+    typeof params.startedAt === "number" && typeof params.endedAt === "number"
+      ? Math.max(0, params.endedAt - params.startedAt)
+      : undefined;
+
+  const parts = [
+    `runtime ${formatDurationShort(runtimeMs)}`,
+    `tokens ${formatTokenCount(ioTotal)} (in ${formatTokenCount(input)} / out ${formatTokenCount(output)})`,
+  ];
+  if (typeof promptCache === "number" && promptCache > ioTotal) {
+    parts.push(`prompt/cache ${formatTokenCount(promptCache)}`);
+  }
+  return `Stats: ${parts.join(" • ")}`;
 }

 type DeliveryContextSource = Parameters<typeof deliveryContextFromSession>[0];
@@ -114,6 +108,8 @@ function resolveAnnounceOrigin(
 }

 async function sendAnnounce(item: AnnounceQueueItem) {
+  const requesterDepth = getSubagentDepthFromSessionStore(item.sessionKey);
+  const requesterIsSubagent = requesterDepth >= 1;
  const origin = item.origin;
  const threadId =
    origin?.threadId != null && origin.threadId !== "" ? String(origin.threadId) : undefined;
@@ -122,15 +118,14 @@ async function sendAnnounce(item: AnnounceQueueItem) {
    params: {
      sessionKey: item.sessionKey,
      message: item.prompt,
-      channel: origin?.channel,
-      accountId: origin?.accountId,
-      to: origin?.to,
-      threadId,
-      deliver: true,
+      channel: requesterIsSubagent ? undefined : origin?.channel,
+      accountId: requesterIsSubagent ? undefined : origin?.accountId,
+      to: requesterIsSubagent ? undefined : origin?.to,
+      threadId: requesterIsSubagent ? undefined : threadId,
+      deliver: !requesterIsSubagent,
      idempotencyKey: crypto.randomUUID(),
    },
-    expectFinal: true,
-    timeoutMs: 60_000,
+    timeoutMs: 15_000,
  });
 }

@@ -219,74 +214,6 @@ async function maybeQueueSubagentAnnounce(params: {
  return "none";
 }

-async function buildSubagentStatsLine(params: {
-  sessionKey: string;
-  startedAt?: number;
-  endedAt?: number;
-}) {
-  const cfg = loadConfig();
-  const { entry, storePath } = await waitForSessionUsage({
-    sessionKey: params.sessionKey,
-  });
-
-  const sessionId = entry?.sessionId;
-  const agentId = resolveAgentIdFromSessionKey(params.sessionKey);
-  let transcriptPath: string | undefined;
-  if (sessionId && storePath) {
-    try {
-      transcriptPath = resolveSessionFilePath(sessionId, entry, {
-        agentId,
-        sessionsDir: path.dirname(storePath),
-      });
-    } catch {
-      transcriptPath = undefined;
-    }
-  }
-
-  const input = entry?.inputTokens;
-  const output = entry?.outputTokens;
-  const total =
-    entry?.totalTokens ??
-    (typeof input === "number" && typeof output === "number" ? input + output : undefined);
-  const runtimeMs =
-    typeof params.startedAt === "number" && typeof params.endedAt === "number"
-      ? Math.max(0, params.endedAt - params.startedAt)
-      : undefined;
-
-  const provider = entry?.modelProvider;
-  const model = entry?.model;
-  const costConfig = resolveModelCost({ provider, model, config: cfg });
-  const cost =
-    costConfig && typeof input === "number" && typeof output === "number"
-      ? (input * costConfig.input + output * costConfig.output) / 1_000_000
-      : undefined;
-
-  const parts: string[] = [];
-  const runtime = formatDurationCompact(runtimeMs);
-  parts.push(`runtime ${runtime ?? "n/a"}`);
-  if (typeof total === "number") {
-    const inputText = typeof input === "number" ? formatTokenCount(input) : "n/a";
-    const outputText = typeof output === "number" ? formatTokenCount(output) : "n/a";
-    const totalText = formatTokenCount(total);
-    parts.push(`tokens ${totalText} (in ${inputText} / out ${outputText})`);
-  } else {
-    parts.push("tokens n/a");
-  }
-  const costText = formatUsd(cost);
-  if (costText) {
-    parts.push(`est ${costText}`);
-  }
-  parts.push(`sessionKey ${params.sessionKey}`);
-  if (sessionId) {
-    parts.push(`sessionId ${sessionId}`);
-  }
-  if (transcriptPath) {
-    parts.push(`transcript ${transcriptPath}`);
-  }
-
-  return `Stats: ${parts.join(" \u2022 ")}`;
-}
-
 function loadSessionEntryByKey(sessionKey: string) {
  const cfg = loadConfig();
  const agentId = resolveAgentIdFromSessionKey(sessionKey);
@@ -323,49 +250,85 @@ export function buildSubagentSystemPrompt(params: {
  childSessionKey: string;
  label?: string;
  task?: string;
+  /** Depth of the child being spawned (1 = sub-agent, 2 = sub-sub-agent). */
+  childDepth?: number;
+  /** Config value: max allowed spawn depth. */
+  maxSpawnDepth?: number;
 }) {
  const taskText =
    typeof params.task === "string" && params.task.trim()
      ? params.task.replace(/\s+/g, " ").trim()
      : "{{TASK_DESCRIPTION}}";
+  const childDepth = typeof params.childDepth === "number" ? params.childDepth : 1;
+  const maxSpawnDepth = typeof params.maxSpawnDepth === "number" ? params.maxSpawnDepth : 1;
+  const canSpawn = childDepth < maxSpawnDepth;
+  const parentLabel = childDepth >= 2 ? "parent orchestrator" : "main agent";
+
  const lines = [
    "# Subagent Context",
    "",
-    "You are a **subagent** spawned by the main agent for a specific task.",
+    `You are a **subagent** spawned by the ${parentLabel} for a specific task.`,
    "",
    "## Your Role",
    `- You were created to handle: ${taskText}`,
    "- Complete this task. That's your entire purpose.",
-    "- You are NOT the main agent. Don't try to be.",
+    `- You are NOT the ${parentLabel}. Don't try to be.`,
    "",
    "## Rules",
    "1. **Stay focused** - Do your assigned task, nothing else",
-    "2. **Complete the task** - Your final message will be automatically reported to the main agent",
+    `2. **Complete the task** - Your final message will be automatically reported to the ${parentLabel}`,
    "3. **Don't initiate** - No heartbeats, no proactive actions, no side quests",
    "4. **Be ephemeral** - You may be terminated after task completion. That's fine.",
+    "5. **Trust push-based completion** - Descendant results are auto-announced back to you; do not busy-poll for status.",
    "",
    "## Output Format",
    "When complete, your final response should include:",
-    "- What you accomplished or found",
-    "- Any relevant details the main agent should know",
+    `- What you accomplished or found`,
+    `- Any relevant details the ${parentLabel} should know`,
    "- Keep it concise but informative",
    "",
    "## What You DON'T Do",
-    "- NO user conversations (that's main agent's job)",
+    `- NO user conversations (that's ${parentLabel}'s job)`,
    "- NO external messages (email, tweets, etc.) unless explicitly tasked with a specific recipient/channel",
    "- NO cron jobs or persistent state",
-    "- NO pretending to be the main agent",
-    "- Only use the `message` tool when explicitly instructed to contact a specific external recipient; otherwise return plain text and let the main agent deliver it",
+    `- NO pretending to be the ${parentLabel}`,
+    `- Only use the \`message\` tool when explicitly instructed to contact a specific external recipient; otherwise return plain text and let the ${parentLabel} deliver it`,
    "",
+  ];
+
+  if (canSpawn) {
+    lines.push(
+      "## Sub-Agent Spawning",
+      "You CAN spawn your own sub-agents for parallel or complex work using `sessions_spawn`.",
+      "Use the `subagents` tool to steer, kill, or do an on-demand status check for your spawned sub-agents.",
+      "Your sub-agents will announce their results back to you automatically (not to the main agent).",
+      "Default workflow: spawn work, continue orchestrating, and wait for auto-announced completions.",
+      "Do NOT repeatedly poll `subagents list` in a loop unless you are actively debugging or intervening.",
+      "Coordinate their work and synthesize results before reporting back.",
+      "",
+    );
+  } else if (childDepth >= 2) {
+    lines.push(
+      "## Sub-Agent Spawning",
+      "You are a leaf worker and CANNOT spawn further sub-agents. Focus on your assigned task.",
+      "",
+    );
+  }
+
+  lines.push(
    "## Session Context",
-    params.label ? `- Label: ${params.label}` : undefined,
-    params.requesterSessionKey ? `- Requester session: ${params.requesterSessionKey}.` : undefined,
-    params.requesterOrigin?.channel
-      ? `- Requester channel: ${params.requesterOrigin.channel}.`
-      : undefined,
-    `- Your session: ${params.childSessionKey}.`,
+    ...[
+      params.label ? `- Label: ${params.label}` : undefined,
+      params.requesterSessionKey
+        ? `- Requester session: ${params.requesterSessionKey}.`
+        : undefined,
+      params.requesterOrigin?.channel
+        ? `- Requester channel: ${params.requesterOrigin.channel}.`
+        : undefined,
+      `- Your session: ${params.childSessionKey}.`,
+    ].filter((line): line is string => line !== undefined),
    "",
-  ].filter((line): line is string => line !== undefined);
+  );
  return lines.join("\n");
 }

@@ -376,6 +339,21 @@ export type SubagentRunOutcome = {

 export type SubagentAnnounceType = "subagent task" | "cron job";

+function buildAnnounceReplyInstruction(params: {
+  remainingActiveSubagentRuns: number;
+  requesterIsSubagent: boolean;
+  announceType: SubagentAnnounceType;
+}): string {
+  if (params.remainingActiveSubagentRuns > 0) {
+    const activeRunsLabel = params.remainingActiveSubagentRuns === 1 ? "run" : "runs";
+    return `There are still ${params.remainingActiveSubagentRuns} active subagent ${activeRunsLabel} for this session. If they are part of the same workflow, wait for the remaining results before sending a user update. If they are unrelated, respond normally using only the result above.`;
+  }
+  if (params.requesterIsSubagent) {
+    return "Convert this completion into a concise internal orchestration update for your parent agent in your own words. Keep this internal context private (don't mention system/log/stats/session details or announce type). If this result is duplicate or no update is needed, reply ONLY: NO_REPLY.";
+  }
+  return `A completed ${params.announceType} is ready for user delivery. Convert the result above into your normal assistant voice and send that user-facing update now. Keep this internal context private (don't mention system/log/stats/session details or announce type), and do not copy the system message verbatim. Reply ONLY: NO_REPLY if this exact result was already delivered to the user in this same turn.`;
+}
+
 export async function runSubagentAnnounceFlow(params: {
  childSessionKey: string;
  childRunId: string;
@@ -396,7 +374,8 @@ export async function runSubagentAnnounceFlow(params: {
  let didAnnounce = false;
  let shouldDeleteChildSession = params.cleanup === "delete";
  try {
-    const requesterOrigin = normalizeDeliveryContext(params.requesterOrigin);
+    let targetRequesterSessionKey = params.requesterSessionKey;
+    let targetRequesterOrigin = normalizeDeliveryContext(params.requesterOrigin);
    const childSessionId = (() => {
      const entry = loadSessionEntryByKey(params.childSessionKey);
      return typeof entry?.sessionId === "string" && entry.sessionId.trim()
@@ -478,12 +457,19 @@ export async function runSubagentAnnounceFlow(params: {
      outcome = { status: "unknown" };
    }

-    // Build stats
-    const statsLine = await buildSubagentStatsLine({
-      sessionKey: params.childSessionKey,
-      startedAt: params.startedAt,
-      endedAt: params.endedAt,
-    });
+    let activeChildDescendantRuns = 0;
+    try {
+      const { countActiveDescendantRuns } = await import("./subagent-registry.js");
+      activeChildDescendantRuns = Math.max(0, countActiveDescendantRuns(params.childSessionKey));
+    } catch {
+      // Best-effort only; fall back to direct announce behavior when unavailable.
+    }
+    if (activeChildDescendantRuns > 0) {
+      // The finished run still has active descendant subagents. Defer announcing
+      // this run until descendants settle so we avoid posting in-progress updates.
+      shouldDeleteChildSession = false;
+      return false;
+    }

    // Build status label
    const statusLabel =
@@ -498,24 +484,70 @@ export async function runSubagentAnnounceFlow(params: {
    // Build instructional message for main agent
    const announceType = params.announceType ?? "subagent task";
    const taskLabel = params.label || params.task || "task";
-    const triggerMessage = [
-      `A ${announceType} "${taskLabel}" just ${statusLabel}.`,
+    const announceSessionId = childSessionId || "unknown";
+    const findings = reply || "(no output)";
+    let triggerMessage = "";
+
+    let requesterDepth = getSubagentDepthFromSessionStore(targetRequesterSessionKey);
+    let requesterIsSubagent = requesterDepth >= 1;
+    // If the requester subagent has already finished, bubble the announce to its
+    // requester (typically main) so descendant completion is not silently lost.
+    if (requesterIsSubagent) {
+      const { isSubagentSessionRunActive, resolveRequesterForChildSession } =
+        await import("./subagent-registry.js");
+      if (!isSubagentSessionRunActive(targetRequesterSessionKey)) {
+        const fallback = resolveRequesterForChildSession(targetRequesterSessionKey);
+        if (!fallback?.requesterSessionKey) {
+          // Without a requester fallback we cannot safely deliver this nested
+          // completion. Keep cleanup retryable so a later registry restore can
+          // recover and re-announce instead of silently dropping the result.
+          shouldDeleteChildSession = false;
+          return false;
+        }
+        targetRequesterSessionKey = fallback.requesterSessionKey;
+        targetRequesterOrigin =
+          normalizeDeliveryContext(fallback.requesterOrigin) ?? targetRequesterOrigin;
+        requesterDepth = getSubagentDepthFromSessionStore(targetRequesterSessionKey);
+        requesterIsSubagent = requesterDepth >= 1;
+      }
+    }
+
+    let remainingActiveSubagentRuns = 0;
+    try {
+      const { countActiveDescendantRuns } = await import("./subagent-registry.js");
+      remainingActiveSubagentRuns = Math.max(
+        0,
+        countActiveDescendantRuns(targetRequesterSessionKey),
+      );
+    } catch {
+      // Best-effort only; fall back to default announce instructions when unavailable.
+    }
+    const replyInstruction = buildAnnounceReplyInstruction({
+      remainingActiveSubagentRuns,
+      requesterIsSubagent,
+      announceType,
+    });
+    const statsLine = await buildCompactAnnounceStatsLine({
+      sessionKey: params.childSessionKey,
+      startedAt: params.startedAt,
+      endedAt: params.endedAt,
+    });
+    triggerMessage = [
+      `[System Message] [sessionId: ${announceSessionId}] A ${announceType} "${taskLabel}" just ${statusLabel}.`,
      "",
-      "Findings:",
-      reply || "(no output)",
+      "Result:",
+      findings,
      "",
      statsLine,
      "",
-      "Summarize this naturally for the user. Keep it brief (1-2 sentences). Flow it into the conversation naturally.",
-      `Do not mention technical details like tokens, stats, or that this was a ${announceType}.`,
-      "You can respond with NO_REPLY if no announcement is needed (e.g., internal task with no user-facing result).",
+      replyInstruction,
    ].join("\n");

    const queued = await maybeQueueSubagentAnnounce({
-      requesterSessionKey: params.requesterSessionKey,
+      requesterSessionKey: targetRequesterSessionKey,
      triggerMessage,
      summaryLine: taskLabel,
-      requesterOrigin,
+      requesterOrigin: targetRequesterOrigin,
    });
    if (queued === "steered") {
      didAnnounce = true;
@@ -526,29 +558,30 @@ export async function runSubagentAnnounceFlow(params: {
      return true;
    }

-    // Send to main agent - it will respond in its own voice
-    let directOrigin = requesterOrigin;
-    if (!directOrigin) {
-      const { entry } = loadRequesterSessionEntry(params.requesterSessionKey);
+    // Send to the requester session. For nested subagents this is an internal
+    // follow-up injection (deliver=false) so the orchestrator receives it.
+    let directOrigin = targetRequesterOrigin;
+    if (!requesterIsSubagent && !directOrigin) {
+      const { entry } = loadRequesterSessionEntry(targetRequesterSessionKey);
      directOrigin = deliveryContextFromSession(entry);
    }
    await callGateway({
      method: "agent",
      params: {
-        sessionKey: params.requesterSessionKey,
+        sessionKey: targetRequesterSessionKey,
        message: triggerMessage,
-        deliver: true,
-        channel: directOrigin?.channel,
-        accountId: directOrigin?.accountId,
-        to: directOrigin?.to,
+        deliver: !requesterIsSubagent,
+        channel: requesterIsSubagent ? undefined : directOrigin?.channel,
+        accountId: requesterIsSubagent ? undefined : directOrigin?.accountId,
+        to: requesterIsSubagent ? undefined : directOrigin?.to,
        threadId:
-          directOrigin?.threadId != null && directOrigin.threadId !== ""
+          !requesterIsSubagent && directOrigin?.threadId != null && directOrigin.threadId !== ""
            ? String(directOrigin.threadId)
            : undefined,
        idempotencyKey: crypto.randomUUID(),
      },
      expectFinal: true,
-      timeoutMs: 60_000,
+      timeoutMs: 15_000,
    });

    didAnnounce = true;