* fix: preserve ${VAR} env var references when writing config back to disk
Fixes#11466
When config is loaded, ${VAR} references are resolved to their plaintext
values. Previously, writeConfigFile would serialize the resolved values,
silently replacing "${ANTHROPIC_API_KEY}" with "sk-ant-api03-..." in the
config file.
Now writeConfigFile reads the current file pre-substitution, and for each
value that matches what a ${VAR} reference would resolve to, restores the
original reference. Values the caller intentionally changed are kept as-is.
This fixes all 50+ writeConfigFile call sites (doctor, configure wizard,
gateway config.set/apply/patch, plugins, hooks, etc.) without requiring
any caller changes.
New files:
- src/config/env-preserve.ts — restoreEnvVarRefs() utility
- src/config/env-preserve.test.ts — 11 unit tests
* fix: remove global config env snapshot race
* docs(changelog): note config env snapshot race fix
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix: defer gateway restart until all replies are sent
Fixes a race condition where gateway config changes (e.g., enabling
plugins via iMessage) trigger an immediate SIGUSR1 restart, killing the
iMessage RPC connection before replies are delivered.
Both restart paths (config watcher and RPC-triggered) now defer until
all queued operations, pending replies, and embedded agent runs complete
(polling every 500ms, 30s timeout). A shared emitGatewayRestart() guard
prevents double SIGUSR1 when both paths fire simultaneously.
Key changes:
- Dispatcher registry tracks active reply dispatchers globally
- markComplete() called in finally block for guaranteed cleanup
- Pre-restart deferral hook registered at gateway startup
- Centralized extractDeliveryInfo() for session key parsing
- Post-restart sentinel messages delivered directly (not via agent)
- config-patch distinguished from config-apply in sentinel kind
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: single-source gateway restart authorization
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix(gateway): normalize session key casing to prevent ghost sessions on Linux
On case-sensitive filesystems (Linux), mixed-case session keys like
agent:ops:MySession and agent:ops:mysession resolve to different store
entries, creating ghost duplicates that never converge.
Core changes in session-utils.ts:
- resolveSessionStoreKey: lowercase all session key components
- canonicalizeSpawnedByForAgent: accept cfg, resolve main-alias references
via canonicalizeMainSessionAlias after lowercasing
- loadSessionEntry: return legacyKey only when it differs from canonicalKey
- resolveGatewaySessionStoreTarget: scan store for case-insensitive matches;
add optional scanLegacyKeys param to skip disk reads for read-only callers
- Export findStoreKeysIgnoreCase for use by write-path consumers
- Compare global/unknown sentinels case-insensitively in all canonicalization
functions
sessions-resolve.ts:
- Make resolveSessionKeyFromResolveParams async for inline migration
- Check canonical key first (fast path), then fall back to legacy scan
- Delete ALL legacy case-variant keys in a single updateSessionStore pass
Fixes#12603
* fix(gateway): propagate canonical keys and clean up all case variants on write paths
- agent.ts: use canonicalizeSpawnedByForAgent (with cfg) instead of raw
toLowerCase; use findStoreKeysIgnoreCase to delete all legacy variants
on store write; pass canonicalKey to addChatRun, registerAgentRunContext,
resolveSendPolicy, and agentCommand
- sessions.ts: replace single-key migration with full case-variant cleanup
via findStoreKeysIgnoreCase in patch/reset/delete/compact handlers; add
case-insensitive fallback in preview (store already loaded); make
sessions.resolve handler async; pass scanLegacyKeys: false in preview
- server-node-events.ts: use findStoreKeysIgnoreCase to clean all legacy
variants on voice.transcript and agent.request write paths; pass
canonicalKey to addChatRun and agentCommand
* test(gateway): add session key case-normalization tests
Cover the case-insensitive session key canonicalization logic:
- resolveSessionStoreKey normalizes mixed-case bare and prefixed keys
- resolveSessionStoreKey resolves mixed-case main aliases (MAIN, Main)
- resolveGatewaySessionStoreTarget includes legacy mixed-case store keys
- resolveGatewaySessionStoreTarget collects all case-variant duplicates
- resolveGatewaySessionStoreTarget finds legacy main alias keys with
customized mainKey configuration
All 5 tests fail before the production changes, pass after.
* fix: clean legacy session alias cleanup gaps (openclaw#12846) thanks @mcaxtr
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* feat(gateway): add register and awaitDecision methods to ExecApprovalManager
Separates registration (synchronous) from waiting (async) to allow callers
to confirm registration before the decision is made. Adds grace period for
resolved entries to prevent race conditions.
* feat(gateway): add two-phase response and waitDecision handler for exec approvals
Send immediate 'accepted' response after registration so callers can confirm
the approval ID is valid. Add exec.approval.waitDecision endpoint to wait for
decision on already-registered approvals.
* fix(exec): await approval registration before returning approval-pending
Ensures the approval ID is registered in the gateway before the tool returns.
Uses exec.approval.request with expectFinal:false for registration, then
fire-and-forget exec.approval.waitDecision for the decision phase.
Fixes#2402
* test(gateway): update exec-approval test for two-phase response
Add assertion for immediate 'accepted' response before final decision.
* test(exec): update approval-id test mocks for new two-phase flow
Mock both exec.approval.request (registration) and exec.approval.waitDecision
(decision) calls to match the new internal implementation.
* fix(lint): add cause to errors, use generics instead of type assertions
* fix(exec-approval): guard register() against duplicate IDs
* fix: remove unused timeoutMs param, guard register() against duplicates
* fix(exec-approval): throw on duplicate ID, capture entry in closure
* fix: return error on timeout, remove stale test mock branch
* fix: wrap register() in try/catch, make timeout handling consistent
* fix: update snapshot on timeout, make two-phase response opt-in
* fix: extend grace period to 15s, return 'expired' status
* fix: prevent double-resolve after timeout
* fix: make register() idempotent, capture snapshot before await
* fix(gateway): complete two-phase exec approval wiring
* fix: finalize exec approval race fix (openclaw#3357) thanks @ramin-shirali
* fix(protocol): regenerate exec approval request models (openclaw#3357) thanks @ramin-shirali
* fix(test): remove unused callCount in discord threading test
---------
Co-authored-by: rshirali <rshirali@rshirali-haga.local>
Co-authored-by: rshirali <rshirali@rshirali-haga-1.home>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* refactor: add config.get to READ_METHODS set
* refactor(gateway): scope talk secrets via talk.config
* fix: resolve rebase conflicts for talk scope refactor
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>