feat(secrets): expand SecretRef coverage across user-supplied credentials (#29580)

* feat(secrets): expand secret target coverage and gateway tooling

* docs(secrets): align gateway and CLI secret docs

* chore(protocol): regenerate swift gateway models for secrets methods

* fix(config): restore talk apiKey fallback and stabilize runner test

* ci(windows): reduce test worker count for shard stability

* ci(windows): raise node heap for test shard stability

* test(feishu): make proxy env precedence assertion windows-safe

* fix(gateway): resolve auth password SecretInput refs for clients

* fix(gateway): resolve remote SecretInput credentials for clients

* fix(secrets): skip inactive refs in command snapshot assignments

* fix(secrets): scope gateway.remote refs to effective auth surfaces

* fix(secrets): ignore memory defaults when enabled agents disable search

* fix(secrets): honor Google Chat serviceAccountRef inheritance

* fix(secrets): address tsgo errors in command and gateway collectors

* fix(secrets): avoid auth-store load in providers-only configure

* fix(gateway): defer local password ref resolution by precedence

* fix(secrets): gate telegram webhook secret refs by webhook mode

* fix(secrets): gate slack signing secret refs to http mode

* fix(secrets): skip telegram botToken refs when tokenFile is set

* fix(secrets): gate discord pluralkit refs by enabled flag

* fix(secrets): gate discord voice tts refs by voice enabled

* test(secrets): make runtime fixture modes explicit

* fix(cli): resolve local qr password secret refs

* fix(cli): fail when gateway leaves command refs unresolved

* fix(gateway): fail when local password SecretRef is unresolved

* fix(gateway): fail when required remote SecretRefs are unresolved

* fix(gateway): resolve local password refs only when password can win

* fix(cli): skip local password SecretRef resolution on qr token override

* test(gateway): cast SecretRef fixtures to OpenClawConfig

* test(secrets): activate mode-gated targets in runtime coverage fixture

* fix(cron): support SecretInput webhook tokens safely

* fix(bluebubbles): support SecretInput passwords across config paths

* fix(msteams): make appPassword SecretInput-safe in onboarding/token paths

* fix(bluebubbles): align SecretInput schema helper typing

* fix(cli): clarify secrets.resolve version-skew errors

* refactor(secrets): return structured inactive paths from secrets.resolve

* refactor(gateway): type onboarding secret writes as SecretInput

* chore(protocol): regenerate swift models for secrets.resolve

* feat(secrets): expand extension credential secretref support

* fix(secrets): gate web-search refs by active provider

* fix(onboarding): detect SecretRef credentials in extension status

* fix(onboarding): allow keeping existing ref in secret prompt

* fix(onboarding): resolve gateway password SecretRefs for probe and tui

* fix(onboarding): honor secret-input-mode for local gateway auth

* fix(acp): resolve gateway SecretInput credentials

* fix(secrets): gate gateway.remote refs to remote surfaces

* test(secrets): cover pattern matching and inactive array refs

* docs(secrets): clarify secrets.resolve and remote active surfaces

* fix(bluebubbles): keep existing SecretRef during onboarding

* fix(tests): resolve CI type errors in new SecretRef coverage

* fix(extensions): replace raw fetch with SSRF-guarded fetch

* test(secrets): mark gateway remote targets active in runtime coverage

* test(infra): normalize home-prefix expectation across platforms

* fix(cli): only resolve local qr password refs in password mode

* test(cli): cover local qr token mode with unresolved password ref

* docs(cli): clarify local qr password ref resolution behavior

* refactor(extensions): reuse sdk SecretInput helpers

* fix(wizard): resolve onboarding env-template secrets before plaintext

* fix(cli): surface secrets.resolve diagnostics in memory and qr

* test(secrets): repair post-rebase runtime and fixtures

* fix(gateway): skip remote password ref resolution when token wins

* fix(secrets): treat tailscale remote gateway refs as active

* fix(gateway): allow remote password fallback when token ref is unresolved

* fix(gateway): ignore stale local password refs for none and trusted-proxy

* fix(gateway): skip remote secret ref resolution on local call paths

* test(cli): cover qr remote tailscale secret ref resolution

* fix(secrets): align gateway password active-surface with auth inference

* fix(cli): resolve inferred local gateway password refs in qr

* fix(gateway): prefer resolvable remote password over token ref pre-resolution

* test(gateway): cover none and trusted-proxy stale password refs

* docs(secrets): sync qr and gateway active-surface behavior

* fix: restore stability blockers from pre-release audit

* Secrets: fix collector/runtime precedence contradictions

* docs: align secrets and web credential docs

* fix(rebase): resolve integration regressions after main rebase

* fix(node-host): resolve gateway secret refs for auth

* fix(secrets): harden secretinput runtime readers

* gateway: skip inactive auth secretref resolution

* cli: avoid gateway preflight for inactive secret refs

* extensions: allow unresolved refs in onboarding status

* tests: fix qr-cli module mock hoist ordering

* Security: align audit checks with SecretInput resolution

* Gateway: resolve local-mode remote fallback secret refs

* Node host: avoid resolving inactive password secret refs

* Secrets runtime: mark Slack appToken inactive for HTTP mode

* secrets: keep inactive gateway remote refs non-blocking

* cli: include agent memory secret targets in runtime resolution

* docs(secrets): sync docs with active-surface and web search behavior

* fix(secrets): keep telegram top-level token refs active for blank account tokens

* fix(daemon): resolve gateway password secret refs for probe auth

* fix(secrets): skip IRC NickServ ref resolution when NickServ is disabled

* fix(secrets): align token inheritance and exec timeout defaults

* docs(secrets): clarify active-surface notes in cli docs

* cli: require secrets.resolve gateway capability

* gateway: log auth secret surface diagnostics

* secrets: remove dead provider resolver module

* fix(secrets): restore gateway auth precedence and fallback resolution

* fix(tests): align plugin runtime mock typings

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
This commit is contained in:
Josh Avant
2026-03-02 20:58:20 -06:00
committed by GitHub
parent f212351aed
commit 806803b7ef
236 changed files with 16810 additions and 2861 deletions

View File

@@ -1,35 +1,70 @@
---
summary: "Secrets management: SecretRef contract, runtime snapshot behavior, and safe one-way scrubbing"
read_when:
- Configuring SecretRefs for providers, auth profiles, skills, or Google Chat
- Operating secrets reload/audit/configure/apply safely in production
- Understanding fail-fast and last-known-good behavior
- Configuring SecretRefs for provider credentials and `auth-profiles.json` refs
- Operating secrets reload, audit, configure, and apply safely in production
- Understanding startup fail-fast, inactive-surface filtering, and last-known-good behavior
title: "Secrets Management"
---
# Secrets management
OpenClaw supports additive secret references so credentials do not need to be stored as plaintext in config files.
OpenClaw supports additive SecretRefs so supported credentials do not need to be stored as plaintext in configuration.
Plaintext still works. Secret refs are optional.
Plaintext still works. SecretRefs are opt-in per credential.
## Goals and runtime model
Secrets are resolved into an in-memory runtime snapshot.
- Resolution is eager during activation, not lazy on request paths.
- Startup fails fast if any referenced credential cannot be resolved.
- Reload uses atomic swap: full success or keep last-known-good.
- Runtime requests read from the active in-memory snapshot.
- Startup fails fast when an effectively active SecretRef cannot be resolved.
- Reload uses atomic swap: full success, or keep the last-known-good snapshot.
- Runtime requests read from the active in-memory snapshot only.
This keeps secret-provider outages off the hot request path.
This keeps secret-provider outages off hot request paths.
## Active-surface filtering
SecretRefs are validated only on effectively active surfaces.
- Enabled surfaces: unresolved refs block startup/reload.
- Inactive surfaces: unresolved refs do not block startup/reload.
- Inactive refs emit non-fatal diagnostics with code `SECRETS_REF_IGNORED_INACTIVE_SURFACE`.
Examples of inactive surfaces:
- Disabled channel/account entries.
- Top-level channel credentials that no enabled account inherits.
- Disabled tool/feature surfaces.
- Web search provider-specific keys that are not selected by `tools.web.search.provider`.
In auto mode (provider unset), provider-specific keys are also active for provider auto-detection.
- `gateway.remote.token` / `gateway.remote.password` SecretRefs are active (when `gateway.remote.enabled` is not `false`) if one of these is true:
- `gateway.mode=remote`
- `gateway.remote.url` is configured
- `gateway.tailscale.mode` is `serve` or `funnel`
In local mode without those remote surfaces:
- `gateway.remote.token` is active when token auth can win and no env/auth token is configured.
- `gateway.remote.password` is active only when password auth can win and no env/auth password is configured.
## Gateway auth surface diagnostics
When a SecretRef is configured on `gateway.auth.password`, `gateway.remote.token`, or
`gateway.remote.password`, gateway startup/reload logs the surface state explicitly:
- `active`: the SecretRef is part of the effective auth surface and must resolve.
- `inactive`: the SecretRef is ignored for this runtime because another auth surface wins, or
because remote auth is disabled/not active.
These entries are logged with `SECRETS_GATEWAY_AUTH_SURFACE` and include the reason used by the
active-surface policy, so you can see why a credential was treated as active or inactive.
## Onboarding reference preflight
When onboarding runs in interactive mode and you choose secret reference storage, OpenClaw performs a fast preflight check before saving:
When onboarding runs in interactive mode and you choose SecretRef storage, OpenClaw runs preflight validation before saving:
- Env refs: validates env var name and confirms a non-empty value is visible during onboarding.
- Provider refs (`file` or `exec`): validates the selected provider, resolves the provided `id`, and checks value type.
- Provider refs (`file` or `exec`): validates provider selection, resolves `id`, and checks resolved value type.
If validation fails, onboarding shows the error and lets you retry.
@@ -122,22 +157,24 @@ Define providers under `secrets.providers`:
- `mode: "json"` expects JSON object payload and resolves `id` as pointer.
- `mode: "singleValue"` expects ref id `"value"` and returns file contents.
- Path must pass ownership/permission checks.
- Windows fail-closed note: if ACL verification is unavailable for a path, resolution fails. For trusted paths only, set `allowInsecurePath: true` on that provider to bypass path security checks.
### Exec provider
- Runs configured absolute binary path, no shell.
- By default, `command` must point to a regular file (not a symlink).
- Set `allowSymlinkCommand: true` to allow symlink command paths (for example Homebrew shims). OpenClaw validates the resolved target path.
- Enable `allowSymlinkCommand` only when required for trusted package-manager paths, and pair it with `trustedDirs` (for example `["/opt/homebrew"]`).
- When `trustedDirs` is set, checks apply to the resolved target path.
- Pair `allowSymlinkCommand` with `trustedDirs` for package-manager paths (for example `["/opt/homebrew"]`).
- Supports timeout, no-output timeout, output byte limits, env allowlist, and trusted dirs.
- Request payload (stdin):
- Windows fail-closed note: if ACL verification is unavailable for the command path, resolution fails. For trusted paths only, set `allowInsecurePath: true` on that provider to bypass path security checks.
Request payload (stdin):
```json
{ "protocolVersion": 1, "provider": "vault", "ids": ["providers/openai/apiKey"] }
```
- Response payload (stdout):
Response payload (stdout):
```json
{ "protocolVersion": 1, "values": { "providers/openai/apiKey": "sk-..." } }
@@ -242,37 +279,33 @@ Optional per-id errors:
}
```
## In-scope fields (v1)
## Supported credential surface
### `~/.openclaw/openclaw.json`
Canonical supported and unsupported credentials are listed in:
- `models.providers.<provider>.apiKey`
- `skills.entries.<skillKey>.apiKey`
- `channels.googlechat.serviceAccount`
- `channels.googlechat.serviceAccountRef`
- `channels.googlechat.accounts.<accountId>.serviceAccount`
- `channels.googlechat.accounts.<accountId>.serviceAccountRef`
- [SecretRef Credential Surface](/reference/secretref-credential-surface)
### `~/.openclaw/agents/<agentId>/agent/auth-profiles.json`
- `profiles.<profileId>.keyRef` for `type: "api_key"`
- `profiles.<profileId>.tokenRef` for `type: "token"`
OAuth credential storage changes are out of scope.
Runtime-minted or rotating credentials and OAuth refresh material are intentionally excluded from read-only SecretRef resolution.
## Required behavior and precedence
- Field without ref: unchanged.
- Field with ref: required at activation time.
- If plaintext and ref both exist, ref wins at runtime and plaintext is ignored.
- Field without a ref: unchanged.
- Field with a ref: required on active surfaces during activation.
- If both plaintext and ref are present, ref takes precedence on supported precedence paths.
Warning code:
Warning and audit signals:
- `SECRETS_REF_OVERRIDES_PLAINTEXT`
- `SECRETS_REF_OVERRIDES_PLAINTEXT` (runtime warning)
- `REF_SHADOWED` (audit finding when `auth-profiles.json` credentials take precedence over `openclaw.json` refs)
Google Chat compatibility behavior:
- `serviceAccountRef` takes precedence over plaintext `serviceAccount`.
- Plaintext value is ignored when sibling ref is set.
## Activation triggers
Secret activation is attempted on:
Secret activation runs on:
- Startup (preflight plus final activation)
- Config reload hot-apply path
@@ -283,9 +316,9 @@ Activation contract:
- Success swaps the snapshot atomically.
- Startup failure aborts gateway startup.
- Runtime reload failure keeps last-known-good snapshot.
- Runtime reload failure keeps the last-known-good snapshot.
## Degraded and recovered operator signals
## Degraded and recovered signals
When reload-time activation fails after a healthy state, OpenClaw enters degraded secrets state.
@@ -297,13 +330,22 @@ One-shot system event and log codes:
Behavior:
- Degraded: runtime keeps last-known-good snapshot.
- Recovered: emitted once after a successful activation.
- Recovered: emitted once after the next successful activation.
- Repeated failures while already degraded log warnings but do not spam events.
- Startup fail-fast does not emit degraded events because no runtime snapshot exists yet.
- Startup fail-fast does not emit degraded events because runtime never became active.
## Command-path resolution
Credential-sensitive command paths that opt in (for example `openclaw memory` remote-memory paths and `openclaw qr --remote`) can resolve supported SecretRefs via gateway snapshot RPC.
- When gateway is running, those command paths read from the active snapshot.
- If a configured SecretRef is required and gateway is unavailable, command resolution fails fast with actionable diagnostics.
- Snapshot refresh after backend secret rotation is handled by `openclaw secrets reload`.
- Gateway RPC method used by these command paths: `secrets.resolve`.
## Audit and configure workflow
Use this default operator flow:
Default operator flow:
```bash
openclaw secrets audit --check
@@ -311,26 +353,22 @@ openclaw secrets configure
openclaw secrets audit --check
```
Migration completeness:
- Include `skills.entries.<skillKey>.apiKey` targets when those skills use API keys.
- If `audit --check` still reports plaintext findings after a partial migration, migrate the remaining reported paths and rerun audit.
### `secrets audit`
Findings include:
- plaintext values at rest (`openclaw.json`, `auth-profiles.json`, `.env`)
- unresolved refs
- precedence shadowing (`auth-profiles` taking priority over config refs)
- legacy residues (`auth.json`, OAuth out-of-scope reminders)
- precedence shadowing (`auth-profiles.json` taking priority over `openclaw.json` refs)
- legacy residues (`auth.json`, OAuth reminders)
### `secrets configure`
Interactive helper that:
- configures `secrets.providers` first (`env`/`file`/`exec`, add/edit/remove)
- lets you select secret-bearing fields in `openclaw.json`
- lets you select supported secret-bearing fields in `openclaw.json` plus `auth-profiles.json` for one agent scope
- can create a new `auth-profiles.json` mapping directly in the target picker
- captures SecretRef details (`source`, `provider`, `id`)
- runs preflight resolution
- can apply immediately
@@ -339,10 +377,11 @@ Helpful modes:
- `openclaw secrets configure --providers-only`
- `openclaw secrets configure --skip-provider-setup`
- `openclaw secrets configure --agent <id>`
`configure` apply defaults to:
`configure` apply defaults:
- scrub matching static creds from `auth-profiles.json` for targeted providers
- scrub matching static credentials from `auth-profiles.json` for targeted providers
- scrub legacy static `api_key` entries from `auth.json`
- scrub matching known secret lines from `<config-dir>/.env`
@@ -361,26 +400,31 @@ For strict target/path contract details and exact rejection rules, see:
## One-way safety policy
OpenClaw intentionally does **not** write rollback backups that contain pre-migration plaintext secret values.
OpenClaw intentionally does not write rollback backups containing historical plaintext secret values.
Safety model:
- preflight must succeed before write mode
- runtime activation is validated before commit
- apply updates files using atomic file replacement and best-effort in-memory restore on failure
- apply updates files using atomic file replacement and best-effort restore on failure
## `auth.json` compatibility notes
## Legacy auth compatibility notes
For static credentials, OpenClaw runtime no longer depends on plaintext `auth.json`.
For static credentials, runtime no longer depends on plaintext legacy auth storage.
- Runtime credential source is the resolved in-memory snapshot.
- Legacy `auth.json` static `api_key` entries are scrubbed when discovered.
- OAuth-related legacy compatibility behavior remains separate.
- Legacy static `api_key` entries are scrubbed when discovered.
- OAuth-related compatibility behavior remains separate.
## Web UI note
Some SecretInput unions are easier to configure in raw editor mode than in form mode.
## Related docs
- CLI commands: [secrets](/cli/secrets)
- Plan contract details: [Secrets Apply Plan Contract](/gateway/secrets-plan-contract)
- Credential surface: [SecretRef Credential Surface](/reference/secretref-credential-surface)
- Auth setup: [Authentication](/gateway/authentication)
- Security posture: [Security](/gateway/security)
- Environment precedence: [Environment Variables](/help/environment)