feat(security): add client-side skill security enforcement

Add a capability-based security model for community skills, inspired by how mobile and Apple ecosystem apps declare capabilities upfront. This is not a silver bullet for prompt injection, but it's a significant step up from the status quo and encourages responsible developer practices by making capability requirements explicit and visible. Runtime enforcement for community skills installed from ClawHub: - Capability declarations (shell, filesystem, network, browser, sessions) parsed from SKILL.md frontmatter and enforced at tool-call time - Static SKILL.md scanner detecting prompt injection patterns, suspicious constructs, and capability mismatches - Global skill security context tracking loaded community skills and their aggregate capabilities - Before-tool-call enforcement gate blocking undeclared tool usage - Command-dispatch capability check preventing shell/filesystem access without explicit declaration - Trust tier classification (builtin/community/local) — only community skills are subject to enforcement - System prompt trust context warning for skills with scan warnings or missing capability declarations - CLI: `skills list -v`, `skills info`, `skills check` now surface capabilities, scan results, and security status - TUI security log panel for skill enforcement events - Docs updated across 7 files covering the full security model Companion PR: openclaw/clawhub (capability visibility + UI badges)
2026-05-23 19:48:12 +00:00 · 2026-02-17 02:26:41 +11:00
parent 602a1ebd55
commit 2c61fb69c1
29 changed files with 1571 additions and 120 deletions
--- a/docs/gateway/security/index.md
+++ b/docs/gateway/security/index.md
@@ -215,6 +215,18 @@ If a macOS node is paired, the Gateway can invoke `system.run` on that node. Thi
 - Controlled on the Mac via **Settings → Exec approvals** (security + ask + allowlist).
 - If you don’t want remote execution, set security to **deny** and remove node pairing for that Mac.

+## Skill security
+
+Community skills (installed from ClawHub) are subject to runtime security enforcement:
+
+- **Capabilities**: Skills declare what system access they need (`shell`, `filesystem`, `network`, `browser`, `sessions`) in `metadata.openclaw.capabilities`. No capabilities = read-only. Community skills that use tools without declaring the matching capability are blocked at runtime.
+- **SKILL.md scanning**: Content is scanned for prompt injection patterns, capability inflation, and boundary spoofing before entering the system prompt. Skills with critical findings are blocked from loading.
+- **Trust tiers**: Skills are classified as `builtin`, `community`, or `local`. Only `community` skills (installed from ClawHub) are subject to enforcement — builtin and local skills are exempt. Author verification may be introduced in a future release to provide an additional trust signal.
+- **Command dispatch gating**: Community skills using `command-dispatch: tool` can't dispatch to dangerous tools without declaring the matching capability.
+- **Audit logging**: All security events are tagged with `category: "security"` and include session context.
+
+Use `openclaw skills check` for a security overview and `openclaw skills info <name>` for per-skill details. See [Skills CLI](/cli/skills) for full command reference.
+
 ## Dynamic skills (watcher / remote nodes)

 OpenClaw can refresh the skills list mid-session:
@@ -222,7 +234,7 @@ OpenClaw can refresh the skills list mid-session:
 - **Skills watcher**: changes to `SKILL.md` can update the skills snapshot on the next agent turn.
 - **Remote nodes**: connecting a macOS node can make macOS-only skills eligible (based on bin probing).

-Treat skill folders as **trusted code** and restrict who can modify them.
+Restrict who can modify skill folders. Community skills are subject to scanning and capability enforcement (see above), but local and workspace skills are treated as trusted — if someone can write to your skill folders, they can inject instructions into the system prompt.

 ## The Threat Model