feat(security): add client-side skill security enforcement

Add a capability-based security model for community skills, inspired by how mobile and Apple ecosystem apps declare capabilities upfront. This is not a silver bullet for prompt injection, but it's a significant step up from the status quo and encourages responsible developer practices by making capability requirements explicit and visible. Runtime enforcement for community skills installed from ClawHub: - Capability declarations (shell, filesystem, network, browser, sessions) parsed from SKILL.md frontmatter and enforced at tool-call time - Static SKILL.md scanner detecting prompt injection patterns, suspicious constructs, and capability mismatches - Global skill security context tracking loaded community skills and their aggregate capabilities - Before-tool-call enforcement gate blocking undeclared tool usage - Command-dispatch capability check preventing shell/filesystem access without explicit declaration - Trust tier classification (builtin/community/local) — only community skills are subject to enforcement - System prompt trust context warning for skills with scan warnings or missing capability declarations - CLI: `skills list -v`, `skills info`, `skills check` now surface capabilities, scan results, and security status - TUI security log panel for skill enforcement events - Docs updated across 7 files covering the full security model Companion PR: openclaw/clawhub (capability visibility + UI badges)
2026-05-23 12:48:11 +00:00 · 2026-02-17 02:26:41 +11:00
parent 602a1ebd55
commit 2c61fb69c1
29 changed files with 1571 additions and 120 deletions
--- a/docs/cli/security.md
+++ b/docs/cli/security.md
@@ -34,6 +34,60 @@ It also warns when npm-based plugin/hook install records are unpinned, missing i
 It warns when Discord allowlists (`channels.discord.allowFrom`, `channels.discord.guilds.*.users`, pairing store) use name or tag entries instead of stable IDs.
 It warns when `gateway.auth.mode="none"` leaves Gateway HTTP APIs reachable without a shared secret (`/tools/invoke` plus any enabled `/v1/*` endpoint).

+## Skill security
+
+Community skills (installed from ClawHub) are subject to additional security enforcement:
+
+- **SKILL.md scanning**: content is scanned for prompt injection patterns, capability inflation, and boundary spoofing before entering the system prompt. Skills with critical findings are blocked from loading.
+- **Capability enforcement**: community skills must declare `capabilities` (e.g., `shell`, `network`) in frontmatter. Undeclared dangerous tool usage is blocked at runtime by the before-tool-call hook — a hard code gate that prompt injection cannot bypass.
+- **Command dispatch gating**: community skills using `command-dispatch: tool` can't dispatch to dangerous tools without the matching capability.
+- **Audit logging**: all security events are tagged with `category: "security"` and include session context for forensics. View in the web UI Logs tab using the Security filter.
+
+See `openclaw skills check` for a runtime security overview, `openclaw skills info <name>` for per-skill details, and [Skills — Tool enforcement matrix](/tools/skills#tool-enforcement-matrix) for the complete tool-by-tool breakdown.
+
+### Tool enforcement matrix
+
+Every tool falls into one of three tiers when community skills are loaded:
+
+**Always denied** — blocked unconditionally, no capability can override:
+
+| Tool | Reason |
+|------|--------|
+| `gateway` | Control-plane reconfiguration (restart, shutdown, auth changes) |
+| `nodes` | Cluster node management (add/remove compute, redirect traffic) |
+
+**Capability-gated** — blocked by default, allowed if the skill declares the matching capability:
+
+| Capability | Tools | What it unlocks |
+|------------|-------|-----------------|
+| `shell` | `exec`, `process`, `lobster` | Run shell commands and manage processes |
+| `filesystem` | `write`, `edit`, `apply_patch` | File mutations (read is always allowed) |
+| `network` | `web_fetch`, `web_search` | Outbound HTTP requests |
+| `browser` | `browser` | Browser automation |
+| `sessions` | `sessions_spawn`, `sessions_send`, `subagents` | Cross-session orchestration |
+| `messaging` | `message` | Send messages to configured channels |
+| `scheduling` | `cron` | Schedule recurring jobs |
+
+**Always allowed** — safe read-only or output-only tools, no capability required:
+
+| Tool | Why safe |
+|------|---------|
+| `read` | Read-only file access |
+| `memory_search`, `memory_get` | Read-only memory access |
+| `agents_list` | List agents (read-only) |
+| `sessions_list`, `sessions_history`, `session_status` | Session introspection (read-only) |
+| `canvas` | UI rendering (output-only) |
+| `image` | Image generation (output-only) |
+| `tts` | Text-to-speech (output-only) |
+
+A community skill with no capabilities declared gets access only to the always-allowed tier. Declare capabilities in SKILL.md frontmatter:
+
+```yaml
+metadata:
+  openclaw:
+    capabilities: [shell, filesystem, network]
+```
+
 ## JSON output

 Use `--json` for CI/policy checks:
--- a/docs/cli/skills.md
+++ b/docs/cli/skills.md
@@ -18,9 +18,163 @@ Related:

 ## Commands

+### `openclaw skills list`
+
+List all skills with status, capabilities, and source.
+
 ```bash
-openclaw skills list
-openclaw skills list --eligible
-openclaw skills info <name>
-openclaw skills check
+openclaw skills list              # all skills
+openclaw skills list --eligible   # only ready-to-use skills
+openclaw skills list --json       # JSON output
+openclaw skills list -v           # verbose (show missing requirements)
+```
+
+Output columns: **Status** (`+ ready`, `x missing`, `x blocked`), **Skill** (name + capability icons), **Description**, **Source**.
+
+Capability icons displayed next to skill names:
+
+| Icon | Capability |
+|------|-----------|
+| `>_` | `shell` — run shell commands |
+| `📂` | `filesystem` — read/write files |
+| `🌐` | `network` — outbound HTTP |
+| `🔍` | `browser` — browser automation |
+| `⚡` | `sessions` — cross-session orchestration |
+
+Skills blocked by security scanning show `x blocked` instead of `x missing`.
+
+Example output:
+
+```
+Skills (10/12 ready)
+
+Status      Skill                          Description                          Source
+ ready     git-autopush >_ 🌐            Automate git workflows               openclaw-managed
+ ready     think                          Extended thinking                    bundled
+ ready     peekaboo 🔍 ⚡                 Browser peek and screenshot          bundled
+x missing   summarize >_                   Summarize with CLI tool              bundled
+x blocked   evil-injector >_               Totally harmless skill               openclaw-managed
+- disabled  old-skill                      Deprecated skill                     workspace
+```
+
+With `-v` (verbose), two extra columns appear — **Scan** and **Missing**:
+
+```
+Status      Skill              Description          Source              Scan        Missing
+ ready     git-autopush >_ 🌐 Automate git wor...  openclaw-managed
+x missing   summarize >_       Summarize with...    bundled                         bins: summarize
+x blocked   evil-injector >_   Totally harmless...  openclaw-managed    [blocked]
+ ready     sketch-tool 🌐 >_  Generate sketches    openclaw-managed    [warn]
+```
+
+### `openclaw skills info <name>`
+
+Show detailed information about a single skill including security status.
+
+```bash
+openclaw skills info git-helper
+openclaw skills info git-helper --json
+```
+
+Displays: description, source, file path, capabilities (with descriptions), security scan results, requirements (met/unmet), and install options.
+
+Example output:
+
+```
+git-autopush + Ready
+
+  Automate git commit, push, and PR workflows.
+
+  Source        openclaw-managed
+  Path          ~/.openclaw/skills/git-autopush/SKILL.md
+  Homepage      https://github.com/example/git-autopush
+  Primary env   GH_TOKEN
+
+  Capabilities
+  >_ shell        Run shell commands
+  🌐 network      Make outbound HTTP requests
+
+  Security
+  Scan          + clean
+
+  Requirements
+  bin           git         + ok
+  bin           gh          + ok
+  env           GH_TOKEN    + ok
+```
+
+For a skill with missing requirements:
+
+```
+summarize x Missing requirements
+
+  Summarize URLs and files using the summarize CLI.
+
+  Source        bundled
+  Path          /opt/openclaw/skills/summarize/SKILL.md
+
+  Capabilities
+  >_ shell        Run shell commands
+
+  Security
+  Scan          + clean
+
+  Requirements
+  bin           summarize   x missing
+
+  Install options
+  brew          Install summarize (brew install summarize)
+```
+
+For a skill blocked by scanning:
+
+```
+evil-injector x Blocked (security)
+
+  Totally harmless skill.
+
+  Source        openclaw-managed
+  Path          ~/.openclaw/skills/evil-injector/SKILL.md
+
+  Capabilities
+  >_ shell        Run shell commands
+
+  Security
+  Scan          [blocked] prompt injection detected
+```
+
+### `openclaw skills check`
+
+Security-focused overview of all skills.
+
+```bash
+openclaw skills check
+openclaw skills check --json
+```
+
+Shows: total/eligible/disabled/blocked/missing counts, capabilities requested by community skills, runtime policy restrictions, and scan result summary.
+
+Example output:
+
+```
+Skills Status Check
+
+Status                      Count
+Total                       12
+Eligible                    10
+Disabled                    1
+Blocked (allowlist)         0
+Missing requirements        1
+
+Community skill capabilities
+Icon    Capability    #    Skills
+>_      shell         3    git-autopush, deploy-helper, node-runner
+📂      filesystem    2    git-autopush, file-editor
+🌐      network       2    git-autopush, sketch-tool
+
+Scan results
+Result      #
+Clean       11
+Warning     1
+Blocked     0
 ```
--- a/docs/gateway/security/index.md
+++ b/docs/gateway/security/index.md
@@ -215,6 +215,18 @@ If a macOS node is paired, the Gateway can invoke `system.run` on that node. Thi
 - Controlled on the Mac via **Settings → Exec approvals** (security + ask + allowlist).
 - If you don’t want remote execution, set security to **deny** and remove node pairing for that Mac.

+## Skill security
+
+Community skills (installed from ClawHub) are subject to runtime security enforcement:
+
+- **Capabilities**: Skills declare what system access they need (`shell`, `filesystem`, `network`, `browser`, `sessions`) in `metadata.openclaw.capabilities`. No capabilities = read-only. Community skills that use tools without declaring the matching capability are blocked at runtime.
+- **SKILL.md scanning**: Content is scanned for prompt injection patterns, capability inflation, and boundary spoofing before entering the system prompt. Skills with critical findings are blocked from loading.
+- **Trust tiers**: Skills are classified as `builtin`, `community`, or `local`. Only `community` skills (installed from ClawHub) are subject to enforcement — builtin and local skills are exempt. Author verification may be introduced in a future release to provide an additional trust signal.
+- **Command dispatch gating**: Community skills using `command-dispatch: tool` can't dispatch to dangerous tools without declaring the matching capability.
+- **Audit logging**: All security events are tagged with `category: "security"` and include session context.
+
+Use `openclaw skills check` for a security overview and `openclaw skills info <name>` for per-skill details. See [Skills CLI](/cli/skills) for full command reference.
+
 ## Dynamic skills (watcher / remote nodes)

 OpenClaw can refresh the skills list mid-session:
@@ -222,7 +234,7 @@ OpenClaw can refresh the skills list mid-session:
 - **Skills watcher**: changes to `SKILL.md` can update the skills snapshot on the next agent turn.
 - **Remote nodes**: connecting a macOS node can make macOS-only skills eligible (based on bin probing).

-Treat skill folders as **trusted code** and restrict who can modify them.
+Restrict who can modify skill folders. Community skills are subject to scanning and capability enforcement (see above), but local and workspace skills are treated as trusted — if someone can write to your skill folders, they can inject instructions into the system prompt.

 ## The Threat Model

--- a/docs/tools/clawhub.md
+++ b/docs/tools/clawhub.md
@@ -81,9 +81,15 @@ A typical skill includes:

 - A `SKILL.md` file with the primary description and usage.
 - Optional configs, scripts, or supporting files used by the skill.
- Metadata such as tags, summary, and install requirements.
+- Metadata such as tags, summary, install requirements, and capabilities.
+
+ClawHub uses metadata to power discovery and display skill capabilities.
+Skills declare what system access they need via `capabilities` in frontmatter
+(e.g., `shell`, `filesystem`, `network`). OpenClaw enforces these at runtime —
+community skills that use tools without declaring the matching capability are
+blocked. See [Skills](/tools/skills#gating-load-time-filters) for the
+full capability reference.

-ClawHub uses metadata to power discovery and safely expose skill capabilities.
 The registry also tracks usage signals (such as stars and downloads) to improve
 ranking and visibility.

@@ -103,7 +109,17 @@ ClawHub is open by default. Anyone can upload skills, but a GitHub account must
 be at least one week old to publish. This helps slow down abuse without blocking
 legitimate contributors.

-Reporting and moderation:
+### Capabilities and enforcement
+
+Skills declare `capabilities` in their SKILL.md frontmatter to describe what
+system access they need. ClawHub displays these to users before install.
+OpenClaw enforces them at runtime — community skills that attempt to use tools
+without the matching declared capability are blocked. Skills with no capabilities
+are treated as read-only (model-only instructions, no tool access).
+
+Available capabilities: `shell`, `filesystem`, `network`, `browser`, `sessions`.
+
+### Reporting and moderation

 - Any signed in user can report a skill.
 - Report reasons are required and recorded.
--- a/docs/tools/creating-skills.md
+++ b/docs/tools/creating-skills.md
@@ -35,11 +35,27 @@ description: A simple skill that says hello.
 When the user asks for a greeting, use the `echo` tool to say "Hello from your custom skill!".
 ```

-### 3. Add Tools (Optional)
+### 3. Declare Capabilities
+
+If your skill uses system tools, declare them in the `metadata.openclaw.capabilities` field:
+
+```markdown
+---
+name: deploy_helper
+description: Automate deployment workflows.
+metadata: { "openclaw": { "capabilities": ["shell", "filesystem"] } }
+---
+```
+
+Available capabilities: `shell`, `filesystem`, `network`, `browser`, `sessions`.
+
+Skills without capabilities are treated as read-only (model-only instructions). Community skills published to ClawHub **must** declare capabilities matching their tool usage — undeclared capabilities are blocked at runtime.
+
+### 4. Add Tools (Optional)

 You can define custom tools in the frontmatter or instruct the agent to use existing system tools (like `bash` or `browser`).

-### 4. Refresh OpenClaw
+### 5. Refresh OpenClaw

 Ask your agent to "refresh skills" or restart the gateway. OpenClaw will discover the new directory and index the `SKILL.md`.

--- a/docs/tools/skills.md
+++ b/docs/tools/skills.md
@@ -68,12 +68,199 @@ that up as `<workspace>/skills` on the next session.

 ## Security notes

- Treat third-party skills as **untrusted code**. Read them before enabling.
+- Treat third-party skills as **untrusted** until you have reviewed them. Runtime enforcement reduces blast radius but does not eliminate risk — read a skill's SKILL.md and declared capabilities before enabling it.
+- **Capabilities**: Community skills (from ClawHub) must declare `capabilities` in `metadata.openclaw` to describe what system access they need. Skills that don't declare capabilities are treated as read-only. Undeclared dangerous tool usage (e.g., `exec` without `shell` capability) is blocked at runtime for community skills. SKILL.md content is scanned for prompt injection before entering the system prompt.
+- Local and workspace skills are exempt from capability enforcement. If someone can write to your skill folders, they can inject instructions into the system prompt — restrict who can modify them.
 - Prefer sandboxed runs for untrusted inputs and risky tools. See [Sandboxing](/gateway/sandboxing).
 - `skills.entries.*.env` and `skills.entries.*.apiKey` inject secrets into the **host** process
  for that agent turn (not the sandbox). Keep secrets out of prompts and logs.
 - For a broader threat model and checklists, see [Security](/gateway/security).

+### Tool enforcement matrix
+
+When community skills are loaded, every tool falls into one of three tiers. Enforcement is applied by a hard code gate in the before-tool-call hook — prompt injection cannot bypass it.
+
+**Always denied** — blocked unconditionally when community skills are loaded, regardless of capability declarations:
+
+| Tool | Reason |
+|------|--------|
+| `gateway` | Control-plane reconfiguration (restart, shutdown, auth changes) |
+| `nodes` | Cluster node management (add/remove devices, redirect traffic) |
+
+**Capability-gated** — blocked by default, allowed when the skill declares the matching capability in `metadata.openclaw.capabilities`:
+
+| Capability | Tools | What it unlocks |
+|------------|-------|-----------------|
+| `shell` | `exec`, `process`, `lobster` | Run shell commands and manage processes |
+| `filesystem` | `write`, `edit`, `apply_patch` | File mutations (`read` is always allowed) |
+| `network` | `web_fetch`, `web_search` | Outbound HTTP requests |
+| `browser` | `browser` | Browser automation |
+| `sessions` | `sessions_spawn`, `sessions_send`, `subagents` | Cross-session orchestration |
+| `messaging` | `message` | Send messages to configured channels |
+| `scheduling` | `cron` | Schedule recurring jobs |
+
+**Always allowed** — safe read-only or output-only tools, no capability required:
+
+| Tool | Why safe |
+|------|---------|
+| `read` | Read-only file access |
+| `memory_search`, `memory_get` | Read-only memory access |
+| `agents_list` | List agents (read-only) |
+| `sessions_list`, `sessions_history`, `session_status` | Session introspection (read-only) |
+| `canvas` | UI rendering (output-only) |
+| `image` | Image generation (output-only) |
+| `tts` | Text-to-speech (output-only) |
+
+A community skill with no capabilities declared gets access only to the always-allowed tier.
+
+### Example: correct capability declaration
+
+This skill runs shell commands and makes HTTP requests. It declares both capabilities, so OpenClaw allows the tool calls:
+
+```markdown
+---
+name: git-autopush
+description: Automate git commit, push, and PR workflows.
+metadata: { "openclaw": { "capabilities": ["shell", "network"], "requires": { "bins": ["git", "gh"] } } }
+---
+
+# git-autopush
+
+When the user asks to push their changes:
+1. Run `git add -A && git commit` via the exec tool.
+2. Run `git push` via the exec tool.
+3. If requested, create a PR using `gh pr create`.
+```
+
+`openclaw skills info git-autopush` shows:
+
+```
+git-autopush + Ready
+
+  Automate git commit, push, and PR workflows.
+
+  Source        openclaw-managed
+  Path          ~/.openclaw/skills/git-autopush/SKILL.md
+
+  Capabilities
+  >_ shell        Run shell commands
+  🌐 network      Make outbound HTTP requests
+
+  Security
+  Scan          + clean
+```
+
+### Example: missing capability declaration
+
+This skill runs shell commands but doesn't declare `shell`. OpenClaw blocks the `exec` calls at runtime:
+
+```markdown
+---
+name: deploy-helper
+description: Deploy to production.
+metadata: { "openclaw": { "requires": { "bins": ["rsync"] } } }
+---
+
+# deploy-helper
+
+When the user asks to deploy, run `rsync -avz ./dist/ user@host:/var/www/` via the exec tool.
+```
+
+This skill has no `capabilities` declared, so it's treated as read-only. When the model tries to call `exec` on behalf of this skill's instructions, OpenClaw denies it. `openclaw skills info deploy-helper` shows:
+
+```
+deploy-helper + Ready
+
+  Deploy to production.
+
+  Source        openclaw-managed
+  Path          ~/.openclaw/skills/deploy-helper/SKILL.md
+
+  Capabilities
+  (none — read-only skill)
+
+  Security
+  Scan          + clean
+```
+
+The fix is to add `"capabilities": ["shell"]` to the metadata.
+
+### Example: blocked skill (failed security scan)
+
+If a SKILL.md contains prompt injection patterns, the scan blocks it from loading entirely:
+
+```
+evil-injector x Blocked (security)
+
+  Totally harmless skill.
+
+  Source        openclaw-managed
+  Path          ~/.openclaw/skills/evil-injector/SKILL.md
+
+  Capabilities
+  >_ shell        Run shell commands
+
+  Security
+  Scan          [blocked] prompt injection detected
+```
+
+This skill never enters the system prompt. It shows as `x blocked` in `openclaw skills list`.
+
+### How the model sees skills
+
+The model does not see the full SKILL.md in the system prompt. It only sees a compact XML listing with three fields per skill: `name`, `description`, and `location` (the file path). The model then uses the `read` tool to load the full SKILL.md on demand when the task matches.
+
+This is what the model receives in the system prompt:
+
+```
+## Skills (mandatory)
+Before replying: scan <available_skills> <description> entries.
+- If exactly one skill clearly applies: read its SKILL.md at <location> with `read`, then follow it.
+- If multiple could apply: choose the most specific one, then read/follow it.
+- If none clearly apply: do not read any SKILL.md.
+Constraints: never read more than one skill up front; only read after selecting.
+
+The following skills provide specialized instructions for specific tasks.
+Use the read tool to load a skill's file when the task matches its description.
+When a skill file references a relative path, resolve it against the skill
+directory (parent of SKILL.md / dirname of the path) and use that absolute
+path in tool commands.
+
+<available_skills>
+  <skill>
+    <name>git-autopush</name>
+    <description>Automate git commit, push, and PR workflows.</description>
+    <location>/home/user/.openclaw/skills/git-autopush/SKILL.md</location>
+  </skill>
+  <skill>
+    <name>todoist-cli</name>
+    <description>Manage Todoist tasks, projects, and labels.</description>
+    <location>/home/user/.openclaw/skills/todoist-cli/SKILL.md</location>
+  </skill>
+</available_skills>
+```
+
+**What this means for skill authors:**
+
+- **`description` is your pitch** — it's the only thing the model reads to decide whether to load your skill. Make it specific and task-oriented. "Manage Todoist tasks, projects, and labels from the command line" is better than "Todoist integration."
+- **`name` must be lowercase `[a-z0-9-]`**, max 64 characters, must match the parent directory name.
+- **`description` max 1024 characters.**
+- **Your SKILL.md body is loaded on demand** — it needs to be self-contained instructions the model can follow after reading.
+- **Relative paths in SKILL.md** are resolved against the skill directory. Use relative paths to reference supporting files.
+
+The `Skill` type from `@mariozechner/pi-coding-agent`:
+
+```typescript
+interface Skill {
+  name: string;              // from frontmatter (or parent dir name)
+  description: string;       // from frontmatter (required, max 1024 chars)
+  filePath: string;          // absolute path to SKILL.md
+  baseDir: string;           // parent directory of SKILL.md
+  source: string;            // origin identifier
+  disableModelInvocation: boolean;  // if true, excluded from prompt
+}
+```
+
 ## Format (AgentSkills + Pi-compatible)

 `SKILL.md` must include at least:
@@ -116,6 +303,7 @@ metadata:
      {
        "requires": { "bins": ["uv"], "env": ["GEMINI_API_KEY"], "config": ["browser.enabled"] },
        "primaryEnv": "GEMINI_API_KEY",
+        "capabilities": ["browser", "network"],
      },
  }
 ---
@@ -125,8 +313,18 @@ Fields under `metadata.openclaw`:

 - `always: true` — always include the skill (skip other gates).
 - `emoji` — optional emoji used by the macOS Skills UI.
- `homepage` — optional URL shown as “Website” in the macOS Skills UI.
+- `homepage` — optional URL shown as "Website" in the macOS Skills UI.
 - `os` — optional list of platforms (`darwin`, `linux`, `win32`). If set, the skill is only eligible on those OSes.
+- `capabilities` — list of system access the skill needs. Used for security enforcement and user-facing display. Allowed values:
+  - `shell` — run shell commands (maps to `exec`, `process`)
+  - `filesystem` — read/write/edit files (maps to `write`, `edit`, `apply_patch`; `read` is always allowed)
+  - `network` — outbound HTTP (maps to `web_search`, `web_fetch`)
+  - `browser` — browser automation (maps to `browser`)
+  - `sessions` — cross-session orchestration (maps to `sessions_spawn`, `sessions_send`, `subagents`)
+  - `messaging` — send messages to configured channels (maps to `message`)
+  - `scheduling` — schedule recurring jobs (maps to `cron`)
+
+  No capabilities declared = read-only, model-only skill. Community skills with undeclared capabilities that attempt to use dangerous tools will be blocked at runtime. See [Tool enforcement matrix](#tool-enforcement-matrix) below and [Security](/gateway/security) for full details.
 - `requires.bins` — list; each must exist on `PATH`.
 - `requires.anyBins` — list; at least one must exist on `PATH`.
 - `requires.env` — list; env var must exist **or** be provided in config.