feat(security): add client-side skill security enforcement

Add a capability-based security model for community skills, inspired by
how mobile and Apple ecosystem apps declare capabilities upfront. This is
not a silver bullet for prompt injection, but it's a significant step up
from the status quo and encourages responsible developer practices by
making capability requirements explicit and visible.

Runtime enforcement for community skills installed from ClawHub:

- Capability declarations (shell, filesystem, network, browser, sessions)
  parsed from SKILL.md frontmatter and enforced at tool-call time
- Static SKILL.md scanner detecting prompt injection patterns, suspicious
  constructs, and capability mismatches
- Global skill security context tracking loaded community skills and
  their aggregate capabilities
- Before-tool-call enforcement gate blocking undeclared tool usage
- Command-dispatch capability check preventing shell/filesystem access
  without explicit declaration
- Trust tier classification (builtin/community/local) — only community
  skills are subject to enforcement
- System prompt trust context warning for skills with scan warnings or
  missing capability declarations
- CLI: `skills list -v`, `skills info`, `skills check` now surface
  capabilities, scan results, and security status
- TUI security log panel for skill enforcement events
- Docs updated across 7 files covering the full security model

Companion PR: openclaw/clawhub (capability visibility + UI badges)
This commit is contained in:
theonejvo
2026-02-17 02:26:41 +11:00
parent 602a1ebd55
commit 2c61fb69c1
29 changed files with 1571 additions and 120 deletions

View File

@@ -81,9 +81,15 @@ A typical skill includes:
- A `SKILL.md` file with the primary description and usage.
- Optional configs, scripts, or supporting files used by the skill.
- Metadata such as tags, summary, and install requirements.
- Metadata such as tags, summary, install requirements, and capabilities.
ClawHub uses metadata to power discovery and display skill capabilities.
Skills declare what system access they need via `capabilities` in frontmatter
(e.g., `shell`, `filesystem`, `network`). OpenClaw enforces these at runtime —
community skills that use tools without declaring the matching capability are
blocked. See [Skills](/tools/skills#gating-load-time-filters) for the
full capability reference.
ClawHub uses metadata to power discovery and safely expose skill capabilities.
The registry also tracks usage signals (such as stars and downloads) to improve
ranking and visibility.
@@ -103,7 +109,17 @@ ClawHub is open by default. Anyone can upload skills, but a GitHub account must
be at least one week old to publish. This helps slow down abuse without blocking
legitimate contributors.
Reporting and moderation:
### Capabilities and enforcement
Skills declare `capabilities` in their SKILL.md frontmatter to describe what
system access they need. ClawHub displays these to users before install.
OpenClaw enforces them at runtime — community skills that attempt to use tools
without the matching declared capability are blocked. Skills with no capabilities
are treated as read-only (model-only instructions, no tool access).
Available capabilities: `shell`, `filesystem`, `network`, `browser`, `sessions`.
### Reporting and moderation
- Any signed in user can report a skill.
- Report reasons are required and recorded.

View File

@@ -35,11 +35,27 @@ description: A simple skill that says hello.
When the user asks for a greeting, use the `echo` tool to say "Hello from your custom skill!".
```
### 3. Add Tools (Optional)
### 3. Declare Capabilities
If your skill uses system tools, declare them in the `metadata.openclaw.capabilities` field:
```markdown
---
name: deploy_helper
description: Automate deployment workflows.
metadata: { "openclaw": { "capabilities": ["shell", "filesystem"] } }
---
```
Available capabilities: `shell`, `filesystem`, `network`, `browser`, `sessions`.
Skills without capabilities are treated as read-only (model-only instructions). Community skills published to ClawHub **must** declare capabilities matching their tool usage — undeclared capabilities are blocked at runtime.
### 4. Add Tools (Optional)
You can define custom tools in the frontmatter or instruct the agent to use existing system tools (like `bash` or `browser`).
### 4. Refresh OpenClaw
### 5. Refresh OpenClaw
Ask your agent to "refresh skills" or restart the gateway. OpenClaw will discover the new directory and index the `SKILL.md`.

View File

@@ -68,12 +68,199 @@ that up as `<workspace>/skills` on the next session.
## Security notes
- Treat third-party skills as **untrusted code**. Read them before enabling.
- Treat third-party skills as **untrusted** until you have reviewed them. Runtime enforcement reduces blast radius but does not eliminate risk — read a skill's SKILL.md and declared capabilities before enabling it.
- **Capabilities**: Community skills (from ClawHub) must declare `capabilities` in `metadata.openclaw` to describe what system access they need. Skills that don't declare capabilities are treated as read-only. Undeclared dangerous tool usage (e.g., `exec` without `shell` capability) is blocked at runtime for community skills. SKILL.md content is scanned for prompt injection before entering the system prompt.
- Local and workspace skills are exempt from capability enforcement. If someone can write to your skill folders, they can inject instructions into the system prompt — restrict who can modify them.
- Prefer sandboxed runs for untrusted inputs and risky tools. See [Sandboxing](/gateway/sandboxing).
- `skills.entries.*.env` and `skills.entries.*.apiKey` inject secrets into the **host** process
for that agent turn (not the sandbox). Keep secrets out of prompts and logs.
- For a broader threat model and checklists, see [Security](/gateway/security).
### Tool enforcement matrix
When community skills are loaded, every tool falls into one of three tiers. Enforcement is applied by a hard code gate in the before-tool-call hook — prompt injection cannot bypass it.
**Always denied** — blocked unconditionally when community skills are loaded, regardless of capability declarations:
| Tool | Reason |
|------|--------|
| `gateway` | Control-plane reconfiguration (restart, shutdown, auth changes) |
| `nodes` | Cluster node management (add/remove devices, redirect traffic) |
**Capability-gated** — blocked by default, allowed when the skill declares the matching capability in `metadata.openclaw.capabilities`:
| Capability | Tools | What it unlocks |
|------------|-------|-----------------|
| `shell` | `exec`, `process`, `lobster` | Run shell commands and manage processes |
| `filesystem` | `write`, `edit`, `apply_patch` | File mutations (`read` is always allowed) |
| `network` | `web_fetch`, `web_search` | Outbound HTTP requests |
| `browser` | `browser` | Browser automation |
| `sessions` | `sessions_spawn`, `sessions_send`, `subagents` | Cross-session orchestration |
| `messaging` | `message` | Send messages to configured channels |
| `scheduling` | `cron` | Schedule recurring jobs |
**Always allowed** — safe read-only or output-only tools, no capability required:
| Tool | Why safe |
|------|---------|
| `read` | Read-only file access |
| `memory_search`, `memory_get` | Read-only memory access |
| `agents_list` | List agents (read-only) |
| `sessions_list`, `sessions_history`, `session_status` | Session introspection (read-only) |
| `canvas` | UI rendering (output-only) |
| `image` | Image generation (output-only) |
| `tts` | Text-to-speech (output-only) |
A community skill with no capabilities declared gets access only to the always-allowed tier.
### Example: correct capability declaration
This skill runs shell commands and makes HTTP requests. It declares both capabilities, so OpenClaw allows the tool calls:
```markdown
---
name: git-autopush
description: Automate git commit, push, and PR workflows.
metadata: { "openclaw": { "capabilities": ["shell", "network"], "requires": { "bins": ["git", "gh"] } } }
---
# git-autopush
When the user asks to push their changes:
1. Run `git add -A && git commit` via the exec tool.
2. Run `git push` via the exec tool.
3. If requested, create a PR using `gh pr create`.
```
`openclaw skills info git-autopush` shows:
```
git-autopush + Ready
Automate git commit, push, and PR workflows.
Source openclaw-managed
Path ~/.openclaw/skills/git-autopush/SKILL.md
Capabilities
>_ shell Run shell commands
🌐 network Make outbound HTTP requests
Security
Scan + clean
```
### Example: missing capability declaration
This skill runs shell commands but doesn't declare `shell`. OpenClaw blocks the `exec` calls at runtime:
```markdown
---
name: deploy-helper
description: Deploy to production.
metadata: { "openclaw": { "requires": { "bins": ["rsync"] } } }
---
# deploy-helper
When the user asks to deploy, run `rsync -avz ./dist/ user@host:/var/www/` via the exec tool.
```
This skill has no `capabilities` declared, so it's treated as read-only. When the model tries to call `exec` on behalf of this skill's instructions, OpenClaw denies it. `openclaw skills info deploy-helper` shows:
```
deploy-helper + Ready
Deploy to production.
Source openclaw-managed
Path ~/.openclaw/skills/deploy-helper/SKILL.md
Capabilities
(none — read-only skill)
Security
Scan + clean
```
The fix is to add `"capabilities": ["shell"]` to the metadata.
### Example: blocked skill (failed security scan)
If a SKILL.md contains prompt injection patterns, the scan blocks it from loading entirely:
```
evil-injector x Blocked (security)
Totally harmless skill.
Source openclaw-managed
Path ~/.openclaw/skills/evil-injector/SKILL.md
Capabilities
>_ shell Run shell commands
Security
Scan [blocked] prompt injection detected
```
This skill never enters the system prompt. It shows as `x blocked` in `openclaw skills list`.
### How the model sees skills
The model does not see the full SKILL.md in the system prompt. It only sees a compact XML listing with three fields per skill: `name`, `description`, and `location` (the file path). The model then uses the `read` tool to load the full SKILL.md on demand when the task matches.
This is what the model receives in the system prompt:
```
## Skills (mandatory)
Before replying: scan <available_skills> <description> entries.
- If exactly one skill clearly applies: read its SKILL.md at <location> with `read`, then follow it.
- If multiple could apply: choose the most specific one, then read/follow it.
- If none clearly apply: do not read any SKILL.md.
Constraints: never read more than one skill up front; only read after selecting.
The following skills provide specialized instructions for specific tasks.
Use the read tool to load a skill's file when the task matches its description.
When a skill file references a relative path, resolve it against the skill
directory (parent of SKILL.md / dirname of the path) and use that absolute
path in tool commands.
<available_skills>
<skill>
<name>git-autopush</name>
<description>Automate git commit, push, and PR workflows.</description>
<location>/home/user/.openclaw/skills/git-autopush/SKILL.md</location>
</skill>
<skill>
<name>todoist-cli</name>
<description>Manage Todoist tasks, projects, and labels.</description>
<location>/home/user/.openclaw/skills/todoist-cli/SKILL.md</location>
</skill>
</available_skills>
```
**What this means for skill authors:**
- **`description` is your pitch** — it's the only thing the model reads to decide whether to load your skill. Make it specific and task-oriented. "Manage Todoist tasks, projects, and labels from the command line" is better than "Todoist integration."
- **`name` must be lowercase `[a-z0-9-]`**, max 64 characters, must match the parent directory name.
- **`description` max 1024 characters.**
- **Your SKILL.md body is loaded on demand** — it needs to be self-contained instructions the model can follow after reading.
- **Relative paths in SKILL.md** are resolved against the skill directory. Use relative paths to reference supporting files.
The `Skill` type from `@mariozechner/pi-coding-agent`:
```typescript
interface Skill {
name: string; // from frontmatter (or parent dir name)
description: string; // from frontmatter (required, max 1024 chars)
filePath: string; // absolute path to SKILL.md
baseDir: string; // parent directory of SKILL.md
source: string; // origin identifier
disableModelInvocation: boolean; // if true, excluded from prompt
}
```
## Format (AgentSkills + Pi-compatible)
`SKILL.md` must include at least:
@@ -116,6 +303,7 @@ metadata:
{
"requires": { "bins": ["uv"], "env": ["GEMINI_API_KEY"], "config": ["browser.enabled"] },
"primaryEnv": "GEMINI_API_KEY",
"capabilities": ["browser", "network"],
},
}
---
@@ -125,8 +313,18 @@ Fields under `metadata.openclaw`:
- `always: true` — always include the skill (skip other gates).
- `emoji` — optional emoji used by the macOS Skills UI.
- `homepage` — optional URL shown as Website in the macOS Skills UI.
- `homepage` — optional URL shown as "Website" in the macOS Skills UI.
- `os` — optional list of platforms (`darwin`, `linux`, `win32`). If set, the skill is only eligible on those OSes.
- `capabilities` — list of system access the skill needs. Used for security enforcement and user-facing display. Allowed values:
- `shell` — run shell commands (maps to `exec`, `process`)
- `filesystem` — read/write/edit files (maps to `write`, `edit`, `apply_patch`; `read` is always allowed)
- `network` — outbound HTTP (maps to `web_search`, `web_fetch`)
- `browser` — browser automation (maps to `browser`)
- `sessions` — cross-session orchestration (maps to `sessions_spawn`, `sessions_send`, `subagents`)
- `messaging` — send messages to configured channels (maps to `message`)
- `scheduling` — schedule recurring jobs (maps to `cron`)
No capabilities declared = read-only, model-only skill. Community skills with undeclared capabilities that attempt to use dangerous tools will be blocked at runtime. See [Tool enforcement matrix](#tool-enforcement-matrix) below and [Security](/gateway/security) for full details.
- `requires.bins` — list; each must exist on `PATH`.
- `requires.anyBins` — list; at least one must exist on `PATH`.
- `requires.env` — list; env var must exist **or** be provided in config.