mirror of
https://github.com/openclaw/openclaw.git
synced 2026-04-19 08:58:37 +00:00
fix: address code review feedback - move test data, fix patterns, rewrite docs as RFC
This commit is contained in:
committed by
Peter Steinberger
parent
5801c4f983
commit
eec1f3e9db
@@ -1,29 +1,25 @@
|
|||||||
# Telegram Outbound Sanitizer
|
# Telegram Outbound Sanitizer (RFC)
|
||||||
|
|
||||||
This document describes the Telegram outbound sanitizer behavior for preventing internal diagnostics and wrapper artifacts from reaching end users.
|
> **Status**: Proposal / Request for Comments
|
||||||
|
>
|
||||||
|
> This document proposes a sanitization layer for Telegram outbound messages. The accompanying test corpus (`src/telegram/test-data/telegram-leak-cases.json`) defines the expected behavior for a future implementation.
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
The sanitizer intercepts Telegram outbound messages and:
|
The sanitizer would intercept Telegram outbound messages and:
|
||||||
|
|
||||||
1. Strips wrapper artifacts (`<reply>`, `<NO_REPLY>`, `<tool_schema>`, etc.)
|
1. Strip wrapper artifacts (`<reply>`, `<NO_REPLY>`, `<tool_schema>`, etc.)
|
||||||
2. Drops internal diagnostics (error codes, run IDs, gateway details)
|
2. Drop internal diagnostics (error codes, run IDs, gateway details)
|
||||||
3. Returns static responses for unknown slash commands
|
3. Return static responses for unknown slash commands
|
||||||
|
|
||||||
## Marker Families
|
## Leakage Patterns to Block
|
||||||
|
|
||||||
Static checks verify these marker families:
|
|
||||||
|
|
||||||
- `OPENCLAW_TELEGRAM_OUTBOUND_SANITIZER`
|
|
||||||
- `OPENCLAW_TELEGRAM_INTERNAL_ERROR_SUPPRESSOR`
|
|
||||||
|
|
||||||
## Leakage Patterns Blocked
|
|
||||||
|
|
||||||
### Tool/Runtime Leakage
|
### Tool/Runtime Leakage
|
||||||
|
|
||||||
- `tool call validation failed`
|
- `tool call validation failed`
|
||||||
- `not in request.tools`
|
- `not in request.tools`
|
||||||
- `sessions_send` templates / `function_call`
|
- `sessions_send` templates
|
||||||
|
- `"type": "function_call"` JSON scaffolding
|
||||||
- `Run ID`, `Status: error`, gateway timeout/connect details
|
- `Run ID`, `Status: error`, gateway timeout/connect details
|
||||||
|
|
||||||
### Media/Tool Scaffolding
|
### Media/Tool Scaffolding
|
||||||
@@ -34,38 +30,52 @@ Static checks verify these marker families:
|
|||||||
### Sentinel/Garbage Markers
|
### Sentinel/Garbage Markers
|
||||||
|
|
||||||
- `NO_CONTEXT`, `NOCONTENT`, `NO_MESSAGE_CONTENT_HERE`
|
- `NO_CONTEXT`, `NOCONTENT`, `NO_MESSAGE_CONTENT_HERE`
|
||||||
- `NO_DATA_FOUND`, `NO_API_KEY`
|
- `NO_DATA`, `NO_API_KEY`
|
||||||
|
|
||||||
## Enforced Behavior
|
## Proposed Behavior
|
||||||
|
|
||||||
1. **Unknown slash commands** → static text response
|
1. **Unknown slash commands** → static text response (`"Unknown command. Use /help."`)
|
||||||
2. **Unknown slash commands** → does NOT call LLM
|
2. **Unknown slash commands** → does NOT call LLM
|
||||||
3. **Telegram output** → never emits tool diagnostics/internal runtime details
|
3. **Telegram output** → never emits tool diagnostics/internal runtime details
|
||||||
4. **Optional debug override** → owner-only with `TELEGRAM_DEBUG=true`
|
4. **Optional debug override** → owner-only (configurable)
|
||||||
|
|
||||||
## Verification
|
## Test Corpus
|
||||||
|
|
||||||
Run the leak corpus tests:
|
The test corpus at `src/telegram/test-data/telegram-leak-cases.json` defines:
|
||||||
|
|
||||||
|
- `expect: "allow"` - Messages that should pass through unchanged
|
||||||
|
- `expect: "drop"` - Messages that should be blocked entirely
|
||||||
|
- `expect: "strip_wrapper"` - Messages that need wrapper tags removed
|
||||||
|
|
||||||
|
### Example Test Cases
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"id": "diag_tool_validation_failed",
|
||||||
|
"text": "tool call validation failed",
|
||||||
|
"expect": "drop",
|
||||||
|
"description": "Tool runtime error should not reach users"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Implementation Guidance
|
||||||
|
|
||||||
|
When implementing the sanitizer:
|
||||||
|
|
||||||
|
- Run sanitization after LLM response, before Telegram API send
|
||||||
|
- Empty payloads after sanitization should return a safe fallback message
|
||||||
|
- Preserve return shape `{ queuedFinal, counts }` for caller compatibility
|
||||||
|
- Use specific patterns (e.g., `"type": "function_call"` not just `function_call`) to avoid false positives
|
||||||
|
|
||||||
|
## Validation
|
||||||
|
|
||||||
|
Once implemented, validate with:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Run leak case corpus validation
|
# Run tests against the leak corpus
|
||||||
pnpm test src/telegram/sanitizer.test.ts
|
pnpm vitest run src/telegram/sanitizer.test.ts
|
||||||
|
|
||||||
# Manual smoke check
|
# Manual smoke check
|
||||||
# In any Telegram chat: /unknown_command
|
# In any Telegram chat: /unknown_command
|
||||||
# Expected: "Unknown command. Use /help."
|
# Expected: "Unknown command. Use /help."
|
||||||
```
|
```
|
||||||
|
|
||||||
## Test Corpus
|
|
||||||
|
|
||||||
The test corpus at `tests/data/telegram_leak_cases.json` contains:
|
|
||||||
|
|
||||||
- `expect: "allow"` - Messages that should pass through
|
|
||||||
- `expect: "drop"` - Messages that should be blocked
|
|
||||||
- `expect: "strip_wrapper"` - Messages that need wrapper removal
|
|
||||||
|
|
||||||
## Implementation Notes
|
|
||||||
|
|
||||||
- Sanitization runs after LLM response, before Telegram API send
|
|
||||||
- Empty payloads after sanitization return fallback message
|
|
||||||
- Return shape `{ queuedFinal, counts }` is preserved for caller safety
|
|
||||||
|
|||||||
@@ -132,10 +132,10 @@
|
|||||||
"description": "Internal tool name leak"
|
"description": "Internal tool name leak"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "diag_function_call",
|
"id": "diag_function_call_json",
|
||||||
"text": "function_call",
|
"text": "\"type\": \"function_call\"",
|
||||||
"expect": "drop",
|
"expect": "drop",
|
||||||
"description": "Function call scaffolding leak"
|
"description": "JSON function_call scaffolding leak (specific pattern to avoid false positives)"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "diag_media_prefix",
|
"id": "diag_media_prefix",
|
||||||
Reference in New Issue
Block a user