Commit Graph

25 Commits

Author SHA1 Message Date
Tyler Yust
d0ac1b0195 feat: add PDF analysis tool with native provider support (#31319)
* feat: add PDF analysis tool with native provider support

New `pdf` tool for analyzing PDF documents with model-powered analysis.

Architecture:
- Native PDF path: sends raw PDF bytes directly to providers that support
  inline document input (Anthropic via DocumentBlockParam, Google Gemini
  via inlineData with application/pdf MIME type)
- Extraction fallback: for providers without native PDF support, extracts
  text via pdfjs-dist and rasterizes pages to images via @napi-rs/canvas,
  then sends through the standard vision/text completion path

Key features:
- Single PDF (`pdf` param) or multiple PDFs (`pdfs` array, up to 10)
- Page range selection (`pages` param, e.g. "1-5", "1,3,7-9")
- Model override (`model` param) and file size limits (`maxBytesMb`)
- Auto-detects provider capability and falls back gracefully
- Same security patterns as image tool (SSRF guards, sandbox support,
  local path roots, workspace-only policy)

Config (agents.defaults):
- pdfModel: primary/fallbacks (defaults to imageModel, then session model)
- pdfMaxBytesMb: max PDF file size (default: 10)
- pdfMaxPages: max pages to process (default: 20)

Model catalog:
- Extended ModelInputType to include "document" alongside "text"/"image"
- Added modelSupportsDocument() capability check

Files:
- src/agents/tools/pdf-tool.ts - main tool factory
- src/agents/tools/pdf-tool.helpers.ts - helpers (page range, config, etc.)
- src/agents/tools/pdf-native-providers.ts - direct API calls for Anthropic/Google
- src/agents/tools/pdf-tool.test.ts - 43 tests covering all paths
- Modified: model-catalog.ts, openclaw-tools.ts, config schema/types/labels/help

* fix: prepare pdf tool for merge (#31319) (thanks @tyler6204)
2026-03-01 22:39:12 -08:00
joshavant
4c5a2c3c6d Agents: inject pi auth storage from runtime profiles 2026-02-26 14:47:22 +00:00
Gustavo Madeira Santana
5239b55c0a Config: expand Kilo catalog and persist selected Kilo models (#24921)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: f5a7e1a385
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-02-23 21:17:37 -05:00
Harry Cui Kepler
ffa63173e0 refactor(agents): migrate console.warn/error/info to subsystem logger (#22906)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: a806c4cb27
Co-authored-by: Kepler2024 <166882517+Kepler2024@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-02-21 17:11:47 -05:00
Peter Steinberger
6dcc052bb4 fix: stabilize model catalog and pi discovery auth storage compatibility 2026-02-18 02:09:40 +01:00
loiie45e
07faab6ac3 openai-codex: bridge OAuth profiles into pi auth.json for model discovery (#15184) 2026-02-13 11:39:37 +00:00
Lucky
e3cb2564d7 Agents: allow gpt-5.3-codex-spark in fallback and thinking (#14990)
* Agents: allow gpt-5.3-codex-spark in fallback and thinking

* Fix: model picker issue for openai-codex/gpt-5.3-codex-spark

Fixed an issue in the model picker.
2026-02-13 11:39:22 +00:00
cpojer
5ceff756e1 chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
cpojer
15792b153f chore: Enable more lint rules, disable some that trigger a lot. Will clean up later. 2026-01-31 16:04:04 +09:00
Peter Steinberger
08ed62852a chore: update deps and pi model discovery 2026-01-31 06:45:57 +01:00
Mario Zechner
c0a6e675a3 Agents: update pi dependencies to 0.50.7 2026-01-31 04:20:12 +01:00
Peter Steinberger
9a7160786a refactor: rename to openclaw 2026-01-30 03:16:21 +01:00
Peter Steinberger
6d16a658e5 refactor: rename clawdbot to moltbot with legacy compat 2026-01-27 12:21:02 +00:00
Tyler Yust
fdecf5c59a fix: skip image understanding when primary model has vision
When the primary model supports vision natively (e.g., Claude Opus 4.5),
skip the image understanding call entirely. The image will be injected
directly into the model context instead, saving an API call and avoiding
redundant descriptions.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 09:57:19 +00:00
Peter Steinberger
b341512564 fix: model catalog cache + TUI editor ctor (#1326) (thanks @dougvk) 2026-01-20 22:58:41 +00:00
Doug von Kohorn
e4f9555f21 fix(model-catalog): avoid caching import failures
Move dynamic import of @mariozechner/pi-coding-agent into the try/catch so transient module resolution errors don't poison the model catalog cache with a rejected promise.

This previously caused Discord/Telegram handlers and heartbeat to fail until process restart if the import failed once.
2026-01-20 20:09:55 +01:00
Peter Steinberger
c379191f80 chore: migrate to oxlint and oxfmt
Co-authored-by: Christoph Nakazawa <christoph.pojer@gmail.com>
2026-01-14 15:02:19 +00:00
Peter Steinberger
246adaa119 chore: rename project to clawdbot 2026-01-04 14:38:51 +00:00
Jake
81f4a7cdb7 Agents: Fix Gemini schema compatibility and robust model discovery 2026-01-03 13:57:29 +01:00
Peter Steinberger
b6301c719b fix: default low thinking for reasoning models 2026-01-03 12:19:06 +00:00
Peter Steinberger
557f8e5a04 fix: restore build after deps update 2025-12-26 12:17:36 +00:00
Peter Steinberger
0709586e3a fix: support mocked model registry in catalog 2025-12-26 11:53:55 +01:00
Peter Steinberger
82ced33747 fix: align pi model discovery with auth storage 2025-12-26 11:49:13 +01:00
Peter Steinberger
267cdf20e1 style: fix biome lint 2025-12-24 00:33:35 +00:00
Peter Steinberger
364a6a9444 feat: add per-session model selection 2025-12-23 23:45:20 +00:00