Commit Graph

37 Commits

Author SHA1 Message Date
Peter Steinberger
b8b43175c5 style: align formatting with oxfmt 0.33 2026-02-18 01:34:35 +00:00
Peter Steinberger
31f9be126c style: run oxfmt and fix gate failures 2026-02-18 01:29:02 +00:00
cpojer
d0cb8c19b2 chore: wtf. 2026-02-17 13:36:48 +09:00
Sebastian
826e62a3bc fix(sessions): purge deleted transcript archives 2026-02-16 22:35:27 -05:00
cpojer
90ef2d6bdf chore: Update formatting. 2026-02-17 09:18:40 +09:00
Peter Steinberger
12a947223b fix(ci): restore main checks after bulk merges 2026-02-16 23:47:27 +00:00
Peter Steinberger
eaa2f7a7bf fix(ci): restore main lint/typecheck after direct merges 2026-02-16 23:26:11 +00:00
Winston
94eecaa446 fix: atomic session store writes to prevent context loss on Windows
On Windows, fs.promises.writeFile truncates the target file to 0 bytes
before writing. Since loadSessionStore reads the file synchronously
without holding the write lock, a concurrent read can observe the empty
file, fail to parse it, and fall through to an empty store — causing the
agent to lose its session context.

Changes:
- saveSessionStoreUnlocked (Windows path): write to a temp file first,
  then rename it onto the target. If rename fails due to file locking,
  retry 3 times with backoff, then fall back to copyFile (which
  overwrites in-place without truncating to 0 bytes).
- loadSessionStore: on Windows, retry up to 3 times with 50ms
  synchronous backoff (via Atomics.wait) when the file is empty or
  unparseable, giving the writer time to finish. SharedArrayBuffer is
  allocated once and reused across retry attempts.
2026-02-16 23:57:21 +01:00
Hudson
93fbe6482b fix(sessions): archive transcript files when pruning stale entries
pruneStaleEntries() removed entries from sessions.json but left the
corresponding .jsonl transcript files on disk indefinitely.

Added an onPruned callback to collect pruned session IDs, then
archives their transcript files via archiveSessionTranscripts()
after pruning completes. Only runs in enforce mode.
2026-02-16 23:50:56 +01:00
大猫子
0931a35709 fix(sessions): guard withSessionStoreLock against undefined storePath (#14717) (openclaw#14755) thanks @lailoo
Verified:
- pnpm build
- pnpm check
- pnpm test:macmini

Co-authored-by: lailoo <20536249+lailoo@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-15 07:57:51 -06:00
Kentaro Kuribayashi
c6ecd2a044 fix: replace file-based session store lock with in-process Promise chain mutex (#14498)
* fix: replace file-based session store lock with in-process Promise chain mutex

Node.js is single-threaded, so file-based locking (open('wx') + polling +
stale eviction) is unnecessary and causes timeouts under heavy session load.

Replace with a simple per-storePath Promise chain that serializes access
without any filesystem overhead.

In a 1159-session environment over 3 hours:
- Lock timeouts: 25
- Stuck sessions: 157 (max 1031s, avg 388s)
- Slow listeners: 39 (max 265s, avg 70s)

Root cause: during sessions.json file I/O, await yields control and other
lock requests hit the 10s timeout waiting for the .lock file to be released.

* test: add comprehensive tests for Promise chain mutex lock

- Concurrent access serialization (10 parallel writers, counter integrity)
- Error resilience (single & multiple consecutive throws don't poison queue)
- Independent storePath parallelism (different paths run concurrently)
- LOCK_QUEUES cleanup after completion and after errors
- No .lock file created on disk

Also fix: store caught promise in LOCK_QUEUES to avoid unhandled rejection
warnings when queued fn() throws.

* fix: add timeout to Promise chain mutex to prevent infinite hangs on Windows

* fix(session-store): enforce strict queue timeout + cross-process lock

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-02-13 05:12:59 +01:00
hyf0-agent
5c32989f53 perf: use JSON.parse instead of JSON5.parse for sessions.json (~35x faster) (#14530)
Co-authored-by: hyf0-agent <hyf0-agent@users.noreply.github.com>
2026-02-12 17:10:21 +09:00
Gustavo Madeira Santana
e19a23520c fix: unify session maintenance and cron run pruning (#13083)
* fix: prune stale session entries, cap entry count, and rotate sessions.json

The sessions.json file grows unbounded over time. Every heartbeat tick (default: 30m)
triggers multiple full rewrites, and session keys from groups, threads, and DMs
accumulate indefinitely with large embedded objects (skillsSnapshot,
systemPromptReport). At >50MB the synchronous JSON parse blocks the event loop,
causing Telegram webhook timeouts and effectively taking the bot down.

Three mitigations, all running inside saveSessionStoreUnlocked() on every write:

1. Prune stale entries: remove entries with updatedAt older than 30 days
   (configurable via session.maintenance.pruneDays in openclaw.json)

2. Cap entry count: keep only the 500 most recently updated entries
   (configurable via session.maintenance.maxEntries). Entries without updatedAt
   are evicted first.

3. File rotation: if the existing sessions.json exceeds 10MB before a write,
   rename it to sessions.json.bak.{timestamp} and keep only the 3 most recent
   backups (configurable via session.maintenance.rotateBytes).

All three thresholds are configurable under session.maintenance in openclaw.json
with Zod validation. No env vars.

Existing tests updated to use Date.now() instead of epoch-relative timestamps
(1, 2, 3) that would be incorrectly pruned as stale.

27 new tests covering pruning, capping, rotation, and integration scenarios.

* feat: auto-prune expired cron run sessions (#12289)

Add TTL-based reaper for isolated cron run sessions that accumulate
indefinitely in sessions.json.

New config option:
  cron.sessionRetention: string | false  (default: '24h')

The reaper runs piggy-backed on the cron timer tick, self-throttled
to sweep at most every 5 minutes. It removes session entries matching
the pattern cron:<jobId>:run:<uuid> whose updatedAt + retention < now.

Design follows the Kubernetes ttlSecondsAfterFinished pattern:
- Sessions are persisted normally (observability/debugging)
- A periodic reaper prunes expired entries
- Configurable retention with sensible default
- Set to false to disable pruning entirely

Files changed:
- src/config/types.cron.ts: Add sessionRetention to CronConfig
- src/config/zod-schema.ts: Add Zod validation for sessionRetention
- src/cron/session-reaper.ts: New reaper module (sweepCronRunSessions)
- src/cron/session-reaper.test.ts: 12 tests covering all paths
- src/cron/service/state.ts: Add cronConfig/sessionStorePath to deps
- src/cron/service/timer.ts: Wire reaper into onTimer tick
- src/gateway/server-cron.ts: Pass config and session store path to deps

Closes #12289

* fix: sweep cron session stores per agent

* docs: add changelog for session maintenance (#13083) (thanks @skyfallsin, @Glucksberg)

* fix: add warn-only session maintenance mode

* fix: warn-only maintenance defaults to active session

* fix: deliver maintenance warnings to active session

* docs: add session maintenance examples

* fix: accept duration and size maintenance thresholds

* refactor: share cron run session key check

* fix: format issues and replace defaultRuntime.warn with console.warn

---------

Co-authored-by: Pradeep Elankumaran <pradeepe@gmail.com>
Co-authored-by: Glucksberg <markuscontasul@gmail.com>
Co-authored-by: max <40643627+quotentiroler@users.noreply.github.com>
Co-authored-by: quotentiroler <max.nussbaumer@maxhealth.tech>
2026-02-09 20:42:35 -08:00
Ayaan Zaidi
d85f0566a9 fix: tighten thread-clear and telegram retry guards 2026-02-09 08:59:21 +05:30
Ayaan Zaidi
d7bd68ff24 fix: recover telegram sends from stale thread ids 2026-02-09 08:59:21 +05:30
cpojer
f06dd8df06 chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
cpojer
5ceff756e1 chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
cpojer
15792b153f chore: Enable more lint rules, disable some that trigger a lot. Will clean up later. 2026-01-31 16:04:04 +09:00
Ayaan Zaidi
310eed825e fix: preserve delivery thread fallback (#4911) (thanks @yevhen) 2026-01-31 09:31:40 +05:30
Yevhen Bobrov
a642ca4ea8 Fix telegram threadId in deliveryContext 2026-01-31 09:31:40 +05:30
Peter Steinberger
9a7160786a refactor: rename to openclaw 2026-01-30 03:16:21 +01:00
Peter Steinberger
975f5a5284 fix: guard session store against array corruption 2026-01-24 04:51:46 +00:00
Peter Steinberger
02ca148583 fix: preserve subagent thread routing (#1241)
Thanks @gnarco.

Co-authored-by: gnarco <gnarco@users.noreply.github.com>
2026-01-20 17:22:07 +00:00
Peter Steinberger
744d1329cb feat: make inbound envelopes configurable
Co-authored-by: Shiva Prasad <shiv19@users.noreply.github.com>
2026-01-18 18:50:37 +00:00
Peter Steinberger
82e49af5a7 fix: resolve plugin tool meta typing 2026-01-18 04:24:16 +00:00
Peter Steinberger
0d9172d761 fix: persist session origin metadata 2026-01-18 03:41:51 +00:00
Peter Steinberger
34590d2144 feat: persist session origin metadata across connectors 2026-01-18 02:42:10 +00:00
Peter Steinberger
bbb71c9198 refactor: prune legacy group targets 2026-01-17 09:01:47 +00:00
Peter Steinberger
1f3a09b43b fix: persist deliveryContext on last-route updates
Co-authored-by: Adam Holt <mail@adamholt.co.nz>
2026-01-17 06:54:31 +00:00
Peter Steinberger
65a8a93854 fix: normalize delivery routing context
Co-authored-by: adam91holt <adam91holt@users.noreply.github.com>
2026-01-17 06:38:33 +00:00
Peter Steinberger
285ed8bac3 fix: sync delivery routing context
Co-authored-by: adam91holt <adam91holt@users.noreply.github.com>
2026-01-17 06:02:27 +00:00
Peter Steinberger
f4f20c6762 refactor: normalize session route fields
Co-authored-by: adam91holt <adam91holt@users.noreply.github.com>
2026-01-17 05:29:06 +00:00
Yurii Chukhlib
cf72b9db3c fix(sessions): preserve 0600 permissions on sessions.json writes 2026-01-16 23:14:33 +00:00
Peter Steinberger
688a0ce439 refactor: harden session store updates
Co-authored-by: Tyler Yust <tyler6204@users.noreply.github.com>
2026-01-15 23:41:34 +00:00
Peter Steinberger
1c96477686 fix: harden session cache + heartbeat restore
Co-authored-by: Ronak Guliani <ronak-guliani@users.noreply.github.com>
2026-01-15 07:07:12 +00:00
Peter Steinberger
c379191f80 chore: migrate to oxlint and oxfmt
Co-authored-by: Christoph Nakazawa <christoph.pojer@gmail.com>
2026-01-14 15:02:19 +00:00
Peter Steinberger
bcbfb357be refactor(src): split oversized modules 2026-01-14 01:17:56 +00:00