Commit Graph

2767 Commits

Author SHA1 Message Date
echoVic
9176571ec1 fix(gemini): sanitize thoughtSignatures for native Google provider
Native Google Gemini provider was accumulating 2K-8K tokens of Base64
thoughtSignature blobs per turn, causing premature context overflow.

The sanitizer was only enabled for OpenRouter Gemini, not native Google.

Fixes #23392
2026-02-22 12:24:53 +01:00
Peter Steinberger
78c3c2a542 fix: stabilize flaky tests and sanitize directive-only chat tags 2026-02-22 12:19:33 +01:00
Peter Steinberger
7d09a9e74d test: update agent tool assertions and reclassify suites 2026-02-22 11:18:50 +00:00
Peter Steinberger
fcb86408fd test: move embedded and tool agent suites out of e2e 2026-02-22 11:17:47 +00:00
Peter Steinberger
e441390fd1 test: reclassify agent local suites out of e2e 2026-02-22 11:16:37 +00:00
Peter Steinberger
713e2928b2 test: move duplicate local scenario suites out of agents e2e 2026-02-22 10:56:58 +00:00
Peter Steinberger
bfada9e425 test: move more local agents helper suites out of e2e 2026-02-22 10:55:22 +00:00
Peter Steinberger
4267fc8593 test: reclassify pi embedded helper suites out of agents e2e 2026-02-22 10:53:50 +00:00
Peter Steinberger
adace58505 test: reclassify local helper suites out of agents e2e 2026-02-22 10:53:40 +00:00
Peter Steinberger
1d4e9ad8d1 test: reclassify remaining bash suites as unit tests 2026-02-22 10:48:32 +00:00
Peter Steinberger
ab38e1e6b2 test: reclassify image tool suite as unit test 2026-02-22 10:47:16 +00:00
Peter Steinberger
aa487bd4f3 test: reclassify bash pty suites as unit tests 2026-02-22 10:47:10 +00:00
Peter Steinberger
3c9f98452e test: reclassify tool-result persist hook suite as unit test 2026-02-22 10:46:02 +00:00
Peter Steinberger
047e18693e test: reclassify exec approval-id suite as unit test 2026-02-22 10:45:23 +00:00
Peter Steinberger
17a65a6f4c test: split pure docker exec arg checks from bash e2e suite 2026-02-22 10:44:40 +00:00
Peter Steinberger
239963ac44 perf(test): shrink bash command fixtures and polling windows 2026-02-22 10:43:22 +00:00
Peter Steinberger
1d7dbd8cd9 test: reclassify web fetch/readability suites as unit tests 2026-02-22 10:41:29 +00:00
Peter Steinberger
304eef575b test: reclassify sandbox and web/image tool suites as unit tests 2026-02-22 10:40:40 +00:00
Peter Steinberger
3b09a0d2d0 perf(test): trim bash e2e log fixtures and abort wait bounds 2026-02-22 10:39:18 +00:00
Peter Steinberger
c68bb8d6d5 test: stabilize bash e2e suites with explicit exec approvals mode 2026-02-22 10:37:44 +00:00
Peter Steinberger
97eb4af01e test: harden models-config env isolation list 2026-02-22 10:34:23 +00:00
Peter Steinberger
744df0fbe7 test: reclassify models-config suites from e2e to unit lane 2026-02-22 10:34:23 +00:00
Peter Steinberger
740fd7ae35 test: reclassify skills suites from e2e to unit lane 2026-02-22 10:34:23 +00:00
Peter Steinberger
c56ab39da5 perf(test): reduce bash e2e wait windows 2026-02-22 10:28:43 +00:00
Peter Steinberger
abff3f0f61 test: reclassify sessions_spawn lifecycle suite as unit test 2026-02-22 10:28:43 +00:00
Peter Steinberger
0b7c7ee1aa perf(test): speed up sessions_spawn lifecycle suite setup 2026-02-22 10:28:43 +00:00
Peter Steinberger
c962bcba37 test: reclassify sandbox merge and exec path suites as unit tests 2026-02-22 10:28:43 +00:00
Peter Steinberger
9ab7b85a66 perf(test): tighten background abort timing windows 2026-02-22 10:28:43 +00:00
Peter Steinberger
c995f9be07 test: reclassify mocked announce and sandbox suites as unit tests 2026-02-22 10:28:43 +00:00
Peter Steinberger
27f0d7ebcc test: reclassify auth-profile-rotation suite as unit test 2026-02-22 10:28:43 +00:00
Peter Steinberger
c0b1c10a08 test: reclassify mocked runner/safe-bins suites as unit tests 2026-02-22 10:28:43 +00:00
Peter Steinberger
a9b26d83de perf(test): narrow pi-embedded runner e2e import path 2026-02-22 10:28:42 +00:00
Peter Steinberger
2b0ca9447c perf(test): trim bash e2e sleep and poll windows 2026-02-22 10:28:42 +00:00
Peter Steinberger
c348a13640 perf(test): lower subagent fast-mode wait floors 2026-02-22 10:28:42 +00:00
Peter Steinberger
54e0786ba6 perf(test): reduce subagent announce fast-mode polling waits 2026-02-22 10:28:42 +00:00
Peter Steinberger
a96139e18c perf(test): mock compact module in auth rotation e2e 2026-02-22 10:28:42 +00:00
Peter Steinberger
eda941f395 perf(test): remove flaky transport timeout and dedupe safeBins checks 2026-02-22 10:28:42 +00:00
Peter Steinberger
d72b4ead18 perf(test): lower fast-mode nested output wait floor to 70ms 2026-02-22 10:28:42 +00:00
Peter Steinberger
7ccf62fb4c test(agents): remove dead shell-timeout override in safeBins suite 2026-02-22 10:28:42 +00:00
Peter Steinberger
60773c124e perf(test): lower fast-mode nested output wait floor to 80ms 2026-02-22 10:28:42 +00:00
Peter Steinberger
36375f121f perf(test): trim nested subagent output wait floor in fast mode 2026-02-22 10:28:42 +00:00
Peter Steinberger
2900eb5456 perf(test): trim background abort settle waits and dedupe cmd fixture 2026-02-22 10:28:42 +00:00
Peter Steinberger
7d13227d41 test(agents): dedupe auth profile rotation fixture setup 2026-02-22 10:28:42 +00:00
Peter Steinberger
6b5c20055b perf(test): speed subagent announce retry polling in fast mode 2026-02-22 10:28:42 +00:00
Peter Steinberger
1b327da6e3 fix: harden exec sandbox fallback semantics (#23398) (thanks @bmendonca3) 2026-02-22 11:12:01 +01:00
Brian Mendonca
c76a47cce2 Exec: fail closed when sandbox host is unavailable 2026-02-22 11:12:01 +01:00
Peter Steinberger
35d5bd4e07 perf(test): shrink subagent announce fast-mode settle waits 2026-02-22 09:29:04 +00:00
Peter Steinberger
703f7213b6 test(agents): simplify subagent announce suite imports and call assertions 2026-02-22 09:29:04 +00:00
Peter Steinberger
6c2e999776 refactor(security): unify secure id paths and guard weak patterns 2026-02-22 10:16:19 +01:00
Peter Steinberger
c3e13175d2 perf(test): bypass queue debounce in fast mode and tighten announce defaults 2026-02-22 09:13:01 +00:00