Five high-impact improvements to the memory system:
1. Min RRF score threshold on auto-recall (default 0.25) — filters low-relevance
results before injecting into context
2. Deduplicate auto-recall against core memories already present in context
3. Capture assistant messages (decisions, recommendations, synthesized facts)
with stricter attention gating and "auto-capture-assistant" source type
4. LLM-judged importance scoring at capture time (0.1-1.0) with 5s timeout
fallback to 0.5, replacing the flat 0.5 default
5. Conflict detection in sleep cycle (Phase 1b) — finds contradictory memories
sharing entities, uses LLM to resolve, invalidates the loser
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a CLI command to re-embed all Memory and Entity nodes after
changing the embedding model or provider. Drops old vector indexes,
re-embeds in batches via the configured provider, and recreates
indexes with the correct dimensions.
Prevents managed/bundled skill file paths from leaking into sandboxed
agent skill snapshots, which caused 'path escapes sandbox root' errors.
Adds scopeToWorkspace option to loadSkillEntries/buildWorkspaceSkillSnapshot.
Also fixes stale Docker mount detection on container probe failure.
- Add extraction config section (apiKey, model, baseUrl) to plugin schema
with env-var fallback and Ollama/local LLM support (no API key required)
- Add category classification to extraction prompt; update memories from
'other' to LLM-assigned category
- Reorder sleep phases: extraction before decay
- Parallelize extraction (3 concurrent via Promise.allSettled)
- Pre-compute effective scores once and reuse for promotion/demotion
- Replace O(n²) Cartesian dedup with per-memory HNSW vector index queries
- Use mentionCount for orphan entity detection instead of subquery
- Remove dead auto-capture code (evaluateAutoCapture, CaptureItem, etc.)
Add `coreMemory.refreshAtContextPercent` config option to re-inject
core memories when context usage exceeds a threshold. This counters
the "lost in the middle" phenomenon documented by Liu et al. (2023).
Implementation:
- Extend before_agent_start hook event with context usage info
- Pass contextWindowTokens and estimatedUsedTokens to hooks
- Track mid-session refresh per session to prevent over-refreshing
- Clear refresh tracking on compaction
- Add comprehensive tests
Based on research: Liu et al., "Lost in the Middle: How Language
Models Use Long Contexts" (Stanford, 2023)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Detect configured memory plugins (memory-neo4j, memory-lancedb) and show
their status alongside core memory search. Provides helpful hints about
plugin-specific commands when plugins are enabled.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implement retrieval tracking and Pareto-based memory consolidation:
- Track retrievalCount and lastRetrievedAt on every search
- Effective importance formula: importance × freq_boost × recency_factor
- Seven-phase sleep cycle: dedup, pareto scoring, promotion, demotion,
decay/pruning, extraction, cleanup
- Bidirectional mobility between core (≤20%) and regular memory tiers
- Core memories ranked by pure usage (no importance multiplier)
Based on ACT-R memory model and Ebbinghaus forgetting curve research.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>