- Add composite index on (agentId, category) for faster filtered queries
- Combine graph search into single UNION Cypher query (was 2 sequential)
- Parallelize conflict resolution with LLM_CONCURRENCY chunks
- Batch entity operations (merge, mentions, relationships, tags, category,
extraction status) into a single managed transaction
- Make auto-capture fire-and-forget with shared captureMessage helper
- Extract attention-gate.ts and message-utils.ts modules from index.ts
and extractor.ts for better separation of concerns
- Update tests to match new batched/combined APIs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix initPromise retry: reset to null on failure so subsequent calls
retry instead of returning cached rejected promise
- Remove dead code: findPromotionCandidates, findDemotionCandidates,
calculateEffectiveImportance (~190 lines, never called)
- Add agentId filter to deleteMemory() to prevent cross-agent deletion
- Fix phase label swaps: 1b=Semantic Dedup, 1c=Conflict Detection
(CLI banner, phaseNames map, SleepCycleResult/Options type comments)
- Add autoRecallMinScore and coreMemory config to plugin JSON schema
so the UI can validate and display these options
- Add embedding LRU cache (200 entries, SHA-256 keyed) to eliminate
redundant API calls across auto-recall, auto-capture, and tools
- Add Ollama concurrency limiter (chunks of 4) to prevent thundering
herd on single-threaded embedding server
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Raise MIN_CAPTURE_CHARS from 10 to 30 to reject trivially short messages
- Add noise patterns for conversational filler (haha, lol, hmm, etc.)
- Add noise pattern to reject /new and /reset session prompts
- Raise importance threshold for assistant auto-captures to >= 0.7
- Add Slack protocol prefix/suffix stripping in stripMessageWrappers()
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Five high-impact improvements to the memory system:
1. Min RRF score threshold on auto-recall (default 0.25) — filters low-relevance
results before injecting into context
2. Deduplicate auto-recall against core memories already present in context
3. Capture assistant messages (decisions, recommendations, synthesized facts)
with stricter attention gating and "auto-capture-assistant" source type
4. LLM-judged importance scoring at capture time (0.1-1.0) with 5s timeout
fallback to 0.5, replacing the flat 0.5 default
5. Conflict detection in sleep cycle (Phase 1b) — finds contradictory memories
sharing entities, uses LLM to resolve, invalidates the loser
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a CLI command to re-embed all Memory and Entity nodes after
changing the embedding model or provider. Drops old vector indexes,
re-embeds in batches via the configured provider, and recreates
indexes with the correct dimensions.
Prevents managed/bundled skill file paths from leaking into sandboxed
agent skill snapshots, which caused 'path escapes sandbox root' errors.
Adds scopeToWorkspace option to loadSkillEntries/buildWorkspaceSkillSnapshot.
Also fixes stale Docker mount detection on container probe failure.
- Add extraction config section (apiKey, model, baseUrl) to plugin schema
with env-var fallback and Ollama/local LLM support (no API key required)
- Add category classification to extraction prompt; update memories from
'other' to LLM-assigned category
- Reorder sleep phases: extraction before decay
- Parallelize extraction (3 concurrent via Promise.allSettled)
- Pre-compute effective scores once and reuse for promotion/demotion
- Replace O(n²) Cartesian dedup with per-memory HNSW vector index queries
- Use mentionCount for orphan entity detection instead of subquery
- Remove dead auto-capture code (evaluateAutoCapture, CaptureItem, etc.)
Add `coreMemory.refreshAtContextPercent` config option to re-inject
core memories when context usage exceeds a threshold. This counters
the "lost in the middle" phenomenon documented by Liu et al. (2023).
Implementation:
- Extend before_agent_start hook event with context usage info
- Pass contextWindowTokens and estimatedUsedTokens to hooks
- Track mid-session refresh per session to prevent over-refreshing
- Clear refresh tracking on compaction
- Add comprehensive tests
Based on research: Liu et al., "Lost in the Middle: How Language
Models Use Long Contexts" (Stanford, 2023)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Detect configured memory plugins (memory-neo4j, memory-lancedb) and show
their status alongside core memory search. Provides helpful hints about
plugin-specific commands when plugins are enabled.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implement retrieval tracking and Pareto-based memory consolidation:
- Track retrievalCount and lastRetrievedAt on every search
- Effective importance formula: importance × freq_boost × recency_factor
- Seven-phase sleep cycle: dedup, pareto scoring, promotion, demotion,
decay/pruning, extraction, cleanup
- Bidirectional mobility between core (≤20%) and regular memory tiers
- Core memories ranked by pure usage (no importance multiplier)
Based on ACT-R memory model and Ebbinghaus forgetting curve research.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>