Skip to main content

Automatic Fact Extraction via LangMem

Status: Accepted Category: Architecture & Design Date: February 26, 2026

Overview​

Added automatic background fact extraction from conversations using LangMem's create_memory_store_manager. After each agent response, a background task analyzes the conversation and persists extracted facts, preferences, and context to the cross-thread LangGraph Store. This enables the agent to recall user-specific information across threads without requiring the user to repeat themselves.

Problem Statement​

The cross-thread store (introduced in the Cross-Thread LangGraph Store ADR) only received data when context compression triggered -- which required the context window to be nearly full. Short conversations produced no cross-thread memory at all, meaning the ("memories", user_id) namespace was effectively empty. Users had to re-explain their environment, preferences, and project details in every new conversation.

Decision​

Use LangMem's create_memory_store_manager API to automatically extract facts from every conversation turn, running as a background task after the agent responds.

AlternativeProsConsDecision
create_memory_store_manager (chosen)Native BaseStore integration, automatic search/insert/update/delete, consolidates duplicatesExtra LLM call per turnSelected
create_memory_manager + manual store writesMore control over extraction outputNo store integration, must manually search/write/deduplicateRejected
Inline extraction during compression onlyNo extra LLM callsFacts only captured when context window is near-full; short conversations produce no memoriesRejected (status quo)
Custom fact extraction promptFull control over promptReinvents what LangMem already provides, no dedup/update logicRejected

Solution Architecture​

Background Extraction Flow​

  1. User sends message; agent streams response back to user.
  2. After the response stream completes, a background asyncio.create_task() launches fact extraction.
  3. The MemoryStoreManager searches the store for existing memories relevant to this conversation.
  4. It analyzes the conversation messages alongside existing memories using an LLM.
  5. It generates insert/update/delete operations and applies them to the store.
  6. On the next conversation (same or new thread), store_get_cross_thread_context retrieves these facts.

Memory Types Extracted​

LangMem's memory manager extracts three categories:

  • Semantic: Facts, preferences, relationships (e.g., "User's team uses ArgoCD on prod-west cluster")
  • Episodic: Past experiences and conversation context (e.g., "User debugged an OOM issue in the monitoring namespace")
  • Procedural: Behavioral patterns (e.g., "User prefers concise responses with YAML examples")

Feature Flag​

Extraction is controlled by ENABLE_FACT_EXTRACTION (default: false). This is disabled by default because:

  • It adds one LLM call per conversation turn (cost/latency consideration)
  • Teams should opt-in after evaluating cost vs. benefit
  • InMemoryStore (default) loses data on restart, so extraction is most useful with Redis/Postgres store backends

Configuration​

ENABLE_FACT_EXTRACTION=false           # Enable/disable background fact extraction
FACT_EXTRACTION_MODEL= # Model for extraction (empty = use default LLM)

Components Changed​

  • ai_platform_engineering/utils/agent_memory/fact_extraction.py (new) - Extraction logic, feature flag, MemoryStoreManager factory
  • ai_platform_engineering/multi_agents/platform_engineer/protocol_bindings/a2a/agent.py - Background task launch after stream()
  • ai_platform_engineering/multi_agents/agent_registry.py - Added FACT_EXTRACTION to DEFAULT_REGISTRY_EXCLUSIONS
  • .env.example / docker-compose.dev.yaml - New environment variables
  • .specify/specs/cross-thread-store.md - Updated spec with Phase 4
  • tests/test_fact_extraction.py (new) - 65 unit tests covering feature flag, config, extraction, store compatibility, edge cases
  • tests/test_store.py (enhanced) - 86 unit tests covering store factory, operations, cross-thread context, integration
  • tests/test_checkpoint.py (new) - 49 unit tests covering checkpointer lifecycle, thread isolation, state round-trip, auto-trim, safe split, repair fallback, concurrent access, agent wiring
  • integration/test_fact_extraction_live.py (new) - Live integration test with recall verification and user isolation

Dependency: cnoe-agent-utils​

The trace_agent_stream decorator in cnoe-agent-utils required a fix to forward **kwargs so that user_id can be propagated through the decorated stream() method. This fix is backward-compatible.

Testing​

All persistence-related tests (289 total) pass:

  • tests/test_checkpoint.py — 49 tests for checkpointer (thread isolation, trim, safe split, repair fallback, concurrent access, agent wiring)
  • tests/test_store.py — 86 tests for cross-thread store (factory, put/get, user isolation, LangMem integration)
  • tests/test_fact_extraction.py — 65 tests for fact extraction (feature flag, config, extraction, store compatibility, edge cases)
  • tests/test_persistence_unit.py — 89 tests for persistence internals (tool-call extraction, summarization, orphan repair, config wiring)
  • ADR: 2026-02-26-cross-thread-langgraph-store.md (Checkpointer + cross-thread store infrastructure)
  • ADR: 2025-12-13-context-management-and-resilience.md (LangMem context management)
  • Spec: .specify/specs/cross-thread-store.md