Skip to main content

Spec: LangGraph Persistence β€” Checkpointer & Cross-Thread Store

Overview​

Implement full LangGraph persistence with two complementary layers: (1) per-thread Checkpointer for saving conversation state (messages) within a thread, including auto-trimming and orphan repair; and (2) cross-thread Store for long-term user-scoped memory (summaries, facts, preferences) that persists across threads. Additionally, integrate LangMem for automatic fact extraction from conversations.

Motivation​

Without persistence, conversation state is lost on restart, threads are isolated, and agents have no memory of prior interactions. This spec covers:

  • Checkpointer: Enables multi-turn conversation continuity within a thread. Without it, the agent forgets the conversation after each message.
  • Cross-thread Store: When a user starts a new conversation, the agent can recall context from prior threads β€” no need to re-explain preferences, project details, or environment.
  • Fact extraction: Actively extracts and persists facts from conversations so the agent proactively remembers details, even from short conversations that never trigger context compression.

Scope​

In Scope​

  • Checkpointer (per-thread state persistence):
    • InMemorySaver (default), with InMemorySaver disabled when LANGGRAPH_DEV is set
    • Checkpointer wired into graph compilation for deep agent, GitHub, GitLab, Slack, AWS, Splunk agents
    • _trim_messages_if_needed auto-compression when context exceeds token limit
    • _find_safe_split_index respects tool-call/tool-result boundaries during trimming
    • Repair fallback: resets thread state via aupdate_state when orphan repair fails
    • Thread isolation: different thread_id values produce isolated state
    • context_id β†’ thread_id mapping for A2A protocol
  • Cross-thread Store (user-scoped long-term memory):
    • Store factory with InMemoryStore (default), Redis, and Postgres backends
    • Wiring the store through deepagents graph compilation
    • Saving LangMem compression summaries to the store for cross-thread access
    • Retrieving cross-thread context (summaries + memories) when starting new threads
    • Propagating user identity from JWT middleware into agent config
  • Automatic fact extraction from conversations using LangMem's create_memory_store_manager
  • Environment variable configuration
  • Unit tests for all layers

Out of Scope​

  • AsyncRedisSaver / AsyncPostgresSaver checkpointer backends (future β€” infrastructure not yet wired)
  • Explicit "remember this" / "forget this" user commands (future)
  • Store-based RAG / semantic search over memories (future)
  • Admin UI for viewing/managing stored memories

Design​

Architecture​

LangGraph persistence has two independent layers:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Layer 1: Checkpointer (per-thread state) β”‚
β”‚ ───────────────────────────────────────────── β”‚
β”‚ Scope: thread_id β”‚
β”‚ Stores: Raw messages (Human, AI, Tool, System) β”‚
β”‚ Backends: InMemorySaver (default) β”‚
β”‚ Features: β”‚
β”‚ β€’ Multi-turn conversation continuity β”‚
β”‚ β€’ Auto-trim when context exceeds token limit β”‚
β”‚ β€’ Safe split: respects tool-call/result pairs β”‚
β”‚ β€’ Orphan repair fallback: resets corrupted state β”‚
β”‚ β€’ Disabled when LANGGRAPH_DEV is set β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Layer 2: Store (cross-thread user memory) β”‚
β”‚ ───────────────────────────────────────────── β”‚
β”‚ Scope: user_id (across all threads) β”‚
β”‚ Stores: User memories, conversation summaries β”‚
β”‚ Backends: InMemoryStore (default), Redis, Postgres β”‚
β”‚ Features: β”‚
β”‚ β€’ Cross-thread recall on new conversations β”‚
β”‚ β€’ LangMem summary persistence after compression β”‚
β”‚ β€’ Automatic fact extraction (opt-in) β”‚
β”‚ β€’ User isolation: each user has own namespace β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Data Model​

Checkpointer:
thread_id -> [HumanMessage, AIMessage, ToolMessage, SystemMessage, ...]

Store Namespaces:
("memories", <user_id>) -> {key: uuid, value: {"data": "...", "source_thread": "...", "timestamp": ...}}
("summaries", <user_id>) -> {key: uuid, value: {"summary": "...", "thread_id": "...", "timestamp": ...}}

Components Affected​

  • Multi-Agents (ai_platform_engineering/multi_agents/) - deep_agent.py (checkpointer + store wiring), agent.py (repair fallback, fact extraction), agent_executor.py (user_id propagation), agent_registry.py
  • Utils (ai_platform_engineering/utils/) - store.py, agent_memory/fact_extraction.py, a2a_common/langmem_utils.py, a2a_common/base_langgraph_agent.py (trim, safe split, checkpointer)
  • Deepagents (deepagents/) - graph.py (store parameter)
  • Agents (ai_platform_engineering/agents/) - GitHub, GitLab, Slack, AWS, Splunk all wire checkpointers via base_langgraph_agent
  • Documentation (docs/)
  • MCP Servers
  • Knowledge Bases (ai_platform_engineering/knowledge_bases/)
  • UI (ui/)
  • Helm Charts (charts/)

Acceptance Criteria​

Checkpointer (per-thread persistence)​

  • InMemorySaver attached to deep agent by default
  • Checkpointer disabled when LANGGRAPH_DEV env var is set
  • Thread isolation: different thread_ids produce independent state
  • Same-thread accumulation: messages persist across invocations
  • _trim_messages_if_needed trims old messages when context exceeds token limit
  • _find_safe_split_index never orphans tool-call/tool-result pairs
  • System messages preserved during trimming
  • Repair fallback: adds reset message via aupdate_state when orphan repair fails
  • Repair fallback skipped when checkpointer is None or thread_id is absent
  • context_id correctly mapped to thread_id in stream config
  • Individual agents (GitHub, GitLab, Slack, AWS, Splunk) wire checkpointers
  • Graph compiles correctly with, without, and with None checkpointer
  • Checkpoint tests pass (49 tests)

Cross-thread Store (user-scoped memory)​

  • Store factory creates InMemoryStore by default
  • Store factory supports Redis and Postgres via env vars
  • Store is wired into graph compilation via deepagents
  • User identity flows from JWT middleware to agent config
  • LangMem summaries are saved to store after compression
  • New threads retrieve cross-thread summaries/memories
  • Graceful fallback when store is unavailable
  • Store unit tests pass (86 tests)

Automatic Fact Extraction​

  • Fact extraction runs in background after each response (when enabled)
  • Controlled by ENABLE_FACT_EXTRACTION env var (default false)
  • Extracted facts persisted to ("memories", user_id) namespace via MemoryStoreManager
  • Fact extraction unit tests pass (65 tests)

Documentation & Overall​

  • Documentation updated (ADR + env vars)
  • All 289 persistence-related unit tests pass

Implementation Plan​

Phase 1: Checkpointer (per-thread persistence)​

  • InMemorySaver wired into deep_agent.py with LANGGRAPH_DEV toggle
  • base_langgraph_agent.py uses MemorySaver for individual agent graphs
  • _trim_messages_if_needed auto-compression with _find_safe_split_index boundary safety
  • Repair fallback in agent.py when orphan repair fails (checks checkpointer presence)
  • context_id β†’ thread_id mapping and user_id/trace_id metadata propagation

Phase 2: Cross-Thread Store Infrastructure​

  • Create store factory (ai_platform_engineering/utils/store.py)
  • Add store parameter to deepagents graph builder
  • Wire store into deep_agent.py

Phase 3: Cross-Thread Data Flow​

  • Propagate user_id from JWT middleware through executor to agent
  • Save LangMem summaries to store
  • Retrieve cross-thread context on new threads

Phase 4: Configuration & Tests​

  • Update .env.example and docker-compose.dev.yaml
  • Write unit tests for store (86 tests)
  • Create ADR for cross-thread store

Phase 5: Automatic Fact Extraction​

  • Create ai_platform_engineering/utils/agent_memory/fact_extraction.py with LangMem create_memory_store_manager integration
  • Add background asyncio.create_task() in agent.py stream() to extract facts after response
  • Add ENABLE_FACT_EXTRACTION and FACT_EXTRACTION_MODEL env vars
  • Verify store_get_cross_thread_context handles MemoryStoreManager output format
  • Write unit tests for fact extraction (65 tests)
  • Create ADR for automatic fact extraction decision

Phase 6: Checkpoint Testing​

  • Write comprehensive checkpoint tests (49 tests) covering:
    • InMemorySaver lifecycle and thread isolation
    • State round-trip (Human, AI, System, Unicode messages)
    • _find_safe_split_index boundary safety with tool-call pairs
    • _trim_messages_if_needed all branches (disabled, no state, under limit, over limit, system preserved)
    • Repair fallback with/without checkpointer/thread_id, error handling
    • Concurrent checkpoint access (10 threads write, 10 concurrent reads)
    • Graph compilation variants (with, without, None checkpointer)
    • Agent checkpointer wiring verification (source inspection)
    • Edge cases (long thread IDs, special chars, 50-message accumulation)

Testing Strategy​

Unit Tests (289 total)​

Test FileCountCoverage
tests/test_checkpoint.py49InMemorySaver lifecycle, thread isolation, state round-trip, _find_safe_split_index, _trim_messages_if_needed (all branches), repair fallback, context_id→thread_id, concurrent access, graph compilation, agent wiring, edge cases
tests/test_store.py86Store factory, put memory/summary, cross-thread retrieval, user isolation, LangMem integration, user_id extraction/propagation, InMemoryStore integration, lazy Postgres
tests/test_fact_extraction.py65Feature flag, config builder, extraction model, extractor creation/caching, extract-and-store, store compatibility, agent integration, edge cases
tests/test_persistence_unit.py89_extract_tool_call_ids, _find_safe_summarization_boundary, summarize_messages, _fallback_summarize, preflight_context_check, _repair_orphaned_tool_calls, stream config wiring, deep_agent checkpointer wiring

Integration Tests​

  • integration/test_fact_extraction_live.py -- Seeds facts via multi-turn conversation, waits for background extraction, verifies recall on a new thread, and checks user isolation
  • integration/test_persistence_features.py -- End-to-end thread persistence, recall, isolation, multi-turn via A2A HTTP API

Manual verification​

  • Multi-thread conversation with memory recall

Rollout Plan​

  1. Merge with InMemoryStore default (no infrastructure changes needed)
  2. Teams can opt-in to Redis/Postgres store via env vars
  3. Future: semantic search over memories, explicit remember/forget commands