Architecture: Cross-Thread Store & Automatic Fact Extraction
Decision​
| Alternative | Pros | Cons | Decision |
|---|---|---|---|
| LangGraph Store with LangMem extraction (chosen) | Native LangGraph API, pluggable backends, LangMem handles extraction logic | Requires store + LLM for extraction | Selected |
| Custom key-value persistence | Full control, simpler data model | No LangGraph integration, manual wiring | Rejected |
| Thread-spanning checkpointer | Reuses existing checkpointer | Violates thread isolation, data leakage risk | Rejected |
| Manual "remember this" commands only | User control over what is stored | Requires explicit user action, no automatic learning | Deferred |
Solution Architecture​
Store Factory Pattern​
The store factory in store.py creates backend-specific store instances based on LANGGRAPH_STORE_TYPE:
LANGGRAPH_STORE_TYPE ──▶ create_store()
│
├── memory ──▶ InMemoryStore (default)
├── redis ──▶ _LazyAsyncRedisStore (wraps AsyncRedisStore)
├── postgres ──▶ _LazyAsyncPostgresStore (wraps AsyncPostgresStore)
└── mongodb ──▶ _LazyAsyncMongoDBStore (custom motor-based wrapper)
All external backends use a lazy async initialization pattern: the store is created synchronously, but the actual connection is deferred until the first async operation via _ensure_initialized(). This allows the synchronous graph builder to accept the store without blocking.
Namespace Layout​
Store data is organized into user-scoped namespaces:
Namespace: ("memories", <sanitized_user_id>)
└── key: uuid
└── value: {"data": "...", "source_thread": "...", "timestamp": ...}
Namespace: ("summaries", <sanitized_user_id>)
└── key: uuid
└── value: {"summary": "...", "thread_id": "...", "timestamp": ...}
User IDs containing periods (e.g., email addresses) are sanitized via sanitize_namespace_label() which replaces . with _, since LangGraph namespace labels forbid periods.
An optional LANGGRAPH_STORE_KEY_PREFIX allows multiple deployments to share a single Redis instance without key collisions.
Cross-Thread Context Retrieval​
When a new thread starts, store_get_cross_thread_context() retrieves prior context:
New thread ──▶ store_get_cross_thread_context(store, user_id)
│
├── asearch("summaries", user_id) ──▶ sorted by timestamp desc
│ └── formatted as "[Previous Conversation Summaries]\n..."
│
└── asearch("memories", user_id) ──▶ sorted by timestamp desc
└── formatted as "[User Memories]\n- fact1\n- fact2\n..."
│
└── combined ──▶ injected into system prompt
Limits are configurable via LANGGRAPH_STORE_MAX_SUMMARIES (default 10) and LANGGRAPH_STORE_MAX_MEMORIES (default 50).
Automatic Fact Extraction​
When ENABLE_FACT_EXTRACTION=true, the system automatically extracts facts after each agent response:
Agent response complete
│
└── asyncio.create_task(extract_and_store_facts(...))
│
├── create_fact_extractor(store) ──▶ cached MemoryStoreManager
│ └── LangMem create_memory_store_manager()
│ ├── model: FACT_EXTRACTION_MODEL or default LLM
│ ├── instructions: platform engineering extraction priorities
│ ├── namespace: ("memories", "{langgraph_user_id}")
│ ├── enable_inserts: true
│ └── enable_deletes: false
│
└── extractor.ainvoke({"messages": messages}, config=config)
└── extracted facts persisted to store
The extraction runs as a background asyncio task with zero impact on response latency. Failures are logged but never propagated.
Embedding Configuration​
The store supports optional semantic search via embeddings, sharing configuration with the RAG stack:
EMBEDDINGS_PROVIDER/EMBEDDINGS_MODEL(shared with RAG)LANGGRAPH_STORE_EMBEDDINGS_PROVIDER/LANGGRAPH_STORE_EMBEDDINGS_MODEL(store-specific overrides)- Auto-detected dimensions for known models (text-embedding-3-small: 1536, text-embedding-3-large: 3072)
Note: MongoDB store does not support semantic/vector search. Use Redis or Postgres for full semantic memory.
Components Changed​
| File | Description |
|---|---|
ai_platform_engineering/utils/store.py | Store factory with InMemoryStore, Redis, Postgres, MongoDB backends; namespace helpers; CRUD operations; cross-thread context retrieval; global singleton |
ai_platform_engineering/utils/agent_memory/fact_extraction.py | LangMem integration for automatic fact extraction; cached MemoryStoreManager; background async extraction |
ai_platform_engineering/multi_agents/platform_engineer/protocol_bindings/a2a/agent.py | Triggers background fact extraction after response; saves summaries to store during compression |
ai_platform_engineering/multi_agents/platform_engineer/deep_agent.py | Creates store via factory; passes to graph builder |
Related​
- Spec: spec.md
- ADR: Cross-Thread LangGraph Store
- ADR: Automatic Fact Extraction