Platform Engineer Streaming Architecture
Status: 🔴 Abandoned (Superseded by 2024-10-23-platform-engineer-streaming-architecture.md) Category: Architecture & Core Design Date: October 22, 2024
Current Status: ⚠️ Streaming Not Fully Working (Historical Note)
Token-by-token streaming from sub-agents (like agent-komodor-p2p) to clients is currently NOT working due to LangGraph's tool execution model. This document explains why and outlines the solution path.
The Problem
Current Architecture
Client Request
↓
Platform Engineer (Deep Agent + LangGraph)
↓
A2ARemoteAgentConnectTool (blocks here!)
↓ (internally streams from sub-agent)
Sub-Agent streams response → Tool accumulates → Returns complete text
↓
Platform Engineer receives complete response as one chunk
↓
Client receives full response at once (no streaming)
Root Cause
LangGraph tools are blocking by design. When Deep Agent invokes a tool:
- Tool execution blocks the graph
A2ARemoteAgentConnectTool._arun()is called- Inside
_arun(), the tool DOES stream from the sub-agent via A2A protocol - BUT it accumulates all chunks into
accumulated_text - Only returns the complete response when streaming finishes
- LangGraph receives this as a single
ToolMessage
Code Evidence (ai_platform_engineering/utils/a2a_common/a2a_remote_agent_connect.py:198-226):
accumulated_text: list[str] = []
async for chunk in self._client.send_message_streaming(streaming_request):
# Chunks ARE received from sub-agent
writer({"type": "a2a_event", "data": chunk_dump}) # ← This writes somewhere but doesn't propagate
if isinstance(chunk, A2ATaskArtifactUpdateEvent):
text = extract_text(chunk)
accumulated_text.append(text) # ← Accumulating, not yielding!
# Return complete response after ALL chunks received
final_response = " ".join(accumulated_text).strip()
return Output(response=final_response) # ← Blocking return
Testing Streaming
Current State (Not Streaming)
uvx --no-cache git+https://github.com/cnoe-io/agent-chat-cli.git a2a \
--host 10.99.255.178 --port 8000
# Type: show me komodor clusters
#
# Behavior: Shows "Calling komodor..." → wait → complete response appears
After Fix (Streaming)
# Same command
#
# Expected: Tokens appear one by one as they're generated by komodor agent
Related
-
LangGraph Streaming: https://python.langchain.com/docs/langgraph/streaming
-
A2A Protocol: https://github.com/cnoe-io/a2a-spec
-
Deep Agent: https://docs.deepagent.ai/
-
Related Issue: https://github.com/langchain-ai/langgraph/issues/XXXX (streaming tools)
-
Architecture: architecture.md