Orphaned Tool Call Repair for Bedrock Multi-Turn Conversations
Status: Implemented Category: Bug Fix / Resilience Date: February 24, 2026 PRs: #842 (supervisor fixes), #31 (OTel fix)
Overview
Fixes that improve supervisor resilience during multi-turn conversations with sub-agent delegations when using AWS Bedrock as the LLM provider. Addresses orphaned tool calls that permanently break conversations and a response_format incompatibility with Bedrock's Converse API.
Motivation
1. Orphaned Tool Calls Break Multi-Turn Conversations
Symptom: After 2-3 turns involving sub-agent delegation, users see:
✅ I've recovered from an interrupted tool call. Let me continue processing your request...
❌ Recovery retry failed. Please ask your question again.
Root Cause: When a sub-agent call (e.g., AWS_Agent, GitHub_Agent) times out or the client disconnects mid-stream, LangGraph records an AIMessage with tool_calls but no corresponding ToolMessage. On the next turn, Bedrock's Converse API rejects the conversation with:
ValidationException: Expected toolResult blocks at messages.0.content
for the following Ids: tooluse_y6Ma8ihoB4Lqbmm4bumT7p
Impact: Conversation becomes permanently broken for that context. Users must start a new session.
Frequency: Common in multi-turn conversations with sub-agent delegations, especially when responses are large (ArgoCD listing 800+ apps, GitHub listing many PRs).
2. Bedrock response_format Causes Prefill ValidationException
Symptom: Sub-agents using aws-bedrock provider fail with:
ValidationException: This model does not support assistant message prefill.
The conversation must end with a user message.
Root Cause: LangGraph's create_react_agent with response_format appends a hidden AIMessage prefill. Bedrock's Converse API does not support assistant message prefill, causing every structured response attempt to fail.
Impact: Sub-agents fall back to error handling, producing ResponseFormat orphaned tool calls that cascade into the supervisor.
Related
- Architecture: architecture.md