Skip to main content

Research: Slack Bot AG-UI Migration

Date: 2026-04-14 Spec: spec.md Plan: plan.md

Research Questions​

R1: AG-UI Protocol Event Types and Payloads​

Decision: Use the AG-UI SSE protocol as implemented by AGUIStreamEncoder in the dynamic agents backend.

Rationale: The AG-UI encoder is already production-ready, serving the web UI. The Slack bot's existing sse_client.py already defines the correct SSEEventType enum. The event format is stable and well-documented in the encoder source.

Event types emitted by the dynamic agents backend (from agui_sse.py):

Event TypeWhen EmittedPayload Fields
RUN_STARTEDOnce at stream startrunId, threadId, timestamp
TEXT_MESSAGE_STARTFirst content chunk per namespacemessageId, role: "assistant", timestamp
TEXT_MESSAGE_CONTENTEach text tokenmessageId, delta, timestamp
TEXT_MESSAGE_ENDEnd of text sequencemessageId, timestamp
TOOL_CALL_STARTAIMessage with tool_callstoolCallId, toolCallName, timestamp
TOOL_CALL_ARGSImmediately after TOOL_CALL_STARTtoolCallId, delta (truncated JSON args), timestamp
TOOL_CALL_ENDToolMessage arrivestoolCallId, timestamp
RUN_FINISHED (success)Stream completes normallyrunId, threadId, outcome: "success", timestamp
RUN_FINISHED (interrupt)HITL input requestedrunId, threadId, outcome: "interrupt", interrupt: {id, reason, payload}
RUN_ERRORUnrecoverable errormessage, code (optional), timestamp
CUSTOM (WARNING)Non-fatal warningname: "WARNING", value: {message, namespace}
CUSTOM (NAMESPACE_CONTEXT)Before subagent eventsname: "NAMESPACE_CONTEXT", value: {namespace: [...]}
CUSTOM (TOOL_ERROR)ToolMessage starts with "ERROR:"name: "TOOL_ERROR", value: {tool_call_id, error}

Events in SSEEventType not emitted by current encoder: STEP_STARTED, STEP_FINISHED, STATE_SNAPSHOT, STATE_DELTA, RAW. These are reserved for future AG-UI protocol extensions. The Slack bot should handle them gracefully (log and skip).

Alternatives considered: Custom SSE protocol — rejected because AG-UI is the standardized protocol and the UI already uses it.

R2: SSE Client Request Format​

Decision: Rewrite stream_chat() to send ChatRequest to /api/v1/chat/stream/start?protocol=agui.

Rationale: The current sse_client.py sends a RunAgentInput-style payload (with threadId, runId, messages, state, tools, context, forwardedProps) to /chat/stream. The dynamic agents backend expects a ChatRequest body (message, conversation_id, agent_id, optional trace_id) at /api/v1/chat/stream/start. The current format is wrong for the target endpoint.

Request format (from dynamic_agents/models.py and dynamic_agents/routes/chat.py):

{
"message": "user's question text",
"conversation_id": "uuid-v5-from-thread-ts",
"agent_id": "agent-config-id-from-channel",
"trace_id": "optional-langfuse-trace-id"
}

Authentication: Authorization: Bearer <jwt> from existing OAuth2ClientCredentials client. When AUTH_ENABLED=false on dynamic agents, no token needed.

Alternatives considered: Adapting the RunAgentInput format to work with dynamic agents — rejected because the backend does not accept it; ChatRequest is the defined contract.

R3: HITL Interrupt and Resume Format​

Decision: Parse RUN_FINISHED events with outcome: "interrupt" and resume via POST /api/v1/chat/stream/resume.

Rationale: AG-UI uses RUN_FINISHED with a special outcome field to signal HITL interrupts, unlike A2A which used caipe_form artifacts. The interrupt payload contains structured field definitions compatible with the existing HITLForm dataclass.

Interrupt payload structure (from agui_sse.py:on_input_required):

{
"id": "interrupt-uuid",
"reason": "human_input",
"payload": {
"prompt": "Please confirm you want to proceed",
"fields": [
{
"field_name": "approval",
"field_label": "Do you approve?",
"field_type": "boolean",
"required": true
}
],
"agent": "platform-engineer"
}
}

Field type mapping (AG-UI InputFieldType → Slack Block Kit):

AG-UI TypeSlack Block Kit Element
textplain_text_input
selectstatic_select with field_values as options
multiselectmulti_static_select with field_values as options
booleanButton pair (Yes/No) or static_select with Yes/No
numberplain_text_input with numeric placeholder
urlplain_text_input with URL placeholder
emailplain_text_input with email placeholder

Resume request (from chat.py:ResumeStreamRequest):

{
"agent_id": "agent-config-id",
"conversation_id": "uuid-v5",
"form_data": "{\"approval\": true}",
"trace_id": "optional"
}

Note: form_data is a JSON string, not a parsed object. For rejections: "User dismissed the input form without providing values."

Alternatives considered: Custom interrupt format — rejected because AG-UI's format is already implemented in the encoder and matches what the web UI consumes.

R4: Conversation ID Strategy​

Decision: Deterministic UUID v5 from thread_ts using a fixed namespace.

Rationale: Eliminates the supervisor lookup API call (GET /api/v1/conversations/lookup). Same thread_ts always produces the same conversation ID, ensuring follow-up messages in the same thread reuse the same LangGraph checkpoint. No race conditions, no network dependency for ID resolution.

Implementation (from research doc):

import uuid

SLACK_NAMESPACE = uuid.uuid5(uuid.NAMESPACE_URL, "slack.caipe.io")

def thread_ts_to_conversation_id(thread_ts: str) -> str:
return str(uuid.uuid5(SLACK_NAMESPACE, thread_ts))

Alternatives considered:

  • Keep supervisor lookup API — rejected because the supervisor is deprecated
  • Random UUID per thread (stored in cache) — rejected because it requires persistence and doesn't survive bot restarts
  • Use thread_ts directly as conversation ID — rejected because LangGraph expects UUID format

R5: httpx for SSE Streaming​

Decision: Use httpx with stream=True for SSE streaming, replacing the current requests-based implementation.

Rationale: The user specified httpx. httpx provides async-compatible streaming, better timeout handling, and is already a dependency of the Slack bot (via pyproject.toml — it's used by FastMCP and other components). The current requests.post(stream=True) approach works but httpx's client.stream() context manager provides cleaner resource management.

SSE parsing approach: Line-by-line iteration over the response stream, parsing event: and data: lines. This is the same approach as the current implementation — no SSE library dependency needed.

Alternatives considered: aiohttp — rejected because the Slack bot runs synchronously (Slack Bolt is sync); sseclient-py — rejected as unnecessary dependency for simple line parsing.

R6: Per-Channel Agent Routing​

Decision: Add agent_id: Optional[str] to ChannelConfig with a default_agent_id fallback in GlobalDefaults.

Rationale: Each Slack channel can be configured to route to a different dynamic agent (e.g., platform-engineer for ops channels, code-reviewer for dev channels). This replaces the single-supervisor model with per-channel specialization.

Configuration format:

defaults:
default_agent_id: "platform-engineer"

C12345ABC:
name: "platform-support"
agent_id: "platform-engineer"
ai_enabled: true

Validation: If ai_enabled=True and no agent_id is set (and no default_agent_id), log a warning and skip the message with an error response.

Alternatives considered: Environment variable per channel — rejected because the YAML config already handles per-channel settings; global single agent_id — rejected because it defeats the purpose of dynamic agents.

R7: Web UI Conversation Isolation​

Decision: Slack conversations are isolated by their deterministic UUID v5 namespace. No additional filtering is needed on the dynamic agents side.

Rationale: The web UI generates conversation IDs using uuid4() (random). The Slack bot generates them using uuid5(SLACK_NAMESPACE, thread_ts). These UUID spaces are statistically disjoint. The web UI's conversation list endpoint (GET /api/v1/conversations) filters by authenticated user, and Slack bot requests use service-level credentials (not user-level), so Slack conversations are naturally excluded from user queries.

Alternatives considered: Adding a source metadata field to conversations — deferred; the namespace-based approach is sufficient for 0.4.0.

R8: Existing Broken State on Branch​

Decision: This migration fixes all three pre-existing issues from the main → release/0.4.0 merge.

IssueRoot CauseFix
app.py calls stream_sse_response() but ai.py defines stream_a2a_response()Naming mismatch from mergePhase 2: New function is named stream_response(), all callers updated
ai.py imports throttler but throttler.py was deletedDead import from commit 02525beaPhase 2: Import removed; throttler not needed for AG-UI
app.py passes SSEClient but ai.py expects A2AClient interfaceInterface mismatch from partial migrationPhase 2: New stream_response() expects SSEClient