Tasks: Slack Streaming Conformance — Artifact ID Reset, No-Tool Queries, RAG Cap Isolation
Input: Spec from .specify/specs/099-slack-streaming-conformance/spec.md
Related: 098-unify-single-distributed-binding (parent spec — binding unification, RAG caps)
Organization: Tasks migrated from 098 Phase 10, organized by user story. All tasks are completed.
Format: [ID] [P?] [Story] Description
- [P]: Can run in parallel (different files, no dependencies)
- [Story]: Which user story this task belongs to (e.g., US1, US4)
- Include exact file paths in descriptions
Path Conventions
- A2A binding:
ai_platform_engineering/multi_agents/platform_engineer/protocol_bindings/a2a/ - Slack bot:
ai_platform_engineering/integrations/slack_bot/ - RAG tools:
ai_platform_engineering/multi_agents/platform_engineer/rag_tools.py - Middleware:
ai_platform_engineering/utils/deepagents_custom/middleware.py - Prompts:
charts/ai-platform-engineering/data/ - Tests:
tests/
Phase 1: Live Final-Answer Streaming & Metadata Propagation ✅
Goal: Restore word-by-word streaming of the final answer to Slack and propagate is_final_answer metadata through the executor.
- T001 [US1] Restore live post-marker token yield with
is_final_answer: Truemetadata inai_platform_engineering/multi_agents/platform_engineer/protocol_bindings/a2a/agent.py(lines ~1054-1061) - T002 [US1] Propagate
is_final_answerfrom event to artifact metadata inai_platform_engineering/multi_agents/platform_engineer/protocol_bindings/a2a/agent_executor.py(lines ~779-781)
Checkpoint: STREAMING_RESULT events carry is_final_answer=True in metadata; executor propagates to artifacts.
Phase 2: Per-Tool RAG Cap Tracking ✅
Goal: Replace global RAG hard-stop with per-tool cap tracking to prevent premature graph termination.
- T003 [US2] Replace global
_rag_hard_stop_setwith_rag_capped_tools: dict[str, set[str]]inai_platform_engineering/multi_agents/platform_engineer/rag_tools.py - T004 [US2] Add
is_rag_tool_capped(thread_id, tool_name)function inrag_tools.py - T005 [US2] Update
_record_rag_cap_hitto accepttool_nameargument inrag_tools.py - T006 [US2] Update
DeterministicTaskMiddleware.after_modelto use per-tool cap checks inai_platform_engineering/utils/deepagents_custom/middleware.py - T007 [US2] Update
tests/test_rag_tools_hard_stop.pyfor new cap tracking structure; addtest_is_rag_tool_capped_tracks_individual_tools
Checkpoint: Capping search does not block fetch_document; graph only terminates when all requested tools are individually capped. SC-003, SC-004 satisfied.
Phase 3: No-Tool Query Stream Opening ✅
Goal: Ensure simple/off-topic queries (no tool calls) deliver answers to Slack.
- T008 [US3] Fix pre-stream typing guard:
if not stream_ts and not streaming_final_answer:inai_platform_engineering/integrations/slack_bot/utils/ai.py(line ~401)
Checkpoint: "tell me a joke" and "how is the weather in SF?" produce visible Slack output. SC-002 satisfied.
Phase 4: RAG Parallel Tool Call Prompts ✅
Goal: Hint the LLM to issue multiple RAG tool calls per response for faster retrieval.
- T009 [P] [US5] Add parallel tool call hint to
search_tool_promptincharts/ai-platform-engineering/data/prompt_config.rag.yaml - T010 [P] [US5] Add parallel tool call hint to RAG instructions in
charts/ai-platform-engineering/data/prompt_config.deep_agent.yaml
Checkpoint: RAG prompts include "Parallel Tool Calls" hint section.
Phase 5: Conformance Test Suite ✅
Goal: Build an automated suite that validates all Slack streaming scenarios end-to-end.
- T011 [US4] Add
--suitemode totests/simulate_slack_stream.pywithrun_suite()function - T012 [US4] Define 4 conformance scenarios:
simple-chat,off-topic,rag-simple,rag-complex - T013 [US4] Implement conformance checks: content_delivered, stream_opened, live_streamed, no_duplicate, final_answer_latched, tools_used, multi_chunk, no_tools
- T014 [US4] Modify
simulate()to return structured result dict for suite consumption
Checkpoint: All 22 conformance checks pass. SC-001 satisfied.
Phase 6: Benchmark & Enforcement ✅
Goal: Define the formal benchmark and automate enforcement via a Cursor rule.
- T015 [US4] Create
tests/STREAMING_CONFORMANCE.mdwith scenarios, invariants, scope, and instructions for adding new scenarios (SC-006) - T016 [US4] Create
.cursor/rules/streaming-conformance.mdcglob-matched to 7 pipeline files (SC-005)
Checkpoint: Benchmark document exists; Cursor rule triggers on pipeline edits.
Dependencies & Execution Order
Phase 1 (live streaming) ──→ Phase 3 (no-tool fix) ──→ Phase 5 (conformance suite)
↑
Phase 2 (RAG caps) ─────────────────────────────────────┘
↓
Phase 4 (prompts) ──────────────────────────────────→ Phase 6 (benchmark)
- Phases 1, 2, 4 can run in parallel (different files, no shared state)
- Phase 3 depends on Phase 1 (
is_final_answermetadata must exist) - Phase 5 depends on Phases 1, 2, 3 (validates all fixes)
- Phase 6 depends on Phase 5 (benchmark references the suite)
Notes
- All 16 tasks (T001–T016) are completed
- All tasks use exact file paths for traceability
- These tasks were originally Phase 10 / T058–T071 in the 098 spec
- The conformance test suite requires a running supervisor:
PYTHONPATH=. uv run python tests/simulate_slack_stream.py --suite