Agent Refactoring: Unified BaseLangGraphAgent Implementation
Status: 🟢 In-use Category: Refactoring & Implementation Date: October 21, 2024
Overview
Refactored 8 agents to use the common BaseLangGraphAgent base class, eliminating code duplication and ensuring consistent behavior across all agents.
Agents Refactored
| Agent | Status | Lines Removed | Lines Added | Reduction |
|---|---|---|---|---|
| ArgoCD | ✅ Complete | ~190 | ~108 | 43% |
| GitHub | ✅ Complete | ~2100 | ~108 | 95% |
| Slack | ✅ Complete | ~250 | ~92 | 63% |
| Jira | ✅ Complete | ~200 | ~91 | 54% |
| Backstage | ✅ Complete | ~180 | ~89 | 51% |
| Confluence | ✅ Complete | ~180 | ~86 | 52% |
| PagerDuty | ✅ Complete | ~180 | ~88 | 51% |
| Splunk | ✅ Complete | ~180 | ~88 | 51% |
| Komodor | ✅ Already using | N/A | N/A | N/A |
| TOTAL | ✅ Complete | ~3,460 | ~750 | 78% |
Benefits
1. Automatic Tool Visibility 🔧
All refactored agents now automatically show:
🔧 Calling tool: **list_clusters**
✅ Tool **list_clusters** completed
🔧 Calling tool: **get_cluster_details**
✅ Tool **get_cluster_details** completed
Before refactoring: No tool visibility, just "Processing..."
2. Consistent Structure 📐
All agents now follow the exact same pattern:
class AgentName(BaseLangGraphAgent):
"""Agent description."""
SYSTEM_INSTRUCTION = "..." # Agent-specific prompt
RESPONSE_FORMAT_INSTRUCTION = "..." # Standard format
def get_agent_name(self) -> str:
return "agent_name"
def get_system_instruction(self) -> str:
return self.SYSTEM_INSTRUCTION
def get_response_format_instruction(self) -> str:
return self.RESPONSE_FORMAT_INSTRUCTION
def get_response_format_class(self) -> type[BaseModel]:
return ResponseFormat
def get_mcp_config(self, server_path: str) -> dict:
# Agent-specific MCP configuration
return {...}
def get_tool_working_message(self) -> str:
return 'Querying Agent...'
def get_tool_processing_message(self) -> str:
return 'Processing Agent data...'
@trace_agent_stream("agent_name")
async def stream(self, query: str, sessionId: str, trace_id: str = None):
async for event in super().stream(query, sessionId, trace_id):
yield event
Only 3 things differ:
- System instruction (prompt)
- MCP configuration (env vars, tools)
- Agent name
3. Reduced Code Duplication 📉
- 3,460 lines removed across all agents
- 750 lines added (clean, consistent implementations)
- 78% code reduction overall
- 2,710 net lines deleted
4. Easier Maintenance 🛠️
Before:
- Bug fix needs to be applied to 8 different files
- Each agent has slightly different implementation
- Inconsistent error handling
After:
- Bug fix in
BaseLangGraphAgentfixes all 8 agents - All agents behave identically
- Consistent error handling and streaming
5. Future Enhancements Automatic 🚀
Any improvements to BaseLangGraphAgent automatically apply to all agents:
- ✅ Tool visibility (already added!)
- ✅ Better error handling
- ✅ Performance optimizations
- ✅ New A2A protocol features
File Changes
Modified Files
ai_platform_engineering/agents/
├── argocd/agent_argocd/protocol_bindings/a2a_server/agent.py
├── github/agent_github/protocol_bindings/a2a_server/agent.py
├── slack/agent_slack/protocol_bindings/a2a_server/agent.py
├── jira/agent_jira/protocol_bindings/a2a_server/agent.py
├── backstage/agent_backstage/protocol_bindings/a2a_server/agent.py
├── confluence/agent_confluence/protocol_bindings/a2a_server/agent.py
├── pagerduty/agent_pagerduty/protocol_bindings/a2a_server/agent.py
├── splunk/agent_splunk/protocol_bindings/a2a_server/agent.py
└── komodor/agent_komodor/protocol_bindings/a2a_server/agent.py (fixed import)
Enhanced Base Classes
ai_platform_engineering/utils/a2a_common/
├── base_langgraph_agent.py (added tool visibility)
└── base_langgraph_agent_executor.py (added tool metadata logging)
Implementation Details
Pattern Example: ArgoCD Agent
Before (190 lines with complex initialization):
class ArgoCDAgent:
def __init__(self):
self.model = LLMFactory().get_llm()
self.graph = None
self.tracing = TracingManager()
self._initialized = False
async def _async_argocd_agent(state, config):
# 150+ lines of setup code
...
self._async_argocd_agent = _async_argocd_agent
async def _initialize_agent(self):
# Complex initialization logic
...
async def stream(self, query, context_id, trace_id):
await self._initialize_agent()
# Custom streaming logic
...
After (108 lines, clean and simple):
class ArgoCDAgent(BaseLangGraphAgent):
SYSTEM_INSTRUCTION = "..."
RESPONSE_FORMAT_INSTRUCTION = "..."
def get_agent_name(self) -> str:
return "argocd"
def get_mcp_config(self, server_path: str) -> dict:
return {
"command": "uv",
"args": [...],
"env": {...},
"transport": "stdio",
}
# All streaming, initialization, and tool handling
# is inherited from BaseLangGraphAgent!
Testing
All agents can be tested with the same pattern:
# Test any agent
curl -X POST http://localhost:8001 \
-H "Content-Type: application/json" \
-d '{"query": "list resources"}'
# Check logs for tool visibility
docker logs agent-argocd-p2p 2>&1 | grep -E "(Tool call detected|Tool result)" | tail -5
Expected output:
argocd: Tool call detected - list_applications
argocd: Tool result received - list_applications (success)
Backward Compatibility
✅ Fully backward compatible
- Agent APIs unchanged
- Environment variables unchanged
- Response formats unchanged
- A2A protocol unchanged
Migration Verification
Check All Agents Compile
cd /home/sraradhy/ai-platform-engineering
for agent in argocd github slack jira backstage confluence pagerduty splunk; do
echo "=== Checking $agent ==="
python3 -c "from ai_platform_engineering.agents.$agent.agent_$agent.protocol_bindings.a2a_server.agent import *" 2>&1 | grep -i error || echo "✅ $agent OK"
done
Restart All Agents
docker compose -f docker-compose.dev.yaml --profile p2p restart \
agent-argocd-p2p \
agent-github-p2p \
agent-slack-p2p \
agent-jira-p2p \
agent-backstage-p2p \
agent-confluence-p2p \
agent-pagerduty-p2p \
agent-splunk-p2p \
agent-komodor-p2p
Verify Tool Visibility
# Query an agent
curl -X POST http://localhost:8000 -H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":"test","method":"message/send","params":{"message":{"role":"user","parts":[{"kind":"text","text":"show me komodor clusters"}]}}}'
# Check logs
docker logs agent-komodor-p2p 2>&1 | grep "🔧 Calling tool" | tail -3
Related Documentation
- A2A Intermediate States - Tool visibility implementation
- Enhanced Streaming Feature - Parallel streaming
- Streaming Architecture - Technical deep dive
Impact Summary
Code Quality
- ✅ 78% code reduction (2,710 net lines removed)
- ✅ Eliminated duplication across 8 agents
- ✅ Consistent patterns for all agents
User Experience
- ✅ Tool visibility - users see what agents are doing
- ✅ Better progress updates - real-time feedback
- ✅ Consistent behavior - all agents work the same way
Developer Experience
- ✅ Easier maintenance - fix once, applies to all
- ✅ Faster development - copy template, change 3 things
- ✅ Better debugging - tool calls logged automatically
Operations
- ✅ Easier monitoring - consistent logs across agents
- ✅ Better observability - tool execution traces
- ✅ Simpler deployment - all agents work the same
Conclusion
This refactoring represents a major improvement to the agent infrastructure:
- 🎯 Consistency: All agents follow the same pattern
- 🔧 Visibility: Users see tool execution in real-time
- 📉 Simplicity: 78% less code to maintain
- 🚀 Scalability: Future agents take 5 minutes to create
Status: ✅ COMPLETE AND READY FOR PRODUCTION
All agents have been refactored, tested, and are ready for deployment!