Skip to main content

Architecture: Agent Refactoring: Unified BaseLangGraphAgent Implementation

Date: 2024-10-22

Agents Refactored​

AgentStatusLines RemovedLines AddedReduction
ArgoCD✅ Complete~190~10843%
GitHub✅ Complete~2100~10895%
Slack✅ Complete~250~9263%
Jira✅ Complete~200~9154%
Backstage✅ Complete~180~8951%
Confluence✅ Complete~180~8652%
PagerDuty✅ Complete~180~8851%
Splunk✅ Complete~180~8851%
Komodor✅ Already usingN/AN/AN/A
TOTAL✅ Complete~3,460~75078%

File Changes​

Modified Files​

ai_platform_engineering/agents/
├── argocd/agent_argocd/protocol_bindings/a2a_server/agent.py
├── github/agent_github/protocol_bindings/a2a_server/agent.py
├── slack/agent_slack/protocol_bindings/a2a_server/agent.py
├── jira/agent_jira/protocol_bindings/a2a_server/agent.py
├── backstage/agent_backstage/protocol_bindings/a2a_server/agent.py
├── confluence/agent_confluence/protocol_bindings/a2a_server/agent.py
├── pagerduty/agent_pagerduty/protocol_bindings/a2a_server/agent.py
├── splunk/agent_splunk/protocol_bindings/a2a_server/agent.py
└── komodor/agent_komodor/protocol_bindings/a2a_server/agent.py (fixed import)

Enhanced Base Classes​

ai_platform_engineering/utils/a2a_common/
├── base_langgraph_agent.py (added tool visibility)
└── base_langgraph_agent_executor.py (added tool metadata logging)

Implementation Details​

Pattern Example: ArgoCD Agent​

Before (190 lines with complex initialization):

class ArgoCDAgent:
def __init__(self):
self.model = LLMFactory().get_llm()
self.graph = None
self.tracing = TracingManager()
self._initialized = False

async def _async_argocd_agent(state, config):
# 150+ lines of setup code
...

self._async_argocd_agent = _async_argocd_agent

async def _initialize_agent(self):
# Complex initialization logic
...

async def stream(self, query, context_id, trace_id):
await self._initialize_agent()
# Custom streaming logic
...

After (108 lines, clean and simple):

class ArgoCDAgent(BaseLangGraphAgent):
SYSTEM_INSTRUCTION = "..."
RESPONSE_FORMAT_INSTRUCTION = "..."

def get_agent_name(self) -> str:
return "argocd"

def get_mcp_config(self, server_path: str) -> dict:
return {
"command": "uv",
"args": [...],
"env": {...},
"transport": "stdio",
}

# All streaming, initialization, and tool handling
# is inherited from BaseLangGraphAgent!

Backward Compatibility​

✅ Fully backward compatible

  • Agent APIs unchanged
  • Environment variables unchanged
  • Response formats unchanged
  • A2A protocol unchanged

Migration Verification​

Check All Agents Compile​

cd /home/sraradhy/ai-platform-engineering
for agent in argocd github slack jira backstage confluence pagerduty splunk; do
echo "=== Checking $agent ==="
python3 -c "from ai_platform_engineering.agents.$agent.agent_$agent.protocol_bindings.a2a_server.agent import *" 2>&1 | grep -i error || echo "✅ $agent OK"
done

Restart All Agents​

docker compose -f docker-compose.dev.yaml --profile p2p restart \
agent-argocd-p2p \
agent-github-p2p \
agent-slack-p2p \
agent-jira-p2p \
agent-backstage-p2p \
agent-confluence-p2p \
agent-pagerduty-p2p \
agent-splunk-p2p \
agent-komodor-p2p

Verify Tool Visibility​

# Query an agent
curl -X POST http://localhost:8000 -H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":"test","method":"message/send","params":{"message":{"role":"user","parts":[{"kind":"text","text":"show me komodor clusters"}]}}}'

# Check logs
docker logs agent-komodor-p2p 2>&1 | grep "🔧 Calling tool" | tail -3

Impact Summary​

Code Quality​

  • ✅ 78% code reduction (2,710 net lines removed)
  • ✅ Eliminated duplication across 8 agents
  • ✅ Consistent patterns for all agents

User Experience​

  • ✅ Tool visibility - users see what agents are doing
  • ✅ Better progress updates - real-time feedback
  • ✅ Consistent behavior - all agents work the same way

Developer Experience​

  • ✅ Easier maintenance - fix once, applies to all
  • ✅ Faster development - copy template, change 3 things
  • ✅ Better debugging - tool calls logged automatically

Operations​

  • ✅ Easier monitoring - consistent logs across agents
  • ✅ Better observability - tool execution traces
  • ✅ Simpler deployment - all agents work the same

Conclusion​

This refactoring represents a major improvement to the agent infrastructure:

  • 🎯 Consistency: All agents follow the same pattern
  • 🔧 Visibility: Users see tool execution in real-time
  • 📉 Simplicity: 78% less code to maintain
  • 🚀 Scalability: Future agents take 5 minutes to create

Status: ✅ COMPLETE AND READY FOR PRODUCTION

All agents have been refactored, tested, and are ready for deployment!