Splunk Agent

🤖 Splunk Agent is an LLM-powered agent built using the LangGraph ReAct Agent workflow and Splunk MCP Server.
🌐 Protocol Support: Compatible with A2A protocol for integration with external user clients.
🛡️ Secure by Design: Enforces Splunk API token-based RBAC and supports secondary external authentication for strong access control.
🏭 MCP Server: The MCP server is generated by our first-party openapi-mcp-codegen utility, ensuring version/API compatibility and software supply chain integrity.
🔌 MCP Tools: Uses langchain-mcp-adapters to glue the tools from Splunk MCP server to LangGraph ReAct Agent Graph.

🏗️ Architecture

Detailed Sequence Diagram with Agentgateway

System Diagram

Sequence Diagram

⚙️ Local Development Setup

Use this setup to test the agent against a Splunk instance.

🔑 Get Splunk API Token

Log in to your Splunk instance
Go to Settings → Data Inputs → HTTP Event Collector
Create a new token with appropriate permissions
Save the token for your .env file

Add to your .env:

SPLUNK_TOKEN=<your_token>
SPLUNK_API_URL=https://your-splunk-instance.com/api
SPLUNK_VERIFY_SSL=true

Local Development

# Navigate to the Splunk agent directory
cd ai_platform_engineering/agents/splunk

# Run the MCP server in stdio mode
make run-a2a

✨ Features

Log Search & Analytics: Search logs, run queries, and analyze data
Alert Management: Create, update, and manage alerts and detectors
Incident Management: Handle incidents and track their status
Team Management: Manage teams and team members
System Monitoring: Monitor system health and performance metrics
Data Ingestion: Manage data sources and ingestion pipelines
API Integration: Full Splunk API coverage through MCP tools

🎯 Example Use Cases

Ask the agent natural language questions like:

Log Analysis

Error Investigation: "Search for error logs in the last 24 hours from the web application"
Performance Analysis: "Show me the top 10 slowest API calls from yesterday"
Security Monitoring: "Find all failed login attempts in the last hour"

Alert Management

Alert Creation: "Create an alert for when CPU usage exceeds 80% for more than 5 minutes"
Alert Monitoring: "Show me all active alerts and their current status"
Alert Configuration: "Update the threshold for the database connection alert"

System Health

Health Check: "Show me the current system health and any active alerts"
Performance Metrics: "Display the average response time for the last 7 days"
Resource Usage: "What's the current memory and CPU utilization?"

Incident Response

Incident Management: "List all open incidents and their current status"
Incident Investigation: "Help me investigate the cause of the recent service outage"
Incident Resolution: "Update the status of incident INC-123 to resolved"

Data Management

Data Sources: "List all configured data sources and their status"
Data Ingestion: "Check the health of the log ingestion pipeline"
Data Retention: "Show me the data retention policies for different log types"

🏗️ Architecture​

System Diagram​

Sequence Diagram​

⚙️ Local Development Setup​

🔑 Get Splunk API Token​

Local Development​

✨ Features​

🎯 Example Use Cases​

Log Analysis​

Alert Management​

System Health​

Incident Response​

Data Management​

🏗️ Architecture

System Diagram

Sequence Diagram

⚙️ Local Development Setup

🔑 Get Splunk API Token

Local Development

✨ Features

🎯 Example Use Cases

Log Analysis

Alert Management

System Health

Incident Response

Data Management