RAG and Git Agents
1. Overview​
This is the third part of the AI agents lab series. In this part, you'll learn about Retrieval-Augmented Generation (RAG) and build a multi-agent system that combines knowledge retrieval with version control automation.
What you'll learn in this part:
- Core concepts of Retrieval-Augmented Generation (RAG)
- How RAG enhances LLM responses with external knowledge
- Vector databases and semantic search
- Building a RAG-powered agent for documentation queries
- Integrating Git automation with AI agents
- Coordinating multiple specialized agents for complex workflows
Prerequisites:
- Completion of Part 1 and Part 2
- Understanding of multi-agent systems and A2A protocol
- Access to Azure OpenAI (credentials provided in lab environment)
- GitHub personal access token (provided in lab environment)
2. Understanding Retrieval-Augmented Generation (RAG)​
2.1 What is RAG?​
Retrieval-Augmented Generation (RAG) is a technique that enhances LLM responses by retrieving relevant information from external knowledge sources before generating an answer. Instead of relying solely on the model's training data, RAG dynamically fetches up-to-date, domain-specific information.
[!NOTE] Think of RAG like an open-book exam: instead of memorizing everything, the LLM can look up relevant information when needed, leading to more accurate and current responses.
Key benefits of RAG:
- Up-to-date information: Access current data beyond the model's training cutoff
- Domain expertise: Incorporate specialized knowledge not in the base model
- Reduced hallucinations: Ground responses in actual retrieved documents
- Transparency: Cite sources and provide evidence for answers
- Cost-effective: Avoid expensive model retraining for new information
2.2 How RAG Works​
RAG operates in two main phases: ingestion and retrieval.
Ingestion Phase​
This is the process of preparing your knowledge base:
- Document Collection: Gather documents (web pages, PDFs, markdown files, etc.)
- Content Extraction: Parse and extract text from various formats
- Chunking: Split documents into smaller, manageable pieces
- Embedding: Convert text chunks into vector representations
- Storage: Store embeddings in a vector database with metadata
Why chunking matters:
- LLMs have token limits for context windows
- Smaller chunks enable more precise retrieval
- Better matching between queries and relevant content
Common chunking strategies:
- Recursive Text Splitter: Splits on paragraphs, then sentences, then words
- Fixed-size chunks: Equal-sized segments with optional overlap
- Semantic chunking: Split based on topic or meaning boundaries
Retrieval Phase​
This is how the system answers queries:
- Query Embedding: Convert the user's question into a vector
- Similarity Search: Find the most similar document chunks in the vector database
- Context Assembly: Gather the top-k most relevant chunks
- Augmented Generation: Pass the query + retrieved context to the LLM
- Response: LLM generates an answer grounded in the retrieved information
Key insight: The same embedding model must be used for both ingestion and retrieval to ensure vectors are in the same semantic space.
2.3 Vector Databases​
Vector databases are specialized storage systems optimized for similarity search on high-dimensional vectors (embeddings).
Popular vector databases:
- Milvus: Open-source, highly scalable
- Pinecone: Managed service, easy to use
- Weaviate: GraphQL API, hybrid search
- Chroma: Lightweight, developer-friendly
How similarity search works:
Vector databases use algorithms like:
- Cosine similarity: Measures angle between vectors
- Euclidean distance: Measures straight-line distance
- Dot product: Measures alignment of vectors
These enable finding semantically similar content even when exact words don't match.
2.4 RAG System Architecture​
In this lab, you'll deploy a complete RAG system with these components:
Components:
- RAG Server: Handles ingestion and retrieval operations
- RAG Agent: Interfaces with the supervisor using A2A protocol
- Milvus: Vector database for storing embeddings
- RAG Web UI: Interface for managing knowledge base
- Embedding Model: Azure OpenAI for generating embeddings
3. Introduction to Git Agent​
3.1 What is the Git Agent?​
The Git Agent is a specialized agent that automates version control operations. It can perform git commands like commits, pushes, and repository management through natural language instructions.
Capabilities:
- Create and commit files to repositories
- Push changes to remote branches
- Manage repository operations
- Handle authentication securely
Use cases:
- Automated documentation updates
- Report generation and archival
- Code snippet management
- Collaborative workflows with AI assistance
3.2 Multi-Agent Workflow​
In this lab, you'll see how the RAG and Git agents work together:
- Research: RAG agent retrieves information from documentation
- Synthesis: Supervisor coordinates report generation
- Persistence: Git agent commits the report to a repository
This demonstrates how specialized agents collaborate to complete complex, multi-step workflows.
4. Deploy the RAG and Git System​
Now let's deploy the multi-agent system with RAG and Git capabilities!
Clomne the repository in case it was not cloned already:
if [ ! -d "$HOME/work/ai-platform-engineering" ]; then
cd $HOME/work
git clone https://github.com/cnoe-io/ai-platform-engineering
fi
Copy the example environment file:
cd $HOME/work/ai-platform-engineering
cp -f .env.example .env
Task 1: Configure Environment Variables​
Populate Azure OpenAI and GitHub credentials:
sed -i "s|^LLM_PROVIDER=.*|LLM_PROVIDER='${LLM_PROVIDER}'|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^AZURE_OPENAI_API_KEY=.*|AZURE_OPENAI_API_KEY='${AZURE_OPENAI_API_KEY}'|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^AZURE_OPENAI_API_VERSION=.*|AZURE_OPENAI_API_VERSION='${AZURE_OPENAI_API_VERSION}'|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^AZURE_OPENAI_DEPLOYMENT=.*|AZURE_OPENAI_DEPLOYMENT='${AZURE_OPENAI_DEPLOYMENT}'|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^AZURE_OPENAI_ENDPOINT=.*|AZURE_OPENAI_ENDPOINT='${AZURE_OPENAI_ENDPOINT}'|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^GITHUB_PERSONAL_ACCESS_TOKEN=.*|GITHUB_PERSONAL_ACCESS_TOKEN=${GITHUB_PERSONAL_ACCESS_TOKEN}|" $HOME/work/ai-platform-engineering/.env
echo "EMBEDDINGS_PROVIDER=${LLM_PROVIDER}" >> $HOME/work/ai-platform-engineering/.env
What this does:
- Configures Azure OpenAI for LLM and embedding operations
- Sets up GitHub authentication for repository access
- Prepares the environment for RAG system and Git agent
Adjust backend URLs accessed from the UI to match the lab
sed -i "s|^NEXT_PUBLIC_A2A_BASE_URL=.*|NEXT_PUBLIC_A2A_BASE_URL=https://%%LABURL%%:3000|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^NEXT_PUBLIC_RAG_URL=.*|NEXT_PUBLIC_RAG_URL=https://%%LABURL%%:19446|" $HOME/work/ai-platform-engineering/.env
Task 2: Enable RAG and Git Agents​
Enable the required agents and disable the previous ones:
sed -i "s|^ENABLE_RAG=.*|ENABLE_RAG=true|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^NEXT_PUBLIC_RAG_ENABLED=.*|NEXT_PUBLIC_RAG_ENABLED=true|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^ENABLE_GITHUB=.*|ENABLE_GITHUB=true|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^ENABLE_CAIPE_UI=.*|ENABLE_CAIPE_UI=true|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^ENABLE_WEATHER=.*|ENABLE_WEATHER=false|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^ENABLE_NETUTILS=.*|ENABLE_NETUTILS=false|" $HOME/work/ai-platform-engineering/.env
What this does:
- Enables the RAG system for knowledge retrieval
- Enables the Git agent for version control operations
- Enables the CAIPE UI
- Disables the weather and NetUtils agents from Part 2
Task 3: Deploy the System​
Start all services:
cd $HOME/work/ai-platform-engineering
./deploy.sh
What this deploys:
The Docker Compose stack starts these services:
caipe-supervisor: Platform engineer supervisor agentagent-github: Git automation agentrag_server: RAG backend serverweb-ingestor: Web content ingestion servicecaipe-ui: Web interface for agent managementmilvus-standalone: Vector databasemilvus-etcd: Milvus metadata storagemilvus-minio: Milvus object storagerag-redis: Redis cache for RAG operations
[!NOTE] The deployment may take 2-3 minutes as it starts the vector database and all agents and services.
[!IMPORTANT] Wait until this process is completed before proceeding.
Task 4: Verify Supervisor Agent​
Check that the supervisor agent is healthy:
curl http://localhost:8000/.well-known/agent.json | jq
Expected output: A JSON object containing the A2A agent card with capabilities from RAG and Git agents.
[!NOTE] The response should be a JSON object (the A2A agent card). If you get an error, wait 1-2 minutes and try again — the agents are still starting up.
What to look for:
- ✅ RAG-related capabilities (search, retrieval)
- ✅ Git-related capabilities (commit, push)
- ✅ Valid JSON structure
The supervisor agent detects automatically what agents and tools are started and builds its capabilities around that. This is a continuous process - it re-tries to identify if there are any changes in the environment every 5 minutes. We will check if the RAG tool was identified at startup - if the RAG MCP was not started before the first identification we will restrart the caipe-supervisor container in order to force a new detection instead of waiting 5 minutes
docker logs caipe-supervisor 2>&1 | grep "RAG tools"
The output should look similar to:
$ docker logs caipe-supervisor 2>&1 | grep "RAG tools"
2026-02-07 00:02:46 [ai_platform_engineering.multi_agents.platform_engineer.deep_agent] [INFO] [_load_rag_tools:165] Loading RAG tools from MCP server...
2026-02-07 00:02:46 [ai_platform_engineering.multi_agents.platform_engineer.deep_agent] [INFO] [_load_rag_tools:167] ✅ Loaded 3 RAG tools: ['search', 'fetch_document', 'fetch_datasources_and_entity_types']
2026-02-07 00:02:46 [ai_platform_engineering.multi_agents.platform_engineer.deep_agent] [INFO] [_build_graph:220] ✅📚 Loaded 3 RAG tools at startup
2026-02-07 00:02:46 [ai_platform_engineering.multi_agents.platform_engineer.deep_agent] [INFO] [_build_graph:289] ✅📚 Added 3 RAG tools to supervisor
If it is not the case, please restart the supervisor agent and re-check the previous conditions:
cd $HOME/work/ai-platform-engineering
docker-compose up -d --force-recreate --no-deps caipe-supervisor
5. Populate the RAG Knowledge Base​
Task 5: Open the Caipe UI​
Access the RAG management interface:
Open RAG UI
Task 6: Ingest AGNTCY Documentation​
Once the Caipe UI is open, please select the Knowledge bases section and follow these steps:
1. Copy the documentation URL:
https://docs.agntcy.org
2. Paste it in the Ingest URL field
3. Click the Ingest button
[!NOTE]
- The server should start ingesting the docs. You can click on the datasource to see the progress.
- Some URLs may take longer, but feel free to move forward while ingestion continues.
Task 7: Understand the Ingestion Process​
What's happening behind the scenes:
- Crawling: The RAG server crawls the webpage (supports sitemaps) and fetches all pages
- Parsing: HTML is parsed and content is extracted
- Chunking: Pages are split into chunks using Recursive Text Splitter
- Embedding: Each chunk is sent to the embedding model to generate vector embeddings
- Storage: Embeddings are stored in Milvus along with metadata (source, title, description, etc.)
Why this matters:
- The chunking strategy affects retrieval quality
- Metadata enables filtering and source attribution
- Vector embeddings capture semantic meaning
- The vector database enables fast similarity search
6. Test the RAG System​
Task 8: Verify RAG Retrieval​
Let's test the RAG system directly through the UI.
1. Navigate to the Search option of the Knowledge Bases tab
2. Type this query in the search box:
What is SLIM
3. Click the Search button
[!NOTE] The response should return relevant document chunks. The chunks may not be formatted in a way that is easy to read. As long as some document chunks are returned, the RAG system is working.
What you're seeing:
- Raw document chunks retrieved from the vector database
- Similarity scores indicating relevance
- Source metadata (URL, title, etc.)
This is the raw retrieval output before the LLM synthesizes it into a coherent answer.
7. Interact with the RAG Agent​
Task 9: Switch to the Chat tab of the Caipe UI​
Task 10: Query the RAG Agent​
Ask the agent about AGNTCY:
Tell me more about SLIM in AGNTCY
[!NOTE] The agent should respond with information about the SLIM protocol, synthesized from the retrieved documentation.
What's happening behind the scenes:
- Query Embedding: Your question is converted to a vector using the same embedding model
- Similarity Search: The vector database finds the most similar document chunks
- Context Retrieval: Top-k relevant chunks are retrieved
- Augmented Generation: The LLM receives your question + retrieved context
- Response Synthesis: The LLM generates a coherent answer grounded in the documentation
Key difference from raw search:
- The LLM synthesizes information from multiple chunks
- The response is coherent and conversational
- Sources can be cited for transparency
8. Multi-Agent Workflow: RAG + Git​
Task 11: Execute a Complex Multi-Agent Task​
Now let's test a workflow that requires both the RAG and Git agents to collaborate.
In the chat, ask:
Research and write a report on AGNTCY in markdown format, wait for this to be completed then commit this report under a file named '%%LABNAME%%-report.md' with commit message "agntcy-report" to repo %%REPO_URL%% on the main branch.
[!NOTE] The agent should have:
- Created a report with name:
%%LABNAME%%-report.md- Committed it to the CAIPE Labs git repository with commit message "agntcy-report".
Task 12: Understand the Multi-Agent Coordination​
What happened in this workflow:
- Task Analysis: The supervisor agent parsed the complex request and identified two sub-tasks
- Research Phase:
- Supervisor delegates to RAG agent
- RAG agent searches the knowledge base for AGNTCY information
- RAG agent synthesizes findings into a markdown report
- Persistence Phase:
- Supervisor waits for research completion
- Supervisor delegates to Git agent
- Git agent creates the file and commits it to the repository
- Confirmation: Supervisor reports success back to the user
Key coordination patterns:
- Sequential execution: Git task waits for RAG task completion
- Data passing: Report content flows from RAG agent to Git agent
- Error handling: Agents report failures back to supervisor
- State management: Supervisor tracks overall workflow progress
This demonstrates the power of multi-agent systems: complex workflows are broken down and distributed to specialized agents, each doing what it does best.
9. Verify the Git Commit​
Task 13: Check the Repository​
Visit the CAIPE Labs repository to verify your report was committed:
Repository URL:
%%REPO_URL%%
What to look for:
- ✅ A file named
%%LABNAME%%-report.mdin the repository - ✅ Commit message: "agntcy-report"
- ✅ Content about AGNTCY from the RAG knowledge base
- ✅ Proper markdown formatting
What this proves:
- The RAG agent successfully retrieved and synthesized information
- The Git agent successfully authenticated and pushed changes
- The supervisor correctly coordinated the multi-step workflow
- Agents can interact with external systems (GitHub) autonomously
10. Clean Up​
Task 14: Stop the System​
When you're done exploring, stop all containers:
cd $HOME/work/ai-platform-engineering
./deploy.sh stop
What this does:
- Gracefully shuts down all agent containers
- Stops the RAG server and vector database
- Stops the Agent Forge UI
- Cleans up network connections
[!NOTE] The vector database data persists in Docker volumes. If you restart the system, your ingested documents will still be available.
11. Summary​
Congratulations! You've completed Part 3 of the AI Agents lab series. Here's what you accomplished:
✅ Understood Retrieval-Augmented Generation (RAG) concepts
✅ Learned how vector databases enable semantic search
✅ Deployed a complete RAG system with Milvus
✅ Ingested documentation into a knowledge base
✅ Tested RAG retrieval and generation
✅ Used the Git agent for version control automation
✅ Orchestrated a complex multi-agent workflow
Key Takeaways from Part 3​
- RAG grounds LLM responses in external knowledge - Reduces hallucinations and provides up-to-date information
- Vector databases enable semantic search - Find relevant content based on meaning, not just keywords
- Chunking strategy affects retrieval quality - Proper document splitting is crucial for good results
- Embeddings capture semantic meaning - Same model must be used for ingestion and retrieval
- Multi-agent workflows enable complex automation - Specialized agents collaborate on multi-step tasks
RAG System Components​
- Ingestion Pipeline: Crawling → Parsing → Chunking → Embedding → Storage
- Retrieval Pipeline: Query Embedding → Similarity Search → Context Assembly → Generation
- Vector Database: Milvus for high-performance similarity search
- Agents: RAG agent for knowledge retrieval, Git agent for version control
Multi-Agent Coordination Patterns​
- Sequential execution: Tasks with dependencies run in order
- Data passing: Information flows between agents
- Error handling: Failures propagate to supervisor
- State management: Supervisor tracks workflow progress
What's Next?​
- Part 4: Tracing and Observability — Add Langfuse and observe agent interactions end-to-end.
Then explore advanced topics:
- Fine-tuning embedding models for your domain
- Implementing hybrid search (vector + keyword)
- Building custom agents for your workflows
- Scaling RAG systems for production
- Advanced chunking and retrieval strategies
Additional Resources​
For deeper exploration:
- LangChain RAG Tutorial: Comprehensive RAG implementation guide
- Recursive Text Splitter: Chunking strategies
- Milvus Documentation: Vector database operations
- RAG Best Practices: Optimization techniques
- CAIPE GitHub Repository: Source code and examples
Part 3 Complete! You now understand how to build RAG-powered agents and orchestrate complex multi-agent workflows that combine knowledge retrieval with external system automation.