Skip to main content
Version: main 🚧

RAG and Git Agents

1. Overview

This is the third part of the AI agents lab series. In this part, you'll learn about Retrieval-Augmented Generation (RAG) and build a multi-agent system that combines knowledge retrieval with version control automation.

What you'll learn in this part:

  • Core concepts of Retrieval-Augmented Generation (RAG)
  • How RAG enhances LLM responses with external knowledge
  • Vector databases and semantic search
  • Building a RAG-powered agent for documentation queries
  • Integrating Git automation with AI agents
  • Coordinating multiple specialized agents for complex workflows

Prerequisites:

  • Completion of Part 1 and Part 2
  • Understanding of multi-agent systems and A2A protocol
  • Access to Azure OpenAI (credentials provided in lab environment)
  • GitHub personal access token (provided in lab environment)

2. Understanding Retrieval-Augmented Generation (RAG)

2.1 What is RAG?

Retrieval-Augmented Generation (RAG) is a technique that enhances LLM responses by retrieving relevant information from external knowledge sources before generating an answer. Instead of relying solely on the model's training data, RAG dynamically fetches up-to-date, domain-specific information.

[!NOTE] Think of RAG like an open-book exam: instead of memorizing everything, the LLM can look up relevant information when needed, leading to more accurate and current responses.

Key benefits of RAG:

  • Up-to-date information: Access current data beyond the model's training cutoff
  • Domain expertise: Incorporate specialized knowledge not in the base model
  • Reduced hallucinations: Ground responses in actual retrieved documents
  • Transparency: Cite sources and provide evidence for answers
  • Cost-effective: Avoid expensive model retraining for new information

2.2 How RAG Works

RAG operates in two main phases: ingestion and retrieval.

Ingestion Phase

This is the process of preparing your knowledge base:

  1. Document Collection: Gather documents (web pages, PDFs, markdown files, etc.)
  2. Content Extraction: Parse and extract text from various formats
  3. Chunking: Split documents into smaller, manageable pieces
  4. Embedding: Convert text chunks into vector representations
  5. Storage: Store embeddings in a vector database with metadata
RAG Ingestion Process

Why chunking matters:

  • LLMs have token limits for context windows
  • Smaller chunks enable more precise retrieval
  • Better matching between queries and relevant content

Common chunking strategies:

  • Recursive Text Splitter: Splits on paragraphs, then sentences, then words
  • Fixed-size chunks: Equal-sized segments with optional overlap
  • Semantic chunking: Split based on topic or meaning boundaries

Retrieval Phase

This is how the system answers queries:

  1. Query Embedding: Convert the user's question into a vector
  2. Similarity Search: Find the most similar document chunks in the vector database
  3. Context Assembly: Gather the top-k most relevant chunks
  4. Augmented Generation: Pass the query + retrieved context to the LLM
  5. Response: LLM generates an answer grounded in the retrieved information
RAG Agent Architecture

Key insight: The same embedding model must be used for both ingestion and retrieval to ensure vectors are in the same semantic space.


2.3 Vector Databases

Vector databases are specialized storage systems optimized for similarity search on high-dimensional vectors (embeddings).

Popular vector databases:

  • Milvus: Open-source, highly scalable
  • Pinecone: Managed service, easy to use
  • Weaviate: GraphQL API, hybrid search
  • Chroma: Lightweight, developer-friendly

How similarity search works:

Vector databases use algorithms like:

  • Cosine similarity: Measures angle between vectors
  • Euclidean distance: Measures straight-line distance
  • Dot product: Measures alignment of vectors

These enable finding semantically similar content even when exact words don't match.


2.4 RAG System Architecture

In this lab, you'll deploy a complete RAG system with these components:

RAG Architecture Overview

Components:

  • RAG Server: Handles ingestion and retrieval operations
  • RAG Agent: Interfaces with the supervisor using A2A protocol
  • Milvus: Vector database for storing embeddings
  • RAG Web UI: Interface for managing knowledge base
  • Embedding Model: Azure OpenAI for generating embeddings

3. Introduction to Git Agent

3.1 What is the Git Agent?

The Git Agent is a specialized agent that automates version control operations. It can perform git commands like commits, pushes, and repository management through natural language instructions.

Capabilities:

  • Create and commit files to repositories
  • Push changes to remote branches
  • Manage repository operations
  • Handle authentication securely

Use cases:

  • Automated documentation updates
  • Report generation and archival
  • Code snippet management
  • Collaborative workflows with AI assistance

3.2 Multi-Agent Workflow

In this lab, you'll see how the RAG and Git agents work together:

  1. Research: RAG agent retrieves information from documentation
  2. Synthesis: Supervisor coordinates report generation
  3. Persistence: Git agent commits the report to a repository

This demonstrates how specialized agents collaborate to complete complex, multi-step workflows.


4. Deploy the RAG and Git System

Now let's deploy the multi-agent system with RAG and Git capabilities!

Clomne the repository in case it was not cloned already:

if [ ! -d "$HOME/work/ai-platform-engineering" ]; then
cd $HOME/work
git clone https://github.com/cnoe-io/ai-platform-engineering
fi

Copy the example environment file:

cd $HOME/work/ai-platform-engineering
cp -f .env.example .env

Task 1: Configure Environment Variables

Populate Azure OpenAI and GitHub credentials:

sed -i "s|^LLM_PROVIDER=.*|LLM_PROVIDER='${LLM_PROVIDER}'|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^AZURE_OPENAI_API_KEY=.*|AZURE_OPENAI_API_KEY='${AZURE_OPENAI_API_KEY}'|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^AZURE_OPENAI_API_VERSION=.*|AZURE_OPENAI_API_VERSION='${AZURE_OPENAI_API_VERSION}'|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^AZURE_OPENAI_DEPLOYMENT=.*|AZURE_OPENAI_DEPLOYMENT='${AZURE_OPENAI_DEPLOYMENT}'|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^AZURE_OPENAI_ENDPOINT=.*|AZURE_OPENAI_ENDPOINT='${AZURE_OPENAI_ENDPOINT}'|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^GITHUB_PERSONAL_ACCESS_TOKEN=.*|GITHUB_PERSONAL_ACCESS_TOKEN=${GITHUB_PERSONAL_ACCESS_TOKEN}|" $HOME/work/ai-platform-engineering/.env
echo "EMBEDDINGS_PROVIDER=${LLM_PROVIDER}" >> $HOME/work/ai-platform-engineering/.env

What this does:

  • Configures Azure OpenAI for LLM and embedding operations
  • Sets up GitHub authentication for repository access
  • Prepares the environment for RAG system and Git agent

Adjust backend URLs accessed from the UI to match the lab

sed -i "s|^NEXT_PUBLIC_A2A_BASE_URL=.*|NEXT_PUBLIC_A2A_BASE_URL=https://%%LABURL%%:3000|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^NEXT_PUBLIC_RAG_URL=.*|NEXT_PUBLIC_RAG_URL=https://%%LABURL%%:19446|" $HOME/work/ai-platform-engineering/.env

Task 2: Enable RAG and Git Agents

Enable the required agents and disable the previous ones:

sed -i "s|^ENABLE_RAG=.*|ENABLE_RAG=true|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^NEXT_PUBLIC_RAG_ENABLED=.*|NEXT_PUBLIC_RAG_ENABLED=true|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^ENABLE_GITHUB=.*|ENABLE_GITHUB=true|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^ENABLE_CAIPE_UI=.*|ENABLE_CAIPE_UI=true|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^ENABLE_WEATHER=.*|ENABLE_WEATHER=false|" $HOME/work/ai-platform-engineering/.env
sed -i "s|^ENABLE_NETUTILS=.*|ENABLE_NETUTILS=false|" $HOME/work/ai-platform-engineering/.env

What this does:

  • Enables the RAG system for knowledge retrieval
  • Enables the Git agent for version control operations
  • Enables the CAIPE UI
  • Disables the weather and NetUtils agents from Part 2

Task 3: Deploy the System

Start all services:

cd $HOME/work/ai-platform-engineering
./deploy.sh

What this deploys:

The Docker Compose stack starts these services:

  • caipe-supervisor: Platform engineer supervisor agent
  • agent-github: Git automation agent
  • rag_server: RAG backend server
  • web-ingestor: Web content ingestion service
  • caipe-ui: Web interface for agent management
  • milvus-standalone: Vector database
  • milvus-etcd: Milvus metadata storage
  • milvus-minio: Milvus object storage
  • rag-redis: Redis cache for RAG operations

[!NOTE] The deployment may take 2-3 minutes as it starts the vector database and all agents and services.

[!IMPORTANT] Wait until this process is completed before proceeding.


Task 4: Verify Supervisor Agent

Check that the supervisor agent is healthy:

curl http://localhost:8000/.well-known/agent.json | jq

Expected output: A JSON object containing the A2A agent card with capabilities from RAG and Git agents.

[!NOTE] The response should be a JSON object (the A2A agent card). If you get an error, wait 1-2 minutes and try again — the agents are still starting up.

What to look for:

  • ✅ RAG-related capabilities (search, retrieval)
  • ✅ Git-related capabilities (commit, push)
  • ✅ Valid JSON structure

The supervisor agent detects automatically what agents and tools are started and builds its capabilities around that. This is a continuous process - it re-tries to identify if there are any changes in the environment every 5 minutes. We will check if the RAG tool was identified at startup - if the RAG MCP was not started before the first identification we will restrart the caipe-supervisor container in order to force a new detection instead of waiting 5 minutes

docker logs caipe-supervisor 2>&1 | grep "RAG tools" 

The output should look similar to:

$ docker logs caipe-supervisor 2>&1 | grep "RAG tools"
2026-02-07 00:02:46 [ai_platform_engineering.multi_agents.platform_engineer.deep_agent] [INFO] [_load_rag_tools:165] Loading RAG tools from MCP server...
2026-02-07 00:02:46 [ai_platform_engineering.multi_agents.platform_engineer.deep_agent] [INFO] [_load_rag_tools:167] ✅ Loaded 3 RAG tools: ['search', 'fetch_document', 'fetch_datasources_and_entity_types']
2026-02-07 00:02:46 [ai_platform_engineering.multi_agents.platform_engineer.deep_agent] [INFO] [_build_graph:220] ✅📚 Loaded 3 RAG tools at startup
2026-02-07 00:02:46 [ai_platform_engineering.multi_agents.platform_engineer.deep_agent] [INFO] [_build_graph:289] ✅📚 Added 3 RAG tools to supervisor

If it is not the case, please restart the supervisor agent and re-check the previous conditions:

cd $HOME/work/ai-platform-engineering
docker-compose up -d --force-recreate --no-deps caipe-supervisor

5. Populate the RAG Knowledge Base

Task 5: Open the Caipe UI

Access the RAG management interface:

Open RAG UI


Task 6: Ingest AGNTCY Documentation

RAG UI Screenshot

Once the Caipe UI is open, please select the Knowledge bases section and follow these steps:

1. Copy the documentation URL:

https://docs.agntcy.org

2. Paste it in the Ingest URL field

3. Click the Ingest button

[!NOTE]

  • The server should start ingesting the docs. You can click on the datasource to see the progress.
  • Some URLs may take longer, but feel free to move forward while ingestion continues.

Task 7: Understand the Ingestion Process

What's happening behind the scenes:

  1. Crawling: The RAG server crawls the webpage (supports sitemaps) and fetches all pages
  2. Parsing: HTML is parsed and content is extracted
  3. Chunking: Pages are split into chunks using Recursive Text Splitter
  4. Embedding: Each chunk is sent to the embedding model to generate vector embeddings
  5. Storage: Embeddings are stored in Milvus along with metadata (source, title, description, etc.)
RAG Ingestion Process

Why this matters:

  • The chunking strategy affects retrieval quality
  • Metadata enables filtering and source attribution
  • Vector embeddings capture semantic meaning
  • The vector database enables fast similarity search

6. Test the RAG System

Task 8: Verify RAG Retrieval

Let's test the RAG system directly through the UI.

RAG UI Screenshot

1. Navigate to the Search option of the Knowledge Bases tab

2. Type this query in the search box:

What is SLIM

3. Click the Search button

[!NOTE] The response should return relevant document chunks. The chunks may not be formatted in a way that is easy to read. As long as some document chunks are returned, the RAG system is working.

What you're seeing:

  • Raw document chunks retrieved from the vector database
  • Similarity scores indicating relevance
  • Source metadata (URL, title, etc.)

This is the raw retrieval output before the LLM synthesizes it into a coherent answer.


7. Interact with the RAG Agent

Task 9: Switch to the Chat tab of the Caipe UI

RAG UI Screenshot

Task 10: Query the RAG Agent

Ask the agent about AGNTCY:

Tell me more about SLIM in AGNTCY

[!NOTE] The agent should respond with information about the SLIM protocol, synthesized from the retrieved documentation.

What's happening behind the scenes:

  1. Query Embedding: Your question is converted to a vector using the same embedding model
  2. Similarity Search: The vector database finds the most similar document chunks
  3. Context Retrieval: Top-k relevant chunks are retrieved
  4. Augmented Generation: The LLM receives your question + retrieved context
  5. Response Synthesis: The LLM generates a coherent answer grounded in the documentation
RAG Agent Architecture

Key difference from raw search:

  • The LLM synthesizes information from multiple chunks
  • The response is coherent and conversational
  • Sources can be cited for transparency

8. Multi-Agent Workflow: RAG + Git

Task 11: Execute a Complex Multi-Agent Task

Now let's test a workflow that requires both the RAG and Git agents to collaborate.

In the chat, ask:

Research and write a report on AGNTCY in markdown format, wait for this to be completed then commit this report under a file named '%%LABNAME%%-report.md' with commit message "agntcy-report" to repo %%REPO_URL%% on the main branch.

[!NOTE] The agent should have:

  • Created a report with name: %%LABNAME%%-report.md
  • Committed it to the CAIPE Labs git repository with commit message "agntcy-report".

Task 12: Understand the Multi-Agent Coordination

What happened in this workflow:

  1. Task Analysis: The supervisor agent parsed the complex request and identified two sub-tasks
  2. Research Phase:
    • Supervisor delegates to RAG agent
    • RAG agent searches the knowledge base for AGNTCY information
    • RAG agent synthesizes findings into a markdown report
  3. Persistence Phase:
    • Supervisor waits for research completion
    • Supervisor delegates to Git agent
    • Git agent creates the file and commits it to the repository
  4. Confirmation: Supervisor reports success back to the user

Key coordination patterns:

  • Sequential execution: Git task waits for RAG task completion
  • Data passing: Report content flows from RAG agent to Git agent
  • Error handling: Agents report failures back to supervisor
  • State management: Supervisor tracks overall workflow progress

This demonstrates the power of multi-agent systems: complex workflows are broken down and distributed to specialized agents, each doing what it does best.


9. Verify the Git Commit

Task 13: Check the Repository

Visit the CAIPE Labs repository to verify your report was committed:

Repository URL:

%%REPO_URL%%

What to look for:

  • ✅ A file named %%LABNAME%%-report.md in the repository
  • ✅ Commit message: "agntcy-report"
  • ✅ Content about AGNTCY from the RAG knowledge base
  • ✅ Proper markdown formatting

What this proves:

  • The RAG agent successfully retrieved and synthesized information
  • The Git agent successfully authenticated and pushed changes
  • The supervisor correctly coordinated the multi-step workflow
  • Agents can interact with external systems (GitHub) autonomously

10. Clean Up

Task 14: Stop the System

When you're done exploring, stop all containers:

cd $HOME/work/ai-platform-engineering
./deploy.sh stop

What this does:

  • Gracefully shuts down all agent containers
  • Stops the RAG server and vector database
  • Stops the Agent Forge UI
  • Cleans up network connections

[!NOTE] The vector database data persists in Docker volumes. If you restart the system, your ingested documents will still be available.


11. Summary

Congratulations! You've completed Part 3 of the AI Agents lab series. Here's what you accomplished:

✅ Understood Retrieval-Augmented Generation (RAG) concepts
✅ Learned how vector databases enable semantic search
✅ Deployed a complete RAG system with Milvus
✅ Ingested documentation into a knowledge base
✅ Tested RAG retrieval and generation
✅ Used the Git agent for version control automation
✅ Orchestrated a complex multi-agent workflow

Key Takeaways from Part 3

  1. RAG grounds LLM responses in external knowledge - Reduces hallucinations and provides up-to-date information
  2. Vector databases enable semantic search - Find relevant content based on meaning, not just keywords
  3. Chunking strategy affects retrieval quality - Proper document splitting is crucial for good results
  4. Embeddings capture semantic meaning - Same model must be used for ingestion and retrieval
  5. Multi-agent workflows enable complex automation - Specialized agents collaborate on multi-step tasks

RAG System Components

  • Ingestion Pipeline: Crawling → Parsing → Chunking → Embedding → Storage
  • Retrieval Pipeline: Query Embedding → Similarity Search → Context Assembly → Generation
  • Vector Database: Milvus for high-performance similarity search
  • Agents: RAG agent for knowledge retrieval, Git agent for version control

Multi-Agent Coordination Patterns

  • Sequential execution: Tasks with dependencies run in order
  • Data passing: Information flows between agents
  • Error handling: Failures propagate to supervisor
  • State management: Supervisor tracks workflow progress

What's Next?

Then explore advanced topics:

  • Fine-tuning embedding models for your domain
  • Implementing hybrid search (vector + keyword)
  • Building custom agents for your workflows
  • Scaling RAG systems for production
  • Advanced chunking and retrieval strategies

Additional Resources

For deeper exploration:


Part 3 Complete! You now understand how to build RAG-powered agents and orchestrate complex multi-agent workflows that combine knowledge retrieval with external system automation.