Skip to main content

Knowledge Base Systems

Overview​

CAIPE RAG is an intelligent knowledge platform that combines vector-based retrieval and graph-based reasoning to provide comprehensive, contextually relevant information for AI agents and users.

The platform:

  • Ingests data from multiple sources (web pages, AWS, Kubernetes, Backstage, Slack, Confluence, and more)
  • Performs hybrid search combining semantic understanding with keyword matching
  • Maintains a knowledge graph for entity relationships and complex reasoning
  • Automatically discovers relationships between entity types using AI-powered ontology agents
  • Exposes MCP tools for AI agents to search, fetch, and explore the knowledge base
Unified RAG Architecture

Key Capabilities​

CAIPE RAG uses a dual-vector approach for search:

  • Semantic Search: Dense vector embeddings capture meaning and context
  • Keyword Search: BM25 sparse vectors match exact terms and phrases
  • Weighted Reranking: Configurable balance between semantic and keyword results

Knowledge Graph​

When Graph RAG is enabled, the system stores structured entities in Neo4j:

  • Entity Storage: Structured data with properties and relationships
  • Graph Traversal: Explore entity neighborhoods and find paths between entities
  • Automatic Splitting: Nested structures are split into connected sub-entities

Automatic Ontology Discovery​

The Ontology Agent automatically discovers relationships between entity types:

  • Uses fuzzy matching and LLM evaluation to identify valid relationships
  • Runs in the background with configurable intervals
  • Syncs discovered relationships back to the data graph

Documentation​

PageDescription
ArchitectureSystem components, data flows, and technology stack
IngestorsOverview of available data source integrations
Ontology AgentAutomatic relationship discovery system
MCP ToolsAI agent integration via Model Context Protocol
AuthenticationSecurity concepts and RBAC overview

Getting Started​

Getting Started

Prerequisites​

  • Docker and Docker Compose
  • Environment variables configured (see Server README)

Start All Services​

# Clone the repository
git clone https://github.com/cnoe-io/ai-platform-engineering.git
cd ai-platform-engineering/ai_platform_engineering/knowledge_bases/rag

# Start all services using Docker Compose
docker compose --profile apps up

Access Points​

InterfaceURLDescription
Web UIhttp://localhost:9447Interactive search and graph visualization
API Docshttp://localhost:9446/docsSwagger UI for REST API
MCP Endpointhttp://localhost:9446/mcpModel Context Protocol for AI agents
Neo4j Browserhttp://localhost:7474Graph database explorer

Connect AI Agents​

If you use Claude Desktop, VS Code with Copilot, Cursor, or other MCP-compatible tools, connect to the MCP server at:

http://localhost:9446/mcp

Supported Data Sources​

CAIPE RAG includes ingestors for various data sources. See Ingestors for details.

SourceTypeDescription
Web PagesDocumentsCrawl sitemaps and web pages
AWSGraph EntitiesEC2, S3, RDS, Lambda, EKS, DynamoDB
KubernetesGraph EntitiesPods, Deployments, Services, CRDs
BackstageGraph EntitiesService catalog entities
ArgoCDGraph EntitiesApplications, projects, clusters
GitHubGraph EntitiesOrganizations, repositories, teams
ConfluenceDocumentsSpace pages with incremental sync
SlackDocumentsChannel conversations and threads
WebexDocumentsSpace messages

Further Reading​