Skip to main content

CAIPE RBAC

Audience: Junior engineers getting oriented + security architects reviewing the design.

This is the canonical reference for how authentication and authorization work in CAIPE. Start with the feature guide if you need one linear explanation, then use the focused docs for deeper reference:

If you want to…Read
Explain the feature front to back to CAIPE users, admins, operators, or security reviewersEnterprise RBAC and ReBAC Feature Guide
Understand each component (Keycloak, UI, Supervisor, AgentGateway, Dynamic Agents) and how they're wiredArchitecture
Get the short end-to-end summary of the Comprehensive RBAC refactor, including Keycloak roles, AgentGateway, and OpenFGAComprehensive RBAC Refactor
Understand how JWT identity and OpenFGA relationship checks work togetherJWT and OpenFGA
Understand exactly where OpenFGA union/computed permissions are evaluated, what gets stored vs. computed, and follow a worked end-to-end Probe-button exampleOpenFGA Permission Evaluation
Trace a request β€” login, OBO token-exchange, end-to-end Slack/Webex flow, Slack channel or Webex space β†’ agent routingWorkflows
Log in, exercise a role, verify a denial, link a Slack/Webex user, run the demoUsage
Find the file that owns a specific piece of the auth pathFile map
Understand the difference between Keycloak roles and client scopes, what a slug is, and what happens when you create a teamRoles vs Scopes
Install or upgrade the RBAC/OpenFGA refactor with Helm, including optional Keycloak, AgentGateway, OpenFGA, and bridge runtime componentsHelm installation and upgrade guide
Install CAIPE on a real K8s cluster β€” bootstrap admin, IdP, and slack-bot client secrets via dev defaults, manual K8s Secrets, or ESO (Vault / AWS-SM / GCP-SM)Secrets bootstrap

For the live caipe/rbac GitOps upgrade, see CAIPE RBAC Helm Migration Guide.

Every component-level doc opens with a badge analogy to build intuition, followed by the precise technical detail. Read the analogy first, then the technical section β€” they describe the same thing at different levels of abstraction.


The Big Picture​

Think of CAIPE like a secure corporate office building:

  • Keycloak is HR + the front desk. It issues ID badges, manages who works here, and verifies contractors through a partner agency (your enterprise IdP β€” typically Okta or Duo SSO).
  • Every service is a room with its own badge reader. You prove who you are once at the front desk, get a badge, and that badge is checked at every door β€” no calling HR again each time.
  • AgentGateway is the armed security checkpoint between the office and the server room. Everyone must show their badge, and the checkpoint calls OpenFGA through ext_authz for the PDP decision before proxying.
  • Team Resources and OpenFGA ReBAC in the Admin UI are the rich ReBAC authoring surfaces: admins assign agents and MCP tool prefixes to a team, preview effective OpenFGA access, inspect all relationships in a full-screen graph, edit relationships on a drag/drop graph canvas, and inspect materialized tuples.
  • CEL policy editing is retired for the management plane. Admins use OpenFGA/ReBAC relationships instead of editing AgentGateway or admin-tab CEL rules.
  • Identity Group Sync maps enterprise groups into CAIPE team memberships using a hybrid source model: memberOf / groups claims refresh the signed-in user's memberships at login, while direct Okta directory queries power full admin dry-runs, removals, and drift detection.
  • Slack channels and Webex spaces are external messaging rooms. They are not identity providers; each bot maps the messaging surface to a Keycloak user, exchanges an OBO token, and checks OpenFGA before dispatching.
  • The badge itself is a JWT β€” a tamper-proof, digitally signed card that any badge reader can verify independently without phoning HR.

Technically: CAIPE uses OpenID Connect (OIDC) for authentication and JWT bearer tokens for stateless authorization across all service boundaries. There is one token issuer (Keycloak), and every service verifies tokens against Keycloak's published JWKS public keys β€” no shared secrets, no per-hop re-authentication.

In 0.5.0, the umbrella Helm chart can deploy the RBAC runtime components (tags.keycloak, openfga.enabled, openfgaAuthzBridge.enabled, and agentgateway.enabled) so release consumers do not need separate companion app manifests for the default stack.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ CAIPE Trust Boundary β”‚
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Keycloak β”‚ β”‚ CAIPE UI β”‚ β”‚ Supervisor β”‚ β”‚ Dynamic β”‚ β”‚
β”‚ β”‚ (OIDC IdP)β”‚ β”‚ (Next.js) β”‚ β”‚ A2A Server β”‚ β”‚ Agents β”‚ β”‚
β”‚ β”‚ port 7080 β”‚ β”‚ port 3000 β”‚ β”‚ port 8000 β”‚ β”‚ port 8001 β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ Token issuer NextAuth + RBAC JwtUserContext get_current_user β”‚
β”‚ JWKS endpoint middleware middleware FastAPI Depends β”‚
β”‚ User profile Session β†’ API contextvar JWKS validation β”‚
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ AgentGateway (Policy Enforcement Point) β”‚ β”‚
│ │ port 4000 · ext_authz→OpenFGA · JWT passthrough │ │
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β–Ό β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚
β”‚ β”‚ OpenFGA β”‚ β”‚ β”‚
β”‚ β”‚ Remote PDP β”‚ β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β–Ό β–Ό β–Ό β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ RAG MCP β”‚ β”‚ ArgoCD MCPβ”‚ β”‚GitHub MCP β”‚ ... β”‚
β”‚ β”‚ Server β”‚ β”‚ Server β”‚ β”‚ Server β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ JWKS validation at each MCP β€” tokens verified independently β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Security properties the architecture is designed to guarantee:

PropertyHow it's achieved
Single source of truth for identityKeycloak is the only token issuer; all services verify against its JWKS
No credentials in transit between servicesJWT is a signed assertion β€” no password or secret is passed between hops
User identity preserved end-to-endThe same JWT travels Slack/Webex Bot β†’ Supervisor β†’ AgentGateway β†’ MCP unchanged
Delegation is auditableOBO tokens carry act.sub (the delegating party) alongside sub (the real user)
Policy enforcement is centralisedAgentGateway is the single PEP for all MCP tool calls; tools don't implement their own authz
Remote PDP for relationshipsAgentGateway extAuthz calls OpenFGA before proxying MCP traffic
Admin-configured ReBACTeam Resources saves write OpenFGA team/agent/tool tuples from the same source of truth as Keycloak roles; OpenFGA ReBAC provides guided tuple creation, checks, full-screen all-relationship graph viewing, drag/drop graph editing, and tuple inspection
Group-to-team provenanceIdentity Group Sync records whether a membership came from login claims, Okta sync, manual admin action, bootstrap, or policy rules
Least privilege at tool layerOpenFGA ReBAC is the authoritative AgentGateway policy path; service-side checks provide defense in depth
Tenant isolationtenant claim in JWT scopes data visible to the MCP server

Core Concept: The JWT​

When you log in, Keycloak issues a JWT (JSON Web Token) signed with RS256 using its realm private key. It's a base64url-encoded envelope of three parts: header.payload.signature.

A decoded payload looks like this:

{
"iss": "http://localhost:7080/realms/caipe",
"sub": "a3f9b2c1-...",
"email": "alice@example.com",
"name": "Alice Smith",
"realm_access": {
"roles": ["admin", "chat_user"]
},
"resource_access": {
"caipe-ui": { "roles": ["uma_protection"] }
},
"tenant": "acme",
"exp": 1713200000,
"iat": 1713196400,
"act": {
"sub": "slack-bot-client"
}
}

Key fields for security architects:

ClaimPurposeWhere it's enforced
issToken issuer β€” services reject tokens from unknown issuersDynamic agents JWKS validation, RAG server
subOpaque user ID (Keycloak UUID) β€” stable, not guessableConversation ownership, audit logs
emailHuman-readable identity β€” used for display and Slack linkingUI, supervisor user context
realm_access.rolesRealm-level role assignmentsDynamic agents is_admin, Web UI backend fallback checks, service-side defense in depth
expToken expiry β€” enforced cryptographicallyAll JWKS validators, NextAuth refresh
act.subDelegation chain β€” set on OBO tokens onlyAudit: proves bot acted on behalf of user
tenantMulti-tenant data scopingRAG server query isolation

Services never call Keycloak on each request. They validate the signature offline using the cached JWKS public key. JWKS is refreshed on cache miss (unknown kid) or on a TTL (1 hour).


Threat Model Considerations​

ThreatMitigation
JWT forgeryRS256 signature verified against Keycloak JWKS; private key never leaves Keycloak
JWT replay after expiryexp claim enforced at every JWKS validation point
Token theft from browserNextAuth stores tokens in httpOnly server-side session cookie; raw JWT never in JS context
Bot impersonating arbitrary user via OBOKeycloak's token-exchange permission must be explicitly granted to the bot client; not available by default
Privilege escalation via claim manipulationJWT is signed; any claim modification invalidates the RS256 signature
Tenant data leakagetenant claim in JWT used for query scoping at MCP layer and service-side filters
PDP outage fail-openAgentGateway extAuthz.failureMode.denyWithStatus=403 fails closed if OpenFGA/bridge is unavailable
AgentGateway admin exposureOnly the data-plane listener (4000) should be ingress-exposed; the admin listener (15000) remains private inside the cluster
Unlinked Slack/Webex users bypassing RBACBot runtime gates block unlinked users before the supervisor is called
AUTH_ENABLED=false in productionStartup log emits a WARNING when auth is disabled; also documented in Architecture β€Ί Dynamic Agents env vars
Bootstrap admin left permanently enabledNo automatic enforcement β€” documented operational risk; must be removed post-setup

Where to next​

  • Architecture β€” Component-by-component reference: Keycloak, UI, Supervisor, AgentGateway, Dynamic Agents.
  • Workflows β€” Sequence diagrams for login, OBO, end-to-end requests, Slack channel and Webex space routing.
  • Usage β€” Bring up the stack, log in as test users, verify RBAC denials, run the demo.
  • File map β€” When you need to change something, this tells you which file to open.