Per-KB Ontology Graph Filtering (Phase 4 Follow-Up)
Branch: TBD Β· Date: 2026-05-27 Β· Status: Draft
Origin: Phase 5 of the 2026-05-27 fine-grained KB ReBAC plan
deliberately deferred per-entity ontology-graph filtering. That phase
gates the Knowledge Bases β Graph tab on "the caller can read at
least one KB" (or is organization#admin) and shows an info banner
warning that the entities displayed are the global ontology. This spec
covers the RAG-server work required to actually narrow the Neo4j result
set to the KBs the caller is granted on.
Goalβ
A non-admin user with knowledge_base:k1#can_read (and no other KB
grants) should see only entities whose _datasource_id == k1 when they
load the Graph tab. Admins (organization#admin) keep the unfiltered
view and the existing banner. The caipe-ui BFF must not perform a
client-side post-filter on entities β the filter must be applied
inside the Neo4j query so the response size scales with the
authorisation scope.
Non-Goalsβ
- Editing the ontology model itself.
- Per-entity ACLs (entities below the KB layer keep KB-level authorisation).
- Cross-tenant or cross-org graph isolation. That stays single-org for this delivery.
- UI redesign of the Graph view. The existing controls and overlays stay.
Background β what exists todayβ
- The RAG server exposes graph endpoints under
ai_platform_engineering/knowledge_bases/rag/server/src/server/restapi.py(/v1/graph/explore,/v1/graph/schema, related entity-detail routes). - Each ingested entity already carries a
_datasource_idproperty written by the ingestor pipeline (Neo4j label / property naming may vary across entity types β see the ingestor module for the authoritative list). - The caipe-ui BFF proxies these endpoints via
ui/src/app/api/rag/[...path]/route.tsand already applies KB-level filtering onGET /v1/datasourcesandGET /v1/mcp/custom-tools. - The Knowledge Bases β Graph page (
ui/src/app/(app)/knowledge-bases/graph/page.tsx) now consumesuseKbTabGatesand shows an amber "global entity graph" banner. - OpenFGA exposes per-KB reader grants as
knowledge_base:<id>#can_read(model indeploy/openfga/model.fga).
Proposed designβ
1. Authorisation surfaceβ
The caipe-ui BFF resolves the caller's readable-KB set with the
existing helper loadReadableKnowledgeBases (or a thin wrapper that
returns just the IDs). For org admins the set is the literal sentinel
"__all__" β the BFF must not enumerate every KB in the deployment
when the caller is allowed everything.
The BFF forwards the resolved scope to the RAG server on every graph request as either:
- header
X-Caipe-Kb-Scope: <comma-separated ids>for the bounded case, or - header
X-Caipe-Kb-Scope: *for the org-admin case.
The header is added by the BFF only after the BFF itself has performed the OpenFGA check β the RAG server must treat the header as advisory and re-derive scope from its own OpenFGA client when the request is direct (no BFF). The two paths converge on the same scope helper inside the RAG server.
2. RAG server filterβ
In restapi.py graph handlers:
- Resolve
allowed_ids: list[str]from either the BFF header (when the request carries a valid signed BFF identity) or via a direct OpenFGAlist_objectscall keyed onknowledge_base#can_read. - If
allowed_ids == []β return204 No Contentwith an empty graph payload. The caipe-ui Graph view already renders an empty canvas in this case. - If
allowed_ids == "*"(admin bypass) β skip the Cypher filter and run the existing query unchanged. - Otherwise rewrite the Cypher to constrain on
_datasource_id:
MATCH (n)
WHERE n._datasource_id IN $allowed_ids
WITH n
... existing graph traversal ...
Edges that cross the boundary (entity inside scope β entity
outside scope) must be elided so the caller does not learn that an
out-of-scope entity exists. The implementation must add a second
WHERE m._datasource_id IN $allowed_ids on the relationship target.
3. Performance budgetβ
allowed_idslength limit: 256 KBs. Beyond that, return400 Bad Requestfrom the BFF with guidance to either ask an org admin to add the user to a team-scoped KB group or contact platform engineering to raise the limit. 256 keeps the Neo4jINclause comfortably within the per-query planning budget on the current deployment size.- Cypher must use a parameterised
IN $allowed_idsrather than string interpolation (security + plan cache hits). - For org admins the unfiltered query path is unchanged from today, so no regression for the existing common case.
4. Cachingβ
The BFF caches the resolved scope per session for 30 seconds (matches the existing KB-tab-gate hook). Cache key includes the user subject and the OpenFGA store id. Cache busts on team-membership change events the BFF already publishes.
5. Telemetryβ
rag_graph_scope_sizeβ histogram oflen(allowed_ids)per request, with a labelscope=admin|bounded|empty.rag_graph_request_total{scope}β counter.rag_graph_filter_rewrites_totalβ counter, increments on every bounded-scope query so we can see filter coverage.
Files we expect to touchβ
ai_platform_engineering/knowledge_bases/rag/server/src/server/restapi.pyβ add scope resolver and rewrite graph Cypher.ai_platform_engineering/knowledge_bases/rag/server/src/server/rbac.pyβ addlist_readable_kb_ids(user_sub)OpenFGA helper.ai_platform_engineering/knowledge_bases/rag/server/src/server/graph.py(or wherever the Cypher templates live today) β parameterise.ui/src/app/api/rag/[...path]/route.tsβ resolve KB scope and attachX-Caipe-Kb-Scopefor graph paths.ui/src/app/(app)/knowledge-bases/graph/page.tsxβ refine the banner copy once filtering is live (e.g. "Showing entities from N knowledge bases you can read").docs/docs/security/rbac/architecture.mdandpdp-coverage-audit.mdβ flip the Graph row from "covered (tab gate only)" to "covered (per-entity filter)".
Acceptance criteriaβ
- A non-admin caller with
knowledge_base:k1#can_read(and no other KB grant) callingGET /v1/graph/explorereceives only entities with_datasource_id=k1. Verified with an integration test seeded with entities across at least 3 datasources. - An admin caller (
organization:caipe#admin) calling the same endpoint receives entities across all datasources. Verified with the same integration test plus a privileged caller. - A caller with zero readable KBs receives
204 No Contentand the Graph view renders the existing empty state. - The Cypher generator emits parameterised
$allowed_ids. Verified with a unit test on the query builder. - The BFF rejects requests where the resolved scope exceeds 256 KBs.
rag_graph_request_total{scope="bounded"}andrag_graph_filter_rewrites_totalincrement 1:1 in bounded mode.
Risks & trade-offsβ
- Implicit graph holes. Eliding cross-boundary edges means an entity may appear "disconnected" to a scoped caller when the bridging entity lives in a different KB. This is the intended behaviour but worth surfacing in the Graph tooltip.
- Cypher plan churn. Adding
IN $allowed_idsmay shift the optimiser onto an index scan on_datasource_id. We need to verify that index exists in all environments (it is created by the ingestor bootstrap script today; confirm before rollout). - 256-KB ceiling. For tenants with very large fan-out the BFF will start to 400. That is a deliberate forcing function to encourage team scoping; raising it requires explicit ops approval.
Out of scope (explicit non-goals)β
- Refactoring the ontology model.
- Per-entity ACLs.
- Changes to ingestion or the way
_datasource_idis written. - Multi-org/multi-tenant graph isolation.
Open questionsβ
- Should the BFF cache the scope longer (5Β min) once the team membership eventing pipeline is fully live? Today 30Β s matches the tab-gate hook for predictable cache invalidation.
- Do we need a richer scope header (e.g. signed JWT-shaped payload) for defence-in-depth, or is the BFF identity check enough?
- Should we emit a per-KB
rag_graph_scope_kb_total{kb_id}counter for utilisation analytics, or is the histogram on the size sufficient?
Related workβ
- Cursor plan
.cursor/plans/caipe-fine-grained-rbac-kb-graph-mcp_b6961a8b.plan.md - Spec 2026-05-18 RAG Team ReBAC
- RBAC architecture
- PDP coverage audit