Implementation Plan: Comprehensive RBAC Tests + Completion of 098
Branch: prebuild/feat/comprehensive-rbac (existing โ do not create a new branch; FR-015) | Date: 2026-04-22 | Spec: spec.md
Input: Feature specification at docs/docs/specs/102-comprehensive-rbac-tests-and-completion/spec.md
Companion docs:
spec.mdโ what & why (8 user stories, 15 FRs, 6 personas)call-sequences.mdโ code-level sequence diagrams (realfile:functionreferences) for every flow this spec touchesresearch.mdโ Phase 0 output (this command)data-model.mdโ Phase 1 output (this command)quickstart.mdโ Phase 1 output (this command)contracts/python-rbac-helper.mdโ Phase 1 output (this command)
Summaryโ
Close every authorization gap identified between spec 098 and the post-merge state of prebuild/feat/comprehensive-rbac, then prove the closure with a comprehensive, matrix-driven test suite that runs against a real Keycloak (no mocks for the PDP).
The work is one PR (#1257, FR-015). It lands in 5 sequential phases, each gated on green tests for the prior phase. The biggest delta is Phase 4 (Custom Agents โ 5 layers of changes per call-sequences.md Flow 4) which is the riskiest and is therefore deliberately kept after the simpler BFF migration is locked in by Phase 1.
Scope sanity check: ~25 files modified, ~12 new files, ~120 new test cases, 1 new docker-compose file, 1 new Make target, 4 new automation scripts, 1 doc rewrite. Implementation effort: estimated 6-10 working days for a single engineer; trivially parallelizable across phases for two.
Technical Contextโ
Language/Version:
- Python 3.11+ (supervisor, dynamic_agents, RAG server, agent MCPs, slack bot)
- TypeScript / Next.js 16 + React 19 (UI BFF + Playwright suite)
Primary Dependencies:
- LangGraph, LangChain, FastAPI, A2A Protocol, deepagents (โฅ0.3.8) โ Python
- NextAuth.js, Keycloak Authorization Services, MongoDB, Tailwind CSS, shadcn/ui โ TypeScript
@slack/web-api, Slack Bolt (Python) โ Slack bot- Playwright (NEW for this spec โ already used in repo for non-RBAC E2E)
- Existing:
cel-python(Python CEL),cel-js(TypeScript CEL)
Storage:
- Keycloak (users, realm/client roles, user attributes, sessions, AuthZ Services policies/resources/scopes) โ single source of truth for identity & policy
- MongoDB โ
teams,team_kb_ownership,slack_link_nonces,slack_user_metrics,authz_decisions(NEW),audit_events(existing dual-write target) - In-process: PDP decision cache (TS
permissionDecisionCache, Python equivalent NEW), JWKS cache (existing), userinfo cache (existing)
Testing:
- Jest (TS) โ BFF unit tests, parameterised over personas
- pytest (Py) โ middleware unit tests, parameterised over personas
- Playwright โ end-to-end browser tests against
docker-compose.dev.yamldriven byCOMPOSE_PROFILES(per spec Clarification 2026-04-22) plus a thindocker-compose/docker-compose.e2e.override.yamlfor port remaps make test-rbac(NEW) โ single orchestration target
Target Platform:
- Linux server (production); macOS / Linux dev environments
- Docker Compose for local dev and CI E2E
- Helm-deployed in production (chart updates out of scope for this spec)
Project Type: Web application โ multi-service backend (Python FastAPI / A2A) + Next.js frontend with BFF pattern.
Performance Goals:
- PDP cache hit ratio โฅ80% in steady-state (existing TTL = 60s, accepted as-is per spec out-of-scope)
make test-rbactotal wall time โค 10 minutes locally on M-series Mac, โค 12 minutes in CI (SC-005, SC-008)- JWKS cache TTL โฅ 5 minutes per FR-002
Constraints:
- No new branches (FR-015) โ single PR
#1257 - No NextAuth replacement, no MongoDB replacement, no CEL evaluator replacement (spec out-of-scope)
- Default-deny on PDP-unavailable for resources without an explicit fallback rule (spec edge case + Open Question 1, recommendation accepted)
- Audit-log write failure MUST NOT cause gate to fail open (FR-007)
- Shared-key auth for MCPs MUST be removed (not deprecated-with-warning) per FR-012
Scale/Scope:
- ~15 BFF routes to migrate from
requireAdminโrequireRbacPermission - ~12 agent MCP servers to migrate from
SharedKeyMiddlewareโJwtUserContextMiddleware+requireRbacPermission - 1 RAG server, 1 dynamic_agents backend, 1 slack bot to gain a uniform Python middleware
- 6 personas ร ~30 protected endpoints ร ~3 scopes = ~500 test cases (reduced via matrix parameterisation to ~120 unique test functions)
Constitution Checkโ
GATE: Must pass before Phase 0 research. Re-checked after Phase 1 design (see end of file).
Constitution principles evaluated against this plan (per .specify/memory/constitution.md):
| Principle | Status | Notes |
|---|---|---|
| I. Worse is Better | โ Pass | Plan deliberately reuses the existing requireRbacPermission / JwtUserContextMiddleware patterns rather than building a new abstraction. Python helpers are mechanical translations of TypeScript helpers. |
| II. YAGNI | โ Pass | No speculative features. Every NEW file maps directly to a numbered FR or user story. PDP cache, CEL layer, role fallback are all kept as-is. |
| III. Rule of Three | โ ๏ธ Justified deviation | Two new TSโPy mirrors are introduced (requireRbacPermission, logAuthzDecision) on first use, not third. Justified in Complexity Tracking โ they are exact behavioural mirrors of existing code, and writing them once each in TS and Py is the simplest implementation given the polyglot trust boundary. |
| IV. Composition over Inheritance | โ Pass | All new modules are functions or middleware classes consumed via composition (withAuth(req, handler), FastAPI Depends(...)). No class hierarchies. |
| V. Specs as Source of Truth | โ Pass | This is a spec-driven flow; tests/rbac-matrix.yaml becomes the in-code source of truth that the spec mandates and CI validates (FR-010). |
| VI. CI Gates Are Non-Negotiable | โ Pass | New make test-rbac target runs in CI; scripts/validate-rbac-matrix.py and scripts/validate-rbac-doc.py are hard gates (FR-010, FR-014, SC-006). |
| VII. Security by Default | โ Pass | Default-deny on PDP-unavailable; no secrets in code; JWKS verification at every Python service; OBO tokens (not service-account passthrough) at every hop; explicit removal of forgeable X-User-Context trust. |
Coding Practices:
- Type hints required (FR-003 implementations will follow)
- Docstrings on all new public helpers
logurufor Python logging;console.logfor TS audit log JSON (existing pattern)- Constants centralised in
tests/rbac-matrix.yamlanddeploy/keycloak/realm-config.json
Result: โ Constitution Check passes with one justified deviation (Rule of Three for TSโPy mirrors), tracked in Complexity Tracking.
Project Structureโ
Documentation (this feature)โ
docs/docs/specs/102-comprehensive-rbac-tests-and-completion/
โโโ spec.md # Feature spec (already written)
โโโ call-sequences.md # Code-level sequence diagrams (already written)
โโโ plan.md # This file
โโโ research.md # Phase 0 โ Open Questions resolved + tech decisions
โโโ data-model.md # Phase 1 โ entities, schemas, transitions
โโโ quickstart.md # Phase 1 โ local & CI usage
โโโ contracts/ # Phase 1 โ interface contracts
โ โโโ audit-event.schema.json # authz_decisions document shape
โ โโโ rbac-matrix.schema.json # tests/rbac-matrix.yaml shape
โ โโโ realm-config-extras.schema.json # PDP-unavailable fallback rules
โ โโโ python-rbac-helper.md # Python `requireRbacPermission` API contract
โโโ tasks.md # Phase 2 output (NOT this command โ see /speckit.tasks)
Source Code (repository root)โ
This is a polyglot web application; structure already exists in the repo and is reused. The plan is modifications and additions within these existing trees, not a new project layout.
ai_platform_engineering/
โโโ multi_agents/platform_engineer/protocol_bindings/a2a/
โ โโโ main.py # MODIFIED: middleware stack already correct (preserved as-is)
โ โโโ agent_executor.py # MODIFIED: OBO mint already correct (test coverage NEW)
โโโ utils/auth/
โ โโโ jwt_context.py # PRESERVED
โ โโโ jwt_user_context_middleware.py # PRESERVED
โ โโโ token_context.py # PRESERVED (used by supervisor MCP factory)
โ โโโ jwks_validate.py # NEW (FR-002 โ Python JWKS validator)
โ โโโ keycloak_authz.py # NEW (FR-003 โ Python `require_rbac_permission`)
โ โโโ audit.py # NEW (FR-007 โ Python `log_authz_decision`)
โโโ utils/obo_exchange.py # PRESERVED (supervisor OBO)
โโโ utils/a2a_common/base_langgraph_agent.py # PRESERVED (supervisor MCP factory)
โโโ dynamic_agents/src/dynamic_agents/
โ โโโ auth/
โ โ โโโ auth.py # MODIFIED: drop X-User-Context trust (FR-004)
โ โ โโโ access.py # MODIFIED: add Keycloak PDP call (defense-in-depth)
โ โ โโโ jwt_middleware.py # NEW (mirrors supervisor JwtUserContextMiddleware)
โ โ โโโ keycloak_authz.py # NEW (DA-side wrapper around utils.auth.keycloak_authz)
โ โ โโโ token_context.py # NEW (DA ContextVar mirrors supervisor)
โ โ โโโ obo_exchange.py # NEW (DA OBO client mirrors supervisor)
โ โโโ services/
โ โโโ agent_runtime.py # MODIFIED: set ContextVar at entry; remove _auth_bearer attr
โ โโโ mcp_client.py # MODIFIED: NEW httpx_client_factory (FR-005)
โโโ knowledge_bases/rag/
โ โโโ server/ # MODIFIED: add JwtUserContextMiddleware + hybrid gate (FR-013)
โโโ agents/
โ โโโ argocd/mcp/ # MODIFIED: shared key โ Keycloak (FR-012)
โ โโโ aws/mcp/ # MODIFIED: same
โ โโโ jira/mcp/ # MODIFIED: same
โ โโโ github/mcp/ # MODIFIED: same
โ โโโ pagerduty/mcp/ # MODIFIED: same
โ โโโ splunk/mcp/ # MODIFIED: same
โ โโโ confluence/mcp/ # MODIFIED: same
โ โโโ webex/mcp/ # MODIFIED: same
โ โโโ slack/mcp/ # MODIFIED: same
โ โโโ komodor/mcp/ # MODIFIED: same
โ โโโ aigateway/mcp/ # MODIFIED: same
โ โโโ backstage/mcp/ # MODIFIED: same
โโโ integrations/slack_bot/
โโโ app.py # MODIFIED: use impersonate_user per command (FR-011)
โโโ utils/obo_exchange.py # PRESERVED
โโโ utils/rbac_middleware.py # MODIFIED: remove channel-allowlist gate
ui/src/
โโโ lib/
โ โโโ api-middleware.ts # MODIFIED: deprecate requireAdmin/requireAdminView usage in production routes
โ โโโ rbac/
โ โ โโโ keycloak-authz.ts # PRESERVED
โ โ โโโ audit.ts # PRESERVED
โ โ โโโ types.ts # MODIFIED: add new resources/scopes
โ โ โโโ matrix-loader.ts # NEW (loads tests/rbac-matrix.yaml at build time)
โ โโโ da-proxy.ts # MODIFIED: drop X-User-Context construction (FR-004)
โโโ app/api/
โโโ admin/**/route.ts # MODIFIED (~5 routes): requireAdmin โ requireRbacPermission
โโโ dynamic-agents/route.ts # MODIFIED: requireAdmin โ requireRbacPermission
โโโ mcp-servers/**/route.ts # MODIFIED (~3 routes): same
โโโ teams/**/route.ts # MODIFIED (~3 routes): same
โโโ agents/**/route.ts # MODIFIED (~3 routes): same
โโโ v1/chat/stream/start/route.ts # MODIFIED: ADD requireRbacPermission (currently has none)
deploy/keycloak/
โโโ realm-config.json # MODIFIED (FR-006): seed all resources/scopes
โโโ realm-config-extras.json # NEW (Open Q1): per-resource PDP-unavailable fallback rules
โโโ docker-compose.yml # MODIFIED (Open Q4): enable token-exchange feature
docker-compose/
โโโ docker-compose.e2e.override.yaml # NEW (FR-008, [spec Clarification 2026-04-22](./spec.md#session-2026-04-22)):
# thin overlay on docker-compose.dev.yaml; remaps host ports
# only โ no service duplication. Driven by COMPOSE_PROFILES.
tests/
โโโ rbac/ # NEW (Open Q2 recommendation accepted)
โโโ rbac-matrix.yaml # NEW (FR-010): single source of truth
โโโ conftest.py # NEW: pytest persona-token fixture (FR-008)
โโโ fixtures/
โ โโโ keycloak.py # NEW: Python persona helper
โ โโโ keycloak.ts # NEW: TypeScript persona helper (consumed by Jest + Playwright)
โ โโโ audit.py / audit.ts # NEW: assertion helpers for authz_decisions writes
โโโ unit/
โ โโโ ts/ # NEW: Jest BFF tests parameterised over matrix
โ โโโ py/ # NEW: pytest middleware tests parameterised over matrix
โโโ e2e/ # NEW: Playwright suite (8 user journeys)
scripts/
โโโ validate-rbac-matrix.py # NEW (FR-010): CI linter โ every protected route in matrix
โโโ validate-rbac-doc.py # NEW (FR-014): CI linter โ file map current
โโโ extract-rbac-resources.py # NEW (FR-006): emit KeycloakResourceCatalog from code
โโโ validate-realm-config.py # NEW (FR-006): assert realm-config covers code
Makefile # MODIFIED: add `test-rbac` target (FR-009)
Structure Decision: This is a web application (multi-service backend + frontend BFF). The repo already has the canonical layout โ backend under ai_platform_engineering/, frontend under ui/. No new project type is introduced. The new top-level tests/rbac/ directory is justified by Open Question 2's accepted recommendation: a single repo-root location for the matrix and persona fixtures, with thin shims in each component's existing test runner. Component-local test trees (ui/src/app/api/__tests__/, tests/) are preserved and untouched except where they must be updated to consume the new fixtures.
Phasesโ
The 8 user stories collapse into 5 implementation phases ordered by risk-asc and dependency-asc. Each phase ends with a green-light checkpoint (matrix-test pass + reviewer signoff). The phases are explicitly designed so each one is independently shippable: Phase 1 alone is a real security improvement even if Phases 2-5 slip. This is per Constitution principle I (Worse is Better) โ bias toward visible incremental delivery over a single big-bang merge.
Phase 0 โ Research, Schema, Fixtures (foundations)โ
Outputs: research.md, data-model.md, quickstart.md, contracts/*.schema.json, tests/rbac/rbac-matrix.yaml (skeleton), tests/rbac/fixtures/keycloak.{py,ts}, deploy/keycloak/realm-config-extras.json (skeleton with admin_ui rule), deploy/keycloak/docker-compose.yml enabling --features=token-exchange, docker-compose/docker-compose.e2e.override.yaml (thin port-remap overlay on docker-compose.dev.yaml per spec Clarification 2026-04-22).
Why first: Every later phase consumes the matrix and the persona fixture. If these are wrong, every test in every phase has to be rewritten.
Acceptance:
make test-rbacexists and runs (it can be vacuous โ the tests will be filled in by later phases) and brings up the e2e compose stack to a healthy state in <2 minutes.- The persona fixture mints a real Keycloak token for
alice_adminandbob_chat_useragainst the seeded realm. - All four schema files in
contracts/exist and validate the seed data they describe. scripts/validate-rbac-matrix.pyexists and passes against an empty matrix (no protected routes yet).
Phase 1 โ UI BFF Keycloak Migration (Story 1)โ
Files modified: ~15 routes under ui/src/app/api/{admin,dynamic-agents,mcp-servers,teams,agents}/**/route.ts. Plus ui/src/lib/api-middleware.ts (mark requireAdmin/requireAdminView as legacy, do not delete).
Files added: ~15 jest tests in tests/rbac/unit/ts/ (one per route, persona-parameterised).
Approach (mechanical):
- For each route, replace
requireAdmin(session)withawait requireRbacPermission(session, '<resource>', '<scope>'). - Add the
(route, resource, scope)triple totests/rbac-matrix.yaml. - Add the seed for
(resource, scope)todeploy/keycloak/realm-config.json(FR-006). - Write the parameterised jest test file consuming the matrix.
Acceptance:
scripts/validate-rbac-matrix.pyfinds zero unprotected routes.- Every test in
tests/rbac/unit/ts/passes for all 6 personas. requireAdminhas zero production callers (confirmed byrg -l 'requireAdmin\(session\)' ui/src/app/api/).
Maps to: Story 1, FR-001, FR-006, FR-010 (subset)
Phase 2 โ Python Service Hardening (Stories 2 + 3)โ
Files added: ai_platform_engineering/utils/auth/{jwks_validate,keycloak_authz,audit}.py โ three Python helpers that mirror the TS implementations.
Files modified: every agent MCP server's main.py to register JwtUserContextMiddleware + replace SharedKeyMiddleware with the new requireRbacPermission dependency. Same for the supervisor (already partially done โ verify and add tests).
Approach:
- Implement Python helpers (see
contracts/python-rbac-helper.mdfor the API contract). - For each MCP server, swap auth middleware. Wire
requireRbacPermission(token, '<agent>_mcp', 'read'|'write')into every tool entry point. - Add the
(<agent>_mcp, read|write)resources torealm-config.json. - Write parameterised pytest tests in
tests/rbac/unit/py/per MCP server. - Confirm supervisor's existing
httpx_client_factorypropagates the OBO token end-to-end via a stub MCP server that records the inboundAuthorizationheader.
Acceptance:
- pytest suite in
tests/rbac/unit/py/passes for all 6 personas ร all 12 MCPs ร 2 scopes. - A stub-MCP integration test asserts that a
bob_chat_userrequest to the supervisor results in the stub seeing anAuthorizationheader whose JWT hassub=bob_chat_user.keycloak_subandact.sub=supervisor-sa. rg -l 'SharedKeyMiddleware' ai_platform_engineering/agents/returns zero results (FR-012).
Maps to: Stories 2 + 3, FR-002, FR-003, FR-007 (Python side), FR-012
Phase 3 โ RAG Hybrid Gate (Story 4)โ
Files modified: ai_platform_engineering/knowledge_bases/rag/server/ โ add JwtUserContextMiddleware and a per-route requireRbacPermission call (rag#ingest for /v1/ingest, rag#retrieve for /v1/query). Then add per-KB filtering: union of TeamKbOwnership (Mongo) and per-KB realm roles (kb_reader:<id>, kb_ingestor:<id>).
Files added: pytest suite in tests/rbac/unit/py/test_rag_*.py covering the 5 acceptance scenarios + edge cases (empty result set, KB exists but user lacks role, KB does not exist).
Approach:
- Add
ragresource (ingest,retrieve,manage) torealm-config.json. - Add the per-KB roles convention (
kb_reader:<id>,kb_ingestor:<id>) โ these are realm roles created by the team-management UI when a KB is provisioned. Note: per-KB role creation already exists in the team-management code from spec 098; this phase only consumes them. - Wire the hybrid gate: coarse Keycloak check โ fine MongoDB + per-KB role filter inside the RAG service (FR-013).
- Write tests using the
team-a-docs/team-b-docstwo-KB fixture from the spec's Independent Test.
Acceptance:
- Sentinel docs from
team-a-docsare visible tocarol_kb_ingestor(team-a member withkb_ingestor:team-a-docsrole) and invisible tobob_chat_user(team-a member without the role). dave_no_rolegets 403 on/v1/query.- All 5 acceptance scenarios from Story 4 pass.
Maps to: Story 4, FR-002, FR-003, FR-013
Phase 4 โ Custom Agents Full Migration (Story 6) โ biggest deltaโ
This is the largest behaviour change. It implements all 5 layers identified in the audit and tracked under "full_da" in the spec's user-decisions log.
Files added:
ai_platform_engineering/dynamic_agents/src/dynamic_agents/auth/jwt_middleware.pyai_platform_engineering/dynamic_agents/src/dynamic_agents/auth/keycloak_authz.pyai_platform_engineering/dynamic_agents/src/dynamic_agents/auth/token_context.pyai_platform_engineering/dynamic_agents/src/dynamic_agents/auth/obo_exchange.py
Files modified:
ai_platform_engineering/dynamic_agents/src/dynamic_agents/auth/auth.pyโ dropX-User-Contexttrust;get_user_context()reads from new ContextVar (FR-004)ai_platform_engineering/dynamic_agents/src/dynamic_agents/auth/access.pyโcan_view_agent,can_use_agent,can_access_conversationnow also call Keycloak PDP (defense-in-depth)ai_platform_engineering/dynamic_agents/src/dynamic_agents/services/agent_runtime.pyโ set ContextVar at entry; remove_auth_bearerattributeai_platform_engineering/dynamic_agents/src/dynamic_agents/services/mcp_client.pyโ NEWhttpx_client_factoryreading fromcurrent_user_tokenContextVar (FR-005)ui/src/lib/da-proxy.tsโ dropX-User-Contextconstruction; passAuthorizationheader throughui/src/app/api/v1/chat/stream/start/route.tsโ ADDrequireRbacPermission(session, 'dynamic_agent:<agent_id>', 'invoke')ui/src/app/api/dynamic-agents/route.tsโ already covered in Phase 1, verify with new tests
Approach:
- Add
dynamic_agentresource (view,invoke,manage) and the per-agent resources convention (dynamic_agent:<agent_id>) torealm-config.json. - Implement the 4 new DA-side modules (mostly copies of the supervisor's equivalents โ see
call-sequences.mdFlow 4b). - Modify the 6 existing files in the order:
da-proxy.tsandchat/stream/start/route.tsfirst (BFF gate), then DA backend changes (defense-in-depth). - Write tests:
- jest: BFF allow/deny per persona per agent
- pytest: DA backend allow/deny + the forged-header-is-ignored test (per Story 6 acceptance scenario 3)
- pytest: MCP-tool-call-from-DA carries fresh OBO (per Story 6 acceptance scenario 4)
- Cleanup:
rg "X-User-Context" ai_platform_engineeringMUST return only test fixtures and audit-log lines (SC-002).
Acceptance:
- All 5 Story 6 acceptance scenarios pass.
- Forged
X-User-Contextheader is rejected by DA with 401. - A stub MCP server records the inbound
Authorizationheader from a DA-initiated tool call; assertion: header is present and JWTsubmatches the chatting user.
Maps to: Story 6, FR-002, FR-003, FR-004, FR-005
Phase 5 โ Slack OBO + E2E + Doc Update (Stories 5 + 7 + 8)โ
Files modified:
ai_platform_engineering/integrations/slack_bot/app.pyโ every command path usesimpersonate_user(keycloak_sub)to mint the OBOai_platform_engineering/integrations/slack_bot/utils/rbac_middleware.pyโ remove channel-allowlist gate (it becomes a Keycloakslack#usescope check)docs/docs/security/rbac/{architecture,workflows,usage,file-map}.mdโ full update per FR-014, including:- File map table in
file-map.md(auto-validated byscripts/validate-rbac-doc.py) - Sequence diagrams for all post-migration flows in
workflows.md(cross-referencecall-sequences.md) - Component sections in
architecture.mdfor all NEW Python helpers and the DA migration - Removed-features callouts:
requireAdmin,requireAdminView(production),X-User-Contextheader trust,SharedKeyMiddleware, channel-allowlist - (The original
docs/docs/specs/098-enterprise-rbac-slack-ui/how-rbac-works.mdis now a redirect stub.)
- File map table in
Makefileโtest-rbactarget now real (was stub in Phase 0)
Files added:
- Playwright suite in
tests/rbac/e2e/โ 8 user journeys covering the 8 stories scripts/validate-rbac-doc.py(final form)
Approach:
- Wire Slack OBO via
impersonate_user(the helper exists atslack_bot/utils/obo_exchange.py:89). - Verify Keycloak's token-exchange + impersonation permissions are granted to
caipe-slack-botclient inrealm-config.json(Phase 0 enabled the feature flag; this phase wires the policies). - Build the Playwright suite incrementally, one user story per spec file under
tests/rbac/e2e/. - Final pass: rewrite the four RBAC docs under
docs/docs/security/rbac/end-to-end using the now-true post-migration code state. Run the doc validator. Run the 10-question quiz against a junior reviewer (SC-007).
Acceptance:
- All 8 user stories' Playwright suites pass against
docker-compose.dev.yaml+docker-compose/docker-compose.e2e.override.yamldriven byCOMPOSE_PROFILES. make test-rbactotal time โค 10 minutes locally (SC-005, SC-008).scripts/validate-rbac-doc.pypasses (file-map current).- 10-question quiz passes 9/10 with a junior reviewer (SC-007).
docs/docs/security/rbac/review checklist signed off by a security-architect-equivalent reviewer.
Maps to: Stories 5 + 7 + 8, FR-008, FR-009, FR-010, FR-011, FR-014
Open Questions โ Resolvedโ
The four open questions in the spec are resolved as follows; rationale lives in research.md:
| Question | Resolution | Reference |
|---|---|---|
| 1. PDP-unavailable behaviour for non-admin scopes | Configurable per-resource, default deny-all. Rules live in deploy/keycloak/realm-config-extras.json. admin_ui preserves its existing admin realm-role fallback. | research.md ยง1, contracts/realm-config-extras.schema.json |
| 2. Test fixture location | Single repo-root tests/rbac/. Component runners load the matrix and personas from there. | research.md ยง2 |
| 3. Realm-config drift detection severity | Hard gate. CI fails if any (resource, scope) referenced in code is missing from realm-config.json. | research.md ยง3 |
| 4. Slack OBO token-exchange in dev compose | Enable in deploy/keycloak/docker-compose.yml so dev mirrors prod. Done in Phase 0. | research.md ยง4 |
Risk Registerโ
| Risk | Severity | Mitigation |
|---|---|---|
| Keycloak token-exchange feature flag breaks unrelated dev workflows | Medium | Phase 0 verifies with a smoke test against the existing dev compose; if breakage occurs, isolate the feature behind a separate compose profile. |
| Phase 4 ContextVar wiring for DA misses an async boundary, OBO leaks across requests | High | Reuse the supervisor's exact pattern (verified working post-merge); add an explicit pytest that asserts current_user_token.get() returns None after a request completes. |
Mongo authz_decisions collection grows unbounded | Low | Out-of-scope to add TTL index here; documented in data-model.md as a follow-up; recommend operators add expireAfterSeconds on ts. |
tests/rbac-matrix.yaml becomes stale relative to actual code | Medium | scripts/validate-rbac-matrix.py is a hard CI gate (FR-010, SC-006); adding a route without an entry fails the build. |
| Playwright E2E flakiness from real Keycloak races | Medium | Phase 5 uses await on Keycloak /health/ready before each test; persona-token fixture caches tokens for the test session to avoid redundant token endpoint hits. |
10-min wall-time budget for make test-rbac is exceeded | Medium | Parameterise tests over matrix entries to amortise fixture setup; cache JWKS at test fixture level; run E2E with parallel workers (Playwright supports --workers). Budget includes 4-min headroom (SC-008). |
| ~25-file PR is large, increases review burden | Low (accepted) | The spec mandates single PR (FR-015); review is structured by phase via call-sequences.md. Reviewer playbook is documented in spec ยง"How to read these diagrams when reviewing the migration PR". |
Complexity Trackingโ
| Violation | Why Needed | Simpler Alternative Rejected Because |
|---|---|---|
Two TSโPy mirrors of requireRbacPermission and logAuthzDecision introduced on first occurrence (vs Constitution Rule of Three) | The trust boundary is polyglot โ UI BFF is Node/TypeScript, every service that handles JWTs server-side is Python. Without the Python mirrors, services fall back to ad-hoc auth that's exactly the gap this spec closes. | (a) Sharing logic via a sidecar (e.g., OPA) was rejected because Keycloak's PDP is already the policy engine; adding OPA would duplicate it. (b) Sharing via a thin HTTP service was rejected because every gate would add a network hop. (c) Generating Py from TS via codegen was rejected as YAGNI for two functions. The mirrors are mechanical and ~30 lines each. |
New top-level tests/rbac/ directory (vs colocating tests with each component) | Personas, the matrix, and the persona-token fixture are shared across Jest, pytest, and Playwright. A single source of truth eliminates drift. | Colocation was rejected because the matrix would need to live in three places (or one, with three loaders) and one of the spec's own goals is to make drift impossible. |
Phase 1 Re-evaluation: Constitution Checkโ
Re-checked after the design above is fleshed out (no new gates emerged in Phase 1 of the speckit flow):
| Principle | Status | Notes |
|---|---|---|
| I. Worse is Better | โ Pass (unchanged) | Phases are deliberately ordered to ship value early; each phase is independently shippable. |
| II. YAGNI | โ Pass (unchanged) | No speculative features; all NEW files trace to FRs. |
| III. Rule of Three | โ ๏ธ Justified deviation (unchanged) | TSโPy mirrors documented in Complexity Tracking. |
| IV. Composition over Inheritance | โ Pass (unchanged) | All new modules expose functions/middleware classes consumed via composition. |
| V. Specs as Source of Truth | โ Pass (unchanged) | tests/rbac-matrix.yaml is the in-code embodiment of the spec's matrix. |
| VI. CI Gates Are Non-Negotiable | โ Pass (unchanged) | make test-rbac is the new CI signal; both validators are hard gates. |
| VII. Security by Default | โ Pass (unchanged) | Default-deny + audit-log + JWKS validation + OBO at every hop + explicit forgeable-header removal. |
Result: โ Constitution Check passes after Phase 1 design with one justified deviation (unchanged from initial check).
What this command did NOT produceโ
This is a /speckit.plan output. It deliberately stops short of:
tasks.mdโ produced by the next command,/speckit.tasks. That command will break each phase above into individually-trackable tasks (one per route migration, one per MCP server, one per test file, etc.).- Code changes โ no source files have been modified by this plan command. All edits begin in the implementation phase, after
/speckit.tasksis approved. docs/docs/security/rbac/updates โ explicitly deferred to Phase 5 per spec ยง"Pending Tasks".