Tasks: Slack JIT Keycloak user creation with web-UI auto-merge
Input: spec.md, plan.md
Branch: prebuild/feat/slack-jit-user-creation
Formatโ
[ID] [P?] [Story] [Type] Description
- [P] = can run in parallel with other [P] tasks of the same phase (different files, no dependencies)
- [Story] = which user story (US1โUS5) the task primarily serves;
INFRA= infrastructure shared by all stories - [Type] =
code | test | config | docs | verify
Phase 1 โ Foundational realm changes (blocks every code path)โ
The slack-bot uses a single Keycloak admin client (caipe-platform)
for both lookup and JIT creation. Phase 1 ensures its service account
holds exactly {view-users, query-users, manage-users} โ no more, no
less โ and confirms the existing IdP auto-merge flow still works.
-
T001 [INFRA] [config] In
charts/ai-platform-engineering/charts/keycloak/scripts/init-idp.sh, add an idempotent function_ensure_caipe_platform_user_roles()that: a) Resolvesservice-account-caipe-platform's user ID. b) Resolves therealm-managementclient's internal ID. c)GETs the current client-role-mapping for that service account. d)POSTs any of{view-users, query-users, manage-users}that are missing. e) Echoes the final mapping at INFO so the script's logs serve as the audit trail. The function MUST NOT delete other roles (operators may have added legitimate ones); R-9 inplan.mdis handled by a periodic CI assertion, not by destructive cleanup at boot. -
T002 [INFRA] [config] In
deploy/keycloak/realm-config.jsonandcharts/ai-platform-engineering/charts/keycloak/realm-config.json, update theservice-account-caipe-platformuser'sclientRolesblock to include all three roles:"clientRoles": {
"realm-management": [
"view-users",
"query-users",
"manage-users"
]
}This is the source-of-truth for fresh realm imports; T001 handles drift correction on already-running clusters.
-
T003 [INFRA] [verify] The IdP auto-merge flow is already configured. The existing
init-idp.sh(lines ~328-476) creates a "silent broker login" flow withidp-create-user-if-unique+idp-auto-link(both ALTERNATIVE), sets it as the IdP'sfirstBrokerLoginFlowAlias, and configurestrustEmail=true+syncMode=FORCE. No new code needed; just verify on a freshmake e2e-test-minimalthat: a) The flowsilent-broker-login(or whateverSILENT_FLOW_ALIASresolves to) exists and contains both executions ALTERNATIVE. b) The IdP entry'sfirstBrokerLoginFlowAliasmatches. c)trustEmail=true,syncMode=FORCEon the IdP entry. -
T004 [INFRA] [verify] Run
make e2e-test-minimal-down && make e2e-test-minimal. Assert: a)service-account-caipe-platform's realm-management mapping contains all three of{view-users, query-users, manage-users}. b) No newcaipe-slack-bot-provisionerclient exists (we deliberately did not introduce one). c) The Duo (or current placeholder) IdP entry hastrustEmail=true,syncMode=FORCE, and the broker-login flow has the auto-set step. Capture the verification commands inline in this task block for future operators:KC_TOKEN=$(curl -sf -d "client_id=admin-cli" -d "username=admin" \
-d "password=admin" -d "grant_type=password" \
http://localhost:8080/realms/master/protocol/openid-connect/token \
| jq -r .access_token)
SA_ID=$(curl -sf -H "Authorization: Bearer $KC_TOKEN" \
"http://localhost:8080/admin/realms/caipe/users?username=service-account-caipe-platform" \
| jq -r '.[0].id')
RM_ID=$(curl -sf -H "Authorization: Bearer $KC_TOKEN" \
"http://localhost:8080/admin/realms/caipe/clients?clientId=realm-management" \
| jq -r '.[0].id')
curl -sf -H "Authorization: Bearer $KC_TOKEN" \
"http://localhost:8080/admin/realms/caipe/users/$SA_ID/role-mappings/clients/$RM_ID" \
| jq -r '.[].name' | sort
# Expected: manage-users, query-users, view-users
Phase 2 โ JIT helper in keycloak_admin (US1, US5 โ depends on Phase 1)โ
-
T006 [US1] [code] In
ai_platform_engineering/integrations/slack_bot/utils/keycloak_admin.py, reuse the existingKeycloakAdminConfig(no new dataclass). All JIT calls use the sameKEYCLOAK_SLACK_BOT_ADMIN_CLIENT_ID/KEYCLOAK_SLACK_BOT_ADMIN_CLIENT_SECRETcredentials already consumed by lookups. Document this in the module docstring with a pointer to spec FR-004/FR-005 explaining why we deliberately avoided a second client. -
T007 [US1] [code] In the same file, add
async def create_user_from_slack(slack_user_id: str, email: str) -> strthat: a) acquires a token via the existing admin token cache (no new token plumbing), b)POST /admin/realms/{realm}/userswith the body specified inspec.mdFR-003, c) parses theLocationheader to obtain the new user's UUID, d) on409 Conflict, re-queries by email and returns the existing user's UUID (FR-008), e) on401/403, raisesJitAuthError/JitForbiddenErrorwitherror_kindfield for caller to log, f) restricts any follow-upPUT /users/{id}to the freshly-returned UUID only (helper-function-shape mitigation per spec M1) โ the caller never gets a generic "PUT any user" surface, g) returns thekc_user_idUUID string on success. -
T008 [US1, US5] [code] Add a small
email_masking.pymodule withmask_email(email: str) -> strreturning"<first_3_chars>***@<domain>". Used by FR-010, FR-011 logs. -
T009 [US1] [test] Create
ai_platform_engineering/integrations/slack_bot/tests/test_keycloak_admin_jit.pywith the following tests: a)test_create_user_from_slack_uses_admin_credentials(asserts no separate provisioner env var is read) b)test_create_user_from_slack_posts_correct_body(httpx mock) c)test_create_user_from_slack_handles_409_by_requeryd)test_create_user_from_slack_raises_on_401e)test_create_user_from_slack_raises_on_403f)test_create_user_from_slack_secret_never_in_logs(usesSecretRedactionFilterand asserts captured log records) g)test_post_users_url_targets_only_freshly_created_id(regression on M1 helper-shape)
Phase 3 โ JIT branch in identity_linker (US1, US3, US4, US5)โ
-
T010 [US1, US3] [code] In
ai_platform_engineering/integrations/slack_bot/utils/identity_linker.py, add module-level constants:SLACK_JIT_CREATE_USER = os.environ.get("SLACK_JIT_CREATE_USER", "true").lower() == "true"_JIT_ALLOWED_DOMAINS = [d.strip().lower() for d in os.environ.get("SLACK_JIT_ALLOWED_EMAIL_DOMAINS", "").split(",") if d.strip()]Document precedence in module docstring (matches plan.md ยง7). -
T011 [US1, US3] [code] In
auto_bootstrap_slack_user, replace the currentif kc_user is None: return Noneblock with: a) If JIT off โ returnNone(caller's off-path will send link). b) If JIT on but_JIT_ALLOWED_DOMAINSis non-empty and email's domain is not in the list โ log structured WARNING (FR-011,error_kind=domain_excluded) and returnNone. c) If JIT on and the existing admin client config (KeycloakAdminConfig.from_env_or_none()) isNoneโ log one WARNING per process startup (suppress repeat), returnNone. d) Else callawait create_user_from_slack(slack_user_id, email), thenawait set_user_attribute(...)to add theslack_user_idattribute (already in POST body but defensive), then log INFOslack_jit_user_createdper FR-010, return the newkc_user_id. e) On anyJitAuthError/JitForbiddenError/httpx.HTTPErrorin (d), log WARNING per FR-011 and returnNone(fall through to the off-path linking flow). -
T012 [US1, US3, US4, US5] [test] Create
ai_platform_engineering/integrations/slack_bot/tests/test_identity_linker_jit.py: a)test_jit_off_returns_none_does_not_call_create_userb)test_jit_on_lookup_miss_calls_create_user_and_returns_idc)test_jit_on_lookup_hit_short_circuits(no regression of existing email-match path) d)test_jit_on_admin_unconfigured_warns_once_returns_nonee)test_jit_on_create_user_401_returns_none_logs_warningf)test_jit_on_create_user_403_returns_none_logs_warningg)test_jit_on_create_user_409_returns_existing_user_idh)test_jit_domain_allowlist_excludes_non_listed_domaini)test_jit_domain_allowlist_empty_means_allow_allj)test_log_record_event_field_is_slack_jit_user_created(FR-010 stable field) k)test_log_record_does_not_contain_admin_secret(usesSecretRedactionFilter)
Phase 4 โ Off-path message fix (US3 โ independent of Phase 2/3)โ
-
T013 [US3] [code] In
ai_platform_engineering/integrations/slack_bot/app.py, locate theif rbac_status == "unlinked":block (around line 395). In theelse:branch (line 410-415, the dead-end "could not be auto-linked" message), replace with:linking_url = asyncio.run(generate_linking_url(slack_user_id))and atextthat tells the user "Click here to link your account before using this feature" with the URL. Keep the_linking_prompt_sentcooldown logic. Now bothSLACK_FORCE_LINK=trueand "JIT failed/unconfigured" paths produce the same actionable user experience. -
T014 [US3] [test] Create
ai_platform_engineering/integrations/slack_bot/tests/test_app_offpath_message.py: a)test_unlinked_with_jit_off_sends_linking_urlb)test_unlinked_with_jit_on_failure_sends_linking_urlc)test_message_does_not_contain_email_match_dead_end_textd)test_cooldown_still_applies_to_offpath_messageUsemocker.patchonchat_postEphemeraland assert the capturedtextarg contains"/api/auth/slack-link?".
Phase 5 โ Compose + Helm wiring (US1, US3 โ runs after Phase 1โ4)โ
No new Keycloak Secret resources, no new chart values for credentials.
The slack-bot already mounts KEYCLOAK_SLACK_BOT_ADMIN_CLIENT_ID and
KEYCLOAK_SLACK_BOT_ADMIN_CLIENT_SECRET; JIT reuses them. The only
new env vars are the feature flags SLACK_JIT_CREATE_USER and
SLACK_JIT_ALLOWED_EMAIL_DOMAINS.
-
T015 [INFRA] [config] In
docker-compose.dev.yaml, in theslack-botservice env block, add:SLACK_JIT_CREATE_USER: ${SLACK_JIT_CREATE_USER:-true}
# SLACK_JIT_ALLOWED_EMAIL_DOMAINS: "" # CSV; empty = allow allAdd an inline comment that JIT reuses the existing
KEYCLOAK_SLACK_BOT_ADMIN_*credentials (no new secret to wire). -
T016 [INFRA] [config] In
.env.example, add a commented documentation block forSLACK_JIT_CREATE_USERandSLACK_JIT_ALLOWED_EMAIL_DOMAINS. Do NOT set them in.env(preserves operator opt-in for any non-default behavior). -
T017 [INFRA] [config] In
charts/ai-platform-engineering/charts/slack-bot/values.yaml, add:jit:
createUser: true
# allowedEmailDomains: [] # list of domains; empty = allow all -
T018 [INFRA] [config] In
charts/ai-platform-engineering/charts/slack-bot/templates/deployment.yaml, underenv:, add: a)SLACK_JIT_CREATE_USERfrom.Values.jit.createUserb)SLACK_JIT_ALLOWED_EMAIL_DOMAINSfrom.Values.jit.allowedEmailDomains | join ","(only if non-empty; omit the env var entirely when empty so unit tests see "unset" not "empty string") -
T019 [INFRA] [verify]
helm template charts/ai-platform-engineering --show-only charts/slack-bot/templates/deployment.yaml(default values; then with--set slackBot.jit.createUser=false; then with--set 'slackBot.jit.allowedEmailDomains={corp.com,partner.com}') โ assert all three render without error and the deployment env section contains the expected variables in each path.
Phase 6 โ Documentation (runs in parallel with Phase 5; required by repo CLAUDE.md rule)โ
-
T023 [P] [docs] Update
docs/docs/specs/098-enterprise-rbac-slack-ui/how-rbac-works.md: a) New row in the env-var table for the slack-bot component b) New diagram for the JIT auto-bootstrap flow + auto-merge on web sign-in c) New entry in the file map for each of the new code/config files -
T024 [P] [docs] Update
docs/docs/specs/098-enterprise-rbac-slack-ui/operator-guide.md: add a new section "Enabling JIT user creation" with: when to enable it, when not to, how to opt out, how to setSLACK_JIT_ALLOWED_EMAIL_DOMAINS, how to identify JIT users after the fact, how to cleanup, what to expect on Duo first-login auto-merge. -
T025 [P] [docs] Update
docs/docs/security/rbac/architecture.md: add a "Slack JIT shell-user creation" subsection under the existing "Slack bot โ Keycloak Admin REST API" block, documenting privilege separation the single-client model (with the R-8 trade-off acknowledgment) and the IdP flow change. -
T026 [P] [docs] Update
docs/docs/security/rbac/file-map.md: entries forkeycloak_admin.py:create_user_from_slack,email_masking.py, andinit-idp.sh:_ensure_caipe_platform_user_roles. -
T027 [P] [docs] Update
docs/docs/security/rbac/secrets-bootstrap.md: in the existing slack-bot admin-client subsection, document that the same secret now also authorizes JIT user creation (and reference R-8 inplan.mdfor the rationale). No new secret-bootstrap section is required. -
T028 [P] [docs] Update
docs/docs/specs/098-enterprise-rbac-slack-ui/quickstart.md: add a commented stanza forSLACK_JIT_CREATE_USER/SLACK_JIT_ALLOWED_EMAIL_DOMAINSand a one-line note that JIT defaults to ON in dev and reuses the existing slack-bot admin credentials. -
T029 [docs] Create
docs/docs/specs/103-slack-jit-user-creation/research.mdcapturing: decisions and rejected alternatives (single-clientcaipe-platformchosen over a dedicatedcaipe-slack-bot-provisioner; auto-merge vs confirm; default ON vs OFF), references to Keycloak docs sections used. -
T030 [docs] Create
docs/docs/specs/103-slack-jit-user-creation/security-review.mdwith a STRIDE walkthrough and the threat catalog from spec.md ยง7 expanded.
Phase 7 โ Live verification (runs last; depends on Phases 1โ5)โ
These tasks replace the former root-level BLOCKERS.md note for spec 103. Keep
their live status here so verification remains attached to the feature spec.
-
T031 [US1, US2] [verify] In a clean stack, send a Slack DM from a real corporate email that does NOT exist in the realm. Assert: a) Slack response is the normal bot reply (or a normal RBAC denial such as "channel has no agent mapping"), NOT the dead-end "could not be auto-linked" message. b) Keycloak admin UI shows a new user with
created_by=slack-bot:jit,slack_user_id=<Uโฆ>,enabled=true,emailVerified=true, nocredentialsarray, norequiredActions. c)docker logs slack-botcontains exactly oneslack_jit_user_createdlog line. -
T032 [US2] [verify] Sign in to the web UI as the same person via Duo (or simulated equivalent if Duo is not yet wired). Assert: a) Keycloak still has exactly one user for that email. b) The user now has both the
slack_user_idattribute (preserved from T031) and afederatedIdentitiesentry pointing at the upstream IdP. c) The web UI did NOT show a "We found an existing account, link?" confirmation prompt. -
T033 [US3] [verify] Set
SLACK_JIT_CREATE_USER=falseand restart slack-bot. Repeat T031's setup with a different unknown email. Assert: a) The bot's ephemeral message contains/api/auth/slack-link?. b) No new Keycloak user was created. -
T034 [US5] [verify] Temporarily strip
manage-usersfromservice-account-caipe-platform(so JIT now hits 403 even though lookups still work) and restart slack-bot. Send a Slack DM from a third unknown email. Assert: a) The bot's ephemeral message contains/api/auth/slack-link?(fall-through to link flow). b)docker logs slack-botcontains a single WARNING withevent=slack_jit_user_creation_failedanderror_kind=forbidden(and noauth_failure, since the token itself is still valid for lookups). c) No new Keycloak user was created. Then re-addmanage-usersand re-run T031 to confirm the happy-path is restored. -
T035 [US4] [verify] Run
kc-export(or curl the admin API) to list all users withq=created_by:slack-bot:jit. Assert the list matches T031 + T034 expectations exactly. -
T036 [US3] [verify] Set
SLACK_JIT_ALLOWED_EMAIL_DOMAINSto a restrictive value such ascisco.com, then send a DM from a Slack user whose email domain is outside the allowlist. Assert the bot falls back to the administrator/linking guidance and no Keycloak user is created. -
T037 [US5] [verify] Run
slack-botwithLOG_LEVEL=DEBUG, send a JIT DM, and inspectdocker compose logs slack-bot. Assert every email is masked (for example,s***y@example.com) and no plaintext email or Slack admin secret appears in logs. This is a manual sanity pass over the unit coverage intest_email_masking.pyandtest_log_redaction.py.
Dependenciesโ
T001,T002,T003 โโโบ T004 (Phase 1 verify)
โ
โโโโบ T006,T007,T008 โโโบ T009 (Phase 2)
โ โ
โ โโโโบ T010,T011 โโโบ T012 (Phase 3)
โ
โโโโบ T013 โโโบ T014 (Phase 4, independent)
(Phases 2, 3, 4 may interleave on different files; T012 needs T011 done)
T015,T016,T017,T018 โโโบ T019 (Phase 5 verify, runs after Phase 4)
T023..T028 [P] โโโบ T029,T030 (Phase 6, parallel docs; can start as soon as plan is fixed)
T004 + T019 + Phase 4 done โโโบ T031..T037 (Phase 7 live verify)
Parallelization opportunitiesโ
- [P] tasks within Phase 6 โ six docs files, six different paths, no inter-file dependency. Send to a single doc-update commit at end of phase.
- Phase 4 (T013/T014) runs independently of Phases 2/3. Can be a small standalone PR if you want to ship the dead-end-message fix before the rest of JIT lands.
- Phase 5 chart wiring tasks (T015..T018) touch different files and can be split among reviewers.
Definition of done (PR-level)โ
- All tests in T009, T012, T014 pass under
make test-supervisor. make lintclean.helm templaterenders for default + JIT-off + allowedEmailDomains populated paths.- All five user stories have every Phase 7 [verify] task ticked.
- All six doc files updated in the same PR.
- Conventional commit + DCO +
Assisted-by:trailer present on every commit. - Spec, plan, tasks, research, security-review files all merged together.