Feature Specification: Integrated Skills with Single Source, Chat Commands, and Skill Hubs
Feature Branch: 097-skills-middleware-integration
Created: 2026-03-18
Status: Draft
Input: User description: "Create a new integrated feature with the current skills in UI and CAIPE supervisor reading the skill from MongoDB and using LangGraph skills middleware, also no more run skills in the chat, add support for /skills in chat window to show skills, also add the ability to add other skill hub via GitHub or public GitHub that supervisor is able to incorporate into skills middleware."
User Scenarios & Testing (mandatory)β
User Story 1 - Single Source of Skills for UI and Assistant (Priority: P1)β
Users and the assistant see the same, up-to-date skill catalog. The chat UI and the platform's assistant (supervisor) both consume skills from one central store through a shared skills layer, so the list of available skills is consistent everywhere and there is no separate "run skills" action in chat.
Why this priority: This is the foundation. Without a single source, the UI and the assistant can disagree on what skills exist, and duplicate or legacy flows (like "run skills" in chat) create confusion and inconsistency.
Independent Test: Confirm that the skill list shown in the UI matches the skills the assistant can use, and that no "run skills" control remains in the chat experience.
Acceptance Scenarios:
- Given a skill is available in the central catalog, When a user opens the skills experience in the UI, Then that skill appears in the list.
- Given the same central catalog and the same authenticated principal, When the assistant is deciding what to do, Then it uses the same entitled set of skills as the UI and
/skills(no separate or conflicting source; FR-020). - Given the new integrated model is in place, When a user is in the chat window, Then there is no "run skills" action or equivalent; skill use is driven by the assistant via the shared catalog.
- Given a skill is removed or disabled in the central catalog, When the UI and assistant refresh or reload, Then that skill is no longer listed or used.
User Story 2 - /skills in Chat to Show Available Skills (Priority: P1)β
Users can type a dedicated command (e.g. /skills) in the chat window to see the list of skills available to the assistant. This replaces the need for a separate "run skills" flow and gives quick, in-context visibility into what the assistant can do.
Why this priority: Directly supports clarity and trust. Users can discover and confirm available capabilities without leaving the conversation.
Independent Test: Open chat, type the designated command (e.g. /skills), and verify that the list of available skills is shown in the chat (and matches the single source from User Story 1).
Acceptance Scenarios:
- Given a user is in the chat window, When they enter the agreed command (e.g.
/skills), Then the system shows the list of skills available to the assistant. - Given the skills list is shown, When the user views it, Then the list is consistent with the central catalog (same as in UI and used by the assistant).
- Given the user has not typed the command, When they send a normal message, Then the system does not automatically show the skills list; the list appears only when the user invokes the command.
- Given the command is invoked, When the catalog is empty or no skills are available, Then the user sees an appropriate message (e.g. no skills available) rather than an error or blank state that implies a failure.
User Story 3 - Add Skill Hubs from External Sources (e.g. GitHub) (Priority: P2)β
Administrators (or authorized users) can register additional skill hubsβsuch as a public or private GitHub repositoryβso that skills from those sources are incorporated into the shared skills layer. The assistant can then use these skills in the same way as skills from the default catalog.
Why this priority: Enables extension and reuse (e.g. org-specific or community skill packs) without changing core product code. Important for scale and customization, but depends on User Stories 1 and 2 being in place.
Independent Test: Register an external skill hub (e.g. a public GitHub repo), verify its skills appear in the central catalog and in the response to the chat command (e.g. /skills), and confirm the assistant can use those skills in conversation.
Acceptance Scenarios (include UI onboarding):
- Given an authorized user has access to add skill hubs, When they use the UI onboarding feature to crawl and add a GitHub repository (e.g. by providing owner/repo or URL, optional credentials, and previewing discovered SKILL.md paths), Then the system validates and registers that hub and makes skills read from that repo available through the shared skills layer.
- Given a hub is registered, When the catalog is refreshed or the assistant loads skills, Then skills from that hub appear in the same catalog as default skills and are usable by the assistant.
- Given a hub is registered, When a user invokes the chat command to show skills (e.g.
/skills), Then skills from that hub are included in the list when they are successfully loaded. - Given a hub is removed or disabled, When the catalog is refreshed, Then skills from that hub are no longer listed or used.
- Given a hub fails to load (e.g. network or permission issue), When the system refreshes skills, Then the rest of the catalog remains available and the user or admin receives a clear indication that the hub could not be loaded (no silent failure of the entire catalog).
User Story 4 - Try Skills Gateway, External Clients, and Guided Setup (Priority: P1)β
Developers and power users can use CAIPE as a Try skills gateway: they authenticate with Okta-issued tokens (OIDC Bearer, same lineage as FR-014) or a dedicated API key, call the catalog with search/query parameters, and see step-by-step instructions in the UI for wiring the same catalog into Claude and Cursor (and compatible Agent Skills layouts). Access respects personal, team, and global visibility (see FR-020).
Why this priority: Unlocks integrations and self-serve adoption without using the chat UI alone; aligns enterprise auth (Okta) with automation (API keys).
Independent Test: From the skills experience, open the gateway/docs panel, copy a curl or example using Bearer or API key, run a search query, and confirm results match visibility rules; follow the Claude and Cursor sections without external docs.
Acceptance Scenarios:
- Given a user with a valid Okta/OIDC access token or a valid API key, When they call the documented catalog endpoint with optional search query params, Then they receive only skills they are allowed to see (global + their teams + personal).
- Given the skills UI gateway page, When the user views integration help, Then they see ordered steps for Claude and Cursor (and pointers to agentskills.io-style layout where applicable).
- Given an invalid or expired token/key, When they call the gateway API, Then the system returns 401 with a generic, non-enumerating error.
- Given the Try skills gateway (or linked observability), When the user views skills sync status, Then they see whether the HTTP catalog (skills gateway /
GET /skillslineage) and the in-process CAIPE supervisor snapshot are in sync, stale, or unknown, with actionable copy if refresh is needed (FR-026).
Edge Casesβ
- What happens when the central catalog is temporarily unavailable? The system should degrade gracefully: the chat command (e.g.
/skills) and the assistant should show a clear "skills unavailable" or "try again" state rather than crashing or showing stale data as if it were current. - What happens when an external hub returns invalid or malformed skill definitions? The system should reject or skip those entries, log or report the issue, and continue to serve the rest of the catalog.
- What happens when two hubs (or the default catalog and a hub) define a skill with the same identifier? The system should apply a consistent, predictable rule (e.g. one source wins, or explicit override order) and document that behavior so admins can avoid conflicts.
- What happens when a user without permission to add hubs tries to add one? The system should deny the action and return a clear permission error.
- What happens when the chat command (e.g.
/skills) is used while skills are still loading? The system should show a loading or "fetching skills" state and then show the list when ready, or a clear message if loading fails. - Where do users learn to see their loaded skills after "Run in Chat" / run-skills controls are removed? The UI must direct users to type
/skillsin chat to see their loaded skills (e.g. chat input placeholder "Type /skills to see available skills" or similar). - Where do admins onboard a GitHub skill hub if it is easy to miss? The skills experience MUST surface a visible entry point (e.g. link or CTA to Admin β Skill Hubs, plus empty-state copy when no hubs exist) so onboarding is discoverable without reading internal docs.
- What happens when skill-scanner reports high-severity findings on an ingested skill? The system MUST apply a documented policy (e.g. block ingest, quarantine, or warn-only for admins) and MUST NOT imply that a clean scan guarantees safety (scanner is best-effort per vendor documentation).
- What happens when a search query matches no skills? The UI and API MUST return an empty set with a clear message, not a generic error.
- What happens when thousands of skills exist? The catalog API MUST remain paginated/filterable; the supervisor MUST NOT rely on unbounded injection of every skill's full body into the context window (see FR-015, FR-024, Assumptions).
- What happens when the skills gateway catalog and the supervisor are out of sync? The UI MUST show a stale or not yet applied state (FR-026), not silently imply the assistant already has the same skill set as the latest
GET /skillsresponse after a hub or config change. - What happens when an agent-skills document's total ancillary files exceed the 5 MB soft limit (FR-028)? The UI MUST warn the user before save and suggest using a hub instead. If the total exceeds MongoDB's 16 MB hard document limit, the save MUST fail with a clear error message.
Requirements (mandatory)β
Functional Requirementsβ
- FR-001: The system MUST expose a single, shared skill catalog to both the chat UI (for display) and the platform assistant (for execution); both MUST consume skills from this same source.
- FR-002: The system MUST remove any "run skills", "Run in Chat", or equivalent action from the chat experience; skill execution MUST be driven by the assistant using the shared catalog. The UI MUST direct users to use the
/skillscommand to see their loaded skills (e.g. via chat placeholder, tooltip, or help text). - FR-003: The system MUST support a designated chat command (e.g.
/skills) that, when invoked, displays the list of skills available to the assistant in that conversation. - FR-004: The list of skills shown by the chat command MUST match the shared catalog (same as used by the assistant and by the rest of the UI), including search/pagination and entitlement when those features apply (see FR-019, FR-020 for the same caller context).
- FR-005: The system MUST allow authorized users to register external skill hubs (e.g. by repository or URL); once registered, skills from those hubs MUST be incorporated into the shared catalog and usable by the assistant.
- FR-006: The system MUST support at least one type of external hub that is commonly used for code or asset sharing (e.g. public or private repository); registration MUST accept the necessary identifiers (e.g. repository location and optional credentials or access method).
- FR-007: When an external hub is removed or disabled, the system MUST stop listing and using skills from that hub after the next catalog refresh or equivalent update.
- FR-008: The system MUST handle catalog or hub load failures gracefully: partial catalog availability and clear, non-misleading feedback to the user or admin (e.g. "skills temporarily unavailable" or "hub X failed to load").
- FR-009: The system MUST enforce access control so that only authorized users can add, update, or remove skill hubs; unauthorized attempts MUST be rejected with a clear permission message.
- FR-010: When multiple sources define a skill with the same identifier, the system MUST apply a deterministic, documented resolution rule (e.g. precedence by source or explicit override) so that behavior is predictable.
- FR-011: The system MUST accept SKILL.md files in both Anthropic/agentskills.io-style and OpenClaw-style format (YAML frontmatter + markdown body) when loading skills from hubs (e.g. GitHub). ClawHub as a hub source is out of scope for v1.
- FR-012: The CAIPE supervisor MUST read runtime-updated skills from MongoDB (and other catalog sources) and load them into the skills middleware so that catalog updatesβe.g. new or changed skills, hub registration, agent_skills changesβare reflected without restarting the supervisor. The supervisor MUST hot reload skills (e.g. on each catalog request or short TTL cache) or support a trigger from the UI (e.g. "Refresh skills" or action after onboarding a hub) to reload the catalog.
- FR-013: The system MUST provide a UI feature that allows authorized users to onboard GitHub repositories as skill hubs (add repo identifier/location, optional credentials, crawl/preview discovered skills per FR-017, list onboarded repos, remove or disable). Skills read from onboarded repos are incorporated into the shared catalog.
- FR-014: The backend endpoint that serves the skill catalog to the UI (e.g. GET /skills or /internal/skills) MUST validate the request using the same authentication pattern as the RAG server: JWT validation via JWKS and/or user_info (e.g. Bearer token validated with JWKS, optionally userinfo for identity/groups). Okta-issued OIDC access tokens MUST be supported when Okta is the IdP. Unauthenticated or invalid tokens MUST be rejected (e.g. 401).
- FR-015: The system MUST use the upstream
deepagents.middleware.skills.SkillsMiddlewareto inject skills into the supervisor's system prompt via progressive disclosure (metadata listing in prompt, full SKILL.md content read on demand). The customskills_middlewarecatalog layer MUST feed aggregated skills into theSkillsMiddleware's backend (e.g.StateBackend) so that the upstream middleware handles system prompt formatting, YAML frontmatter parsing, and the "Skills System" prompt section. The custom catalog layer retains responsibility for MongoDB/agent_skills/hub aggregation, precedence rules, dual SKILL.md format normalization (FR-011), hot reload (FR-012), and visibility filtering (FR-020) so the supervisor's effective skill set for a conversation matches the caller's entitled catalog where the product defines user-bound execution. - FR-016: The system MUST expose supervisor skills-load observability to authorized users or operators (e.g. admin UI panel and/or authenticated API): at minimum whether the in-process supervisor matches a recent catalog refreshβusing concrete fields such as graph generation (
_graph_generation), count of merged skills last loaded into the deep agent, and/or timestamp of last successful skills mergeβso it is visible when the supervisor has picked up refreshed skills versus stale in-memory state. FR-026 requires the Try skills gateway to present a gateway-vs-supervisor comparison for the same lineage of data. - FR-017: The hub onboarding UI MUST include a crawl step for GitHub skill repositories: discover/list SKILL.md (and related) paths from the repo (e.g. preview before save) in addition to registering the hub, so admins can confirm what will be ingested. Crawl MUST respect the same auth and hub validation rules as registration (FR-009).
- FR-018 (Try Skills Gateway): The UI MUST expose a Try skills gateway experience: overview of the catalog HTTP API, authentication options (Okta/OIDC Bearer per FR-014 and API key for machine clients), example requests, and step-by-step integration guides for Claude and Cursor (copy-paste friendly). API keys MUST be revocable, scope-limited to catalog read (and related gateway operations), and stored per security policy (no plaintext keys in logs).
- FR-019 (Search): The catalog API and UI MUST support search/query filtering (e.g. text search on name/description and optional filters) so clients can narrow large catalogs; behavior MUST match between UI, gateway docs, and
/skillschat command surfaces for the same caller context. - FR-020 (Visibility): Each skill (or catalog entry) MUST support a visibility dimension: global (org-wide), team (limited to named team(s)), and personal (owner only). The system MUST enforce visibility on catalog list and detail responses for UI, gateway, and chat command based on the authenticated principal (and team membership). Precedence and indexing MUST be documented (e.g. user sees union of global + their teams + personal).
- FR-021 (Source UX): The skills UI MUST visually distinguish skills by origin using user-friendly labels: Custom (user-created,
source: agent_skills), Built-in (platform/default,source: default), and Skill hub (GitHub or other registered hub,source: hub). The label "Agent config" MUST NOT appear in user-facing UI copy; use "Custom" instead. Consistent labels, grouping, or filters MUST be used so users understand provenance and trust boundaries. - FR-022 (Hub discoverability): The main skills experience MUST include a discoverable path to GitHub hub onboarding (e.g. prominent link or button to Admin β Skill Hubs, tooltip, or settings entry) and onboarding-related empty states that explain how to add a repo.
- FR-023 (Skill security scanning): The pipeline MUST integrate Skill Scanner from Cisco AI Defense (or an equivalent approved wrapper) for skills loaded from GitHub hubs and, where feasible, default packaged skills: run static/best-effort analysis before or during merge, surface results to admins, and apply a documented gate (e.g. fail on high/critical, or warn-only). Documentation MUST state that no findings does not prove safety (aligned with upstream scope/limitations). Third-party attribution: Product documentation, repository NOTICE / third-party credits (where the project maintains them), and admin UI surfaces that name or summarize the scanner MUST clearly state that Skill Scanner is provided by Cisco AI Defense and MUST include the repository URL https://github.com/cisco-ai-defense/skill-scanner.
- FR-024 (Supervisor scale / context): For large catalogs (e.g. thousands of skills), the system MUST ensure the deep agent /
SkillsMiddlewarepath does not linearly bloat the context window with full SKILL.md bodies: progressive disclosure (metadata in prompt, full content loaded on demand per FR-015) is mandatory; additionally, the implementation MUST apply a documented cap or windowing strategy for how many skill summaries appear in the system prompt (e.g. max N, relevance to task, or pagination contract) so token usage stays bounded. Merging thousands of files into backend storage is allowed; prompt injection MUST remain bounded by design. - FR-025 (agent_skills source / naming alignment): The implementation MAY refactor code and UI so MongoDB
agent_skills, theagent_skillsloader (skills_middleware/loaders/agent_skill.py), and catalogsource: agent_skillsare the single conceptual model for user/agent-authored skills, consolidating or renaming legacy surfaces (routes, components, copy) that duplicate the same data. Any refactor MUST preserve backward compatibility for existing stored documents (migration or dual-read) until a documented cutover; MUST NOT change merge precedence (FR-010) or visibility rules (FR-020) without a spec amendment. - FR-026 (Gateway-supervisor sync status): The Try skills gateway (FR-018) MUST expose skills sync status between the HTTP catalog (merged skills cache / generation the gateway documents for API clients) and the in-process CAIPE supervisor (FR-016 fields: e.g.
graph_generation, merged skill count, last successful merge timestamp). Operators MUST see a clear state such as In sync, Supervisor stale (catalog newer than supervisor β suggest refresh), or Unknown (e.g. status unavailable). The same comparison MAY also appear in admin observability (FR-016) if not duplicated confusingly. - FR-027 (Scan on agent-skills save): When a user creates or updates an agent-skills document that contains
skill_content(or the equivalent skill body), the system MUST synchronously invoke the skill scanner (FR-023) against that content before completing the save response. The document MUST always be persisted regardless of scan outcome, but with ascan_statusfield:passed(scanner ran, no blocking findings),flagged(scanner ran, findings met the severity threshold underSKILL_SCANNER_GATE=strict), orunscanned(scanner unavailable orskill_contentabsent). WhenSKILL_SCANNER_GATE=strict, the catalog loader (agent_skillssource) MUST exclude documents withscan_status: "flagged"from the merged catalog so they do not appear in the supervisor's skill set or the UI catalog until remediated. The save response MUST includescan_status(and optionallyscan_summary) so the UI can inform the user. Scan findings MUST be persisted toskill_scan_findingswithsource_type: "agent_skills"for admin review. The Next.js API route (/api/agent-skills) MUST haveBACKEND_SKILLS_URLconfigured (in.env.local, documented in.env.example) pointing to the FastAPI backend; without it the scanner is silently skipped andscan_statusdefaults to"unscanned". - FR-028 (Multi-file skills): Skills MUST support ancillary files (
scripts/,references/,assets/, and other directories) alongsideSKILL.md, per the agentskills.io specification. (a) Hub skills: When fetching skills from a GitHub hub, the system MUST fetch all files under each skill's directory (not justSKILL.md) and write them intoStateBackendat the correct relative paths so the agent can read scripts, references, and other files at runtime viaSkillsMiddleware. (b) Agent-skills (source: agent_skills): Users MUST be able to attach ancillary files to agent-skills documents via two mechanisms: (1) a file drop zone in the skill editor UI (drag-and-drop upload stored asancillary_filesmap in MongoDB), and (2) a GitHub repo/directory link that imports (fetches and snapshots) all files at save time into the sameancillary_filesfield. The repo link is an import convenience, not a live reference; users re-import to update. (c) Storage limits: Agent-skillsancillary_filesMUST enforce a 5 MB soft limit (total size of all ancillary files per document) with a UI warning; the UI MUST suggest using a hub for larger skills. MongoDB's 16 MB hard document limit is the backstop. (d) Hub skills have no fetch size limit (trust the repo maintainer). (e) Ancillary files MUST be written toStateBackendunder{source_dir}/{skill_name}/{relative_path}so the agent can access them via the standardread_file/download_filestools provided byFilesystemMiddleware.
Supervisor runtime reference (CAIPE platform engineer)β
- Primary Python modules (supervisor + skills wiring):
ai_platform_engineering/multi_agents/platform_engineer/deep_agent.pyβAIPlatformEngineerMAS: builds the deep agent graph, callsget_merged_skills/build_skills_filesinside_build_graph(), holds_skills_files/_skills_sources, increments_graph_generationon each build.ai_platform_engineering/multi_agents/platform_engineer/protocol_bindings/a2a/agent.pyβAIPlatformEngineerA2ABinding: constructsAIPlatformEngineerMAS(), exposesget_graph(), seedsfileson each graph invoke from_mas_instance._skills_files(see Architecture:SkillsMiddlewarelifecycle... above; per-user entitlement should override this copy at invoke time).ai_platform_engineering/multi_agents/platform_engineer/protocol_bindings/fastapi/main.pyβ FastAPI app mountingskills_middlewarerouter (GET /skills,POST /skills/refresh).ai_platform_engineering/skills_middleware/β catalog aggregation, cache invalidation, hub loaders,router.py(HTTP API).
- When a new deep agent graph is created: Each call to
_build_graph()constructs a new compiled graph viacreate_deep_agent(...). That runs on process startup (AIPlatformEngineerMAS.__init__) and wheneverplatform_registrydynamic monitoring reports agent connectivity changes (_on_agents_changedβ_rebuild_graphβ_build_graph). Each such build re-reads the merged skills catalog (get_merged_skills) and rebuildsskills/filespassed into the deep agent. Catalog-only invalidation (e.g.POST /skills/refreshclearing the skills cache) MUST be wired so the supervisor reloads skills consistent with FR-012 and FR-016 (rebuild graph or equivalent in-process reload); the spec treats "refresh visible in supervisor" as incomplete if only the HTTP cache clears without updating the MAS snapshot.
Architecture: SkillsMiddleware lifecycle, shared graph, and per-invoke files (FR-015 / FR-020)β
This subsection is normative for implementers: it avoids a common mistake (assuming skills are re-bound on every message) and records how multi-user entitlement must be achieved without rebuilding middleware per user.
| Question | Answer |
|---|---|
Is a new SkillsMiddleware created on each user message or A2A invoke? | No. SkillsMiddleware is wired when create_deep_agent(..., skills=...) runs inside _build_graph(). The same compiled graph (and thus the same middleware instances) is reused for subsequent invocations until the next graph rebuild. |
| What runs on every invoke? | A new conversation thread (thread_id) and initial graph state, including the files map seeded for StateBackend / SkillsMiddleware. |
Where is files set today? | AIPlatformEngineerMAS.serve() / serve_stream() copy self._skills_files into state_dict["files"]. A2A (protocol_bindings/a2a/agent.py) copies self._mas_instance._skills_files into invoke inputs the same way. Both paths use the snapshot produced at the last _build_graph(). |
| Implication for concurrent users on one supervisor process | All users currently share the same self._skills_files snapshot unless implementation is extended. FR-020 requires the effective skill set for a turn to match the caller's entitlement. That does not require a new SkillsMiddleware or a new compiled graph per user. |
| Preferred pattern for per-user entitled skills | At the invoke boundary (A2A handler and any FastAPI path that invokes the graph), compute files (and apply MAX_SKILL_SUMMARIES_IN_PROMPT after filtering) from the org-wide merged catalog + caller sub + team ids, then pass that dict as files in initial state instead of blindly copying self._skills_files. The compiled graph and SkillsMiddleware stay shared; state carries the per-principal skill bundle. |
| Anti-patterns | (1) Rebuilding _build_graph() per user per request β unnecessary cost and operational risk. (2) Mutating self._skills_files per request without synchronization β race conditions across concurrent invocations. (3) Assuming "refresh catalog" alone changes per-user files without invoke-time filtering when visibility is enforced. |
Key Entitiesβ
- Skill: A capability offered to the assistant (e.g. a named action or tool) with a stable identifier, description, optional parameters, visibility (
global|team|personal), optional team binding(s), source metadata (default/ Built-in,agent_skills/ Custom,hub/ Skill hub), and optional ancillary files (scripts/,references/,assets/, or other directories per FR-028). Consumed from the shared catalog by the UI and the assistant. Skill format: support both Anthropic/agentskills.io-style and OpenClaw-style SKILL.md (YAML frontmatter + markdown body) when loaded from GitHub or other supported hubs. User-facing labels: Custom (not "Agent config"), Built-in, Skill hub. - Catalog API key: A revocable, scope-limited credential (machine client) used with the Try skills gateway alongside Okta/OIDC Bearer tokens; MUST NOT appear in logs and MUST be rotatable without downtime of unrelated features.
- Skill scan finding: A record produced by Skill Scanner (provided by Cisco AI Defense) (severity, rule id, path) associated with an ingested skill from a hub revision (FR-023) or an agent-skills save/publish (FR-027); used for admin review and optional ingest/save gates. The
source_typefield distinguishes hub vs agent-skills findings. - Skill catalog (central / shared): The single source of truth for available skills; used by the chat UI for display (e.g.
/skills) and by the platform assistant for execution. The catalog layer feeds skills into the upstreamSkillsMiddlewarevia a backend (e.g.StateBackend) so the middleware handles system prompt injection. - Skill hub: An external source of skills (e.g. a repository) that can be registered so that its skills are merged into the shared catalog; has an identifier, location, optional credentials, and status (e.g. enabled/disabled, last load success/failure).
- Chat command: A reserved input (e.g.
/skills) that triggers a specific in-chat behavior (e.g. showing the list of skills) instead of being sent as a normal user message to the assistant. - Skills sync status (gateway-supervisor): A composed operational view comparing catalog cache generation (or equivalent from the skills HTTP API) with supervisor graph generation / merge metadata (FR-016) so operators know whether the assistant has applied the same skill set the gateway API advertises (FR-026).
Success Criteria (mandatory)β
Measurable Outcomesβ
- SC-001: Users can see the same set of available skills in the UI and in the chat command response, with no "run skills" or "Run in Chat" flow in chat; consistency is verifiable by comparing the two surfaces.
- SC-002: Users can discover available skills from within the chat in under two actions (e.g. typing the command and viewing the result).
- SC-003: After an admin registers an external hub, skills from that hub appear in the shared catalog and in the chat command response within one refresh or documented time window (e.g. within one minute under normal conditions).
- SC-004: When the central catalog or a hub is unavailable, users see a clear, non-technical message (e.g. "Skills are temporarily unavailable") and the chat remains usable for non-skill interactions.
- SC-005: Unauthorized attempts to add or remove skill hubs fail with a clear permission message in 100% of tested cases; no hub is added or removed without proper authorization.
- SC-006: At least one external hub type (e.g. a public or private repository) is supported for registration; an admin can add a hub and the assistant can use skills from it in a real conversation.
- SC-007: After a catalog or hub refresh, an authorized operator can confirm from the UI or API that the supervisor's loaded skills metadata (e.g. generation, count, timestamp per FR-016) reflects the refresh or documents stale state.
- SC-008: A developer can complete gateway setup using only in-product docs: obtain a Bearer token or API key, run a documented search query against the catalog API, and receive results consistent with global / team / personal visibility for that principal.
- SC-009: Admins can see Skill Scanner outcomes (or absence of blockers) for hub-ingested skills in a dedicated surface; documentation states that clean scans are best-effort, not a security guarantee (skill-scanner scope). Attribution to Cisco AI Defense with the canonical repo link is present per FR-023 wherever the scanner is named in product or admin UI.
- SC-010: From the Try skills gateway (or linked panel), an authorized operator can see skills sync status between the catalog the gateway documents and the CAIPE supervisor snapshot (aligned vs stale vs unknown per FR-026).
- SC-011: When a user saves an agent-skills document with
skill_content, the save response includesscan_status(passed,flagged, orunscanned). UnderSKILL_SCANNER_GATE=strict, aflaggeddocument does not appear in the merged catalog (GET /skillsor supervisor skill set) until its content is remediated and re-scanned. Scan findings are persisted toskill_scan_findingswithsource_type: "agent_skills"and are visible to admins. - SC-012: When a GitHub hub skill contains ancillary files (e.g.
scripts/thumbnail.py,references/api.md), those files are available to the agent at runtime viaread_file/download_filesthroughStateBackend. The hub preview/crawl shows the count of ancillary files per skill. - SC-013: When a user saves an agent-skills document with ancillary files (via drop zone or GitHub import), the files are stored in MongoDB and available to the agent at runtime. If total ancillary file size exceeds 5 MB, the UI warns the user before save.
Assumptionsβ
- "Current skills" refers to the existing skill definitions or catalog used by the platform today; the feature integrates these into a single catalog consumed by both UI and assistant.
- The shared skills layer has two parts: (1) a custom catalog layer (
ai_platform_engineering/skills_middleware/) that aggregates skills from MongoDB, agent_skills, filesystem, and registered hubs, applying precedence and normalization; and (2) the upstreamdeepagents.middleware.skills.SkillsMiddlewarethat handles system prompt injection via progressive disclosure (listing skills in the prompt, reading full SKILL.md on demand via the backend). The catalog layer writes normalized skills into theSkillsMiddleware's backend storage (e.g.StateBackend), and the upstream middleware injects them into the supervisor's system prompt automatically (FR-015). The CAIPE supervisor reads runtime-updated skills from MongoDB and loads them into the skills middleware so updates are visible without restart (FR-012). The supervisor must hot reload skills (e.g. on each catalog read or short TTL) or support a UI-triggered refresh (e.g. "Refresh skills" or after onboarding a hub) so the catalog reloads without restart. - The chat command (e.g.
/skills) is the primary in-chat way to list skills; the exact syntax (e.g./skillsvs another slash-command) can be decided during design, but the behavior (show list of skills) is fixed. - "Skill hub" includes at least one option that is repository-based (e.g. GitHub public or private); other hub types may be added later. ClawHub (OpenClaw marketplace) as a hub source is out of scope for v1 (risk/complexity); document as a future option. Hub-loaded skills may be in Anthropic/agentskills.io or OpenClaw-style SKILL.md format; both are supported when discovered from GitHub (or other v1 hubs).
- Only authorized roles (e.g. administrators or configured "skill hub managers") can register, update, or remove external hubs; end users can only view and use skills.
- The assistant uses the same entitlement rules as the catalog API for the active user/context: skills visible in UI, gateway,
/skills, and loaded into the supervisor for that session are drawn from the union of global + the user's team-scoped + personal entries (FR-020). There is no separate ad-hoc per-conversation override list unless explicitly specified as a future feature. - Large catalogs: Storing and merging thousands of skill definitions in MongoDB/backends is expected; system prompt size MUST stay bounded via progressive disclosure (FR-015) and explicit caps/windowing on how many skill summaries appear at once (FR-024). Full SKILL.md bodies are not all injected into the context window simultaneously.
- Multi-user supervisor (FR-015 + FR-020): One supervisor process may handle many concurrent users.
SkillsMiddlewareis not recreated per invoke (see Architecture:SkillsMiddlewarelifecycle... in the Supervisor section). Per-user entitlement is satisfied by supplying per-invokefiles(filtered + capped) in initial graph state, not by compiling a separate graph per principal. Keep an org-wide merged catalog for merge/precedence; at each A2A/FastAPI invoke resolve the caller'ssuband team ids and build thefilesmap for that turn. Optional: memoize filtered file bundles keyed by(principal_id, catalog_cache_generation)with a short TTL. Alternative (usually heavier): separate graph or process per tenant. Tasks T066 and the Supervisor architecture table name the entry points (deep_agent.py,protocol_bindings/a2a/agent.py). - Multi-file skills (FR-028): A skill is a directory whose only required file is
SKILL.md. Optional first-class directories per the agentskills.io specification:scripts/(executable code),references/(extra documentation),assets/(static resources). Hub skills are fetched as full directory trees (no size limit). Agent-skills documents store ancillary files inline in MongoDB asancillary_files: Record<string, string>with a 5 MB soft cap; a GitHub repo link can import (snapshot) files at save time but is not a live reference.
Clarificationsβ
Session 2026-03-18β
- Q: Should this feature support OpenClaw/ClawHub skills (e.g. as a hub source or compatible format), or is the initial scope limited to GitHub repos and agentskills.io-aligned SKILL.md only? β A: Support both Anthropic/agentskills.io-style and OpenClaw-style SKILL.md format when loaded from GitHub (or other supported hubs). ClawHub as a hub source is out of scope for v1 (too risky); document as future option.
- Q: When and how should the CAIPE supervisor see skill catalog updates from MongoDB? β A: The CAIPE supervisor MUST read runtime-updated skills from MongoDB and load them into the skills middleware so that catalog updates (new skills, hub registration, agent_skills changes) are reflected without restarting the supervisor.
- Q: How do authorized users add GitHub repos as skill hubs? β A: The system MUST provide a UI feature to onboard GitHub repositories to read skills from (e.g. admin or settings page where users can add repo location, optional credentials, and see list/status of onboarded repos).
- Q: How should the UI handle removal of "run in chat" and discovery of skills? β A: In the UI, "run in chat" MUST be removed; users MUST be directed to use the
/skillscommand to see their loaded skills (e.g. via placeholder, tooltip, or help text in chat). - Q: How should the CAIPE supervisor pick up catalog updates (new hubs, changed skills)? β A: The CAIPE supervisor MUST hot reload skills (e.g. on each catalog read or short TTL cache) or support a trigger from the UI (e.g. "Refresh skills" or post-onboard action) so the catalog is reloaded without restarting the supervisor.
- Q: How should the backend catalog endpoint authenticate requests? β A: Validate the token using JWKS or user_info, same pattern as the RAG server (FR-014).
- Q: Should we use the upstream
deepagents.middleware.skills.SkillsMiddlewarefor system prompt injection? β A: Yes. Use the upstreamSkillsMiddleware(fromdeepagents>=0.3.8) for injecting skills into the supervisor's system prompt via progressive disclosure. Our custom catalog layer (skills_middleware/) handles aggregation, precedence, hub fetch, and normalization, then writes the merged skills into theSkillsMiddleware's backend (e.g.StateBackend) so the upstream middleware handles prompt formatting and "read on demand" (FR-015).
Session 2026-03-23β
- Q: What should the spec say about supervisor files, deep agent reload, "Run in Chat", GitHub crawl UI, and visibility of skills refresh? β A: Document a Supervisor runtime reference listing primary modules (
deep_agent.py, A2Aagent.py, FastAPImain.py,skills_middleware/). New deep agent: each_build_graph()creates a new compiled graph (startup +platform_registrychange β_rebuild_graph); skills re-read viaget_merged_skillson each build; catalog HTTP refresh MUST be wired to supervisor reload for FR-012/FR-016 consistency. Remove "Run in Chat" explicitly alongside run-skills (FR-002, edge cases, SC-001). UI: add GitHub repo crawl/preview step (FR-017, FR-013 story). Observability: FR-016 + SC-007 for supervisor skills-load metadata (generation, count, timestamp).
Session 2026-03-24β
- Q: Should the UI skills area act as a Try skills gateway with Okta token or API key and guided Claude/Cursor steps? β A: Yes β FR-018, User Story 4; Okta/OIDC Bearer (FR-014) plus revocable catalog API keys; in-product step-by-step for Claude and Cursor.
- Q: Must the catalog support search and personal / team / global access? β A: Yes β FR-019 (search/query), FR-020 (visibility); union of entitlements for list/detail/chat/gateway; FR-015 updated so supervisor effective set aligns with entitled catalog.
- Q: How to improve UX for current vs default vs GitHub-crawled skills and where to onboard a GitHub repo? β A: FR-021 (source labels/grouping), FR-022 (discoverable path to Admin β Skill Hubs, empty states); edge case for onboarding discoverability.
- Q: How do thousands of skills affect the supervisor / deep agent context? β A: FR-024 + Assumptions β storage scale is fine; prompt must use progressive disclosure (FR-015) and bounded summary count/windowing; full SKILL.md on demand only, not all-at-once injection.
- Q: Integrate skill-scanner? β A: Yes β FR-023, SC-009, edge case for high-severity policy; link cisco-ai-defense/skill-scanner; no-findings β safe per upstream.
Session 2026-03-26β
- Q: May we align MongoDB
agent_skills, the catalogsource: agent_skillstag, and UI/route naming? β A: Yes β FR-025: theagent_skillsloader andsource: agent_skillsare the conceptual model for user/agent-authored skills; refactor UI/routes/modules allowed with backward-compatible reads or migration; no silent precedence/visibility changes. - Q: Should we show sync status between skills gateway and CAIPE supervisor? β A: Yes β FR-026, SC-010, User Story 4 scenario 4, Key entity Skills sync status; gateway MUST show catalog vs supervisor alignment (in sync / stale / unknown).
Session 2026-03-27β
- Q: How should we attribute the integrated skills scanner? β A: Skill Scanner is provided by Cisco AI Defense; credit and https://github.com/cisco-ai-defense/skill-scanner MUST appear in docs, NOTICE/third-party credits as applicable, and admin UI that names the scanner (FR-023, SC-009, Key entity Skill scan finding).
Session 2026-03-28β
- Q: Should the skill scanner run when a user saves/publishes an agent-skills document with
skill_content? β A: Yes β FR-027, SC-011. The scanner runs synchronously (blocking the save response). The document is always persisted but marked withscan_status:passed,flagged, orunscanned. UnderSKILL_SCANNER_GATE=strict, flagged documents are excluded from the merged catalog (agent_skills loader filters them out). Findings are persisted toskill_scan_findingswithsource_type: "agent_skills". The UI receivesscan_statusin the save response. - Q: Why not reject the save entirely on scan failure? β A: Persist-but-flag is preferred over reject because (1) user work is never lost, (2) admins can review and remediate, (3) the catalog exclusion under strict gate provides the same security boundary as rejection without the UX cost of data loss.
- Q: Why synchronous instead of async? β A: The user wants immediate feedback on whether their skill content passed scanning. Async would require polling or notifications and a more complex UI state machine. The scanner subprocess typically completes in seconds for a single skill.
Session 2026-03-24 (second)β
- Q: What user-facing label should replace "Agent config" for user-created skills? β A: "Custom". The label "Agent config" is internal jargon and MUST NOT appear in user-facing UI copy. The source label triad is: Custom (
source: agent_skills), Built-in (source: default), Skill hub (source: hub). Updated FR-021, Key entity: Skill. - Q: Why is the skill scanner not running on agent-skills save? β A: The
scanSkillContentfunction inui/src/app/api/agent-skills/route.tsrequiresBACKEND_SKILLS_URLto be set. When absent, it silently returns"unscanned". The fix is to configureBACKEND_SKILLS_URLin.env.local(gitignored, developer-specific) and document it in.env.example. Updated FR-027. - Q: How should SKILL.md include ancillary files like
scripts/folders? β A: Per the agentskills.io specification, a skill is a directory withSKILL.mdas the only required file plus optionalscripts/,references/,assets/, and other directories. Hub skills: fetch the full directory tree (no size limit). Agent-skills (source: agent_skills) entries: support ancillary files via file drop zone (upload) and GitHub repo link (fetch-and-snapshot at save time, not a live reference). Storage:ancillary_filesmap in MongoDB with 5 MB soft limit and UI warning. Added FR-028, SC-012, SC-013, updated Key entity: Skill, Assumptions, Edge Cases. - Q: For agent-skills ancillary files, should the GitHub repo link be a live reference or snapshot? β A: Fetch-and-snapshot at save time. The repo link is an import convenience; files are stored inline in MongoDB. Users re-import to update. This keeps agent-skills documents self-contained and avoids runtime GitHub API dependencies per-user.
- Q: Should agent-skills ancillary files have a size limit given MongoDB's 16 MB document limit? β A: Yes β 5 MB soft limit with UI warning. Suggest using a hub for larger skills. MongoDB's 16 MB hard limit is the backstop.