Skills and hooks

Two mutable primitives that extend the CLI agent: skills (declarative capabilities) and hooks (lifecycle signals). Both subject to the GEPA tournament.

Skills: declarative capabilities

A skill is a reusable capability that a pod can invoke like a function. But unlike a tool or an MCP server, a skill is defined by a markdown file with frontmatter:

---
name: dataviz
description: |
  Decision skill for choosing the right chart type AND the appropriate
  data-science treatment given a dataset and analytical intent.
trigger: "user requests a chart, plot, dashboard, or data visualization"
---

# How this skill decides

[markdown body of the skill...]

The pod loads the list of available skills at each session start. When the semantic trigger matches the user request, the skill is proposed to the pod, which can invoke it via Skill(skill="dataviz", args=...).

Why skills rather than prompts

Composable — a skill can call another skill.
Versioned — each skill lives in $NIKA_CONFIG/skills/{name}/SKILL.md (path resolved at install, independent of the CLI).
Semantically triggerable — no exact-match matching needed.
Mutable by GEPA — the prompt harness evolves under tournament.
Observable — every invocation is traced.

The GEPA tournament

GEPA (Genetic-Pareto, 2025 academic research) is the algorithm we use to evolve the prompt formulation of each skill. The principle:

From a v0 skill, generate several variants by mutation (rephrasings, added examples, removal of redundancies).
Evaluate each variant on a set of memorized example cases.
Multi-objective selection: output quality, latency, token cost, retry rate.
The best variant becomes v1 and replaces the old one. The others are dropped.

The kernel is never mutated by GEPA. Only the harness is. This separation guarantees that security invariants, output contracts, and business policies stay carved.

Hooks: POSIX-like signals for the CLI agent

Hooks are Python scripts attached to lifecycle events. The pod fires the hook, the hook reads stdin, writes stdout/stderr, and returns an exit code that the pod honors.

The six active events

Event	When	Example hooks
`SessionStart`	Pod startup	`on_session_start.py`: boot, RAG context, IPC consumer groups
`UserPromptSubmit`	User submits a prompt	`on_user_prompt.py`: 3-stage router (normalize, split, route)
`PreToolUse`	Before each tool call	`dep_guard.py`, `task_redirect.py`, `pod_primitives_guard.py`
`PostToolUse`	After each tool call	`context_checkpoint.py`, `redis_telemetry.py`, `heartbeat_repair.py`
`SubagentStop`	When a child pod terminates	`on_subagent_stop.py`: entity update, IPC publish, review request
`Stop`	When the pod terminates	`on_stop.py`: session summary, karma scoring

Exit codes and their semantics

exit 0 — success, no action on the pod side. The tool call continues.
exit 1 — error, the hook failed. Pod logs and continues (fail-open).
exit 2 — blocking. The pod honors the hook and cancels the action. Used by dep_guard.py (block torch install without CPU index) and task_redirect.py (block TaskCreate → redirect to hierarchy entities).

Why hooks rather than a middleware

Three reasons:

Language-independent — a hook in Python can be rewritten in Rust or Go without touching the pod. It is just a binary that reads stdin and writes stdout.
Composable — multiple hooks can chain on the same event.
Observable — each hook logs in logs/{hook_name}.log. The operator can tail -f any hook to see what it is doing.

Shared utilities

Hooks share a utility library to avoid duplication:

scripts/hooks/hook_base.py
├── read_hook_input()      — parse stdin JSON
├── emit(data)             — write stdout JSON
├── write_bus(message)     — append to the JSONL bus
├── load_state(name)       — read $NIKA_CONFIG/cache/{name}_state.json
├── save_state(name, data) — atomic tmp-rename
├── setup_logging(name)    — handler to logs/{name}.log
└── check_cooldown(name)   — prevent hooks firing too often

scripts/hooks/rag_utils.py
├── search_qdrant(query, k, filter)
└── ingest_to_qdrant(text, meta)

scripts/hooks/ipc.py
├── publish_result(...)
├── publish_signal(...)
├── publish_entity_event(...)
├── consume_entity_events(group, consumer)
├── ack_entity_event(...)
└── ensure_consumer_groups()

This factorization keeps each hook readable in under 100 lines. If a hook exceeds that, it is doing too much — it must be split or refactored into the lib.

Guardrails on hooks

No blocking I/O — a hook that hangs blocks the pod. Strict timeout.
Local logs only — do not write to Qdrant from a critical hook; use the JSONL bus (asynchronous by design).
Idempotence — a hook may be triggered twice on the same event in case of retry. It must be a no-op on the second invocation.
No surprising side-effects — a PostToolUse hook does not send emails or open PRs. It observes and logs.