Multi-kernel federation
Beyond a single tenant, one kernel is no longer enough. Federation exposes N Nika OS kernels per /goal, routed via a semantic RAG triage and federated through Redis pub/sub.
Why a single kernel is not enough
A well-tuned Nika OS kernel can absorb sustained usage for several months. But once past the single-tenant threshold — several distinct business perimeters, several teams, several sets of doctrines — the intrinsic limits of a single instance appear:
- Context window saturation — even with handover at 41%, a kernel shared between R&D, customer operations, and public procurement monitoring accumulates too much heterogeneous signal. RAG compression cannot keep up.
- Doctrine conflicts — two perimeters can require contradictory invariants (e.g. tolerance for external writes in R&D vs. systematic validation on a customer engagement). A single kernel policy cannot honor both at once.
- Credential isolation — a BCUB3-IT kernel and a customer Pulsa kernel must not be able to read each other’s credentials, even accidentally.
- Decoupled mutation cycle — GEPA mutations useful to one perimeter are not necessarily useful to another. A unified tournament dilutes signals.
The architectural answer is to spawn N kernels, each with its own
/goal, its own primitives, and its own isolated memory — then federate
them through a shared layer.
The pattern: one kernel per /goal
Each kernel is an isolated Nika OS instance:
| Field | Content proper to the kernel |
|---|---|
/goal | Distinct strategic mission (short string) |
| Pod space | tmux nika-os-{kernel_id} or dedicated K3s namespace |
| Memory MD | Separate ~/.claude/projects/{kernel_id}/ directory |
settings.json | Skills, hooks and permissions tailored |
| Local autonomy | Internal decisions without consulting other kernels |
A few typical /goals:
| Kernel | Perimeter |
|---|---|
kernel-alpha-bcub3-it | BCUB3 IT operator (daily reference) |
kernel-beta-bcub3-lab | BCUB3 Lab R&D — KTW, sWELU patents, benchmarks |
kernel-gamma-client-mission | Customer engagements: mails, reports, deliverables |
kernel-delta-iot-edge | Embedded controller + sensor data |
kernel-epsilon-tender-watch | Public procurement watch and sectoral benchmarks |
Each kernel has its own tmux, its context, its tailored skills. None has direct access to another’s context; they communicate only via the federation layer.
Semantic RAG triage
A user prompt arriving at the federation must be routed to the right kernel. That is the role of the triage layer:
- The prompt is embedded (bge-small or equivalent model, 384 dimensions).
- A cosine search is run on the
nika_federation_goalscollection (1 chunk = 1 kernel/goal+ its semantic tags). - The top-1 is selected if its score exceeds a confidence threshold.
- Otherwise, the prompt is routed to the default generalist kernel,
which can request clarification or create a new kernel via the
kernel_spawnprimitive.
Triage is fast (< 100 ms) and deterministic on non-ambiguous prompts. For hybrid prompts (e.g. “generate an ISO technical drawing for customer X”), the triage can produce a split into several sub-prompts addressed to several kernels.
Federation via Redis pub/sub
Kernels exchange weak signals through a shared Redis bus:
| Channel | Direction | Usage |
|---|---|---|
nika:federation:directives | any kernel | Signal kernel A → kernel B: here is an event that concerns you |
nika:federation:work_stealing | any kernel | Announces that a kernel has free capacity and can take pending work |
nika:federation:health | any kernel | Heartbeat + aggregated metrics (TTFL, error_rate, KTW Y) |
nika:federation:goals | triage | Update of the registry of active /goals |
No kernel shares its context window directly with another. Pub/sub carries only signals (entity references, task IDs, scores, alerts). To transfer content, we go through the shared RAG.
Shared RAG and triage layer
The Qdrant nika_vault store is shared between all kernels, with a
kernel_id field filtered by default in every query. A kernel does
not read another’s memory by default. But the triage maintains two
complementary collections:
nika_federation_summaries— a compact summary per kernel and per day (1 chunk / kernel / 24h). Lets a kernel understand what the others are doing without reading their transcripts.nika_federation_goals— the knowledge base that drives the triage. Each entry describes a kernel by its/goal+ its domains + its canonical prompt examples.
Continuous triage deduplicates (cosine sim > 0.95 = drop), summarizes (frequent chunks → super-chunk), and hierarchizes (strategy → project → job → task → atomic) to keep the RAG actionable.
Work-stealing between kernels
When a kernel A is silent (no current prompt, empty queue), it can
subscribe to nika:federation:work_stealing and offer its resources:
- Kernel B publishes a
task_availablesignal with a priority score. - Kernel A (free capacity) replies
task_claimwith its aptitude score (computed on the similarity of its/goalwith the task). - If A is the only candidate and its similarity exceeds a threshold, it takes the task.
- Otherwise, B keeps the task in its own queue.
This pattern absorbs load peaks without having to pre-allocate capacity on every kernel.
Architecture diagram
flowchart TB
U["User prompt"] --> T["RAG triage<br/>cosine on /goal"]
T -->|"BCUB3-IT match"| KA["Kernel Alpha<br/>BCUB3-IT"]
T -->|"BCUB3-LAB match"| KB["Kernel Beta<br/>BCUB3-LAB"]
T -->|"client match"| KC["Kernel Gamma<br/>Customer mission"]
KA <-->|"directives + signals"| BUS[("Redis pub/sub<br/>nika:federation:*")]
KB <-->|"directives + signals"| BUS
KC <-->|"directives + signals"| BUS
KA -->|"scoped read/write"| RAG[("Qdrant nika_vault<br/>+ federation_summaries")]
KB -->|"scoped read/write"| RAG
KC -->|"scoped read/write"| RAG
T --> RAG
classDef prompt fill:#F5F1E8,color:#2C3E42,stroke:#7DB5A5,stroke-width:2px;
classDef triage fill:#E99971,color:#FDFBF8,stroke:#C97A55,stroke-width:2px;
classDef kernel fill:#7DB5A5,color:#FDFBF8,stroke:#5E9384,stroke-width:2px;
classDef bus fill:#2C3E42,color:#F5F1E8,stroke:#1A262A;
classDef store fill:#F5F1E8,color:#2C3E42,stroke:#A86640;
class U prompt;
class T triage;
class KA,KB,KC kernel;
class BUS bus;
class RAG store;
Solid arrows carry content (prompts, embeddings, vectors). Dotted arrows carry weak signals (alerts, scores, references).
Auto-handover via session_registry
When a kernel reaches its context threshold (typically 41% by default), it triggers an automatic handover to a successor kernel:
- The current kernel serializes its essential state into
nika_federation_summaries(summary) +session_registry(table of active sessions ↔ entities). - A successor kernel is spawned with an identical
/goaland the parent entity injected at boot. - The new kernel’s
session_idis linked to the rootentity_idviasession_registry, which lets child pods find their lineage without reading transcripts.
This mechanism is described in more detail in the session ↔ hierarchy doctrine — the registry is the source of truth.
Conditions to spawn a new kernel
Four algorithmic triggers can cause the creation of an additional kernel:
| Trigger | Measurement | Default threshold |
|---|---|---|
| Context saturation | Average % context used over 24h | > 70% |
| Topic diversity | Number of disjoint clusters in the kernel’s RAG | > 5 |
| Doctrine conflict | Detection of two mutually contradictory active doctrines | binary |
| Explicit request | User command /kernel new <goal> | — |
None of these triggers spawns a kernel without explicit human validation — the cost (RAM, embeddings, mutation tournaments) is high and deserves arbitration.
Lifecycle: five verbs
As for pods, a kernel follows five verbs:
spawn → kernel_spawn(goal, scope, resource_limits)
invoke → kernel_invoke(kernel_id, prompt)
readjust → kernel_readjust(kernel_id, new_directive)
observe → kernel_observe(kernel_id, since, query?)
kill → kernel_kill(kernel_id, reason)
readjust lets us inject a directive mid-session without killing or
respawning the kernel — the equivalent of a patch on the context. The
verb is useful when the business scope drifts slightly without
changing the /goal.
When to orchestrate several kernels in parallel
Three canonical use cases:
- Load burst — massive ingestion (hundreds of PDFs, videos, data sets). Spawn of a temporary worker kernel during the burst, then decommission.
- Doctrine diversification — an R&D kernel can tolerate aggressive exploration, a customer kernel requires systematic validation. Separating them lets us optimize each for its business.
- Data region — a kernel hosted in France for sensitive customer data, another in a neutral zone for public watch. The triage routes according to the nature of the prompt.
Cost and bottlenecks
Each kernel has a fixed cost (RAM ~512 MB – 2 GB depending on loaded skills, CPU during embedding bursts) and a variable cost (API or GPU tokens per decision). Observed bottlenecks on a shared VPS:
- RAM — primary limit. On a 29 GB VPS, 14 to 58 parallel kernels depending on the per-kernel envelope.
- CPU — mainly on Qdrant bursts and heavy Python hooks.
- Network — provider rate limits (Anthropic, OpenRouter, Mistral).
- Disk — Qdrant growth. The 24h TTL retention on
nika:pod_streamsmitigates.
K3s pod or namespace isolation lets us enforce hard limits (Linux
cgroups) and avoid one kernel saturating the whole machine.
See also
- Kernel and pods — a pod’s spawn contract also applies to a kernel.
- IPC and bus — Redis Streams, consumer groups, JSONL.
- Observability and controllers — cross-kernel
aggregated metrics via
nika:federation:health. - Doctrines — antifragility, kernel/pod split, multi-CLI agnosticism all apply to federation.