Multi-kernel federation

Beyond a single tenant, one kernel is no longer enough. Federation exposes N Nika OS kernels per /goal, routed via a semantic RAG triage and federated through Redis pub/sub.

Why a single kernel is not enough

A well-tuned Nika OS kernel can absorb sustained usage for several months. But once past the single-tenant threshold — several distinct business perimeters, several teams, several sets of doctrines — the intrinsic limits of a single instance appear:

Context window saturation — even with handover at 41%, a kernel shared between R&D, customer operations, and public procurement monitoring accumulates too much heterogeneous signal. RAG compression cannot keep up.
Doctrine conflicts — two perimeters can require contradictory invariants (e.g. tolerance for external writes in R&D vs. systematic validation on a customer engagement). A single kernel policy cannot honor both at once.
Credential isolation — a kernel-it-ops and a kernel-client-missions must not be able to read each other’s credentials, even accidentally.
Decoupled mutation cycle — GEPA mutations useful to one perimeter are not necessarily useful to another. A unified tournament dilutes signals.

The architectural answer is to spawn N kernels, each with its own /goal, its own primitives, and its own isolated memory — then federate them through a shared layer.

The pattern: one kernel per /goal

Each kernel is an isolated Nika OS instance:

Field	Content proper to the kernel
`/goal`	Distinct strategic mission (short string)
Pod space	tmux `nika-os-{kernel_id}` or dedicated K3s namespace
Memory MD	Separate memory directory `$NIKA_CONFIG/projects/{kernel_id}/`
`settings.json`	Skills, hooks and permissions tailored
Local autonomy	Internal decisions without consulting other kernels

A few typical /goals:

Kernel	Perimeter
`kernel-it-ops`	IT operations (daily reference)
`kernel-rnd-lab`	R&D and continual improvement — benchmarks
`kernel-client-missions`	Customer engagements: mails, reports, deliverables
`kernel-iot-edge`	Embedded controller + sensor data
`kernel-market-watch`	Market / opportunity watch and sectoral benchmarks

Each kernel has its own tmux, its context, its tailored skills. None has direct access to another’s context; they communicate only via the federation layer.

Semantic RAG triage

A user prompt arriving at the federation must be routed to the right kernel. That is the role of the triage layer:

The prompt is embedded (bge-small or equivalent model, 384 dimensions).
A cosine search is run on the nika_federation_goals collection (1 chunk = 1 kernel /goal + its semantic tags).
The top-1 is selected if its score exceeds a confidence threshold.
Otherwise, the prompt is routed to the default generalist kernel, which can request clarification or create a new kernel via the kernel_spawn primitive.

Triage is fast (< 100 ms) and deterministic on non-ambiguous prompts. For hybrid prompts (e.g. “generate an ISO technical drawing for customer X”), the triage can produce a split into several sub-prompts addressed to several kernels.

Federation via Redis pub/sub

Kernels exchange weak signals through a shared Redis bus:

Channel	Direction	Usage
`nika:federation:directives`	any kernel	Signal `kernel A → kernel B: here is an event that concerns you`
`nika:federation:work_stealing`	any kernel	Announces that a kernel has free capacity and can take pending work
`nika:federation:health`	any kernel	Heartbeat + aggregated metrics (TTFL, error_rate, quality score)
`nika:federation:goals`	triage	Update of the registry of active `/goal`s

No kernel shares its context window directly with another. Pub/sub carries only signals (entity references, task IDs, scores, alerts). To transfer content, we go through the shared RAG.

Shared RAG and triage layer

The Qdrant nika_vault store is shared between all kernels, with a kernel_id field filtered by default in every query. A kernel does not read another’s memory by default. But the triage maintains two complementary collections:

nika_federation_summaries — a compact summary per kernel and per day (1 chunk / kernel / 24h). Lets a kernel understand what the others are doing without reading their transcripts.
nika_federation_goals — the knowledge base that drives the triage. Each entry describes a kernel by its /goal + its domains + its canonical prompt examples.

Continuous triage deduplicates (cosine sim > 0.95 = drop), summarizes (frequent chunks → super-chunk), and hierarchizes (strategy → project → job → task → atomic) to keep the RAG actionable.

Work-stealing between kernels

When a kernel A is silent (no current prompt, empty queue), it can subscribe to nika:federation:work_stealing and offer its resources:

Kernel B publishes a task_available signal with a priority score.
Kernel A (free capacity) replies task_claim with its aptitude score (computed on the similarity of its /goal with the task).
If A is the only candidate and its similarity exceeds a threshold, it takes the task.
Otherwise, B keeps the task in its own queue.

This pattern absorbs load peaks without having to pre-allocate capacity on every kernel.

Architecture diagram

flowchart TB
    U["User prompt"] --> T["RAG triage<br/>cosine on /goal"]
    T -->|"it-ops match"| KA["kernel-it-ops"]
    T -->|"rnd-lab match"| KB["kernel-rnd-lab"]
    T -->|"client match"| KC["kernel-client-missions"]

    KA <-->|"directives + signals"| BUS[("Redis pub/sub<br/>nika:federation:*")]
    KB <-->|"directives + signals"| BUS
    KC <-->|"directives + signals"| BUS

    KA -->|"scoped read/write"| RAG[("Qdrant nika_vault<br/>+ federation_summaries")]
    KB -->|"scoped read/write"| RAG
    KC -->|"scoped read/write"| RAG

    T --> RAG

    classDef prompt fill:#F5F1E8,color:#2C3E42,stroke:#7DB5A5,stroke-width:2px;
    classDef triage fill:#E99971,color:#FDFBF8,stroke:#C97A55,stroke-width:2px;
    classDef kernel fill:#7DB5A5,color:#FDFBF8,stroke:#5E9384,stroke-width:2px;
    classDef bus fill:#2C3E42,color:#F5F1E8,stroke:#1A262A;
    classDef store fill:#F5F1E8,color:#2C3E42,stroke:#A86640;

    class U prompt;
    class T triage;
    class KA,KB,KC kernel;
    class BUS bus;
    class RAG store;

Solid arrows carry content (prompts, embeddings, vectors). Dotted arrows carry weak signals (alerts, scores, references).

Auto-handover via session_registry

When a kernel reaches its context threshold (typically 41% by default), it triggers an automatic handover to a successor kernel:

The current kernel serializes its essential state into nika_federation_summaries (summary) + session_registry (table of active sessions ↔ entities).
A successor kernel is spawned with an identical /goal and the parent entity injected at boot.
The new kernel’s session_id is linked to the root entity_id via session_registry, which lets child pods find their lineage without reading transcripts.

This mechanism is described in more detail in the session ↔ hierarchy doctrine — the registry is the source of truth.

Conditions to spawn a new kernel

Four algorithmic triggers can cause the creation of an additional kernel:

Trigger	Measurement	Default threshold
Context saturation	Average % context used over 24h	> 70%
Topic diversity	Number of disjoint clusters in the kernel’s RAG	> 5
Doctrine conflict	Detection of two mutually contradictory active doctrines	binary
Explicit request	User command `/kernel new <goal>`	—

None of these triggers spawns a kernel without explicit human validation — the cost (RAM, embeddings, mutation tournaments) is high and deserves arbitration.

Lifecycle: five verbs

As for pods, a kernel follows five verbs:

spawn       → kernel_spawn(goal, scope, resource_limits)
invoke      → kernel_invoke(kernel_id, prompt)
readjust    → kernel_readjust(kernel_id, new_directive)
observe     → kernel_observe(kernel_id, since, query?)
kill        → kernel_kill(kernel_id, reason)

readjust lets us inject a directive mid-session without killing or respawning the kernel — the equivalent of a patch on the context. The verb is useful when the business scope drifts slightly without changing the /goal.

When to orchestrate several kernels in parallel

Three canonical use cases:

Load burst — massive ingestion (hundreds of PDFs, videos, data sets). Spawn of a temporary worker kernel during the burst, then decommission.
Doctrine diversification — an R&D kernel can tolerate aggressive exploration, a customer kernel requires systematic validation. Separating them lets us optimize each for its business.
Data region — a kernel hosted in France for sensitive customer data, another in a neutral zone for public watch. The triage routes according to the nature of the prompt.

Cost and bottlenecks

Each kernel has a fixed cost (RAM ~512 MB – 2 GB depending on loaded skills, CPU during embedding bursts) and a variable cost (API or GPU tokens per decision). Observed bottlenecks on a shared VPS:

RAM — primary limit. Depending on the per-kernel envelope, a shared host runs from a handful to several dozen parallel kernels.
CPU — mainly on Qdrant bursts and heavy Python hooks.
Network — provider rate limits (Anthropic, OpenRouter, Mistral).
Disk — Qdrant growth. The 24h TTL retention on nika:pod_streams mitigates.

K3s pod or namespace isolation lets us enforce hard limits (Linux cgroups) and avoid one kernel saturating the whole machine.