Doctrines
Four operational doctrines that guide every decision in Nika OS: antifragility, algos as tools, kernel/pod split, multi-CLI agnosticism.
Why doctrines rather than rules
A rule says what to do. A doctrine explains why, and lets an autonomous system make the right call in cases not covered by written rules. The four doctrines below structure every mutation, every pod spawn, every extension of the system.
Doctrine 1 — Antifragility (no rigid harness)
A rigid system degrades when the environment shifts. A robust system resists. An antifragile system, in the Taleb sense, improves when the environment shifts.
Practical consequences:
- No hardcoded harness when a deterministic algorithm or an MCP server can do the work. If an external integration (CLI-Anything, for example) adds an abstraction layer that prevents the system from learning, we uninstall.
- Continuous mutation of the harness under GEPA tournament. Prompt wording evolves; the kernel never mutates.
- No complacent fallback. When a pod fails, we mutate its primitives and retry. We don’t wrap the failure in a try/except that hides the cause.
Concrete application
Decision 2026-05-21: uninstallation of the cli-anything-hub and
cli-anything-freecad packages. Why? Because they introduced an external
CLI harness layer that prevented the freecad-pilot skill from calling
FreeCAD’s internal Python API directly. A skill mutation had to go
through the external package’s API, which did not follow FreeCAD’s own
evolution. Result: we depended on someone else to evolve.
Doctrine applied: fewer intermediaries, more direct control, more capacity to evolve.
Doctrine 2 — Algos as tools, not as LLM
For a problem that has a deterministic algorithmic solution, we don’t use an LLM. We expose the algo as a tool the LLM can call.
Consequences:
- Inferential statistics, DoE, SPC, ANOVA are implemented in OSS Python
(pyDOE3, statsmodels, scipy.stats, scikit-learn) and exposed via the
stats-doe-spc-pythonskill. - Generation of ISO technical drawings goes through FreeCAD AppImage +
the
techdraw-fab-drawingsskill. No LLM-generated technical drawing. - Company search via the
recherche-entreprises.api.gouv.frAPI + therecherche-entreprisesskill. No LLM Q&A on this data.
Why: an LLM is expensive, slow, and stochastic. A deterministic algo is free, fast, and reproducible. The LLM should serve decisions, synthesis, and communication — not calculations already solved.
The criterion
We promote a primitive to kernel status only if it:
- brings a measurable net advantage on user satisfaction;
- lets us get it right the first time, thus saving tokens.
Otherwise, it stays an accessory mutable primitive — therefore in the GEPA scope, therefore subject to continuous evolution.
Doctrine 3 — Kernel/pod split
The kernel of a long-running agentic system must be etched. The harness that surrounds it must be mutable. Mixing the two is the most frequent cause of incidents.
| Layer | Status | Examples |
|---|---|---|
| Kernel | Off-limits for auto-mutation | CLAUDE.md, agent system prompts, security contracts, PALACE PROTOCOL, SQP matrix |
| Harness | Mutable under GEPA | Skill wording, model hyperparams, prompt ordering, routing heuristics |
Strict rule: no automatic mutation can modify a kernel file. A mutation that would need to touch the kernel is intercepted and escalated to the human operator for explicit validation.
This discipline allows:
- guaranteeing the stability of critical invariants (security, integrity, brand);
- authorizing aggressive learning of the harness without risking a regression on the invariants;
- rolling back a harness mutation quickly without having to rebuild the kernel.
Doctrine 4 — Multi-CLI agnostic
Nika OS is designed to work with several agent runtimes: the CLI agent (main reference), Hermes (in-house CLI), Gemini CLI, on-premise Mistral. The business logic does not depend on a specific runtime.
Consequences:
- Skills are markdown + frontmatter, therefore readable by any runtime that can parse markdown.
- Hooks are Python scripts that read stdin and write stdout. No dependency on the agent CLI’s internal API.
- MCP servers follow the standard protocol, so any MCP client can use them.
- The JSONL bus is readable by any process able to parse JSON.
Why: a system that depends on a single provider is fragile. If the provider changes its pricing, its limits, or its behavior, the whole system is exposed. By keeping the architecture neutral, we keep the freedom to switch.
Concrete application
The nika_route_request cognitive router is invokable via the route
skill. It dispatches an intent toward the appropriate primitives (skill,
pod, MCP, schedule). The dispatch does not assume it is the agent CLI
that receives the request: the output is a structured plan that any
runtime can execute.
In one sentence each
- Antifragility: prefer learning over resisting.
- Algos as tools: deterministic wins over stochastic when possible.
- Kernel/pod split: what is etched stays etched, what mutates is bounded.
- Multi-CLI: no entrenchment in a single runtime.
These four doctrines, together, explain every architecture decision documented on this site.
Doctrine — Measured reliability + human validation on external writes
Nika OS is an autonomous system that acts on tools, but this autonomy is bounded by two kernel-immutable primitives that cannot be mutated by GEPA, nor disabled by a pod, nor bypassed by a skill.
Primitive 1 — Reliability level measurement
Every deliverable produced by Nika OS is accompanied by a reliability score computed continuously:
- Sources: observed KTW Y (chunks/min, error_rate, tool_diversity), deterministic LLM-as-judge meta-pod score, dispersion across N CLIs in tournament mode, recency / staleness of the RAG retrieval, internal contradiction rate detected.
- Scale: 0–100, empirically calibrated on a gold dataset.
- Display: the score is visible alongside each deliverable (web page, WhatsApp message, mail draft). Not a hidden score.
- Granularity: global deliverable score + sub-scores per dimension (factuality, completeness, coherence, style).
Without measurement, no verifiable trust. Without trust, no production use. This measurement is constitutive of the system.
Primitive 2 — Human-in-the-loop on external writes
Any action that writes outside the Nika OS system requires an explicit human validation before execution. The exhaustive list:
| Category | Examples |
|---|---|
| Outgoing communications | Mail (Gmail, Outlook), WhatsApp to an external contact, LinkedIn / public Teams post, SMS, fax |
| External DB writes | Public API POST / PUT / DELETE, modification of a client SharePoint folder, write into a third-party ERP / CRM |
| Financial commitments | Payment, transfer, purchase, quote / contract signature |
| Code in production | Direct push to main, merge without review, deployment, modification of shared infra |
| Legal documents | Submission of an official file (Greffe, DREETS, INPI, tax), signature of a deed |
| Personal data | Creation of a customer account, modification of an HR / medical file |
This rule is etched in the kernel via a PreToolUse hook that:
- Intercepts each tool call before execution.
- Classifies the action (read / internal write / external write).
- If external write → blocks + requests validation from the human operator via the channel they configured (WhatsApp / Teams by default, admin console as backup).
- The operator validates or rejects. Without an answer within the configured delay (default 30 min) → action cancelled and traced as rejected.
- The rule cannot be disabled by GEPA, nor by a skill, nor by a pod.
The hook is marked
kernel_immutable: trueinsettings.json.
Strict exceptions
Only three cases are auto-validated:
- Outgoing replies on a thread initiated by Paul (reply to a mail where Paul is explicitly in CC and has already authorized the draft).
- Internal Nika OS updates (logs, metrics, RAG, JSONL, tenant-local configs).
- Explicitly budgeted actions in the wizard with a maximum threshold (e.g. consumable purchase < 10 € via dedicated purchase API).
Any other external write → mandatory human validation.
Why this doctrine
Autonomy without guardrails leads to incidents that cost infinitely more than the friction of a validation. The system can be wrong on a meeting summary (impact 0), but not on sending a badly worded mail to a customer (reputation, legal, commercial impact).
This doctrine is the operational expression of the dual-control principle historically applied to critical systems: automated decision + human validation = safe autonomy.
Without this doctrine, Nika OS would not be industrializable.