Doctrines

Four operational doctrines that guide every decision in Nika OS: antifragility, algos as tools, kernel/pod split, multi-CLI agnosticism.

Why doctrines rather than rules

A rule says what to do. A doctrine explains why, and lets an autonomous system make the right call in cases not covered by written rules. The four doctrines below structure every mutation, every pod spawn, every extension of the system.

Doctrine 1 — Antifragility (no rigid harness)

A rigid system degrades when the environment shifts. A robust system resists. An antifragile system, in the Taleb sense, improves when the environment shifts.

Practical consequences:

  • No hardcoded harness when a deterministic algorithm or an MCP server can do the work. If an external integration (CLI-Anything, for example) adds an abstraction layer that prevents the system from learning, we uninstall.
  • Continuous mutation of the harness under GEPA tournament. Prompt wording evolves; the kernel never mutates.
  • No complacent fallback. When a pod fails, we mutate its primitives and retry. We don’t wrap the failure in a try/except that hides the cause.

Concrete application

Decision 2026-05-21: uninstallation of the cli-anything-hub and cli-anything-freecad packages. Why? Because they introduced an external CLI harness layer that prevented the freecad-pilot skill from calling FreeCAD’s internal Python API directly. A skill mutation had to go through the external package’s API, which did not follow FreeCAD’s own evolution. Result: we depended on someone else to evolve.

Doctrine applied: fewer intermediaries, more direct control, more capacity to evolve.

Doctrine 2 — Algos as tools, not as LLM

For a problem that has a deterministic algorithmic solution, we don’t use an LLM. We expose the algo as a tool the LLM can call.

Consequences:

  • Inferential statistics, DoE, SPC, ANOVA are implemented in OSS Python (pyDOE3, statsmodels, scipy.stats, scikit-learn) and exposed via the stats-doe-spc-python skill.
  • Generation of ISO technical drawings goes through FreeCAD AppImage + the techdraw-fab-drawings skill. No LLM-generated technical drawing.
  • Company search via the recherche-entreprises.api.gouv.fr API + the recherche-entreprises skill. No LLM Q&A on this data.

Why: an LLM is expensive, slow, and stochastic. A deterministic algo is free, fast, and reproducible. The LLM should serve decisions, synthesis, and communication — not calculations already solved.

The criterion

We promote a primitive to kernel status only if it:

  1. brings a measurable net advantage on user satisfaction;
  2. lets us get it right the first time, thus saving tokens.

Otherwise, it stays an accessory mutable primitive — therefore in the GEPA scope, therefore subject to continuous evolution.

Doctrine 3 — Kernel/pod split

The kernel of a long-running agentic system must be etched. The harness that surrounds it must be mutable. Mixing the two is the most frequent cause of incidents.

LayerStatusExamples
KernelOff-limits for auto-mutationCLAUDE.md, agent system prompts, security contracts, PALACE PROTOCOL, SQP matrix
HarnessMutable under GEPASkill wording, model hyperparams, prompt ordering, routing heuristics

Strict rule: no automatic mutation can modify a kernel file. A mutation that would need to touch the kernel is intercepted and escalated to the human operator for explicit validation.

This discipline allows:

  • guaranteeing the stability of critical invariants (security, integrity, brand);
  • authorizing aggressive learning of the harness without risking a regression on the invariants;
  • rolling back a harness mutation quickly without having to rebuild the kernel.

Doctrine 4 — Multi-CLI agnostic

Nika OS is designed to work with several agent runtimes: the CLI agent (main reference), Hermes (in-house CLI), Gemini CLI, on-premise Mistral. The business logic does not depend on a specific runtime.

Consequences:

  • Skills are markdown + frontmatter, therefore readable by any runtime that can parse markdown.
  • Hooks are Python scripts that read stdin and write stdout. No dependency on the agent CLI’s internal API.
  • MCP servers follow the standard protocol, so any MCP client can use them.
  • The JSONL bus is readable by any process able to parse JSON.

Why: a system that depends on a single provider is fragile. If the provider changes its pricing, its limits, or its behavior, the whole system is exposed. By keeping the architecture neutral, we keep the freedom to switch.

Concrete application

The nika_route_request cognitive router is invokable via the route skill. It dispatches an intent toward the appropriate primitives (skill, pod, MCP, schedule). The dispatch does not assume it is the agent CLI that receives the request: the output is a structured plan that any runtime can execute.

In one sentence each

  • Antifragility: prefer learning over resisting.
  • Algos as tools: deterministic wins over stochastic when possible.
  • Kernel/pod split: what is etched stays etched, what mutates is bounded.
  • Multi-CLI: no entrenchment in a single runtime.

These four doctrines, together, explain every architecture decision documented on this site.

Doctrine — Measured reliability + human validation on external writes

Nika OS is an autonomous system that acts on tools, but this autonomy is bounded by two kernel-immutable primitives that cannot be mutated by GEPA, nor disabled by a pod, nor bypassed by a skill.

Primitive 1 — Reliability level measurement

Every deliverable produced by Nika OS is accompanied by a reliability score computed continuously:

  • Sources: observed KTW Y (chunks/min, error_rate, tool_diversity), deterministic LLM-as-judge meta-pod score, dispersion across N CLIs in tournament mode, recency / staleness of the RAG retrieval, internal contradiction rate detected.
  • Scale: 0–100, empirically calibrated on a gold dataset.
  • Display: the score is visible alongside each deliverable (web page, WhatsApp message, mail draft). Not a hidden score.
  • Granularity: global deliverable score + sub-scores per dimension (factuality, completeness, coherence, style).

Without measurement, no verifiable trust. Without trust, no production use. This measurement is constitutive of the system.

Primitive 2 — Human-in-the-loop on external writes

Any action that writes outside the Nika OS system requires an explicit human validation before execution. The exhaustive list:

CategoryExamples
Outgoing communicationsMail (Gmail, Outlook), WhatsApp to an external contact, LinkedIn / public Teams post, SMS, fax
External DB writesPublic API POST / PUT / DELETE, modification of a client SharePoint folder, write into a third-party ERP / CRM
Financial commitmentsPayment, transfer, purchase, quote / contract signature
Code in productionDirect push to main, merge without review, deployment, modification of shared infra
Legal documentsSubmission of an official file (Greffe, DREETS, INPI, tax), signature of a deed
Personal dataCreation of a customer account, modification of an HR / medical file

This rule is etched in the kernel via a PreToolUse hook that:

  1. Intercepts each tool call before execution.
  2. Classifies the action (read / internal write / external write).
  3. If external write → blocks + requests validation from the human operator via the channel they configured (WhatsApp / Teams by default, admin console as backup).
  4. The operator validates or rejects. Without an answer within the configured delay (default 30 min) → action cancelled and traced as rejected.
  5. The rule cannot be disabled by GEPA, nor by a skill, nor by a pod. The hook is marked kernel_immutable: true in settings.json.

Strict exceptions

Only three cases are auto-validated:

  • Outgoing replies on a thread initiated by Paul (reply to a mail where Paul is explicitly in CC and has already authorized the draft).
  • Internal Nika OS updates (logs, metrics, RAG, JSONL, tenant-local configs).
  • Explicitly budgeted actions in the wizard with a maximum threshold (e.g. consumable purchase < 10 € via dedicated purchase API).

Any other external write → mandatory human validation.

Why this doctrine

Autonomy without guardrails leads to incidents that cost infinitely more than the friction of a validation. The system can be wrong on a meeting summary (impact 0), but not on sending a badly worded mail to a customer (reputation, legal, commercial impact).

This doctrine is the operational expression of the dual-control principle historically applied to critical systems: automated decision + human validation = safe autonomy.

Without this doctrine, Nika OS would not be industrializable.