MCP and schedule

Two primitives that extend the functional perimeter of Nika OS beyond the agent CLI: MCP servers (external tools), triggers (temporal triggering).

MCP: Model Context Protocol

The Model Context Protocol is a protocol standardized by Anthropic in 2024 to expose external tools to a language model. Each MCP server is an independent process that:

  • declares a list of tools with their JSON schema;
  • communicates with the agent CLI via stdio (stdin/stdout) or HTTP;
  • optionally exposes resources (files, databases).

Nika OS uses 22 MCP servers configured in ~/.mcp.json. A few examples:

ServerRole
qdrantSearch and write in the nika_vault vector store
nika-mcpIn-house kernel: pod spawn, bus, RAG, tournaments
playwrightHeadless browser automation
runpodDispatch of GPU jobs on demand
grafanaRead monitoring dashboards
datagouvOpen API of French government data
mcp-builderGenerates a new MCP server from API documentation (OAuth-compatible if scopes provided)

Concrete usage examples by category

A few operational workflows (by tool category, without customer names):

CategoryMCP toolsTypical workflow
Mailgmail, outlookSend, read, classify, summarize a thread, index in RAG
Banking / fintechmcp-banking (OAuth API)List transactions, export monthly statements, create a quote or invoice
CAD / 3Dfreecad-pilot, techdraw-fab-drawingsGenerate a parametric part, export STL/STEP, produce an ISO-compliant technical drawing
Drive / Storagedrive (Google), sharepoint (M365)Sync, multimodal ingest (PDF + images), semantic search
Calendargoogle-calendar, outlook-calendarList events, suggest a slot, create a meeting
Web / scrapingplaywright, chrome-cdpNavigate an authenticated site, extract, automate a flow
GPU computerunpodDispatch a fine-tune or a benchmark, kill the pod at the end
Monitoringgrafana, qdrantRead dashboards, query the vector store

Why MCP rather than in-house plugins

Three reasons:

  1. Interoperability — an MCP server written for a compatible agent CLI also works with other clients that speak MCP. The implementation cost is shared.
  2. Security by default — each MCP server runs in its own process, with its own credentials. The attack surface is contained.
  3. Hot-swap — an MCP server can be enabled or disabled without restarting the agent. Useful to isolate a bug or test a new integration.

The server-skill pattern

Several MCP servers are accompanied by a skill that knows how to use them. Generic example: a low-level server mcp-foo-api (exposes the raw REST endpoints of a third-party API) is paired with a foo-workflow skill (high level, knows when to call which endpoint to produce a business report). The server stays reusable, the skill encodes the business logic.

This separation respects the Unix principle: one tool per responsibility. The server knows how to talk to the API. The skill knows when to call it.

Creating an MCP server from an API (with OAuth)

Nika OS includes an mcp-builder skill that automates the creation of a new MCP server from API documentation (OpenAPI, README, or arbitrary markdown). The generator produces:

  1. A Python wrapper that exposes the endpoints as MCP tools (@tool).
  2. An OAuth 2.0 flow if the API uses OAuth (PKCE by default), with encrypted refresh token storage in ~/.config/nika-os/credentials/ and automatic rotation before expiration.
  3. A declarative manifest (mcp.toml) listing scopes, rate limits, and permissions per tool.
  4. An optional companion skill that encodes frequent workflows (e.g. “weekly sync”, “monthly report”) to avoid re-specifying the logic at every session.

The pattern allows industrializing the integration of a new API in a few minutes: doc as input, usable MCP server as output, without duplication between CLI backends (Claude Code / Mistral Vibe / Gemini CLI).

Pods exposed as servers (N concurrent agents)

A Nika OS pod is not only an ephemeral worker. It can be exposed as a server (local port + ephemeral token) to allow N concurrent agents (humans or other pods, or even third-party CLIs) to:

  • Read its state in real time (transcript, RAG retrieval, working memory).
  • Send it directives without restarting the session.
  • Vote on its proposals (multi-pod tournament on the same task).
  • Re-route its messages to another backend in case of exhausted quota.

This exposure transforms a pod into a composable primitive: an agent becomes an addressable resource from anywhere in the private Tailscale mesh, with explicit access contracts (read / write / kill / re-adjust). Several agent sessions can thus collaborate on the same objective without stepping on each other, and a meta-orchestrator can arbitrate between them by reading their respective states.

Schedule: temporal triggers

The Nika OS router automatically transforms a recurring request into a schedule trigger. It is a routing decision at the same level as spawn_pod. When the router classifies an intent as query_class=cron, it invokes one of the three backends.

The three backends

BackendPersistenceLocal accessWhen to use it
Remote TriggersYes (cloud)Via env bridgeTasks with connectors (Gmail, Calendar, etc.)
CronCreateNo (session)YesFast polling, in-session reminders
Crontab OSYesYesInfra scripts (health, oauth, compaction)

Remote Triggers: the persistent cloud layer

Remote Triggers are created via Anthropic’s /v1/code/triggers API. They keep running even when no pod is open. Triggering is done by cron (UTC, to convert from Europe/Paris) with a minimum interval of 1 hour.

RemoteTrigger(
    action="create",
    body={
        "name": "morning-brief",
        "cron_expression": "0 5 * * *",  # 7am Paris in winter, 6am UTC in summer
        "job_config": {...},
        "mcp_connections": ["gmail", "calendar"],
    },
)

The trigger spawns an agent pod (MCP-compatible CLI) at time T with the requested MCP connectors and the brief in job_config. The pod runs, produces its deliverable, notifies via the configured channels, and kills itself.

CronCreate: the session layer

When the need is one-off (polling to check a CI build passed, reminder at 2pm in the current session), CronCreate creates a cron at session scope. It disappears when the pod terminates.

This is the primitive used by the verify and loop skills, and by pods that need to wait for an external event without blocking their main loop.

Crontab OS: the infra layer

For infra scripts that do not need the agent loop (log rotation, health checks, oauth refresh, database compaction), Nika uses classic Linux crontab. It is the most basic and reliable layer.

The code of these scripts lives in scripts/cron_jobs/. They all log in JSONL in logs/cron/ to respect the auditability pattern.

A routing decision, not a static config

Important: these three backends are not hardcoded. The on_user_prompt.py router looks at the request and chooses the appropriate backend at runtime. A request like “every morning send me a brief of my mails” will be routed to a Remote Trigger. A request like “check every 2 minutes that pod X has finished” will be routed to a CronCreate.

This routing dynamic is what allows the system to transform a user intent into an operational primitive without human intervention.

Multimodal capabilities: voice, vision, gateways

Nika OS is not limited to text. Several multimodal primitives open the interface beyond the terminal:

Voice (Text-to-Speech + Speech-to-Text)

  • TTS via Vertex AI Gemini 2.5 Flash TTS (Kore voice by default) or OpenAI TTS. A text answer is synthesized into Opus .ogg and returned as a WhatsApp voice note or Teams voice message. Latency ~2–4s for a 30-second message.
  • STT via local Whisper (CPU) or Vertex Speech-to-Text. Received voice messages are transcribed before entering the standard text pipeline.
  • Heuristic: if the user sends a voice message, we reply with voice (with a text fallback in parallel for searches).

Vision (the agent’s eyes) — advanced image analysis

  • Simple image input — any pod can read an image (PDF, screenshot, product photo, technical drawing) via native multimodal models (Claude vision, Gemini vision, OpenAI vision depending on the backend chosen by the router).
  • Advanced image analysis — multi-pass pipeline:
    1. High-fidelity OCR via DeepSeek OCR hosted on private GPU (multilingual, technical drawing + handwriting recognition).
    2. Layout parsing of complex PDFs (Docling, LayoutParser) — extracts tables, figures, captions, title hierarchy into structured JSON.
    3. Business element detection (mechanical dimensions on a drawing, dial values, barcodes, signatures) via specialized models.
    4. Multimodal vision (Gemini 2.5 Pro / Claude vision) for semantic understanding: “what do we see? what is the state of the process?”.
    5. Cross-check text ↔ image — a fact extracted from text is confronted with what is visible on the image (inconsistency detection).
  • Multimodal RAG — Qdrant stores in parallel two collections: nika_vault (text, 384d embedding) and nika_multimodal (images + structured extracts, Vertex 1408d embedding). A question can retrieve a relevant image even if the query is textual.
  • Generation of images via Vertex Imagen or specialized models exposed via RunPod (for example a logo fine-tune).

Messaging gateways

So a pod can interact with a human without a terminal:

  • WhatsApp Business — local Baileys bridge + Redis dispatcher. The received message triggers a synthetic UserPromptSubmit on the target kernel. Reply sent via POST /send (text or media).
  • Microsoft Teams — Graph API delegated OAuth. The bot listens to @mentions in authorized channels, routes to the kernel, returns the answer as a rich adaptive card.
  • Google Chat — Chat API webhook + configurable bot account, same pattern as Teams.

1-click install wizard

The installation of a new Nika OS tenant follows a detailed 7-step flow:

  1. Preferred CLI choice — the user selects their default agent CLI (Claude Code / Mistral Vibe / Gemini CLI / Hermes Agent). It is the first backend Nika OS calls for its prompts.

  2. Fallback chain configuration — order of fallback backends in case of usage limit or exhausted quota on the preferred one. By default: [preferred] → Mistral Vibe → Gemini CLI → Hermes. The user can reorder. Automatic usage-limit detector (regex on stdout) that switches transparently.

  3. Messaging gateways — one-click OAuth:

    • WhatsApp Business: scan QR to connect the Baileys bridge.
    • Microsoft Teams: delegated Graph API OAuth, multi-tenant.
    • Google Chat (optional): Chat API webhook.
  4. Inventory of tools used — the wizard asks which tools the user wants to integrate (Drive, Calendar, Gmail, Outlook, SharePoint, Qonto, GitHub, LinkedIn, etc.). For each, one-click OAuth if the provider supports it (Family A/B/C below), otherwise API key requested. Automatic generation of the corresponding MCP servers.

  5. Voice + vision setup — entry of Vertex API key (Gemini 2.5 Flash TTS + STT) or local Whisper key. Activation of multimodal RAG (Qdrant nika_multimodal 1408d Vertex collection). Smoke test: send a test voice message → transcription → voice reply.

5bis. Multi-cloud configuration (optional) — for users who want to launch ML / AI jobs on demand or activate Nika OS horizontal auto-scaling. The wizard offers API keys for the following providers:

  • Scaleway — sovereign EU VPS, L4 / H100 GPU, ideal for GDPR compliance.
  • Contabo — high-RAM / CPU VPS at low monthly cost, standard entry point.
  • Microsoft Azure — Compute + ML Compute (NDv4, NC, NDm), tenant Active Directory integration.
  • Google Cloud — Vertex AI + Compute Engine (A100, L4, TPU if project is eligible).
  • AWS — EC2 spot + SageMaker (p4d, g5, inf2).
  • RunPod — spot or serverless GPU, billed per minute.

The connection is done by SSH key generated locally and copied to the provider via their official API. The wizard then tests that nvidia-smi responds from the triggering pod. Once configured, the runpod skill (and its per-provider equivalents) can dispatch a job in one command.

This step is optional: a tenant that does not need heavy compute can skip it and enable it later from the UI.

Horizontal auto-scale — if several providers are configured, the Nika OS scheduler can automatically spawn a worker VPS when load exceeds a threshold (CPU > 80% / RAM > 75% / Redis queue > N tasks). The worker joins the Tailscale mesh, downloads the Nika OS image, drains the queue, then decommissions itself when load goes back down. Switching from one provider to another is done according to cost and availability — Scaleway and Contabo for EU base load, RunPod or GCP for GPU peaks.

Key use case: continuous training of a proprietary embedding + retrieval model for Nika OS. Rather than depending on an external embedding model (baseline bge-small-en-v1.5 384d), the tenant can periodically train its own model on the Qdrant nika_vault corpus that is theirs. The pipeline:

  1. Dataset builder — periodic pull of (query, relevant document) pairs from session history and JSONL audit.
  2. Hard negatives — generation by BM25 + adversarial LLM.
  3. Fine-tune — LoRA / QLoRA over 1–3 epochs (3–8 hours of A100 GPU).
  4. Eval — NDCG@10, MRR, Recall@10 vs baseline; deployment if significant gain (>5%).
  5. Zero-downtime migration — incremental Qdrant re-embedding via the qdrant-model-migration skill, without interrupting retrieval.
  6. Loop — a weekly or monthly cron restarts the pipeline.

The effect: retrieval quality increases progressively as the tenant uses Nika OS, because the embedding model learns the specific semantics of its business.

  1. Instance provisioning — dedicated K3s pod OR shared instance, with per-tenant quotas. The wizard packages all dependencies into a self-contained Docker image:

    • Chosen agent CLI + adapters of the others
    • Redis 7 + Qdrant (pre-filled with public doctrines + skills)
    • PostgreSQL + pgvector (hierarchy state)
    • Tailscale client to join the private mesh
    • Baileys WhatsApp bridge
    • All MCP servers from step 4
    • Local fastembed bge-small embedding models

    A single nika-install.sh command orchestrates everything, starts services in order, verifies healthchecks.

6bis. 1-click ingestion of personal documentation — before the onboarding chatbot, the wizard asks the user for a few profile fields (name, role, industry, preferred language, business context in a few sentences). On this basis, it offers a guided ingestion of personal or professional documentation:

  • Guided Tailscale SSH connection — the wizard opens a page with an ephemeral Tailscale auth-key identifier + a command to paste in a local terminal (Mac, Linux, Windows via WSL). Once executed, the user’s machine joins the tenant’s private mesh.
  • Simplified transfer — the user drags and drops their folders (Documents, local Drive, SaaS tool exports, old PDFs, drawing photos, Excel files) into a web UI served by Nika OS via Tailscale. No data transits over open Internet.
  • Ingestion pipeline — Nika OS detects the type (PDF, image, docx, xlsx, txt, eml, jsonl), applies the right pipeline (OCR, layout parsing, table extraction), tags with metadata (tenant_id, scope, entity_type, language, domain), embeds via the current embedding model, and inserts into Qdrant nika_vault filtered by tenant.
  • Validation — a summary is presented with: number of documents ingested, size, distribution of detected themes. The user validates or rejects.

The effect: from the first prompt to Nika OS, the RAG already contains the proper knowledge of the tenant. No need for 10 conversational sessions for Nika to become useful — it knows your context right away.

This step is optional but strongly recommended for business uses (admin, finance, R&D, support). For purely conversational use (general chat), it can be skipped.

  1. Chatbot onboarding — via WhatsApp or Teams, an assistant guides the user through their first 10 prompts (present their business stack, preferences, pain points). Everything is memorized in the RAG from the start, which makes subsequent sessions immediately contextual.

Let’s go — minimum viable install

For a fast PoC, the functional minimum is:

StepMinimum
CLI1 (the preferred one) — fallback optional
OAuthGoogle OR Microsoft (at least 1)
GatewayWhatsApp OR Teams (at least 1)
VoiceOptional — text-only OK to start
VisionOptional — can be enabled later
ToolsDrive + Calendar + Gmail = 90% of common usage
RAGQdrant preloaded with public doctrines

→ ~15 minutes from the “Install” click to the first functional prompt.

The whole gateway stack relies on the CLI adapters (Claude Code, Mistral Vibe, Gemini CLI, Hermes Agent) that can run in parallel, which allows offering several quality / sovereignty / cost tiers without changing the tenant-side wiring.

One-click OAuth to MCP — detailed flow

The Nika OS install wizard automates the connection of external providers via one-click OAuth. Two families are supported:

Family A — Google Cloud / Google Workspace (GWS)

  1. The user clicks “Connect Google” on the install page.
  2. Nika OS generates an OAuth 2.0 URL with PKCE + minimal scopes:
    • https://www.googleapis.com/auth/drive.readonly
    • https://www.googleapis.com/auth/calendar
    • https://www.googleapis.com/auth/gmail.modify (modify labels, drafts)
    • https://www.googleapis.com/auth/cloud-platform (if GCP is needed)
  3. The browser opens the Google consent. The user validates.
  4. Callback https://install.bcub3.com/oauth/google/callback receives the code.
  5. Exchange code → refresh + access token. Encrypted storage in ~/.config/nika-os/credentials/google.enc (age + tenant key).
  6. Automatic generation of 4 MCP servers (gmail-mcp, gdrive-mcp, gcalendar-mcp, gcp-mcp) with ready credentials. Hot-swap into the pod without restart.
  7. Smoke test: gmail.list_threads(maxResults=1) must return 200.

Family B — Microsoft 365 / Azure

Identical pattern for Outlook + Teams + SharePoint + OneDrive via Microsoft Graph API (delegated OAuth). The Azure app is multi-tenant; each customer authorizes in their own tenant. Scopes:

  • Mail.ReadWrite, Calendars.ReadWrite, Sites.Read.All, Files.ReadWrite
  • Chat.ReadWrite (Teams)

Family C — Classic MCP servers (API key)

For services that do not have OAuth (LinkedIn, some internal APIs, fintech banks with raw API key), the wizard asks for the key directly, encrypts it (age), and generates the corresponding MCP. The Python wrapper exposes @tool decorators that load the key at runtime from the vault.

Rotation and revocation

  • Google refresh token → automatic rotation 24h before expiration.
  • If OAuth refresh fails (user revoked) → WA notification + UI re-consent in one click.
  • JSONL audit log on every rotation / use for GDPR traceability.

Network recommendations

Nika OS runs on modest infra (1 Contabo VPS + on-demand spawn workers). The network choices favor simplicity and defense in depth rather than sophistication.

flowchart LR
    U["Paul (iPhone / laptop)"] -->|"Tailscale<br/>WireGuard"| TS["Tailscale Mesh<br/>(private, end-to-end encrypted)"]
    TS --> VPS["Contabo VPS<br/>(Ubuntu 24.04 LTS)"]
    VPS -->|Caddy/Traefik| TLS["TLS 1.3 + HSTS"]
    TLS --> NIKA["Nika OS Kernel"]
    TLS --> BAILEYS["WhatsApp Bridge<br/>port 3300 (loopback)"]
    NIKA --> REDIS[("Redis<br/>127.0.0.1:6379")]
    NIKA --> QDRANT[("Qdrant<br/>127.0.0.1:6333")]

    classDef user fill:#2C3E42,color:#F5F1E8,stroke:#1A262A;
    classDef mesh fill:#7DB5A5,color:#F5F1E8,stroke:#5E9384;
    classDef vps fill:#F5F1E8,color:#2C3E42,stroke:#7DB5A5;
    classDef edge fill:#E99971,color:#F5F1E8,stroke:#C97A55;
    classDef store fill:#F5F1E8,color:#2C3E42,stroke:#A86640;
    class U user;
    class TS mesh;
    class VPS,NIKA vps;
    class TLS,BAILEYS edge;
    class REDIS,QDRANT store;

Strict rules

  1. Bind to loopback by default — all internal services (Redis, Qdrant, Postgres, WhatsApp bridge) listen only on 127.0.0.1. No port is exposed on the public IP except 80/443 (Traefik) and 22 (restricted SSH).

  2. Tailscale as access VPN — the admin accesses internal services exclusively via the Tailscale WireGuard mesh. No exposed OpenVPN VPN, no IPSec, no admin panel on open Internet. Spawn workers (future multi-VPS) automatically join the mesh with an ephemeral auth-key.

  3. TLS everywhere — Traefik (or Caddy as an alternative) terminates TLS 1.3 + HSTS on 443. Automatic Let’s Encrypt certs. No TLS 1.0/1.1 accepted.

  4. Restrictive UFW firewall — default deny; allow only 22 (from admin IP), 80 (redirects 443), 443. Tailscale manages its tunnel separately.

  5. SSH hardening — no password auth, public key only, port 22 restricted to admin IPs via UFW. fail2ban active on SSH + audit cron.

  6. MTA-STS + DKIM + SPF + DMARC for outgoing mail from bcub3.com. DMARC policy at minimum quarantine, ideally reject once the chain is stable. MTA-STS published at .well-known/mta-sts.txt.

  7. Secrets never in clear — credentials encrypted with age, keys managed by a local vault (Vaultwarden self-hosted recommended) accessible only via Tailscale. No secrets in git, no logs in clear.

  8. Regular auditdaily-security-audit (hourly cron) checks public open ports, SSH attempts, sudo events, OOM kills, integrity of authorized_keys + sudoers, fail2ban banlist, state of critical services. Anomaly → encrypted WA notification.

What we deliberately don’t do

  • No direct Internet exposure of the Nika kernel. If an external client must interact, it goes through a public gateway (Teams, WhatsApp, web) with OAuth; never SSH or private API directly exposed.
  • No Cloudflare Worker for the kernel — we keep full control on the chain, even if it means supporting less traffic.
  • No managed multi-zone cluster (GKE / EKS) — overkill for our current load. Single-node K3s + ephemeral VPS workers is enough.
  • No separate Bastion host — Tailscale already covers the case with less attack surface.

This network frugality is consistent with the antifragility doctrine: fewer moving parts, simpler to reason about, faster to recover in case of incident.