MCP and schedule
Two primitives that extend the functional perimeter of Nika OS beyond the agent CLI: MCP servers (external tools), triggers (temporal triggering).
MCP: Model Context Protocol
The Model Context Protocol is a protocol standardized by Anthropic in 2024 to expose external tools to a language model. Each MCP server is an independent process that:
- declares a list of tools with their JSON schema;
- communicates with the agent CLI via stdio (stdin/stdout) or HTTP;
- optionally exposes resources (files, databases).
Nika OS uses 22 MCP servers configured in ~/.mcp.json. A few examples:
| Server | Role |
|---|---|
qdrant | Search and write in the nika_vault vector store |
nika-mcp | In-house kernel: pod spawn, bus, RAG, tournaments |
playwright | Headless browser automation |
runpod | Dispatch of GPU jobs on demand |
grafana | Read monitoring dashboards |
datagouv | Open API of French government data |
mcp-builder | Generates a new MCP server from API documentation (OAuth-compatible if scopes provided) |
Concrete usage examples by category
A few operational workflows (by tool category, without customer names):
| Category | MCP tools | Typical workflow |
|---|---|---|
gmail, outlook | Send, read, classify, summarize a thread, index in RAG | |
| Banking / fintech | mcp-banking (OAuth API) | List transactions, export monthly statements, create a quote or invoice |
| CAD / 3D | freecad-pilot, techdraw-fab-drawings | Generate a parametric part, export STL/STEP, produce an ISO-compliant technical drawing |
| Drive / Storage | drive (Google), sharepoint (M365) | Sync, multimodal ingest (PDF + images), semantic search |
| Calendar | google-calendar, outlook-calendar | List events, suggest a slot, create a meeting |
| Web / scraping | playwright, chrome-cdp | Navigate an authenticated site, extract, automate a flow |
| GPU compute | runpod | Dispatch a fine-tune or a benchmark, kill the pod at the end |
| Monitoring | grafana, qdrant | Read dashboards, query the vector store |
Why MCP rather than in-house plugins
Three reasons:
- Interoperability — an MCP server written for a compatible agent CLI also works with other clients that speak MCP. The implementation cost is shared.
- Security by default — each MCP server runs in its own process, with its own credentials. The attack surface is contained.
- Hot-swap — an MCP server can be enabled or disabled without restarting the agent. Useful to isolate a bug or test a new integration.
The server-skill pattern
Several MCP servers are accompanied by a skill that knows how to use
them. Generic example: a low-level server mcp-foo-api (exposes the raw
REST endpoints of a third-party API) is paired with a foo-workflow
skill (high level, knows when to call which endpoint to produce a
business report). The server stays reusable, the skill encodes the
business logic.
This separation respects the Unix principle: one tool per responsibility. The server knows how to talk to the API. The skill knows when to call it.
Creating an MCP server from an API (with OAuth)
Nika OS includes an mcp-builder skill that automates the creation of a
new MCP server from API documentation (OpenAPI, README, or arbitrary
markdown). The generator produces:
- A Python wrapper that exposes the endpoints as MCP tools (
@tool). - An OAuth 2.0 flow if the API uses OAuth (PKCE by default), with
encrypted refresh token storage in
~/.config/nika-os/credentials/and automatic rotation before expiration. - A declarative manifest (
mcp.toml) listing scopes, rate limits, and permissions per tool. - An optional companion skill that encodes frequent workflows (e.g. “weekly sync”, “monthly report”) to avoid re-specifying the logic at every session.
The pattern allows industrializing the integration of a new API in a few minutes: doc as input, usable MCP server as output, without duplication between CLI backends (Claude Code / Mistral Vibe / Gemini CLI).
Pods exposed as servers (N concurrent agents)
A Nika OS pod is not only an ephemeral worker. It can be exposed as a server (local port + ephemeral token) to allow N concurrent agents (humans or other pods, or even third-party CLIs) to:
- Read its state in real time (transcript, RAG retrieval, working memory).
- Send it directives without restarting the session.
- Vote on its proposals (multi-pod tournament on the same task).
- Re-route its messages to another backend in case of exhausted quota.
This exposure transforms a pod into a composable primitive: an agent becomes an addressable resource from anywhere in the private Tailscale mesh, with explicit access contracts (read / write / kill / re-adjust). Several agent sessions can thus collaborate on the same objective without stepping on each other, and a meta-orchestrator can arbitrate between them by reading their respective states.
Schedule: temporal triggers
The Nika OS router automatically transforms a recurring request into a
schedule trigger. It is a routing decision at the same level as
spawn_pod. When the router classifies an intent as query_class=cron,
it invokes one of the three backends.
The three backends
| Backend | Persistence | Local access | When to use it |
|---|---|---|---|
| Remote Triggers | Yes (cloud) | Via env bridge | Tasks with connectors (Gmail, Calendar, etc.) |
| CronCreate | No (session) | Yes | Fast polling, in-session reminders |
| Crontab OS | Yes | Yes | Infra scripts (health, oauth, compaction) |
Remote Triggers: the persistent cloud layer
Remote Triggers are created via Anthropic’s /v1/code/triggers API.
They keep running even when no pod is open. Triggering is done by cron
(UTC, to convert from Europe/Paris) with a minimum interval of 1 hour.
RemoteTrigger(
action="create",
body={
"name": "morning-brief",
"cron_expression": "0 5 * * *", # 7am Paris in winter, 6am UTC in summer
"job_config": {...},
"mcp_connections": ["gmail", "calendar"],
},
)
The trigger spawns an agent pod (MCP-compatible CLI) at time T with the
requested MCP connectors and the brief in job_config. The pod runs,
produces its deliverable, notifies via the configured channels, and
kills itself.
CronCreate: the session layer
When the need is one-off (polling to check a CI build passed, reminder
at 2pm in the current session), CronCreate creates a cron at session
scope. It disappears when the pod terminates.
This is the primitive used by the verify and loop skills, and by pods
that need to wait for an external event without blocking their main loop.
Crontab OS: the infra layer
For infra scripts that do not need the agent loop (log rotation, health
checks, oauth refresh, database compaction), Nika uses classic Linux
crontab. It is the most basic and reliable layer.
The code of these scripts lives in scripts/cron_jobs/. They all log
in JSONL in logs/cron/ to respect the auditability pattern.
A routing decision, not a static config
Important: these three backends are not hardcoded. The
on_user_prompt.py router looks at the request and chooses the
appropriate backend at runtime. A request like “every morning send me a
brief of my mails” will be routed to a Remote Trigger. A request like
“check every 2 minutes that pod X has finished” will be routed to a
CronCreate.
This routing dynamic is what allows the system to transform a user intent into an operational primitive without human intervention.
Multimodal capabilities: voice, vision, gateways
Nika OS is not limited to text. Several multimodal primitives open the interface beyond the terminal:
Voice (Text-to-Speech + Speech-to-Text)
- TTS via Vertex AI Gemini 2.5 Flash TTS (Kore voice by default) or
OpenAI TTS. A text answer is synthesized into Opus
.oggand returned as a WhatsApp voice note or Teams voice message. Latency ~2–4s for a 30-second message. - STT via local Whisper (CPU) or Vertex Speech-to-Text. Received voice messages are transcribed before entering the standard text pipeline.
- Heuristic: if the user sends a voice message, we reply with voice (with a text fallback in parallel for searches).
Vision (the agent’s eyes) — advanced image analysis
- Simple image input — any pod can read an image (PDF, screenshot, product photo, technical drawing) via native multimodal models (Claude vision, Gemini vision, OpenAI vision depending on the backend chosen by the router).
- Advanced image analysis — multi-pass pipeline:
- High-fidelity OCR via DeepSeek OCR hosted on private GPU (multilingual, technical drawing + handwriting recognition).
- Layout parsing of complex PDFs (Docling, LayoutParser) — extracts tables, figures, captions, title hierarchy into structured JSON.
- Business element detection (mechanical dimensions on a drawing, dial values, barcodes, signatures) via specialized models.
- Multimodal vision (Gemini 2.5 Pro / Claude vision) for semantic understanding: “what do we see? what is the state of the process?”.
- Cross-check text ↔ image — a fact extracted from text is confronted with what is visible on the image (inconsistency detection).
- Multimodal RAG — Qdrant stores in parallel two collections:
nika_vault(text, 384d embedding) andnika_multimodal(images + structured extracts, Vertex 1408d embedding). A question can retrieve a relevant image even if the query is textual. - Generation of images via Vertex Imagen or specialized models exposed via RunPod (for example a logo fine-tune).
Messaging gateways
So a pod can interact with a human without a terminal:
- WhatsApp Business — local Baileys bridge + Redis dispatcher. The
received message triggers a synthetic
UserPromptSubmiton the target kernel. Reply sent viaPOST /send(text or media). - Microsoft Teams — Graph API delegated OAuth. The bot listens to @mentions in authorized channels, routes to the kernel, returns the answer as a rich adaptive card.
- Google Chat — Chat API webhook + configurable bot account, same pattern as Teams.
1-click install wizard
The installation of a new Nika OS tenant follows a detailed 7-step flow:
-
Preferred CLI choice — the user selects their default agent CLI (Claude Code / Mistral Vibe / Gemini CLI / Hermes Agent). It is the first backend Nika OS calls for its prompts.
-
Fallback chain configuration — order of fallback backends in case of usage limit or exhausted quota on the preferred one. By default:
[preferred] → Mistral Vibe → Gemini CLI → Hermes. The user can reorder. Automatic usage-limit detector (regex on stdout) that switches transparently. -
Messaging gateways — one-click OAuth:
- WhatsApp Business: scan QR to connect the Baileys bridge.
- Microsoft Teams: delegated Graph API OAuth, multi-tenant.
- Google Chat (optional): Chat API webhook.
-
Inventory of tools used — the wizard asks which tools the user wants to integrate (Drive, Calendar, Gmail, Outlook, SharePoint, Qonto, GitHub, LinkedIn, etc.). For each, one-click OAuth if the provider supports it (Family A/B/C below), otherwise API key requested. Automatic generation of the corresponding MCP servers.
-
Voice + vision setup — entry of Vertex API key (Gemini 2.5 Flash TTS + STT) or local Whisper key. Activation of multimodal RAG (Qdrant
nika_multimodal1408d Vertex collection). Smoke test: send a test voice message → transcription → voice reply.
5bis. Multi-cloud configuration (optional) — for users who want to launch ML / AI jobs on demand or activate Nika OS horizontal auto-scaling. The wizard offers API keys for the following providers:
- Scaleway — sovereign EU VPS, L4 / H100 GPU, ideal for GDPR compliance.
- Contabo — high-RAM / CPU VPS at low monthly cost, standard entry point.
- Microsoft Azure — Compute + ML Compute (NDv4, NC, NDm), tenant Active Directory integration.
- Google Cloud — Vertex AI + Compute Engine (A100, L4, TPU if project is eligible).
- AWS — EC2 spot + SageMaker (p4d, g5, inf2).
- RunPod — spot or serverless GPU, billed per minute.
The connection is done by SSH key generated locally and copied to
the provider via their official API. The wizard then tests that
nvidia-smi responds from the triggering pod. Once configured, the
runpod skill (and its per-provider equivalents) can dispatch a job
in one command.
This step is optional: a tenant that does not need heavy compute can skip it and enable it later from the UI.
Horizontal auto-scale — if several providers are configured, the Nika OS scheduler can automatically spawn a worker VPS when load exceeds a threshold (CPU > 80% / RAM > 75% / Redis queue > N tasks). The worker joins the Tailscale mesh, downloads the Nika OS image, drains the queue, then decommissions itself when load goes back down. Switching from one provider to another is done according to cost and availability — Scaleway and Contabo for EU base load, RunPod or GCP for GPU peaks.
Key use case: continuous training of a proprietary embedding +
retrieval model for Nika OS. Rather than depending on an external
embedding model (baseline bge-small-en-v1.5 384d), the tenant can
periodically train its own model on the Qdrant nika_vault corpus
that is theirs. The pipeline:
- Dataset builder — periodic pull of (query, relevant document) pairs from session history and JSONL audit.
- Hard negatives — generation by BM25 + adversarial LLM.
- Fine-tune — LoRA / QLoRA over 1–3 epochs (3–8 hours of A100 GPU).
- Eval — NDCG@10, MRR, Recall@10 vs baseline; deployment if significant gain (>5%).
- Zero-downtime migration — incremental Qdrant re-embedding via
the
qdrant-model-migrationskill, without interrupting retrieval. - Loop — a weekly or monthly cron restarts the pipeline.
The effect: retrieval quality increases progressively as the tenant uses Nika OS, because the embedding model learns the specific semantics of its business.
-
Instance provisioning — dedicated K3s pod OR shared instance, with per-tenant quotas. The wizard packages all dependencies into a self-contained Docker image:
- Chosen agent CLI + adapters of the others
- Redis 7 + Qdrant (pre-filled with public doctrines + skills)
- PostgreSQL + pgvector (hierarchy state)
- Tailscale client to join the private mesh
- Baileys WhatsApp bridge
- All MCP servers from step 4
- Local fastembed bge-small embedding models
A single
nika-install.shcommand orchestrates everything, starts services in order, verifies healthchecks.
6bis. 1-click ingestion of personal documentation — before the onboarding chatbot, the wizard asks the user for a few profile fields (name, role, industry, preferred language, business context in a few sentences). On this basis, it offers a guided ingestion of personal or professional documentation:
- Guided Tailscale SSH connection — the wizard opens a page with an ephemeral Tailscale auth-key identifier + a command to paste in a local terminal (Mac, Linux, Windows via WSL). Once executed, the user’s machine joins the tenant’s private mesh.
- Simplified transfer — the user drags and drops their folders (Documents, local Drive, SaaS tool exports, old PDFs, drawing photos, Excel files) into a web UI served by Nika OS via Tailscale. No data transits over open Internet.
- Ingestion pipeline — Nika OS detects the type (PDF, image,
docx, xlsx, txt, eml, jsonl), applies the right pipeline (OCR,
layout parsing, table extraction), tags with metadata
(
tenant_id,scope,entity_type,language,domain), embeds via the current embedding model, and inserts into Qdrantnika_vaultfiltered by tenant. - Validation — a summary is presented with: number of documents ingested, size, distribution of detected themes. The user validates or rejects.
The effect: from the first prompt to Nika OS, the RAG already contains the proper knowledge of the tenant. No need for 10 conversational sessions for Nika to become useful — it knows your context right away.
This step is optional but strongly recommended for business uses (admin, finance, R&D, support). For purely conversational use (general chat), it can be skipped.
- Chatbot onboarding — via WhatsApp or Teams, an assistant guides the user through their first 10 prompts (present their business stack, preferences, pain points). Everything is memorized in the RAG from the start, which makes subsequent sessions immediately contextual.
Let’s go — minimum viable install
For a fast PoC, the functional minimum is:
| Step | Minimum |
|---|---|
| CLI | 1 (the preferred one) — fallback optional |
| OAuth | Google OR Microsoft (at least 1) |
| Gateway | WhatsApp OR Teams (at least 1) |
| Voice | Optional — text-only OK to start |
| Vision | Optional — can be enabled later |
| Tools | Drive + Calendar + Gmail = 90% of common usage |
| RAG | Qdrant preloaded with public doctrines |
→ ~15 minutes from the “Install” click to the first functional prompt.
The whole gateway stack relies on the CLI adapters (Claude Code, Mistral Vibe, Gemini CLI, Hermes Agent) that can run in parallel, which allows offering several quality / sovereignty / cost tiers without changing the tenant-side wiring.
One-click OAuth to MCP — detailed flow
The Nika OS install wizard automates the connection of external providers via one-click OAuth. Two families are supported:
Family A — Google Cloud / Google Workspace (GWS)
- The user clicks “Connect Google” on the install page.
- Nika OS generates an OAuth 2.0 URL with PKCE + minimal scopes:
https://www.googleapis.com/auth/drive.readonlyhttps://www.googleapis.com/auth/calendarhttps://www.googleapis.com/auth/gmail.modify(modify labels, drafts)https://www.googleapis.com/auth/cloud-platform(if GCP is needed)
- The browser opens the Google consent. The user validates.
- Callback
https://install.bcub3.com/oauth/google/callbackreceives the code. - Exchange code → refresh + access token. Encrypted storage in
~/.config/nika-os/credentials/google.enc(age + tenant key). - Automatic generation of 4 MCP servers (
gmail-mcp,gdrive-mcp,gcalendar-mcp,gcp-mcp) with ready credentials. Hot-swap into the pod without restart. - Smoke test:
gmail.list_threads(maxResults=1)must return 200.
Family B — Microsoft 365 / Azure
Identical pattern for Outlook + Teams + SharePoint + OneDrive via Microsoft Graph API (delegated OAuth). The Azure app is multi-tenant; each customer authorizes in their own tenant. Scopes:
Mail.ReadWrite,Calendars.ReadWrite,Sites.Read.All,Files.ReadWriteChat.ReadWrite(Teams)
Family C — Classic MCP servers (API key)
For services that do not have OAuth (LinkedIn, some internal APIs,
fintech banks with raw API key), the wizard asks for the key directly,
encrypts it (age), and generates the corresponding MCP. The Python
wrapper exposes @tool decorators that load the key at runtime from the
vault.
Rotation and revocation
- Google refresh token → automatic rotation 24h before expiration.
- If OAuth refresh fails (user revoked) → WA notification + UI re-consent in one click.
- JSONL audit log on every rotation / use for GDPR traceability.
Network recommendations
Nika OS runs on modest infra (1 Contabo VPS + on-demand spawn workers). The network choices favor simplicity and defense in depth rather than sophistication.
Recommended topology
flowchart LR
U["Paul (iPhone / laptop)"] -->|"Tailscale<br/>WireGuard"| TS["Tailscale Mesh<br/>(private, end-to-end encrypted)"]
TS --> VPS["Contabo VPS<br/>(Ubuntu 24.04 LTS)"]
VPS -->|Caddy/Traefik| TLS["TLS 1.3 + HSTS"]
TLS --> NIKA["Nika OS Kernel"]
TLS --> BAILEYS["WhatsApp Bridge<br/>port 3300 (loopback)"]
NIKA --> REDIS[("Redis<br/>127.0.0.1:6379")]
NIKA --> QDRANT[("Qdrant<br/>127.0.0.1:6333")]
classDef user fill:#2C3E42,color:#F5F1E8,stroke:#1A262A;
classDef mesh fill:#7DB5A5,color:#F5F1E8,stroke:#5E9384;
classDef vps fill:#F5F1E8,color:#2C3E42,stroke:#7DB5A5;
classDef edge fill:#E99971,color:#F5F1E8,stroke:#C97A55;
classDef store fill:#F5F1E8,color:#2C3E42,stroke:#A86640;
class U user;
class TS mesh;
class VPS,NIKA vps;
class TLS,BAILEYS edge;
class REDIS,QDRANT store;
Strict rules
-
Bind to loopback by default — all internal services (Redis, Qdrant, Postgres, WhatsApp bridge) listen only on
127.0.0.1. No port is exposed on the public IP except 80/443 (Traefik) and 22 (restricted SSH). -
Tailscale as access VPN — the admin accesses internal services exclusively via the Tailscale WireGuard mesh. No exposed OpenVPN VPN, no IPSec, no admin panel on open Internet. Spawn workers (future multi-VPS) automatically join the mesh with an ephemeral auth-key.
-
TLS everywhere — Traefik (or Caddy as an alternative) terminates TLS 1.3 + HSTS on 443. Automatic Let’s Encrypt certs. No TLS 1.0/1.1 accepted.
-
Restrictive UFW firewall — default deny; allow only 22 (from admin IP), 80 (redirects 443), 443. Tailscale manages its tunnel separately.
-
SSH hardening — no password auth, public key only, port 22 restricted to admin IPs via UFW.
fail2banactive on SSH + audit cron. -
MTA-STS + DKIM + SPF + DMARC for outgoing mail from bcub3.com. DMARC policy at minimum
quarantine, ideallyrejectonce the chain is stable. MTA-STS published at.well-known/mta-sts.txt. -
Secrets never in clear — credentials encrypted with age, keys managed by a local vault (Vaultwarden self-hosted recommended) accessible only via Tailscale. No secrets in git, no logs in clear.
-
Regular audit —
daily-security-audit(hourly cron) checks public open ports, SSH attempts, sudo events, OOM kills, integrity ofauthorized_keys+sudoers, fail2ban banlist, state of critical services. Anomaly → encrypted WA notification.
What we deliberately don’t do
- No direct Internet exposure of the Nika kernel. If an external client must interact, it goes through a public gateway (Teams, WhatsApp, web) with OAuth; never SSH or private API directly exposed.
- No Cloudflare Worker for the kernel — we keep full control on the chain, even if it means supporting less traffic.
- No managed multi-zone cluster (GKE / EKS) — overkill for our current load. Single-node K3s + ephemeral VPS workers is enough.
- No separate Bastion host — Tailscale already covers the case with less attack surface.
This network frugality is consistent with the antifragility doctrine: fewer moving parts, simpler to reason about, faster to recover in case of incident.