Roadmap

What’s shipped, what’s being built right now, and what’s next. Dates reflect real engineering plans, not marketing promises.

Shipped Done and working
In Progress Actively being built
Planned Designed, not yet started
Shipped — Live today

Core Platform

End-to-end declarative VM lifecycle on bare metal. Controller, node agent, CLI, networking, storage, mTLS — all working in production.

VM Lifecycle

  • Create, delete, start, stop, update VMs
  • Container workloads via containerd (kctl workload / kctl container)
  • Unified workload API for VMs and containers
  • Desired-state reconciliation via Nix
  • Cloud Hypervisor runtime
  • YAML manifest support (kctl create vm -f)
  • Wait for boot / SSH readiness

Storage & Volumes

  • Filesystem, LVM, and ZFS backends
  • Volume create, delete, attach, detach
  • Per-VM storage provisioning on create
  • Storage overview API (dashboard)

Networking

  • NAT, bridge, and VXLAN network types
  • Cross-host VXLAN with optional outbound NAT
  • VLAN ID support on gateway
  • Security groups with ingress + optional DNAT policy
  • kctl apply -f ... kind-based manifest dispatch (kubectl-style)
  • dnsmasq + nftables per network

Security & Operations

  • Full mTLS PKI (CA, sub-CA, per-node certs)
  • CN-based gRPC authorization
  • Cert rotation (kctl rotate)
  • Node drain and approval flow
  • Compliance reporting

Node Bootstrap

  • Automated node install via kctl node install
  • NixOS ISO with disko integration
  • PKI bootstrap into /etc/kcore/certs
  • Multi-disk layout (OS + data disks)

Platform

  • Capacity-aware scheduler (most-free-first)
  • Image management (URL+SHA256, upload, pull)
  • SSH key management
  • Dashboard (web UI)
  • Terraform provider (separate repo)
April 2026 — Shipped

Declarative Disk Partitioning

Declarative disk layouts from install through day-2. The live ISO generates disko-config.nix and runs disko --mode format,mount; the kcore.disko NixOS module expresses OS and data disks with optional merged fragments; nodes default to installer-only ownership and can be moved to controller-managed for declarative DiskLayout resources. The controller stores DiskLayout objects in its replicated DB and reconciles them to the target node, which atomically persists the layout to /etc/kcore/disk/current.nix and chains nixos-rebuild test + switch. The node-agent classifier refuses any layout that would touch a disk currently backing an active kcore volume, LVM PV, or ZFS pool member — the controller never touches VMs.

  • Shipped Install-time disko: generates disko-config.nix, runs disko --mode format,mount
  • Shipped modules/kcore-disko.nix with kcore.disko.* (including managementMode, controllerFragments, and persistedLayoutPath)
  • Shipped Ownership split: /etc/kcore/disk-management-mode — default installer-only; controller-managed unlocks declarative apply (legacy disko-management-mode path read as fallback)
  • Shipped Per-node kctl node apply-disk -f … [--apply] [--no-rebuild] → node-agent ApplyDiskLayout: classifier-gated, atomic persist to /etc/kcore/disk/current.nix, automatic nixos-rebuild chain
  • Shipped Declarative fleet rollout: kind: DiskLayout resource, controller CRUD + replication outbox, controller-side reconciler tick that pushes unobserved generations to nodes and writes back observed_generation + phase
  • Shipped Safe / dangerous classifier in kcore-disko-types shared crate (used by both controller pre-flight and node-agent authoritative gate); refusal codes surface on status.refusalReason
April 2026 — Shipped

🔗 High Availability & Replication

Multi-controller clusters with CRDT-based state merge and cross-DC replication. Replication now runs with automatic reconciliation and compensation paths instead of manual conflict resolution.

  • Shipped Replication DB tables, ack frontiers, LWW/MV merge, and anti-entropy fan-out
  • Shipped Domain materialization for node/network/vm/security-group/ssh-key event families
  • Shipped Typed, idempotent compensation executor with zero-manual SLO gating
  • Shipped kctl get replication-status [--require-healthy] for hard runtime gates
  • Shipped Manual kctl resolve conflict removed from operator workflow
April 2026 — Shipped

📐 TLA+ Specifications

Formal models of distributed protocols to catch design bugs before they become code bugs. Models and trace checks are now wired as required CI gates.

  • Shipped Bounded TLA+ specs and required CI integration
  • Shipped ControllerNodeReconcile.tla — reconciliation model
  • Shipped ControllerReplication.tla — single-DC replication
  • Shipped CrossDcReplication.tla — multi-DC model
  • Shipped DiskLayoutReconcile.tla — per-resource state machine for the controller-orchestrated DiskLayout; safety invariant “no Applied transition without classifier approval”
  • Shipped Trace bridge: Rust test traces validated against TLA+ invariants
April 2026 — Shipped

🔬 Property-Based Testing & Bounded Model Checking

The remaining three phases of the formal-methods plan landed alongside TLA+: generative property tests across every Rust crate, Kani bounded proofs on the security-critical sanitisers, and proptest coverage of database CRUD invariants. All four phases (proptest, Kani, DB invariants, TLA+) are now required CI gates.

  • Shipped proptest on Nix generation: escape round-trip, attribute-key sanitisation, structural validity (controller, node-agent, kctl, dashboard)
  • Shipped Kani harnesses for path traversal, NUL injection, segment validation, and Nix string escaping in the dedicated kcore-sanitize leaf crate (zero non-std deps, so proofs compile in seconds)
  • Shipped Kani harnesses on the DiskLayout diff parser in kcore-disko-types: extract_target_devices never panics, comment stripper preserves length, every emitted path is /dev/-prefixed, extraction is deterministic
  • Shipped Per-harness CI matrix — each #[kani::proof] runs on its own GitHub Actions runner with shared toolchain cache, completing the gate in ~1–2 min on warm cache
  • Shipped proptest on database CRUD invariants: VM, node, network, security-group, SSH-key, and DiskLayout round-trips, idempotent upserts, foreign-key integrity, cascading delete, and the reconciler queue (list_disk_layouts_needing_reconcile)
  • Shipped proptest fuzzers for the DiskLayout safe/dangerous classifier: safe verdicts never overlap an active-VM, system-mount, LVM PV, or zpool-member device; classifier is deterministic on identical inputs
  • Shipped 575 workspace tests (incl. proptest fuzzers) + Kani matrix + TLA+ trace bridge run on every pull request
Q3 2026

🔒 Security & Compliance

RBAC, audit logging, SBOM, and compliance APIs for regulated environments.

  • Planned RBAC roles: read-only, vm-admin, cluster-admin
  • Planned Append-only audit log with actor, action, resource, timestamp
  • Planned Automated certificate rotation and CRL/OCSP revocation
  • Planned SBOM generation and signed release artifacts
  • Planned New APIs: ListAuditEvents, GetCryptoConfig, ExportSbom
  • Planned gRPC rate limiting
Q3 2026

📡 Webhooks & Events

Push-based event system for external integrations and alerting.

  • Planned Events: node.heartbeat.missed, cert.expiry.warning, vm.state.changed
  • Planned Configurable webhook endpoints in controller.yaml
Q4 2026

📊 Advanced Scheduler

Topology-aware placement beyond the current most-free-first algorithm.

  • Planned Label-based affinity (--label dc=dc-a)
  • Planned Anti-affinity rules for HA workloads
  • Planned Overcommit ratio and load-aware placement
Q4 2026

🌐 Networking Improvements

Expanding from single-NIC IPv4 to richer network topologies and access control.

  • Planned Multi-NIC / multi-homed VMs
  • Planned IPv6 support
  • Planned East-west firewall between VMs on the same bridge
  • Shipped Security groups with YAML apply flow and nftables rendering on nodes
  • Planned VXLAN IP reclamation and improved peer discovery
2027

🔬 Formal Verification — Next Iteration

Building on the four shipped phases (proptest, Kani, DB invariants, TLA+), extending depth and coverage in places where the current bounds or fuzz domains leave room for higher-confidence guarantees.

  • Planned Raise Kani MAX_INPUT_LEN bound once the per-harness matrix budget allows it
  • Planned Kani harness for validate_path_under_root alongside the already-shipped segment validator
  • Planned TLA+ invariants for the compensation executor and replication reservations
  • Planned Extend proptest coverage to security-group rule rendering and replication outbox state machines
2027

🛠 Operations & Node Lifecycle

Day-2 tooling and stronger safety for node removal and VM continuity.

  • Planned GetClusterHealth API
  • Planned Backup and restore tooling
  • Planned Automatic VM migration on node failure
  • Planned Node cordon and certificate invalidation on delete

Explicitly not planned

  • CSR approval flow (Kubernetes-style certificate signing requests)
  • Admission webhooks
  • Formal API versioning contract

Want to shape the roadmap?

Open an issue on GitHub or reach out directly. We prioritise based on real usage.