What’s shipped, what’s being built right now, and what’s next. Dates reflect real engineering plans, not marketing promises.
End-to-end declarative VM lifecycle on bare metal. Controller, node agent, CLI, networking, storage, mTLS — all working in production.
kctl workload / kctl container)kctl create vm -f)kctl apply -f ... kind-based manifest dispatch (kubectl-style)kctl rotate)kctl node install/etc/kcore/certs
Declarative disk layouts from install through day-2. The live ISO generates
disko-config.nix and runs disko --mode format,mount;
the kcore.disko NixOS module expresses OS and data disks with optional
merged fragments; nodes default to installer-only ownership and can be
moved to controller-managed for declarative DiskLayout
resources. The controller stores DiskLayout objects in its replicated
DB and reconciles them to the target node, which atomically persists the layout to
/etc/kcore/disk/current.nix and chains
nixos-rebuild test + switch. The node-agent classifier
refuses any layout that would touch a disk currently backing an active kcore
volume, LVM PV, or ZFS pool member — the controller never touches VMs.
disko-config.nix, runs disko --mode format,mount
modules/kcore-disko.nix with kcore.disko.* (including managementMode, controllerFragments, and persistedLayoutPath)
/etc/kcore/disk-management-mode — default installer-only; controller-managed unlocks declarative apply (legacy disko-management-mode path read as fallback)
kctl node apply-disk -f … [--apply] [--no-rebuild] → node-agent ApplyDiskLayout: classifier-gated, atomic persist to /etc/kcore/disk/current.nix, automatic nixos-rebuild chain
kind: DiskLayout resource, controller CRUD + replication outbox, controller-side reconciler tick that pushes unobserved generations to nodes and writes back observed_generation + phase
kcore-disko-types shared crate (used by both controller pre-flight and node-agent authoritative gate); refusal codes surface on status.refusalReason
Multi-controller clusters with CRDT-based state merge and cross-DC replication. Replication now runs with automatic reconciliation and compensation paths instead of manual conflict resolution.
kctl get replication-status [--require-healthy] for hard runtime gates
kctl resolve conflict removed from operator workflow
Formal models of distributed protocols to catch design bugs before they become code bugs. Models and trace checks are now wired as required CI gates.
ControllerNodeReconcile.tla — reconciliation model
ControllerReplication.tla — single-DC replication
CrossDcReplication.tla — multi-DC model
DiskLayoutReconcile.tla — per-resource state machine for the controller-orchestrated DiskLayout; safety invariant “no Applied transition without classifier approval”
The remaining three phases of the formal-methods plan landed alongside TLA+: generative property tests across every Rust crate, Kani bounded proofs on the security-critical sanitisers, and proptest coverage of database CRUD invariants. All four phases (proptest, Kani, DB invariants, TLA+) are now required CI gates.
kcore-sanitize leaf crate (zero non-std deps, so proofs compile in seconds)
DiskLayout diff parser in kcore-disko-types: extract_target_devices never panics, comment stripper preserves length, every emitted path is /dev/-prefixed, extraction is deterministic
#[kani::proof] runs on its own GitHub Actions runner with shared toolchain cache, completing the gate in ~1–2 min on warm cache
DiskLayout round-trips, idempotent upserts, foreign-key integrity, cascading delete, and the reconciler queue (list_disk_layouts_needing_reconcile)
DiskLayout safe/dangerous classifier: safe verdicts never overlap an active-VM, system-mount, LVM PV, or zpool-member device; classifier is deterministic on identical inputs
RBAC, audit logging, SBOM, and compliance APIs for regulated environments.
ListAuditEvents, GetCryptoConfig, ExportSbom
Push-based event system for external integrations and alerting.
node.heartbeat.missed, cert.expiry.warning, vm.state.changed
controller.yaml
Topology-aware placement beyond the current most-free-first algorithm.
--label dc=dc-a)
Expanding from single-NIC IPv4 to richer network topologies and access control.
Building on the four shipped phases (proptest, Kani, DB invariants, TLA+), extending depth and coverage in places where the current bounds or fuzz domains leave room for higher-confidence guarantees.
MAX_INPUT_LEN bound once the per-harness matrix budget allows it
validate_path_under_root alongside the already-shipped segment validator
Day-2 tooling and stronger safety for node removal and VM continuity.
GetClusterHealth API
Open an issue on GitHub or reach out directly. We prioritise based on real usage.