RBC Capital Markets has built a cloud-native platform that manages over 50 Kubernetes clusters across on-premises VMware and multiple clouds using three open-source projects: Kairos, k0rdent, and bindy. The platform is designed to meet the compliance requirements of a regulated financial institution while eliminating manual operations at the node, cluster, and DNS layers.
The problem: Three gaps in a growing platform
RBC Capital Markets had already adopted FluxCD for GitOps-based deployment. But as the fleet grew past 50 clusters, three operational gaps became critical:
- Node configuration drift: VM-based nodes patched and mutated over time became impossible to reason about.
- Cluster provisioning: Spinning up new clusters for trading desks or risk teams was a multi-day manual exercise with no single source of truth.
- DNS integration: Every new service or ingress endpoint required a manual ticket to the network team, creating a bottleneck and an audit trail outside the GitOps workflow.
The team decided to solve each gap from the ground up, using cloud-native projects where they existed and building their own where they did not.
Kairos: Immutable OS for reproducible nodes
Kairos, a CNCF Sandbox project, provides a Linux distribution designed to be immutable, declaratively configured, and reproducible. Every node in the fleet boots from an OCI image built from a RHEL-derived base, baked with approved security configuration, and published to an internal registry.
The cloud-config model defines node behavior — SSH keys, network configuration, SSSD authentication against Active Directory, Kubernetes agent registration — as versioned YAML that flows through FluxCD like any other platform component.
A CI/CD pipeline treats Kairos images exactly like application container images: every change triggers a GitHub Actions pipeline that builds the image, runs integration tests against a live VM, and publishes a new OCI tag only on a clean pass. Nightly builds catch upstream regressions in base packages or the Kairos framework itself before they reach production.
For VM provisioning, the team uses VirtRigaud, a Kubernetes operator that provides declarative VM management across multiple hypervisors (vSphere, Libvirt/KVM, and Proxmox) through a unified CRD API. Kairos-built OCI images are registered as VMImage CRDs, and VMs are expressed as VirtualMachine CRDs referencing that image. FluxCD reconciles these manifests like any other platform resource. Provisioning a new Kairos node on vSphere is a pull request, reviewed, merged, and reconciled automatically.
k0rdent: Cluster lifecycle management as code
k0rdent, built on Cluster API (CAPI), provides a Kubernetes-native control plane for managing Kubernetes clusters. Combined with k0smotron for in-cluster control planes, the entire cluster topology is expressed declaratively, and FluxCD reconciles that state continuously.
The team chose k0s, a CNCF Sandbox project, as the Kubernetes distribution for workload clusters. k0s is a fully self-contained, single-binary distribution with no host OS dependencies beyond the kernel. That property matters when nodes run an immutable OS: k0s installs cleanly into a Kairos image without requiring package managers or systemd unit file manipulation at runtime.
The architecture uses a hub-and-spoke model:
- A management cluster runs k0rdent, k0smotron, and the CAPI controllers.
- Workload clusters run k0s, provisioned and decommissioned through CRD manifests stored in Git.
- MetalLB handles load-balancing on bare-metal segments; Traefik provides ingress with consistent configuration across all spoke clusters.
Day-two operations are transformed: cluster upgrades are a pull request, cluster templates standardize configurations for common use cases (trading desk, risk compute, tooling), and compliance posture is consistent by default because every cluster is expressed as code.
bindy: Kubernetes-native DNS operations
DNS was the gap where no existing project fully covered the requirements. At RBC Capital Markets, DNS infrastructure runs on Infoblox, an enterprise DDI platform. Previously, every DNS record request went through a ticketing workflow routed to the network team, processed on a timescale of hours or days.
bindy, built by Erick Bourgeois, is a Kubernetes operator written in Rust using kube-rs that manages DNS zones and records as first-class Kubernetes resources. The core design philosophy: make DNS a GitOps citizen with the same reconciliation guarantees applied to everything else on the platform.
Key design elements:
- Zones and records are CRDs. A DNSZone or ARecord manifest in Git is the source of truth, reconciled continuously by bindy's controllers.
- RFC 2136 dynamic updates allow bindy to push record changes to the DNS backend without manual intervention or ticket queues.
- bindcar, a sidecar REST API, provides an RNDC interface for zone lifecycle operations (creation, deletion, reload) alongside dynamic updates.
- A multi-controller architecture with strict write boundaries prevents split-brain scenarios.
The impact: DNS records for new services are created automatically as part of the same GitOps workflow that deploys the service itself. Provisioning time drops from hours to seconds, and the audit trail is Git history, not a ticket system.
How the three fit together
The stack is coherent because each layer builds on the same foundational principle: everything is code, reconciled continuously, with no manual state.
- Git is the source of truth.
- FluxCD is the reconciliation engine.
- Kairos ensures every node boots from a known, auditable image.
- k0rdent ensures every cluster is expressed and managed declaratively.
- bindy ensures every DNS record is a versioned artifact.
Drift — at the node, cluster, or network level — is structurally prevented rather than operationally managed.
Challenges and lessons learned
- Immutable OS adoption requires patience with enterprise integration. SSSD, NetworkManager, and corporate CA trust chains all need explicit attention when baking immutable images.
- CRD-based cluster management shifts responsibility left. When cluster provisioning is a pull request, platform teams need to invest in review processes and template governance up front.
- Building operators in Rust is the right long-term call, but the ecosystem is still maturing. kube-rs is excellent, but patterns for multi-controller architectures with reflector/store caching require deliberate design decisions.
Looking ahead
The platform continues to evolve. Active development areas include SPIRE/SPIFFE integration for workload identity across all 50+ clusters, an internal self-service API layer called Foundry built in Rust, and Kairos-based spot computing using k0smotron and Kata Containers to absorb donated physical server capacity dynamically.
Bottom line
RBC Capital Markets has demonstrated that a regulated financial institution can build a fully GitOps-native Kubernetes platform using open-source projects. The combination of Kairos for immutable nodes, k0rdent for declarative cluster lifecycle, and bindy for DNS-as-code eliminates manual operations and provides the audit trail that compliance requires. The key insight: treat every layer — OS, cluster, DNS — as code reconciled through a single GitOps pipeline.