Skip to content

Cluster Topology

The component inventory per plane: what lives in the control-plane region, what runs in each customer cluster, and what stays external. The same picture as a theme-aware diagram is in System Topology.

4.1 Control-Plane Region & Regional Telemetry (control plane: one region at MVP)

At MVP the control plane runs in a single region. Shuttles in every region pull desired state from it, and Starbase reads each region's telemetry stores to render dashboards (§35.4). Telemetry stores run per region on DO VM droplets (not in Kubernetes) — single-node stateful stores are simpler and cheaper on a dedicated droplet than as in-cluster StatefulSets, and this matches DO Managed Postgres/Valkey already living off-cluster.

Single-region control plane — a SPOF by decision

The control plane is central at MVP. Running workloads survive an outage via Shuttle's level-driven loop, but new deploys and dashboards stop. Distributing it needs a distributed DB (Fly's path) and is out of near-term scope — seeded in §39.3.

Control-plane region — infra cluster + managed services:

Component Purpose
Starbase API HTTP server for dashboard, Shuttle, webhooks; telemetry-query broker (injects the server-side tenant filter, §35.2 / FR-065; routes each query to the project's region)
Starbase Worker Background jobs (builds, provisioning, billing)
Stardeck (Next.js) Customer dashboard
DO Managed Postgres Starbase database
DO Container Registry Customer app images

Regional telemetry tier — per region, DO VM droplets (off-cluster):

Component Purpose
ClickHouse (VM droplet) Customer logs + billing audit trail — written only by the regional Vector aggregator (§35.1)
VictoriaMetrics (VM droplet) Customer metrics — read by Starbase via the server-side tenant filter (§35.2 / FR-065)
Vector aggregator Regional log fan-in (Fluent Bit forward → ClickHouse); stateless, runs in the region's infra footprint

Separate droplets per store (different resource profiles; ClickHouse is disk/memory-heavy). The shared regional VPC keeps intra-region ingest at $0 — same-VPC traffic, no peering (§4.4); peering carries only cross-region reads (§35.4). Platform self-monitoring is offloaded to Grafana Cloud (§35.5) — there is no internal VictoriaMetrics or Alertmanager at MVP. (A vmalert still runs in the telemetry tier, but only to evaluate the customer recording rules against the customer VictoriaMetrics — §35.2 — not for platform alerting.)

4.2 Customer Cluster (one or more per region, lean)

Component Type Purpose
Shuttle Deployment (1 replica) Agent — applies desired state, reports snapshots/capacity
Envoy Gateway Deployment (2+ replicas) Traffic routing via HTTPRoute, future JWT/JWKS/ext_authz; also the source of L7 metrics (latency, RPS, throughput, error rate)
Fluent Bit DaemonSet Ships customer logs (forward protocol) → regional Vector aggregator (§35.1). Light agent (~64Mi/node); the heavy batching/CH-sink work lives at the aggregator
vmagent Deployment Scrapes Envoy stats, kubelet/cAdvisor, and kube-state-metrics; remote_writes customer metrics → VictoriaMetrics
metrics-server Deployment Resource metrics (CPU/mem) via the K8s Metrics API for HPA and kubectl top. Not bundled on DOKS — must be installed at bootstrap (Section 26.3)
kube-state-metrics Deployment Cluster object state; source of pod/namespace label metadata for metric attribution
Grafana Alloy Deployment (k8s-monitoring Helm chart) Platform self-monitoring agent — scrapes Starform-component series (Shuttle, Envoy gateway-self, node-exporter, KSM platform objects) and remote_writes them to Grafana Cloud (§35.5). A separate agent from vmagent, so the platform-monitoring path is independent of the customer pipeline

Seven components. No cert-manager, no Cilium Gateway, no Istio, no service mesh.

4.3 External Services (not in any cluster)

Service Purpose
Cloudflare CDN, DNS, WAF, DDoS protection, TLS termination (edge + origin cert)
Stripe Payments, subscriptions, invoicing
Depot SaaS Build execution (remote BuildKit; Railpack frontend + Dockerfile fallback)
Tigris Object storage (Partner Integration API; isolated tenant per workspace; zero egress)
GitHub / GitLab / Bitbucket Source code, webhooks
Grafana Cloud Platform self-monitoring — hosted metrics + Grafana Alerting + OnCall for Starform's own components (§35.5). MVP-only; revisit post-MVP

VPC & IP topology → §4.4 (owned by Networking).

Cross-references

The topology diagram → System Topology · per-region VPC / IP plan → §4.4 · the telemetry transport (peering, $0 intra / $0.01-GiB cross-region) → §35.4 · platform self-monitoring → §35.5 · how it scales → Scaling Model. Canonical map: Canonical Sources.