Skip to content

Tenant isolation & retention

Part of the self-contained SRE guide

This page restates the server-side query-isolation rule (FR-065) and the retention contract (FR-051) inline so the guide stands alone. Owned by PRD §35 / FR-065 / FR-051 — the PRD wins on conflict. The Reference collects every inlined contract with its provenance.

Tenant isolation

In plain words

A customer must see only their own data. When someone opens a dashboard, the backend takes their logged-in identity and bolts a filter onto every query — and the user can never supply or change it. That single step is the isolation model.

How to build it

The API derives the tenant key from the authenticated session and injects {project_id="X", environment="..."} into PromQL and WHERE project_id='X' into ClickHouse queries — server-side, never from client input. Recording rules pre-compute per-service series so these queries stay simple. Cover it with a CI isolation test that asserts a query never returns another tenant's series.

Gotchas & what lives elsewhere

One VictoriaMetrics is fine for MVP — splitting stores does not fix cross-tenant leakage; that's purely a query-filter concern. If the client could pass its own filter, isolation is gone. Keep the filter construction unreachable from request parameters.

PRD reference & inlined contracts

Owned by FR-065 (server-side filter), SC-016 (isolation test). The server-side filter rule is restated above so this guide stands alone — if it ever diverges, the PRD wins.

Retention

In plain words

Plans promise different history lengths (7 / 30 / 90 days). The two stores enforce that differently, and metrics are the awkward one.

Signal MVP approach Why
Logs Per-tier TTL in ClickHouse by partition ClickHouse does per-partition TTL cleanly, so retention can vary per tenant.
Metrics Global 90 days for everyone Single-node VictoriaMetrics retention is a per-instance setting, not per-tenant. Storage is trivial at launch volume. Per-tier metric retention needs downsampling or multiple instances — deferred.

How to build it

Drive log retention from a per-row retention_days column (so it varies per tenant/tier); set metric retention globally on VictoriaMetrics (per-tier is deferred):

ClickHouse SQL · retention
-- logs: a retention_days column drives the row TTL, so retention varies per tenant/tier
ALTER TABLE logs.customer_logs ADD COLUMN retention_days UInt16 DEFAULT 30;
ALTER TABLE logs.customer_logs MODIFY TTL ts + toIntervalDay(retention_days);

-- metrics: single-node VictoriaMetrics is global-only at MVP (per-tier deferred)
-- vmstorage  -retentionPeriod=90d

PRD reference & inlined contracts

Owned by FR-051 (retention). The retention split (per-tier logs · global-90d metrics) is restated above so this guide stands alone — if it ever diverges, the PRD wins. Canonical map: Canonical Sources.