ADR-021: Citus para horizontal scaling¶

Status: Proposed
Date: 2026-05-14
Tags: infrastructure, scaling, postgres, phase-4

Status: Proposed — esta decisión aplica a Phase 4 del scaling roadmap. Re-evaluar cuando esté en horizonte cercano (Phase 3 ya implementado y triggers para Phase 4 alcanzados).

Context and Problem Statement¶

Single Postgres primary tiene ceiling de throughput (~5K-10K TPS dependiendo del hardware). Para escalar más allá, opciones son: app-level sharding (Phase 3) o database-level sharding via Citus (Phase 4). ¿Qué approach es mejor para crecimiento más allá de Phase 3?

Decision Drivers¶

Citus is now open-source (Microsoft acquired y mantiene)
Transparent SQL distribution (no app changes para queries)
Already extension of Postgres (no fork)
Maturity en producción (Microsoft, Azure deployments)
Cross-tenant queries muy difíciles en app-level sharding

Considered Options¶

Citus extension (transparent distributed Postgres)
App-level sharding continuado (smart router)
CockroachDB (Postgres-compatible distributed)
YugabyteDB (Postgres-compatible distributed)
TimescaleDB (no — different use case)
Custom OLAP separado (export para queries analytics)

Decision Outcome (Proposed)¶

Proposed: Citus extension cuando Phase 4 sea needed porque: - Transparent al app code (queries SQL normales) - Same Postgres tooling/ecosystem - Mature (Microsoft Azure backing) - Cross-shard queries native - Migration from Phase 3 app-sharding factible

Positive Consequences (if accepted)¶

SQL queries transparentes (no smart router en app)
Cross-shard joins manejados automáticamente
Mismo Postgres ecosystem (Patroni, pgBackRest, etc.)
Linear scaling adding workers
Maintenance similar a single Postgres
Migration tooling de Citus (online rebalance)

Negative Consequences (if accepted)¶

Citus tiene some Postgres features unsupported
Coordinator node es bottleneck (potential)
Operacional complexity increases (workers + coordinator)
Some queries patterns inefficient distributed
Licensing concerns historically (resolved now: open-source)

Pros and Cons of the Options¶

Citus extension¶

Pros: - Transparent SQL - Microsoft maintenance - Open-source ahora - Mature - Postgres-compatible (SQL exacto)

Cons: - Some Postgres features unsupported - Cross-shard transactions complex - Coordinator scaling limit - Re-sharding requires planning

App-level sharding (Phase 3 continuado)¶

Pros: - Total control - No DB feature limitations

Cons: - Smart router complexity en app - Cross-shard queries manual - Migration tooling custom - Maintenance burden

CockroachDB¶

Pros: - Globally distributed - Strong consistency (Raft) - Postgres-compatible (mostly)

Cons: - Different storage engine (not PG) - License changes recently (BSL) - Some Postgres features unsupported - More expensive

YugabyteDB¶

Pros: - Distributed - Postgres compatible

Cons: - Less mature than CockroachDB - Smaller community - Migration from PG requires testing

TimescaleDB¶

NOT applicable — designed for time-series, not OLTP workload.

OLAP separado¶

Pros: - Analytics separated from OLTP

Cons: - Doesn't solve OLTP scaling - Export pipeline overhead - Two stores

Why proposed (not accepted)¶

Phase 4 está horizon planning, not imminent. Things que pueden cambiar antes:

Citus features pueden evolve — checks otra vez at Phase 3 completion
CockroachDB/Yugabyte pueden mature más
Postgres native sharding is in roadmap (TBD)
MVP product direction puede shift away from need for Phase 4

Re-evaluate antes de commit. Mientras tanto, code para Phase 3 should be Phase-4-compatible (use tenant_id as natural shard key).

Schema considerations¶

If Citus accepted, schema patterns:

-- Tables distributed by tenant_id
SELECT create_distributed_table('process_instances', 'tenant_id');
SELECT create_distributed_table('element_instances', 'tenant_id');
SELECT create_distributed_table('variables', 'tenant_id');
SELECT create_distributed_table('jobs', 'tenant_id');
SELECT create_distributed_table('user_tasks', 'tenant_id');
SELECT create_distributed_table('command_log', 'tenant_id');
SELECT create_distributed_table('event_log', 'tenant_id');

-- Co-located (same tenant data on same node):
-- All tables sharded by tenant_id are auto co-located
-- Joins on tenant_id are local

-- Reference tables (small, replicated to all nodes):
SELECT create_reference_table('process_definitions');
SELECT create_reference_table('tenants');
SELECT create_reference_table('forms');

Co-location is critical — queries que filter por tenant_id stay local.

Migration from Phase 3 to Phase 4¶

Phase 3 has multiple shards each running own Postgres cluster. Migration:

Phase 3 setup:
  Shard A: tenants 1-100 (PG cluster A)
  Shard B: tenants 101-200 (PG cluster B)
  Shard C: tenants 201-300 (PG cluster C)

Phase 4 setup:
  Citus cluster:
    Coordinator
    Worker 1: tenants 1-100 (data)
    Worker 2: tenants 101-200 (data)
    Worker 3: tenants 201-300 (data)

Migration steps: 1. Setup Citus cluster en paralelo 2. Backup Phase 3 shard A → restore to Citus worker 1 3. Verify data integrity 4. Cut over Phase 3 shard A traffic to Citus (router change) 5. Repeat for B, C 6. Decommission Phase 3 shards

Online or with downtime — depends on tolerance.

Operational impact¶

Aspect	App-sharding (Phase 3)	Citus (Phase 4)
Code complexity	High (smart router)	Low (transparent SQL)
Ops complexity	High (N clusters)	Medium (Citus cluster)
Cross-shard queries	Hard (app code)	Native
Re-sharding	Manual tooling	Citus rebalance
Backup	Per shard	Cluster-wide
Monitoring	Per shard	Cluster-wide

Triggers para advance to Phase 4¶

Si en Phase 3 emergen estos issues:

Cross-tenant queries needed frequently: analytics across tenants
Skewed shards: some shards much bigger than others
Single tenant exceeds shard capacity: needs sub-tenant sharding
Re-sharding tooling burden is exhausting team
Operational complexity of N clusters > Citus cluster

Re-evaluation criteria (when Phase 3 about to be reached)¶

Before commit to Citus, evaluate:

¿Citus features support all queries we have?
¿Performance benchmarks meet our targets?
¿Operational expertise available (or training feasible)?
¿Cost analysis vs cloud-managed Citus services (Azure Cosmos DB for Postgres)?
¿Alternative tech matured (CockroachDB, Yugabyte)?

If criteria fail → re-open decision.

Links¶

adr 003 single node mvp incremental scaling — Phase context
scaling strategy postgres — Roadmap completo
Citus
Citus docs
Azure Cosmos DB for PostgreSQL