ADR-021: Citus para horizontal scaling¶
- Status: Proposed
- Date: 2026-05-14
- Tags: infrastructure, scaling, postgres, phase-4
Status: Proposed — esta decisión aplica a Phase 4 del scaling roadmap. Re-evaluar cuando esté en horizonte cercano (Phase 3 ya implementado y triggers para Phase 4 alcanzados).
Context and Problem Statement¶
Single Postgres primary tiene ceiling de throughput (~5K-10K TPS dependiendo del hardware). Para escalar más allá, opciones son: app-level sharding (Phase 3) o database-level sharding via Citus (Phase 4). ¿Qué approach es mejor para crecimiento más allá de Phase 3?
Decision Drivers¶
- Citus is now open-source (Microsoft acquired y mantiene)
- Transparent SQL distribution (no app changes para queries)
- Already extension of Postgres (no fork)
- Maturity en producción (Microsoft, Azure deployments)
- Cross-tenant queries muy difíciles en app-level sharding
Considered Options¶
- Citus extension (transparent distributed Postgres)
- App-level sharding continuado (smart router)
- CockroachDB (Postgres-compatible distributed)
- YugabyteDB (Postgres-compatible distributed)
- TimescaleDB (no — different use case)
- Custom OLAP separado (export para queries analytics)
Decision Outcome (Proposed)¶
Proposed: Citus extension cuando Phase 4 sea needed porque: - Transparent al app code (queries SQL normales) - Same Postgres tooling/ecosystem - Mature (Microsoft Azure backing) - Cross-shard queries native - Migration from Phase 3 app-sharding factible
Positive Consequences (if accepted)¶
- SQL queries transparentes (no smart router en app)
- Cross-shard joins manejados automáticamente
- Mismo Postgres ecosystem (Patroni, pgBackRest, etc.)
- Linear scaling adding workers
- Maintenance similar a single Postgres
- Migration tooling de Citus (online rebalance)
Negative Consequences (if accepted)¶
- Citus tiene some Postgres features unsupported
- Coordinator node es bottleneck (potential)
- Operacional complexity increases (workers + coordinator)
- Some queries patterns inefficient distributed
- Licensing concerns historically (resolved now: open-source)
Pros and Cons of the Options¶
Citus extension¶
Pros: - Transparent SQL - Microsoft maintenance - Open-source ahora - Mature - Postgres-compatible (SQL exacto)
Cons: - Some Postgres features unsupported - Cross-shard transactions complex - Coordinator scaling limit - Re-sharding requires planning
App-level sharding (Phase 3 continuado)¶
Pros: - Total control - No DB feature limitations
Cons: - Smart router complexity en app - Cross-shard queries manual - Migration tooling custom - Maintenance burden
CockroachDB¶
Pros: - Globally distributed - Strong consistency (Raft) - Postgres-compatible (mostly)
Cons: - Different storage engine (not PG) - License changes recently (BSL) - Some Postgres features unsupported - More expensive
YugabyteDB¶
Pros: - Distributed - Postgres compatible
Cons: - Less mature than CockroachDB - Smaller community - Migration from PG requires testing
TimescaleDB¶
NOT applicable — designed for time-series, not OLTP workload.
OLAP separado¶
Pros: - Analytics separated from OLTP
Cons: - Doesn't solve OLTP scaling - Export pipeline overhead - Two stores
Why proposed (not accepted)¶
Phase 4 está horizon planning, not imminent. Things que pueden cambiar antes:
- Citus features pueden evolve — checks otra vez at Phase 3 completion
- CockroachDB/Yugabyte pueden mature más
- Postgres native sharding is in roadmap (TBD)
- MVP product direction puede shift away from need for Phase 4
Re-evaluate antes de commit. Mientras tanto, code para Phase 3 should be Phase-4-compatible (use tenant_id as natural shard key).
Schema considerations¶
If Citus accepted, schema patterns:
-- Tables distributed by tenant_id
SELECT create_distributed_table('process_instances', 'tenant_id');
SELECT create_distributed_table('element_instances', 'tenant_id');
SELECT create_distributed_table('variables', 'tenant_id');
SELECT create_distributed_table('jobs', 'tenant_id');
SELECT create_distributed_table('user_tasks', 'tenant_id');
SELECT create_distributed_table('command_log', 'tenant_id');
SELECT create_distributed_table('event_log', 'tenant_id');
-- Co-located (same tenant data on same node):
-- All tables sharded by tenant_id are auto co-located
-- Joins on tenant_id are local
-- Reference tables (small, replicated to all nodes):
SELECT create_reference_table('process_definitions');
SELECT create_reference_table('tenants');
SELECT create_reference_table('forms');
Co-location is critical — queries que filter por tenant_id stay local.
Migration from Phase 3 to Phase 4¶
Phase 3 has multiple shards each running own Postgres cluster. Migration:
Phase 3 setup:
Shard A: tenants 1-100 (PG cluster A)
Shard B: tenants 101-200 (PG cluster B)
Shard C: tenants 201-300 (PG cluster C)
Phase 4 setup:
Citus cluster:
Coordinator
Worker 1: tenants 1-100 (data)
Worker 2: tenants 101-200 (data)
Worker 3: tenants 201-300 (data)
Migration steps: 1. Setup Citus cluster en paralelo 2. Backup Phase 3 shard A → restore to Citus worker 1 3. Verify data integrity 4. Cut over Phase 3 shard A traffic to Citus (router change) 5. Repeat for B, C 6. Decommission Phase 3 shards
Online or with downtime — depends on tolerance.
Operational impact¶
| Aspect | App-sharding (Phase 3) | Citus (Phase 4) |
|---|---|---|
| Code complexity | High (smart router) | Low (transparent SQL) |
| Ops complexity | High (N clusters) | Medium (Citus cluster) |
| Cross-shard queries | Hard (app code) | Native |
| Re-sharding | Manual tooling | Citus rebalance |
| Backup | Per shard | Cluster-wide |
| Monitoring | Per shard | Cluster-wide |
Triggers para advance to Phase 4¶
Si en Phase 3 emergen estos issues:
- Cross-tenant queries needed frequently: analytics across tenants
- Skewed shards: some shards much bigger than others
- Single tenant exceeds shard capacity: needs sub-tenant sharding
- Re-sharding tooling burden is exhausting team
- Operational complexity of N clusters > Citus cluster
Re-evaluation criteria (when Phase 3 about to be reached)¶
Before commit to Citus, evaluate:
- ¿Citus features support all queries we have?
- ¿Performance benchmarks meet our targets?
- ¿Operational expertise available (or training feasible)?
- ¿Cost analysis vs cloud-managed Citus services (Azure Cosmos DB for Postgres)?
- ¿Alternative tech matured (CockroachDB, Yugabyte)?
If criteria fail → re-open decision.
Links¶
- adrs/adr-003-single-node-mvp-incremental-scaling — Phase context
- analysis/scaling-strategy-postgres — Roadmap completo
- Citus
- Citus docs
- Azure Cosmos DB for PostgreSQL