Saltar a contenido

ADR-021: Citus para horizontal scaling

  • Status: Proposed
  • Date: 2026-05-14
  • Tags: infrastructure, scaling, postgres, phase-4

Status: Proposed — esta decisión aplica a Phase 4 del scaling roadmap. Re-evaluar cuando esté en horizonte cercano (Phase 3 ya implementado y triggers para Phase 4 alcanzados).

Context and Problem Statement

Single Postgres primary tiene ceiling de throughput (~5K-10K TPS dependiendo del hardware). Para escalar más allá, opciones son: app-level sharding (Phase 3) o database-level sharding via Citus (Phase 4). ¿Qué approach es mejor para crecimiento más allá de Phase 3?

Decision Drivers

  • Citus is now open-source (Microsoft acquired y mantiene)
  • Transparent SQL distribution (no app changes para queries)
  • Already extension of Postgres (no fork)
  • Maturity en producción (Microsoft, Azure deployments)
  • Cross-tenant queries muy difíciles en app-level sharding

Considered Options

  1. Citus extension (transparent distributed Postgres)
  2. App-level sharding continuado (smart router)
  3. CockroachDB (Postgres-compatible distributed)
  4. YugabyteDB (Postgres-compatible distributed)
  5. TimescaleDB (no — different use case)
  6. Custom OLAP separado (export para queries analytics)

Decision Outcome (Proposed)

Proposed: Citus extension cuando Phase 4 sea needed porque: - Transparent al app code (queries SQL normales) - Same Postgres tooling/ecosystem - Mature (Microsoft Azure backing) - Cross-shard queries native - Migration from Phase 3 app-sharding factible

Positive Consequences (if accepted)

  • SQL queries transparentes (no smart router en app)
  • Cross-shard joins manejados automáticamente
  • Mismo Postgres ecosystem (Patroni, pgBackRest, etc.)
  • Linear scaling adding workers
  • Maintenance similar a single Postgres
  • Migration tooling de Citus (online rebalance)

Negative Consequences (if accepted)

  • Citus tiene some Postgres features unsupported
  • Coordinator node es bottleneck (potential)
  • Operacional complexity increases (workers + coordinator)
  • Some queries patterns inefficient distributed
  • Licensing concerns historically (resolved now: open-source)

Pros and Cons of the Options

Citus extension

Pros: - Transparent SQL - Microsoft maintenance - Open-source ahora - Mature - Postgres-compatible (SQL exacto)

Cons: - Some Postgres features unsupported - Cross-shard transactions complex - Coordinator scaling limit - Re-sharding requires planning

App-level sharding (Phase 3 continuado)

Pros: - Total control - No DB feature limitations

Cons: - Smart router complexity en app - Cross-shard queries manual - Migration tooling custom - Maintenance burden

CockroachDB

Pros: - Globally distributed - Strong consistency (Raft) - Postgres-compatible (mostly)

Cons: - Different storage engine (not PG) - License changes recently (BSL) - Some Postgres features unsupported - More expensive

YugabyteDB

Pros: - Distributed - Postgres compatible

Cons: - Less mature than CockroachDB - Smaller community - Migration from PG requires testing

TimescaleDB

NOT applicable — designed for time-series, not OLTP workload.

OLAP separado

Pros: - Analytics separated from OLTP

Cons: - Doesn't solve OLTP scaling - Export pipeline overhead - Two stores

Why proposed (not accepted)

Phase 4 está horizon planning, not imminent. Things que pueden cambiar antes:

  1. Citus features pueden evolve — checks otra vez at Phase 3 completion
  2. CockroachDB/Yugabyte pueden mature más
  3. Postgres native sharding is in roadmap (TBD)
  4. MVP product direction puede shift away from need for Phase 4

Re-evaluate antes de commit. Mientras tanto, code para Phase 3 should be Phase-4-compatible (use tenant_id as natural shard key).

Schema considerations

If Citus accepted, schema patterns:

-- Tables distributed by tenant_id
SELECT create_distributed_table('process_instances', 'tenant_id');
SELECT create_distributed_table('element_instances', 'tenant_id');
SELECT create_distributed_table('variables', 'tenant_id');
SELECT create_distributed_table('jobs', 'tenant_id');
SELECT create_distributed_table('user_tasks', 'tenant_id');
SELECT create_distributed_table('command_log', 'tenant_id');
SELECT create_distributed_table('event_log', 'tenant_id');

-- Co-located (same tenant data on same node):
-- All tables sharded by tenant_id are auto co-located
-- Joins on tenant_id are local

-- Reference tables (small, replicated to all nodes):
SELECT create_reference_table('process_definitions');
SELECT create_reference_table('tenants');
SELECT create_reference_table('forms');

Co-location is critical — queries que filter por tenant_id stay local.

Migration from Phase 3 to Phase 4

Phase 3 has multiple shards each running own Postgres cluster. Migration:

Phase 3 setup:
  Shard A: tenants 1-100 (PG cluster A)
  Shard B: tenants 101-200 (PG cluster B)
  Shard C: tenants 201-300 (PG cluster C)

Phase 4 setup:
  Citus cluster:
    Coordinator
    Worker 1: tenants 1-100 (data)
    Worker 2: tenants 101-200 (data)
    Worker 3: tenants 201-300 (data)

Migration steps: 1. Setup Citus cluster en paralelo 2. Backup Phase 3 shard A → restore to Citus worker 1 3. Verify data integrity 4. Cut over Phase 3 shard A traffic to Citus (router change) 5. Repeat for B, C 6. Decommission Phase 3 shards

Online or with downtime — depends on tolerance.

Operational impact

Aspect App-sharding (Phase 3) Citus (Phase 4)
Code complexity High (smart router) Low (transparent SQL)
Ops complexity High (N clusters) Medium (Citus cluster)
Cross-shard queries Hard (app code) Native
Re-sharding Manual tooling Citus rebalance
Backup Per shard Cluster-wide
Monitoring Per shard Cluster-wide

Triggers para advance to Phase 4

Si en Phase 3 emergen estos issues:

  1. Cross-tenant queries needed frequently: analytics across tenants
  2. Skewed shards: some shards much bigger than others
  3. Single tenant exceeds shard capacity: needs sub-tenant sharding
  4. Re-sharding tooling burden is exhausting team
  5. Operational complexity of N clusters > Citus cluster

Re-evaluation criteria (when Phase 3 about to be reached)

Before commit to Citus, evaluate:

  1. ¿Citus features support all queries we have?
  2. ¿Performance benchmarks meet our targets?
  3. ¿Operational expertise available (or training feasible)?
  4. ¿Cost analysis vs cloud-managed Citus services (Azure Cosmos DB for Postgres)?
  5. ¿Alternative tech matured (CockroachDB, Yugabyte)?

If criteria fail → re-open decision.