Language Choice
Análisis exhaustivo de lenguajes para implementar el MVP. Prioridad #1 resiliencia (memory safety, predictable behavior, mature distributed systems ecosystem, mature Postgres drivers, no surprise failures). Prioridad #2 eficiencia (memory footprint, startup time, throughput, GC behavior). Recomendación: Go. Razones: ecosystem maduro distribuido (etcd, Consul, Temporal, Kubernetes validan), pgx driver excellent, predictable GC, single static binary, simple concurrency, hireable talent pool. Rust es second choice si equipo tiene expertise. Java/Kotlin solo si team viene de Camunda. Avoid Node.js, Elixir (niche), Python (too dynamic). Genera ADR-026.
Criteria framework¶
Priorities en orden:
Priority 1: Resilience¶
Workflow engines son stateful, long-running, mission-critical. Crashes = lost business state. Criteria:
- Memory safety: no segfaults, no buffer overflows in production
- Predictable behavior under load: no surprise OOM, no GC pause spikes
- Mature ecosystem for distributed systems: HA libraries battle-tested
- Mature Postgres drivers: connection pooling, prepared statements, robust
- Strong typing: catch bugs at compile time, not runtime
- Error handling primitives: explicit handling, no swallowed exceptions
- Long-term maintainability: language stability, backwards compat
- Observability tooling: mature OTel SDK
- Production failure modes well-understood: industry experience with workload
Priority 2: Efficiency¶
- Memory footprint: smaller = less infra cost, more density
- Startup time: faster restart on failure (per FMEA F1)
- GC behavior: low/no pauses, predictable
- Hot path performance: high TPS without saturating
- Concurrency model: matches workflow engine pattern (single-thread per partition + many connections)
Priority 3: Practical considerations¶
- Hireability: can we find devs?
- Development velocity: time-to-implement
- Ecosystem maturity: libraries available
- Team familiarity: existing skill
- Tooling quality: build, test, debug
Candidates evaluated¶
Real candidates for a workflow engine in 2026:
- Go (Golang)
- Rust
- Java/Kotlin (JVM)
- C#/.NET
- TypeScript/Node.js (incl. Bun, Deno)
- Elixir/Erlang (BEAM)
- Python (mentioned but ruled out early)
Per-language analysis¶
Go (Golang) — Score 92/100¶
Resilience: 9/10
- Memory safe (garbage collected, no manual pointer arithmetic)
- Static binary deployment (no runtime version issues)
- Excellent concurrency primitives (goroutines + channels match actor pattern)
- Strong typing with type inference
- Explicit error handling (if err != nil) — verbose but forces handling
- Modern GC: sub-millisecond pauses, predictable
- Compile-time race detector
- Mature ecosystem for distributed systems: Temporal, CockroachDB, Kubernetes, etcd, Consul, Vault, Nomad, NATS, TiDB all in Go
- pgx Postgres driver is excellent (production-grade)
- Mature OpenTelemetry SDK
- Backwards compatibility guarantee (Go 1 promise)
Efficiency: 8/10 - Fast startup (< 100ms typical for engine process) - Small memory footprint (~50-100MB baseline) - Good throughput (handles 100K+ goroutines easily) - GC overhead minimal at MVP scale - Single static binary = simple deployment - No JVM warmup - Lower than Rust raw performance, but rarely matters for IO-heavy workload
Practical: 9/10 - Hireable: huge talent pool, many devs know Go - Development velocity: high (simple language, fast compile) - Ecosystem: massive (similar problem domains well-covered) - Tooling: excellent (gofmt, gopls, gotest, pprof, race detector) - Learning curve: shallow (Go is intentionally simple)
Cons honest assessment: - Verbose error handling (more lines vs exceptions) - Generics only since 1.18 (some 2022 libraries less idiomatic) - No real exceptions (panic = unrecoverable) - GC exists (vs Rust) but rarely problematic - "Boilerplate-y" for some patterns
Validated by: Temporal (Uber/competitor workflow engine) chose Go after evaluating alternatives. CockroachDB (distributed SQL) chose Go. Kubernetes orchestration ecosystem standardized on Go.
Rust — Score 87/100¶
Resilience: 10/10
- Memory safety enforced by compiler (no GC needed)
- No null pointer exceptions
- Result<T, E> forces error handling
- Strong type system catches many bugs
- No data races (compile-time guarantee)
- Predictable performance (no GC at all)
- Stable since 1.0 (2015) — mature
Efficiency: 10/10 - C++-level performance - Zero-cost abstractions - No GC pauses (deterministic latency) - Smallest memory footprint of mainstream langs - Fast startup
Practical: 5/10 - Steep learning curve (borrow checker takes 6+ months to master) - Slower development velocity (compile times, fighting borrow checker) - Smaller talent pool (Rust devs hard to find, expensive) - Async ecosystem still evolving — tokio dominant but framework fragmentation (axum, actix-web, rocket, warp) - Compile times slow (especially incremental at first) - Macro magic can be opaque - "Rewrite it in Rust" overhead — fewer libraries vs Go
Honest assessment: - Best LANGUAGE for resilience - Worst CHOICE for fast MVP if team doesn't already know Rust - 50-80% slower development velocity than Go for engineers not Rust-fluent
Best for: established companies with 3+ Rust experts, willing to invest in correctness over speed-to-market.
Java/Kotlin (JVM) — Score 76/100¶
Resilience: 8/10 - Memory safe (GC) - Mature ecosystem (decades) - Excellent for long-running services - Mature OTel SDK - Robust Postgres drivers (jdbc, R2DBC) - Strong typing - Camunda uses Java — proven at scale
Efficiency: 6/10 - JVM memory overhead (~200-500 MB baseline) - Slow startup (10-30s typical Spring Boot) - GC pauses larger (G1 ~50-100ms, ZGC ~ms) - Higher infra cost (more RAM per instance) - More CPU per request than Go/Rust
Practical: 8/10 - Hireable: huge talent pool - Mature tooling - Spring ecosystem - Existing Camunda devs already know it - BUT: differentiation issue (if MVP = Camunda in Java, why bother?)
Honest assessment: - Smart choice if team comes from Camunda 8 and wants minimal language risk - BUT loses operational simplicity advantage over Camunda - JVM ops complexity is real cost (heap tuning, GC tuning, monitoring) - Kotlin reduces verbosity, but JVM trade-offs same
Best for: enterprise teams migrating from Camunda 8, already invested in JVM.
C#/.NET — Score 78/100¶
Resilience: 8/10 - Memory safe - Mature ecosystem - Strong typing - async/await primitives - Good OTel support - ASP.NET Core mature
Efficiency: 7/10 - .NET 8+ much improved (~similar to Go now) - AOT compilation available - Smaller footprint than JVM - Faster startup than JVM - Good GC (server GC)
Practical: 7/10 - Hireable in some regions (Windows shops) - Microsoft stewardship (positive + risk) - Less popular in cloud-native / distributed systems space - Smaller open-source ecosystem for our problem domain
Honest assessment: viable but niche for our domain. Workflow engine community/ecosystem is JVM + Go + small Rust. C# would be lonely.
Best for: shops already on .NET stack heavily.
TypeScript/Node.js (incl. Bun) — Score 71/100¶
Resilience: 6/10 - Type safety opt-in (TS, but JS runs) - Single-threaded event loop fits actor pattern - Mature Postgres drivers (pg, postgres.js) - Memory leaks easier in JS than typed langs - V8 GC can have spikes - Long-running Node.js processes need care - Many production stories of Node.js memory issues
Efficiency: 7/10 - V8 is fast (modern JS) - Bun even faster (Zig-based runtime) - Single-threaded model matches engine - Startup fast - Memory usage moderate
Practical: 9/10 - Huge talent pool - Fast development - Frontend + backend same language (TypeScript) - Massive ecosystem - TypeScript types decent
Honest assessment: - Concerning for resilience-critical workload: V8 memory limits, JS quirks, type safety not compile-enforced - Production workflow engines NOT typically in Node.js - Memory profile less predictable than Go/Java - For RUNTIME engine: risky - For SDK / Tasklist / Inspector: excellent
Best for: MVP frontend stack + worker SDK. NOT engine runtime.
Elixir/Erlang (BEAM) — Score 79/100¶
Resilience: 10/10 - Built for resilience — Erlang/OTP designed for 99.9999% uptime in telecom - Actor model native (matches engine pattern PERFECTLY) - Process isolation (one crash doesn't kill others) - "Let it crash" philosophy with supervisors - Hot code reload (zero-downtime deploys) - Distributed by default - Battle-tested for decades
Efficiency: 6/10 - BEAM less raw performance than Go/Rust - BUT excellent for concurrent workloads (millions of processes cheap) - Memory per process ~2KB (very small) - GC per process (no stop-the-world)
Practical: 5/10 - Smaller talent pool (hard to hire) - Dynamic typing (mostly, Elixir has typespecs but not enforced) - Postgres ecosystem less mature than Go (Ecto is good but smaller) - Workflow engine community small - OpenTelemetry support newer - Niche choice = bus factor risk
Honest assessment: - Best language for "resilience first" by design philosophy - BUT practical concerns: hiring, talent retention, ecosystem maturity - If you ONLY optimized for resilience, Elixir is the answer - For pragmatic MVP with priority resilience + practical, Go wins
Best for: teams with Erlang/Elixir expertise already. Otherwise too risky.
Python — Score 55/100¶
Resilience: 4/10 - Memory safe but GC + memory leaks common - Dynamic typing = runtime errors - GIL limits concurrency - Async story complex (asyncio fragmentation) - Long-running Python processes have memory issues
Efficiency: 4/10 - Slow interpreted (compared to Go/Rust/Java) - GIL = no true parallelism - Memory footprint moderate - Startup OK
Practical: 9/10 - Easiest to develop in - Huge talent pool
Honest assessment: - Ruled out for engine runtime. Dynamic typing + GIL + memory characteristics make it bad fit for our workload. - Perfect for SDK and tooling. Acceptable for some scripts.
NOT recommended for engine.
Comparative table¶
| Language | Resilience | Efficiency | Practical | Total | Recommended? |
|---|---|---|---|---|---|
| Go | 9 | 8 | 9 | 92 | ✅ Primary |
| Rust | 10 | 10 | 5 | 87 | ⚠️ If expertise |
| Elixir | 10 | 6 | 5 | 79 | ⚠️ If expertise |
| C#/.NET | 8 | 7 | 7 | 78 | ⚠️ Niche fit |
| Java/Kotlin | 8 | 6 | 8 | 76 | ⚠️ If from Camunda |
| TypeScript | 6 | 7 | 9 | 71 | ❌ Not engine |
| Python | 4 | 4 | 9 | 55 | ❌ Not engine |
Industry validation¶
What did serious workflow/orchestration projects choose?
| Project | Language | Notes |
|---|---|---|
| Temporal | Go | Direct competitor to MVP. Chose Go after evaluating alternatives. |
| Camunda 8 (Zeebe) | Java | Inherited from Camunda 7 ecosystem |
| Conductor (Netflix) | Java + Go (rewrites) | OSS rewrites going to Go |
| Cadence (Uber) | Go | Temporal predecessor |
| Argo Workflows | Go | K8s controller |
| Airflow | Python | Different domain (data pipelines) |
| n8n | TypeScript | Different domain (low-code) |
| Restate | Rust | Newer durable execution engine |
Pattern: serious distributed/orchestration systems in 2020+ overwhelmingly choose Go (Temporal, Cadence, Argo, CockroachDB, K8s, etcd, Consul, Vault, Nomad). Newer projects experiment with Rust (Restate).
Recommendation: Go¶
Why Go wins our criteria¶
Resilience considerations:
-
Memory safety: GC eliminates whole class of bugs (use-after-free, double-free). No segfaults in production.
-
Predictable behavior: GC pauses sub-millisecond with modern Go. No surprises like JVM full GCs.
-
Mature distributed systems ecosystem: every major OSS distributed system in last 10 years is Go. Patterns, libraries, lessons all transferable.
-
pgx Postgres driver: industry-best Postgres driver. Connection pooling, prepared statements, COPY protocol, listen/notify, all production-grade.
-
Strong typing: catches most bugs at compile. Not as strict as Rust, but pragmatic.
-
Explicit error handling:
if err != nilverbose but forces conscious decisions. No accidentally swallowed exceptions. -
Backwards compatibility guarantee: Go 1 code from 2012 still compiles on Go 1.22. Critical for long-lived workflow data + replay determinism.
-
OpenTelemetry SDK: best-in-class. Reference implementation often in Go.
-
Industry knowledge: failure modes well-understood. Many production stories to learn from.
Efficiency considerations:
-
Memory footprint: 50-100 MB baseline for engine process. Affordable per-instance.
-
Startup time: < 1 second typical. Per-FMEA F1, restart recovery fast.
-
GC overhead: minimal at MVP scale. Tunable for higher scale.
-
Concurrency: goroutines + channels match actor model perfectly. 100K concurrent goroutines trivial.
-
Single static binary: simple deployment, no dependencies, fast container images.
Practical considerations:
-
Hireable: massive Go talent pool 2026. Many engineers from K8s, distributed systems background.
-
Development velocity: faster than Rust/Java. Slower than Python/TS but acceptable.
-
Tooling: gopls (LSP), gofmt, gotest, pprof, race detector all excellent.
-
Ecosystem fit: nearly every dependency we need is mature in Go (pgx, otel, gin/chi, sqlc, etc.).
When Go would be wrong¶
Consider alternatives if:
- Team has 3+ Rust experts AND time → Rust offers slight resilience edge
- Team is Java/Camunda veterans + commercial pressure → Java/Kotlin to leverage skills
- Team has Erlang/Elixir experts AND want maximum resilience → Elixir is theoretically superior
- Resilience is so critical it justifies slower velocity → Rust
Otherwise: Go.
Concrete architecture stack recommendation¶
Engine (core, performance-critical):
Language: Go 1.22+
Web framework: Chi (lightweight) or Gin (popular)
Database: pgx/v5 (Postgres driver)
ORM: sqlc (compile-time SQL → Go) o pgxscan (lightweight)
CEL: cel-go (Google's CEL implementation)
OpenTelemetry: go.opentelemetry.io/otel
BPMN parsing: custom (no good Go BPMN library, build minimal)
Tests: standard library + testify
Workers SDK:
Go: same as engine
TypeScript: official SDK (frontend + backend devs)
Python: official SDK (data/ML teams)
Java: official SDK (Camunda migrants)
Other languages: REST API + OpenAPI-generated clients
Frontend (Tasklist + Inspector):
Language: TypeScript
Framework: React (familiar) or Vue (simpler)
Build: Vite
UI: Tailwind CSS + Radix UI primitives
State: TanStack Query
Forms: react-jsonschema-form
CLI:
Language: Go (same binary as engine, shared types)
Framework: cobra
Output: tabular + JSON
Tooling:
Migrations: golang-migrate
Linting: golangci-lint
Build: Make + Docker
CI: GitHub Actions / GitLab CI
Result: 2 languages total (Go + TypeScript). Manageable.
Alternative stack: Rust¶
If team has Rust expertise and willing to accept slower velocity:
Engine:
Language: Rust
Web framework: Axum (Tokio-based, modern)
Database: sqlx (async, type-checked SQL at compile time)
CEL: cel-rust (less mature than cel-go) or build subset
OpenTelemetry: opentelemetry-rust (newer, but viable)
Async runtime: Tokio
Workers SDK:
Rust: same engine code
TypeScript / Python / Go: separate teams
Frontend: same as Go stack
Trade-off: ~30-50% slower MVP delivery, but compile-time guarantees stronger.
Cost analysis¶
For 1 year of development:
Go stack¶
Dev velocity: baseline (100%)
Time to MVP (M4): 26 weeks (per [analysis/implementation-roadmap-concrete](<../../analysis/implementation-roadmap-concrete.md>))
Infrastructure cost (Phase 0): ~$345/month
Talent cost: moderate (mature Go market)
Total Year 1: $450K + $4K infra
Rust stack¶
Dev velocity: 60-80% of Go baseline
Time to MVP (M4): 32-42 weeks (slower)
Infrastructure cost: slightly lower (~$300/month, less memory)
Talent cost: 30-50% premium (Rust devs scarcer)
Total Year 1: $600-700K + $4K infra
NPV: slower revenue start, higher risk of project not shipping
Java stack¶
Dev velocity: 90% of Go baseline (Spring boilerplate)
Time to MVP (M4): 28-30 weeks
Infrastructure cost: 2-3x higher (~$700-900/month, more RAM)
Talent cost: moderate (huge Java market)
Total Year 1: $480K + $10K infra
Ops cost ongoing: 20-30% higher (JVM tuning)
Elixir stack¶
Dev velocity: variable, depends on team familiarity
Time to MVP (M4): 28-40 weeks
Infrastructure cost: similar to Go
Talent cost: 50-100% premium (Elixir devs very scarce)
Total Year 1: $550-700K + $4K infra
Risk: bus factor, hiring difficulty
Go wins on combined factors.
Common counterarguments addressed¶
"But Rust is more resilient than Go"¶
True in absolute terms (no GC, no use-after-free possible). But: - At MVP scale, the difference matters less than other factors (DB query design, etc.) - Go's mature ecosystem provides resilience advantages too (battle-tested libraries) - Team velocity matters for resilience (faster bug fixes, more time for tests)
"But Camunda uses Java"¶
Yes, AND that's why MVP wins by NOT using Java: - Differentiate on operational simplicity (no JVM) - Lower infrastructure cost - Faster startup (better for HA, see FMEA F1)
If team comes from Camunda, Java tempting. But strategic differentiation is in moving away.
"But Elixir was designed for this"¶
Theoretically yes. Practically: - Erlang/Elixir hiring is HARD in 2026 - Ecosystem (especially Postgres + OTel) less mature than Go - "Designed for this" doesn't beat "industry validated for this with Temporal, Cadence, Argo"
If team has 3+ Elixir experts and wants maximum theoretical resilience, viable. Otherwise risky.
"But Go has GC"¶
True. But: - Modern Go GC pauses are sub-millisecond - For MVP target (TP99 < 1s), GC pauses are irrelevant noise - If we hit Rust-required latencies (< 100μs P99), can rewrite specific hot paths - Not worth the development velocity tax to start
"But TypeScript would let us share code with frontend"¶
True for SDKs and simple services. NOT true for engine because: - Node.js memory characteristics worry for long-running stateful service - V8 GC less predictable than Go for this workload - Backend uses different patterns than frontend anyway - TypeScript types are not enforced at runtime — runtime errors possible
TypeScript IS the right choice for: frontend (Tasklist, Inspector), worker SDK.
Mitigation strategies for Go cons¶
Verbose error handling¶
// Pattern 1: errors.Wrap for context
if err := db.Exec(...); err != nil {
return fmt.Errorf("failed to insert: %w", err)
}
// Pattern 2: helper functions for common chains
func mustExec(ctx context.Context, sql string, args ...any) error {
if _, err := db.ExecContext(ctx, sql, args...); err != nil {
return fmt.Errorf("exec failed: %w", err)
}
return nil
}
Verbose but explicit. Linters catch unhandled errors.
Lack of exceptions¶
Use panic only for unrecoverable bugs. Recover at goroutine boundaries:
func processCommand(cmd Command) {
defer func() {
if r := recover(); r != nil {
log.Error("panic in processCommand", "panic", r, "stack", debug.Stack())
metrics.Inc("engine.panics")
}
}()
// ...
}
Generics still maturing¶
Use generics where they help (collections, optionals). Don't force in early code.
Decision: Go¶
Primary: Go 1.22+ for engine, CLI, workers (Go SDK).
Secondary: TypeScript for frontend (Tasklist, Inspector) and workers (TS SDK).
Tertiary: Python, Java SDKs for ecosystem coverage. REST API spec available for any language.
Honest caveats¶
This recommendation is for MVP starting from scratch with mainstream team. Different teams should evaluate differently:
- All-Rust team: Rust justified
- Camunda migrants: Java/Kotlin tempting (but consider strategic differentiation)
- All-Elixir team: Elixir justified
- Microservices already on .NET: C# acceptable
- Greenfield with no constraints: Go almost certainly right
Generates ADR-026¶
This analysis genera ADR-026 (formal decision):
ADR-026: Go como lenguaje de implementación para engine + CLI + Go SDK.
Status: Accepted.
Links¶
- adrs/adr-002-postgresql-as-state-store — Why Go pgx matters
- adrs/adr-006-single-threaded-per-partition — Goroutine fits actor model
- analysis/implementation-roadmap-concrete — Velocity assumptions
- analysis/failure-mode-analysis — Resilience requirements
- Temporal's Go choice rationale
- pgx driver
- Go 1 compatibility promise