ADR-017: JSON serialization¶
- Status: Accepted
- Date: 2026-05-14
- Tags: data, serialization, performance
Context and Problem Statement¶
Camunda usa SBE (10-50 ns encode/decode) para internal protocol + MessagePack para variables (50-200 ns). Performance extrema pero requiere schemas custom y NO es human-readable. ¿Qué usa el MVP?
Decision Drivers¶
- MVP performance target es single-digit ms (no nanoseconds)
- Postgres roundtrip domina serialization cost
- JSON es debuggable (puede leerse en logs, queries SQL, etc.)
- JSONB de Postgres es optimizado
- Ecosystem JSON es masivo (tools, libraries, IDE support)
Considered Options¶
- JSON únicamente (text + JSONB en Postgres)
- MessagePack (compact, binary, self-describing)
- Protobuf (binary con schema)
- SBE (ultra-fast pero complejo)
- Mixed: JSON exterior, binary interno
Decision Outcome¶
Chosen option: JSON únicamente porque: - Performance suficiente para target MVP - Debuggability tier 1 - Postgres JSONB es eficiente - Ecosystem completo - Reading event_log SQL trivial
Positive Consequences¶
- Logs son legibles
- SQL queries directos sobre payloads
- Tools standard (jq, IDE support)
- Onboarding trivial
- Migration desde Camunda traceable (Camunda usa JSON external también)
Negative Consequences¶
- ~10-100x más lento que SBE
- Storage size mayor (~2-5x vs SBE)
- JSON parsing overhead per request
- Si performance crítica, requires change
Performance reality check¶
Benchmarks (de concepts/sbe-serialization):
| Format | Encode | Decode | Self-describing |
|---|---|---|---|
| SBE | 10-50 ns | 10-50 ns | NO |
| Protobuf | 100-500 ns | 100-500 ns | NO |
| MessagePack | 50-200 ns | 50-200 ns | YES |
| JSON | 1-10 μs | 1-10 μs | YES |
JSON es 100-1000x más lento que SBE. Pero:
Total request time = Network + Parse + Process + DB + Response
Para MVP:
Network: 1-5 ms
Parse (JSON): 0.1-1 ms ← optimization aquí: ~1-5%
Process: 1-10 ms
DB: 5-50 ms ← dominante
Response: 1-5 ms
Total: ~10-70 ms
Optimizing JSON parsing from 1ms a 0.01ms (1000x) → saves ~1% of total. No worth la complexity de SBE.
Storage en Postgres¶
-- JSONB es eficiente
CREATE TABLE event_log (
position BIGSERIAL PRIMARY KEY,
intent TEXT NOT NULL,
payload JSONB NOT NULL,
timestamp TIMESTAMPTZ DEFAULT NOW()
);
-- Indexes en campos específicos del JSON
CREATE INDEX idx_event_pi ON event_log ((payload->>'processInstanceKey'));
-- Queries directos
SELECT payload->>'processInstanceKey' AS pid,
payload->>'elementId' AS element,
intent
FROM event_log
WHERE payload->>'processInstanceKey' = '12345'
ORDER BY position;
JSONB internal storage está optimized — no es text raw.
Variables del proceso¶
// process variable serialized
{
"customerId": "CUST-001",
"orderItems": [
{ "sku": "WIDGET-A", "qty": 2, "price": 19.99 },
{ "sku": "GADGET-B", "qty": 1, "price": 49.99 }
],
"shippingAddress": {
"street": "123 Main St",
"city": "Springfield"
}
}
vs MessagePack del mismo (smaller pero illegible):
JSON wins en developer experience.
Limite de variables¶
Recordar analysis/intuit-production-benchmarks: variables > 100-150 KB ralentizan exports. Enforce limit:
ALTER TABLE variables
ADD CONSTRAINT max_variable_size
CHECK (octet_length(value::text) <= 102400); -- 100 KB
Si user necesita store >100KB, externalize:
NO store el PDF in variables.
Cuándo reconsider¶
Switch to binary format SI:
- Throughput necesario > 50K TPS
- Profiler muestra JSON parsing > 20% del time
- Storage cost se vuelve issue ($$$ en database)
- Network bandwidth bottleneck
Para 99% de casos, JSON wins. Premature optimization is the root of evil.
Migration path (si needed)¶
Cuando si needed switch:
# Engine maneja both formats con header
if record.format == 'json':
payload = json.loads(record.bytes)
elif record.format == 'msgpack':
payload = msgpack.unpackb(record.bytes)
# Newly written records usan format actual
new_record = Record(
format='msgpack', # eventually
bytes=msgpack.packb(data)
)
Lazy migration: old records JSON, new MessagePack. Engine handles both.
Links¶
- concepts/sbe-serialization — SBE detalle (para contraste)
- analysis/intuit-production-benchmarks — Variable size limits
- JSONB Postgres docs