Rest Api Design
Spec del REST API del MVP. Convenciones:
/v2/prefix (Camunda compatibility), JSON request/response, cursor pagination, idempotency-key header, OpenAPI 3.0 spec auto-generated. Error responses estandarizados (RFC 7807 Problem Details). Versioning via URL path (/v2/,/v3/future). Rate limiting via 429 + Retry-After. Webhook signing via HMAC. Auth via Bearer JWT (users) o Bearer api-key (workers). Complete endpoints inventory: 60+ endpoints organized by resource. Aligned con OpenAPI ecosystem (Swagger UI, client gen, Postman).
Design principles¶
- Camunda v2 compatibility:
/v2/prefix to ease migration - REST conventions: nouns for resources, verbs only for actions
- JSON everywhere: request + response, no other formats
- Stateless: no session cookies, auth per request
- Self-describing: OpenAPI spec auto-generated, valid for client gen
- Standardized errors: RFC 7807 Problem Details
- Cursor pagination (not OFFSET) for large lists
- Idempotency opt-in via header
URL structure¶
Examples:
GET /v2/process-instances/12345
POST /v2/process-instances
DELETE /v2/process-instances/12345
GET /v2/process-instances/12345/variables
POST /v2/process-instances/12345/variables
POST /v2/jobs/activate
POST /v2/jobs/67890/complete
POST /v2/user-tasks/24681/assignment
DELETE /v2/user-tasks/24681/assignee
POST /v2/messages/payment-received/correlate
POST /v2/signals/maintenance-mode/broadcast
GET /v2/incidents
POST /v2/incidents/13579/resolve
Resource names: kebab-case, plural form, semantic.
Authentication¶
User auth (JWT Bearer)¶
JWT validated per ADR-014. Claims used:
- sub → user_id
- aud → must match engine audience
- iss → must match configured IdP
- Custom claim tenants → user's tenant memberships
Worker auth (API key)¶
API key format:
- Prefix wfk_ (workflow key) for identification
- 32 bytes random base64
- Hashed in DB (bcrypt)
Standard headers¶
Request¶
| Header | Purpose | Required |
|---|---|---|
Authorization |
Bearer token/key | Yes |
Content-Type |
application/json |
If body |
Accept |
application/json |
Optional |
Idempotency-Key |
Client-generated idempotency | Optional |
X-Request-ID |
Request tracking | Optional |
Prefer |
wait=30s for sync mode |
Optional |
Response¶
| Header | Purpose |
|---|---|
Content-Type |
Always application/json |
X-Request-ID |
Echo of request, generated if missing |
X-RateLimit-Limit |
Rate limit ceiling |
X-RateLimit-Remaining |
Tokens remaining |
X-RateLimit-Reset |
Unix timestamp when bucket resets |
Retry-After |
Seconds to wait before retry (429/503) |
ETag |
For conditional GET (future) |
Response envelope¶
Successful list response:
totalEstimate: NOT exact count (avoiding expensive COUNT(*))nextCursor: opaque base64-encoded continuationhasMore: boolean for cleaner client code
Singleton response:
No envelope wrapper for single resources.
Error responses (RFC 7807)¶
Standard format for ALL errors:
HTTP/1.1 404 Not Found
Content-Type: application/problem+json
{
"type": "https://errors.mvp.dev/process-instance-not-found",
"title": "Process Instance Not Found",
"status": 404,
"detail": "Process instance with key 12345 does not exist or you don't have access",
"instance": "/v2/process-instances/12345",
"code": "PROCESS_INSTANCE_NOT_FOUND"
}
Standard fields per RFC 7807:
- type: URI to error documentation
- title: Short human-readable summary
- status: HTTP status code (also in HTTP layer)
- detail: Specific to this occurrence
- instance: URI of the resource
- code: Machine-readable error code (custom extension)
Validation errors:
HTTP/1.1 400 Bad Request
Content-Type: application/problem+json
{
"type": "https://errors.mvp.dev/validation-failed",
"title": "Validation Failed",
"status": 400,
"detail": "Request validation failed for 2 fields",
"instance": "/v2/process-instances",
"code": "VALIDATION_FAILED",
"errors": [
{
"field": "bpmnProcessId",
"code": "REQUIRED",
"message": "Field is required"
},
{
"field": "variables.amount",
"code": "TYPE_MISMATCH",
"message": "Expected number, got string"
}
]
}
errors[] is extension for field-level details.
Error code catalog¶
| HTTP | Code | Cause |
|---|---|---|
| 400 | VALIDATION_FAILED |
Field validation failed |
| 400 | INVALID_BPMN |
BPMN parsing/validation error |
| 400 | INVALID_EXPRESSION |
CEL expression invalid |
| 401 | UNAUTHENTICATED |
No/invalid auth token |
| 401 | TOKEN_EXPIRED |
JWT expired |
| 403 | INSUFFICIENT_PERMISSIONS |
Auth OK, but no permission |
| 404 | PROCESS_INSTANCE_NOT_FOUND |
PI doesn't exist |
| 404 | PROCESS_DEFINITION_NOT_FOUND |
PD doesn't exist |
| 404 | JOB_NOT_FOUND |
Job doesn't exist |
| 404 | USER_TASK_NOT_FOUND |
Task doesn't exist |
| 409 | STATE_CONFLICT |
Resource not in expected state |
| 409 | JOB_ALREADY_COMPLETED |
Trying to complete completed job |
| 410 | RESOURCE_DELETED |
Resource was deleted |
| 413 | PAYLOAD_TOO_LARGE |
Variables > 100KB |
| 422 | BUSINESS_RULE_VIOLATION |
Domain logic violation |
| 429 | RATE_LIMITED |
Tenant rate limit exceeded |
| 500 | INTERNAL_ERROR |
Engine bug, see logs |
| 503 | ENGINE_OVERLOADED |
Queue full, retry |
| 504 | OPERATION_TIMEOUT |
Sync operation timed out |
Pagination¶
Cursor-based (preferred for large lists)¶
Request:
Response:
Cursor is opaque (base64 of internal state). Client treats as black box.
Server implementation:
def encode_cursor(state):
return base64.urlsafe_b64encode(json.dumps(state).encode()).decode()
def decode_cursor(cursor):
return json.loads(base64.urlsafe_b64decode(cursor.encode()).decode())
# In query:
state = decode_cursor(cursor) if cursor else None
last_key = state['key'] if state else None
results = await db.fetch_all("""
SELECT * FROM process_instances
WHERE tenant_id = $1
AND ($2::bigint IS NULL OR process_instance_key < $2)
ORDER BY process_instance_key DESC
LIMIT $3
""", tenant_id, last_key, limit)
next_cursor = encode_cursor({'key': results[-1].key}) if results and len(results) == limit else None
Constant-time pagination regardless of dataset size.
Offset-based (small lists only)¶
For lists known to be small (process definitions, tenants):
Returns total count (small list, COUNT(*) OK):
Idempotency¶
For state-changing operations (POST/PATCH/DELETE):
POST /v2/process-instances HTTP/1.1
Idempotency-Key: client-generated-uuid-here
Content-Type: application/json
{...}
Server caches response by (tenant_id, idempotency_key) for 24 hours. Re-request returns cached response. Per concepts/api-engine-serialization.
CREATE TABLE idempotency_keys (
tenant_id TEXT NOT NULL,
key TEXT NOT NULL,
request_hash TEXT NOT NULL, -- detect different request with same key
response_status INT NOT NULL,
response_body JSONB NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW(),
expires_at TIMESTAMPTZ DEFAULT NOW() + INTERVAL '24 hours',
PRIMARY KEY (tenant_id, key)
);
If request_hash differs but key reused → 409 with explanation.
Webhooks (outbound)¶
For event notifications to external systems:
POST https://customer-app.example.com/webhook HTTP/1.1
Content-Type: application/json
X-MVP-Signature: t=1715731200,v1=a1b2c3...
X-MVP-Event: process_instance.completed
{
"eventType": "process_instance.completed",
"data": {
"processInstanceKey": 12345,
"bpmnProcessId": "order-approval",
...
},
"timestamp": "2025-05-14T12:00:00Z"
}
Signing protocol (HMAC-SHA256)¶
def sign_webhook(payload: bytes, secret: str, timestamp: int) -> str:
signed = f"{timestamp}.{payload.decode()}"
hmac_obj = hmac.new(secret.encode(), signed.encode(), hashlib.sha256)
return f"t={timestamp},v1={hmac_obj.hexdigest()}"
def verify_webhook(payload: bytes, signature: str, secret: str, max_age: int = 300) -> bool:
# Parse signature header
parts = dict(p.split('=') for p in signature.split(','))
timestamp = int(parts['t'])
# Reject if too old (prevent replay)
if abs(time.time() - timestamp) > max_age:
return False
expected = sign_webhook(payload, secret, timestamp)
return hmac.compare_digest(signature, expected)
Pattern from Stripe webhooks (industry standard).
Endpoints inventory¶
Deployments¶
POST /v2/deployments Deploy resource (BPMN, form)
GET /v2/deployments/{key} Get deployment details
GET /v2/deployments List deployments (paginated)
Process definitions¶
GET /v2/process-definitions/{key} Get definition
GET /v2/process-definitions/{key}/xml Get BPMN XML
GET /v2/process-definitions List/search
DELETE /v2/process-definitions/{key} Delete (only if no active instances)
Process instances¶
POST /v2/process-instances Create instance
POST /v2/process-instances/search Search (complex filters)
GET /v2/process-instances/{key} Get instance
DELETE /v2/process-instances/{key} Cancel
GET /v2/process-instances/{key}/variables List variables
POST /v2/process-instances/{key}/variables Update variables
GET /v2/process-instances/{key}/element-instances Element tree
GET /v2/process-instances/{key}/events History timeline
GET /v2/process-instances/{key}/incidents Active incidents
Jobs¶
POST /v2/jobs/activate Activate jobs (workers)
POST /v2/jobs/{key}/complete Complete
POST /v2/jobs/{key}/fail Fail (with retries)
POST /v2/jobs/{key}/throw-error BPMN error
PATCH /v2/jobs/{key} Update (extend timeout, retries)
User tasks¶
POST /v2/user-tasks/search Search tasks
GET /v2/user-tasks/{key} Get task
POST /v2/user-tasks/{key}/assignment Claim
DELETE /v2/user-tasks/{key}/assignee Unclaim
POST /v2/user-tasks/{key}/completion Complete
PATCH /v2/user-tasks/{key} Update vars
GET /v2/user-tasks/{key}/form Get form schema
GET /v2/user-tasks/{key}/variables Get variables
Incidents¶
GET /v2/incidents/{key} Get incident
POST /v2/incidents/search Search
POST /v2/incidents/{key}/resolve Resolve
Messages¶
POST /v2/messages/{name}/correlate Correlate to instance
POST /v2/messages/{name}/publish Publish (no correlation)
Signals¶
Variables (direct access)¶
Forms¶
Identity (Phase 1 minimal)¶
GET /v2/users/me Current user info
GET /v2/users/me/tenants User's tenants
POST /v2/auth/token-exchange OIDC token exchange
POST /v2/auth/logout Logout (revoke tokens)
Admin (admin role only)¶
POST /v2/tenants Create tenant
GET /v2/tenants List
DELETE /v2/tenants/{id} Delete
POST /v2/api-keys Create API key
DELETE /v2/api-keys/{id} Delete API key
POST /v2/users/{id}/roles Assign role
DELETE /v2/users/{id}/roles/{role} Remove role
Operations (admin/operator)¶
POST /v2/operations/batch-cancel-instances Batch cancel
POST /v2/operations/batch-resolve-incidents Batch resolve
GET /v2/operations/{key} Operation status
GET /v2/operations List operations
Audit (admin)¶
POST /v2/audit/search Search audit log
GET /v2/audit/{id} Get audit entry
GET /v2/audit/export Export (compliance)
System (no auth or admin)¶
GET /health Liveness
GET /ready Readiness
GET /metrics Prometheus metrics
GET /v2/system/info Version, capabilities
GET /v2/system/queue-depth Engine queue depth (admin)
Total: ~60 endpoints organized en ~12 resource groups.
Versioning¶
URL path-based: /v2/, /v3/ future.
Within /v2/, additive changes only (new fields, new endpoints). Breaking changes require new version.
Deprecation policy:
- Announce deprecation 6 months ahead
- Sunset after 1 year minimum
- Sunset HTTP header on deprecated endpoints
Sunset: Mon, 01 Jan 2027 00:00:00 GMT
Deprecation: true
Link: </v3/process-instances>; rel="successor-version"
OpenAPI 3.0 spec¶
Auto-generated from code (via decorators/annotations). Published at:
Swagger UI for interactive exploration:
Auto-generated SDKs from spec:
openapi-generator-cli generate \
-i https://mvp.example.com/v2/openapi.json \
-g typescript-axios \
-o ./mvp-sdk-ts/
Rate limit headers¶
Per concepts/backpressure-rest-strategy:
When exceeded:
HTTP/1.1 429 Too Many Requests
Retry-After: 1
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1715731261
{
"type": "https://errors.mvp.dev/rate-limited",
"title": "Rate Limit Exceeded",
"status": 429,
"detail": "Tenant 'acme' exceeded 100 req/s limit",
"code": "RATE_LIMITED"
}
CORS policy¶
For browser clients (Inspector, Tasklist):
Access-Control-Allow-Origin: https://app.mvp.example.com
Access-Control-Allow-Methods: GET, POST, PATCH, DELETE, OPTIONS
Access-Control-Allow-Headers: Content-Type, Authorization, Idempotency-Key
Access-Control-Max-Age: 86400
Configurable per deployment. Default: same-origin only.
Content negotiation¶
Only application/json accepted. No XML, no MessagePack (per ADR-017).
If Accept: application/xml → 406 Not Acceptable.
API versioning strategy¶
Major versions (v2, v3):
- URL path
- Breaking changes
- Maintained in parallel during deprecation period
Minor versions:
- Header-based (optional): MVP-API-Version: 2.5.0
- Backwards-compatible additions
- Feature flags
Patch versions: - Bug fixes only - No client-visible changes
Webhook delivery semantics¶
Same as job delivery: at-least-once with retries:
Engine emits event
↓
Webhook delivery service (separate worker process)
↓
POST to webhook URL with HMAC signature
↓
If response 2xx → success, mark delivered
If response 4xx (except 408, 429) → permanent failure, alert tenant
If response 5xx, 408, 429, timeout → retry with backoff
Backoff: exponential with jitter. Max retries: 10. After exhausted: dead letter queue.
Customer must implement idempotency on webhook receivers.
SDK considerations¶
Per ADR-016 (minimal SDK):
import { MVPClient } from '@mvp/sdk';
const client = new MVPClient({
endpoint: 'https://mvp.example.com',
apiKey: process.env.MVP_API_KEY,
timeout: 30000,
retries: 3,
rateLimitBackoff: true
});
// All operations
const pi = await client.processInstances.create({...});
const tasks = await client.userTasks.search({...});
const jobs = await client.jobs.activate({...});
Generated TypeScript types from OpenAPI provide full type safety.
Links¶
- concepts/api-engine-serialization — Internal serialization
- concepts/backpressure-rest-strategy — Rate limiting
- adrs/adr-014-oidc-single-idp — Auth
- adrs/adr-017-json-serialization-not-sbe — JSON only
- RFC 7807 Problem Details
- OpenAPI 3.0
- Stripe webhooks signing