Rest Api Design

Spec del REST API del MVP. Convenciones: /v2/ prefix (Camunda compatibility), JSON request/response, cursor pagination, idempotency-key header, OpenAPI 3.0 spec auto-generated. Error responses estandarizados (RFC 7807 Problem Details). Versioning via URL path (/v2/, /v3/ future). Rate limiting via 429 + Retry-After. Webhook signing via HMAC. Auth via Bearer JWT (users) o Bearer api-key (workers). Complete endpoints inventory: 60+ endpoints organized by resource. Aligned con OpenAPI ecosystem (Swagger UI, client gen, Postman).

Design principles¶

Camunda v2 compatibility: /v2/ prefix to ease migration
REST conventions: nouns for resources, verbs only for actions
JSON everywhere: request + response, no other formats
Stateless: no session cookies, auth per request
Self-describing: OpenAPI spec auto-generated, valid for client gen
Standardized errors: RFC 7807 Problem Details
Cursor pagination (not OFFSET) for large lists
Idempotency opt-in via header

URL structure¶

https://<host>/v2/<resource>[/<id>][/<action>][?<query>]

Examples:

GET    /v2/process-instances/12345
POST   /v2/process-instances
DELETE /v2/process-instances/12345
GET    /v2/process-instances/12345/variables
POST   /v2/process-instances/12345/variables
POST   /v2/jobs/activate
POST   /v2/jobs/67890/complete
POST   /v2/user-tasks/24681/assignment
DELETE /v2/user-tasks/24681/assignee
POST   /v2/messages/payment-received/correlate
POST   /v2/signals/maintenance-mode/broadcast
GET    /v2/incidents
POST   /v2/incidents/13579/resolve

Resource names: kebab-case, plural form, semantic.

Authentication¶

User auth (JWT Bearer)¶

GET /v2/process-instances HTTP/1.1
Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6Ii...

JWT validated per ADR-014. Claims used: - sub → user_id - aud → must match engine audience - iss → must match configured IdP - Custom claim tenants → user's tenant memberships

Worker auth (API key)¶

POST /v2/jobs/activate HTTP/1.1
Authorization: Bearer wfk_a1b2c3d4e5f6...

API key format: - Prefix wfk_ (workflow key) for identification - 32 bytes random base64 - Hashed in DB (bcrypt)

Standard headers¶

Request¶

Header	Purpose	Required
`Authorization`	Bearer token/key	Yes
`Content-Type`	`application/json`	If body
`Accept`	`application/json`	Optional
`Idempotency-Key`	Client-generated idempotency	Optional
`X-Request-ID`	Request tracking	Optional
`Prefer`	`wait=30s` for sync mode	Optional

Response¶

Header	Purpose
`Content-Type`	Always `application/json`
`X-Request-ID`	Echo of request, generated if missing
`X-RateLimit-Limit`	Rate limit ceiling
`X-RateLimit-Remaining`	Tokens remaining
`X-RateLimit-Reset`	Unix timestamp when bucket resets
`Retry-After`	Seconds to wait before retry (429/503)
`ETag`	For conditional GET (future)

Response envelope¶

Successful list response:

{
  "items": [...],
  "totalEstimate": 1247,
  "nextCursor": "eyJrZXkiOjE3ODR9",
  "hasMore": true
}

totalEstimate: NOT exact count (avoiding expensive COUNT(*))
nextCursor: opaque base64-encoded continuation
hasMore: boolean for cleaner client code

Singleton response:

{
  "processInstanceKey": 12345,
  "bpmnProcessId": "order-approval",
  "state": "ACTIVE",
  ...
}

No envelope wrapper for single resources.

Error responses (RFC 7807)¶

Standard format for ALL errors:

HTTP/1.1 404 Not Found
Content-Type: application/problem+json

{
  "type": "https://errors.mvp.dev/process-instance-not-found",
  "title": "Process Instance Not Found",
  "status": 404,
  "detail": "Process instance with key 12345 does not exist or you don't have access",
  "instance": "/v2/process-instances/12345",
  "code": "PROCESS_INSTANCE_NOT_FOUND"
}

Standard fields per RFC 7807: - type: URI to error documentation - title: Short human-readable summary - status: HTTP status code (also in HTTP layer) - detail: Specific to this occurrence - instance: URI of the resource - code: Machine-readable error code (custom extension)

Validation errors:

HTTP/1.1 400 Bad Request
Content-Type: application/problem+json

{
  "type": "https://errors.mvp.dev/validation-failed",
  "title": "Validation Failed",
  "status": 400,
  "detail": "Request validation failed for 2 fields",
  "instance": "/v2/process-instances",
  "code": "VALIDATION_FAILED",
  "errors": [
    {
      "field": "bpmnProcessId",
      "code": "REQUIRED",
      "message": "Field is required"
    },
    {
      "field": "variables.amount",
      "code": "TYPE_MISMATCH",
      "message": "Expected number, got string"
    }
  ]
}

errors[] is extension for field-level details.

Error code catalog¶

HTTP	Code	Cause
400	`VALIDATION_FAILED`	Field validation failed
400	`INVALID_BPMN`	BPMN parsing/validation error
400	`INVALID_EXPRESSION`	CEL expression invalid
401	`UNAUTHENTICATED`	No/invalid auth token
401	`TOKEN_EXPIRED`	JWT expired
403	`INSUFFICIENT_PERMISSIONS`	Auth OK, but no permission
404	`PROCESS_INSTANCE_NOT_FOUND`	PI doesn't exist
404	`PROCESS_DEFINITION_NOT_FOUND`	PD doesn't exist
404	`JOB_NOT_FOUND`	Job doesn't exist
404	`USER_TASK_NOT_FOUND`	Task doesn't exist
409	`STATE_CONFLICT`	Resource not in expected state
409	`JOB_ALREADY_COMPLETED`	Trying to complete completed job
410	`RESOURCE_DELETED`	Resource was deleted
413	`PAYLOAD_TOO_LARGE`	Variables > 100KB
422	`BUSINESS_RULE_VIOLATION`	Domain logic violation
429	`RATE_LIMITED`	Tenant rate limit exceeded
500	`INTERNAL_ERROR`	Engine bug, see logs
503	`ENGINE_OVERLOADED`	Queue full, retry
504	`OPERATION_TIMEOUT`	Sync operation timed out

Pagination¶

Cursor-based (preferred for large lists)¶

Request:

GET /v2/process-instances?cursor=eyJrZXkiOjE3ODR9&limit=50

Response:

{
  "items": [...],
  "nextCursor": "eyJrZXkiOjE4MzR9",
  "hasMore": true
}

Cursor is opaque (base64 of internal state). Client treats as black box.

Server implementation:

def encode_cursor(state):
    return base64.urlsafe_b64encode(json.dumps(state).encode()).decode()

def decode_cursor(cursor):
    return json.loads(base64.urlsafe_b64decode(cursor.encode()).decode())

# In query:
state = decode_cursor(cursor) if cursor else None
last_key = state['key'] if state else None

results = await db.fetch_all("""
    SELECT * FROM process_instances
    WHERE tenant_id = $1
      AND ($2::bigint IS NULL OR process_instance_key < $2)
    ORDER BY process_instance_key DESC
    LIMIT $3
""", tenant_id, last_key, limit)

next_cursor = encode_cursor({'key': results[-1].key}) if results and len(results) == limit else None

Constant-time pagination regardless of dataset size.

Offset-based (small lists only)¶

For lists known to be small (process definitions, tenants):

GET /v2/process-definitions?offset=0&limit=20

Returns total count (small list, COUNT(*) OK):

{
  "items": [...],
  "total": 47,
  "offset": 0,
  "limit": 20
}

Idempotency¶

For state-changing operations (POST/PATCH/DELETE):

POST /v2/process-instances HTTP/1.1
Idempotency-Key: client-generated-uuid-here
Content-Type: application/json

{...}

Server caches response by (tenant_id, idempotency_key) for 24 hours. Re-request returns cached response. Per api engine serialization.

CREATE TABLE idempotency_keys (
    tenant_id TEXT NOT NULL,
    key TEXT NOT NULL,
    request_hash TEXT NOT NULL,        -- detect different request with same key
    response_status INT NOT NULL,
    response_body JSONB NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    expires_at TIMESTAMPTZ DEFAULT NOW() + INTERVAL '24 hours',
    PRIMARY KEY (tenant_id, key)
);

If request_hash differs but key reused → 409 with explanation.

Webhooks (outbound)¶

For event notifications to external systems:

POST https://customer-app.example.com/webhook HTTP/1.1
Content-Type: application/json
X-MVP-Signature: t=1715731200,v1=a1b2c3...
X-MVP-Event: process_instance.completed

{
  "eventType": "process_instance.completed",
  "data": {
    "processInstanceKey": 12345,
    "bpmnProcessId": "order-approval",
    ...
  },
  "timestamp": "2025-05-14T12:00:00Z"
}

Signing protocol (HMAC-SHA256)¶

def sign_webhook(payload: bytes, secret: str, timestamp: int) -> str:
    signed = f"{timestamp}.{payload.decode()}"
    hmac_obj = hmac.new(secret.encode(), signed.encode(), hashlib.sha256)
    return f"t={timestamp},v1={hmac_obj.hexdigest()}"

def verify_webhook(payload: bytes, signature: str, secret: str, max_age: int = 300) -> bool:
    # Parse signature header
    parts = dict(p.split('=') for p in signature.split(','))
    timestamp = int(parts['t'])

    # Reject if too old (prevent replay)
    if abs(time.time() - timestamp) > max_age:
        return False

    expected = sign_webhook(payload, secret, timestamp)
    return hmac.compare_digest(signature, expected)

Pattern from Stripe webhooks (industry standard).

Endpoints inventory¶

Deployments¶

POST   /v2/deployments                    Deploy resource (BPMN, form)
GET    /v2/deployments/{key}              Get deployment details
GET    /v2/deployments                    List deployments (paginated)

Process definitions¶

GET    /v2/process-definitions/{key}           Get definition
GET    /v2/process-definitions/{key}/xml       Get BPMN XML
GET    /v2/process-definitions                 List/search
DELETE /v2/process-definitions/{key}           Delete (only if no active instances)

Process instances¶

POST   /v2/process-instances                                Create instance
POST   /v2/process-instances/search                         Search (complex filters)
GET    /v2/process-instances/{key}                          Get instance
DELETE /v2/process-instances/{key}                          Cancel
GET    /v2/process-instances/{key}/variables                List variables
POST   /v2/process-instances/{key}/variables                Update variables
GET    /v2/process-instances/{key}/element-instances        Element tree
GET    /v2/process-instances/{key}/events                   History timeline
GET    /v2/process-instances/{key}/incidents                Active incidents

Jobs¶

POST   /v2/jobs/activate                                 Activate jobs (workers)
POST   /v2/jobs/{key}/complete                          Complete
POST   /v2/jobs/{key}/fail                              Fail (with retries)
POST   /v2/jobs/{key}/throw-error                       BPMN error
PATCH  /v2/jobs/{key}                                   Update (extend timeout, retries)

User tasks¶

POST   /v2/user-tasks/search                            Search tasks
GET    /v2/user-tasks/{key}                             Get task
POST   /v2/user-tasks/{key}/assignment                  Claim
DELETE /v2/user-tasks/{key}/assignee                    Unclaim
POST   /v2/user-tasks/{key}/completion                  Complete
PATCH  /v2/user-tasks/{key}                             Update vars
GET    /v2/user-tasks/{key}/form                        Get form schema
GET    /v2/user-tasks/{key}/variables                   Get variables

Incidents¶

GET    /v2/incidents/{key}                              Get incident
POST   /v2/incidents/search                             Search
POST   /v2/incidents/{key}/resolve                      Resolve

Messages¶

POST   /v2/messages/{name}/correlate                    Correlate to instance
POST   /v2/messages/{name}/publish                      Publish (no correlation)

Signals¶

POST   /v2/signals/{name}/broadcast                     Broadcast to all

Variables (direct access)¶

GET    /v2/variables/{key}                              Get variable

Forms¶

POST   /v2/forms                                        Deploy form
GET    /v2/forms/{id}/{version}                         Get form
GET    /v2/forms                                        List

Identity (Phase 1 minimal)¶

GET    /v2/users/me                                     Current user info
GET    /v2/users/me/tenants                             User's tenants
POST   /v2/auth/token-exchange                          OIDC token exchange
POST   /v2/auth/logout                                  Logout (revoke tokens)

Admin (admin role only)¶

POST   /v2/tenants                                      Create tenant
GET    /v2/tenants                                      List
DELETE /v2/tenants/{id}                                 Delete
POST   /v2/api-keys                                     Create API key
DELETE /v2/api-keys/{id}                                Delete API key
POST   /v2/users/{id}/roles                             Assign role
DELETE /v2/users/{id}/roles/{role}                      Remove role

Operations (admin/operator)¶

POST   /v2/operations/batch-cancel-instances           Batch cancel
POST   /v2/operations/batch-resolve-incidents          Batch resolve
GET    /v2/operations/{key}                            Operation status
GET    /v2/operations                                  List operations

Audit (admin)¶

POST   /v2/audit/search                                Search audit log
GET    /v2/audit/{id}                                  Get audit entry
GET    /v2/audit/export                                Export (compliance)

System (no auth or admin)¶

GET    /health                                          Liveness
GET    /ready                                           Readiness
GET    /metrics                                         Prometheus metrics
GET    /v2/system/info                                  Version, capabilities
GET    /v2/system/queue-depth                          Engine queue depth (admin)

Total: ~60 endpoints organized en ~12 resource groups.

Versioning¶

URL path-based: /v2/, /v3/ future.

Within /v2/, additive changes only (new fields, new endpoints). Breaking changes require new version.

Deprecation policy: - Announce deprecation 6 months ahead - Sunset after 1 year minimum - Sunset HTTP header on deprecated endpoints

Sunset: Mon, 01 Jan 2027 00:00:00 GMT
Deprecation: true
Link: </v3/process-instances>; rel="successor-version"

OpenAPI 3.0 spec¶

Auto-generated from code (via decorators/annotations). Published at:

https://<host>/v2/openapi.json
https://<host>/v2/openapi.yaml

Swagger UI for interactive exploration:

https://<host>/v2/docs

Auto-generated SDKs from spec:

openapi-generator-cli generate \
    -i https://mvp.example.com/v2/openapi.json \
    -g typescript-axios \
    -o ./mvp-sdk-ts/

Rate limit headers¶

Per backpressure rest strategy:

HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 1715731260

When exceeded:

HTTP/1.1 429 Too Many Requests
Retry-After: 1
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1715731261

{
  "type": "https://errors.mvp.dev/rate-limited",
  "title": "Rate Limit Exceeded",
  "status": 429,
  "detail": "Tenant 'acme' exceeded 100 req/s limit",
  "code": "RATE_LIMITED"
}

CORS policy¶

For browser clients (Inspector, Tasklist):

Access-Control-Allow-Origin: https://app.mvp.example.com
Access-Control-Allow-Methods: GET, POST, PATCH, DELETE, OPTIONS
Access-Control-Allow-Headers: Content-Type, Authorization, Idempotency-Key
Access-Control-Max-Age: 86400

Configurable per deployment. Default: same-origin only.

Content negotiation¶

Only application/json accepted. No XML, no MessagePack (per ADR-017).

Accept: application/json
Content-Type: application/json

If Accept: application/xml → 406 Not Acceptable.

API versioning strategy¶

Major versions (v2, v3): - URL path - Breaking changes - Maintained in parallel during deprecation period

Minor versions: - Header-based (optional): MVP-API-Version: 2.5.0 - Backwards-compatible additions - Feature flags

Patch versions: - Bug fixes only - No client-visible changes

Webhook delivery semantics¶

Same as job delivery: at-least-once with retries:

Engine emits event
  ↓
Webhook delivery service (separate worker process)
  ↓
POST to webhook URL with HMAC signature
  ↓
If response 2xx → success, mark delivered
If response 4xx (except 408, 429) → permanent failure, alert tenant
If response 5xx, 408, 429, timeout → retry with backoff

Backoff: exponential with jitter. Max retries: 10. After exhausted: dead letter queue.

Customer must implement idempotency on webhook receivers.

SDK considerations¶

Per ADR-016 (minimal SDK):

import { MVPClient } from '@mvp/sdk';

const client = new MVPClient({
    endpoint: 'https://mvp.example.com',
    apiKey: process.env.MVP_API_KEY,
    timeout: 30000,
    retries: 3,
    rateLimitBackoff: true
});

// All operations
const pi = await client.processInstances.create({...});
const tasks = await client.userTasks.search({...});
const jobs = await client.jobs.activate({...});

Generated TypeScript types from OpenAPI provide full type safety.

Links¶

api engine serialization — Internal serialization
backpressure rest strategy — Rate limiting
adr 014 oidc single idp — Auth
adr 017 json serialization not sbe — JSON only
RFC 7807 Problem Details
OpenAPI 3.0
Stripe webhooks signing