Saltar a contenido

Rest Api Design

Spec del REST API del MVP. Convenciones: /v2/ prefix (Camunda compatibility), JSON request/response, cursor pagination, idempotency-key header, OpenAPI 3.0 spec auto-generated. Error responses estandarizados (RFC 7807 Problem Details). Versioning via URL path (/v2/, /v3/ future). Rate limiting via 429 + Retry-After. Webhook signing via HMAC. Auth via Bearer JWT (users) o Bearer api-key (workers). Complete endpoints inventory: 60+ endpoints organized by resource. Aligned con OpenAPI ecosystem (Swagger UI, client gen, Postman).

Design principles

  1. Camunda v2 compatibility: /v2/ prefix to ease migration
  2. REST conventions: nouns for resources, verbs only for actions
  3. JSON everywhere: request + response, no other formats
  4. Stateless: no session cookies, auth per request
  5. Self-describing: OpenAPI spec auto-generated, valid for client gen
  6. Standardized errors: RFC 7807 Problem Details
  7. Cursor pagination (not OFFSET) for large lists
  8. Idempotency opt-in via header

URL structure

https://<host>/v2/<resource>[/<id>][/<action>][?<query>]

Examples:

GET    /v2/process-instances/12345
POST   /v2/process-instances
DELETE /v2/process-instances/12345
GET    /v2/process-instances/12345/variables
POST   /v2/process-instances/12345/variables
POST   /v2/jobs/activate
POST   /v2/jobs/67890/complete
POST   /v2/user-tasks/24681/assignment
DELETE /v2/user-tasks/24681/assignee
POST   /v2/messages/payment-received/correlate
POST   /v2/signals/maintenance-mode/broadcast
GET    /v2/incidents
POST   /v2/incidents/13579/resolve

Resource names: kebab-case, plural form, semantic.

Authentication

User auth (JWT Bearer)

GET /v2/process-instances HTTP/1.1
Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6Ii...

JWT validated per ADR-014. Claims used: - sub → user_id - aud → must match engine audience - iss → must match configured IdP - Custom claim tenants → user's tenant memberships

Worker auth (API key)

POST /v2/jobs/activate HTTP/1.1
Authorization: Bearer wfk_a1b2c3d4e5f6...

API key format: - Prefix wfk_ (workflow key) for identification - 32 bytes random base64 - Hashed in DB (bcrypt)

Standard headers

Request

Header Purpose Required
Authorization Bearer token/key Yes
Content-Type application/json If body
Accept application/json Optional
Idempotency-Key Client-generated idempotency Optional
X-Request-ID Request tracking Optional
Prefer wait=30s for sync mode Optional

Response

Header Purpose
Content-Type Always application/json
X-Request-ID Echo of request, generated if missing
X-RateLimit-Limit Rate limit ceiling
X-RateLimit-Remaining Tokens remaining
X-RateLimit-Reset Unix timestamp when bucket resets
Retry-After Seconds to wait before retry (429/503)
ETag For conditional GET (future)

Response envelope

Successful list response:

{
  "items": [...],
  "totalEstimate": 1247,
  "nextCursor": "eyJrZXkiOjE3ODR9",
  "hasMore": true
}
  • totalEstimate: NOT exact count (avoiding expensive COUNT(*))
  • nextCursor: opaque base64-encoded continuation
  • hasMore: boolean for cleaner client code

Singleton response:

{
  "processInstanceKey": 12345,
  "bpmnProcessId": "order-approval",
  "state": "ACTIVE",
  ...
}

No envelope wrapper for single resources.

Error responses (RFC 7807)

Standard format for ALL errors:

HTTP/1.1 404 Not Found
Content-Type: application/problem+json

{
  "type": "https://errors.mvp.dev/process-instance-not-found",
  "title": "Process Instance Not Found",
  "status": 404,
  "detail": "Process instance with key 12345 does not exist or you don't have access",
  "instance": "/v2/process-instances/12345",
  "code": "PROCESS_INSTANCE_NOT_FOUND"
}

Standard fields per RFC 7807: - type: URI to error documentation - title: Short human-readable summary - status: HTTP status code (also in HTTP layer) - detail: Specific to this occurrence - instance: URI of the resource - code: Machine-readable error code (custom extension)

Validation errors:

HTTP/1.1 400 Bad Request
Content-Type: application/problem+json

{
  "type": "https://errors.mvp.dev/validation-failed",
  "title": "Validation Failed",
  "status": 400,
  "detail": "Request validation failed for 2 fields",
  "instance": "/v2/process-instances",
  "code": "VALIDATION_FAILED",
  "errors": [
    {
      "field": "bpmnProcessId",
      "code": "REQUIRED",
      "message": "Field is required"
    },
    {
      "field": "variables.amount",
      "code": "TYPE_MISMATCH",
      "message": "Expected number, got string"
    }
  ]
}

errors[] is extension for field-level details.

Error code catalog

HTTP Code Cause
400 VALIDATION_FAILED Field validation failed
400 INVALID_BPMN BPMN parsing/validation error
400 INVALID_EXPRESSION CEL expression invalid
401 UNAUTHENTICATED No/invalid auth token
401 TOKEN_EXPIRED JWT expired
403 INSUFFICIENT_PERMISSIONS Auth OK, but no permission
404 PROCESS_INSTANCE_NOT_FOUND PI doesn't exist
404 PROCESS_DEFINITION_NOT_FOUND PD doesn't exist
404 JOB_NOT_FOUND Job doesn't exist
404 USER_TASK_NOT_FOUND Task doesn't exist
409 STATE_CONFLICT Resource not in expected state
409 JOB_ALREADY_COMPLETED Trying to complete completed job
410 RESOURCE_DELETED Resource was deleted
413 PAYLOAD_TOO_LARGE Variables > 100KB
422 BUSINESS_RULE_VIOLATION Domain logic violation
429 RATE_LIMITED Tenant rate limit exceeded
500 INTERNAL_ERROR Engine bug, see logs
503 ENGINE_OVERLOADED Queue full, retry
504 OPERATION_TIMEOUT Sync operation timed out

Pagination

Cursor-based (preferred for large lists)

Request:

GET /v2/process-instances?cursor=eyJrZXkiOjE3ODR9&limit=50

Response:

{
  "items": [...],
  "nextCursor": "eyJrZXkiOjE4MzR9",
  "hasMore": true
}

Cursor is opaque (base64 of internal state). Client treats as black box.

Server implementation:

def encode_cursor(state):
    return base64.urlsafe_b64encode(json.dumps(state).encode()).decode()

def decode_cursor(cursor):
    return json.loads(base64.urlsafe_b64decode(cursor.encode()).decode())

# In query:
state = decode_cursor(cursor) if cursor else None
last_key = state['key'] if state else None

results = await db.fetch_all("""
    SELECT * FROM process_instances
    WHERE tenant_id = $1
      AND ($2::bigint IS NULL OR process_instance_key < $2)
    ORDER BY process_instance_key DESC
    LIMIT $3
""", tenant_id, last_key, limit)

next_cursor = encode_cursor({'key': results[-1].key}) if results and len(results) == limit else None

Constant-time pagination regardless of dataset size.

Offset-based (small lists only)

For lists known to be small (process definitions, tenants):

GET /v2/process-definitions?offset=0&limit=20

Returns total count (small list, COUNT(*) OK):

{
  "items": [...],
  "total": 47,
  "offset": 0,
  "limit": 20
}

Idempotency

For state-changing operations (POST/PATCH/DELETE):

POST /v2/process-instances HTTP/1.1
Idempotency-Key: client-generated-uuid-here
Content-Type: application/json

{...}

Server caches response by (tenant_id, idempotency_key) for 24 hours. Re-request returns cached response. Per concepts/api-engine-serialization.

CREATE TABLE idempotency_keys (
    tenant_id TEXT NOT NULL,
    key TEXT NOT NULL,
    request_hash TEXT NOT NULL,        -- detect different request with same key
    response_status INT NOT NULL,
    response_body JSONB NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    expires_at TIMESTAMPTZ DEFAULT NOW() + INTERVAL '24 hours',
    PRIMARY KEY (tenant_id, key)
);

If request_hash differs but key reused → 409 with explanation.

Webhooks (outbound)

For event notifications to external systems:

POST https://customer-app.example.com/webhook HTTP/1.1
Content-Type: application/json
X-MVP-Signature: t=1715731200,v1=a1b2c3...
X-MVP-Event: process_instance.completed

{
  "eventType": "process_instance.completed",
  "data": {
    "processInstanceKey": 12345,
    "bpmnProcessId": "order-approval",
    ...
  },
  "timestamp": "2025-05-14T12:00:00Z"
}

Signing protocol (HMAC-SHA256)

def sign_webhook(payload: bytes, secret: str, timestamp: int) -> str:
    signed = f"{timestamp}.{payload.decode()}"
    hmac_obj = hmac.new(secret.encode(), signed.encode(), hashlib.sha256)
    return f"t={timestamp},v1={hmac_obj.hexdigest()}"

def verify_webhook(payload: bytes, signature: str, secret: str, max_age: int = 300) -> bool:
    # Parse signature header
    parts = dict(p.split('=') for p in signature.split(','))
    timestamp = int(parts['t'])

    # Reject if too old (prevent replay)
    if abs(time.time() - timestamp) > max_age:
        return False

    expected = sign_webhook(payload, secret, timestamp)
    return hmac.compare_digest(signature, expected)

Pattern from Stripe webhooks (industry standard).

Endpoints inventory

Deployments

POST   /v2/deployments                    Deploy resource (BPMN, form)
GET    /v2/deployments/{key}              Get deployment details
GET    /v2/deployments                    List deployments (paginated)

Process definitions

GET    /v2/process-definitions/{key}           Get definition
GET    /v2/process-definitions/{key}/xml       Get BPMN XML
GET    /v2/process-definitions                 List/search
DELETE /v2/process-definitions/{key}           Delete (only if no active instances)

Process instances

POST   /v2/process-instances                                Create instance
POST   /v2/process-instances/search                         Search (complex filters)
GET    /v2/process-instances/{key}                          Get instance
DELETE /v2/process-instances/{key}                          Cancel
GET    /v2/process-instances/{key}/variables                List variables
POST   /v2/process-instances/{key}/variables                Update variables
GET    /v2/process-instances/{key}/element-instances        Element tree
GET    /v2/process-instances/{key}/events                   History timeline
GET    /v2/process-instances/{key}/incidents                Active incidents

Jobs

POST   /v2/jobs/activate                                 Activate jobs (workers)
POST   /v2/jobs/{key}/complete                          Complete
POST   /v2/jobs/{key}/fail                              Fail (with retries)
POST   /v2/jobs/{key}/throw-error                       BPMN error
PATCH  /v2/jobs/{key}                                   Update (extend timeout, retries)

User tasks

POST   /v2/user-tasks/search                            Search tasks
GET    /v2/user-tasks/{key}                             Get task
POST   /v2/user-tasks/{key}/assignment                  Claim
DELETE /v2/user-tasks/{key}/assignee                    Unclaim
POST   /v2/user-tasks/{key}/completion                  Complete
PATCH  /v2/user-tasks/{key}                             Update vars
GET    /v2/user-tasks/{key}/form                        Get form schema
GET    /v2/user-tasks/{key}/variables                   Get variables

Incidents

GET    /v2/incidents/{key}                              Get incident
POST   /v2/incidents/search                             Search
POST   /v2/incidents/{key}/resolve                      Resolve

Messages

POST   /v2/messages/{name}/correlate                    Correlate to instance
POST   /v2/messages/{name}/publish                      Publish (no correlation)

Signals

POST   /v2/signals/{name}/broadcast                     Broadcast to all

Variables (direct access)

GET    /v2/variables/{key}                              Get variable

Forms

POST   /v2/forms                                        Deploy form
GET    /v2/forms/{id}/{version}                         Get form
GET    /v2/forms                                        List

Identity (Phase 1 minimal)

GET    /v2/users/me                                     Current user info
GET    /v2/users/me/tenants                             User's tenants
POST   /v2/auth/token-exchange                          OIDC token exchange
POST   /v2/auth/logout                                  Logout (revoke tokens)

Admin (admin role only)

POST   /v2/tenants                                      Create tenant
GET    /v2/tenants                                      List
DELETE /v2/tenants/{id}                                 Delete
POST   /v2/api-keys                                     Create API key
DELETE /v2/api-keys/{id}                                Delete API key
POST   /v2/users/{id}/roles                             Assign role
DELETE /v2/users/{id}/roles/{role}                      Remove role

Operations (admin/operator)

POST   /v2/operations/batch-cancel-instances           Batch cancel
POST   /v2/operations/batch-resolve-incidents          Batch resolve
GET    /v2/operations/{key}                            Operation status
GET    /v2/operations                                  List operations

Audit (admin)

POST   /v2/audit/search                                Search audit log
GET    /v2/audit/{id}                                  Get audit entry
GET    /v2/audit/export                                Export (compliance)

System (no auth or admin)

GET    /health                                          Liveness
GET    /ready                                           Readiness
GET    /metrics                                         Prometheus metrics
GET    /v2/system/info                                  Version, capabilities
GET    /v2/system/queue-depth                          Engine queue depth (admin)

Total: ~60 endpoints organized en ~12 resource groups.

Versioning

URL path-based: /v2/, /v3/ future.

Within /v2/, additive changes only (new fields, new endpoints). Breaking changes require new version.

Deprecation policy: - Announce deprecation 6 months ahead - Sunset after 1 year minimum - Sunset HTTP header on deprecated endpoints

Sunset: Mon, 01 Jan 2027 00:00:00 GMT
Deprecation: true
Link: </v3/process-instances>; rel="successor-version"

OpenAPI 3.0 spec

Auto-generated from code (via decorators/annotations). Published at:

https://<host>/v2/openapi.json
https://<host>/v2/openapi.yaml

Swagger UI for interactive exploration:

https://<host>/v2/docs

Auto-generated SDKs from spec:

openapi-generator-cli generate \
    -i https://mvp.example.com/v2/openapi.json \
    -g typescript-axios \
    -o ./mvp-sdk-ts/

Rate limit headers

Per concepts/backpressure-rest-strategy:

HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 1715731260

When exceeded:

HTTP/1.1 429 Too Many Requests
Retry-After: 1
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1715731261

{
  "type": "https://errors.mvp.dev/rate-limited",
  "title": "Rate Limit Exceeded",
  "status": 429,
  "detail": "Tenant 'acme' exceeded 100 req/s limit",
  "code": "RATE_LIMITED"
}

CORS policy

For browser clients (Inspector, Tasklist):

Access-Control-Allow-Origin: https://app.mvp.example.com
Access-Control-Allow-Methods: GET, POST, PATCH, DELETE, OPTIONS
Access-Control-Allow-Headers: Content-Type, Authorization, Idempotency-Key
Access-Control-Max-Age: 86400

Configurable per deployment. Default: same-origin only.

Content negotiation

Only application/json accepted. No XML, no MessagePack (per ADR-017).

Accept: application/json
Content-Type: application/json

If Accept: application/xml → 406 Not Acceptable.

API versioning strategy

Major versions (v2, v3): - URL path - Breaking changes - Maintained in parallel during deprecation period

Minor versions: - Header-based (optional): MVP-API-Version: 2.5.0 - Backwards-compatible additions - Feature flags

Patch versions: - Bug fixes only - No client-visible changes

Webhook delivery semantics

Same as job delivery: at-least-once with retries:

Engine emits event
Webhook delivery service (separate worker process)
POST to webhook URL with HMAC signature
If response 2xx → success, mark delivered
If response 4xx (except 408, 429) → permanent failure, alert tenant
If response 5xx, 408, 429, timeout → retry with backoff

Backoff: exponential with jitter. Max retries: 10. After exhausted: dead letter queue.

Customer must implement idempotency on webhook receivers.

SDK considerations

Per ADR-016 (minimal SDK):

import { MVPClient } from '@mvp/sdk';

const client = new MVPClient({
    endpoint: 'https://mvp.example.com',
    apiKey: process.env.MVP_API_KEY,
    timeout: 30000,
    retries: 3,
    rateLimitBackoff: true
});

// All operations
const pi = await client.processInstances.create({...});
const tasks = await client.userTasks.search({...});
const jobs = await client.jobs.activate({...});

Generated TypeScript types from OpenAPI provide full type safety.