Saltar a contenido

ADR-024: Postgres Row-Level Security as defense in depth

  • Status: Accepted
  • Date: 2026-05-14
  • Tags: security, multi-tenancy, postgres

Context and Problem Statement

Tenant isolation depends on every SQL query including tenant_id filter. Single bug (forgotten WHERE clause) → cross-tenant data leak (T4.1 threat). ¿Como defend en depth?

Decision Drivers

  • Application-level checks can be bypassed by bugs
  • SQL injection could circumvent app-level checks
  • Compliance (SOC2 multi-tenant requirements)
  • Postgres RLS is mature (since 9.5, 2016)
  • Negligible performance impact

Considered Options

  1. App-level tenant_id checks únicamente
  2. Postgres RLS as defense in depth (recomendado)
  3. Per-tenant separate schemas (Phase 3 sharding)
  4. Per-tenant separate databases (max isolation, max ops)

Decision Outcome

Chosen: Postgres RLS + app-level checks. Defense in depth.

Positive Consequences

  • Bug en código NO causa data leak (RLS catches it)
  • Compliance posture strengthened
  • No performance penalty (Postgres optimizes RLS-filtered queries)
  • Audit trail easier (RLS violations logged)

Negative Consequences

  • RLS adds slight complexity para developer onboarding
  • Migration of legacy tables needs careful planning
  • Some queries más complex con RLS context setting

Implementation

-- Enable RLS on all tenant-scoped tables
ALTER TABLE process_instances ENABLE ROW LEVEL SECURITY;
ALTER TABLE process_definitions ENABLE ROW LEVEL SECURITY;
ALTER TABLE jobs ENABLE ROW LEVEL SECURITY;
ALTER TABLE user_tasks ENABLE ROW LEVEL SECURITY;
ALTER TABLE incidents ENABLE ROW LEVEL SECURITY;
ALTER TABLE variables ENABLE ROW LEVEL SECURITY;
-- ... (all tables with tenant_id column)

-- Policy: tenant isolation
CREATE POLICY tenant_isolation ON process_instances
  FOR ALL                            -- SELECT, INSERT, UPDATE, DELETE
  USING (tenant_id = current_setting('app.current_tenant', true))
  WITH CHECK (tenant_id = current_setting('app.current_tenant', true));

-- Repeat for each table

Engine sets context per request:

async def handle_request(request):
    tenant_id = request.user.tenant_id

    async with db.acquire() as conn:
        # Set RLS context for this transaction
        await conn.execute(f"SET LOCAL app.current_tenant = '{tenant_id}'")

        # All queries now automatically filtered by tenant
        result = await conn.fetch("SELECT * FROM process_instances")
        # ← Even if forgot WHERE tenant_id, RLS filters it

Bypass for admin operations

Some ops legitimately cross tenants (engine internals, admin tools):

-- Create admin role that bypasses RLS
CREATE ROLE engine_admin BYPASSRLS;
GRANT engine_admin TO engine_user;

-- App switches role for admin ops only
SET LOCAL ROLE engine_admin;       -- bypasses RLS
-- ... admin query ...
RESET ROLE;                         -- back to normal

Carefully audited (per adrs/adr-025-audit-logging-mandatory).

Testing

async def test_rls_prevents_cross_tenant_access():
    # Insert data for tenant A
    async with db.acquire() as conn:
        await conn.execute("SET LOCAL app.current_tenant = 'tenant-a'")
        await conn.execute("INSERT INTO process_instances ... tenant_id='tenant-a'")

    # Try to read as tenant B
    async with db.acquire() as conn:
        await conn.execute("SET LOCAL app.current_tenant = 'tenant-b'")
        result = await conn.fetch("SELECT * FROM process_instances")
        assert len(result) == 0, "RLS allowed cross-tenant read!"

    # Try to update as tenant B
    async with db.acquire() as conn:
        await conn.execute("SET LOCAL app.current_tenant = 'tenant-b'")
        result = await conn.execute("UPDATE process_instances SET state='CANCELED'")
        assert result.rowcount == 0, "RLS allowed cross-tenant update!"

Performance

Postgres optimizes RLS: - Adds WHERE tenant_id = current_setting(...) to query plans - Uses index on tenant_id (already exists per schema) - Negligible overhead in benchmarks (~1-2%)