Problem & Constraints
Two production Go backends, same stack, but very different access-control requirements. The first is a single-tenant service running one operational context per deployment, with the main RBAC challenge being row-level data scoping — which organisational unit a user can see, not who they are across tenants. The second is a multi-tenant platform where the same user account can hold different roles across independent tenants, permissions are organised into feature modules, and isolation must be enforced strictly per tenant.
Both share the same structural primitives — User → Role → Permission join graphs, soft-delete reactivation, per-request permission resolution. The constraints diverge enough that each required a different shape. This doc documents both designs, the explicit comparison between them, and what the simpler design revealed when requirements expanded.
- Correctness over caching: revoked permissions must not stay live after an admin removes them.
- Soft-delete reactivation: revoke-and-regrant must not collide with UNIQUE indexes.
- Per-request resolution: tokens carry identity, never permissions — permissions are resolved on every request.
- Auditability: every grant/revoke is recoverable from the database, not from logs.
Comparative Overview
At a glance, the two designs differ on five orthogonal axes: tenancy model, role tiering, scope mechanism, module gating, and revocation strategy. The table below summarises where each design lands.
| Axis | Design A — Flat | Design B — Module-Gated Multi-Tenant |
|---|---|---|
| Tenancy | Single operational context | Many independent tenants per deployment |
| Role tiers | One tier (flat roles) | Two tiers: system + tenant-scoped |
| Per-tenant isolation | Not required | TenantMemberRole 3-way join |
| Data scoping | ScopeContext injected on every request | DataAccess enum (single_tenant / all_tenant) |
| Module gating | None — flat permission catalog | TenantModule check is the final gate |
| Permission count | ~100 constants, resource.action | 300+ constants across 11 modules |
| Revocation | Immediate session kill via Redis | Per-request DB resolution, no cache |
| Superadmin escape | Hardcoded 'Super Admin' role name | is_superadmin user flag + system role |
Design A — Project Structure
Single-tenant service. Permissions live in one flat catalog; scope is a per-request context, not a per-role attribute.
design-a/├── ent/schema/│ ├── role.go # Role: name (unique), is_active│ ├── permission.go # Permission: code, is_active│ ├── user_role.go # User → Role (soft-delete)│ └── role_permission.go # Role → Permission (soft-delete)├── internal/│ ├── transport/http/middleware/│ │ └── permission_middleware.go # RequirePermission, ScopeContext injection│ ├── pkg/│ │ ├── permissions/│ │ │ └── constants.go # ~100 resource.action constants│ │ └── scope/│ │ └── context.go # ScopeContext {ScopeID, DeptIDs, DataAccess}│ └── app/│ ├── auth/ # Session service, Redis revocation│ └── role/ # Role assignment + revocation└── migrations/ # Atlas versioned schema migrationsDesign A — Flat Roles with Scope Context Injection
The data model is a textbook RBAC join graph: User → UserRole → Role → RolePermission → Permission. Every join row carries is_active so previously-revoked assignments can be reactivated without violating the UNIQUE (user_id, role_id) and (role_id, permission_id) indexes.
CREATE TABLE roles ( id UUID PRIMARY KEY, name TEXT NOT NULL UNIQUE, is_active BOOLEAN NOT NULL DEFAULT true);CREATE TABLE user_roles ( user_id UUID NOT NULL REFERENCES users(id), role_id UUID NOT NULL REFERENCES roles(id), is_active BOOLEAN NOT NULL DEFAULT true, UNIQUE(user_id, role_id));CREATE TABLE role_permissions ( role_id UUID NOT NULL REFERENCES roles(id), permission_id UUID NOT NULL REFERENCES permissions(id), is_active BOOLEAN NOT NULL DEFAULT true, UNIQUE(role_id, permission_id));Permission resolution flow
HTTP request
│
▼
[1] JWT middleware ── extract user_id, set on ctx
│
▼
[2] RequirePermission("item.create")
│
▼
[3] Load active UserRoles ── WHERE user_id=? AND is_active
│ │
│ └─► If role.name == "Super Admin" → BYPASS (allow)
│
▼
[4] Load active RolePermissions── WHERE role_id IN (...) AND is_active
│
▼
[5] Build O(1) lookup map ── map[code]struct{}
│
▼
[6] Build ScopeContext ── {ScopeID, DeptIDs, DataAccess}
│
▼
[7] code ∈ map ?
├── yes → enrich ctx, call handler
└── no → 403 ForbiddenThe non-obvious part of Design A is step 6 — scope context injection. Beyond permission checking, the middleware injects ScopeContext{ScopeID, DeptIDs, DataAccess} into the request context. Handlers use these for row-level filtering without re-querying: SELECT * FROM items WHERE scope_id = $1. The ScopeContext is the row-level multi-tenancy substitute in a system that has no tenant axis.
Permission catalog shape
// ~80 CRUD constants generated by resource × actionconst ( ITEM_VIEW, ITEM_CREATE, ITEM_UPDATE, ITEM_DELETE, STOCK_VIEW, STOCK_CREATE, STOCK_UPDATE, STOCK_DELETE, USER_VIEW, USER_CREATE, USER_UPDATE, USER_DELETE, ROLE_VIEW, ROLE_CREATE, ROLE_UPDATE, ROLE_DELETE, // ... 20+ resource types × 4 verbs)// ~20 named workflow actions — can't be derived from CRUD verbsconst ( WORKFLOW_APPROVE // Approve a pending workflow item WORKFLOW_REJECT // Reject a workflow item COMMITTEE_CREATE // Create a recommendation STAGE_ACTION // Act on a multi-step approval stage ADMIN_OVERRIDE // Bypass approval chain (admin only) ITEM_ISSUE // Issue items against a requisition DOCUMENT_DOWNLOAD // Download a generated document BULK_IMPORT // Bulk import resources)Revocation strategy
On any RBAC mutation (role assign, permission revoke), the service immediately invalidates the user's active sessions via Redis. A revoked operator must not retain access while their session is still live, so any TTL-based cache was ruled out from the start.
Design B — Project Structure
Multi-tenant platform. Same user account can be a member of many tenants with different roles in each. Permissions are organised by feature module; tenants opt in to modules independently.
design-b/├── ent/schema/│ ├── tenant.go # Tenant│ ├── module.go # Module: 11 feature namespaces│ ├── tenant_module.go # Tenant × Module activation (is_enabled)│ ├── role.go # Role: tenant_id (nullable), is_system│ ├── permission.go # Permission: module_id FK, code│ ├── user.go # User: data_access, is_superadmin│ ├── user_role.go # User → system Role (global)│ ├── tenant_member.go # User × Tenant membership│ ├── tenant_member_role.go # Tenant × Member × Role (3-way)│ └── role_permission.go # Role → Permission (soft-delete)├── internal/│ ├── transport/http/middleware/│ │ ├── tenant_context.go # Resolver: 5-step ladder│ │ └── permission_middleware.go # checkViaTenantMembership / checkViaSystemRoles│ ├── pkg/│ │ ├── permissions/│ │ │ ├── constants.go # 300+ constants tagged by ModuleKey│ │ │ └── modules.go # 11 ModuleKey identifiers│ │ └── tenantaccess/│ │ └── resolver.go # EffectiveTenantScope, CanAccessTenant│ └── app/│ ├── auth/ # Session, JWT (tenant_id claim)│ └── role/ # System + tenant role assignment└── migrations/ # Atlas versioned schema migrationsDesign B — Two-Tier Roles with Module Gating
A flat single-tier model cannot satisfy multi-tenant isolation. Design B introduces two-tier roles: system roles at the platform level (tenant_id = NULL, is_system = true) and tenant-scoped roles tied to a specific tenant. The same role name can exist independently across tenants because uniqueness is scoped per (tenant_id, name).
-- Two-tier role: system (tenant_id NULL) or tenant-scoped (tenant_id N)CREATE TABLE roles ( id INT PRIMARY KEY, name TEXT NOT NULL, tenant_id INT REFERENCES tenants(id), -- NULL for system roles is_system BOOLEAN NOT NULL DEFAULT false, is_active BOOLEAN NOT NULL DEFAULT true, CHECK (NOT is_system OR tenant_id IS NULL), -- system → no tenant UNIQUE NULLS NOT DISTINCT (tenant_id, name) -- per-tenant role names);-- Per-tenant membershipCREATE TABLE tenant_members ( id INT PRIMARY KEY, user_id INT NOT NULL REFERENCES users(id), tenant_id INT NOT NULL REFERENCES tenants(id), is_active BOOLEAN NOT NULL DEFAULT true, UNIQUE(user_id, tenant_id));-- 3-way join: this is the per-tenant role isolation mechanismCREATE TABLE tenant_member_roles ( tenant_id INT NOT NULL REFERENCES tenants(id), tenant_member_id INT NOT NULL REFERENCES tenant_members(id), tenant_role_id INT NOT NULL REFERENCES roles(id), is_active BOOLEAN NOT NULL DEFAULT true, UNIQUE(tenant_id, tenant_member_id, tenant_role_id));-- Module activation per tenant — the final gate at check timeCREATE TABLE tenant_modules ( tenant_id INT NOT NULL REFERENCES tenants(id), module_id INT NOT NULL REFERENCES modules(id), is_enabled BOOLEAN NOT NULL DEFAULT false, is_active BOOLEAN NOT NULL DEFAULT true, UNIQUE(tenant_id, module_id));Permission check flow
Request enters checkUserPermission(user, tenant, code)
│
▼
is_superadmin? ── yes ─► ALLOW (fast bypass)
│ no
▼
user.data_access ?
│
├── single_tenant
│ │
│ ▼
│ Path A: checkViaTenantMembership
│ │
│ ├─[1]─► Active TenantMember (user, tenant)? no → skip
│ ├─[2]─► Load active TenantMemberRoles
│ ├─[3]─► Collect active role IDs (dedup)
│ ├─[4]─► RolePermission(role IN ..., code=?)
│ └─[5]─► TenantModule(tenant, perm.module).is_enabled?
│ yes → ALLOW
│ no → fall through
│
└── all_tenant
│
▼
Path B: checkViaSystemRoles
│
├─► Active UserRole → system Role
├─► RolePermission(role IN ..., code=?)
└─► TenantModule gate still applies
yes → ALLOW
no → DENYStep 5 — TenantModule gating — is the architectural commitment that makes module subscriptions tractable. A permission granted via a role is denied if the tenant has not enabled the owning module. Activating a module on a tenant therefore unlocks all of its associated permissions automatically; no role reassignment, no migration. Conversely, deactivating a module instantly closes the door on all of its permissions, regardless of who holds them in their role.
| DataAccess value | Check path | Tenant membership required | Use case |
|---|---|---|---|
| single_tenant | Path A (membership) → Path B fallback | Yes (for Path A) | Regular tenant users |
| all_tenant | Path B (system roles only) | No | Platform admins, integrations |
Permission catalog shape
// 11 feature modules — each permission belongs to exactly oneconst ( ModuleCore = "core" ModuleBilling = "billing" ModuleReporting = "reporting" ModuleAssessment = "assessment" ModuleAttendance = "attendance" // + 6 more)type Definition struct { Code string Name string ModuleKey string // ← gates the permission at check time}func Definitions() []Definition { return []Definition{ {Code: RECORD_VIEW, ModuleKey: ModuleCore}, {Code: BILLING_VIEW, ModuleKey: ModuleBilling}, {Code: ATTENDANCE_MARK, ModuleKey: ModuleAttendance}, // ... 300+ entries total }}Tenant Context Resolution
Before any permission check runs in Design B, the middleware must decide which tenant governs the request. The Resolver implements a 5-step ladder, centralised so no handler re-implements tenant selection.
| Step | Condition | Outcome |
|---|---|---|
| 1 | Request specifies tenant + user may access it | Use requested tenant |
| 2 | Request specifies tenant + user may NOT access | Return Denied=true (403) |
| 3 | No request tenant + JWT carries tenant_id | Use JWT tenant |
| 4 | No JWT tenant + user is single_tenant | Use MIN active membership |
| 5 | None of the above | Resolved=false (caller handles) |
func (r *Resolver) Resolve( ctx context.Context, usr *ent.User, requestedTenantID, jwtTenantID NullableTenantID,) (EffectiveTenantScope, error) { if requestedTenantID.IsSet { ok, _ := r.CanAccessTenant(ctx, usr, requestedTenantID.Value) if !ok { return EffectiveTenantScope{Denied: true}, nil } return EffectiveTenantScope{TenantID: requestedTenantID.Value, Resolved: true}, nil } if jwtTenantID.IsSet { return EffectiveTenantScope{TenantID: jwtTenantID.Value, Resolved: true}, nil } if usr.DataAccess == DataAccessSingleTenant { minID, _ := r.minActiveMembership(ctx, usr.ID) return EffectiveTenantScope{TenantID: minID, Resolved: true}, nil } return EffectiveTenantScope{}, nil}Access enforcement is DataAccess-aware: all_tenant users may access any active tenant; single_tenant users may only access tenants where they have an active TenantMember row. This keeps the access boundary consistent between the tenant-selection step and the permission-resolution step — a single source of truth for 'can this user see this tenant at all?'.
Concurrent Role-Assignment Validation
Assigning multiple roles in one request requires validating that the user and every target role exist and are active. These are independent DB lookups, so they run in parallel — the worst-case latency is the slowest individual query, not the sum.
func (s *rbacService) AssignRoles(ctx context.Context, userID uuid.UUID, roleIDs []uuid.UUID) error { var ( wg sync.WaitGroup mu sync.Mutex errs []string ) wg.Add(1 + len(roleIDs)) go func() { defer wg.Done() if _, err := s.domain.GetActiveUser(ctx, userID); err != nil { mu.Lock(); errs = append(errs, err.Error()); mu.Unlock() } }() for _, id := range roleIDs { go func(id uuid.UUID) { defer wg.Done() if _, err := s.domain.GetActiveRole(ctx, id); err != nil { mu.Lock(); errs = append(errs, err.Error()); mu.Unlock() } }(id) } wg.Wait() if len(errs) > 0 { return fmt.Errorf("validation failed: %s", strings.Join(errs, "; ")) } return s.revokeUserAccess(ctx, userID, "rbac_role_change")}Key Design Decisions
Soft-delete join rows (is_active=false) in both designs
Why: Revoke-and-regrant would hit the UNIQUE (role_id, permission_id) and (user_id, role_id) indexes if implemented as hard delete + insert. Soft-delete + reactivate avoids the violation and preserves the full assignment history for audit.
Alternative: Hard delete: cleaner table but requires ON CONFLICT handling and destroys audit trail.
Flat single-tier roles for Design A, two-tier for Design B
Why: Design A has one operational context; roles don't need tenant-level scoping. Adding a second tier would only introduce join complexity. Design B has strict multi-tenant isolation where the same user account can serve different roles per tenant — exactly what TenantMemberRole solves.
Alternative: One unified model for both: either over-engineers A or under-engineers B.
TenantModule gating at check time (Design B)
Why: Tenants subscribe to feature modules; a tenant without Billing should not have billing permissions regardless of their roles. Gating at check time means activating a module unlocks its permissions automatically — no role reassignment, no migration.
Alternative: Revoke all permissions in deactivated modules explicitly: requires migration on every module change and is error-prone.
Per-request DB resolution, no permission cache
Why: Both designs resolve permissions on every request. Combined with Redis session revocation (Design A) and DataAccess-aware tenant resolution (Design B), revoked permissions take effect within one round-trip. Compliance-critical systems can't tolerate stale ALLOW decisions.
Alternative: Permission cache with TTL: faster per-request, but revoked permissions stay live until expiry.
DataAccess enum (single_tenant vs all_tenant) on users in Design B
Why: Platform admins managing many tenants shouldn't be enrolled as TenantMember in every tenant. DataAccess=all_tenant bypasses the membership requirement and routes to system roles only — avoiding a proliferation of TenantMember rows while keeping the check path consistent.
Alternative: Enrol all_tenant users in every tenant: uniform path, but requires a data migration every time a new tenant is created.
Superadmin escape — name-based (A) vs flag-based (B)
Why: Both designs need an emergency-access path that survives catalog corruption. Design A uses a hardcoded 'Super Admin' role name. Design B adds an is_superadmin boolean on User for a faster bypass, gated by a separate MANAGE_PRIVILEGE permission to prevent escalation. The flag is more robust because it survives role renames.
Alternative: Superadmin as just another role with all permissions: correct normally, but breaks if a permission catalog migration partially fails.
Tradeoffs Summary
- Single-tier vs two-tier roles: flat is simpler and correct when there is one tenancy axis; the second tier is justified the moment 'role per tenant' becomes a real requirement, not a hypothetical.
- Per-request DB resolution vs cached permissions: chose correctness — revoked access propagates within one request. A targeted per-user-per-tenant cache with explicit invalidation is the right next step under load.
- Module gating at check vs at grant: check-time gating decouples module subscriptions from role assignments. The cost is one extra index lookup per request; the saving is zero migrations on module changes.
- Scope context (A) vs tenant context (B): both encode 'what data may this user see', but on different axes. A injects row-level scope from the user's roles; B routes through tenant membership and module enablement.
- Soft delete vs hard delete on joins: soft delete preserves audit trail and avoids UNIQUE collisions on regrant. The cost is permanent table growth — acceptable here because join tables stay small relative to the data tables they govern.
Lessons & What I'd Change
The hardcoded 'Super Admin' string in Design A is the most fragile part. A role rename silently breaks emergency access. I'd replace it with an is_super_admin boolean column on the roles table — same bypass semantics, survives renames, and is auditable as data rather than as a string match in code. Design B already handles this better via the is_superadmin user flag.
Design A injects ScopeContext unconditionally on every request, even for endpoints that only need permission checking and don't filter by scope. A lazy pattern — compute scope only on first access — would eliminate the unnecessary DB round-trips on read-only endpoints.
Design B's permission check is 3–4 DB round-trips per request: TenantMember, TenantMemberRole, RolePermission, TenantModule. At low request volume this is fine. At scale, a short-TTL per-user-per-tenant permission cache (key = userID + tenantID + permissionCode) with explicit invalidation on RBAC mutations would collapse the hot path to one Redis lookup. The invalidation set is bounded — only the affected user and tenant need eviction — so cache coherence stays tractable.
Building both designs in sequence was clarifying. Design A confirmed that a flat model is the right default. Design B made it clear which specific requirements justify each added layer: multi-tenancy requires TenantMemberRole, module subscriptions require gating at check time, and per-tenant permission isolation requires DataAccess as an explicit user attribute rather than something inferred at query time. The shape of an access-control system should be derived from its requirements, not transplanted from another codebase.