Files
rdp-proxy/docs/architecture/WEB_INGRESS_AND_ADMIN_UI_MODEL.md
T

28 KiB

Web Ingress and Admin UI Model

Status: target architecture and implementation contract.

This document defines how HTTP/HTTPS web entry, Admin UI, dynamic page composition, and cluster configuration responsibilities are separated in the Secure Access Fabric.

The fabric node-to-node transport remains QUIC-only. HTTP/HTTPS is allowed only as an external client-facing service edge.

Purpose

The platform needs a clear distinction between:

  • Web Service as the HTTP/HTTPS entry layer
  • Control Plane as the owner of cluster configuration and policy
  • Admin UI as a safe, scoped user interface over Control Plane APIs
  • Fabric Transport as the internal QUIC-only node-to-node substrate

The Web layer must never become the owner of cluster state, policy, topology, secrets, node identity, or routing authority.

Layer Ownership

Public HTTPS Ingress

Public HTTPS Ingress is an edge service. It may run on a public Internet node, including a small/slow node intended only to accept browser traffic and pass it into the fabric.

Role names:

  • public-ingress
  • admin-ingress

Responsibilities:

  • listen on TCP 80 only for ACME challenges, health checks, and HTTPS redirects
  • listen on TCP 443 for browser/API HTTPS
  • terminate TLS or sit behind the approved TLS terminator
  • serve only approved static UI shells and safe public metadata
  • validate SNI/Host, request size, rate limits, and edge policy
  • map the request to an allowed platform, cluster, organization, or user portal scope
  • forward accepted traffic into the fabric through an authorized fabric service channel
  • apply edge controls such as headers, rate limits, request size limits, and future WAF rules
  • expose only approved public/admin endpoints

Public HTTPS Ingress must not:

  • own cluster configuration
  • directly mutate PostgreSQL
  • store durable topology or policy
  • store secrets
  • store node identity or certificates as source of truth
  • expose internal mesh topology to browser clients
  • execute cluster decisions locally
  • hold platform/global admin authority keys
  • infer authorization from the fact that it accepted TCP 443
  • become a general relay for arbitrary HTTP inside the fabric

The node that accepts HTTPS is not the node that automatically owns or executes admin logic. It is only a service edge.

Fabric Transport

Fabric Transport is the internal node-to-node layer.

Rules:

  • node-to-node traffic uses QUIC only
  • no HTTP fallback between fabric nodes
  • STUN/ICE/rendezvous/relay are fabric transport mechanisms, not browser/API protocols
  • any service traffic accepted on 443 is converted into a scoped fabric service channel before it crosses the mesh
  • direct links, relay links, and route-health observations must remain separate in diagnostics
  • a fabric route proves reachability, not administrative authority

If a public ingress receives a request for an admin surface, the request flow is:

Browser HTTPS
  -> public/admin ingress on 443
  -> tenant/cluster/platform scope selection
  -> signed fabric service channel over QUIC
  -> authorized admin/runtime service node
  -> Control Plane authorization and policy

Control Plane

Control Plane owns all durable cluster configuration and policy.

Responsibilities:

  • clusters
  • nodes
  • node enrollment and approval
  • role assignments
  • organization and tenant policy
  • service desired state
  • service endpoint visibility
  • signed scoped snapshots
  • config distribution rules
  • audit
  • high-risk action authorization
  • step-up authentication requirements

PostgreSQL remains the durable source of truth. Redis remains live coordination only.

Cluster configuration is changed only through Control Plane services and APIs. The Web layer is a presentation and ingress layer over those APIs.

Admin UI Runtime

Admin UI Runtime is the service that serves and executes the admin surface. It may run on any node explicitly assigned the matching runtime role.

Role names:

  • global-admin-runtime
  • cluster-admin-runtime
  • organization-portal-runtime
  • user-portal-runtime
  • identity-runtime
  • policy-authority
  • audit-sink

Admin UI is a client application served through Public HTTPS Ingress or Admin UI Runtime according to deployment policy.

It renders safe Control Plane projections and submits user actions to Control Plane APIs.

Admin UI must not:

  • contain embedded internal topology
  • contain secrets
  • contain raw credential references beyond safe indicators
  • contain peer cache data
  • contain route cache data
  • contain private node-to-node endpoints unless explicitly authorized for the viewer
  • contain executable cluster logic

Admin Endpoint Placement And Trust

Admin UI endpoint placement is explicit and must not be inferred from storage.

Scopes:

  • Platform Owner Console: global platform-owner scope. It may aggregate multiple clusters through Control Plane APIs according to platform policy and audit.
  • Cluster Admin Endpoint: cluster-local admin/web ingress endpoint for a single cluster. It is hosted only by nodes explicitly assigned an approved admin/web ingress role.
  • Organization Admin Panel: tenant-safe projection for one organization. It must expose only allowed resources, service endpoints, sessions, policies, and safe status.
  • User Portal: personal/account scope. It must expose only the authenticated user's resources, sessions, devices, and profile actions.

Rules:

  • Fabric Storage / Config Storage nodes do not automatically host Admin UI.
  • Adding a storage node to a new cluster does not move the cluster panel.
  • Storage nodes distribute/cache scoped configuration and snapshots only.
  • Admin/web ingress is a separate service role and requires explicit Control Plane assignment.
  • Public Internet ingress is not enough to run a global panel.
  • global-admin-runtime, policy-authority, and audit-sink may run only on platform-owner trusted nodes.
  • cluster-admin-runtime may run only on nodes authorized for that cluster.
  • organization-portal-runtime and user-portal-runtime may run on broader infrastructure, but they receive only scoped projections.
  • Cluster-local admin endpoints require valid TLS/cert policy, signed scoped snapshots, current node health, and sufficient role coverage.
  • Platform Owner Console remains the owner-level view even when cluster-local admin endpoints exist.
  • Organization Admin Panel must never expose intermediate mesh topology, storage shards, peer caches, route caches, or unrelated cluster data.
  • A request entering through an organization-bound ingress must be rejected if it asks for another organization, another cluster outside its contract, global topology, or platform-owner data.

Request Flow

Admin Browser
  -> Public/Admin HTTPS Ingress
  -> Fabric Service Channel over QUIC
  -> Admin UI Runtime / Control API
  -> PostgreSQL source of truth
  -> signed scoped snapshots / config distribution
  -> rap-node-agent

Web Ingress may cache static assets and safe UI manifests, but it must not become a second source of truth.

Dynamic Admin Pages

Admin pages may be dynamically composed, but they must be generated from safe metadata and scoped projections.

The recommended model is:

Admin Web Shell
  -> UI Manifest / Page Definition endpoint
  -> Scoped Control API endpoints

Dynamic pages are allowed for:

  • platform admin sections
  • cluster admin sections
  • node detail sections
  • service adapter safe configuration sections
  • future organization admin sections

Dynamic pages must be declarative. They must not inject arbitrary executable code from the backend into the browser.

UI Manifest Model

The Control Plane may provide a ui_manifest or page definition for a specific viewer context.

Viewer context includes:

  • user id
  • platform role
  • organization memberships
  • cluster access scope
  • device trust state
  • MFA / step-up state
  • feature flags
  • service availability

The manifest may include:

  • visible navigation sections
  • page ids
  • component ids from an approved component registry
  • form schemas
  • table schemas
  • safe field labels and message keys
  • allowed actions
  • action risk level
  • API route references
  • required permissions
  • required step-up authentication flags
  • audit event category
  • refresh hints

The manifest must not include:

  • secrets
  • raw credentials
  • private keys
  • full mesh topology
  • full peer cache
  • route cache
  • unrelated organization data
  • unrelated cluster data
  • internal node-to-node route details
  • arbitrary JavaScript or executable code

Page Definition Safety Rules

Dynamic pages are schema-driven views over safe data.

Rules:

  • page definitions are data, not code
  • page definitions must use an approved component registry
  • fields must be explicitly typed
  • actions must map to known Control Plane operations
  • every action must be permission checked server-side
  • high-risk actions must declare step-up requirements
  • all mutations must be audited
  • UI labels should use localization message keys with English fallback text
  • sensitive responses should use Cache-Control: no-store

Client-side hiding is not authorization. The Control Plane must enforce all permissions and policies even if a browser crafts a request manually.

Safe Data Projection

The Control Plane should expose different projections for different audiences.

Platform owner/admin may see:

  • clusters
  • nodes
  • join requests
  • role assignments
  • safe topology summaries
  • service placement
  • health and audit
  • partition/recovery status
  • active node for cluster-managed services where allowed

Organization admin may see only:

  • organization resources
  • organization users/groups where authorized
  • organization policies
  • active sessions
  • allowed ingress endpoints
  • allowed egress/service endpoints
  • safe VPN/connector status
  • organization audit

Organization admin must not see:

  • intermediate core mesh topology
  • other organizations
  • peer caches
  • route caches
  • unrelated nodes
  • platform trust roots
  • raw node certificates
  • secrets
  • unrelated cluster internals

Ingress-bound projections:

  • A platform-owner ingress may expose platform navigation only after platform authorization, MFA/step-up, and policy checks.
  • A cluster-bound ingress may expose only that cluster's admin surface and cluster-scoped safe diagnostics.
  • An organization-bound ingress may expose only the organization projection and organization-safe service endpoints.
  • A user portal ingress may expose only the user's personal/account projection.
  • Host/SNI alone is not authorization; it only selects the maximum possible projection before server-side authorization narrows it further.

Service Adapter UI Extensions

Service adapters may need configuration UI.

Examples:

  • RDP resource settings
  • VNC resource settings
  • SSH resource settings
  • VPN/IP tunnel connection settings
  • file policy settings
  • video/audio policy settings

Adapter UI extensions must be registered as safe schema descriptors through the Control Plane. Adapters must not directly publish arbitrary browser code.

Allowed extension content:

  • field schema
  • validation hints
  • policy options
  • message keys
  • safe help text
  • action ids mapped to Control Plane APIs

Disallowed extension content:

  • executable code
  • protocol secrets
  • internal adapter memory/state
  • raw target credentials
  • unrestricted backend endpoints

Cluster Configuration Ownership

Cluster configuration belongs to Control Plane.

Examples:

  • cluster creation and disablement
  • node approval
  • node role assignment
  • service desired state
  • VPN connection desired state
  • allowed node policy
  • route policy
  • QoS policy
  • signed snapshot generation
  • storage/config distribution scope

Admin UI may present these controls, but it does not own the decisions.

The authoritative path is:

Admin action
  -> Control API authorization
  -> policy validation
  -> PostgreSQL mutation
  -> audit event
  -> snapshot/config distribution update
  -> node-agent consumption

Security Requirements

Web/Admin security requirements:

  • TLS for all browser traffic
  • secure cookies or approved token storage model
  • CSRF protection where cookie auth is used
  • CSP for Admin UI
  • no secrets in HTML or JavaScript bundles
  • no internal topology embedded in static assets
  • no arbitrary backend-provided JavaScript
  • strict server-side authorization
  • risk-based admin access
  • MFA/2FA and step-up for high-risk actions
  • audit every mutation
  • short-lived UI manifests where sensitive
  • no-store cache headers for sensitive API responses

High-risk actions include:

  • node approval
  • role assignment
  • cluster trust changes
  • cross-cluster trust changes
  • partition promotion
  • secrets access
  • update policy changes
  • VPN credential/config resolver access

Deployment Model

Current Test Entry

The current shared Docker test stand exposes the Platform Owner Control Panel at http://docker-test.cin.su:18080/ (http://192.168.200.61:18080/). This is a temporary lab HTTP edge served by rap_web_admin from /tmp/rap-web-admin/html on test-docker.

This entry is not the production authority model. It is allowed only for the shared test stand while the HTTPS admin-ingress runtime is being completed. The target production entry is:

Browser HTTPS on 443
  -> node with explicit admin-ingress/public-ingress role
  -> signed web-ingress envelope
  -> QUIC fabric service channel
  -> authorized admin/portal runtime node
  -> Control API projection/authorization

The browser-facing ingress may be a small public node, but it must not become the management authority. Platform/global admin runtime remains limited to platform-owner trusted nodes. Cluster, organization, and user panels receive only their scoped projections.

The legacy Fabric map with separate inputs, cluster nodes, and egress zones is retired for the transport-layer view. The Fabric panel must show actual direct/fresh QUIC neighbor links, one-way/passive direction, stale/problem state, relay/route-health annotations, and web-ingress runtime readiness. It must not render old entry/egress zone columns as if they were transport topology.

Possible deployment modes:

  • Public/Admin HTTPS Ingress and Control API in the same deployment for small/test installs
  • Web Ingress separated from Control API for production
  • multiple Web Ingress nodes for regional/admin access
  • Web Ingress behind Caddy/Nginx/enterprise ingress
  • Admin UI shell served from Web Ingress while APIs remain on Control API
  • Internet ingress on a low-capacity node that forwards scoped channels to a trusted admin runtime elsewhere in the fabric
  • global admin runtime only on platform-owner controlled nodes
  • cluster admin runtime on cluster-authorized nodes
  • organization/user portal runtime on tenant-safe nodes with scoped data

Even when deployed together, ownership remains separate:

  • Public/Admin HTTPS Ingress is entry/presentation
  • Fabric Transport is QUIC-only service-channel delivery
  • Control API is authorization/domain logic
  • PostgreSQL is source of truth
  • Fabric Storage/Config Storage is scoped distribution/cache
  • node-agent consumes scoped desired state

Required Roles

The platform recognizes these web/admin placement roles:

Role Scope Purpose
public-ingress cluster or organization Listen on 80/443, terminate/validate HTTPS, forward scoped service channels.
admin-ingress platform or cluster HTTPS edge for admin surfaces. It does not own authority.
global-admin-runtime platform trusted nodes only Platform-owner console/runtime.
cluster-admin-runtime cluster Cluster admin console/runtime for one cluster.
organization-portal-runtime organization Tenant-safe organization administration.
user-portal-runtime user/organization Personal account/resource portal.
identity-runtime platform/cluster Authentication, session, MFA, step-up and token issuance.
policy-authority platform trusted nodes only Authorization/policy decisions and signed claims.
audit-sink platform trusted nodes only Durable mutation/security audit ingestion.

Legacy entry-node remains a generic client ingress/service edge role for non-admin product services. It must not imply admin authority.

Fabric Service Classes

Admin and portal traffic uses explicit fabric service classes. This prevents admin traffic from being disguised as VPN/RDP/file/video traffic and gives the routing layer clear QoS, role, and audit semantics.

Service class Required runtime roles Projection
platform_admin admin-ingress, global-admin-runtime, identity-runtime, policy-authority, audit-sink Platform-owner console.
cluster_admin admin-ingress, cluster-admin-runtime, identity-runtime, policy-authority, audit-sink One cluster.
organization_portal public-ingress, organization-portal-runtime, identity-runtime, policy-authority, audit-sink One organization.
user_portal public-ingress, user-portal-runtime, identity-runtime, policy-authority, audit-sink One authenticated user/account scope.

Default channels for these classes are control, interactive, and reliable. They are latency-sensitive control-plane/service traffic, not bulk data transfer.

Desired Workload Contract

Ingress nodes are configured through normal node desired workloads. The first runtime stage is a contract probe: node-agent validates the policy and reports a workload status, but it does not open 80/443 until the real ingress runtime stage is enabled.

Example platform/cluster admin ingress workload:

{
  "service_type": "admin-ingress",
  "desired_state": "enabled",
  "runtime_mode": "native",
  "config": {
    "listen_http_port": 80,
    "listen_https_port": 443,
    "tls_mode": "terminate",
    "scope": "platform",
    "service_classes": ["platform_admin", "cluster_admin"]
  }
}

Example organization/user public ingress workload:

{
  "service_type": "public-ingress",
  "desired_state": "enabled",
  "runtime_mode": "native",
  "config": {
    "listen_http_port": 80,
    "listen_https_port": 443,
    "tls_mode": "terminate",
    "scope": "organization",
    "service_classes": ["organization_portal", "user_portal"]
  }
}

Contract-probe status requirements:

  • fabric_transport is quic_only
  • http_between_fabric_nodes is false
  • authority_service is false
  • fabric_service_channel_required is true
  • ports_opened_by_stub is false
  • invalid service classes or non-80/443 ports report degraded
  • real listener startup requires both workload config real_listener_enabled=true and node-agent process gate RAP_WEB_INGRESS_RUNTIME_ENABLED=true
  • without the process gate, a real-listener request reports web_ingress_real_listener_gate_disabled
  • the first handler stage returns schema rap.web_ingress.runtime_response.v1; it redirects HTTP to HTTPS, exposes health, validates service class/scope, and blocks payload forwarding with fabric_service_channel_binding_not_implemented until the QUIC service channel binding is implemented
  • node-agent owns a web-ingress listener lifecycle manager. When the real listener gate is enabled, it starts the HTTP redirect listener and starts HTTPS only when tls_cert_file and tls_key_file are present in workload config. Without TLS files the listener status is partial and service payload remains blocked.
  • HTTPS handler has a FabricBinder boundary. Valid requests become rap.web_ingress.fabric_request.v1 records with method, path, query, host, derived scope, service class, safe headers, bounded body, and observed timestamp. Runtime derives fabric scope from service class (platform_admin -> platform, cluster_admin -> cluster, organization_portal -> organization, user_portal -> user) before signing/forwarding the request. Dangerous browser headers such as Authorization, Cookie, Set-Cookie, and service-channel tokens are not forwarded as ordinary proxy headers. The binder must convert the request into a signed/scoped fabric service channel envelope; if no binder is present, ingress returns fabric_service_channel_binding_not_implemented.
  • The first concrete binder emits rap.web_ingress.fabric_service_channel_envelope.v1. The envelope contains the safe request projection, base64-encoded body, scope, service class, observed timestamp, and envelope timestamp. It is serialized as canonical JSON for signing, then passed to an EnvelopeSigner and EnvelopeSender. EnvelopeSigner owns node/service-channel signature policy. EnvelopeSender owns delivery into the QUIC fabric service channel and route selection. This keeps HTTP edge handling separated from mesh internals while making the security boundary explicit and testable.
  • The initial signer implementation is Ed25519 over the canonical envelope bytes. The signer can derive key_id from the public key fingerprint or use an explicitly configured key id. Production deployment must bind this key to the node identity/service-channel authority policy before enabling real browser traffic.
  • The initial mesh sender adapter can submit the signed envelope through the existing reliable fabric channel runtime using control traffic class and a configured route set to an admin/portal runtime node or pool. At this stage it returns a delivery-accepted response with route/channel metrics. Full request/response admin API streaming remains a later runtime step and must stay on the same QUIC fabric channel model.
  • The fabric channel runtime now also has a request/response path for web ingress: it opens a QUIC stream, sends the signed envelope as FrameData, and waits for a FrameData response on the same stream and sequence. Route failures or response timeouts use the same latency-aware reroute path as reliable delivery. Runtime HTTP responses use rap.web_ingress.fabric_runtime_response.v1 with status code, safe headers, and body/body_b64. If a runtime response is not in that schema, ingress reports delivery-accepted metrics instead of treating arbitrary payload as an HTTP response.
  • QUIC fabric server reserves WebIngressForwardQUICStreamID for web ingress request/response forwarding. The server invokes a web-ingress forward handler with the signed envelope payload and returns a wrapper containing either runtime payload or an error on the same stream/sequence.
  • Admin/portal runtime nodes have a signed-envelope receiver contract. The receiver verifies rap.web_ingress.signed_fabric_service_channel_envelope.v1, Ed25519 signature, trusted key id, scope, service class, and timestamp skew before calling the local runtime handler. The local handler returns rap.web_ingress.fabric_runtime_response.v1; unsafe response headers are filtered before the payload is returned to the ingress edge.
  • Node-agent exposes explicit runtime key policy inputs while the final signed config-snapshot distribution is being wired: RAP_WEB_INGRESS_SIGNING_PRIVATE_KEY, RAP_WEB_INGRESS_SIGNING_KEY_ID, and RAP_WEB_INGRESS_TRUSTED_KEYS_JSON. Trusted keys JSON may be either {"key_id":"public_key_b64"} or an array of {"key_id":"...","public_key":"..."} objects. Without trusted keys the web-ingress receiver handler is not installed. Runtime receiver placement can be narrowed with RAP_WEB_INGRESS_RUNTIME_SERVICE_CLASSES, a comma-separated allow-list of platform_admin, cluster_admin, organization_portal, and user_portal; this is a temporary explicit node-local policy until signed role snapshots drive receiver placement.
  • Heartbeat metadata includes web_ingress_runtime_receiver_report when QUIC fabric or web-ingress key policy is configured. The report exposes the signed-envelope schema, QUIC stream id, trusted key count, receiver service-class allow-list, handler installation state, status/reason (ready, degraded, or blocked), and QUIC endpoint readiness so the fabric panel can show whether a node can currently receive admin/portal runtime traffic and why it cannot.
  • QUIC listener/reverse-transport handler configuration is sensitive to the web-ingress trusted key policy and runtime service-class allow-list. If either policy changes, node-agent restarts or refreshes the QUIC fabric handler binding so stale key trust or stale receiver placement is not kept in memory.
  • The first local admin runtime dispatcher is intentionally read-only. It handles /healthz, /readyz, and */ui-manifest requests after signed envelope verification. It returns rap.web_ingress.admin_runtime_response.v1 with a safe rap.web_ingress.ui_manifest.v1 projection that lists sections and read-only actions for the requested service class. It rejects invalid scope/service_class pairs before using either the local fallback or the Control API projection client. Mutations return control_api_mutation_binding_not_implemented; unknown read projections return control_api_projection_binding_not_implemented until the dispatcher is wired to the real Control API authorization/projection layer.
  • The dispatcher now has a ControlAPIProjectionClient boundary. When bound, read-only GET/HEAD requests are sent to the Control API projection endpoint and returned as rap.web_ingress.control_api_projection_response.v1. Backend exposes the first read-only projection endpoint at /api/v1/clusters/{cluster_id}/nodes/{node_id}/admin-runtime/projection. It returns safe manifest/projection payloads, marks audit as required, and rejects mutation methods and invalid scope/service_class combinations. Requests must use schema rap.web_ingress.control_api_projection_request.v1; agent accepts responses only with schema rap.web_ingress.control_api_projection_response.v1. This is the first Control API binding slice; it is not yet a full authorization/session/audit implementation.

Future Stages

Suggested staged work:

WEB-1: Document Web Ingress and Admin UI ownership model.

WEB-2: Define ui_manifest schema and approved component registry.

WEB-3: Add platform-admin Admin Web Shell that consumes scoped manifests. Initial Platform Owner Control Panel is implemented and build-verified in web-admin. Report: artifacts/web-admin-platform-owner-control-panel-report.md.

WEB-4: Add cluster admin pages using Control Plane projections.

WEB-5: Add organization admin pages using tenant-safe projections.

WEB-6: Add high-risk action step-up and device-trust UI flows.

WEB-7: Add service-adapter UI extension registry.

WEB-8: Add signed/versioned UI manifest distribution if needed for offline or edge-served admin shells.

Non-Goals

This document does not authorize:

  • implementation of new UI pages
  • changing existing Windows client behavior
  • changing RDP runtime
  • mesh runtime
  • VPN runtime
  • node-agent service execution changes
  • storing cluster configuration inside Web Service
  • exposing internal topology to organizations

Result / Decision

WEB is an ingress and presentation layer, not a cluster configuration owner. Fabric remains QUIC-only internally; HTTP/HTTPS exists only at the external client edge. Cluster configuration belongs to the Control Plane and is persisted in PostgreSQL. Dynamic admin pages are allowed only as safe, scoped, schema-driven projections over Control Plane APIs. They must not embed secrets, internal topology, peer caches, route caches, or arbitrary executable code.

Admin endpoint placement is explicit. A Fabric Storage / Config Storage node does not automatically become a cluster panel. Platform Owner Console remains global platform-owner scope; Cluster Admin Endpoint is a separate cluster-local admin/web ingress role; Organization Admin Panel remains a tenant-safe projection.