Backend Foundation
Production-oriented Go backend skeleton for the remote access platform.
Scope included
- configuration loading from environment
- HTTP server bootstrap with graceful shutdown
- PostgreSQL and Redis connectivity wiring
- migrations scaffold
- auth foundation with access/refresh tokens, hashed refresh rotation, trusted devices, and persisted auth sessions
- persistent session storage foundation for remote sessions, attachments, resource policies, and audit events
- session broker orchestration for start, attach, detach, takeover, terminate, failure, and detached-session recovery
- Redis-backed live session state, controller binding, attach tokens, heartbeat keys, worker routing, and reconnect support
- Redis-backed worker registration, lease lifecycle, heartbeat tracking, stale lease recovery, and routing queues
- worker assignment queueing and worker event ingestion for the minimal real RDP worker runtime
- websocket live plane with attach handshake, ping/pong heartbeat, state messages, takeover detection, and transport reconnect flow
- module boundaries for auth, resources, session broker, and websocket gateway
- worker registry scaffold to prepare later RDP worker integration
- per-resource certificate verification policy for RDP connections with
strictdefault and explicitignoreoverride - platform-core v2 foundations for organizations, memberships, identity sources, nodes, and node-agent control plane
- Data Plane v1 contract scaffolding for optional session response candidates/tokens, with current backend gateway behavior preserved as fallback
- production resource secret-readiness guard for rejecting plaintext credential-like metadata and requiring
secret_reffor RDP/VNC/SSH resources in production mode - encrypted resource secret storage/resolver MVP for production
secret_refusage
Entry point
Run the API from cmd/api.
Local dev
- backend:
pwsh -File scripts/smoke/run-backend.ps1 - infra:
pwsh -File scripts/smoke/start-infra.ps1 - migrations:
pwsh -File scripts/smoke/apply-migrations.ps1 - worker image build:
docker build --tag rap-rdp-worker:dev --file workers/rdp-worker/Dockerfile workers/rdp-worker - end-to-end smoke path: scripts/smoke/README.md
Configuration
Use configs/api.example.env as the starting point for local environment variables.
Resource secret-readiness is controlled by APP_ENV:
- in
APP_ENV=productionorAPP_ENV=prod, RDP/VNC/SSH resources must carrysecret_refand must not include plaintext credential-like fields inmetadata - in development and smoke environments, plaintext metadata remains allowed until the encrypted secret resolver is implemented
- the production guard is enforced both on resource create/update and on session start, so compat plaintext resources cannot be started in production accidentally
SECRET_ENCRYPTION_KEY_B64orSECRET_ENCRYPTION_KEY_FILEsupplies the AES-256-GCM master key for the MVP encrypted store; production mode refuses to start without oneSECRET_ENCRYPTION_KEY_IDlabels the active key version in stored recordsPUT /api/v1/resources/{resourceID}/secretcreates or rotates a resource secret and updatesresources.secret_ref; plaintext is never returned by the API- session assignment keeps PostgreSQL metadata safe:
remote_sessions.metadatastoressecret_ref, while resolved credentials are merged only into the transient worker assignment after session/worker/lease checks
See docs/architecture/SECURITY_SECRETS_READINESS.md for the target
secret-reference model and remaining resolver/PKI gaps.
Data Plane v1 contract scaffolding is controlled by:
DATA_PLANE_TOKEN_TTL, default1mDATA_PLANE_TOKEN_PRIVATE_KEY_FILE, optional path to an RSA private key PEM used to sign RS256 data-plane tokensDATA_PLANE_TOKEN_PRIVATE_KEY_PEM, optional inline RSA private key PEM; used when file path is not configuredDATA_PLANE_BACKEND_GATEWAY_URL, default/api/v1/gateway/wsDATA_PLANE_DIRECT_WORKER_WSS_URL_TEMPLATE, optional; supports{worker_id}replacementDATA_PLANE_DIRECT_WORKER_JSON_RUNTIME, defaultfalse; advertisesruntime_transport=json_v1only after the worker direct JSON bridge is deployed and verifiedDATA_PLANE_DIRECT_WORKER_BINARY_RENDER, defaultfalse; when the direct JSON runtime is enabled, advertisesrender_transport=binary_v1so DP-2 clients can request binary render frames over direct worker WSS. Binary render candidates also advertisesupported_color_modes=["full_color","grayscale"]anddefault_color_mode="full_color"for the DP-3A grayscale foundation.DATA_PLANE_DIRECT_WORKER_TLS_TRUST_MODE, defaultsmoke_insecure; allowed values aresmoke_insecure,public_ca, andplatform_ca.DATA_PLANE_DIRECT_WORKER_TLS_CA_REF, optional label for the platform CA or trust bundle version advertised to clients.
Data-plane tokens are RS256-signed. The backend must hold only the private key;
workers receive only the matching public key for validation. If no private key
is configured, the backend omits the optional data_plane offer and the
backend gateway fallback remains unchanged.
If no direct worker WSS URL template is configured, session responses still include the backend gateway fallback candidate only.
If the URL template is configured but DATA_PLANE_DIRECT_WORKER_JSON_RUNTIME
is false, the direct candidate is still present for contract visibility but is
not marked data-capable; DP-1D Windows clients will skip it and use the backend
gateway fallback.
If DATA_PLANE_DIRECT_WORKER_BINARY_RENDER is false, direct worker WSS
remains JSON/base64 for render. If it is true, only direct worker WSS render
is binary; backend gateway fallback remains JSON/base64.
In production, the backend does not advertise direct worker WSS when
DATA_PLANE_DIRECT_WORKER_TLS_TRUST_MODE=smoke_insecure; it keeps the backend
gateway fallback instead. Trusted direct candidates include tls_trust_mode,
production_trusted, smoke_only, and optional tls_ca_ref metadata. See
docs/architecture/DIRECT_WORKER_TLS_PKI.md.
Module layout
internal/platformshared runtime, config, infra, and bootstrap concernsinternal/modules/authauth and trusted-device boundaryinternal/modules/organizationorganization model, org roles, and membershipsinternal/modules/identitysourcelocal/LDAP/OIDC identity source model and future mapping foundationsinternal/modules/resourceremote resource inventory boundaryinternal/modules/sessionbrokerpersistent session lifecycle, orchestration, audit, and Redis live-state boundaryinternal/modules/sessiongatewaywebsocket attach/reconnect/takeover transport boundaryinternal/modules/workerworker registration, lease coordination, and control-plane routing boundary for future C++ RDP workersinternal/modules/nodenode inventory, capabilities, enabled services, update policy, and partition stateinternal/modules/nodeagentnode-agent registration, health, service status, and update/rollback control interfacepkg/contractscross-module contracts for sessions and worker control
Backend responsibilities
- PostgreSQL remains the source of truth for auth sessions, devices, remote sessions, attachments, resource policies, and audit events
- Redis is used only for live routing and coordination: attach tokens, controller bindings, live session cache, worker registration, worker leases, heartbeats, and routing queues
worker:control:<worker_id>carries worker assignments,worker:queue:<session_id>carries live control/input envelopes, andworker:eventscarries worker-reported lifecycle events back into broker processing- Session broker owns state transitions and orchestration rules; websocket handlers call broker services instead of talking to postgres repositories directly
- Worker runtime stays behind interfaces and Redis coordination so the backend remains isolated from FreeRDP implementation details while the minimal real RDP worker plugs into the control plane
- RDP certificate verification is configured per resource through
certificate_verification_mode - resources are now org-scoped in PostgreSQL and remote sessions persist their owning organization without changing the proven worker/session runtime contracts
- session start/attach/takeover responses may include optional
data_planecandidates and a short-lived signed data-plane token for DP-1 direct worker WSS migration; existing clients continue to use the current gateway path, and direct realtime use remains gated by explicit candidate metadata
Authorization model
platform_adminandplatform_recovery_adminhave global access across organizations, resources, and sessions- in
INSTALLATION_AUTHORITY_MODE=strict, platform-admin power is effective only when the user also has a valid signed row inplatform_role_grants; changingusers.platform_rolein PostgreSQL alone no longer grants owner access - first-owner bootstrap is available at
POST /api/v1/installation/bootstrap-ownerand requires a Product Root Ed25519 signature over an activation manifest in strict mode - production (
APP_ENV=productionorprod) requires strict installation authority plusINSTALLATION_PRODUCT_ROOT_PUBLIC_KEY_B64orINSTALLATION_PRODUCT_ROOT_PUBLIC_KEY_FILE - compat/dev installs can keep database-role behavior, and insecure first-owner
bootstrap is available only when
INSTALLATION_INSECURE_BOOTSTRAP_ENABLED=true org_ownerandorg_admincan create and update resources inside their organization and can manage any remote session inside that organization- active non-admin memberships such as
org_operator,org_member, andorg_viewerare deny-by-default for admin actions; they can only access org-scoped reads and operate on their own session flows where the session broker explicitly allows it - session start always authorizes the actor against the resource organization before worker reservation
- attach, detach, takeover, and terminate authorize against the owning remote session organization before any state transition is written
- worker-facing events do not bypass this model for user-originated commands; internal worker failure and heartbeat paths remain broker-internal control-plane operations
Migration safety
000005_platform_core_v2bootstraps a singledefaultorganization and backfills existingresources.organization_idandremote_sessions.organization_idinto that organization before settingNOT NULL000006_default_org_memberships_backfillsafely restores access continuity by inserting missing active memberships for existing users into thedefaultorganization- the backfill is idempotent because it only inserts rows missing under the
(organization_id, user_id)uniqueness constraint - platform administrators are backfilled as
org_ownerin the default organization, while other existing users are backfilled asorg_member - if
000005fails before theNOT NULLstep, PostgreSQL rolls back the transaction and leaves pre-v2 rows untouched; if000006is rerun, it skips already-created memberships rather than duplicating them
Platform-Core V2 Notes
organizations,organization_memberships, andorganization_rolesestablish multi-tenant ownership and basic org-scoped authorization boundariesidentity_sourcesandidentity_mappingsare foundation-only in this phase; full LDAP/OIDC sync and claim/group ingestion are intentionally deferrednodes,node_capabilities,node_services,node_update_policies,node_partition_states, andnode_agent_update_runsprovide the first control-plane model for node and node-agent lifecycle- current proven RDP session lifecycle remains preserved: the session broker still orchestrates the same worker/session behavior, but it now records organization ownership via org-scoped resources
- PostgreSQL remains the source of truth for organizations, memberships, org-scoped resources, identity sources, nodes, node-agent state, and session lifecycle state
Resource Certificate Verification
strictis the default and keeps normal certificate validation enabled in the worker runtimeignoremust be explicitly stored on the resource and allows that one RDP connection to skip certificate validation- the backend passes this policy through session assignment data; it is not a global backend toggle
Messaging Model
- HTTP errors now use a structured envelope:
error.codeerror.message_keyerror.fallback_messageerror.detailserror.trace_id
internal/platform/httpxowns error normalization and trace-id generation so handlers can keep callingWriteError(...)without changing business logic.- For
5xxresponses, user-facing payloads are normalized to an English generic fallback message while logs and diagnostics can still keep raw internal details elsewhere. - For
4xxresponses, stablecodeandmessage_keyare derived from the current fallback message, so clients can localize without depending on raw English text as the primary contract.
WebSocket Messaging
- Session gateway envelopes keep the existing
typeandpayloadcontract. - User-facing websocket events now also include
eventwith:codemessage_keyfallback_messagedetailstrace_id
session.taken_over, terminalsession.state,transport.closed, and protocol-level errors now carry this structured event object.- Existing payload semantics remain intact for compatibility with the already proven session lifecycle.
Message Rules
- Keep English as the only development language for
fallback_message, logs, and diagnostics. - New HTTP handlers should prefer
httpx.WriteError(...)for user-facing failures instead of hand-building"error": "..."JSON. - New websocket user-facing notifications should populate
TransportEnvelope.Eventwith a stablecodeandmessage_key. - Do not use raw human-readable English text as the primary client contract; it should only remain as fallback text.
- This messaging layer is now runtime-proven against the live Windows smoke flow for invalid-login errors, websocket takeover delivery, websocket state fallback rendering, and worker-death failure handling.
Clipboard Policy
RDP text clipboard is controlled per resource through resource_policies.clipboard_mode.
Allowed values are disabled, client_to_server, server_to_client, and
bidirectional; the default is disabled. The compat clipboard_enabled
column is retained only for compatibility and migration/backfill, while new
runtime decisions use clipboard_mode.
Clipboard enforcement happens in the real data path:
sessionbroker.ResourcePolicy.ClipboardModeis loaded from PostgreSQL and embedded into the session assignment metadata sent to the worker.sessiongateway.Module.handleEnvelopeblocks client-to-server clipboard envelopes unless the session isactiveand the policy allows that direction.worker.EventProcessorsends worker-originated clipboard text throughsessionbroker.Service.UpdateWorkerClipboardText, which applies the same active-state and server-to-client policy checks before updating live state.- Clipboard messages carry
sequence_id,origin, andcontent_hashso clients and workers can avoid feedback loops across reattach/takeover paths. - Redis stores clipboard text only as transient live state for routing to the active controller; PostgreSQL remains authoritative for policy/session state.
File Upload Policy
Stage 5.1 introduces client-to-server file upload as a policy-gated RDP
feature. The authoritative policy field is
resource_policies.file_transfer_mode; allowed values are disabled,
client_to_server, server_to_client, and bidirectional, but only
client_to_server behavior is implemented in this stage. The default is
disabled. The compat file_transfer_enabled column is retained only as a
derived compatibility flag and must not be treated as the primary policy.
Enforcement is deliberately duplicated in the real data path:
resource.Moduleexposesfile_transfer_modein resource create, update, list, and read payloads.sessionbroker.Service.StartRemoteSessionembedsfile_transfer_modeinto assignment metadata and requests the workerfile-transfercapability only when client-to-server upload is allowed.sessiongateway.Module.handleFileUploadStartandhandleFileUploadChunkrequire an active session, current controller, allowed policy mode, valid UUIDtransfer_id, safe file name, 25 MiB max file size, and 256 KiB max chunk size before routing chunks to the worker.- Redis is used only to route bounded upload envelopes to the worker. The file itself is written by the worker to controlled worker storage; PostgreSQL remains authoritative for policy and session state.
File Download Policy
Stage 5.2 adds a runtime-proven server-to-client download path for RDP. The
policy field remains resource_policies.file_transfer_mode; server_to_client
and bidirectional allow download, while disabled and client_to_server
block it. The default remains disabled.
The v1 download model uses only the restricted RAP_Transfers\ToClient
drop-zone inside the existing per-session visible transfer directory. Backend
gateway accepts only file_download.start, file_download.ack, and
file_download.cancel from the current controller of an active session and
routes them to the worker after policy validation. Worker-origin
file_download.* events are stored only as transient live state for
backend-gateway fallback delivery; PostgreSQL remains authoritative for
session/resource/policy state and must not store file contents.
The direct worker WSS path is also lifecycle-gated: detach returns
file_download.blocked, old-controller takeover returns session.taken_over,
and worker failure closes the direct transport after PostgreSQL transitions the
session to failed.