m/rdp-proxy

Fork 0

Files

T

m 8ba0561f4f Initial project snapshot

2026-04-28 22:29:50 +03:00

18 KiB

Raw Blame History

Backend Foundation

Production-oriented Go backend skeleton for the remote access platform.

Scope included

configuration loading from environment
HTTP server bootstrap with graceful shutdown
PostgreSQL and Redis connectivity wiring
migrations scaffold
auth foundation with access/refresh tokens, hashed refresh rotation, trusted devices, and persisted auth sessions
persistent session storage foundation for remote sessions, attachments, resource policies, and audit events
session broker orchestration for start, attach, detach, takeover, terminate, failure, and detached-session recovery
Redis-backed live session state, controller binding, attach tokens, heartbeat keys, worker routing, and reconnect support
Redis-backed worker registration, lease lifecycle, heartbeat tracking, stale lease recovery, and routing queues
worker assignment queueing and worker event ingestion for the minimal real RDP worker runtime
websocket live plane with attach handshake, ping/pong heartbeat, state messages, takeover detection, and transport reconnect flow
module boundaries for auth, resources, session broker, and websocket gateway
worker registry scaffold to prepare later RDP worker integration
per-resource certificate verification policy for RDP connections with strict default and explicit ignore override
platform-core v2 foundations for organizations, memberships, identity sources, nodes, and node-agent control plane
Data Plane v1 contract scaffolding for optional session response candidates/tokens, with current backend gateway behavior preserved as fallback
production resource secret-readiness guard for rejecting plaintext credential-like metadata and requiring secret_ref for RDP/VNC/SSH resources in production mode
encrypted resource secret storage/resolver MVP for production secret_ref usage

Entry point

Run the API from cmd/api.

Local dev

backend: pwsh -File scripts/smoke/run-backend.ps1
infra: pwsh -File scripts/smoke/start-infra.ps1
migrations: pwsh -File scripts/smoke/apply-migrations.ps1
worker image build: docker build --tag rap-rdp-worker:dev --file workers/rdp-worker/Dockerfile workers/rdp-worker
end-to-end smoke path: scripts/smoke/README.md

Configuration

Use configs/api.example.env as the starting point for local environment variables.

Resource secret-readiness is controlled by APP_ENV:

in APP_ENV=production or APP_ENV=prod, RDP/VNC/SSH resources must carry secret_ref and must not include plaintext credential-like fields in metadata
in development and smoke environments, plaintext metadata remains allowed until the encrypted secret resolver is implemented
the production guard is enforced both on resource create/update and on session start, so legacy plaintext resources cannot be started in production accidentally
SECRET_ENCRYPTION_KEY_B64 or SECRET_ENCRYPTION_KEY_FILE supplies the AES-256-GCM master key for the MVP encrypted store; production mode refuses to start without one
SECRET_ENCRYPTION_KEY_ID labels the active key version in stored records
PUT /api/v1/resources/{resourceID}/secret creates or rotates a resource secret and updates resources.secret_ref; plaintext is never returned by the API
session assignment keeps PostgreSQL metadata safe: remote_sessions.metadata stores secret_ref, while resolved credentials are merged only into the transient worker assignment after session/worker/lease checks

See docs/architecture/SECURITY_SECRETS_READINESS.md for the target secret-reference model and remaining resolver/PKI gaps.

Data Plane v1 contract scaffolding is controlled by:

DATA_PLANE_TOKEN_TTL, default 1m
DATA_PLANE_TOKEN_PRIVATE_KEY_FILE, optional path to an RSA private key PEM used to sign RS256 data-plane tokens
DATA_PLANE_TOKEN_PRIVATE_KEY_PEM, optional inline RSA private key PEM; used when file path is not configured
DATA_PLANE_BACKEND_GATEWAY_URL, default /api/v1/gateway/ws
DATA_PLANE_DIRECT_WORKER_WSS_URL_TEMPLATE, optional; supports {worker_id} replacement
DATA_PLANE_DIRECT_WORKER_JSON_RUNTIME, default false; advertises runtime_transport=json_v1 only after the worker direct JSON bridge is deployed and verified
DATA_PLANE_DIRECT_WORKER_BINARY_RENDER, default false; when the direct JSON runtime is enabled, advertises render_transport=binary_v1 so DP-2 clients can request binary render frames over direct worker WSS. Binary render candidates also advertise supported_color_modes=["full_color","grayscale"] and default_color_mode="full_color" for the DP-3A grayscale foundation.
DATA_PLANE_DIRECT_WORKER_TLS_TRUST_MODE, default smoke_insecure; allowed values are smoke_insecure, public_ca, and platform_ca.
DATA_PLANE_DIRECT_WORKER_TLS_CA_REF, optional label for the platform CA or trust bundle version advertised to clients.

Data-plane tokens are RS256-signed. The backend must hold only the private key; workers receive only the matching public key for validation. If no private key is configured, the backend omits the optional data_plane offer and the backend gateway fallback remains unchanged.

If no direct worker WSS URL template is configured, session responses still include the backend gateway fallback candidate only. If the URL template is configured but DATA_PLANE_DIRECT_WORKER_JSON_RUNTIME is false, the direct candidate is still present for contract visibility but is not marked data-capable; DP-1D Windows clients will skip it and use the backend gateway fallback. If DATA_PLANE_DIRECT_WORKER_BINARY_RENDER is false, direct worker WSS remains JSON/base64 for render. If it is true, only direct worker WSS render is binary; backend gateway fallback remains JSON/base64. In production, the backend does not advertise direct worker WSS when DATA_PLANE_DIRECT_WORKER_TLS_TRUST_MODE=smoke_insecure; it keeps the backend gateway fallback instead. Trusted direct candidates include tls_trust_mode, production_trusted, smoke_only, and optional tls_ca_ref metadata. See docs/architecture/DIRECT_WORKER_TLS_PKI.md.

Module layout

internal/platform shared runtime, config, infra, and bootstrap concerns
internal/modules/auth auth and trusted-device boundary
internal/modules/organization organization model, org roles, and memberships
internal/modules/identitysource local/LDAP/OIDC identity source model and future mapping foundations
internal/modules/resource remote resource inventory boundary
internal/modules/sessionbroker persistent session lifecycle, orchestration, audit, and Redis live-state boundary
internal/modules/sessiongateway websocket attach/reconnect/takeover transport boundary
internal/modules/worker worker registration, lease coordination, and control-plane routing boundary for future C++ RDP workers
internal/modules/node node inventory, capabilities, enabled services, update policy, and partition state
internal/modules/nodeagent node-agent registration, health, service status, and update/rollback control interface
pkg/contracts cross-module contracts for sessions and worker control

Backend responsibilities

PostgreSQL remains the source of truth for auth sessions, devices, remote sessions, attachments, resource policies, and audit events
Redis is used only for live routing and coordination: attach tokens, controller bindings, live session cache, worker registration, worker leases, heartbeats, and routing queues
worker:control:<worker_id> carries worker assignments, worker:queue:<session_id> carries live control/input envelopes, and worker:events carries worker-reported lifecycle events back into broker processing
Session broker owns state transitions and orchestration rules; websocket handlers call broker services instead of talking to postgres repositories directly
Worker runtime stays behind interfaces and Redis coordination so the backend remains isolated from FreeRDP implementation details while the minimal real RDP worker plugs into the control plane
RDP certificate verification is configured per resource through certificate_verification_mode
resources are now org-scoped in PostgreSQL and remote sessions persist their owning organization without changing the proven worker/session runtime contracts
session start/attach/takeover responses may include optional data_plane candidates and a short-lived signed data-plane token for DP-1 direct worker WSS migration; existing clients continue to use the current gateway path, and direct realtime use remains gated by explicit candidate metadata

Authorization model

platform_admin and platform_recovery_admin have global access across organizations, resources, and sessions
in INSTALLATION_AUTHORITY_MODE=strict, platform-admin power is effective only when the user also has a valid signed row in platform_role_grants; changing users.platform_role in PostgreSQL alone no longer grants owner access
first-owner bootstrap is available at POST /api/v1/installation/bootstrap-owner and requires a Product Root Ed25519 signature over an activation manifest in strict mode
production (APP_ENV=production or prod) requires strict installation authority plus INSTALLATION_PRODUCT_ROOT_PUBLIC_KEY_B64 or INSTALLATION_PRODUCT_ROOT_PUBLIC_KEY_FILE
legacy/dev installs can keep database-role behavior, and insecure first-owner bootstrap is available only when INSTALLATION_INSECURE_BOOTSTRAP_ENABLED=true
org_owner and org_admin can create and update resources inside their organization and can manage any remote session inside that organization
active non-admin memberships such as org_operator, org_member, and org_viewer are deny-by-default for admin actions; they can only access org-scoped reads and operate on their own session flows where the session broker explicitly allows it
session start always authorizes the actor against the resource organization before worker reservation
attach, detach, takeover, and terminate authorize against the owning remote session organization before any state transition is written
worker-facing events do not bypass this model for user-originated commands; internal worker failure and heartbeat paths remain broker-internal control-plane operations

Migration safety

000005_platform_core_v2 bootstraps a single default organization and backfills existing resources.organization_id and remote_sessions.organization_id into that organization before setting NOT NULL
000006_default_org_memberships_backfill safely restores access continuity by inserting missing active memberships for existing users into the default organization
the backfill is idempotent because it only inserts rows missing under the (organization_id, user_id) uniqueness constraint
platform administrators are backfilled as org_owner in the default organization, while other existing users are backfilled as org_member
if 000005 fails before the NOT NULL step, PostgreSQL rolls back the transaction and leaves pre-v2 rows untouched; if 000006 is rerun, it skips already-created memberships rather than duplicating them

Platform-Core V2 Notes

organizations, organization_memberships, and organization_roles establish multi-tenant ownership and basic org-scoped authorization boundaries
identity_sources and identity_mappings are foundation-only in this phase; full LDAP/OIDC sync and claim/group ingestion are intentionally deferred
nodes, node_capabilities, node_services, node_update_policies, node_partition_states, and node_agent_update_runs provide the first control-plane model for node and node-agent lifecycle
current proven RDP session lifecycle remains preserved: the session broker still orchestrates the same worker/session behavior, but it now records organization ownership via org-scoped resources
PostgreSQL remains the source of truth for organizations, memberships, org-scoped resources, identity sources, nodes, node-agent state, and session lifecycle state

Resource Certificate Verification

strict is the default and keeps normal certificate validation enabled in the worker runtime
ignore must be explicitly stored on the resource and allows that one RDP connection to skip certificate validation
the backend passes this policy through session assignment data; it is not a global backend toggle

Messaging Model

HTTP errors now use a structured envelope:
- error.code
- error.message_key
- error.fallback_message
- error.details
- error.trace_id
internal/platform/httpx owns error normalization and trace-id generation so handlers can keep calling WriteError(...) without changing business logic.
For 5xx responses, user-facing payloads are normalized to an English generic fallback message while logs and diagnostics can still keep raw internal details elsewhere.
For 4xx responses, stable code and message_key are derived from the current fallback message, so clients can localize without depending on raw English text as the primary contract.

WebSocket Messaging

Session gateway envelopes keep the existing type and payload contract.
User-facing websocket events now also include event with:
- code
- message_key
- fallback_message
- details
- trace_id
session.taken_over, terminal session.state, transport.closed, and protocol-level errors now carry this structured event object.
Existing payload semantics remain intact for compatibility with the already proven session lifecycle.

Message Rules

Keep English as the only development language for fallback_message, logs, and diagnostics.
New HTTP handlers should prefer httpx.WriteError(...) for user-facing failures instead of hand-building "error": "..." JSON.
New websocket user-facing notifications should populate TransportEnvelope.Event with a stable code and message_key.
Do not use raw human-readable English text as the primary client contract; it should only remain as fallback text.
This messaging layer is now runtime-proven against the live Windows smoke flow for invalid-login errors, websocket takeover delivery, websocket state fallback rendering, and worker-death failure handling.

Clipboard Policy

RDP text clipboard is controlled per resource through resource_policies.clipboard_mode. Allowed values are disabled, client_to_server, server_to_client, and bidirectional; the default is disabled. The legacy clipboard_enabled column is retained only for compatibility and migration/backfill, while new runtime decisions use clipboard_mode.

Clipboard enforcement happens in the real data path:

sessionbroker.ResourcePolicy.ClipboardMode is loaded from PostgreSQL and embedded into the session assignment metadata sent to the worker.
sessiongateway.Module.handleEnvelope blocks client-to-server clipboard envelopes unless the session is active and the policy allows that direction.
worker.EventProcessor sends worker-originated clipboard text through sessionbroker.Service.UpdateWorkerClipboardText, which applies the same active-state and server-to-client policy checks before updating live state.
Clipboard messages carry sequence_id, origin, and content_hash so clients and workers can avoid feedback loops across reattach/takeover paths.
Redis stores clipboard text only as transient live state for routing to the active controller; PostgreSQL remains authoritative for policy/session state.

File Upload Policy

Stage 5.1 introduces client-to-server file upload as a policy-gated RDP feature. The authoritative policy field is resource_policies.file_transfer_mode; allowed values are disabled, client_to_server, server_to_client, and bidirectional, but only client_to_server behavior is implemented in this stage. The default is disabled. The legacy file_transfer_enabled column is retained only as a derived compatibility flag and must not be treated as the primary policy.

Enforcement is deliberately duplicated in the real data path:

resource.Module exposes file_transfer_mode in resource create, update, list, and read payloads.
sessionbroker.Service.StartRemoteSession embeds file_transfer_mode into assignment metadata and requests the worker file-transfer capability only when client-to-server upload is allowed.
sessiongateway.Module.handleFileUploadStart and handleFileUploadChunk require an active session, current controller, allowed policy mode, valid UUID transfer_id, safe file name, 25 MiB max file size, and 256 KiB max chunk size before routing chunks to the worker.
Redis is used only to route bounded upload envelopes to the worker. The file itself is written by the worker to controlled worker storage; PostgreSQL remains authoritative for policy and session state.

File Download Policy

Stage 5.2 adds a runtime-proven server-to-client download path for RDP. The policy field remains resource_policies.file_transfer_mode; server_to_client and bidirectional allow download, while disabled and client_to_server block it. The default remains disabled.

The v1 download model uses only the restricted RAP_Transfers\ToClient drop-zone inside the existing per-session visible transfer directory. Backend gateway accepts only file_download.start, file_download.ack, and file_download.cancel from the current controller of an active session and routes them to the worker after policy validation. Worker-origin file_download.* events are stored only as transient live state for backend-gateway fallback delivery; PostgreSQL remains authoritative for session/resource/policy state and must not store file contents.

The direct worker WSS path is also lifecycle-gated: detach returns file_download.blocked, old-controller takeover returns session.taken_over, and worker failure closes the direct transport after PostgreSQL transitions the session to failed.

18 KiB Raw Blame History