44 KiB
Data Plane v1 for RDP
Archived status: this document is a historical RDP/WebSocket stage record, not
the current runtime source of truth for transport architecture. The active
fabric transport model is QUIC-only between nodes; see
docs/architecture/DISTRIBUTED_FABRIC_NODE_PROTOCOL_PLAN.md,
docs/architecture/FABRIC_FIRST_TRANSPORT_AND_STRESS_PLAN.md, and
docs/architecture/SECURE_ACCESS_FABRIC_TARGET.md.
Status: DP-3A grayscale full-frame binary render foundation is implemented and smoke-proven on the test Docker environment as of 2026-04-25. DP-3B adaptive quality policy/selection is intentionally paused. The accepted C++ RDP Adapter baseline is the ordered-region path. RDP-Perf-6 makes direct dirty-region binary render explicit with render.frame.full / render.frame.region RAP2 message types and is build/probe/live-smoke-proven on the test Docker environment as of 2026-04-26. The current test Docker deployment for the RDP Adapter performance path is rap-rdp-worker:rdp-perf6-dirty-region. The Stage 5.2 core download data path remains runtime-proven for direct worker WSS and backend gateway fallback. Data-plane and RDP work are paused; the next active focus is Stage C10 Fabric Core / cluster foundation, not another data-plane feature.
This document defines the first staged data-plane evolution for the RDP MVP. It does not implement direct worker WebSocket runtime, mesh routing, VPN, QUIC, UDP, WebRTC, relay nodes, or multi-cluster behavior.
The long-term platform target is defined in docs/architecture/SECURE_ACCESS_FABRIC_TARGET.md. This document narrows that target to DP-1: direct client-to-worker WSS for RDP realtime traffic, with the current backend gateway retained as fallback/debug.
1. Current Problem
The current RDP MVP routes realtime input/render through the backend WebSocket gateway and Redis-backed coordination. This is acceptable for fallback, debugging, lifecycle proof, and early MVP validation.
It is not acceptable as the production realtime path because:
- render frames are high-rate and high-volume
- base64/JSON render payloads add CPU and payload overhead
- backend gateway can become a bottleneck under concurrent sessions
- input can compete with render/frame processing
- backend API capacity should be reserved for control-plane work
- Redis must not become a frame transport or durable render store
The current implementation remains valid as fallback while DP-1 is introduced in stages.
2. Target DP-1 Path
Target DP-1 path:
Windows client
-> direct WSS data-plane connection
-> rdp-worker realtime endpoint
-> existing RDP session runtime
-> FreeRDP
Control-plane path remains:
Windows client
-> backend API
-> auth / org / policy / session broker
-> worker selection
-> short-lived data-plane token issuance
Fallback path remains:
Windows client
-> backend WebSocket gateway
-> current gateway/Redis/worker coordination path
DP-1 does not replace the session broker. It only moves realtime session traffic away from backend relay when direct worker WSS is available and authorized.
3. Responsibilities
Backend as Control Plane
Backend remains responsible for:
- authentication
- organization selection and isolation
- resource authorization
- resource policy evaluation
- session lifecycle
- worker selection
- attachment ownership
- takeover semantics
- audit
- short-lived data-plane token issuing
- returning data-plane candidates
- retaining backend gateway fallback
Backend must not become the production high-rate render relay.
Worker as Direct Realtime Endpoint
The RDP worker becomes responsible for:
- exposing an authorized direct WSS endpoint
- validating
data_plane_token - binding a WSS connection to an existing session runtime
- enforcing session, attachment, user, organization, and channel scope
- carrying realtime logical channels directly:
- input
- render
- clipboard
- file_upload
- control / heartbeat
- telemetry
- preserving existing FreeRDP runtime boundaries
- preserving policy enforcement already present in worker runtime
The worker must not create a new RDP session just because a direct WSS connection attaches. It must bind to the existing broker-created session runtime.
Windows Client
The Windows client will eventually:
- read data-plane candidates from session start/attach responses
- prefer
direct_worker_wsswhen available - fall back to
backend_gatewaywhen direct worker WSS is unavailable - keep existing lifecycle behavior unchanged
- keep backend gateway support for debug/fallback
No client behavior changes are required for DP-1A.
4. Backend Contract Proposal
On session start, attach, and takeover, backend should extend the response with data-plane candidates.
Example:
{
"session_id": "session-123",
"attachment_id": "attachment-456",
"gateway_url": "wss://backend.example.com/api/v1/gateway/ws",
"data_plane": {
"preferred": "direct_worker_wss",
"token": "short-lived-data-plane-token",
"expires_at": "2026-04-25T13:00:00Z",
"candidates": [
{
"type": "direct_worker_wss",
"url": "wss://worker-node.example.com/rap/v1/data-plane"
},
{
"type": "backend_gateway",
"url": "wss://backend.example.com/api/v1/gateway/ws"
}
]
}
}
Compatibility rules:
- Existing fields must remain valid.
- Existing clients may ignore
data_plane. gateway_urlremains available for fallback/debug.- The backend must not return direct worker candidates unless the worker is live and route policy permits it.
- Token TTL must be short.
Proposed DTO Shape
Names are proposals only.
SessionControlResult
session
attachment
attach_token
gateway_url
data_plane?
DataPlaneOffer
preferred
token
expires_at
candidates[]
DataPlaneCandidate
type
url
worker_id?
node_id?
cluster_id?
priority?
metadata?
5. Data Plane Token Model
data_plane_token must be short-lived and scoped. It is not a general API token.
Required claims:
session_idattachment_iduser_idorganization_idcluster_idif availableworker_idresource_idallowed_channelsexpires_atnonce/jtiissued_atissueraudience
Allowed channel values:
inputrenderclipboardfile_uploadcontroltelemetry
Validation rules:
- token must be signed by the backend with RS256 private key
- worker must validate with public key only and must not hold a signing secret
- token must be short-lived
- token must match the worker receiving it
- token must match an active session runtime
- token must match current attachment/controller where required
- token must not grant channels not allowed by resource policy
- token must not survive session termination
- token replay must be rejected or bounded by
jti/ nonce cache
Token refresh is not part of DP-1A. Future stages may either reissue tokens through the control plane or renew direct connections through a controlled flow.
6. Direct WSS Channel Model
DP-1 uses a single WSS connection with logical channels. Later stages may split transports, but DP-1 must keep the model simple and bounded.
control
Reliable channel.
Used for:
- attach handshake
- heartbeat
- session state messages
- detach notification
- takeover notification
- terminate notification
- protocol errors
input
Highest-priority channel.
Rules:
- input never waits behind render
- key down/up and mouse button/wheel events must be ordered
- mouse move may be coalesced to latest
- input queues must be bounded
- stale mouse move may be dropped
- click/key/wheel must not be dropped under normal operation
render
Droppable/latest-frame channel.
Rules:
- stale render frames must be dropped
- latest frame wins
- render must not block input/control
- binary payloads should be used on direct data plane
- backend fallback may continue existing JSON/base64 behavior during migration
clipboard
Reliable policy-gated channel.
Rules:
- existing
clipboard_modeapplies - text-only behavior remains until richer formats are explicitly designed
- blocked behavior must remain localized in clients
- worker must enforce policy again
file_upload
Reliable chunked channel.
Rules:
- existing
file_transfer_modeapplies - bounded chunk size
- content hash
- transfer id
- no arbitrary path exposure
- file upload must not block input
telemetry
Low-priority channel.
Rules:
- sampled or lossy telemetry is acceptable
- telemetry must not block user traffic
- useful metrics include input RTT, frame FPS, dropped frames, queue length, decode time, render apply time
7. Message Framing
DP-1 uses:
- JSON control messages for small envelopes
- binary WebSocket frames for render payloads
- no base64 for direct data-plane render frames
Backend fallback keeps the current JSON/base64 frame path for debug/fallback. Direct worker WSS uses binary render frames when the backend advertises render_transport=binary_v1 and the client requests render_transport=binary_v1.
JSON Envelope
Small control/reliable messages may use JSON:
{
"protocol_version": 1,
"session_id": "session-123",
"attachment_id": "attachment-456",
"channel": "input",
"message_type": "mouse",
"sequence": 1024,
"timestamp": "2026-04-25T13:00:00.000Z",
"flags": {},
"payload": {}
}
Binary Frame Header
DP-2 uses a fixed 16-byte preamble followed by a UTF-8 JSON header and a raw binary payload:
offset size field
0 4 magic = "RAP2"
4 2 protocol_version, little-endian uint16, currently 1
6 2 flags, little-endian uint16
8 4 header_length, little-endian uint32
12 4 payload_length, little-endian uint32
16 n UTF-8 JSON header
16+n m raw render payload bytes
The DP-2 JSON header contains:
protocol_versionsession_idchannel, currentlyrendermessage_type, currentlyrender.frame.fullorrender.frame.regionon direct worker WSS;session.frameremains accepted as the legacy DP-2 binary message type for compatibility.sequencetimestampflagspayload_lengthframe_widthframe_heightframe_strideframe_format- optional region fields when
message_type=render.frame.region:region_x,region_y,region_width,region_height,region_stride,region_format=BGRA32 - optional
color_mode, currentlyfull_colororgrayscale - optional
quality_profile - optional
original_frame_format - optional
output_frame_format - optional
raw_frame_bytes - optional
binary_direct_bytes - optional diagnostics:
full_frame_bytes,region_bytes,region_savings_percent,diff_time_ms,render_update_reason,fallback_to_full_frame_reason - optional
input_correlation_id - optional
worker_frame_captured_at
Binary frames must include a fixed or clearly parseable header before payload.
Required header fields:
protocol_versionsession_idchannelmessage_typesequencetimestampflagspayload_length
Render payload must not be base64 encoded on direct data plane.
Suggested render message types:
render.frame.fullrender.frame.regionrender.cursorrender.resizerender.quality.changed
Suggested flags:
keyframedroppablelatest_onlycompressedinteractivegrayscale
8. Quality Profile Foundation
DP-1A defines quality profiles only. It does not implement adaptive rendering.
Profiles:
emergency_grayscalelow_bandwidthtext_prioritybalancedhigh_quality
Color modes:
full_color256_colors64_colors16_colorsgrayscale
Rules:
- quality profile must affect real render behavior in later stages
- input priority remains absolute
- render quality must degrade before input latency increases
- lower profiles may reduce FPS, color depth, region size, or compression settings
- higher profiles may increase FPS and color fidelity only when queues remain healthy
- profile selection must be policy-aware and observable
9. Security Model
DP-1 security boundaries:
- backend authorizes session access
- backend issues short-lived data-plane token
- worker validates token before accepting direct WSS
- worker binds token to existing session runtime
- worker enforces channel permissions
- worker rejects mismatched session, attachment, organization, resource, worker, or expired token
- backend gateway fallback keeps existing auth path
Transport:
- direct worker WSS must use TLS
- future node-to-node traffic uses mTLS as defined in the Secure Access Fabric target
- DP-1 direct WSS may start with worker server TLS plus signed token validation
- P3.2 direct worker WSS trust metadata distinguishes
smoke_insecure,public_ca, andplatform_ca - production backend must not advertise smoke-only direct candidates
- production clients must not use insecure TLS bypass and must fall back to backend gateway if direct worker trust is unavailable
- production deployments should avoid long-lived static worker secrets
Audit:
- backend audits token issuance
- backend audits session lifecycle
- worker should report direct attach/detach/failure events back to control plane
- direct data-plane traffic does not require auditing every input/render event
- high-risk events such as takeover, failed token validation, policy denial, and file transfer should be auditable
10. Fallback Backend Gateway Path
The current backend WebSocket gateway remains:
- fallback path
- debug path
- compatibility path for older clients
- smoke-test path while DP-1 is staged
Fallback activation cases:
- no direct worker candidate returned
- direct WSS connect fails
- token validation fails due to stale route
- worker endpoint unavailable
- policy forces backend gateway
- client version does not support direct WSS
Fallback rules:
- fallback must preserve existing lifecycle behavior
- fallback must not silently weaken policy
- fallback should be visible in logs/telemetry
- fallback should be measurable against direct path latency
11. Migration Stages
Stage DP-1A: Spec Only
Create architecture/spec documentation.
No runtime behavior changes.
Stage DP-1B: Backend Offers Data Plane Candidates
Status: completed.
Backend extends session start/attach/takeover responses with data_plane.
Client still uses fallback backend gateway.
Implementation status:
- backend response DTO can include optional
gateway_urlanddata_plane data_plane.tokenis a short-lived signed token with session, attachment, user, organization, worker, resource, allowed-channel, expiry, andjtiscopebackend_gatewaycandidate is always returned when configureddirect_worker_wsscandidate is returned only when a direct worker WSS URL template is configured- current clients may ignore
data_planesafely - no worker direct WSS runtime is implemented in this stage
- no client routing behavior changes in this stage
Verification:
- old clients still work
- responses include valid candidate shape
- token is short-lived
- token is scoped
Stage DP-1C: Worker Direct WSS Endpoint
Status: completed.
Worker exposes direct WSS endpoint and validates data_plane_token.
Windows client still uses fallback backend gateway.
Implementation status:
- worker has optional
/rap/v1/data-planeWSS endpoint - endpoint is disabled by default and requires TLS certificate/key paths
- worker validates signed RS256
data_plane_tokenwith a public key only - worker keeps no data-plane signing secret
- worker rejects reused
jtivalues with a bounded in-memory TTL cache - token validation checks session, attachment, user, organization, worker,
resource, allowed channels, expiry, audience, and
jti - endpoint binds only to existing
SessionRuntime - bind checks reject old attachment after takeover, wrong attachment, wrong worker, wrong organization, wrong resource, missing runtime, failed/terminated runtime state, and channels broader than runtime policy
- invalid token, wrong worker, expired token, replayed
jti, and missing runtime are rejected rdp-worker-dataplane-token-probevalidates token behavior in the worker imagerdp-worker-dataplane-bind-probevalidates attachment/state/channel bind policy without starting RDP- backend gateway remains active fallback
- no Windows client routing change is included in this stage
Verification:
- token validation works
- runtime binding rejects missing runtime without creating a new RDP session
- replayed
jtivalues are rejected - wrong attachment and over-broad channels are rejected
- no new RDP session is created
- invalid tokens are rejected
Stage DP-1D: Windows Client Prefers Direct WSS
Status: completed as hardened client transport selection.
Windows client uses direct worker WSS only when the candidate is explicitly marked data-capable. Current DP-1C worker endpoint validates and binds but does not yet carry production render/input traffic, so unmarked candidates fall back to the backend gateway immediately.
Implementation status:
- Windows session DTOs understand optional
data_planeoffers and candidates - transport selection remains behind
ISessionGatewayClient - direct worker WSS candidates are considered only when metadata contains
runtime_transport=json_v1ortraffic_ready=true - direct WSS attach attempts use short bounded timeout and never block the UI
- failed/unavailable/not-ready direct path automatically uses backend gateway
- existing backend gateway behavior remains unchanged
- no worker runtime changes are included in DP-1D
- no binary render frames or mesh/relay/VPN behavior is included
Verification:
- Windows client build succeeds
- fallback works and remains the default runtime path for current DP-1C endpoint
- direct candidate selection is capability-gated to avoid losing render/input
- lifecycle behavior remains stable
Stage DP-1D.1: Worker Direct JSON Realtime Bridge
Status: runtime-proven on the test Docker environment as of 2026-04-25.
Worker direct WSS now carries the same JSON realtime envelopes already used by the backend gateway. This is intentionally a bridge stage, not the final production data-plane protocol.
Implementation status:
- worker direct WSS accepts existing JSON
input,control,clipboard, andfile_uploadenvelopes - worker direct WSS emits existing JSON
session.state,session.frame,session.taken_over,clipboard.text, andfile_upload.progressevents - direct WSS binds only to an existing
SessionRuntime; it never creates a new RDP runtime - direct inbound envelopes are bounded and drained before Redis fallback input
- mouse move can be coalesced, but click, wheel, keyboard, clipboard, and file upload envelopes remain reliable within bounded queues
- direct render is latest-frame-only and droppable in the worker WSS writer
- direct inbound envelopes are tagged with token-bound session, attachment, user, organization, worker, and resource claims before they enter runtime
- runtime rejects direct envelopes whose
attachment_idno longer matches the current active controller attachment - takeover updates emit
session.taken_overto the previous direct attachment while normal frame/state events continue only to the current attachment - backend advertises direct metadata only when
DATA_PLANE_DIRECT_WORKER_JSON_RUNTIME=true - backend gateway fallback remains active and unchanged
- Windows client behavior remains gated by DP-1D metadata selection
Verification performed:
- backend
go test ./...passes - Windows client build passes with no routing behavior change required
- worker canonical Docker image builds with the direct JSON bridge
- DP-1C endpoint smoke still proves malformed token rejection,
valid-token-without-runtime rejection, and
jtireplay rejection - backend tests prove
runtime_transport=json_v1andtraffic_ready=trueare emitted only when the explicit runtime flag is enabled - live runtime proof was run on test Docker
192.168.200.61withDATA_PLANE_DIRECT_WORKER_JSON_RUNTIME=true - backend session start returns
direct_worker_wsscandidate metadata:runtime_transport=json_v1andtraffic_ready=true - Windows desktop smoke selected
direct_worker_wssand connected towss://192.168.200.61:18443/rap/v1/data-plane - worker direct WSS validated the token and bound to the existing runtime
- direct WSS accepted input envelopes and applied mouse/keyboard events through FreeRDP
- direct WSS emitted JSON render/state events and the Windows client rendered a real desktop frame
- direct WSS carried text clipboard client-to-server through the existing
clipboardenvelope and worker policy/cliprdr boundary - direct WSS carried chunked file upload through existing
file_upload.start/file_upload.chunkenvelopes and emittedfile_upload.progress - fallback was proven by advertising an unavailable direct worker URL; the
Windows client timed out direct WSS and selected
backend_gateway - detach, reattach, takeover,
session.taken_over, input, and render remained stable in direct and fallback smoke runs - no new RDP runtime was created by direct WSS attach; worker logs showed one
started new runtimefor the session and laterupdated assignment for existing sessionon reattach/takeover
Known limitations after DP-1D.1:
- direct render still uses JSON/base64 full-frame payloads; binary render frames remain DP-2
- direct server-to-client clipboard was not re-matrixed in this DP proof because Stage 4.1 already proved FreeRDP cliprdr behavior; DP-1D.1 proved that the direct bridge carries clipboard envelopes and preserves worker enforcement
- file upload direct proof lands in the existing restricted worker visible transfer directory; broader file-transfer UX remains outside DP-1D.1
- the Windows smoke script reports
rendering=falsewhen compact layout hides telemetry controls, even though frame receipt/rendering is proven by logs and UIA event text
Stage DP-1E: Latency Comparison
Status: measurement-complete on the test Docker environment as of 2026-04-25.
Compare direct path vs fallback before starting DP-2 binary render frames.
Metrics:
- input capture to worker apply
- worker frame capture to client render
- frame queue length
- dropped stale frames
- close/dispose latency
- fallback activation count
Smoke commands used:
pwsh -ExecutionPolicy Bypass -File scripts/windows-smoke/desktop-smoke.ps1 `
-PreferDirectDataPlane:$true `
-AllowInsecureDirectDataPlaneTlsForSmoke:$true `
-DirectDataPlaneConnectTimeoutMs 2500 `
-SkipOrgSwitchAndTokenRefresh
pwsh -ExecutionPolicy Bypass -File scripts/windows-smoke/desktop-smoke.ps1 `
-PreferDirectDataPlane:$false `
-AllowInsecureDirectDataPlaneTlsForSmoke:$true `
-DirectDataPlaneConnectTimeoutMs 750 `
-SkipOrgSwitchAndTokenRefresh
Measured sessions:
- direct worker WSS:
59af4b37-3708-4cff-8e9d-054869946250 - backend gateway fallback baseline:
673b7540-6276-4d73-824b-e5b2ea96182a - additional fallback-activation proof: direct candidate unavailable/not-ready logs on
8d89dd5c-fb14-4f70-a4e4-01ebb2a37da4and673b7540-6276-4d73-824b-e5b2ea96182a
Verification summary:
- direct smoke passed login, resource list, start, input, detach, reattach,
takeover,
session.taken_over, and logout - fallback smoke passed login, resource list, start, input, detach, reattach,
takeover,
session.taken_over, and logout - smoke
rendering=falseis a compact-layout harness artifact; session event log containedDesktop frame receivedand client logs containedSessionWindow rendered frame - cleanup probe against
/api/v1/sessions/activereturned 404 because that endpoint does not exist; the implemented list endpoint is/api/v1/sessions?user_id=... - Redis worker queues were empty after the measured runs:
worker:queue:59af... = 0,worker:queue:673b... = 0
Latency matrix:
| Metric | Direct worker WSS | Backend gateway fallback |
|---|---|---|
| Client transport selection | selected=direct_worker_wss in desktop logs |
selected=backend_gateway in desktop logs |
| Client capture/send to worker apply | direct smoke retained worker-side receive/apply timestamps; client capture timestamp was not retained in the compact smoke log | sampled fallback activation: about 205ms from WPF capture to worker apply for mouse down |
| Backend gateway input hop | bypassed for direct realtime input | backend receive to route typically <1ms |
| Worker receive to FreeRDP apply, mouse down | 0ms to 24ms observed |
0ms to 29ms observed |
| Worker receive to FreeRDP apply, mouse up | 25ms to 26ms observed |
28ms to 29ms observed |
| Worker receive to FreeRDP apply, key down | about 25ms observed |
about 33ms observed |
| Worker receive to FreeRDP apply, key up | about 26ms observed |
about 30ms observed |
| Backend route to worker receive | not applicable | about 31ms for key down, about 0ms to 31ms for sampled mouse/key events |
| FreeRDP apply to next captured frame | 0ms to 40ms observed |
0ms to 43ms observed |
| Worker frame capture to backend receive | backend still receives worker frame telemetry; observed same-second receive | same-second receive |
| Backend frame receive to client write | not on direct render path | sampled 486ms and 753ms on full-frame JSON/base64 gateway writes |
| Client render proof | session.frame received and frame rendered in direct smoke |
SessionWindow rendered frame seq=19 size=1280x720 |
| SessionGatewayConnection dispose | about 1ms in sampled close traces |
about 1ms in sampled close traces |
| SessionWindow closed handler | below 1ms in sampled close traces |
below 1ms in sampled close traces |
Queue and backpressure observations:
- direct inbound drained bounded batches before fallback Redis input
- direct mouse move coalescing was active while preserving click/key ordering
- direct outbound reported
frames_queued_per_secondmatchingframes_sent_per_second;reliable_dropped=0 - worker render pending remained
0for both paths - fallback Redis append queue length stayed bounded in sampled logs, usually
1to3, and returned to0after the run
Render observations:
- direct render is already latest-frame-only/droppable at the worker WSS writer
- worker render rates during interaction were approximately:
- direct:
~3.0to~5.7frames/sec sent/published, pending0 - fallback:
~2.0to~5.0frames/sec published, pending0
- direct:
- current frames are still JSON/base64 full-frame payloads
- measured frame payload size remains about
4,915,200bytes per JSON/base64 frame, so DP-1D.1 improves routing but does not remove the render payload bottleneck
Fallback activation proof:
- fallback was explicitly selected when the client was configured with
PreferDirectDataPlane=false - fallback was also visible when direct WSS was unavailable or not runtime
ready, with client logs:
data_plane.transport direct_worker_wss failed; falling back to backend_gatewaydata_plane.transport direct_worker_wss unavailable_or_not_runtime_ready; using backend_gatewaydata_plane.transport selected=backend_gateway
DP-1E conclusion:
- direct worker WSS removes backend/Redis from the realtime input path
- fallback backend gateway remains functional and observable
- neither path showed unbounded input queue growth during smoke
- close/dispose traces remained fast in sampled logs
- the dominant remaining bottleneck is render payload format and size, not worker input scheduling
- DP-2 should focus on binary render frames and avoiding base64/JSON render payloads on the direct data plane
Stage DP-2: Binary Render Frames
Status: implemented and smoke-proven on the test Docker environment as of 2026-04-25.
Direct worker WSS now sends render payloads as binary WebSocket frames when the backend candidate metadata advertises render_transport=binary_v1 and the Windows client requests that transport. Backend gateway fallback continues to use the existing JSON/base64 frame path.
Goals:
- remove base64 overhead from the direct worker WSS wire path
- reduce direct render payload size
- keep backend gateway JSON/base64 fallback intact
- keep direct render latest-frame-only and droppable
- keep input/control ahead of render
Implementation notes:
- Backend advertises binary direct render only when
DATA_PLANE_DIRECT_WORKER_BINARY_RENDER=true. - Direct candidate metadata includes
runtime_transport=json_v1,traffic_ready=true, andrender_transport=binary_v1. - Worker direct WSS accepts existing JSON envelopes for control/input/clipboard/file_upload and emits binary WebSocket frames for
session.frame. - Windows client enables binary parsing only for direct candidates that advertise
render_transport=binary_v1orbinary_render=true. - Backend gateway fallback remains unchanged and continues to deliver
session.frameas JSON/base64.
Smoke proof:
- direct session id:
824c0057-c8a0-4366-b5c2-805597ae2d61 - fallback session id:
28e4b198-2c27-4971-951a-7b187c11f96d - direct client selected
direct_worker_wsswithrender_transport=binary_v1 - direct worker bind succeeded with
render_transport=binary_v1 - client received binary frames with raw payload size
3,686,400bytes - client rendered binary frames, including frame sequences
1,2,4,7,9,12,14,15,17,18, and19 - fallback client selected
backend_gateway - fallback rendered JSON/base64 frames through the existing backend gateway path
Payload comparison:
- DP-1E JSON/base64 frame payload: about
4,915,200bytes for1280x720BGRA - DP-2 direct binary frame payload:
3,686,400bytes for the same1280x720BGRA frame, plus a small binary preamble and JSON header - Direct wire payload reduction is about 25 percent compared with base64.
Latency and queue observations from smoke:
- direct click frame render sample: worker captured frame at
1777141091937, WPF rendered it at2026-04-25T21:18:11.6628382+03:00, about 226 ms later - direct key-down frame render sample: worker captured frame at
1777141093434, WPF rendered it at2026-04-25T21:18:13.1614990+03:00, about 727 ms later - direct worker render rate sample:
seen_per_second=4.953283,published_per_second=3.962626,dropped_per_second=0.990657,pending=0 - direct data-plane outbound sample:
frames_queued_per_second=5.404927,frames_sent_per_second=5.404927,binary_render_bytes_per_second=19926577.806299,json_render_bytes_per_second=0.000000,reliable_dropped=0 - fallback worker render rate sample:
seen_per_second=4.871576,published_per_second=3.897260,dropped_per_second=0.974315,pending=0
Known limitations:
- DP-2.1 removed the internal base64 encode/decode hop from the direct render path. The direct worker WSS sender now receives raw captured frame bytes and writes them into
RAP2binary frames without decoding a compatibilitysession_frame. - The worker still builds compatibility
session_frameevents with base64 for backend gateway/live-state fallback. That compatibility conversion is intentionally isolated to the fallback boundary and is not used by the direct binary render sink. - Backend still receives compatibility worker frame events for fallback/debug. Binary render frames are not routed through Redis or backend gateway.
- At the DP-2.1 point, dirty regions, tile encoding, adaptive quality, compression/codecs, and color-mode reduction remained later work.
- Smoke
rendering=falseremains a compact-layout harness artifact; UIA output and client logs proveDesktop frame receivedandSessionWindow rendered frame.
Stage DP-2.1: Worker Raw-Frame Split
Status: implemented and smoke-proven on the test Docker environment as of 2026-04-25.
DP-2.1 keeps the DP-2 RAP2 binary frame contract and removes the remaining worker-internal base64 encode/decode hop from the direct render path.
Implementation notes:
- FreeRDP frame capture now produces raw BGRA frame bytes for worker runtime render notifications.
SessionRuntimesplits render publication into two outputs:- direct binary render sink receives raw frame bytes
- compatibility fallback sink builds JSON/base64
session_frameonly for backend gateway/live-state fallback
- Worker direct WSS sends raw captured frame bytes as
RAP2binary WebSocket frames whenrender_transport=binary_v1is active. - Backend gateway fallback remains unchanged and still receives JSON/base64
session.framecompatibility events. - Direct render remains latest-frame-only and droppable; input/control scheduling is unchanged.
Smoke proof:
- direct session id:
b4720057-db61-4c72-bb4c-bccfd7e30008 - fallback session id:
65d0667b-aaef-4042-ae30-4c34d151e5aa - direct client selected
direct_worker_wsswithrender_transport=binary_v1 - fallback client selected
backend_gateway - direct client received
binary_frame_receivedpayloads of3,686,400bytes for1280x720BGRA - direct client rendered frame sequences including
1,2,4,7,9,13,14,15,16,18,19, and20 - fallback client rendered JSON/base64
session.framethrough backend gateway - worker logs show
raw_frame_bytes=3686400,binary_direct_bytes=3686400,base64_compat_bytes=4915200,encode_skipped_for_direct=true, andfallback_compat_frame_built=true - worker direct outbound logs show
binary_render_bytes_per_secondnon-zero andjson_render_bytes_per_second=0.000000
Payload and conversion proof:
- direct raw frame payload remains
3,686,400bytes plus the smallRAP2preamble/header - fallback compatibility payload remains about
4,915,200base64 bytes for the same frame - direct render no longer decodes
frame_datafrom compatibility base64 before binary send - base64 is still generated for fallback/debug because the backend gateway path intentionally remains JSON/base64
Known limitations:
- DP-2.1 is an internal worker render plumbing cleanup only.
- Full-frame BGRA payloads are still heavy.
- At the DP-2.1 point, dirty regions, tiles, adaptive quality, compression/codecs, and color-mode reduction remained future work.
- Backend gateway fallback remains JSON/base64 by design.
Stage DP-3A: Grayscale Full-Frame Binary Render
Status: implemented and smoke-proven on the test Docker environment as of 2026-04-25.
DP-3A adds the first conservative quality foundation for the direct binary render path. DP-3A itself did not implement tiles, compression, codecs, or adaptive profile switching. Dirty-region direct binary rendering is handled by the later RDP Adapter RDP-Perf-6 path.
Contract changes:
- backend direct binary render candidates advertise
render_transport=binary_v1 - backend direct binary render candidates advertise
supported_color_modes=["full_color","grayscale"] - backend direct binary render candidates advertise
default_color_mode="full_color" - Windows client requests
full_colorby default - Windows smoke can request
grayscalethrough-DirectDataPlaneColorMode grayscale RAP2binary frame headers carrycolor_mode,quality_profile,original_frame_format,output_frame_format,raw_frame_bytes, andbinary_direct_bytes
Implementation notes:
full_colordirect render sends the existing raw BGRA frame unchanged.grayscaledirect render converts BGRA bytes in the worker direct binary sink before WSS send.- The grayscale path preserves BGRA32 output format so the Windows presenter can reuse the existing render path.
- Backend gateway fallback remains JSON/base64 and is not affected by direct grayscale mode.
- Direct render remains latest-frame-only and droppable.
- Input/control scheduling is unchanged and remains higher priority than render.
Smoke proof:
- direct full-color session id:
74a0e5c6-02e0-487f-a1a1-c2850a13881c - direct grayscale session id:
3b616bd7-1179-4ec5-879f-7cd270f92a0a - fallback backend-gateway session id:
e5724cac-7f09-4931-9ad9-156a3f33d0b1 - direct full-color client selected
direct_worker_wsswithrender_transport=binary_v1,requested_color_mode=full_color, andapplied_color_mode=full_color - direct grayscale client selected
direct_worker_wsswithrender_transport=binary_v1,requested_color_mode=grayscale, andapplied_color_mode=grayscale - fallback smoke selected
backend_gatewayand continued to render JSON/base64session.frameevents - direct full-color frames rendered with
color_mode=full_colorandbytes=3686400 - direct grayscale frames rendered with
color_mode=grayscaleandbytes=3686400 - worker logs show
grayscale_conversion_applied=falsefor full color andgrayscale_conversion_applied=truefor grayscale - worker logs show
raw_frame_bytes_before=3686400,raw_frame_bytes_after=3686400, andbinary_direct_bytes=3686400 - worker grayscale conversion time was observed around
1-2 msper sampled1280x720BGRA frame - worker direct outbound logs show binary render traffic and
json_render_bytes_per_second=0.000000on the direct binary path
Verification commands:
pwsh -ExecutionPolicy Bypass -File scripts/windows-smoke/desktop-smoke.ps1 -PreferDirectDataPlane:$true -AllowInsecureDirectDataPlaneTlsForSmoke:$true -DirectDataPlaneConnectTimeoutMs 2500 -DirectDataPlaneColorMode full_color -SkipOrgSwitchAndTokenRefresh
pwsh -ExecutionPolicy Bypass -File scripts/windows-smoke/desktop-smoke.ps1 -PreferDirectDataPlane:$true -AllowInsecureDirectDataPlaneTlsForSmoke:$true -DirectDataPlaneConnectTimeoutMs 2500 -DirectDataPlaneColorMode grayscale -SkipOrgSwitchAndTokenRefresh
pwsh -ExecutionPolicy Bypass -File scripts/windows-smoke/desktop-smoke.ps1 -PreferDirectDataPlane:$false -AllowInsecureDirectDataPlaneTlsForSmoke:$true -DirectDataPlaneConnectTimeoutMs 2500 -DirectDataPlaneColorMode grayscale -SkipOrgSwitchAndTokenRefresh
Known limitations after DP-3A:
grayscalecurrently reduces color fidelity but not wire byte size because the output format remains BGRA32.256_colors,64_colors,16_colors, and palette modes are not implemented.- Tiles, compression/codecs, and adaptive profile switching remain future work.
- Backend gateway fallback remains JSON/base64 by design.
- Smoke
rendering=falseremains a compact-layout harness artifact in some runs; client logs proveDesktop frame receivedandSessionWindow rendered frame.
RDP-Perf-6: Direct Dirty-Region Binary Render Contract
Status: implemented and build/probe/live-smoke-proven on the test Docker
environment as of 2026-04-26 using P3.3 Secret RDP Resource, direct worker
WSS, and rap-rdp-worker:rdp-perf6-dirty-region.
RDP-Perf-6 keeps the existing RAP2 binary WebSocket transport and adds
explicit direct render message types:
render.frame.fullrender.frame.region
Compatibility:
- Windows client direct transport still accepts legacy binary
message_type=session.frame. - Inside the Windows application pipeline, direct binary frames are normalized
back into the existing
session.frameenvelope so UI, lifecycle, input, clipboard, and file transfer behavior remain unchanged. - Backend gateway fallback remains JSON/base64 and is not removed.
Dirty-region frame metadata:
frame_width,frame_height,frame_stride,frame_formatdesktop_width,desktop_heightregion_x,region_y,region_width,region_heightregion_stride,region_format=BGRA32payload_lengthinput_correlation_idandworker_frame_captured_atwhen available
Diagnostics added for payload and latency analysis:
full_frame_sentregion_frame_sentfull_frame_bytesregion_bytesregion_savings_percentdiff_time_msrender_update_reasonfallback_to_full_frame_reason
Implementation notes:
- Worker direct WSS emits
render.frame.fullfor baseline/recovery frames andrender.frame.regionfor dirty-region patches. - Worker direct render logs include payload savings and diff/capture timing.
- Windows direct transport accepts the explicit render message types.
- Windows
DesktopFramePresentermaintains a session framebuffer and patches BGRA32 region payloads into it before presenting the updated surface. - Full-frame fallback remains available for first frame, attach/reattach, resize, region-loss repair, and debug/fallback paths.
Observed runtime proof:
- Direct transport selected
direct_worker_wsswithrender_transport=binary_v1. - Baseline frame used
render.frame.full,1280x720,3,686,400bytes. - Dirty-region examples used
render.frame.region:64x64=16,384bytes (99.56%savings),1280x128=655,360bytes (82.22%savings), and640x64=163,840bytes (95.56%savings). - Direct-only binary region frames logged
fallback_compat_frame_built=falsewhile backend gateway fallback compatibility remained available separately. - Input, detach, reattach, takeover, and takeover event handling remained smoke-proven in the same run.
Stage DP-3B: Adaptive Quality
Implement quality profiles and adaptive render behavior.
Goals:
- lower latency under load
- bounded queues
- real profile behavior
- color mode adaptation
12. Risks
Token Leakage
Risk:
- direct worker token could be reused.
Mitigation:
- short TTL
jti/ nonce- worker-scoped audience
- attachment/session binding
- TLS
Worker Endpoint Exposure
Risk:
- worker direct endpoint becomes an attack surface.
Mitigation:
- token validation before bind
- rate limits
- TLS
- no unauthenticated session enumeration
- minimal endpoint surface
Policy Drift
Risk:
- backend and worker disagree on allowed channels.
Mitigation:
- token claims include allowed channels
- worker receives policy snapshot in assignment
- worker enforces policy again
- policy changes trigger session update or reconnect where required
Fallback Masking Production Problems
Risk:
- clients silently fall back and hide direct data-plane failure.
Mitigation:
- log fallback reason
- expose telemetry
- smoke tests verify both direct and fallback paths
Render Still Too Heavy
Risk:
- direct WSS improves routing but full-frame render remains expensive.
Mitigation:
- DP-2 binary frames
- DP-3 adaptive quality
- dirty regions / tiles
- latest-frame-only semantics
File Upload Starving Input
Risk:
- reliable file chunks can fill send queues.
Mitigation:
- channel priority
- bounded file queues
- chunk pacing
- input preemption
13. Future Verification Plan
Future DP-1 implementation must prove:
- backend gateway fallback still works
- direct worker WSS connects
- token validation works
- invalid/expired/wrong-worker tokens are rejected
- direct WSS binds to existing session runtime
- direct WSS does not recreate remote RDP session
- input works over direct WSS
- rendering works over direct WSS
- clipboard still works
- file upload still works
- fallback activates if direct worker path is unavailable
- input latency improves compared with fallback
- render backlog does not grow
- stale render frames are dropped
- close/dispose is immediate
- org/session/attachment/channel scope is enforced
14. Next Implementation Prompt
Data-plane and RDP work are paused by product decision.
DP-3B, Stage 5.2 remaining RDP desktop proof, and further RDP performance work must not start without a new explicit RDP/data-plane stage prompt.
The next active project work is Stage C10 in the lower Secure Access Fabric foundation:
Proceed with Stage C10 only.
Goal:
Consolidate Fabric Core architecture and prepare scoped cluster configuration
distribution design.
Scope:
- define signed scoped cluster snapshot model
- define node-local state boundaries
- define peer directory/cache boundaries
- define Fabric Storage / Config Storage role
- define source-of-truth vs distribution/cache boundaries
- define multi-cluster isolation boundaries
- define future implementation stages C11-C18
Do NOT:
- implement mesh runtime
- implement VPN
- implement RDP work
- implement service workloads
- change backend/runtime code
15. Non-Goals
DP-1 does not implement:
- full mesh
- VPN
- QUIC
- UDP transport
- WebRTC
- relay nodes
- multi-cluster routing
- adaptive quality beyond DP-3A grayscale full-frame foundation
- binary render frames for fallback backend gateway
- adaptive profile switching beyond DP-3A and dirty regions
- removal of current backend WebSocket gateway
- RDP MVP rewrite