# RDP Service C++ Performance Target Paused/archival note: this document is an RDP performance track record, not the current source of truth for node-to-node transport. Fabric transport is now QUIC-only between nodes; use `docs/architecture/DISTRIBUTED_FABRIC_NODE_PROTOCOL_PLAN.md`, `docs/architecture/FABRIC_FIRST_TRANSPORT_AND_STRESS_PLAN.md`, and `docs/architecture/SECURE_ACCESS_FABRIC_TARGET.md` for the active transport model. ## Status This is the paused RDP service performance direction. The implementation name is `RDP Adapter`: a concrete `Service Adapter` that translates Microsoft RDP into the platform session/data-plane protocol. The common adapter contract is defined in `docs/architecture/SERVICE_ADAPTER_PROTOCOL.md`; the RDP-specific runtime plan is defined in `docs/architecture/RDP_ADAPTER_RUNTIME.md`. Current implementation status: - RDP-A1 / RDP-Perf-1 is build-proven. - RDP-A2 adapter boundary is live-smoke-proven on the test Docker environment as of 2026-04-26: runtime code now goes through `RdpAdapterRuntime`. - RDP-Perf-2 runtime instrumentation is build-proven and live-smoke-proven on the test Docker environment as of 2026-04-26. - RDP-Perf-3 region-first BGRA fallback is build-proven and live-smoke-proven on the test Docker environment as of 2026-04-26. - RDP-Perf-4 gated RDPGFX foundation is build-proven and default-path smoke-proven on the test Docker environment as of 2026-04-26. The current live RDP target resets the connection when RDPGFX is advertised, so RDPGFX remains disabled by default. - RDP-A4 CursorAdapter is build-proven and live-smoke-proven on the test Docker environment as of 2026-04-26. Cursor events now flow as latest-only adapter-origin `cursor.update` events over direct worker WSS and remain compatible with backend gateway fallback. - RDP-Perf-5A GDI repaint cadence hardening is build-proven and smoke-proven on the test Docker environment as of 2026-04-26. Region/interactive frames now publish on a 33 ms cadence, hot-loop lease renewal was removed, and backend gateway fallback remains compatible. - RDP-Perf-6 dirty-region direct binary contract is build/probe/live-smoke-proven on the test Docker environment as of 2026-04-26. Direct `RAP2` render frames now distinguish full frames from dirty-region patches and carry payload savings diagnostics; observed runtime dirty regions reduced payloads from the `3,686,400` byte full frame to examples such as `16,384`, `163,840`, and `655,360` bytes. - Current accepted baseline is `rap-rdp-worker:rdp-p1-region-order2`: dirty-region delivery is preserved in order through `SessionRuntime`, worker direct WSS, Windows transport, and WPF presenter queues. Manual visual smoke accepted idle repaint, Start menu/hover, keyboard, mouse, and session close. - Remaining visual limitation is quality/performance rather than correctness: window drag behaves like older/slow-link RDP clients by moving a frame, and repaint after release is usable but not polished. - FreeRDP remains the internal substrate behind the adapter boundary until region-first/event-driven replacement paths are live-proven. - RDP performance work is paused by product decision. When RDP work explicitly resumes, the next RDP step should continue from the stable GDI region-first path unless an RDPGFX-compatible target is added for gated testing. The C++ worker remains the primary RDP runtime. The goal is not to rewrite the worker in another language. The goal is to replace the slow parts of the RDP service internals while preserving the proven backend/session/cluster/data-plane contracts. The C# RDP service skeleton is superseded as a runtime direction and must not be used for implementation unless explicitly re-approved later. ## Current Problem The current MVP proved the hard lifecycle behavior: - connect - active state - detach without killing the remote session - reattach - takeover - terminate - clipboard text - file upload to worker storage - direct worker WSS data-plane However, the render/input experience is not acceptable. Root cause: - the worker uses FreeRDP successfully for the RDP connection - but the production render path still behaves like framebuffer capture - the worker copies large BGRA buffers and publishes them as RAP frames - input is fast enough in parts of the path, but visual feedback depends on slow snapshot/frame delivery On a >1 Gbit LAN this should not be slow. The bottleneck is the RDP service render algorithm, not the network. ## Non-Negotiable Boundaries Do not change: - backend control plane - organization/session lifecycle - PostgreSQL source of truth - Redis live coordination model - worker leases and assignment contracts - data_plane_token contracts - direct worker WSS transport - backend gateway fallback - clipboard/file-transfer policy semantics Only the RDP service adapter internals may change. ## Target Design Keep the worker in C++. Use C++ to own the RDP service internals: - input adapter - graphics adapter - cursor adapter - virtual channel adapters - quality/adaptive controller - render sink to existing RAP data-plane FreeRDP may remain temporarily as a connection/security/channel substrate, but the target production render path must not be FreeRDP GDI framebuffer snapshots. If a FreeRDP layer blocks access to the needed RDP graphics primitives, replace that narrow layer with project-owned C++ code rather than rewriting the full service in another language. ## High-Performance RDP Model Fast RDP clients do not repeatedly send full desktop images. They use protocol updates: - dirty rectangles - surface commands - cursor updates - bitmap/cache updates - RDPGFX dynamic virtual channel - RemoteFX Progressive / ClearCodec / H.264 AVC420 / AVC444 / HEVC where negotiated - adaptive graphics and quality selection References: - https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-rdpegfx/da5c75f9-cd99-450c-98c4-014a496942b0 - https://learn.microsoft.com/en-us/azure/virtual-desktop/graphics-encoding - https://freerdp-freerdp.mintlify.app/concepts/codecs ## New Internal Layers ```mermaid flowchart LR Target["Windows RDP Server"] RdpCore["C++ RDP Core / FreeRDP Substrate"] Graphics["Graphics Adapter"] Input["Input Adapter"] Channels["Virtual Channel Adapters"] DataPlane["Existing Direct Worker WSS"] Client["RAP Windows Client"] Target <--> RdpCore RdpCore --> Graphics Input --> RdpCore RdpCore <--> Channels Graphics --> DataPlane Channels --> DataPlane DataPlane <--> Client ``` ### Graphics Adapter The graphics adapter converts RDP graphics primitives into RAP render updates. Supported update classes: - `frame_full_bgra` only for baseline/debug/fallback - `region_bgra` for dirty regions - `surface_create` - `surface_delete` - `surface_map` - `surface_bits` - `encoded_frame` - `cursor_update` Rules: - full-frame BGRA is fallback, not the target production path - direct render remains binary - backend gateway fallback may keep JSON/base64 compatibility - stale render updates are droppable - input never waits behind render ### Input Adapter Input stays separate from render. Rules: - keyboard down/up is reliable and ordered - mouse button down/up and wheel are reliable and ordered - mouse move is latest-only/coalesced - button down must include or be preceded by pointer position - no RAP focus message may consume the first remote click - input must not trigger full-frame capture loops ### Virtual Channel Adapters Clipboard/file/drive redirection remain isolated: - clipboard stays text-only until explicitly expanded - restricted drive mapping remains policy-bound - file upload/download policies stay enforced in the real data path ## Weak Network Strategy Weak-channel performance must degrade render before input. Priority order: 1. input 2. control 3. clipboard 4. render key updates 5. file transfer 6. telemetry Render adaptation: - drop stale render updates - prefer dirty regions over full frames - reduce FPS before increasing input latency - reduce color mode where useful - use text-priority mode for office/admin workloads - use encoded/compressed graphics payloads where negotiated - never let file transfer or VPN-like bulk traffic starve RDP input/control Quality profiles: - `emergency_grayscale` - `low_bandwidth` - `text_priority` - `balanced` - `high_quality` Color modes: - full color - 256 colors - 64 colors - 16 colors - grayscale ## Migration Stages ### RDP-A1 / RDP-Perf-1: Boundary And Audit Create C++ graphics/input adapter boundaries and document the replacement path. Do not change runtime behavior yet. Deliver: - common `Service Adapter` channel contract - RDP Adapter runtime plan - `graphics_adapter` interface - render update model - compile-safe probe - docs update Status: completed and build-proven. ### RDP-Perf-2: Runtime Instrumentation And Source Selection Measure existing FreeRDP update callbacks separately from frame publishing. Deliver: - update callback rate - dirty region dimensions - framebuffer copy time - binary send time - client render time - first-click trace without RAP focus interference Status: completed and live-smoke-proven on the test Docker environment as of 2026-04-26. Smoke command: ```powershell pwsh -ExecutionPolicy Bypass -File scripts/windows-smoke/desktop-smoke.ps1 ` -PreferDirectDataPlane:$true ` -AllowInsecureDirectDataPlaneTlsForSmoke:$true ` -DirectDataPlaneConnectTimeoutMs 2500 ` -DirectDataPlaneColorMode full_color ` -SkipOrgSwitchAndTokenRefresh ``` Smoke evidence: - worker image: `rap-rdp-worker:rdp-perf2-instrumented` - session id: `1328b0dd-c5f9-4b15-b2ca-6d196ead5823` - direct data plane selected by the Windows client - login/resource/start/input/detach/attach/takeover/taken_over/logout passed - one RDP runtime was created for the session - artifacts: - `artifacts/rdp-perf2-worker-final.log` - `artifacts/rdp-perf2-client-final.log` - `artifacts/rdp-perf2-report.md` Measured callback sources: | Source | Count / behavior | | --- | --- | | `BeginPaint` | observed | | `EndPaint` | observed | | `BitmapUpdate` | observed and produced dirty region information | | `RefreshRect` | not observed in smoke | | `SurfaceBits` | not observed in smoke | | `SurfaceFrameMarker` | not observed in smoke | | `SurfaceFrameBits` | not observed in smoke | | pointer callbacks | not observed in smoke | Measured conclusions: - The RDP server/FreeRDP path does emit server-origin graphics callbacks in stable GDI mode. - Idle or server-origin screen changes can be detected without relying on local mouse/keyboard activity. - Full framebuffer copy time is not the main bottleneck in the measured smoke run. - The current render path duplicates work by capturing around both `BitmapUpdate` and `EndPaint`. - `EndPaint` should become a flush/safety marker rather than a second normal capture producer. - RDP-Perf-3 should make `BitmapUpdate` dirty regions the default normal render path and reserve full frames for connect/resize/attach/recovery. ### RDP-Perf-3: Region-First BGRA Fallback Use true dirty regions as the default fallback path. Deliver: - no full-frame copy for small dirty updates - baseline full frame only on connect/resize/attach - region payloads only for normal UI changes Status: completed and live-smoke-proven on the test Docker environment as of 2026-04-26. Smoke evidence: - worker image: `rap-rdp-worker:rdp-perf3-region-first` - direct smoke session id: `abc11233-34c4-45a6-a55b-0571a09332a1` - fallback smoke session id: `ee756839-6a82-49d4-9619-54acf69e1efd` - direct worker WSS selected and backend gateway fallback separately verified - login/resource/start/input/detach/attach/takeover/taken_over/logout passed in both direct and fallback smoke - direct session cleanup state: `terminated` - fallback session cleanup state: `terminated` - report: `artifacts/rdp-perf3-report.md` Measured direct-path results: | Metric | Result | | --- | --- | | new RDP runtime count | 1 | | direct data-plane binds | 6 | | worker input apply events | 6 | | deferred `BitmapUpdate` callbacks | 104 | | `bitmap_update_flush` captures | 104 | | region flush captures | 93 | | full flush captures | 11 | | periodic duplicate changes | 0 | | client rendered region frames | 19 | | client skipped region frames | 0 | Implementation notes: - `BitmapUpdate` is now deferred during a paint cycle. - `EndPaint` flushes the accumulated `BitmapUpdate` dirty region once. - `EndPaint` no longer performs a second normal change-detect capture when a bitmap update was already flushed. - The periodic change detector snapshot is synchronized after callback-driven frame capture, avoiding rediscovery of the same changed pixels. - Direct binary frame metadata now preserves full desktop dimensions separately from region payload dimensions, so the Windows client can patch regions into its framebuffer. - Backend gateway fallback remains compatible with the existing JSON/base64 path. ### RDP-Perf-4: RDPGFX Channel Foundation Capture and parse RDPGFX surface updates where available. Deliver: - surface lifecycle - surface bits updates - cursor updates - fallback to region BGRA when RDPGFX unavailable Status: build-proven and default-path smoke-proven on the test Docker environment as of 2026-04-26. Implementation: - RDPGFX stays disabled by default. - `RDP_WORKER_RDPGFX_ENABLED=true` is the only gated runtime switch. - Worker diagnostics now log RDPGFX configuration, channel subscription, channel connection, GDI graphics pipeline initialization, fallback reasons, and normalized FreeRDP surface update callbacks. - Callback summaries include RDPGFX counters. - The default classic GDI region-first path remains the active safe path. Default smoke evidence: - worker image: `rap-rdp-worker:rdp-perf4-rdpgfx-gated` - final default smoke session id: `30e80d99-e3b5-428b-aa18-fea65b8db499` - direct worker WSS selected - login/resource/start/input/detach/attach/takeover/taken_over/logout passed - session cleanup state: `terminated` - worker log: `rdp.gfx config requested=false mode=classic_gdi_region_first` - worker log: `rdp.perf callback_summary ... rdpgfx_requested=false ... frame_capture_region=...` Gated RDPGFX target compatibility result: - gated session id: `aa69f606-9217-4579-b438-b7d3ec5e01d0` - environment: `RDP_WORKER_RDPGFX_ENABLED=true` - result: failed on the current live RDP target - observed: `BIO_read returned a system error 104: Connection reset by peer` - observed: `freerdp_post_connect failed` - no `rdp.gfx channel_connected` or surface callbacks were observed before reset - conclusion: the current target must use the default GDI region-first path Report: `artifacts/rdp-perf4-report.md` ### RDP-Perf-5: Encoded Graphics Payloads Support encoded graphics payloads over RAP direct data-plane. Deliver: - binary encoded payload message type - client decode strategy - fallback to region BGRA ### RDP-A4: CursorAdapter Move cursor handling into the RDP Adapter boundary and keep cursor events independent from display frame cadence. Status: completed and live-smoke-proven on the test Docker environment as of 2026-04-26. Implementation: - `CursorAdapter` normalizes FreeRDP pointer callbacks into cursor position, visibility, shape, cache, and mask metadata. - FreeRDP pointer callbacks are installed and restored inside the RDP runtime hook boundary. - Original FreeRDP pointer callbacks are invoked before platform normalization, preserving FreeRDP internal state. - `session_cursor_updated` worker events are mapped to platform `cursor.update` envelopes. - Direct worker WSS treats cursor as latest-only/droppable and schedules it separately from binary render frames. - Backend gateway fallback remains compatible with the same `session_cursor_updated` event payload. - Windows client accepts `cursor.update` through the existing render payload bridge without changing UI layout. Smoke evidence: - worker image: `rap-rdp-worker:rdp-a4-cursor-adapter` - direct smoke session id: `549806aa-c9db-48a9-917e-cf817cf236b5` - fallback smoke session id: `dee3a856-bee1-4eba-9c10-f62edaf56547` - direct worker WSS selected in direct smoke - backend gateway selected in fallback smoke - login/resource/start/input/detach/attach/takeover/taken_over/logout passed in both direct and fallback smoke - direct session cleanup state: `terminated` - fallback session cleanup state: `terminated` - worker log: `cursor.adapter hooks installed pointer_callbacks=true` - worker log: `adapter_event channel=cursor type=cursor.update origin=adapter` - worker log: `rdp.perf callback_summary ... cursor_updates_enqueued=...` - client log: `SessionWindowViewModel.HandleEnvelopeAsync ... cursor.update` - report: `artifacts/rdp-a4-cursor-adapter-report.md` Known limitation: - Cursor event separation does not by itself fix delayed hover/menu repaint. The next safe step is a GDI repaint cadence and server-origin update audit on the stable region-first path. ### RDP-Perf-5A: GDI Repaint Cadence And Hover Feedback Hardening Fix the first proven stable-path repaint cadence bottlenecks without changing backend, session lifecycle, data-plane contracts, clipboard/file transfer, or UI layout. Status: build-proven and smoke-proven on the test Docker environment as of 2026-04-26. Implementation: - FreeRDP event pump performs a bounded immediate drain after a signaled handle check so already-queued server events are not delayed by the next wait cycle. - Periodic no-change detection logging is rate-limited to avoid hot-loop log pressure while the remote screen is idle. - Worker session runtime renews the worker lease every 5 seconds instead of performing Redis lease I/O on every render/input loop iteration. - Region and interactive render notifications use a 33 ms publish interval. - Full-frame fallback remains at 100 ms. - Direct worker WSS binary writer uses the same 33 ms interval for region/interactive frames. Smoke evidence: - worker image: `rap-rdp-worker:rdp-perf5a-repaint-cadence` - direct smoke session id: `0cca4974-2a82-48dc-a0f6-1036ea8e98f0` - fallback smoke session id: `16deb09e-1c44-4e9d-8448-93b42ac66ed0` - direct worker WSS selected in direct smoke - backend gateway selected in fallback smoke - login/resource/start/input/detach/attach/takeover/taken_over/logout passed in both direct and fallback smoke - direct session cleanup state: `terminated` - fallback session cleanup state: `terminated` - report: `artifacts/rdp-perf5a-report.md` Measured direct-path results: | Metric | Result | | --- | --- | | client rendered frames observed | 65 | | client binary frames observed | 66 | | direct region publishes at 33 ms | 54 | | direct outbound FPS max | 9.705640 | | render seen FPS max | 26.386542 | | render published FPS max | 9.459327 | | direct backpressure frame drops | 0 | | render pending max | 0 | Measured conclusion: - Region/interactive frames now leave the worker promptly when server-origin changes arrive. - The direct smoke did not show queued FreeRDP event-handle bursts after the new immediate drain path: `event_pump_drained_checks=0`. - The current live target still emits idle/server-origin region changes at roughly 1 FPS in observed stable GDI mode. - Manual UX validation is still required before claiming hover/menu responsiveness accepted by a human operator. ### RDP-Perf-6: Dirty-Region Direct Binary Render Contract Replace full-frame-only direct binary render updates with explicit dirty-region direct binary render updates while preserving full-frame fallback. Deliver: - direct `RAP2` `message_type=render.frame.full` - direct `RAP2` `message_type=render.frame.region` - one bounding-rectangle dirty-region BGRA payload for normal UI changes - full-frame fallback for first frame, attach/reattach, resize, recovery, invalid region state, and debug/fallback mode - worker diagnostics for `full_frame_sent`, `region_frame_sent`, `region_bytes`, `full_frame_bytes`, `region_savings_percent`, `diff_time_ms`, `render_update_reason`, and `fallback_to_full_frame_reason` - Windows direct receiver support for explicit full/region message types - Windows framebuffer-backed region patching - backend gateway JSON/base64 fallback unchanged Status: implemented and build/probe/live-smoke-proven on the test Docker environment as of 2026-04-26 using the current RDP target. Build/probe evidence: - worker image build: `rap-rdp-worker:rdp-perf6-dirty-region` - Windows client build: PASS - worker graphics adapter probe: PASS - worker direct data-plane bind valid probe: PASS - worker service adapter protocol probe: PASS - direct worker WSS smoke: PASS - backend gateway fallback smoke: PASS Implementation notes: - The current classic GDI region-first display path remains the source of dirty-region payloads. - The direct worker WSS sender no longer labels all binary render payloads as `session.frame`; it uses `render.frame.full` and `render.frame.region`. - The Windows transport still normalizes direct render frames into the existing application-level `session.frame` pipeline, so session lifecycle, input, clipboard, and file-transfer behavior are unchanged. - The Windows presenter keeps a session framebuffer and applies region patches into it before presenting the updated surface. - Backend gateway fallback remains JSON/base64 and is not used as the production high-rate render relay. - Runtime payload examples: full baseline `3,686,400` bytes; dirty regions `16,384`, `163,840`, `327,680`, `655,360`, and `737,280` bytes. ### RDP-Perf-7: Adaptive Quality Controller Add channel-aware adaptive render quality. Deliver: - latency-aware profile switching - bandwidth-aware profile switching - latest-only render backpressure - stable input under load ## Acceptance Targets LAN targets: - first frame: under 2 seconds after successful RDP login - click to visible response: under 150 ms for common UI - keypress to visible response: under 150 ms for text input - pointer hover response: under 100 ms where the target emits hover changes - one click activates remote buttons correctly - no unbounded frame/input queues Weak-channel targets: - input remains usable even when render quality degrades - render drops stale updates instead of building backlog - file transfer never starves interactive input ## RDP Performance Work Paused RDP performance work is paused. Next active work is Fabric Core / cluster foundation. RDP-Perf-6 remains accepted and smoke-proven. Future RDP roadmap items such as RDP-Perf-7, adaptive quality, encoded payloads, additional RDPGFX testing, tiles, codecs, or further renderer optimization must not start without a new explicit RDP-stage prompt. The preserved RDP baseline remains: - C++ RDP Adapter runtime - direct worker WSS - backend gateway fallback - dirty-region direct binary render from RDP-Perf-6 - proven session lifecycle - existing clipboard and file-transfer semantics