Files
rdp-proxy/docs/codex/CURRENT_STATUS.md
T
2026-04-28 22:29:50 +03:00

2413 lines
93 KiB
Markdown

# Current Implementation Status
Date: 2026-04-28
This is the current operational status for Codex continuation. It supersedes
older prompt fragments that still describe platform-core v2, WebSocket takeover
proof, or basic Windows client work as future tasks.
## Project Identity
The project is a Secure Access Fabric platform.
Current implementation focus:
- RDP work is paused by product decision
- preserve RDP as the first proven service baseline
- keep the C++ RDP Adapter as the active runtime when RDP work resumes
- RDP-Perf-6 dirty-region direct binary rendering is completed and
build/probe/live-smoke-proven
- Product model clarified: Remote Server/Desktop Access is one managed product
service. RDP/VNC/SSH are internal adapters selected by organization resource
protocol, not separate organization-facing cluster services.
- preserve the proven backend/session/worker lifecycle
- preserve direct worker WSS and backend gateway fallback
- Stage C10 Fabric Core documentation consolidation and scoped cluster
configuration distribution design is completed.
- Stage C11 signed scoped cluster snapshot model is completed.
- Stage C12 node local state store is completed.
- Stage C13 Fabric Storage / Config Storage service foundation is completed.
- Stage C14 peer directory and cache model is completed.
- Stage C15 Fabric Routing Engine skeleton is completed.
- Stage C16 secure node-to-node channel lifecycle is completed.
- Stage C17 planning is completed.
- Stage C17A synthetic mesh runtime skeleton is implemented and test-proven.
- Stage C17B route health and failover probe skeleton is implemented and
test-proven.
- Stage C17C relay semantic hardening skeleton is implemented and test-proven.
- Stage C17D non-production synthetic test-service path experiment is
implemented and test-proven.
- Stage C17E live node-to-node synthetic HTTP transport skeleton is
implemented, build-proven, and smoke-proven. It remains disabled by default
and carries synthetic traffic only.
- Stage C17F scoped synthetic peer/route config loading and route-health
reporting is implemented, build-proven, and smoke-proven. It remains
synthetic-only and does not enable production mesh traffic.
- Stage C17G Control Plane scoped synthetic config read boundary is
implemented and backend/node-agent-test-proven. It remains synthetic-only and
does not enable production mesh traffic.
- Stage C17H deployed multi-agent synthetic config smoke is implemented and
runtime-proven on `docker-test`. Five running `rap-node-agent` containers
consume backend-issued node-scoped synthetic config, direct and single-relay
synthetic route-health observations return to the Control Plane, and
production forwarding remains disabled.
- Stage C17I production forwarding gate foundation is implemented and
test-proven. `rap-node-agent` now has an explicit
`RAP_MESH_PRODUCTION_FORWARDING_ENABLED` gate, but `/mesh/v1/forward` still
does not forward production payloads without a later approved runtime stage.
- Stage C17J production envelope contract is implemented and test-proven.
`/mesh/v1/forward` validates route-bound production envelopes when the gate
is enabled, accepts only `fabric_control` / `fabric.control` for this stage,
rejects service channels, and still does not forward payloads.
- Stage C17K production envelope observation is implemented and test-proven.
Valid production envelopes can be observed locally as metadata-only records
after validation; rejected envelopes are not observed, observation failure
fails closed, and payloads are still not forwarded.
- Stage C17L bounded production observation sink is implemented and
test-proven. Accepted metadata-only production envelope observations can be
retained locally with fixed capacity and oldest-entry drop behavior; payload
bodies are not stored and payloads are still not forwarded.
- Stage C17M production observation sink wiring is implemented and
test-proven. `rap-node-agent` can wire the bounded local metadata-only sink
when `RAP_MESH_PRODUCTION_OBSERVATION_SINK_CAPACITY` is explicitly greater
than zero; the wiring is disabled by default, exposes no read API, stores no
payload bodies, and payloads are still not forwarded.
- Stage C17N production observation sink metrics are implemented and
test-proven. Local sink metrics expose only capacity, current depth, accepted
total, and dropped-oldest total; they expose no observation records, route
IDs, message IDs, hashes, payload metadata, or payload bodies.
- Stage C17O production observation sink local metrics logging is implemented
and test-proven. `rap-node-agent` logs aggregate sink metrics locally when
the sink is explicitly enabled; this adds no read API, no Control Plane
reporting, no payload storage, and no forwarding.
- Stage C17P production observation sink change-driven metrics logging is
implemented and test-proven. `rap-node-agent` suppresses repeated identical
local sink metrics logs; this adds no read API, no Control Plane reporting,
no payload storage, and no forwarding.
- Stage C17Q production forwarding gate/runtime log boundary is implemented
and test-proven. `rap-node-agent` logs production forwarding gate state
separately from production forwarding runtime state; runtime state remains
false and forwarding remains unavailable.
- Stage C17R production observation sink capacity guard is implemented and
test-proven. `RAP_MESH_PRODUCTION_OBSERVATION_SINK_CAPACITY` remains
disabled by default, rejects negative values, and rejects values above
`10000`.
- Stage C17S production observation panic fail-closed hardening is implemented
and test-proven. Observer errors and observer panics both fail closed as
observation failure; forwarding remains unavailable.
- Stage C17T production envelope payload boundary is implemented and
test-proven. Validated production `fabric.control` envelope payloads are
bounded to `4096` bytes, and oversized envelopes are rejected before
observation.
- Stage C17U production envelope created-at skew boundary is implemented and
test-proven. Validated production `fabric.control` envelopes whose
`created_at` is more than one minute in the future are rejected before
observation.
- Stage C17V peer endpoint candidate model and NAT/connectivity hints are
implemented and test-proven. Node-scoped synthetic mesh config now carries
route-scoped endpoint candidates with transport, address, reachability, NAT
type, connectivity mode, priority, policy tags, verification time, and
metadata. This does not implement production route scoring, NAT traversal,
shortcut routing, or forwarding runtime.
- Stage C17W peer endpoint candidate scoring model is implemented and
test-proven. `rap-node-agent` can deterministically rank already-scoped
endpoint candidates using soft inputs, but this does not open connections,
choose production routes, or forward payloads.
- Stage C17X health-aware endpoint candidate scoring overlay is implemented
and test-proven. Candidate scoring can optionally use local health
observations keyed by `endpoint_id`, including latency, success/failure
history, recent failure reason, reliability score, and observation freshness.
This remains advisory scoring only.
- Stage C17Y Platform Owner synthetic mesh visibility is implemented and
build/test-proven. `web-admin` now reads node-scoped synthetic mesh config
and shows config enabled state, route counts, peer endpoints, endpoint
candidates, C17X advisory scoring boundary, and `production_forwarding`.
This remains platform-owner visibility only and does not enable production
forwarding.
- Stage C17Z production fabric-control direct forwarding boundary is
implemented and test-proven. `/mesh/v1/forward` no longer always returns
unavailable after validation: when the explicit production gate is enabled,
it can deliver valid route-bound `fabric.control` envelopes at the local
destination or forward them to a direct next hop from explicit peer endpoint
config. Service channels, arbitrary relay forwarding, multi-hop production
route execution, and RDP/VPN/file/video/service workload traffic remain out
of scope.
- Stage C17Z1 production fabric-control multi-hop route-path boundary is
implemented and test-proven. Production `fabric.control` envelopes can carry
`route_path` and `visited_node_ids`; relay nodes validate path position,
forward only to the next path node, update TTL/hop/visited metadata, and
reject loops. Service payloads remain unavailable.
- Stage C17Z2 production fabric-control forwarding observability boundary is
implemented and test-proven. Node-agent now emits local
`mesh_production_forward_event` logs for accepted, forwarded, delivered, and
rejected production `fabric.control` envelopes. Logs are metadata-only and
include no payload bodies, no read API, and no Control Plane reporting.
- Stage C17Z3 production fabric-control route-config boundary is implemented
and test-proven. When scoped/control-plane mesh routes are available locally,
production `fabric.control` envelopes must match configured route_id,
cluster, source, destination, route path, next hop, allowed channel, expiry,
max TTL, and max hop count before forwarding. Service payloads remain
unavailable.
- Stage C17Z4 scoped peer directory and recovery seeds boundary is implemented
and test/build-proven. Node-scoped mesh config now carries scoped
`peer_directory` and explicit bounded `recovery_seeds`; node-agent parses and
validates them, and Platform Owner Control Panel shows peer-directory/recovery
seed counts. This does not implement connection management, NAT traversal,
dynamic endpoint reporting, or service traffic.
- Stage C17Z5 node-agent peer cache runtime boundary is implemented and
test-proven. Node-agent now builds a node-local `PeerCache` from scoped peer
directory, recovery seeds, endpoints, endpoint candidates, and routes; selects
a bounded warm peer set; probes warm peers through `/mesh/v1/health` when
synthetic mesh testing is enabled; and reports metadata-only mesh-link
observations. This does not implement a persistent connection manager,
NAT traversal, dynamic endpoint reporting, or service payload forwarding.
- Stage C17Z6 dynamic endpoint reporting boundary is implemented and
test-proven. Node-agent can report an explicit advertised mesh endpoint in
heartbeat metadata, and Control Plane projects latest reported peer endpoints
and candidates into node-scoped synthetic mesh config. This does not implement
automatic public IP discovery, STUN/TURN/ICE NAT classification, persistent
connection management, or service payload forwarding.
- Stage C17Z7 private/corporate endpoint candidate boundary is implemented and
test-proven. Node-agent can report multiple advertised endpoint candidates,
including private/corporate LAN candidates; scoring rewards `private-lan`,
`corp-lan`, and `same-site`; and peer cache can select the best candidate
address for warm-peer health. This does not implement automatic subnet
discovery, persistent connection management, or service payload forwarding.
- Stage C17Z8 peer connection state machine boundary is implemented and
test-proven. Node-agent now tracks warm-peer connection states
(`disconnected`, `connecting`, `ready`, `degraded`, `backoff`), transitions
on warm-peer health probes, applies bounded backoff after repeated failures,
and reports metadata-only connection state in mesh-link observations. This
does not implement persistent data-plane sockets or service payload
forwarding.
- Stage C17Z9 peer recovery planner boundary is implemented and test-proven.
Node-agent now plans a bounded stable ready-peer set, enters recovery mode
when ready peers fall below target, selects bounded recovery probe
candidates from warm peers, recovery seeds, and other connectable scoped
peers, and reports metadata-only recovery state in heartbeat and mesh-link
observations. This does not implement persistent data-plane sockets, NAT
traversal, relay/rendezvous runtime, or service payload forwarding.
- Stage C17Z10 peer connection intent planner boundary is implemented and
test-proven. Node-agent now classifies bounded peer work as maintain/probe/
recover and classifies transport readiness as direct, private LAN,
corporate LAN, outbound-only, or relay-required, with rendezvous-required
metadata in heartbeat and mesh-link observations. This does not implement
persistent data-plane sockets, STUN/TURN/ICE, NAT traversal,
relay/rendezvous runtime, or service payload forwarding.
- Stage C17Z11 peer connection manager runtime boundary is implemented and
test-proven. Node-agent now uses a reusable HTTP keep-alive client to perform
real control-plane health probes for direct/private/corporate peers selected
by connection intents, updates shared peer connection state, records
`waiting_rendezvous` for outbound-only/relay-required peers, and reports
metadata-only manager cycle state. This does not implement STUN/TURN/ICE,
relay/rendezvous runtime, route leases, VPN runtime, or service payload
forwarding.
- Stage C17Z12 rendezvous/relay control-plane contract is implemented and
docker-test-runtime-proven. Backend now issues node-scoped
`rendezvous_leases` in synthetic mesh config, including explicit
route-policy leases and derived leases for outbound-only or relay-required
candidates when a route has a reachable HTTP relay control endpoint.
Node-agent consumes those leases, resolves matching `waiting_rendezvous`
intents into `relay_control`, probes relay `/mesh/v1/health`, records and
maintains `relay_ready` for the peer control path, and reports manager
metadata. This remains control-plane health only and does not enable
RDP/VPN/service payload forwarding, arbitrary relay packet forwarding,
STUN/TURN/ICE, or host network changes.
- Stage C17Z13 rendezvous lease telemetry is implemented and
docker-test-runtime-proven. Node-agent heartbeat now emits
`mesh_rendezvous_lease_report` with schema
`c17z13.mesh_rendezvous_lease_report.v1`, local role
(`relay`, `peer`, or `entry_or_observer`), relay admission, peer admission,
TTL/renewal posture, `relay_ready`, and explicit no-payload boundary flags.
Web-admin recent heartbeat tables show `rv leases`. This remains
control-plane telemetry only and does not enable RDP/VPN/service payload
forwarding, arbitrary relay packet forwarding, STUN/TURN/ICE, or host
network changes.
- Stage C17Z14 rendezvous lease refresh contract is implemented and
docker-test-runtime-proven. Node-agent refreshes renewal-needed, expired,
invalid, or stale relay leases through the existing node-scoped synthetic
config endpoint, reloads peer cache/leases/routes into the running synthetic
mesh runtime, and reports `c17z14.mesh_rendezvous_lease_report.v1` with
refresh counters plus stale relay withdrawal/reselection telemetry. This
remains control-plane health only and does not enable RDP/VPN/service
payload forwarding, arbitrary relay packet forwarding, STUN/TURN/ICE, or
host network changes.
- Stage C17Z15 backend relay replacement policy is implemented and
docker-test-runtime-proven. Backend reads recent
`mesh_rendezvous_lease_report` stale-relay feedback, withdraws stale
explicit rendezvous leases from node-scoped synthetic config, scores
alternate relay candidates using route adjacency, endpoint priority,
policy tags, and recent mesh-link health, and returns a replacement lease
plus `rendezvous_relay_policy` decisions in `c17z15.synthetic.v1`.
Node-agent reports `c17z15.mesh_rendezvous_lease_report.v1`, advertises the
relay replacement contract capability, and keeps stale state bound to the
exact lease/relay instead of smearing it across alternate leases for the same
peer. This remains control-plane health only and does not enable
RDP/VPN/service payload forwarding, arbitrary relay packet forwarding,
STUN/TURN/ICE, or host network changes.
- Stage C17Z16 route/path decision artifact is implemented and
docker-test-runtime-proven. Backend synthetic config now uses
`c17z16.synthetic.v1` and includes `route_path_decisions` with original hops,
effective hops, local previous/next hop, selected replacement relay,
generation, score reasons, and explicit control-plane/no-payload flags.
Node-agent stores the control-plane route generation and reports
`c17z16.mesh_route_path_decision_report.v1` alongside
`c17z16.mesh_rendezvous_lease_report.v1`. This remains route metadata only
and does not enable RDP/VPN/service payload forwarding, arbitrary relay
packet forwarding, STUN/TURN/ICE, or host network changes.
- Stage C17Z17 node-side route generation tracker is implemented and
docker-test-runtime-proven. Backend synthetic config now uses
`c17z17.synthetic.v1`; node-agent tracks Control Plane
`route_path_decisions` apply/unchanged/withdraw transitions and reports
`c17z17.mesh_route_generation_report.v1` alongside
`c17z17.mesh_route_path_decision_report.v1` and
`c17z17.mesh_rendezvous_lease_report.v1`. A first-observed
`stale_relay_replacement` still emits a `withdrawn_by_replacement` record
for the old stale relay path. This remains route metadata/control-plane
health only and does not enable RDP/VPN/service payload forwarding,
arbitrary relay packet forwarding, STUN/TURN/ICE, or host network changes.
- Stage C17Z18 synthetic route-health effective path runtime is implemented
and docker-test-runtime-proven. Backend synthetic config now uses
`c17z18.synthetic.v1`; node-agent refreshes Control Plane route decisions
into a separate route-health route config, probes the selected effective
path through the replacement relay, and reports
`c17z18.mesh_route_health_config_report.v1` plus route-health observations
with expected/observed hops and drift state. Backend latest mesh links now
preserve `synthetic_route_health` separately from peer connection-manager
observations, and web-admin shows route-health rows. This remains synthetic
route-health/control-plane only and does not enable RDP/VPN/service payload
forwarding, arbitrary relay packet forwarding, STUN/TURN/ICE, or host
network changes.
- Stage C17Z19 synthetic route-health feedback scoring is implemented and
docker-test-runtime-proven. Backend now consumes recent
`synthetic_route_health` observations in the relay scoring loop: drift,
unreachable status, or failure metadata can mark the exact selected relay
stale and trigger replacement, while healthy low-latency route-health boosts
alternate relay scoring. Node-agent route-health observations include
rendezvous peer/lease metadata, migration `000022` adds the `synthetic`
mesh service class, and web-admin marks relay policy `rh feedback`. This
remains synthetic/control-plane only and does not enable RDP/VPN/service
payload forwarding, arbitrary relay packet forwarding, STUN/TURN/ICE, or
host network changes.
- Stage C17Z20 node-side route-health feedback refresh is implemented and
docker-test-runtime-proven. After node-agent reports synthetic route-health
drift, unreachable status, or failure metadata, it schedules a bounded
node-scoped synthetic-config refresh, applies returned replacement route
decisions to route-health config immediately, and reports
`c17z20.mesh_route_health_feedback_refresh_report.v1` with attempt,
success, failure, and suppressed-repeat counters. Web-admin route-health
heartbeat summary now shows feedback refresh counters. This remains
synthetic/control-plane only and does not enable RDP/VPN/service payload
forwarding, arbitrary relay packet forwarding, STUN/TURN/ICE, or host
network changes.
- Installation Authority foundation is implemented and backend/web-build
verified. Production config now requires `INSTALLATION_AUTHORITY_MODE=strict`
with a Product Root Ed25519 public key. First-owner bootstrap accepts signed
activation manifests, stores installation authority and signed
`platform_role_grants`, and strict platform-admin checks ignore direct
PostgreSQL `users.platform_role` edits unless a valid grant exists. Web-admin
shows installation status and first-owner bootstrap; dev/legacy SQL seed
compatibility remains explicit and gated by
`INSTALLATION_INSECURE_BOOTSTRAP_ENABLED`.
- Cluster Authority foundation is implemented and backend/agent/web-build plus
docker-test lifecycle-smoke verified. Clusters now have Ed25519 authority
keys, join-token scope material is signed, node approval/bootstrap material
is signed, and Control Plane synthetic mesh config snapshots include a
signed hash envelope with `authority_required=true`. Cluster authority
private keys are encrypted at rest when `SECRET_ENCRYPTION_KEY_B64`/file is
configured. `rap-node-agent` verifies signed Control Plane synthetic config
and supports pinned authority public key/fingerprint through env or identity
state. Web-admin shows cluster authority fingerprints in summaries,
join-token output, approval rows, and synthetic config visibility. The
docker-test run `dev-bootstrap-20260428-201430` proved fresh dev cluster
creation, signed join token, real node-agent enrollment, platform-owner
approval, automatic signed bootstrap polling, authority pin persistence,
heartbeat, and signed synthetic-config verification. This is a control-plane
trust contract only; it does not enable RDP/VPN/service payload forwarding or
production relay packet forwarding.
- Node enrollment bootstrap polling is implemented and backend/agent-test plus
docker-test lifecycle-smoke verified. After enrollment, `rap-node-agent`
stores `pending_join_request_id`, polls
`/node-agents/enrollments/{requestID}/bootstrap`, verifies the signed
approval/bootstrap contract, and persists the approved `node_id`,
`identity_status`, and cluster authority pin into `identity.json`. Polling is
controlled by `RAP_ENROLLMENT_POLL_INTERVAL_SECONDS` and
`RAP_ENROLLMENT_POLL_TIMEOUT_SECONDS`.
- Migration `000021_cluster_authority_keys` was hardened after the fresh
docker-test replay found that PostgreSQL cannot change the
`cluster_admin_summaries` view layout through `CREATE OR REPLACE VIEW`; the
migration now drops and recreates the view in both up/down paths.
- `rap-node-agent` desired-workload polling/status reporting is now gated by
`RAP_WORKLOAD_SUPERVISION_ENABLED=false` by default, avoiding repeated
admin-only workload-status `403` logs while service runtime supervision is
still a stub.
- Stage C18 VPN/IP tunnel service target design is completed as
documentation/planning only.
- Stage C18A VPN/IP tunnel control-plane data model foundation is implemented
and backend-test-proven.
- Stage C18B VPN/IP tunnel lease/fencing hardening is implemented and
backend-test-proven.
- Stage C18C VPN/IP tunnel node-agent desired-state consumption/reporting is
implemented and backend-test-proven.
- Version Storage / Update Repository is documented as a future Fabric Core
service for signed releases, OS/arch artifacts, stable/current/candidate
channels, update-cache mirroring, node-agent update supervision, rollback,
and explicit data-structure migration bundles. Runtime updater behavior is
not implemented.
- Web Ingress and Admin UI ownership model is documented:
`docs/architecture/WEB_INGRESS_AND_ADMIN_UI_MODEL.md`.
- Admin endpoint placement decision is documented: storage/config-storage nodes
do not automatically become cluster panels; Platform Owner Console remains
global, Cluster Admin Endpoint requires explicit admin/web ingress role
assignment, and Organization Admin Panel remains tenant-safe.
- Platform Owner Control Panel is implemented in `web-admin` and
build-verified. Report:
`artifacts/web-admin-platform-owner-control-panel-report.md`.
- Fabric service endpoint control-plane foundation is implemented:
cluster-scoped `fabric_entry_points` and `fabric_egress_pools` are durable
PostgreSQL objects for logical client ingress and logical egress zones. This
is not production mesh routing yet.
- Fabric endpoint node assignment is implemented for the Platform Owner
Control Panel: entry points and egress pools can show and assign active
cluster nodes through control-plane APIs. This remains placement intent only;
it does not start mesh routing or service traffic.
- Platform Owner Control Panel Fabric map now visualizes logical ingress,
active cluster nodes, logical egress pools, endpoint-node placement intent,
observed peer links, and node telemetry/service summaries in one cluster
diagram. This remains a platform-owner topology view and must not be exposed
to organization panels.
- Platform Owner Control Panel Fabric page also shows current node-scoped
synthetic mesh config/candidate/scoring/route-health feedback visibility
after C17Z20.
- Stage C17H deployed multi-agent synthetic config smoke on `docker-test` is
complete; next mesh/Fabric work requires an explicit new staged prompt
- prepare the Secure Access Fabric platform-core foundation:
clusters, node enrollment, native node-agent identity, role assignment,
platform admin console, scoped configuration distribution, node-local state,
Fabric Storage/Config Storage, and future multi-cluster administration
- Stage C1 backend cluster/node model foundation is implemented and verified.
- Stage C2 node enrollment hardening is implemented and verified.
- Stage C3 native `rap-node-agent` MVP scaffold is implemented and verified.
- Stage C4 Platform Admin Console MVP is implemented and build-verified.
- Stage C5 service workload supervision contract is implemented and verified.
- Stage C6 mesh control-plane preparation is implemented and verified.
- Stage C7 Mesh MVP skeleton is implemented and verified.
- Stage C8 multi-cluster/partition hardening is implemented and verified.
- Stage C9 organization admin foundation is implemented and verified.
- Current focus: Fabric Core / mesh transport foundation. RDP remains paused.
C17Z20 is complete and remains peer/rendezvous control-plane health
management, stale-relay replacement policy, route/path decision metadata, and
synthetic route-health effective-path feedback scoring/refresh only, not
production service mesh traffic.
Not current scope:
- Web/Admin UI implementation beyond documented ownership/model boundaries
- production mesh runtime
- VPN/IP tunnel runtime
- multi-cluster runtime
- node-agent updater runtime
- production Version Storage / Update Repository runtime
- automatic PostgreSQL migration execution by node-agent
- SSH/VNC adapters
- Linux/mobile clients
- DP-3B adaptive quality expansion
- RDP performance work, including RDP-Perf-7 or further RDP-Perf stages
- Stage 5.2 remaining RDP desktop UI proof
- organization admin UI implementation
- production authentication/session hardening for Web Admin
## Canonical Test Environment
- Docker test host: `192.168.200.61`
- SSH alias: `docker-test`
- Docker endpoint: `ssh://docker-test`
- Backend API: `http://192.168.200.61:8080/api/v1`
- Backend gateway: `ws://192.168.200.61:8080/api/v1/gateway/ws`
- C17A required no active runtime deployment. It is implemented in
`rap-node-agent` tests only, behind a disabled-by-default feature flag, and
carries synthetic `fabric.probe` / `fabric.probe_ack` messages only.
- C17B required no active runtime deployment. It is implemented in
`rap-node-agent` tests only, behind the same disabled-by-default feature
flag, and carries synthetic route health probes only.
- C17C required no active runtime deployment. It is implemented in
`rap-node-agent` tests only, behind the same disabled-by-default feature
flag, and models synthetic relay queues/QoS only.
- C17D required no active runtime deployment. It is implemented in
`rap-node-agent` tests only, behind the same disabled-by-default feature
flag, and carries only bounded `synthetic.echo` test-service payloads.
- C17E adds a live node-to-node synthetic HTTP transport skeleton and smoke
harness. It remains behind `RAP_MESH_SYNTHETIC_RUNTIME_ENABLED=false` by
default and does not authorize production mesh, RDP, VPN, file, video, or
service workload traffic.
- C17F adds a scoped synthetic mesh config file boundary, prefers it over
debug JSON, and reports synthetic route-health observations to the existing
mesh links control-plane endpoint when testing flags allow synthetic links.
- C17G adds backend
`/clusters/{clusterID}/nodes/{nodeID}/mesh/synthetic-config` and node-agent
consumption of that config when no local scoped config file is set.
- C17H proves the C17G boundary in a deployed multi-agent `docker-test` smoke
with synthetic traffic only.
- C17I adds an explicit node-agent production forwarding gate while keeping
production forwarding unavailable.
- C17J adds route-bound production envelope validation on `/mesh/v1/forward`
while keeping production forwarding unavailable.
- C17K adds local metadata-only accepted-envelope observation while keeping
production forwarding unavailable.
- C17L adds a bounded local in-memory sink for accepted metadata-only
observations while keeping production forwarding unavailable.
- C17M adds disabled-by-default node-agent wiring for the bounded local
metadata-only observation sink while keeping production forwarding
unavailable.
- C17N adds local metrics for the bounded observation sink while keeping
production forwarding unavailable.
- C17O adds local node-agent logging for bounded observation sink metrics while
keeping production forwarding unavailable.
- C17P adds change-driven suppression for unchanged local bounded observation
sink metrics logs while keeping production forwarding unavailable.
- C17Q adds explicit local log separation for production forwarding gate state
versus runtime state while keeping production forwarding unavailable.
- C17R adds a maximum capacity guard for the local production observation sink
while keeping production forwarding unavailable.
- C17S adds panic-safe fail-closed observation handling while keeping
production forwarding unavailable.
- C17T adds an explicit payload boundary for validated production
`fabric.control` envelopes while keeping production forwarding unavailable.
- C17U adds an explicit future-skew boundary for validated production
`fabric.control` envelope `created_at` while keeping production forwarding
unavailable.
- C17V adds scoped peer endpoint candidates and NAT/connectivity hints to
synthetic mesh config while keeping production forwarding unavailable.
- C17W adds deterministic local scoring for scoped peer endpoint candidates
while keeping production forwarding unavailable.
- C17X adds an optional local health observation overlay for endpoint
candidate scoring while keeping production forwarding unavailable.
- C17Y updates the Platform Owner Control Panel with node-scoped synthetic
mesh config visibility while keeping production forwarding unavailable.
- C17Z adds gate-controlled production `fabric.control` local delivery and
direct next-hop forwarding while keeping service channels unavailable.
- C17Z1 adds route-path-bound production `fabric.control` multi-hop forwarding
while keeping service channels unavailable.
- C17Z2 adds local metadata-only production `fabric.control` forwarding event
logs while keeping service channels unavailable.
- C17Z3 binds production `fabric.control` forwarding to local route config
when configured routes are available while keeping service channels
unavailable.
- C17Z4 adds scoped peer directory and bounded recovery seeds to node-scoped
mesh config while keeping service channels unavailable.
- C17Z5 adds node-local peer cache and warm-peer health probes while keeping
service channels unavailable.
- C17Z6 adds explicit advertised endpoint reporting and scoped config
projection while keeping service channels unavailable.
- C17Z7 adds multiple public/private/corporate endpoint candidates and
same-site scoring while keeping service channels unavailable.
- C17Z8 adds node-local warm-peer connection states and backoff while keeping
service channels unavailable.
- C17Z9 adds bounded node-local peer recovery planning while keeping service
channels unavailable.
- C17Z10 adds node-local peer connection intent and transport readiness
metadata while keeping service channels unavailable.
- C17Z11 adds a real node-local peer connection manager for control-plane
health while keeping service channels unavailable.
- C17Z12 adds node-scoped rendezvous/relay control-plane leases while keeping
service channels unavailable.
- C17Z13 adds rendezvous lease telemetry while keeping service channels
unavailable.
- C17Z14 adds node-scoped lease refresh/reload and stale relay telemetry while
keeping service channels unavailable.
- C17Z15 adds backend stale-relay replacement policy and alternate relay
scoring while keeping service channels unavailable.
- C17Z16 adds Control Plane route/path decision metadata while keeping service
channels unavailable.
- C17Z17 adds node-side route generation apply/withdraw metadata while keeping
service channels unavailable.
- C18 completed design/planning only and did not implement VPN/IP tunnel
runtime. C18A completed control-plane data model foundation only. C18B
completed lease/fencing control-plane hardening only. C18C completed
node-agent desired-state consumption/reporting only. C18D, if accepted, must
remain credential/config resolver boundary work and must not implement real
VPN/IP tunnel runtime without a separate explicit prompt.
- Latest RDP performance reference image:
`rap-rdp-worker:rdp-perf6-dirty-region`
- Stage 5.2 file-download runtime artifacts remain preserved for when RDP work
resumes, but they are not the active next task.
## Proven Baseline
### Backend
- Go backend builds and tests pass.
- PostgreSQL is the source of truth.
- Redis is live coordination/routing only.
- Auth foundation exists.
- Refresh rotation, auth sessions, devices, and trusted devices exist.
- Multi-tenant organization foundation exists.
- Resources and remote sessions are organization-scoped.
- Platform-core v2 models exist.
- Identity source foundation exists.
- Node and node-agent control-plane foundation exists.
- Session broker lifecycle is implemented.
- Worker coordination and stale worker monitoring are implemented.
- Structured localization-ready messaging exists.
- Per-resource `certificate_verification_mode = strict | ignore` exists.
- `strict` remains default.
- Clipboard policy mode exists.
- File transfer policy mode exists.
- Data-plane token/candidate generation exists.
- Production resource secret-readiness guard exists:
- `APP_ENV=production` rejects plaintext credential-like resource metadata.
- RDP/VNC/SSH resources require `secret_ref` in production.
- development and smoke paths may still use plaintext metadata explicitly.
- Encrypted PostgreSQL-backed resource secret storage/resolver MVP exists:
- `resource_secrets` stores ciphertext, nonce, key id, algorithm, version,
safe metadata, and `payload_sha256`.
- `PUT /api/v1/resources/{resourceID}/secret` creates/rotates a resource
secret without returning plaintext.
- session assignment resolves `secret_ref` only after organization, resource,
session, worker, and lease checks.
- Production direct worker WSS TLS/PKI guard exists:
- backend direct candidates advertise `tls_trust_mode`, `production_trusted`,
`smoke_only`, and optional `tls_ca_ref` metadata.
- production backend omits smoke-only direct candidates and keeps backend
gateway fallback.
- Windows client skips untrusted/smoke-only direct candidates in production.
### Worker / RDP Adapter
- Worker Docker build is reproducible.
- C++ worker is the active RDP runtime.
- FreeRDP is behind the RDP Adapter boundary.
- Worker registration, assignment consumption, heartbeat, leases, and worker
events are implemented.
- Real RDP connection works.
- Detach/reattach/takeover/terminate/failure flows are proven.
- Reattach/takeover do not recreate the remote RDP session.
- Worker death/orphan active-session recovery is proven.
- Direct worker WSS endpoint exists.
- RS256 data-plane token validation exists.
- Current attachment/controller binding is enforced.
- Backend gateway fallback remains available.
- Direct binary `RAP2` render frames exist.
- Region-first BGRA rendering is the current stable path.
- Direct attach baseline full-frame repair exists.
- Region-loss full-frame repair exists.
- Ordered dirty-region delivery is accepted through `SessionRuntime`, worker
direct WSS, Windows transport, and WPF presenter queues.
- Cursor adapter boundary exists.
- Text clipboard through FreeRDP `cliprdr` is accepted.
- Client-to-server file upload to controlled worker storage is accepted.
- Restricted transfer-drive visibility through FreeRDP RDPDR is runtime-proven:
uploaded files are visible and openable inside the remote Windows session via
`RAP_Transfers`.
- Backend gateway fallback/debug frame state reconstructs a full framebuffer by
patching accepted region updates into Redis live state, so fallback screenshots
are not left with region-sized payloads after region-first rendering.
### Windows Client
- Windows WPF client builds.
- Login, refresh, logout, organization selection, resource list, active sessions,
and session window exist.
- Direct worker WSS selection exists with automatic backend gateway fallback.
- Binary direct render receive path exists.
- Real remote desktop is visible.
- Keyboard and mouse input are usable after RDP adapter hardening.
- Session window lifecycle is stable enough for current smoke work.
- Localization-ready resources and structured backend message resolution exist.
- Text clipboard UI/path exists.
- File upload UI/path exists.
## Current Known Gaps
- C1 cluster/node backend foundation is implemented. Remaining platform-core
work continued through C2: production enrollment hardening and node-agent
enrollment API. Runtime reports:
`artifacts/c1-cluster-node-foundation-report.md`.
- `artifacts/c2-node-enrollment-hardening-report.md`.
- `artifacts/c3-rap-node-agent-mvp-report.md`.
- `artifacts/c4-platform-admin-console-report.md`.
- `artifacts/c5-service-workload-supervision-contract-report.md`.
- `artifacts/c6-mesh-control-plane-preparation-report.md`.
- `artifacts/c7-mesh-mvp-skeleton-report.md`.
- `artifacts/c8-multi-cluster-hardening-report.md`.
- `artifacts/c9-organization-admin-foundation-report.md`.
- `artifacts/c10-fabric-core-config-distribution-design-report.md`.
- `artifacts/c11-signed-scoped-cluster-snapshot-model-report.md`.
- `artifacts/c12-node-local-state-store-report.md`.
- `artifacts/c13-fabric-storage-config-service-report.md`.
- `artifacts/c14-peer-directory-cache-model-report.md`.
- `artifacts/c15-fabric-routing-engine-skeleton-report.md`.
- `artifacts/c16-secure-node-to-node-channel-lifecycle-report.md`.
- `artifacts/c17-mesh-routing-runtime-implementation-plan-report.md`.
- `artifacts/c17a-synthetic-mesh-runtime-skeleton-report.md`.
- `artifacts/c17b-route-health-failover-probes-report.md`.
- `artifacts/c17c-relay-semantic-hardening-report.md`.
- `artifacts/c17d-non-production-test-service-path-report.md`.
- `artifacts/c17e-live-node-to-node-synthetic-transport-report.md`.
- `artifacts/c17f-scoped-synthetic-route-config-report.md`.
- `artifacts/c17g-control-plane-scoped-synthetic-config-report.md`.
- `artifacts/c17v-peer-endpoint-candidate-model-report.md`.
- `artifacts/c17w-peer-endpoint-candidate-scoring-report.md`.
- `artifacts/c17x-health-aware-endpoint-candidate-scoring-report.md`.
- `artifacts/c17y-platform-owner-synthetic-mesh-visibility-report.md`.
- `artifacts/c18-vpn-ip-tunnel-service-target-design-report.md`.
- `artifacts/c18a-vpn-control-plane-data-model-report.md`.
- `artifacts/c18b-vpn-lease-fencing-hardening-report.md`.
- `artifacts/c18c-vpn-node-agent-desired-state-report.md`.
- RDP correctness baseline is accepted. Remaining visual/performance limitation:
window drag behaves like older/slow-link RDP clients by showing a drag frame,
and repaint after releasing a moved window is workable but not yet polished.
- RDPGFX is gated and disabled by default because the current live target resets
the connection when RDPGFX is advertised.
- Encoded graphics/codecs/tiles are not production-accepted.
- Server-to-client file download core data path is runtime-proven through both
direct worker WSS and backend gateway fallback. Stage 5.2 lifecycle blocking
is runtime-proven for detach, old-client takeover, and worker failure. Stage
5.2 still needs manual Windows-client UI proof before full runtime
acceptance.
- External KMS/Vault integration and master-key rotation are not implemented.
- Worker production assignment currently receives resolved credentials through
transient assignment metadata; a future resolver pull/token flow should reduce
Redis control-queue exposure.
- The current dev/smoke RDP proof path can still resolve credentials from
resource metadata outside production.
- Production direct-worker certificate issuance/rotation and platform CA
distribution are not automated yet.
- Backend test coverage is thin outside `sessionbroker`.
- Windows client automated tests are missing.
- Worker probes exist, but a full automated adapter conformance suite does not.
- Several documents were updated during P0, but future work must keep them in
sync after every accepted stage.
## Current Verification Snapshot
Additional C17A synthetic mesh runtime skeleton verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- direct synthetic `fabric.probe` / `fabric.probe_ack`: PASS
- single-relay synthetic `fabric.probe` / `fabric.probe_ack`: PASS
- feature flag / kill-switch disabled path: PASS
- wrong cluster rejection: PASS
- wrong node rejection: PASS
- unauthorized channel rejection: PASS
- expired route rejection: PASS
- TTL exhaustion rejection: PASS
- loop rejection: PASS
- unavailable peer rejection: PASS
- existing mesh health and production-forwarding-disabled behavior: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
Additional C17B route health and failover probe verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- C17A direct/single-relay synthetic probes remain intact: PASS
- route health `fabric.route_health` / `fabric.route_health_ack`: PASS
- local route success observation: PASS
- local route failure observation: PASS
- preferred route failure with fallback route use: PASS
- warm fallback route promotion metric/log boundary: PASS
- route cache invalidation on policy version change: PASS
- same-version route cache preservation: PASS
- feature flag / kill-switch disabled path: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
Additional C17C relay semantic hardening verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- C17A direct/single-relay synthetic probes remain intact: PASS
- C17B route health/failover probes remain intact: PASS
- synthetic relay envelope validation: PASS
- QoS dequeue order `fabric_control` > `route_control` > `telemetry`: PASS
- telemetry backpressure drops oldest stale telemetry only: PASS
- reliable fabric/control queue full rejects instead of dropping: PASS
- relay rejects wrong cluster, wrong node, unauthorized channel, unsupported
message, TTL exhaustion, and loop: PASS
- relay disabled / kill-switch path: PASS
- relay queue depth metrics: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
Additional C17D non-production test-service path verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- C17A direct/single-relay synthetic probes remain intact: PASS
- C17B route health/failover probes remain intact: PASS
- C17C relay validation/QoS/backpressure remains intact: PASS
- direct `synthetic.echo` service-path test: PASS
- single-relay `synthetic.echo` service-path test: PASS
- forced fallback `synthetic.echo` service-path test: PASS
- bounded payload max-size behavior: PASS
- wrong organization rejected: PASS
- unsupported service type rejected: PASS
- oversized payload rejected: PASS
- unauthorized channel rejected: PASS
- missing request id rejected: PASS
- runtime disabled / kill-switch path: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
Additional C17E live node-to-node synthetic transport verification:
```powershell
go test ./...
go run ./cmd/mesh-live-smoke
go build -o bin/rap-node-agent.exe ./cmd/rap-node-agent
go build -o bin/mesh-live-smoke.exe ./cmd/mesh-live-smoke
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- live HTTP peer transport for synthetic envelopes: PASS
- direct `node-a -> node-b` synthetic probe over HTTP endpoints: PASS
- single-relay `node-a -> node-r -> node-b` synthetic probe over HTTP endpoints:
PASS
- bounded `synthetic.echo` test-service over relay HTTP path: PASS
- disabled-by-default `rap-node-agent` synthetic mesh endpoint: PASS
- production `/mesh/v1/forward` remains disabled: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
Additional C17F scoped synthetic route config verification:
```powershell
go test ./...
go run ./cmd/mesh-live-smoke
go build -o bin/rap-node-agent.exe ./cmd/rap-node-agent
go build -o bin/mesh-live-smoke.exe ./cmd/mesh-live-smoke
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- scoped synthetic config file load: PASS
- wrong cluster rejected: PASS
- wrong node rejected: PASS
- expired route rejected: PASS
- scoped config preferred over debug JSON: PASS
- synthetic route-health reporting boundary added: PASS
- C17E live direct/relay smoke remains intact: PASS
- production `/mesh/v1/forward` remains disabled: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
Additional C17G Control Plane scoped synthetic config verification:
```powershell
go test ./...
```
Run from:
```powershell
backend
agents\rap-node-agent
```
Result:
- backend node-scoped synthetic config endpoint/service: PASS
- disabled testing flag returns no routes and no peer endpoints: PASS
- unrelated route intent does not leak to requesting node: PASS
- production forwarding remains false in config: PASS
- node-agent consumes Control Plane config when local scoped config file is not
set: PASS
- local `RAP_MESH_SYNTHETIC_CONFIG` remains preferred debug fallback: PASS
- C17F live direct/relay smoke remains intact: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
Additional C17H deployed multi-agent synthetic config smoke verification:
```powershell
powershell -NoProfile -ExecutionPolicy Bypass -File scripts\fabric\c17h-multi-agent-synthetic-smoke-ssh.ps1 -KeepRunning
go test ./...
```
Run from:
```powershell
backend
agents\rap-node-agent
```
Result:
- deployed backend on `docker-test` stayed ready at `http://192.168.200.61:18080/api/v1`: PASS
- five running `rap-node-agent` containers loaded `source=control_plane`
synthetic config: PASS
- scoped config route counts: node-a=2, node-r=1, node-b=1, node-c=1,
node-idle=0: PASS
- direct route-health observation reported reachable to Control Plane: PASS
- single-relay route-health observation reported reachable to Control Plane:
PASS
- Platform Owner cluster summary showed 5 nodes and 5 healthy nodes: PASS
- all scoped configs kept `production_forwarding=false`: PASS
- backend `go test ./...`: PASS
- node-agent `go test ./...`: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
C17H runtime report:
- `artifacts/c17h-deployed-multi-agent-synthetic-config-smoke-report.md`
Additional C17I production forwarding gate verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
backend
```
Result:
- `RAP_MESH_PRODUCTION_FORWARDING_ENABLED` config gate: PASS
- `/mesh/v1/forward` remains disabled by default: PASS
- explicit gate enabled still reports unavailable production runtime: PASS
- backend tests: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
C17I report:
- `artifacts/c17i-production-forwarding-gate-report.md`
Additional C17J production envelope contract verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
backend
```
Result:
- route-bound production envelope contract validation: PASS
- invalid payload hash rejected: PASS
- service channel rejected: PASS
- gate-enabled `/mesh/v1/forward` still returns unavailable runtime after
successful validation: PASS
- backend tests: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
C17J report:
- `artifacts/c17j-production-envelope-contract-report.md`
Additional C17K production envelope observation verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
backend
```
Result:
- valid production envelope triggers metadata-only observation: PASS
- rejected production envelope does not trigger observation: PASS
- observation failure fails closed: PASS
- gate-enabled `/mesh/v1/forward` still returns unavailable runtime after
validation/observation: PASS
- backend tests: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
C17K report:
- `artifacts/c17k-production-envelope-observation-report.md`
Additional C17L bounded production observation sink verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- bounded metadata-only sink stores accepted observations: PASS
- oldest observation is dropped when capacity is exceeded: PASS
- payload hash/length metadata is preserved: PASS
- payload body is not stored by the sink: PASS
- gate-enabled `/mesh/v1/forward` still returns unavailable runtime after
validation/observation/sink storage: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
C17L report:
- `artifacts/c17l-bounded-production-observation-sink-report.md`
Additional C17M production observation sink wiring verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- `RAP_MESH_PRODUCTION_OBSERVATION_SINK_CAPACITY` config loading: PASS
- negative sink capacity rejected: PASS
- observer wiring disabled by default: PASS
- observer wiring created only when capacity is greater than zero: PASS
- existing mesh/forwarding tests remain green: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
C17M report:
- `artifacts/c17m-production-observation-sink-wiring-report.md`
Additional C17N production observation sink metrics verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- empty sink metrics start at zero depth/accepted/dropped: PASS
- bounded sink metrics track capacity/current depth: PASS
- accepted observations increment accepted total: PASS
- oldest-entry drop increments dropped total: PASS
- existing mesh/forwarding tests remain green: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
C17N report:
- `artifacts/c17n-production-observation-sink-metrics-report.md`
Additional C17O production observation sink local metrics log verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- sink wiring remains disabled by default: PASS
- enabled sink keeps configured capacity: PASS
- nil mesh state metrics logging is safe: PASS
- existing mesh/forwarding tests remain green: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
C17O report:
- `artifacts/c17o-production-observation-sink-local-metrics-log-report.md`
Additional C17P production observation sink change-driven metrics log
verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- first metrics snapshot logs locally: PASS
- unchanged metrics do not log again: PASS
- changed metrics log again: PASS
- metrics equality helper detects identical/different snapshots: PASS
- existing mesh/forwarding tests remain green: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
C17P report:
- `artifacts/c17p-production-observation-sink-change-driven-metrics-log-report.md`
Additional C17Q production forwarding gate/runtime log boundary verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- production forwarding gate state is logged separately from runtime state:
PASS
- default gate/runtime log state remains false/false: PASS
- gate-enabled log state remains true/false: PASS
- existing mesh/forwarding tests remain green: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
C17Q report:
- `artifacts/c17q-production-forwarding-gate-runtime-log-boundary-report.md`
Additional C17R production observation sink capacity guard verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- normal sink capacity config still loads: PASS
- negative sink capacity rejected: PASS
- too-large sink capacity rejected: PASS
- existing mesh/forwarding tests remain green: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
C17R report:
- `artifacts/c17r-production-observation-sink-capacity-guard-report.md`
Additional C17S production observation panic fail-closed verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- observer error fails closed: PASS
- observer panic fails closed: PASS
- nil observer is allowed: PASS
- rejected envelopes are not observed: PASS
- existing mesh/forwarding tests remain green: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
C17S report:
- `artifacts/c17s-production-observation-panic-fail-closed-report.md`
Additional C17T production envelope payload boundary verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- valid fabric-control envelope still passes validation/observation: PASS
- oversized fabric-control envelope rejected: PASS
- oversized rejected envelope does not call observer: PASS
- invalid payload hash still rejected: PASS
- service channel still rejected: PASS
- existing mesh/forwarding tests remain green: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
C17T report:
- `artifacts/c17t-production-envelope-payload-boundary-report.md`
Additional C17U production envelope created-at skew verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- valid fabric-control envelope still passes validation/observation: PASS
- future-created fabric-control envelope rejected: PASS
- future-created rejected envelope does not call observer: PASS
- existing payload/hash/channel/time validation remains green: PASS
- RDP/runtime/data-plane behavior changed: no
- production service traffic over mesh: no
C17U report:
- `artifacts/c17u-production-envelope-created-at-skew-report.md`
Additional C17V peer endpoint candidate model verification:
```powershell
go test ./...
```
Run from:
```powershell
backend
agents\rap-node-agent
```
Result:
- backend synthetic config includes route-scoped peer endpoint candidates:
PASS
- unrelated peer endpoints and endpoint candidates do not leak across route
paths: PASS
- candidate validation rejects unknown transport/NAT, route-path mismatch,
node mismatch, and invalid metadata: PASS
- node-agent scoped config loads valid peer endpoint candidates: PASS
- node-agent scoped config rejects invalid peer endpoint candidates: PASS
- production forwarding remained unavailable: PASS
- production service traffic over mesh: no
C17V report:
- `artifacts/c17v-peer-endpoint-candidate-model-report.md`
Additional C17W peer endpoint candidate scoring verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
backend
```
Result:
- direct/fresh/public endpoint candidate ranks above relay and stale
candidates: PASS
- tie-breaking by priority/node/endpoint is deterministic: PASS
- relay and outbound fallback candidates are retained instead of dropped:
PASS
- production forwarding remained unavailable: PASS
- production service traffic over mesh: no
C17W report:
- `artifacts/c17w-peer-endpoint-candidate-scoring-report.md`
Additional C17X health-aware endpoint candidate scoring verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
backend
```
Result:
- local health observations can promote lower-latency, high-reliability
candidates: PASS
- failure history and recent failure reasons penalize candidates: PASS
- stale observations do not contribute fresh latency benefits: PASS
- production forwarding remained unavailable: PASS
- production service traffic over mesh: no
C17X report:
- `artifacts/c17x-health-aware-endpoint-candidate-scoring-report.md`
Additional C17Y Platform Owner synthetic mesh visibility verification:
```powershell
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build && popd"
go test ./...
```
Run from:
```powershell
web-admin
backend
agents\rap-node-agent
```
Result:
- web-admin TypeScript/Vite build: PASS
- backend tests: PASS
- node-agent tests: PASS
- Platform Owner Fabric page reads node-scoped synthetic mesh config: PASS
- Fabric page shows route/endpoint/candidate counts and production forwarding
state: PASS
- production forwarding remained unavailable: PASS
- RDP/runtime/data-plane behavior changed: no
C17Y report:
- `artifacts/c17y-platform-owner-synthetic-mesh-visibility-report.md`
Additional C17Z production fabric-control direct forwarding verification:
```powershell
go test ./...
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build && popd"
```
Run from:
```powershell
agents\rap-node-agent
backend
web-admin
```
Result:
- local destination delivery for valid `fabric.control`: PASS
- direct next-hop forwarding for valid `fabric.control`: PASS
- no-transport path still returns runtime unavailable: PASS
- invalid/hash/channel/time/payload boundaries remain enforced: PASS
- service channels remain rejected: PASS
- web-admin build with updated C17Z boundary wording: PASS
- RDP/runtime service payload behavior changed: no
C17Z report:
- `artifacts/c17z-production-fabric-control-direct-forwarding-report.md`
Additional C17Z1 production fabric-control multi-hop route-path verification:
```powershell
go test ./...
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build && popd"
```
Run from:
```powershell
agents\rap-node-agent
backend
web-admin
```
Result:
- route-path-bound multi-hop `fabric.control` forwarding: PASS
- wrong next hop rejected: PASS
- duplicate route path loop rejected: PASS
- visited node metadata propagated to the destination: PASS
- service channels remain rejected: PASS
- RDP/runtime service payload behavior changed: no
C17Z1 report:
- `artifacts/c17z1-production-fabric-control-multihop-route-path-report.md`
Additional C17Z2 production fabric-control forwarding observability
verification:
```powershell
go test ./...
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build && popd"
```
Run from:
```powershell
agents\rap-node-agent
backend
web-admin
```
Result:
- accepted production `fabric.control` events logged: PASS
- forwarded production `fabric.control` events logged: PASS
- delivered production `fabric.control` events logged: PASS
- rejected production `fabric.control` events logged: PASS
- payload bodies are not logged: PASS
- service channels remain rejected: PASS
- RDP/runtime service payload behavior changed: no
C17Z2 report:
- `artifacts/c17z2-production-fabric-control-forwarding-observability-report.md`
Additional C17Z3 production fabric-control route-config verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- configured production `fabric.control` route forwarding: PASS
- unknown configured route rejection: PASS
- wrong configured next-hop rejection: PASS
- existing direct and route-path forwarding tests: PASS
- service channels remain rejected: PASS
- RDP/runtime service payload behavior changed: no
C17Z3 report:
- `artifacts/c17z3-production-fabric-control-route-config-boundary-report.md`
Additional C17Z4 scoped peer directory/recovery seed verification:
```powershell
go test ./...
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build && popd"
```
Run from:
```powershell
backend
agents\rap-node-agent
web-admin
```
Result:
- backend scoped peer-directory projection: PASS
- backend recovery seed projection/ordering: PASS
- node-agent scoped config validation: PASS
- web-admin peer directory/recovery seed counters: PASS
- RDP/runtime service payload behavior changed: no
C17Z4 report:
- `artifacts/c17z4-scoped-peer-directory-recovery-seeds-report.md`
Additional C17Z5 node-agent peer cache runtime verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- peer cache warm selection: PASS
- recovery seed warm promotion: PASS
- endpoint candidate scoring integration: PASS
- node-agent config loading with peer runtime config: PASS
- warm-peer health probe code compiles in node-agent: PASS
- RDP/runtime service payload behavior changed: no
C17Z5 report:
- `artifacts/c17z5-node-agent-peer-cache-runtime-report.md`
Additional C17Z6 dynamic endpoint reporting verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
backend
```
Result:
- node-agent advertised endpoint config loading: PASS
- heartbeat endpoint report payload: PASS
- backend projection of reported endpoint into scoped config: PASS
- backend projection of reported endpoint candidate into scoped config: PASS
- peer directory counts include reported endpoint/candidate: PASS
- RDP/runtime service payload behavior changed: no
C17Z6 report:
- `artifacts/c17z6-dynamic-endpoint-reporting-report.md`
Additional C17Z7 private/corporate endpoint candidate verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
backend
```
Result:
- multiple advertised endpoint heartbeat payload: PASS
- private/corporate endpoint candidate preservation: PASS
- corporate/private endpoint scoring preference: PASS
- peer cache selects corporate LAN address for warm health: PASS
- backend tests remain green: PASS
- RDP/runtime service payload behavior changed: no
C17Z7 report:
- `artifacts/c17z7-private-corporate-endpoint-candidates-report.md`
Additional C17Z8 peer connection state-machine verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- ready/degraded transitions: PASS
- repeated-failure backoff transition: PASS
- backoff probe suppression/recovery: PASS
- snapshot state counters: PASS
- node-agent warm-peer health metadata compiles with connection states: PASS
- RDP/runtime service payload behavior changed: no
C17Z8 report:
- `artifacts/c17z8-peer-connection-state-machine-report.md`
Additional C17Z9 peer recovery planner verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- bounded steady ready-peer selection: PASS
- recovery seed candidate selection during ready deficit: PASS
- active backoff candidate suppression: PASS
- target capped by connectable peer count: PASS
- node-agent recovery report metadata compiles: PASS
- RDP/runtime service payload behavior changed: no
C17Z9 report:
- `artifacts/c17z9-peer-recovery-planner-report.md`
Additional C17Z10 peer connection intent planner verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- corporate/private direct intent classification: PASS
- outbound-only rendezvous-required classification: PASS
- relay-required rendezvous-required classification: PASS
- private endpoint classification without explicit candidate hints: PASS
- heartbeat intent report without advertised endpoint: PASS
- RDP/runtime service payload behavior changed: no
C17Z10 report:
- `artifacts/c17z10-peer-connection-intent-planner-report.md`
Additional C17Z11 peer connection manager runtime verification:
```powershell
go test ./...
```
Run from:
```powershell
agents\rap-node-agent
```
Result:
- direct control-plane health probe through manager: PASS
- relay/rendezvous-required peer deferred: PASS
- repeated failure enters backoff and is suppressed: PASS
- heartbeat manager report compiles: PASS
- RDP/runtime service payload behavior changed: no
C17Z11 report:
- `artifacts/c17z11-peer-connection-manager-runtime-report.md`
Additional C17Z12 rendezvous/relay control-plane contract verification:
```powershell
go test ./...
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build && popd"
```
Run from:
```powershell
backend
agents\rap-node-agent
```
Result:
- backend node-scoped `rendezvous_leases` contract and no unrelated leak: PASS
- scoped config rendezvous lease validation: PASS
- rendezvous-required intent resolved to relay_control with lease: PASS
- peer connection manager probes relay health and records `relay_ready`: PASS
- peer recovery planner maintains `relay_ready` peers in steady mode: PASS
- web-admin synthetic config visibility for rendezvous leases: PASS
- RDP/VPN/service payload behavior changed: no
Docker-test C17Z12 runtime smoke:
```powershell
.\scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning
```
Result from run `c17z12-20260428-142108`:
- backend API on `http://192.168.200.61:18120/api/v1`: PASS
- C17Z12 web-admin view on `http://192.168.200.61:5174/`: PASS
- C17Z12 containers left running on `docker-test`: PASS
- backend auto-derived `rendezvous_leases` for outbound-only node C via relay
node R: PASS
- entry node A resolved `waiting_rendezvous` to `relay_control`: PASS
- A -> C manager observation through relay was `reachable`: PASS
- A -> C connection state was `relay_ready`: PASS
- direct baseline synthetic route delivery: PASS
- production forwarding remained disabled: PASS
C17Z12 report:
- `artifacts/c17z12-rendezvous-relay-control-plane-contract-report.md`
Additional C17Z13 rendezvous lease telemetry verification:
```powershell
go test ./...
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build && popd"
.\scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning
```
Run from:
```powershell
agents\rap-node-agent
backend
```
Result from docker-test run `c17z13-20260428-145133`:
- backend API on `http://192.168.200.61:18120/api/v1`: PASS
- web-admin on `http://192.168.200.61:5174/`: PASS
- entry node A heartbeat reports
`c17z13.mesh_rendezvous_lease_report.v1`: PASS
- entry node A reports `entry_observer_count=1` and
`relay_control_ready_count=1`: PASS
- relay node R reports `admitted_as_relay_count=1`: PASS
- outbound-only node C reports `admitted_as_peer_count=1`: PASS
- expired lease skip and active lease reselection test: PASS
- lease telemetry boundary flags keep payload forwarding disabled: PASS
- A -> C relay-control manager observation remains `reachable` and
`relay_ready`: PASS
C17Z13 report:
- `artifacts/c17z13-rendezvous-lease-telemetry-report.md`
Additional C17Z14 rendezvous lease refresh verification:
```powershell
go test ./...
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build"
.\scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning
```
Run from:
```powershell
agents\rap-node-agent
backend
```
Result from docker-test run `c17z14-20260428-151435`:
- backend API on `http://192.168.200.61:18120/api/v1`: PASS
- web-admin on `http://192.168.200.61:5174/`: PASS
- entry node A heartbeat reports
`c17z14.mesh_rendezvous_lease_report.v1`: PASS
- entry node A reports `refresh_contract=node_scoped_synthetic_config_get`,
`refresh_needed_count=1`, and `refresh_success_count=2`: PASS
- relay node R reports `admitted_as_relay_count=2` and refresh success: PASS
- outbound-only node C reports `admitted_as_peer_count=2` and refresh
success: PASS
- stale relay withdrawal/reselection fields are present and zero in the
healthy smoke path: PASS
- refresh telemetry boundary flags keep payload forwarding disabled: PASS
- direct baseline synthetic route delivery: PASS
- A -> C relay-control manager observation remains `reachable` and
`relay_ready`: PASS
C17Z14 report:
- `artifacts/c17z14-rendezvous-lease-refresh-report.md`
Additional C17Z15 rendezvous relay replacement verification:
```powershell
go test ./...
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build"
.\scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning
```
Run from:
```powershell
agents\rap-node-agent
backend
```
Result from docker-test run `c17z15-20260428-153917`:
- backend API on `http://192.168.200.61:18120/api/v1`: PASS
- web-admin on `http://192.168.200.61:5174/`: PASS
- backend synthetic config schema `c17z15.synthetic.v1`: PASS
- entry node A heartbeat reports
`c17z15.mesh_rendezvous_lease_report.v1`: PASS
- entry node A initially receives an explicit stale relay lease through old
relay R with bad relay endpoint `http://127.0.0.1:19210`: PASS
- Control Plane withdraws the stale old-relay lease and issues a replacement
`stale_relay_replacement` lease through alternate relay S: PASS
- replacement lease metadata includes
`relay_replacement_contract=stale_relay_feedback_policy` and
`replacement_for_stale_relay=true`: PASS
- relay S reports `admitted_as_relay_count=1`: PASS
- R or S reports `last_refresh_reason=stale_relay` and refresh success: PASS
- direct baseline synthetic route delivery remains available: PASS
- payload forwarding boundary flags keep RDP/VPN/service forwarding disabled:
PASS
C17Z15 report:
- `artifacts/c17z15-rendezvous-relay-replacement-report.md`
Additional C17Z16 route/path decision artifact verification:
```powershell
go test ./...
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build"
.\scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning
```
Run from:
```powershell
agents\rap-node-agent
backend
```
Result from docker-test run `c17z16-20260428-160621`:
- backend API on `http://192.168.200.61:18120/api/v1`: PASS
- web-admin on `http://192.168.200.61:5174/`: PASS
- backend synthetic config schema `c17z16.synthetic.v1`: PASS
- entry node A heartbeat reports
`c17z16.mesh_rendezvous_lease_report.v1`: PASS
- entry node A heartbeat reports
`c17z16.mesh_route_path_decision_report.v1`: PASS
- Control Plane route/path decision removes stale relay R from effective hops
and selects alternate relay S as next hop for A -> C: PASS
- route/path decision report includes generation, score reasons,
`control_plane_only=true`, `route_path_forwarding_runtime=false`, and
`production_payload_forwarding=false`: PASS
- replacement lease through alternate relay S remains `relay_ready`: PASS
- direct baseline synthetic route delivery remains available: PASS
- payload forwarding boundary flags keep RDP/VPN/service forwarding disabled:
PASS
C17Z16 report:
- `artifacts/c17z16-route-path-decision-report.md`
Additional C17Z17 route generation tracker verification:
```powershell
go test ./...
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build"
.\scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning
```
Run from:
```powershell
agents\rap-node-agent
backend
```
Result from docker-test run `c17z17-20260428-165118`:
- backend API on `http://192.168.200.61:18120/api/v1`: PASS
- web-admin on `http://192.168.200.61:5174/`: PASS
- backend synthetic config schema `c17z17.synthetic.v1`: PASS
- entry node A heartbeat reports
`c17z17.mesh_rendezvous_lease_report.v1`: PASS
- entry node A heartbeat reports
`c17z17.mesh_route_path_decision_report.v1`: PASS
- entry node A heartbeat reports
`c17z17.mesh_route_generation_report.v1`: PASS
- route generation tracker reports active decisions, applied decisions,
withdrawn decisions, total withdrawn count, and `generation_changed=true`:
PASS
- first-observed replacement records old relay path withdrawal as
`withdrawn_by_replacement`: PASS
- route generation boundary flags keep `control_plane_only=true`,
`route_path_forwarding_runtime=false`, `service_workload_traffic=false`, and
`production_payload_forwarding=false`: PASS
- replacement lease through alternate relay S remains `relay_ready`: PASS
- direct baseline synthetic route delivery remains available: PASS
- payload forwarding boundary flags keep RDP/VPN/service forwarding disabled:
PASS
C17Z17 report:
- `artifacts/c17z17-route-generation-tracker-report.md`
Additional C17Z18 route-health effective path verification:
```powershell
go test ./...
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build"
.\scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning
```
Run from:
```powershell
backend
agents\rap-node-agent
```
Result from docker-test run `c17z18-20260428-174559`:
- backend API on `http://192.168.200.61:18120/api/v1`: PASS
- web-admin on `http://192.168.200.61:5174/`: PASS
- backend synthetic config schema `c17z18.synthetic.v1`: PASS
- entry node A heartbeat reports
`c17z18.mesh_rendezvous_lease_report.v1`: PASS
- entry node A heartbeat reports
`c17z18.mesh_route_path_decision_report.v1`: PASS
- entry node A heartbeat reports
`c17z18.mesh_route_generation_report.v1`: PASS
- entry node A heartbeat reports
`c17z18.mesh_route_health_config_report.v1`: PASS
- synthetic route-health runtime uses the replacement effective path
A -> alternate relay S -> outbound-only C: PASS
- route-health drift detection stays false for the selected effective path:
PASS
- backend latest mesh links preserve `synthetic_route_health` separately from
`peer_connection_manager`: PASS
- web-admin Fabric links show route-health observation type, selected relay,
and effective/observed path: PASS
- payload forwarding boundary flags keep RDP/VPN/service forwarding disabled:
PASS
C17Z18 report:
- `artifacts/c17z18-route-health-effective-path-report.md`
Additional C17Z19 route-health feedback scoring verification:
```powershell
go test ./...
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build"
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\fabric\c17z19-route-health-feedback-smoke-ssh.ps1 -KeepRunning
```
Run from:
```powershell
backend
agents\rap-node-agent
web-admin
```
Result from docker-test run `c17z19-20260428-214427`:
- isolated backend API on `http://192.168.200.61:18122/api/v1`: PASS
- fresh migration replay through `000022_synthetic_mesh_service_class`: PASS
- `synthetic` mesh route intent service class accepted by PostgreSQL: PASS
- initial fast-path relay selection prefers `node-s`: PASS
- injected synthetic route-health drift for selected relay `node-s` causes
stale relay replacement through `node-t`: PASS
- route path decision records `node-t` as selected relay and `node-s` as stale
relay: PASS
- healthy low-latency route-health for `node-t` keeps `node-t` selected with
`route_health_reachable`, `route_health_no_drift`,
`route_health_quality`, and `route_health_latency` score reasons: PASS
- signed synthetic config is still required and present: PASS
- payload forwarding boundary flags keep RDP/VPN/service forwarding disabled:
PASS
C17Z19 report:
- `artifacts/c17z19-route-health-feedback-report.md`
- `artifacts/c17z19-route-health-feedback-smoke-result.json`
Additional C17Z20 route-health feedback refresh verification:
```powershell
go test ./...
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build"
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning
```
Run from:
```powershell
backend
agents\rap-node-agent
web-admin
```
Result from docker-test run `c17z18-20260428-221601`:
- multi-agent backend API on `http://192.168.200.61:18120/api/v1`: PASS
- backend/node-agent images rebuilt on docker-test: PASS
- node A reports `c17z20.mesh_route_health_config_report.v1`: PASS
- node A reports
`c17z20.mesh_route_health_feedback_refresh_report.v1`: PASS
- route-health failure triggers immediate config refresh before normal
periodic interval: PASS
- heartbeat reports feedback refresh attempts/successes/failures/suppressed:
PASS
- replacement route-health effective path through alternate relay remains
active: PASS
- payload forwarding boundary flags keep RDP/VPN/service forwarding disabled:
PASS
C17Z20 report:
- `artifacts/c17z20-route-health-feedback-refresh-report.md`
Dev cluster enrollment/bootstrap lifecycle verification:
```powershell
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\fabric\dev-cluster-enrollment-bootstrap-smoke-ssh.ps1 -KeepRunning
```
Result from docker-test run `dev-bootstrap-20260428-201430`:
- isolated backend API on `http://192.168.200.61:18121/api/v1`: PASS
- fresh migration replay through `000021_cluster_authority_keys`: PASS
- first-owner dev bootstrap through `/installation/bootstrap-owner`: PASS
- signed join token, real node-agent enrollment, pending join request, and
platform-owner approval: PASS
- node-agent automatic bootstrap polling verified signed approval and
persisted cluster authority pin: PASS
- node heartbeat after bootstrap: PASS
- signed `c17z18.synthetic.v1` Control Plane synthetic config verified and
loaded by node-agent: PASS
- workload supervision stub is disabled by default, so no repeated admin-only
desired-workload `403` loop is produced: PASS
- production/service payload forwarding remains disabled: PASS
Dev enrollment/bootstrap report:
- `artifacts/dev-cluster-enrollment-bootstrap-smoke-report.md`
Commands run during P0 baseline freeze:
```powershell
go test ./...
dotnet build .\clients\windows\RemoteAccessPlatform.Windows.slnx
docker -H ssh://docker-test run --rm rap-rdp-worker:rdp-region-repair rdp-worker-graphics-adapter-probe
docker -H ssh://docker-test run --rm rap-rdp-worker:rdp-region-repair rdp-worker-cursor-adapter-probe
docker -H ssh://docker-test run --rm rap-rdp-worker:rdp-region-repair rdp-worker-service-adapter-protocol-probe
docker -H ssh://docker-test run --rm rap-rdp-worker:rdp-region-repair rdp-worker-dataplane-bind-probe --scenario valid
```
Additional accepted P1 baseline verification:
```powershell
go test ./...
dotnet build .\clients\windows\RemoteAccessPlatform.Windows.slnx
docker -H ssh://docker-test build --tag rap-rdp-worker:rdp-p1-region-order2 --file workers/rdp-worker/Dockerfile workers/rdp-worker
docker -H ssh://docker-test run --rm rap-rdp-worker:rdp-p1-region-order2 rdp-worker-graphics-adapter-probe
docker -H ssh://docker-test run --rm rap-rdp-worker:rdp-p1-region-order2 rdp-worker-cursor-adapter-probe
docker -H ssh://docker-test run --rm rap-rdp-worker:rdp-p1-region-order2 rdp-worker-service-adapter-protocol-probe
docker -H ssh://docker-test run --rm rap-rdp-worker:rdp-p1-region-order2 rdp-worker-dataplane-bind-probe --scenario valid
pwsh -ExecutionPolicy Bypass -File scripts\smoke\drive-visibility-smoke.ps1 -WorkerImage rap-rdp-worker:rdp-p1-region-order2 -OutputFrame artifacts\stage5-drive-visibility-frame-p1-rerun.bmp
```
Result:
- backend tests: PASS
- Windows build: PASS, 0 warnings, 0 errors
- worker probes: PASS
- P1 worker image build: PASS
- P1 smoke-worker deployment: PASS, `worker:registration:rdp-worker-1`
reports `status=online`
- P1 manual visual smoke: PASS, idle Task Manager updates, Start menu/hover,
mouse, keyboard, and session close work; window drag is usable with old-client
style frame-only movement and non-perfect release repaint
- Stage 5.1.1 restricted drive visibility smoke: PASS, remote Notepad opened
`stage5-upload-text.txt` from the redirected `RAP_Transfers` drive
Additional P3 security-readiness verification:
```powershell
go test ./...
docker -H ssh://docker-test build --tag rap-rdp-worker:p3-security-probes --file workers/rdp-worker/Dockerfile workers/rdp-worker
$scenarios = @('valid','starting','wrong-worker','wrong-attachment','wrong-user','wrong-organization','wrong-resource','channels-too-broad','failed-state','terminated-state')
foreach ($scenario in $scenarios) {
docker -H ssh://docker-test run --rm rap-rdp-worker:p3-security-probes rdp-worker-dataplane-bind-probe --scenario $scenario
}
```
Result:
- backend tests: PASS, including production secret-readiness guard tests
- sessionbroker data-plane policy test: PASS
- worker P3 probe image build: PASS
- direct bind denial probes: PASS for valid/starting/wrong-worker/
wrong-attachment/wrong-user/wrong-organization/wrong-resource/
channels-too-broad/failed-state/terminated-state
Additional P3.1 secret resolver verification:
```powershell
go test ./...
```
Result:
- encrypted secret AES-256-GCM round trip: PASS
- wrong AAD decrypt rejection: PASS
- assignment-time resolved secret merge: PASS
- session metadata plaintext mutation prevention: PASS
- production missing resolver denial: PASS
- development metadata fallback compatibility: PASS
Additional P3.2 direct worker TLS/PKI guard verification:
```powershell
go test ./...
dotnet build clients\windows\RemoteAccessPlatform.Windows.slnx
```
Result:
- production backend omits smoke-only direct worker WSS candidates: PASS
- production-trusted direct candidate metadata: PASS
- Windows client build with production/smoke direct TLS guard: PASS
Additional P3.3 production secret/TLS test-stand smoke:
```powershell
docker -H ssh://docker-test build --tag rap-backend-smoke:p3-3 --file - backend
Get-Content -Raw backend\migrations\000009_resource_secrets.up.sql |
docker -H ssh://docker-test exec -i rap_postgres psql -U rap_user -d remote_access_platform -v ON_ERROR_STOP=1 -f -
pwsh -ExecutionPolicy Bypass -File scripts\smoke\drive-visibility-smoke.ps1 `
-WorkerImage rap-rdp-worker:rdp-p1-region-order2 `
-ResourceName "P3.3 Secret RDP Resource" `
-OutputFrame artifacts\p3-3-secret-backed-drive-frame.bmp
```
Result:
- backend image `rap-backend-smoke:p3-3`: PASS
- backend production-like start with `SECRET_ENCRYPTION_KEY_FILE`: PASS
- secret-backed RDP resource through `PUT /api/v1/resources/{id}/secret`: PASS
- real RDP session through resolver-backed assignment: PASS
- resource/session metadata plaintext credential checks: PASS
- audit plaintext credential checks: PASS
- production backend omits smoke-only direct worker WSS candidate: PASS
- development/smoke backend advertises explicit smoke-only direct worker WSS
candidate: PASS
- backend gateway fallback smoke with rendering/input/clipboard/file upload:
PASS
- secret-backed detach/reattach/takeover API lifecycle regression: PASS
Additional P3.4 production direct-worker WSS trust design/prep:
- production worker WSS certificate model: documented
- platform CA vs public CA recommendation: documented
- worker certificate SAN and identity binding rules: documented
- app-local Windows client trust approach: documented
- rotation/revocation/fallback behavior: documented
- future `platform_ca` smoke plan: documented
- runtime behavior changed: no
Additional P3.5 app-local platform CA trust smoke:
```powershell
dotnet build clients/windows/src/RemoteAccessPlatform.Windows.App/RemoteAccessPlatform.Windows.App.csproj
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts/smoke/prepare-platform-ca-direct-worker.ps1 `
-DockerSshAlias docker-test `
-LocalCaOutputPath artifacts/p3-5-platform-ca.crt `
-WorkerHost 192.168.200.61 `
-WorkerId rdp-worker-1 `
-ClusterId default
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts/windows-smoke/desktop-smoke.ps1 `
-DefaultResourceName "P3.3 Secret RDP Resource" `
-PreferDirectDataPlane:$true `
-AllowInsecureDirectDataPlaneTlsForSmoke:$false `
-DirectDataPlaneConnectTimeoutMs 2500 `
-DirectDataPlaneColorMode full_color `
-DirectDataPlanePlatformCaBundle "\\192.168.220.200\mst\codex\rdp-proxy\artifacts\p3-5-platform-ca.crt" `
-BackendEnvironment production `
-SkipOrgSwitchAndTokenRefresh `
-DockerSshAlias docker-test
```
Result:
- Windows client app-local platform CA bundle support: PASS
- worker WSS test cert with IP SAN and URI SAN: PASS
- backend `platform_ca` candidate metadata: PASS
- production client direct worker WSS selected without insecure TLS bypass:
PASS
- direct binary render over trusted WSS: PASS
- input/lifecycle smoke over trusted WSS: PASS
- unknown CA rejected and backend gateway fallback activated: PASS
- `smoke_insecure` production case used backend gateway fallback: PASS
- backend gateway fallback remained usable: PASS
P3.5 runtime report:
- `artifacts/p3-5-app-local-platform-ca-smoke-report.md`
New P3.6 hardening finding:
- stale Redis live/worker events after backend restart can crash the backend
with `invalid session state transition: terminated -> active`
- Redis was safely cleared for the test stand because PostgreSQL is the source
of truth
- next step should make stale worker events idempotent, not add product
features
P3.6 stale worker event / restart idempotency hardening:
```powershell
go test ./...
docker -H ssh://docker-test build --tag rap-backend-smoke:p3-6 --file - backend
```
Runtime smoke:
- start real secret-backed RDP session
- wait for PostgreSQL state `active`
- terminate session
- stop backend
- push stale `session_connected` to Redis `worker:events`
- restart backend
- verify backend stays up
- verify PostgreSQL state remains `terminated`
- verify new normal RDP session still reaches `active`
Result:
- stale `session_connected` for terminal session ignored: PASS
- stale render telemetry for terminal session does not recreate live state:
PASS
- backend restart survives stale Redis worker event: PASS
- terminal PostgreSQL session is not reopened: PASS
- normal new RDP session after restart: PASS
P3.6 runtime report:
- `artifacts/p3-6-stale-worker-event-idempotency-report.md`
Stage 5.2 server-to-client file download design pass:
- safest v1 model selected: restricted `RAP_Transfers\ToClient` outbound
drop zone
- no Windows agent, SMB/WebDAV, remote filesystem browser, arbitrary path
download, or expanded drive mapping
- download remains policy-gated by `file_transfer_mode`
- direct worker WSS remains preferred; backend gateway remains fallback
- implementation prompt is documented for the next step
Design document:
- `docs/architecture/RDP_FILE_DOWNLOAD_STAGE_5_2.md`
Stage 5.2 server-to-client file download implementation:
```powershell
go test ./...
dotnet build clients/windows/src/RemoteAccessPlatform.Windows.App/RemoteAccessPlatform.Windows.App.csproj
docker -H ssh://docker-test build --tag rap-rdp-worker:stage5-2-download --file workers/rdp-worker/Dockerfile workers/rdp-worker
docker -H ssh://docker-test run --rm rap-rdp-worker:stage5-2-download rdp-worker-graphics-adapter-probe
docker -H ssh://docker-test run --rm rap-rdp-worker:stage5-2-download rdp-worker-cursor-adapter-probe
docker -H ssh://docker-test run --rm rap-rdp-worker:stage5-2-download rdp-worker-service-adapter-protocol-probe
docker -H ssh://docker-test run --rm rap-rdp-worker:stage5-2-download rdp-worker-dataplane-bind-probe --scenario valid
docker -H ssh://docker-test build --tag rap-backend-smoke:stage5-2-download --file - backend
```
Result:
- backend tests: PASS
- Windows build: PASS, 0 warnings, 0 errors
- worker image build: PASS
- worker probes: PASS
- backend smoke image build: PASS
- live runtime proof: pending
Build report:
- `artifacts/stage5-2-file-download-build-report.md`
Stage 5.2 server-to-client file download runtime proof, core data path:
```powershell
docker -H ssh://docker-test build --tag rap-rdp-worker:stage5-2-download-direct-block --file workers/rdp-worker/Dockerfile workers/rdp-worker
docker -H ssh://docker-test tag rap-rdp-worker:stage5-2-download-direct-block rap-rdp-worker:stage5-2-download
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\smoke\file-download-smoke.ps1 -AllowMode server_to_client -Transport direct_worker_wss -OutputDirectory artifacts/stage5-2-download-smoke-direct-fixed2
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\smoke\file-download-smoke.ps1 -AllowMode bidirectional -Transport direct_worker_wss -OutputDirectory artifacts/stage5-2-download-smoke-direct-bidirectional
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\smoke\file-download-smoke.ps1 -AllowMode client_to_server -Transport direct_worker_wss -ExpectBlocked -OutputDirectory artifacts/stage5-2-download-smoke-direct-client-to-server-block-fixed
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\smoke\file-download-smoke.ps1 -AllowMode disabled -Transport direct_worker_wss -ExpectBlocked -OutputDirectory artifacts/stage5-2-download-smoke-direct-disabled-fixed
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\smoke\file-download-smoke.ps1 -AllowMode server_to_client -Transport backend_gateway -OutputDirectory artifacts/stage5-2-download-smoke-backend-regression-after-direct-block
```
Result:
- direct worker WSS `server_to_client`: PASS, text and binary size/hash match
- direct worker WSS `bidirectional`: PASS, text and binary size match
- direct worker WSS `client_to_server`: PASS, download blocked with
`access denied`
- direct worker WSS `disabled`: PASS, download blocked with `access denied`
- backend gateway fallback `server_to_client`: PASS, text and binary size/hash
match
- direct WSS smoke harness bug fixed: PowerShell TLS callback now uses a static
.NET delegate and URL query construction now uses `?` correctly
- direct WSS policy feedback bug fixed: disallowed file download now returns
`file_download.blocked` instead of silently dropping the request
Stage 5.2 server-to-client file download lifecycle proof:
```powershell
docker -H ssh://docker-test build --tag rap-rdp-worker:stage5-2-download --file workers/rdp-worker/Dockerfile workers/rdp-worker
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\smoke\file-download-smoke.ps1 -AllowMode server_to_client -Transport direct_worker_wss -LifecycleScenario detach -OutputDirectory artifacts/stage5-2-download-lifecycle-detach-fixed
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\smoke\file-download-smoke.ps1 -AllowMode server_to_client -Transport direct_worker_wss -LifecycleScenario takeover_old_controller -OutputDirectory artifacts/stage5-2-download-lifecycle-takeover-fixed
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\smoke\file-download-smoke.ps1 -AllowMode server_to_client -Transport direct_worker_wss -LifecycleScenario worker_failure -OutputDirectory artifacts/stage5-2-download-lifecycle-worker-failure
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\smoke\file-download-smoke.ps1 -AllowMode server_to_client -Transport direct_worker_wss -OutputDirectory artifacts/stage5-2-download-smoke-direct-after-lifecycle-fix
```
Result:
- detach: PASS, PostgreSQL state `detached`, outcome `file_download.blocked`,
reason `session is not active`
- old-client takeover: PASS, stale attachment receives `session.taken_over`
and cannot continue download
- worker failure: PASS, PostgreSQL state `failed`, audit `session_failed`,
direct WebSocket closes and download cannot continue
- post-fix direct download regression: PASS, text and binary size match
- direct WSS stale attachment feedback bug fixed: stale attachment now receives
`session.taken_over` instead of observing silence
Runtime report:
- `artifacts/stage5-2-file-download-runtime-report.md`
Coverage warning:
- this is not a full live RDP smoke pass
- most confidence for RDP UX still comes from manual/live smoke history
- automated regression coverage must be expanded before production readiness
## Correct Next Step
C17Z20 is complete. Do not automatically continue into VPN runtime, RDP work, or
service workload traffic.
The next step must be chosen as a new explicit staged prompt. Until then, keep
the proven C17A-C17Z20 mesh proof/gate/contract/observation/config/scoring/
visibility/fabric-control-forwarding/forwarding-observability/route-config
boundary/scoped-peer-directory/peer-cache-runtime/dynamic-endpoint-reporting
private/corporate endpoint candidate/peer-connection-state/recovery-planner
/connection-intent/connection-manager/rendezvous-lease-telemetry/lease-refresh
/relay-replacement-policy/route-path-decision/route-generation-tracker/
synthetic-route-health-effective-path/route-health-feedback-scoring/
route-health-feedback-refresh set preserved. Production forwarding remains
explicitly gate-controlled and limited to `fabric.control`; service payload
forwarding remains unavailable.
Do not start:
- RDP performance work
- Stage 5.2 RDP download UI proof
- production service mesh runtime traffic
- VPN/IP tunnel runtime implementation
- C18D VPN credential/config resolver work
- TUN/TAP, host route, or firewall manipulation
- general relay packet routing
- service workload traffic over mesh
- RDP/VNC/SSH/file/video traffic over mesh
- QUIC/WebRTC
- service workload execution
- backend/session lifecycle changes
- Windows client changes
RDP status:
- RDP is paused by product decision.
- RDP-Perf-6 is completed and smoke-proven.
- The C++ RDP Adapter remains the preserved runtime baseline for when RDP work
explicitly resumes.
- C10-C17 planning are completed as documentation/planning. C17A, C17B, C17C,
C17D, C17E, C17F, C17G, C17H, C17I, C17J, C17K, C17L, C17M, C17N, C17O, C17P, C17Q, C17R, C17S, C17T, C17U, C17V, C17W, C17X, C17Y, C17Z, C17Z1, C17Z2, C17Z3, C17Z4, C17Z5, C17Z6, C17Z7, C17Z8, C17Z9, C17Z10, C17Z11, C17Z12, C17Z13, C17Z14, C17Z15, C17Z16, C17Z17, C17Z18, C17Z19, and C17Z20 are implemented/proven
with synthetic traffic, explicit production-forwarding gate checks, envelope
contract validation, metadata-only observation, or bounded local observation
retention/wiring/metrics/local logging/capacity guard/fail-closed hardening
and payload/time-boundary validation only. C18 planning is completed as
documentation only. C18A
control-plane data model foundation is implemented and backend-test-proven.
C18B lease/fencing hardening is implemented and backend-test-proven. C18C
node-agent desired-state consumption/reporting is implemented and
backend-test-proven. C18 runtime is not authorized.