Files
rdp-proxy/CODEX_CONTEXT.md
T
2026-04-28 22:29:50 +03:00

34 KiB

CODEX CONTEXT

Project identity

This project is a production-grade distributed secure access platform.

It started as a custom RDP proxy with persistent server-side sessions, but the final target architecture is broader:

  • distributed secure access fabric
  • multi-tenant platform
  • session broker for GUI and future non-GUI protocols
  • cluster mesh of nodes
  • connector/VPN layer
  • customer-managed and platform-managed nodes
  • node-agent based self-update / rollback / health supervision

Current proven foundation

The current codebase already proved the most risky low-level lifecycle assumptions for RDP:

  • real FreeRDP connect works
  • session state transitions to active work
  • terminate works
  • detach works without killing the remote session
  • reattach works without recreating the remote session
  • takeover works without recreating the remote session
  • per-resource certificate verification policy exists
  • certificate_verification_mode = strict | ignore
  • strict is default
  • ignore works on a per-resource basis
  • worker build is reproducible
  • backend build is reproducible

This proven lifecycle must NOT be broken by future architecture work.

Current architecture baseline

Current audit and baseline snapshot:

  • docs/audits/PROJECT_AUDIT_2026-04-26.md
  • docs/audits/CURRENT_BASELINE_MATRIX.md

Test environment

  • Canonical test Docker host: 192.168.200.61
  • Canonical Docker context: test-ubuntu
  • Canonical SSH alias: docker-test
  • Backend API for local/client smoke runs: http://192.168.200.61:8080/api/v1
  • WebSocket gateway for local/client smoke runs: ws://192.168.200.61:8080/api/v1/gateway/ws
  • Stage C17 planning is completed.
  • C17A synthetic mesh runtime skeleton is implemented and test-proven in rap-node-agent only. It is disabled by default and carries synthetic fabric.probe / fabric.probe_ack messages only.
  • C17B route health and failover probes are implemented and test-proven in rap-node-agent only. They are disabled by default and carry synthetic fabric.route_health / fabric.route_health_ack messages only.
  • C17C relay semantic hardening is implemented and test-proven in rap-node-agent only. It is disabled by default and models synthetic per-channel queues/QoS/backpressure only.
  • C17D non-production test-service path is implemented and test-proven in rap-node-agent only. It is disabled by default and carries only bounded synthetic.echo test payloads.
  • C17E/C17F/C17G are implemented and proven for live synthetic HTTP transport, scoped synthetic route config, and Control Plane scoped synthetic config consumption.
  • C17H deployed multi-agent synthetic config smoke is runtime-proven on docker-test: five running rap-node-agent containers consume backend-issued node-scoped synthetic config, direct and single-relay synthetic route-health observations return to the Control Plane, and production forwarding remains disabled.
  • C17I production forwarding gate foundation is implemented and test-proven: rap-node-agent has an explicit production-forwarding gate, while /mesh/v1/forward still refuses production payload forwarding until a later approved runtime stage.
  • C17J production envelope contract is implemented and test-proven: /mesh/v1/forward validates route-bound production envelopes for fabric_control / fabric.control only when the gate is enabled, rejects service channels, and still refuses production forwarding.
  • C17K production envelope observation is implemented and test-proven: valid accepted envelopes can be observed locally as metadata-only records after validation; rejected envelopes are not observed, observation failure fails closed, and production forwarding remains unavailable.
  • C17L bounded production observation sink is implemented and test-proven: accepted metadata-only observations can be retained locally with fixed capacity, oldest-entry drop behavior, and no payload body storage.
  • C17M production observation sink wiring is implemented and test-proven: node-agent can wire the bounded local metadata-only sink when RAP_MESH_PRODUCTION_OBSERVATION_SINK_CAPACITY is explicitly greater than zero; the wiring is disabled by default and exposes no read API.
  • C17N production observation sink metrics are implemented and test-proven: local sink metrics expose only capacity, current depth, accepted total, and dropped-oldest total; they expose no observation records or payload metadata.
  • C17O production observation sink local metrics logging is implemented and test-proven: node-agent logs aggregate sink metrics locally when the sink is explicitly enabled; no read API or Control Plane reporting is added.
  • C17P production observation sink change-driven metrics logging is implemented and test-proven: node-agent suppresses repeated identical local sink metrics logs; no read API or Control Plane reporting is added.
  • C17Q production forwarding gate/runtime log boundary is implemented and test-proven: node-agent logs production forwarding gate state separately from production forwarding runtime state. Runtime state remained false until C17Z introduced gate-controlled fabric.control direct forwarding.
  • C17R production observation sink capacity guard is implemented and test-proven: RAP_MESH_PRODUCTION_OBSERVATION_SINK_CAPACITY is rejected above 10000.
  • C17S production observation panic fail-closed hardening is implemented and test-proven: observer errors and observer panics both fail closed as observation failure.
  • C17T production envelope payload boundary is implemented and test-proven: validated production fabric.control envelope payloads are bounded to 4096 bytes and oversized envelopes are rejected before observation.
  • C17U production envelope created-at skew boundary is implemented and test-proven: validated production fabric.control envelopes whose created_at is more than one minute in the future are rejected before observation.
  • C17V peer endpoint candidate model is implemented and test-proven: node-scoped synthetic mesh config now carries route-scoped endpoint candidates with transport, address, reachability, NAT type, connectivity mode, priority, policy tags, verification time, and metadata. This is a model/config boundary only; no production route scoring, NAT traversal, shortcut routing, or forwarding runtime is implemented.
  • C17W peer endpoint candidate scoring model is implemented and test-proven: rap-node-agent can rank already-scoped endpoint candidates using soft inputs such as transport, reachability, connectivity mode, NAT type, priority, region, policy tags, channel class, and verification age. This is a scoring helper only; it does not open connections, choose production routes, or forward payloads.
  • C17X health-aware endpoint candidate scoring overlay is implemented and test-proven: endpoint candidate scoring can optionally use local health observations keyed by endpoint_id, including latency, success/failure history, recent failure reason, reliability score, and observation freshness. This remains advisory scoring only and is not wired into production route execution.
  • C17Y Platform Owner synthetic mesh visibility is implemented and build/test-proven: web-admin reads node-scoped synthetic mesh config and shows config enabled state, route counts, peer endpoints, endpoint candidates, C17X advisory scoring boundary, and production_forwarding. This remains platform-owner visibility only and does not enable production forwarding.
  • C17Z production fabric-control direct forwarding boundary is implemented and test-proven: when RAP_MESH_PRODUCTION_FORWARDING_ENABLED=true, /mesh/v1/forward can deliver valid route-bound fabric.control envelopes at the local destination or forward them to a direct next hop from explicit peer endpoint config. Service channels, arbitrary relay forwarding, multi-hop production route execution, and RDP/VPN/file/video/service payloads remain unavailable.
  • C17Z1 production fabric-control multi-hop route-path boundary is implemented and test-proven: production fabric.control envelopes can carry route_path and visited_node_ids; relay nodes validate path position, forward only to the next path node, update TTL/hop/visited metadata, and reject loops. Service payloads remain unavailable.
  • C17Z2 production fabric-control forwarding observability boundary is implemented and test-proven: node-agent emits local mesh_production_forward_event logs for accepted, forwarded, delivered, and rejected production fabric.control envelopes. Logs are metadata-only and include no payload bodies or read API.
  • C17Z3 production fabric-control route-config boundary is implemented and test-proven: when scoped/control-plane mesh routes are available locally, production fabric.control envelopes must match configured route_id/path/ next-hop/channel/expiry/TTL/hop limits before forwarding.
  • C17Z4 scoped peer directory and recovery seeds boundary is implemented and test/build-proven: node-scoped mesh config carries scoped peer_directory and explicit bounded recovery_seeds; node-agent parses/validates them and web-admin shows counts.
  • C17Z5 node-agent peer cache runtime boundary is implemented and test-proven: node-agent builds a local PeerCache, selects bounded warm peers, probes warm peers with /mesh/v1/health, and reports metadata-only mesh-link observations when synthetic mesh testing is enabled.
  • C17Z6 dynamic endpoint reporting boundary is implemented and test-proven: node-agent reports explicit advertised mesh endpoint metadata in heartbeat, and Control Plane projects latest reported endpoints/candidates into node-scoped synthetic mesh config.
  • C17Z7 private/corporate endpoint candidate boundary is implemented and test-proven: node-agent reports multiple advertised endpoint candidates, scoring rewards private/corporate same-site candidates, and peer cache can use the best candidate address for warm health.
  • C17Z8 peer connection state machine boundary is implemented and test-proven: node-agent tracks warm-peer states disconnected, connecting, ready, degraded, and backoff, with bounded backoff after repeated health probe failures.
  • C17Z9 peer recovery planner boundary is implemented and test-proven: node-agent targets a bounded stable ready-peer set, enters recovery when ready peers fall below target, and selects bounded recovery probes from warm peers, recovery seeds, and other connectable scoped peers.
  • C17Z10 peer connection intent planner boundary is implemented and test-proven: node-agent classifies bounded peer work as maintain/probe/ recover and classifies transport readiness as direct/private_lan/ corporate_lan/outbound_only/relay_required, with rendezvous-required metadata only.
  • C17Z11 peer connection manager runtime boundary is implemented and test-proven: node-agent uses a reusable HTTP keep-alive client for real control-plane health probes of direct/private/corporate peers and records waiting_rendezvous for outbound-only/relay-required peers.
  • C17Z12 rendezvous/relay control-plane contract is implemented and docker-test-runtime-proven: backend issues node-scoped rendezvous_leases, node-agent resolves matching waiting_rendezvous intents into relay_control, probes relay /mesh/v1/health, records and maintains relay_ready, and keeps service payload forwarding disabled.
  • C17Z13 rendezvous lease telemetry is implemented and docker-test-runtime-proven: node-agent reports mesh_rendezvous_lease_report with relay admission, peer admission, TTL/renewal posture, relay_ready, and explicit no-payload boundary flags; web-admin shows rv leases in recent heartbeat tables.
  • C17Z14 rendezvous lease refresh contract is implemented and docker-test-runtime-proven: node-agent refreshes renewal-needed/stale rendezvous leases through node-scoped synthetic config reload, updates the running peer cache/route/lease state, and reports refresh plus stale relay withdrawal/reselection telemetry. Service payload forwarding remains unavailable.
  • C17Z15 backend relay replacement policy is implemented and docker-test-runtime-proven: backend consumes recent stale-relay heartbeat feedback, withdraws stale explicit rendezvous leases, scores alternate relay candidates from route adjacency, endpoint priority, policy tags, and recent mesh-link health, and returns replacement leases plus rendezvous_relay_policy decisions in node-scoped synthetic config. Node-agent reports c17z15.mesh_rendezvous_lease_report.v1 and keeps stale state scoped to the exact lease/relay, so replacement leases for the same peer are not marked stale by association. Service payload forwarding remains unavailable.
  • C17Z16 route/path decision artifact is implemented and docker-test-runtime-proven: backend c17z16.synthetic.v1 config includes route_path_decisions with original hops, effective hops, local previous/ next hop, selected replacement relay, generation, score reasons, and no-payload boundary flags. Node-agent stores the control-plane route generation and reports c17z16.mesh_route_path_decision_report.v1 plus c17z16.mesh_rendezvous_lease_report.v1. Service payload forwarding remains unavailable.
  • C17Z17 node-side route generation tracker is implemented and docker-test-runtime-proven: backend c17z17.synthetic.v1 config and node-agent mesh_route_generation_report track active/applied/unchanged/ withdrawn route decisions, generation changes, total counters, and withdrawn_by_replacement records for stale relay paths when replacement is first observed. Service payload forwarding remains unavailable.
  • C17Z18 synthetic route-health effective path runtime is implemented and docker-test-runtime-proven: backend c17z18.synthetic.v1 config and node-agent mesh_route_health_config_report apply Control Plane route_path_decisions to synthetic route-health route config only. The synthetic runtime probes selected effective paths through replacement relays, reports expected/observed hops and drift state, and backend latest mesh links preserve route-health observations separately from connection-manager observations. Service payload forwarding remains unavailable.
  • C17Z19 synthetic route-health feedback scoring is implemented and docker-test-runtime-proven: backend consumes recent synthetic_route_health observations in relay scoring, uses drift/unreachable/failure metadata to mark the exact selected relay stale, boosts healthy low-latency relay candidates, and returns replacement leases/route decisions through the existing synthetic config contract. Migration 000022 adds the synthetic mesh service class. Service payload forwarding remains unavailable.
  • C17Z20 node-side route-health feedback refresh is implemented and docker-test-runtime-proven: after reporting synthetic route-health drift/unreachable/failure, node-agent performs a bounded node-scoped synthetic-config refresh, applies returned replacement route decisions to route-health config immediately, and reports c17z20.mesh_route_health_feedback_refresh_report.v1. Service payload forwarding remains unavailable.
  • Installation Authority foundation is implemented: production requires strict Product Root public key config, first-owner bootstrap uses signed Ed25519 activation manifests, installation_authority and signed platform_role_grants are persisted, and strict platform-admin checks ignore direct users.platform_role database edits without a valid signed grant. Web-admin exposes installation status/first-owner bootstrap, and scripts/installation/product-root-tool.go generates keys/manifests for offline product-root operations.
  • Cluster Authority and node enrollment bootstrap are docker-test lifecycle smoke-proven in run dev-bootstrap-20260428-201430: a fresh dev install bootstrapped the first owner, created a cluster, issued a signed join token, accepted real rap-node-agent enrollment, owner-approved the join request, agent-polled signed bootstrap, persisted cluster authority pin, heartbeated, and verified signed c17z18.synthetic.v1 Control Plane config. Production service payload forwarding remains unavailable.
  • Migration 000021_cluster_authority_keys drops/recreates cluster_admin_summaries because fresh replay proved PostgreSQL cannot change that view layout via CREATE OR REPLACE VIEW.
  • rap-node-agent desired-workload polling/status reporting is gated by RAP_WORKLOAD_SUPERVISION_ENABLED=false by default while service runtime supervision remains a stub.
  • C18 VPN/IP tunnel service target design is completed as documentation only.
  • C18A VPN/IP tunnel control-plane data model foundation is implemented and backend-test-proven.
  • C18B VPN/IP tunnel lease/fencing hardening is implemented and backend-test-proven.
  • C18C VPN/IP tunnel node-agent desired-state consumption/reporting is implemented and backend-test-proven.
  • No next platform-core implementation step is automatically authorized after C17Z20. The next mesh layer should stay limited to route-health feedback refresh dampening/no-change cooldown unless the user explicitly chooses another staged task.
  • Latest RDP performance reference image: rap-rdp-worker:rdp-perf6-dirty-region
  • Stage 5.2 file-download runtime artifacts remain preserved for when RDP work resumes, but they are not the active next task.
  • Do not use docker.cin.su for this project unless explicitly requested for a separate one-off check.

Backend

  • Go
  • PostgreSQL = source of truth
  • Redis = live coordination / routing only
  • REST for control plane
  • WebSocket for live session channel

Worker

  • C++ worker
  • FreeRDP integration
  • worker runtime hides FreeRDP details from backend
  • The C++ worker remains the primary RDP runtime.
  • Target RDP performance direction: docs/architecture/RDP_SERVICE_CPP_PERFORMANCE_TARGET.md.
  • The RDP performance rewrite scope is limited to C++ RDP service adapter internals. It must not redesign backend control plane, cluster transport, organizations, leases, or session lifecycle.
  • The C# RDP service skeleton is inactive research scaffolding and is not the current runtime direction.
  • Current RDP Adapter baseline: RDP-Perf-6 dirty-region direct binary rendering is completed and smoke-proven on docker-test. RDP work is paused by product decision; next active work is Fabric Core / cluster foundation.
  • P3/P3.1 security-readiness foundation exists: production mode rejects plaintext credential-like resource metadata, requires secret_ref for RDP/VNC/SSH resources, and has an encrypted PostgreSQL-backed resource secret storage/resolver MVP. P3.2 direct-worker TLS/PKI guard exists.
  • P3.3 production-like test-stand smoke is complete on docker-test: backend runs in APP_ENV=production with a test-only secret key file, a secret-backed RDP resource starts real sessions through the resolver path, metadata/audit do not contain plaintext credentials, and backend gateway fallback remains available when direct worker WSS trust is smoke_insecure.
  • P3.4 production direct-worker WSS trust model is documented in docs/architecture/PRODUCTION_DIRECT_WORKER_WSS_TRUST.md; it defines platform CA/public CA behavior, worker certificate SAN/identity requirements, app-local Windows trust direction, rotation/revocation, and the future platform_ca smoke plan. No RDP runtime behavior changed in P3.4.
  • P3.5 app-local platform CA trust is implemented and runtime-proven on docker-test: Windows client validates direct worker WSS with an app-local platform CA bundle, keeps hostname/SAN validation enabled, selects direct_worker_wss without insecure TLS bypass, and falls back to backend gateway for unknown CA / smoke-only production cases.
  • P3.6 stale Redis worker/live event idempotency is implemented and runtime-proven: stale worker events for terminal PostgreSQL sessions are ignored, backend restart survives stale Redis events, and terminal sessions are not reopened.
  • Stage 5.2 server-to-client file download core data path is runtime-proven: direct worker WSS and backend gateway fallback both download text/binary files from RAP_Transfers\ToClient with matching size/hash, and direct policy blocking is proven for disabled and client_to_server. Lifecycle blocking is also runtime-proven for detach, old-client takeover, and worker failure. Runtime report: artifacts/stage5-2-file-download-runtime-report.md.
  • Stage 5.2 is not fully accepted yet. Remaining proof: Windows desktop UI download path and regression matrix for rendering/input/clipboard/upload/ reconnect/takeover.

Clients

  • future native clients:
    • Windows: native desktop client first
    • Linux: native desktop client later
  • web UI is admin/control plane, not the primary power-user client

Final architecture direction

The long-term target architecture is documented in:

  • docs/architecture/SECURE_ACCESS_FABRIC_TARGET.md
  • docs/architecture/CLUSTER_NODE_ADMIN_FOUNDATION.md
  • docs/architecture/WEB_INGRESS_AND_ADMIN_UI_MODEL.md

This document defines the target Secure Access Fabric architecture only. It is not the current implementation scope and must not be used as permission to start mesh, VPN, multi-cluster, updater, or realtime data-plane migration work without an explicit staged prompt.

CLUSTER_NODE_ADMIN_FOUNDATION.md defines the next platform-core planning baseline for clusters, node enrollment, native node-agent identity, platform admin console, multi-cluster administration, and future organization admin visibility. It is a staged foundation document, not permission to implement mesh packet routing or VPN runtime.

WEB_INGRESS_AND_ADMIN_UI_MODEL.md defines WEB as HTTP/HTTPS ingress and Admin UI presentation only. Cluster configuration remains Control Plane ownership through scoped APIs, PostgreSQL source-of-truth mutations, and audit. Dynamic pages must be safe schema-driven projections and must not embed internal topology, peer caches, route caches, secrets, raw credentials, or arbitrary executable code.

Admin endpoint placement is explicit. Fabric Storage / Config Storage nodes do not automatically host or move the cluster panel. Platform Owner Console remains global platform-owner scope. Cluster Admin Endpoint requires explicit admin/web ingress role assignment, cluster health/trust readiness, and Control Plane authorization. Organization Admin Panel remains a tenant-safe projection.

The final platform must support:

  1. Multi-tenancy / Organizations
  • platform has many organizations
  • each organization has isolated users, groups, resources, policies, audit, connectors
  • users may belong to multiple organizations
  • organization admins only see their organization
  • platform admins see platform scope
  1. Identity federation
  • local users
  • LDAP / Active Directory
  • OIDC
  • future extensibility for more identity sources
  • access mappings based on external groups / claims
  1. Cluster of nodes
  • no mandatory single central node
  • many nodes across many sites
  • nodes can be platform-managed or customer-managed
  • customer-managed nodes are sandboxed cluster participants, not full cluster owners
  1. Node agent
  • small stable always-running agent on every node
  • supervises services
  • downloads updates
  • verifies signed artifacts
  • can rollback to previous version
  • can restart crashed services
  • can work on thin or thick nodes
  1. Service-based node model Each node is not monolithic. A node has:
  • capabilities: what it can do physically/technically
  • enabled services: what it is allowed/assigned to do

Possible services include:

  • ingress-gateway
  • mesh-router
  • relay
  • connector-host
  • vpn-adapter
  • session-worker
  • media-relay
  • file-relay
  • update-cache
  • config-replica
  • audit-sink
  • metrics-exporter
  1. Cluster mesh and routing
  • encrypted inter-node communication
  • dynamic topology
  • no need for full mesh
  • multi-hop routing allowed
  • route failover
  • client failover between ingress nodes
  • connector failover between nodes
  1. Split-brain prevention
  • quorum-based cluster behavior
  • minority partition must not become a second authoritative cluster
  • degraded / recovery / isolated modes
  • manual recovery / promote decision by platform recovery admin
  1. Connector / VPN layer
  • connectors are reusable network access methods
  • one connector may be used by multiple resources
  • connector placement and failover are controlled by policy
  • nodes may be allowed or disallowed to host connectors
  • direct access, VPN, relay and future egress modes must fit this model
  1. Future exit mode
  • split tunnel
  • full tunnel
  • internet access through cluster
  • not first implementation priority

Non-negotiable design rules

  • Do not rewrite proven session lifecycle carelessly.
  • Do not turn Redis into a source of truth.
  • Do not make certificate-ignore a global worker setting.
  • Do not make customer-managed nodes platform-wide trusted by default.
  • Do not create a separate cluster per organization.
  • Do not assume a single permanently reachable central node.
  • Do not rely on “secret protocol with no docs” as security.
  • Security must come from crypto, auth, isolation, policy and observability.
  • Prefer incremental evolution from current proven system.
  • Do not collapse platform control plane and data plane into one vague layer.

Implementation strategy

The codebase must evolve in phases.

Current implementation focus remains:

  • RDP work is paused by product decision
  • preserve the accepted RDP Adapter baseline and Stage 5.x file-transfer work
  • do not delete or rewrite the current RDP MVP while platform-core work starts
  • C1-C9 platform-core foundations are implemented and verified: clusters, node enrollment, node-agent scaffold, platform admin console, workload supervision contract, mesh control-plane prep, mesh skeleton, multi-cluster hardening, and organization admin foundation
  • C10 Fabric Core configuration distribution design is completed
  • C11 signed scoped cluster snapshot model is completed
  • C12 node local state store is completed
  • C13 Fabric Storage / Config Storage service foundation is completed
  • C14 peer directory and cache model is completed
  • C15 Fabric Routing Engine skeleton is completed
  • C16 secure node-to-node channel lifecycle is completed
  • C17 mesh routing runtime implementation plan is completed
  • C17A synthetic mesh runtime skeleton is implemented and test-proven with synthetic fabric messages only, no RDP/VPN/production service traffic
  • C17B route health and failover probes are implemented and test-proven with synthetic traffic only, no RDP/VPN/production service traffic
  • C17C relay semantic hardening is implemented and test-proven with synthetic channel classes only, no RDP/VPN/production service traffic
  • C17D non-production test-service path is implemented and test-proven with bounded synthetic.echo traffic only, no RDP/VPN/production service traffic
  • C17E live node-to-node synthetic HTTP transport is implemented and smoke-proven with synthetic traffic only
  • C17F scoped synthetic route config loading and route-health reporting is implemented and smoke-proven with synthetic traffic only
  • C17G Control Plane scoped synthetic config read/consume is implemented and test-proven with synthetic traffic only
  • C17H deployed multi-agent synthetic config smoke is implemented and runtime-proven on docker-test with synthetic traffic only
  • C17I production forwarding gate foundation is implemented and test-proven; production forwarding remains unavailable
  • C17J production envelope contract validation is implemented and test-proven; production forwarding remains unavailable
  • C17K production envelope observation is implemented and test-proven; production forwarding remains unavailable
  • C17L bounded production observation sink is implemented and test-proven; production forwarding remains unavailable
  • C17M production observation sink wiring is implemented and test-proven; production forwarding remains unavailable
  • C17N production observation sink metrics are implemented and test-proven; production forwarding remains unavailable
  • C17O production observation sink local metrics logging is implemented and test-proven; production forwarding remains unavailable
  • C17P production observation sink change-driven metrics logging is implemented and test-proven; production forwarding remains unavailable
  • C17Q production forwarding gate/runtime log boundary is implemented and test-proven; production forwarding remains unavailable
  • C17R production observation sink capacity guard is implemented and test-proven; production forwarding remains unavailable
  • C17S production observation panic fail-closed hardening is implemented and test-proven; production forwarding remains unavailable
  • C17T production envelope payload boundary is implemented and test-proven; production forwarding remains unavailable
  • C17U production envelope created-at skew boundary is implemented and test-proven; production forwarding remains unavailable
  • C17V peer endpoint candidate model and NAT/connectivity hints are implemented and test-proven; production forwarding remains unavailable
  • C17W peer endpoint candidate scoring model is implemented and test-proven; production forwarding remains unavailable
  • C17X health-aware endpoint candidate scoring overlay is implemented and test-proven; production forwarding remains unavailable
  • C17Y Platform Owner synthetic mesh visibility is implemented and build/test-proven; production forwarding remains unavailable
  • C17Z production fabric-control direct forwarding is implemented and test-proven; production service traffic remains unavailable
  • C17Z1 production fabric-control multi-hop route-path forwarding is implemented and test-proven; production service traffic remains unavailable
  • C17Z2 production fabric-control forwarding observability is implemented and test-proven; production service traffic remains unavailable
  • C17Z3 production fabric-control route-config boundary is implemented and test-proven; production service traffic remains unavailable
  • C17Z4 scoped peer directory/recovery seed boundary is implemented and test/build-proven; production service traffic remains unavailable
  • C17Z5 node-agent peer cache runtime boundary is implemented and test-proven; production service traffic remains unavailable
  • C17Z6 dynamic endpoint reporting boundary is implemented and test-proven; production service traffic remains unavailable
  • C17Z7 private/corporate endpoint candidate boundary is implemented and test-proven; production service traffic remains unavailable
  • C17Z8 peer connection state machine boundary is implemented and test-proven; production service traffic remains unavailable
  • C17Z9 peer recovery planner boundary is implemented and test-proven; production service traffic remains unavailable
  • C17Z10 peer connection intent planner boundary is implemented and test-proven; production service traffic remains unavailable
  • C17Z11 peer connection manager runtime boundary is implemented and test-proven; production service traffic remains unavailable
  • C17Z12 rendezvous/relay control-plane contract is implemented and docker-test-runtime-proven; production service traffic remains unavailable
  • C17Z13 rendezvous lease telemetry is implemented and docker-test-runtime-proven; production service traffic remains unavailable
  • C17Z14 rendezvous lease refresh contract is implemented and docker-test-runtime-proven; production service traffic remains unavailable
  • C17Z15 backend relay replacement policy is implemented and docker-test-runtime-proven; production service traffic remains unavailable
  • C17Z16 route/path decision artifact is implemented and docker-test-runtime-proven; production service traffic remains unavailable
  • C17Z17 node-side route generation tracker is implemented and docker-test-runtime-proven; production service traffic remains unavailable
  • C17Z18 synthetic route-health effective path runtime is implemented and docker-test-runtime-proven; production service traffic remains unavailable
  • C17Z19 synthetic route-health feedback scoring is implemented and docker-test-runtime-proven; production service traffic remains unavailable
  • C17Z20 node-side route-health feedback refresh is implemented and docker-test-runtime-proven; production service traffic remains unavailable
  • Cluster Authority plus node enrollment bootstrap polling are docker-test lifecycle-smoke-proven; fresh install migration replay is fixed for cluster_admin_summaries
  • C18 VPN/IP tunnel service target design is completed as documentation only
  • C18A VPN/IP tunnel control-plane data model foundation is implemented and backend-test-proven
  • C18B VPN/IP tunnel lease/fencing hardening is implemented and backend-test-proven
  • C18C VPN/IP tunnel node-agent desired-state consumption/reporting is implemented and backend-test-proven
  • Version Storage / Update Repository is documented as a future Fabric Core service for signed release manifests, OS/arch artifacts, stable/current/candidate channels, update-cache mirroring, node-agent update supervision, rollback, and explicit data-structure migration bundles. Runtime updater behavior is not implemented.
  • no next platform-core implementation step is automatically authorized after C17Z20; choose the next narrow staged prompt explicitly before continuing
  • preserve the proven RDP lifecycle behavior
  • keep the current backend gateway available as the active/fallback implementation path

The current phase is NOT:

  • full mesh routing implementation
  • full VPN orchestration
  • multi-cluster runtime traffic handling
  • production data-plane migration
  • updater runtime
  • video meetings
  • final native client UI redesign

Future mesh, VPN, multi-cluster, node-agent updater, and production realtime data-plane work must be introduced only through explicit, narrow, staged implementation prompts.

Always keep the project production-oriented. Do not simplify it into a toy app.