m/rdp-proxy

Fork 0

Files

T

m 8ba0561f4f Initial project snapshot

2026-04-28 22:29:50 +03:00

12 KiB

Raw Permalink Blame History

Fabric Routing Engine Skeleton

Status: Stage C15 result. Documentation and architecture only.

This document defines the Fabric Routing Engine skeleton boundary. It does not implement code, migrations, APIs, mesh runtime traffic, VPN/IP tunnel runtime, relay packet routing, RDP work, or service workload execution.

1. Purpose

The Fabric Routing Engine is the logical Fabric layer responsible for choosing authorized paths between ingress, core, egress, service, storage, and future VPN/IP-tunnel components.

C15 defines the route decision boundary before runtime mesh routing exists.

The purpose is to ensure that future routing:

is policy-aware
is QoS-aware
is channel-aware
respects cluster and organization boundaries
uses scoped local state and peer cache
does not depend on live backend availability for realtime decisions
is not implemented independently by Service Adapters

2. Non-Goals

C15 does not:

carry production mesh traffic
implement node-to-node transport
implement relay forwarding
implement VPN/IP tunnel packets
implement QUIC/WebRTC
implement route execution
implement service workloads
change RDP runtime
change backend session lifecycle
change Windows client behavior

It defines contracts and responsibilities only.

3. Routing Engine Responsibilities

The Fabric Routing Engine owns:

route request validation
peer candidate filtering
route scoring
channel-aware path selection
QoS class selection
route cache lookup/update policy
failover decision boundaries
shortcut recommendation boundaries
topology hiding
policy and cluster-boundary enforcement
service adapter routing integration boundary

The Routing Engine does not own:

PostgreSQL source-of-truth mutation
service protocol translation
RDP/VNC/SSH/VPN implementation details
raw packet forwarding
direct secret resolution
organization admin visibility
node enrollment authority

4. Inputs

Routing decisions may consume:

signed scoped cluster snapshot
node-local peer cache
route cache
peer directory
route policy
QoS policy
service assignment cache
cluster membership
organization scope
service/resource scope
channel class
current health/degraded state
partition/authority state
failure history
load and latency observations

Routing decisions must not require a live backend call in the realtime path.

5. Route Request Contract

A route request is a logical request for a path. It is not a packet.

Required fields:

request_id
cluster_id
organization_id where applicable
source_node_id
source_role
destination_kind
destination_ref
service_type
channel_class
priority_class
policy_refs
requested_at

Destination kinds:

node
egress_pool
service_instance
resource_target
vpn_connection
storage_scope
control_plane_endpoint

Optional fields:

session_id
attachment_id
resource_id
user_id
device_id
region_preference
required_capabilities
forbidden_nodes
preferred_nodes
max_latency_ms
min_bandwidth_hint
stickiness_key
previous_route_id
failure_context

Service adapters may create route requests through an adapter-facing boundary, but they must not select peers or paths themselves.

6. Route Result Contract

A route result is a signed or locally verifiable decision artifact for a bounded time.

Required fields:

route_id
request_id
cluster_id
organization_id where applicable
route_class
channel_class
selected_path
selected_qos_class
score
valid_from
expires_at
route_epoch
policy_version
decision_reason

Selected path contains ordered logical hops:

source node
optional ingress node
zero or more core/relay nodes
optional egress/service node
target/service endpoint

Optional fields:

fallback_paths
shortcut_candidate
stickiness_key
drain_after
degraded_mode
constraints_applied
rejection_reason

Route results must be bounded by expiry, policy version, route epoch, and cluster authority state.

7. Channel Classes

Routing is channel-aware.

Initial channel classes:

control
input
render
cursor
clipboard
file_transfer
telemetry
vpn_packet
storage_fetch
update_fetch

Rules:

input and critical control prefer lowest latency and lowest jitter.
render prefers bandwidth and bounded jitter; stale render may be dropped.
cursor is latest-only and should use low-latency paths.
clipboard is reliable and bounded.
file_transfer prefers throughput but must not starve input/control/render.
telemetry is low priority and may be sampled or dropped.
vpn_packet uses adaptive QoS and bulk protection.
storage_fetch and update_fetch should not consume interactive reserves.

8. Route Classes

Initial route classes:

direct
single_relay
multi_hop
storage_local
storage_remote
vpn_chained
degraded_existing
unavailable

direct:

selected when source can safely reach destination directly
trust and policy must allow it

single_relay:

selected when one relay improves connectivity or policy requires relay

multi_hop:

selected when direct/single relay is unavailable or policy/region requires it

storage_local / storage_remote:

used for config/snapshot/artifact fetch decisions

vpn_chained:

used when a managed service or IP tunnel depends on a logical vpn_connection

degraded_existing:

keeps an already-authorized existing path alive while policy permits

unavailable:

explicit denial or no valid route

9. Hard Policy Checks

Hard checks run before scoring.

Reject route when:

source node is not trusted
source node is not a member of the cluster
destination is outside cluster scope
cross-cluster trust is missing
organization scope does not match
role assignment does not permit the route
peer certificate is invalid or revoked
required channel is not authorized
partition/authority state forbids new route
destination node is draining or disabled and policy forbids placement
route would leak topology or tenant data

No score can override hard policy rejection.

10. Scoring Inputs

Soft scoring inputs:

latency
jitter
packet loss
reliability
recent failure history
region distance
load
available bandwidth
role suitability
route length
service co-location
stickiness preference
cost preference
policy preference
health score

Scoring weights are policy-driven and may differ by channel class.

Example:

input/control heavily weight latency and jitter
file transfer heavily weights throughput and reliability
VPN bulk considers QoS impact on interactive routes
storage fetch considers locality and replica freshness

11. Route Cache Relationship

Route cache is local and bounded.

Cache key inputs:

cluster id
organization id
source node
destination kind/ref
service type
channel class
policy version
route epoch
stickiness key

Cache entries contain:

route result
expiry
score
last success/failure
backoff state
fallback candidates

Cache invalidation triggers:

policy version change
peer directory version change
trust/revocation update
route epoch change
health state change
repeated route failure
expiry

Route cache is a performance aid, not route authority.

12. Failover Boundaries

Failover decisions may:

switch from failed active path to fallback path
promote warm peer path
retry through bootstrap route for recovery
mark route unavailable
request control-plane/config refresh when reachable
keep degraded existing path alive if policy permits

Failover decisions must not:

create new cluster authority
bypass policy
add nodes
approve role changes
cross cluster boundaries without explicit trust
expose topology to organizations

13. Shortcut Decision Boundary

Shortcut connections are optional optimization recommendations.

A shortcut may be recommended when:

long-lived flow exists
current path latency/jitter is high
direct connectivity appears possible
trust validation succeeds
policy allows shortcut
shortcut improves latency, jitter, or bandwidth
fallback path remains available

Shortcut recommendation output:

source node
destination node
channel classes affected
expected improvement
required validation
expiry
fallback route id

C15 does not implement shortcut connections. It only defines when a future Routing Engine may recommend them.

14. Service Adapter Integration

Service Adapters may ask for routes using service-neutral metadata.

Examples:

RDP Adapter requests route to RDP service/egress node or resource target.
VNC Adapter requests route to VNC target zone.
SSH Adapter requests route to SSH target.
VPN/IP tunnel service requests route through vpn_connection.
Storage fetch requests route to config/storage scope.

Service Adapters must not:

enumerate peers
select mesh paths
create relay chains
create shortcuts
implement failover policy
implement partition recovery
implement cross-cluster routing trust

The adapter consumes a route result and sends/receives through the approved data-plane boundary when runtime exists.

15. Topology Hiding

Organizations see:

allowed service endpoints
safe ingress/egress status
safe session/resource status
policy-visible route dependency names where allowed

Organizations must not see:

intermediate core mesh nodes
full peer directory
route cache
shortcut candidates
other organizations' route data
storage shard placement

Platform owners may inspect routing internals according to audited platform policy.

16. Degraded and Partition Behavior

In degraded mode, Routing Engine may:

keep existing authorized routes alive until TTL
use last signed snapshot for recovery
select fallback among already-authorized peers
mark route unavailable when safety cannot be proven

In degraded mode, Routing Engine must not:

authorize new high-risk routes
mutate cluster trust
approve nodes
assign roles
promote partition authority automatically
create cross-cluster trust

17. Observability

Routing decisions should emit safe telemetry:

route selected
route rejected
rejection reason
route class
channel class
score bucket
latency/jitter/packet loss summary
failover count
fallback used
shortcut recommended
policy version
peer directory version
route epoch

Tenant-visible telemetry must hide topology.

18. Future Validation Tests

Future implementation tests must prove:

route request rejects wrong cluster
route request rejects wrong organization
revoked peer is not selected
unavailable route returns explicit result
cache invalidates on policy version change
cache invalidates on peer directory version change
input route prefers latency over throughput
file transfer route does not starve input class
service adapter cannot bypass routing engine
shortcut recommendation requires fallback path
degraded mode does not authorize new forbidden routes

19. C16 Preparation

C16 must define the secure node-to-node channel lifecycle that can later carry route-selected traffic.

C16 must preserve:

routing results are bounded and policy-scoped
channels are authenticated and authorized
trust/revocation affects active channels
Service Adapters remain above Fabric routing
no mesh packet routing starts before explicit C17

20. Result / Decision

Stage C15 defines Fabric Routing Engine as a skeleton boundary for route requests, route results, scoring, cache relationship, failover, shortcut recommendations, topology hiding, and Service Adapter integration.

Decisions:

Routing belongs to Fabric, not Service Adapters.
Route requests/results are logical contracts, not packet forwarding.
Hard policy checks precede scoring.
Route cache is local, bounded, and non-authoritative.
Routing is channel-aware and QoS-aware.
Shortcut connections are future optional recommendations, not C15 runtime.
C16 must define secure node-to-node channels before mesh routing runtime.

No code, migration, API, runtime, RDP, data-plane, mesh, VPN, relay, or service workload behavior is changed by C15.

12 KiB Raw Permalink Blame History