Define distributed fabric node protocol plan
This commit is contained in:
@@ -0,0 +1,323 @@
|
||||
# Distributed Fabric Node Protocol Plan
|
||||
|
||||
This document fixes the target direction for the Secure Access Fabric after the
|
||||
VPN performance investigation. The platform must not be treated as a VPN
|
||||
server, RDP gateway, or web console. It is a distributed overlay transport where
|
||||
every participating device is a fabric node, and VPN/RDP/HTTP/admin/storage are
|
||||
services running over that fabric.
|
||||
|
||||
## Core Position
|
||||
|
||||
Every device is a node.
|
||||
|
||||
A phone, home server, cloud server, relay, admin-console host, storage host, and
|
||||
update-cache host share the same base identity model. They differ by roles,
|
||||
capabilities, policy, trust level, and current health.
|
||||
|
||||
```text
|
||||
Node = identity + roles + capabilities + policy + health + local state
|
||||
```
|
||||
|
||||
The Android VPN app is therefore not only a client. It is a mobile fabric node.
|
||||
It may carry VPN traffic, participate in route discovery, relay traffic when
|
||||
policy allows, host limited control/storage roles when approved, and report
|
||||
mobile-specific capacity signals such as battery, network type, NAT behavior,
|
||||
foreground/background state, and metered network policy.
|
||||
|
||||
## What Was Missing
|
||||
|
||||
The current implementation proves route leases and production VPN forwarding,
|
||||
but it still has a data-plane shape that cannot scale to high throughput:
|
||||
|
||||
- too much payload traffic is carried as small request/response HTTP forwarding
|
||||
calls;
|
||||
- JSON/base64 payload envelopes add overhead and CPU cost;
|
||||
- one overloaded stream can delay unrelated traffic;
|
||||
- route health is visible, but the transport does not yet provide enough
|
||||
low-latency per-stream feedback;
|
||||
- the phone behaves mostly as a service client, not as a full fabric node;
|
||||
- service discovery and route execution are not yet separated cleanly enough;
|
||||
- fallback paths can keep traffic alive, but can also hide architecture
|
||||
bottlenecks if used as the primary data plane.
|
||||
|
||||
For 100 Mbps per active device and future 1000+ or millions of devices, the
|
||||
fabric must move to a persistent, binary, multiplexed data plane with explicit
|
||||
route and stream semantics.
|
||||
|
||||
## Non-Negotiable Principles
|
||||
|
||||
1. Fabric is the lower transport layer. VPN, RDP, HTTP, admin console, storage,
|
||||
and update delivery are services above it.
|
||||
2. Service adapters must not discover topology, own route selection, or invent
|
||||
failover logic. They request transport from the fabric.
|
||||
3. Control plane and data plane are separate. API/console traffic must not be
|
||||
the packet transport mechanism.
|
||||
4. Every data session carries many independent streams. A blocked bulk download
|
||||
must not stall RDP, DNS, control, or telemetry.
|
||||
5. Routes are leased and replaceable. Route selection uses quality, policy,
|
||||
locality, role eligibility, cost, trust, and current load.
|
||||
6. The fabric is distributed. Central control can coordinate, but the runtime
|
||||
must keep working through cached policy, peer directories, route leases, and
|
||||
local health when central components are degraded.
|
||||
7. Mobile nodes are first-class nodes with stricter capability scoring.
|
||||
8. HTTP forwarding remains a compatibility and emergency fallback, not the
|
||||
primary high-speed data plane.
|
||||
|
||||
## Node Roles
|
||||
|
||||
Initial role vocabulary:
|
||||
|
||||
- `mobile-edge`: mobile Android/iOS fabric node.
|
||||
- `entry`: accepts external sessions.
|
||||
- `relay`: forwards fabric traffic between nodes.
|
||||
- `exit`: terminates routes into a target network or service zone.
|
||||
- `service-host`: runs service adapters such as admin console, VPN exit, RDP,
|
||||
HTTP ingress, storage, or update-cache.
|
||||
- `control-plane`: participates in control authority, policy decisions, route
|
||||
authority, or quorum work.
|
||||
- `route-coordinator`: calculates or assists route candidates for a partition,
|
||||
region, or service class.
|
||||
- `storage`: stores approved replicated fabric state.
|
||||
- `observer`: collects telemetry and health without carrying user traffic.
|
||||
- `update-cache`: mirrors signed artifacts close to nodes.
|
||||
|
||||
Roles are policy decisions, not binary builds. A phone can theoretically receive
|
||||
any role, but scheduler scoring must account for battery, OS restrictions, NAT,
|
||||
uplink stability, foreground state, and user cost policy.
|
||||
|
||||
## Capability Model
|
||||
|
||||
Nodes must advertise capability facts in heartbeats and peer updates:
|
||||
|
||||
- supported fabric protocol versions;
|
||||
- supported transports: UDP/QUIC, TCP, WebSocket, HTTPS fallback;
|
||||
- NAT type and reachability;
|
||||
- measured RTT/loss/jitter/bandwidth to peers and entry candidates;
|
||||
- CPU, memory, queue depth, file descriptor/socket pressure;
|
||||
- battery state, charging state, mobile/wifi network type, metered policy;
|
||||
- max relay bandwidth and allowed traffic classes;
|
||||
- service roles and service capacity;
|
||||
- trust tier and allowed tenant/organization scopes;
|
||||
- local policy version, peer directory version, route cache version.
|
||||
|
||||
## Fabric Data Session V1
|
||||
|
||||
The first practical protocol step is a persistent binary data session. It may
|
||||
initially run over WebSocket/TCP for faster delivery, but the framing must be
|
||||
transport-neutral so the same protocol can move to QUIC/UDP.
|
||||
|
||||
Minimum frame set:
|
||||
|
||||
```text
|
||||
HELLO node identity, protocol version, capabilities
|
||||
AUTH signed session token or mTLS-bound proof
|
||||
SESSION_READY accepted limits, route epoch, peer epoch
|
||||
OPEN_STREAM stream id, service id, traffic class, route id
|
||||
DATA stream id, sequence, flags, payload
|
||||
ACK stream id, received sequence/window
|
||||
PING/PONG RTT and liveness
|
||||
ROUTE_UPDATE new route lease or alternate route set
|
||||
STREAM_CREDIT per-stream backpressure window
|
||||
NODE_PRESSURE queue/cpu/memory/network pressure signal
|
||||
CLOSE_STREAM normal stream close
|
||||
RESET_STREAM failed stream, other streams remain alive
|
||||
GOAWAY draining or protocol shutdown
|
||||
```
|
||||
|
||||
Traffic classes:
|
||||
|
||||
- `control`: authorization, route updates, attach/detach, liveness.
|
||||
- `dns`: small, latency-sensitive name resolution.
|
||||
- `interactive`: RDP input, SSH interactive, UI control.
|
||||
- `reliable`: normal web/API traffic.
|
||||
- `bulk`: downloads, uploads, sync, large media.
|
||||
- `droppable`: telemetry samples, optional probes, low-value background data.
|
||||
|
||||
Each stream has independent flow control and backpressure. Bulk can be slowed or
|
||||
moved to another route without blocking control or interactive streams.
|
||||
|
||||
## Route Model
|
||||
|
||||
The fabric must maintain multiple candidate routes for an active session:
|
||||
|
||||
```text
|
||||
phone-a -> entry-1 -> home-1
|
||||
phone-a -> phone-b -> relay-2 -> home-1
|
||||
phone-a -> entry-2 -> relay-4 -> service-host-7
|
||||
```
|
||||
|
||||
Route scoring inputs:
|
||||
|
||||
- policy and role eligibility;
|
||||
- route length and failure domains;
|
||||
- RTT, jitter, packet loss, bandwidth estimate;
|
||||
- queue depth and retransmit pressure;
|
||||
- current node CPU/memory/socket pressure;
|
||||
- mobile battery/charging/metered status;
|
||||
- historical reliability;
|
||||
- service locality;
|
||||
- tenant/organization isolation;
|
||||
- cost and operator preference.
|
||||
|
||||
Routes are issued as short leases with route id, epoch, allowed channels,
|
||||
allowed service classes, hop list or next-hop policy, expiry, and fencing rules.
|
||||
|
||||
## Service Discovery
|
||||
|
||||
Services are logical names, not fixed hosts:
|
||||
|
||||
```text
|
||||
service: admin-console
|
||||
replicas: home-1, node-2, node-9
|
||||
policy: active-active or leader/follower
|
||||
ingress: vpn.cin.su / admin.cin.su / internal name
|
||||
```
|
||||
|
||||
`vpn.cin.su` as an HTTP/HTTPS entry is a service endpoint. It can be hosted on
|
||||
any eligible service-host node. If one replica fails, another replica can accept
|
||||
the service lease and traffic can be routed to it.
|
||||
|
||||
## Scale Model
|
||||
|
||||
For 1000 devices, the platform needs entry pools, exit pools, route leases,
|
||||
session placement, and overload protection.
|
||||
|
||||
For millions of devices, the platform additionally needs regional route
|
||||
coordinators, distributed peer directories, local control partitions, telemetry
|
||||
sampling, policy sharding, and resource accounting.
|
||||
|
||||
Every device joining the system increases potential edge capacity, but only if
|
||||
the scheduler can safely decide when that node is allowed to relay, store, serve,
|
||||
or only consume.
|
||||
|
||||
## Security And Abuse Controls
|
||||
|
||||
The distributed model increases power and also risk. The following controls are
|
||||
required before mobile relay/control/storage roles are broadly enabled:
|
||||
|
||||
- node identity is cryptographic; IP address is never identity;
|
||||
- all route leases are signed or locally verifiable;
|
||||
- roles are scoped by organization, tenant, service, and time;
|
||||
- mobile relay is opt-in by policy and user/device state;
|
||||
- storage uses encrypted shards and explicit retention policy;
|
||||
- control-plane participation requires trust tier and quorum policy;
|
||||
- nodes never receive more topology or secret data than their role requires;
|
||||
- abuse controls rate-limit relay use, route churn, and failed authentication;
|
||||
- traffic accounting records who relayed what class and how much, without
|
||||
exposing payload contents.
|
||||
|
||||
## Observability
|
||||
|
||||
The current tests show why aggregate "VPN works" is not enough. The fabric needs
|
||||
per-node, per-route, and per-stream metrics:
|
||||
|
||||
- throughput by direction and traffic class;
|
||||
- RTT, jitter, loss, retransmits, queue depth;
|
||||
- frame encode/decode errors;
|
||||
- stream resets and close reasons;
|
||||
- route switch reason and time to recovery;
|
||||
- node pressure and scheduler decisions;
|
||||
- service discovery failover events;
|
||||
- Android foreground/background and network transition events.
|
||||
|
||||
## Work Plan
|
||||
|
||||
### Stage FNP-0: Architecture Lock
|
||||
|
||||
Status: this document.
|
||||
|
||||
Deliverables:
|
||||
|
||||
- fix "every device is a node" as the model;
|
||||
- separate fabric, services, control, and data plane;
|
||||
- define missing protocol, route, scale, security, and observability pieces.
|
||||
|
||||
### Stage FNP-1: Binary Frame Contract
|
||||
|
||||
Deliverables:
|
||||
|
||||
- add a transport-neutral Go package for Fabric Data Session V1 frame types;
|
||||
- encode/decode binary frames with size limits and validation;
|
||||
- add tests for malformed frames, max frame size, stream ids, and frame type
|
||||
compatibility;
|
||||
- do not connect it to production traffic yet.
|
||||
|
||||
### Stage FNP-2: Persistent Session Runtime Skeleton
|
||||
|
||||
Deliverables:
|
||||
|
||||
- implement in-memory session runtime with streams, sequence numbers, ACK,
|
||||
stream credit, reset, and close;
|
||||
- prove that a blocked bulk stream does not block control/interactive streams;
|
||||
- expose per-stream metrics.
|
||||
|
||||
### Stage FNP-3: WebSocket/TCP Compatibility Transport
|
||||
|
||||
Deliverables:
|
||||
|
||||
- carry binary frames over one persistent WebSocket/TCP connection;
|
||||
- replace high-frequency `/mesh/v1/forward` packet POST usage for VPN routes in
|
||||
a gated mode;
|
||||
- keep HTTP forwarding as fallback.
|
||||
|
||||
### Stage FNP-4: Android As Mobile Fabric Node
|
||||
|
||||
Deliverables:
|
||||
|
||||
- Android advertises node capabilities, network state, battery state, and
|
||||
supported transports;
|
||||
- Android opens Fabric Data Session V1 to entry;
|
||||
- VPN packets map to independent streams/classes;
|
||||
- diagnostics can run per-stream and per-route tests.
|
||||
|
||||
### Stage FNP-5: Route Leases And Multipath
|
||||
|
||||
Deliverables:
|
||||
|
||||
- route result includes primary and alternate routes;
|
||||
- runtime can switch new streams to a better route;
|
||||
- interactive streams can recover quickly after route fencing;
|
||||
- route health uses dataplane metrics, not only HTTP request success.
|
||||
|
||||
### Stage FNP-6: QUIC/UDP Transport
|
||||
|
||||
Deliverables:
|
||||
|
||||
- implement QUIC transport for Fabric Data Session V1;
|
||||
- preserve WebSocket/TCP as fallback;
|
||||
- test 4G/Wi-Fi transition and NAT behavior;
|
||||
- benchmark throughput, latency, and recovery against current HTTP forwarding.
|
||||
|
||||
### Stage FNP-7: Distributed Service Discovery
|
||||
|
||||
Deliverables:
|
||||
|
||||
- service names map to eligible service replicas;
|
||||
- admin console and VPN service can move between service-host nodes;
|
||||
- service failover is expressed as leases and route updates.
|
||||
|
||||
### Stage FNP-8: Mobile Relay And Distributed Capacity
|
||||
|
||||
Deliverables:
|
||||
|
||||
- mobile nodes can opt into relay under strict policy;
|
||||
- scheduler scores battery, metered network, NAT, trust, and load;
|
||||
- route planner can use mobile nodes where they are closer/faster;
|
||||
- accounting and abuse controls are enforced.
|
||||
|
||||
### Stage FNP-9: Scale To Large Fleets
|
||||
|
||||
Deliverables:
|
||||
|
||||
- entry and route coordinator pools;
|
||||
- peer directory sharding;
|
||||
- telemetry sampling and aggregation;
|
||||
- per-tenant quotas and fairness;
|
||||
- load tests for 1000 simulated devices, then larger synthetic fleets.
|
||||
|
||||
## Immediate Next Action
|
||||
|
||||
Start Stage FNP-1 in `rap-node-agent` as a non-production protocol package. The
|
||||
goal is to create the binary frame contract and tests without disturbing the
|
||||
current VPN path. After that, wire it into a gated persistent session runtime and
|
||||
only then move Android/VPN traffic onto it.
|
||||
Reference in New Issue
Block a user