рабочий вариант, но скороть 10 МБит
build / backend (push) Has been cancelled
build / node-agent (push) Has been cancelled
build / worker (push) Has been cancelled

This commit is contained in:
2026-05-22 21:46:49 +03:00
parent 469fa0e860
commit 20d361a886
280 changed files with 954890 additions and 18524 deletions
@@ -344,7 +344,7 @@ The first backend contract slice is implemented:
- Fenced routes are not returned as primary or alternate route candidates in a
service-channel lease. If every route for the selected entry/exit pair is
fenced by service-channel feedback, the lease enters explicit degraded
backend fallback with reason
compat fallback with reason
`fabric_routes_fenced_by_service_channel_feedback`.
- A live smoke on 2026-05-07 created two short-lived `test-1 -> test-2`
`vpn_packets` route intents, injected fresh service-channel flow feedback
@@ -507,18 +507,18 @@ The first backend contract slice is implemented:
post-restart exit inbox depth from `0` to `88` with zero inbox drops.
- C18Z3 adds the matching entry-side resilience and degraded-fallback contract.
Node-agent `0.2.183` validates the signed service-channel lease authority and
forces backend fallback when Control Plane has signed
forces compat fallback when Control Plane has signed
`status=degraded_fallback` or `primary_route.status=missing_route_intent`.
This prevents a node from ignoring the lease decision and accidentally using
older generic route candidates for the same VPN resource. The rule applies to
both HTTP packet ingress and WebSocket packet ingress. The live smoke
`scripts/fabric/c18z3-live-service-channel-entry-ws-fallback-smoke.ps1`
proves HTTP warm delivery, WebSocket ingress parity, entry-node restart and
recovery while a lease exists, explicit backend fallback when no authorized
recovery while a lease exists, explicit compat fallback when no authorized
fabric route exists, and route-intent expiry. The passing artifact is
`artifacts/c18z3-live-service-channel-entry-ws-fallback-smoke-result.json`;
run `c18z3-20260507-211402` accepted warm `4/4`, WebSocket `8` packets,
recovery `4/4`, and moved the degraded backend fallback queue from `0` to
recovery `4/4`, and moved the degraded compat fallback queue from `0` to
`8`.
- C18Z4 adds live long-session pressure coverage without another runtime
release. The script
@@ -529,7 +529,7 @@ The first backend contract slice is implemented:
alternate route. The passing artifact is
`artifacts/c18z4-live-service-channel-session-pressure-smoke-result.json`;
run `c18z4-20260507-212748` grew the exit inbox from `0` to `384`, kept
route failure delta `0`, flow drop delta `0`, and backend fallback queue
route failure delta `0`, flow drop delta `0`, and compat fallback queue
`0 -> 0`. This proves route-policy churn can be absorbed by the shared
fabric runtime while a service WebSocket remains active.
- C18Z5 adds live exit-node failure coverage while the same kind of service
@@ -540,7 +540,7 @@ The first backend contract slice is implemented:
the same signed WebSocket. The passing artifact is
`artifacts/c18z5-live-service-channel-exit-restart-smoke-result.json`; run
`c18z5-20260507-213745` sent 480 packets total, observed route failure delta
`48`, backend fallback queue `0 -> 192`, flow drop delta `0`, and recovery
`48`, compat fallback queue `0 -> 192`, flow drop delta `0`, and recovery
exit inbox `0 -> 192`. This proves exit failure is surfaced as explicit
degraded/fallback telemetry and fabric delivery resumes after runtime
recovery without requiring the service connection to be rebuilt.
@@ -554,7 +554,7 @@ The first backend contract slice is implemented:
`artifacts/c18z6-live-service-channel-active-rebuild-smoke-result.json`; run
`c18z6-20260507-214900` sent 384 packets, delivered all of them to the exit
inbox, selected the replacement route, kept route failure delta `0`, flow
drop delta `0`, and backend fallback queue `0 -> 0`. This proves route-manager
drop delta `0`, and compat fallback queue `0 -> 0`. This proves route-manager
replacement can be applied under an active service session without requiring
the service connection to be recreated.
- C18Z7 adds concurrent service-session isolation coverage. The script
@@ -565,7 +565,7 @@ The first backend contract slice is implemented:
`applied_rebuild`, then continues all sessions. The passing artifact is
`artifacts/c18z7-live-service-channel-concurrent-isolation-smoke-result.json`;
run `c18z7-20260507-215727` delivered 864 packets total, 288 packets per
session, with total backend fallback delta `0`, route failure delta `0`, and
session, with total compat fallback delta `0`, route failure delta `0`, and
flow drop delta `0`. This proves concurrent service sessions keep separate
resource queues and are not starved or poisoned by a shared route-manager
rebuild.
@@ -579,7 +579,7 @@ The first backend contract slice is implemented:
run `c18z8-20260507-221347` delivered 192 packets per interactive session,
hit flow scheduler high watermark `1024`, scheduled `1030` packets on the
hottest channel, dropped `282` packets on that overloaded channel, and kept
backend fallback delta `0` and route failure delta `0`. This proves bounded
compat fallback delta `0` and route failure delta `0`. This proves bounded
queue pressure is service-neutral, observable, and isolated to the overloaded
logical flow without starving other active sessions.
- C18Z9 adds route-pool replacement preference coverage. Node-agent `0.2.184`
@@ -593,7 +593,7 @@ The first backend contract slice is implemented:
node-agent `applied_rebuild`, and verifies the same service session continues
over the fast route. The passing artifact is
`artifacts/c18z9-live-service-channel-route-pool-smoke-result.json`; run
`c18z9-20260507-224901` kept backend fallback delta `0`, route failure delta
`c18z9-20260507-224901` kept compat fallback delta `0`, route failure delta
`0`, and flow drop delta `0`.
- C18Z10 adds service-channel exit-pool failover coverage. Backend/node-agent
`0.2.185` binds signed entry/exit pools into the service-channel lease
@@ -610,7 +610,7 @@ The first backend contract slice is implemented:
`applied_rebuild`, and verifies 288 packets land on the alternate exit. The
passing artifact is
`artifacts/c18z10-live-service-channel-exit-pool-smoke-result.json`; run
`c18z10-20260507-232645` kept backend fallback `0`, route failure delta `0`,
`c18z10-20260507-232645` kept compat fallback `0`, route failure delta `0`,
and flow drop delta `0`.
- C18Z11 adds service-channel entry-pool failover contract coverage. Backend
`rap-backend:fabric-service-channel-0.2.186` keeps
@@ -675,7 +675,7 @@ The first backend contract slice is implemented:
continues on the learned fast route. The passing artifact is
`artifacts/c18z14-live-service-channel-active-quality-shift-smoke-result.json`;
run `c18z14-20260508-071644` sent 60 batches / 480 packets, delivered all
packets to the exit, kept backend fallback `0`, flow drops `0`, and expired
packets to the exit, kept compat fallback `0`, flow drops `0`, and expired
temporary route intents.
- C18Z15 exposes and hardens effective route-quality preference telemetry.
Backend `rap-backend:fabric-service-channel-0.2.191` reports both raw
@@ -690,7 +690,7 @@ The first backend contract slice is implemented:
passing artifact is
`artifacts/c18z15-live-service-channel-effective-quality-smoke-result.json`;
run `c18z14-20260508-073538` sent 60 batches / 480 packets, delivered all
packets to the exit, kept backend fallback `0`, flow drops `0`, and exposed
packets to the exit, kept compat fallback `0`, flow drops `0`, and exposed
decayed effective scores in node telemetry.
- C18Z16 adds per-channel route-quality preference telemetry and fairness
guardrails. Node-agent `0.2.191` records the applied
@@ -704,7 +704,7 @@ The first backend contract slice is implemented:
`artifacts/c18z16-live-service-channel-quality-fairness-smoke-result.json`;
run `c18z14-20260508-074943` sent 60 batches / 480 packets, served 32
logical channels, applied quality preference telemetry to all 32 served
channels, kept backend fallback `0`, and flow drops `0`.
channels, kept compat fallback `0`, and flow drops `0`.
- C18Z17 clears stale per-channel route-quality markers. Node-agent `0.2.192`
removes channel-level quality preference diagnostics when the preference is no
longer present in the current effective preference set or when the preferred
@@ -712,10 +712,10 @@ The first backend contract slice is implemented:
`scripts/fabric/c18z17-live-service-channel-quality-cleanup-smoke.ps1`
verifies that active channel markers reference visible preferences, stale
markers are absent, expired route intents are not active, and the session
completes without backend fallback. The passing artifact is
completes without compat fallback. The passing artifact is
`artifacts/c18z17-live-service-channel-quality-cleanup-smoke-result.json`;
run `c18z14-20260508-075750` sent 60 batches / 480 packets, kept 32 active
quality markers, found `0` stale markers, kept backend fallback `0`, and
quality markers, found `0` stale markers, kept compat fallback `0`, and
flow drops `0`.
- C18Z18 scopes flow-scheduler channel memory by service session. Node-agent
`0.2.193` now keys runtime-sent logical channels as
@@ -728,11 +728,11 @@ The first backend contract slice is implemented:
`scripts/fabric/c18z18-service-channel-session-scoped-fairness-smoke.ps1`
wraps the live C18Z17 route-quality/fairness path, verifies served live
channel names are session-scoped and no unscoped served `flow-NN` channels
remain, and keeps backend fallback and flow drops at zero. The passing
remain, and keeps compat fallback and flow drops at zero. The passing
artifact is
`artifacts/c18z18-service-channel-session-scoped-fairness-smoke-result.json`;
run `c18z14-20260508-082520` served 32 session-scoped channels, applied
quality markers to all 32, kept backend fallback `0`, and flow drops `0`.
quality markers to all 32, kept compat fallback `0`, and flow drops `0`.
- C18Z19 adds the first bounded parallel send window for independent
service-channel logical flows. Node-agent `0.2.194` can send scheduled
logical channels concurrently with `MaxParallelFlowSends=4` in the live
@@ -769,7 +769,7 @@ The first backend contract slice is implemented:
run `c18z14-20260508-085635` delivered 480 packets, observed
`max_parallel_flow_sends=4`, `recommended_parallel_flow_sends=4`,
`scheduler_max_in_flight=4`, attempt/success/latency telemetry on all 32
served channels, backend fallback `0`, and flow drops `0`.
served channels, compat fallback `0`, and flow drops `0`.
- C18Z21 adds rolling per-channel/session quality windows. Node-agent `0.2.196`
keeps the lifetime counters for audit visibility, but adaptive send-window
pressure now comes from the bounded recent quality window, so old drops and
@@ -785,7 +785,7 @@ The first backend contract slice is implemented:
run `c18z14-20260508-091952` delivered 480 packets, observed
`scheduler_quality_window_sample_count=480`, rolling failures `0`, rolling
drops `0`, rolling samples/success/latency on all 32 served channels,
`recommended_parallel_flow_sends=4`, backend fallback `0`, and flow drops `0`.
`recommended_parallel_flow_sends=4`, compat fallback `0`, and flow drops `0`.
- C18Z22 connects the rolling window to backend durable route feedback. Backend
`rap-backend:fabric-service-channel-0.2.197` reads `quality_window_*` fields
from node-agent heartbeat metadata and uses fresh rolling failure/drop/slow
@@ -799,7 +799,7 @@ The first backend contract slice is implemented:
fields. The passing artifact is
`artifacts/c18z22-service-channel-rolling-feedback-smoke-result.json`; run
`c18z14-20260508-093100` delivered 480 packets, observed one persisted
healthy rolling feedback item with rolling payload, backend fallback `0`, and
healthy rolling feedback item with rolling payload, compat fallback `0`, and
flow drops `0`.
- C18Z23 adds route recovery hysteresis. Backend
`rap-backend:fabric-service-channel-0.2.198` re-admits routes that have
@@ -812,7 +812,7 @@ The first backend contract slice is implemented:
the live C18Z22 path and verifies backend `0.2.198`, rolling feedback, clean
forwarding, and the unit hysteresis contract. The passing artifact is
`artifacts/c18z23-service-channel-recovery-hysteresis-smoke-result.json`; run
`c18z14-20260508-094111` delivered 480 packets with backend fallback `0` and
`c18z14-20260508-094111` delivered 480 packets with compat fallback `0` and
flow drops `0`.
- C18Z24 exposes that recovery state to operators and API consumers. Backend
`rap-backend:fabric-service-channel-0.2.199` enriches route feedback API
@@ -925,7 +925,7 @@ The first backend contract slice is implemented:
C18X; route-intent lifecycle cleanup and synthetic-config expired-route
filtering landed in C18Y; bounded multi-channel load/rebuild/drop telemetry
coverage landed in C18Z; live signed service-channel ingress through the
running mesh listener landed in C18Z1; sustained live ingress with exit-node
running fabric listener landed in C18Z1; sustained live ingress with exit-node
restart/recovery coverage landed in C18Z2; signed degraded fallback
enforcement plus entry restart/WebSocket parity landed in C18Z3; long-lived
WebSocket pressure with mid-session route-policy churn landed in C18Z4; live
@@ -988,7 +988,7 @@ The first backend contract slice is implemented:
from explicit degraded compatibility requests; C18Z56 adds active-channel remediation
diagnostics (`none`, `rebuild_route`, `prefer_alternate_route`,
`hold_degraded_route_state`) to make the next runtime action explicit, and its
alternate-route branch is live-smoke-proven with backend fallback kept off.
alternate-route branch is live-smoke-proven with compat fallback kept off.
C18Z57 adds the bounded machine-readable `remediation_command` contract to
active access telemetry rows so route-manager can consume a short-lived
`prefer_alternate_route` command with primary/replacement route ids and TTL.
@@ -996,7 +996,7 @@ The first backend contract slice is implemented:
node-agent route-manager consumes them as explicit applied replacement
decisions sourced from `service_channel_remediation_command`. C18Z59 proves
post-remediation service-channel traffic actually selects the replacement
route in runtime/flow telemetry without local/backend fallback. C18Z60 proves
route in runtime/flow telemetry without local/compat fallback. C18Z60 proves
the same remediation path for multiple independent VPN flow channels in one
packet batch, with replacement-route flow stats, no flow drops, no route
failures, and no degraded fallback. C18Z61 proves the remediation replacement
@@ -1024,7 +1024,7 @@ The first backend contract slice is implemented:
0. C18Z68 turns this evidence into backend/admin flow-health diagnostics:
access telemetry now reports `flow_health_status` and `flow_health_reason` at
cluster, node, and active-channel levels using traffic-class pressure, queue
pressure, flow drops, backend fallback, route-quality failures/drops/slow
pressure, flow drops, compat fallback, route-quality failures/drops/slow
samples, and route send latency. C18Z69 adds node-side adaptive response:
runtime heartbeat flow-scheduler snapshots now include per-class
`recommended_parallel_windows` and adaptive backpressure reason, and the send
@@ -1039,7 +1039,7 @@ The first backend contract slice is implemented:
tune shared fabric backpressure without changing VPN/RDP-specific code.
C18Z72 adds an audited pool/failover policy contract for entry/exit pool
constraints, preferred entry/exit, selection strategy, failover modes,
backend fallback allowance, and sticky session mode. Lease issuance applies
compat fallback allowance, and sticky session mode. Lease issuance applies
that policy before route selection and signs the effective `pool_policy`
provenance into the service-channel lease authority payload. C18Z73 projects
that signed pool-policy fingerprint into active access telemetry and guards
@@ -1080,7 +1080,7 @@ The first backend contract slice is implemented:
existing rebuild command to a replacement route, the entry node reports a
route-manager decision for the same `rebuild_request_id`, the transition is
`applied_rebuild`, and live service-channel packet ingress selects the
replacement route with no local/backend fallback, route failures, or flow
replacement route with no local/compat fallback, route failures, or flow
drops. C18Z80 extends that into sustained post-rebuild pressure: five mixed
service-channel packet bursts remain on the replacement route, no stale
primary route is reselected, and fallback, route-failure, flow-drop, and