рабочий вариант, но скороть 10 МБит
This commit is contained in:
@@ -306,12 +306,12 @@ Current implementation focus:
|
||||
activation manifests, stores installation authority and signed
|
||||
`platform_role_grants`, and strict platform-admin checks ignore direct
|
||||
PostgreSQL `users.platform_role` edits unless a valid grant exists. Web-admin
|
||||
shows installation status and first-owner bootstrap; dev/legacy SQL seed
|
||||
shows installation status and first-owner bootstrap; dev/compat SQL seed
|
||||
compatibility remains explicit and gated by
|
||||
`INSTALLATION_INSECURE_BOOTSTRAP_ENABLED`.
|
||||
- Cluster Authority foundation is implemented and backend/agent/web-build plus
|
||||
docker-test lifecycle-smoke verified. Clusters now have Ed25519 authority
|
||||
keys, join-token scope material is signed, node approval/bootstrap material
|
||||
keys, join-token scope material is signed, node approval/join material
|
||||
is signed, and Control Plane synthetic mesh config snapshots include a
|
||||
signed hash envelope with `authority_required=true`. Cluster authority
|
||||
private keys are encrypted at rest when `SECRET_ENCRYPTION_KEY_B64`/file is
|
||||
@@ -321,15 +321,15 @@ Current implementation focus:
|
||||
join-token output, approval rows, and synthetic config visibility. The
|
||||
docker-test run `dev-bootstrap-20260428-201430` proved fresh dev cluster
|
||||
creation, signed join token, real node-agent enrollment, platform-owner
|
||||
approval, automatic signed bootstrap polling, authority pin persistence,
|
||||
approval, automatic signed join polling, authority pin persistence,
|
||||
heartbeat, and signed synthetic-config verification. This is a control-plane
|
||||
trust contract only; it does not enable RDP/VPN/service payload forwarding or
|
||||
production relay packet forwarding.
|
||||
- Node enrollment bootstrap polling is implemented and backend/agent-test plus
|
||||
- Node enrollment join polling is implemented and backend/agent-test plus
|
||||
docker-test lifecycle-smoke verified. After enrollment, `rap-node-agent`
|
||||
stores `pending_join_request_id`, polls
|
||||
`/node-agents/enrollments/{requestID}/bootstrap`, verifies the signed
|
||||
approval/bootstrap contract, and persists the approved `node_id`,
|
||||
`/node-agents/enrollments/{requestID}/join`, verifies the signed
|
||||
approval/join contract, and persists the approved `node_id`,
|
||||
`identity_status`, and cluster authority pin into `identity.json`. Polling is
|
||||
controlled by `RAP_ENROLLMENT_POLL_INTERVAL_SECONDS` and
|
||||
`RAP_ENROLLMENT_POLL_TIMEOUT_SECONDS`.
|
||||
@@ -437,12 +437,12 @@ Not current scope:
|
||||
`rap-node-agent` tests only, behind the same disabled-by-default feature
|
||||
flag, and carries only bounded `synthetic.echo` test-service payloads.
|
||||
- C17E adds a live node-to-node synthetic HTTP transport skeleton and smoke
|
||||
harness. It remains behind `RAP_MESH_SYNTHETIC_RUNTIME_ENABLED=false` by
|
||||
harness. It remains behind `RAP_FABRIC_RUNTIME_ENABLED=false` by
|
||||
default and does not authorize production mesh, RDP, VPN, file, video, or
|
||||
service workload traffic.
|
||||
- C17F adds a scoped synthetic mesh config file boundary, prefers it over
|
||||
debug JSON, and reports synthetic route-health observations to the existing
|
||||
mesh links control-plane endpoint when testing flags allow synthetic links.
|
||||
mesh links fabric control endpoint when testing flags allow synthetic links.
|
||||
- C17G adds backend
|
||||
`/clusters/{clusterID}/nodes/{nodeID}/mesh/synthetic-config` and node-agent
|
||||
consumption of that config when no local scoped config file is set.
|
||||
@@ -876,7 +876,7 @@ Result:
|
||||
Additional C17H deployed multi-agent synthetic config smoke verification:
|
||||
|
||||
```powershell
|
||||
powershell -NoProfile -ExecutionPolicy Bypass -File scripts\fabric\c17h-multi-agent-synthetic-smoke-ssh.ps1 -KeepRunning
|
||||
removed multi-agent smoke script is not part of the active tree
|
||||
go test ./...
|
||||
```
|
||||
|
||||
@@ -1705,7 +1705,7 @@ Result:
|
||||
Docker-test C17Z12 runtime smoke:
|
||||
|
||||
```powershell
|
||||
.\scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning
|
||||
removed docker-test smoke script is not part of the active tree
|
||||
```
|
||||
|
||||
Result from run `c17z12-20260428-142108`:
|
||||
@@ -1730,7 +1730,7 @@ Additional C17Z13 rendezvous lease telemetry verification:
|
||||
```powershell
|
||||
go test ./...
|
||||
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build && popd"
|
||||
.\scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning
|
||||
removed docker-test smoke script is not part of the active tree
|
||||
```
|
||||
|
||||
Run from:
|
||||
@@ -1764,7 +1764,7 @@ Additional C17Z14 rendezvous lease refresh verification:
|
||||
```powershell
|
||||
go test ./...
|
||||
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build"
|
||||
.\scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning
|
||||
removed docker-test smoke script is not part of the active tree
|
||||
```
|
||||
|
||||
Run from:
|
||||
@@ -1801,7 +1801,7 @@ Additional C17Z15 rendezvous relay replacement verification:
|
||||
```powershell
|
||||
go test ./...
|
||||
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build"
|
||||
.\scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning
|
||||
removed docker-test smoke script is not part of the active tree
|
||||
```
|
||||
|
||||
Run from:
|
||||
@@ -1840,7 +1840,7 @@ Additional C17Z16 route/path decision artifact verification:
|
||||
```powershell
|
||||
go test ./...
|
||||
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build"
|
||||
.\scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning
|
||||
removed docker-test smoke script is not part of the active tree
|
||||
```
|
||||
|
||||
Run from:
|
||||
@@ -1878,7 +1878,7 @@ Additional C17Z17 route generation tracker verification:
|
||||
```powershell
|
||||
go test ./...
|
||||
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build"
|
||||
.\scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning
|
||||
removed docker-test smoke script is not part of the active tree
|
||||
```
|
||||
|
||||
Run from:
|
||||
@@ -1921,7 +1921,7 @@ Additional C17Z18 route-health effective path verification:
|
||||
```powershell
|
||||
go test ./...
|
||||
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build"
|
||||
.\scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning
|
||||
removed docker-test smoke script is not part of the active tree
|
||||
```
|
||||
|
||||
Run from:
|
||||
@@ -2002,7 +2002,13 @@ Additional C17Z20 route-health feedback refresh verification:
|
||||
```powershell
|
||||
go test ./...
|
||||
cmd /c "pushd \\nas\MST\codex\rdp-proxy\web-admin && npm run build"
|
||||
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning
|
||||
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\check-fabric-standard-boundary.ps1
|
||||
```
|
||||
|
||||
Removed smoke record:
|
||||
|
||||
```powershell
|
||||
removed docker-test smoke script is not part of the active tree
|
||||
```
|
||||
|
||||
Run from:
|
||||
@@ -2033,10 +2039,10 @@ C17Z20 report:
|
||||
|
||||
- `artifacts/c17z20-route-health-feedback-refresh-report.md`
|
||||
|
||||
Dev cluster enrollment/bootstrap lifecycle verification:
|
||||
Archived dev cluster enrollment/bootstrap lifecycle verification:
|
||||
|
||||
```powershell
|
||||
pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\fabric\dev-cluster-enrollment-bootstrap-smoke-ssh.ps1 -KeepRunning
|
||||
removed dev lifecycle smoke script is not part of the active tree
|
||||
```
|
||||
|
||||
Result from docker-test run `dev-bootstrap-20260428-201430`:
|
||||
|
||||
@@ -62,7 +62,7 @@ Cluster Authority foundation is now also complete:
|
||||
- cluster authority private keys are encrypted at rest when
|
||||
`SECRET_ENCRYPTION_KEY_B64`/file is configured; production already requires
|
||||
a secret encryption key
|
||||
- legacy/default clusters are backfilled lazily through `EnsureClusterAuthority`
|
||||
- compat/default clusters are backfilled lazily through `EnsureClusterAuthority`
|
||||
- backend signs join-token scope material, node approval/bootstrap material,
|
||||
and node-scoped synthetic mesh config snapshots
|
||||
- node-agent verifies signed Control Plane synthetic config when
|
||||
@@ -80,14 +80,14 @@ Cluster Authority foundation is now also complete:
|
||||
`RAP_WORKLOAD_SUPERVISION_ENABLED=false` by default while service runtime
|
||||
supervision remains a stub
|
||||
|
||||
Node enrollment bootstrap polling is also complete:
|
||||
|
||||
- backend exposes `/node-agents/enrollments/{requestID}/bootstrap`
|
||||
- pending agents prove `cluster_id`, `node_fingerprint`, and `public_key`
|
||||
before receiving status/bootstrap material
|
||||
- `rap-node-agent` stores `pending_join_request_id`, polls approval, verifies
|
||||
the signed bootstrap contract, then persists `node_id`, `identity_status`,
|
||||
and cluster authority pin into `identity.json`
|
||||
Node enrollment join polling is also complete:
|
||||
|
||||
- backend exposes `/node-agents/enrollments/{requestID}/join`
|
||||
- pending agents prove `cluster_id`, `node_fingerprint`, and `public_key`
|
||||
before receiving status/join material
|
||||
- `rap-node-agent` stores `pending_join_request_id`, polls approval, verifies
|
||||
the signed join contract, then persists `node_id`, `identity_status`,
|
||||
and cluster authority pin into `identity.json`
|
||||
- polling is controlled by `RAP_ENROLLMENT_POLL_INTERVAL_SECONDS` and
|
||||
`RAP_ENROLLMENT_POLL_TIMEOUT_SECONDS`
|
||||
|
||||
@@ -157,14 +157,16 @@ Runtime report:
|
||||
- `artifacts/c18z15-live-service-channel-effective-quality-smoke-result.json`
|
||||
- `artifacts/c18z16-live-service-channel-quality-fairness-smoke-result.json`
|
||||
- `artifacts/c18z17-live-service-channel-quality-cleanup-smoke-result.json`
|
||||
- `artifacts/c18z18-service-channel-session-scoped-fairness-smoke-result.json`
|
||||
- `artifacts/c18z19-service-channel-parallel-flow-window-smoke-result.json`
|
||||
- `artifacts/c18z20-service-channel-adaptive-window-telemetry-smoke-result.json`
|
||||
- Docker-test smoke command:
|
||||
`pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\fabric\c17z12-rendezvous-relay-smoke-ssh.ps1 -KeepRunning`
|
||||
- Dev lifecycle smoke command:
|
||||
`pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\fabric\dev-cluster-enrollment-bootstrap-smoke-ssh.ps1 -KeepRunning`
|
||||
- Last proven runtime run: `c17z18-20260428-221601` (legacy smoke script name,
|
||||
- `artifacts/c18z18-service-channel-session-scoped-fairness-smoke-result.json`
|
||||
- `artifacts/c18z19-service-channel-parallel-flow-window-smoke-result.json`
|
||||
- `artifacts/c18z20-service-channel-adaptive-window-telemetry-smoke-result.json`
|
||||
- Active fabric standard check:
|
||||
`pwsh -NoProfile -ExecutionPolicy Bypass -File scripts\check-fabric-standard-boundary.ps1`
|
||||
- Removed docker-test smoke record:
|
||||
`removed docker-test smoke script is not part of the active tree`
|
||||
- Removed dev lifecycle smoke record:
|
||||
`removed dev lifecycle smoke script is not part of the active tree`
|
||||
- Last proven runtime run: `c17z18-20260428-221601` (compat smoke script name,
|
||||
current C17Z20 node-agent code)
|
||||
- Last proven dev lifecycle run: `dev-bootstrap-20260428-201430`
|
||||
- Admin: `http://192.168.200.61:18080/`
|
||||
@@ -193,30 +195,30 @@ Node-agent image `rap-node-agent:0.2.270-c18z95` is built and deployed on
|
||||
`test-1/2/3`; web-admin is rebuilt and deployed to `rap_web_admin`.
|
||||
All three test nodes run the C18Z92 image, healthy, and current after policy
|
||||
update. Node-agent still requires signed service-channel lease authority when
|
||||
cluster authority is pinned, but if legacy clients cannot send signed lease
|
||||
cluster authority is pinned, but if compat clients cannot send signed lease
|
||||
headers it now calls backend introspection before accepting the unsigned token.
|
||||
Accepted ingress is visible as `accepted_by=signed|introspection|legacy_unsigned`
|
||||
Accepted ingress is visible as `accepted_by=signed|introspection|compat_unsigned`
|
||||
in structured node logs and via `X-RAP-Service-Channel-Accepted-By` on HTTP
|
||||
packet ingress. Durable introspection stores only `token_hash` plus a scrubbed
|
||||
lease payload, so backend restarts no longer break compatibility clients. Live
|
||||
lease maintenance now lists active/expired durable compatibility leases and runs
|
||||
bounded cleanup through the admin API/panel. Durable access telemetry now
|
||||
aggregates node-reported accepted ingress counters by signed/introspection/
|
||||
legacy path, with heartbeat metadata fallback and admin-panel visibility.
|
||||
compat path, with heartbeat metadata fallback and admin-panel visibility.
|
||||
Access telemetry now also correlates active durable service-channel leases with
|
||||
entry/exit nodes, primary route status, backend fallback, and latest
|
||||
entry/exit nodes, primary route status, compat fallback, and latest
|
||||
route-quality feedback when a route exists. Normal-route access diagnostics are
|
||||
smoke-proven with a temporary direct `vpn_packets` route and healthy rolling
|
||||
quality window. Degraded normal-route diagnostics are also smoke-proven: the
|
||||
active channel stays on a normal primary route with `force_backend_fallback=false`
|
||||
active channel stays on a normal primary route with `force_compat_fallback=false`
|
||||
while route feedback becomes `fenced` and rolling failure/drop/slow counters are
|
||||
visible. Active-channel remediation diagnostics now expose
|
||||
`remediation_action`, reason, optional alternate route id/status, and operator
|
||||
hint, with unit coverage for healthy/noop, rebuild, backend fallback, and
|
||||
hint, with unit coverage for healthy/noop, rebuild, compat fallback, and
|
||||
authorized alternate decisions. The alternate-route remediation branch is now
|
||||
live-smoke-proven: a selected primary route is degraded after lease issuance and
|
||||
access telemetry recommends `prefer_alternate_route` while keeping
|
||||
`force_backend_fallback=false`. C18Z57 turns that recommendation into a bounded
|
||||
`force_compat_fallback=false`. C18Z57 turns that recommendation into a bounded
|
||||
machine-readable `remediation_command` on the active channel row, including the
|
||||
primary route, replacement route, issued time, and command TTL capped to the
|
||||
lease lifetime. C18Z58 projects those commands into node-scoped synthetic mesh
|
||||
@@ -225,10 +227,10 @@ route-manager `applied` decision with source
|
||||
`service_channel_remediation_command`. C18Z59 proves active traffic follows the
|
||||
replacement route after remediation: runtime heartbeat evidence shows
|
||||
`last_selected_route_id` and flow-scheduler `last_route_id` on the replacement
|
||||
route, with no local/backend fallback and no route send failures. C18Z60 proves
|
||||
route, with no local/compat fallback and no route send failures. C18Z60 proves
|
||||
the same replacement path under multiple independent VPN flow channels: a
|
||||
twelve-packet batch is classified across multiple flow-scheduler channels, all
|
||||
observed replacement-route sends avoid local/backend fallback, flow drops, and
|
||||
observed replacement-route sends avoid local/compat fallback, flow drops, and
|
||||
route failures. C18Z61 raises that to a pressure batch of 128 IPv4/TCP-like
|
||||
packets; runtime evidence shows 32 replacement-route flow stats, scheduler
|
||||
high-watermark 5, max-in-flight 4, no fallback, no drops, and no route failures.
|
||||
@@ -260,7 +262,7 @@ fallback, route failures, flow drops, and scheduler drops stayed at 0. C18Z68
|
||||
adds backend/admin flow-health guard diagnostics over that telemetry:
|
||||
`flow_health_status` and `flow_health_reason` are projected at cluster, node,
|
||||
and active-channel levels from traffic-class pressure, queue pressure, flow
|
||||
drops, backend fallback, route-quality failures/drops/slow samples, and route
|
||||
drops, compat fallback, route-quality failures/drops/slow samples, and route
|
||||
send latency. Web-admin now shows flow-health chips beside flow QoS.
|
||||
C18Z69 adds node-side adaptive response: heartbeat flow-scheduler snapshots now
|
||||
report per-class `recommended_parallel_windows` plus
|
||||
@@ -319,7 +321,7 @@ C18Z79 closes the planner-to-runtime proof loop for that branch: after planner
|
||||
resolution, the entry node reports a route-manager decision with the same
|
||||
`rebuild_request_id`, the transition is `applied_rebuild`, and live
|
||||
service-channel packet traffic selects the replacement route without
|
||||
local/backend fallback, route failures, or flow drops. C18Z80 hardens that
|
||||
local/compat fallback, route failures, or flow drops. C18Z80 hardens that
|
||||
same path under sustained pressure: after planner-applied rebuild, five
|
||||
post-rebuild bursts of mixed `interactive`, `bulk`, and `reliable` VPN packet
|
||||
batches stay on the replacement route, the stale primary is not reselected, and
|
||||
@@ -396,7 +398,7 @@ C18Z91 makes node-agent consume that signed/introspected data-plane contract.
|
||||
Service-channel packet ingress validates the contract, applies the preferred
|
||||
fabric route, emits data-plane mode/transport/fallback/logical-flow fields in
|
||||
access logs, and reports contract adoption in heartbeat access telemetry.
|
||||
C18Z92 enforces disabled backend fallback policy at node-agent runtime: when a
|
||||
C18Z92 enforces disabled compat fallback policy at node-agent runtime: when a
|
||||
signed lease says `backend_relay_policy=disabled`, route failure or missing
|
||||
fabric route returns a visible 503 instead of silently proxying working data
|
||||
through backend relay.
|
||||
@@ -414,13 +416,13 @@ can now surface a recommended action such as restoring the fabric route instead
|
||||
of treating backend relay as normal service traffic.
|
||||
C18Z95 adds node-agent blocked-fallback telemetry. When a signed data-plane
|
||||
contract disables backend relay and the entry runtime cannot use a fabric
|
||||
route, node-agent reports `backend_fallback_blocked`, the last data-plane
|
||||
route, node-agent reports `compat_fallback_blocked`, the last data-plane
|
||||
violation status/reason, and backend/admin project those fields to cluster,
|
||||
node, channel, and `data_plane_contract` incident diagnostics. Disabled-policy
|
||||
refusal is now separate from real backend relay usage.
|
||||
C18Z96 wires normal-route send failure with disabled backend relay into the
|
||||
existing route feedback and rebuild planner path. When heartbeat access
|
||||
telemetry reports `fabric_route_send_failed_backend_fallback_blocked`, backend
|
||||
telemetry reports `fabric_route_send_failed_compat_fallback_blocked`, compat
|
||||
correlates the entry node's active service-channel leases, records fenced
|
||||
`fabric_service_channel_route_feedback` for the selected primary route, and the
|
||||
existing planner can select an alternate/replacement route. This keeps blocked
|
||||
@@ -570,7 +572,7 @@ artifacts:
|
||||
`artifacts/c18z89-service-channel-access-decision-resurface-action-smoke-result.json`, and
|
||||
`artifacts/c18z90-service-channel-data-plane-contract-smoke-result.json`, and
|
||||
`artifacts/c18z91-node-agent-data-plane-contract-enforcement-smoke-result.json`, and
|
||||
`artifacts/c18z92-node-agent-disabled-backend-fallback-smoke-result.json`, and
|
||||
`artifacts/c18z92-node-agent-disabled-compat-fallback-smoke-result.json`, and
|
||||
`artifacts/c18z93-access-telemetry-data-plane-contract-smoke-result.json`, and
|
||||
`artifacts/c18z94-data-plane-contract-incident-smoke-result.json`, and
|
||||
`artifacts/c18z95-node-agent-blocked-fallback-telemetry-smoke-result.json`, and
|
||||
|
||||
Reference in New Issue
Block a user