Refactor RDP proxy handling and update related tests
This commit is contained in:
@@ -0,0 +1,96 @@
|
||||
# Distributed Authority Audit 2026-05-16
|
||||
|
||||
Status: target architecture is distributed, but the live test cluster still has
|
||||
bootstrap central authority pieces that must be removed before production trust.
|
||||
|
||||
## Fixed Requirements
|
||||
|
||||
- No single management/API/storage/update service is allowed to own cluster
|
||||
truth.
|
||||
- Control, storage, update, route authority, observer, and update-cache are node
|
||||
roles in the fabric.
|
||||
- A service endpoint can serve signed state, but cannot create trusted state by
|
||||
itself.
|
||||
- Node identity is cryptographic. IP addresses, DNS names, and NAT addresses are
|
||||
endpoint candidates only.
|
||||
- Nodes must publish real signed candidates for reachable interfaces,
|
||||
STUN/ICE-reflexive addresses, passive reverse channels, and relay fallback.
|
||||
- Nodes must verify signed control data locally before applying it.
|
||||
|
||||
## Live Cluster Findings
|
||||
|
||||
- The live cluster has one active `cluster_authorities` row:
|
||||
`rap-ca-ed25519-09877466aa9b6b58b0f312b0b313ea33`.
|
||||
- Its metadata says `storage=database_signer` and
|
||||
`production_target=external_cluster_signer_or_hsm`.
|
||||
- Release metadata for recent node-agent versions is signed, but signed by the
|
||||
same database-backed authority.
|
||||
- Synthetic mesh configs are signed and node-agent verifies them against the
|
||||
pinned cluster authority.
|
||||
- Node enrollment pins cluster authority into `identity.json`.
|
||||
- Before this audit, host-agent update plans were carried with signatures but
|
||||
host-agent did not locally reject unsigned plans when a pinned authority was
|
||||
present.
|
||||
|
||||
## Changes Made In This Audit
|
||||
|
||||
- The fabric docs now declare distributed authority and quorum as mandatory.
|
||||
- Node/fabric endpoints must be explicit `host:port`; DNS-only service names are
|
||||
rejected as fabric endpoints.
|
||||
- `home-1` no longer advertises `smoke.cin.su` as a fabric endpoint. It now
|
||||
advertises its real interface candidate `quic://192.168.200.85:18080`.
|
||||
- Host-agent now verifies `node_update_plan` authority signatures when
|
||||
`identity.json` contains a pinned cluster authority public key.
|
||||
- Unsigned update plans are rejected in that pinned-authority mode.
|
||||
- Added `rap.cluster_authority.quorum.v1` and
|
||||
`rap.cluster_authority.quorum_envelope.v1` contracts to both agent and
|
||||
backend authority packages.
|
||||
- Host-agent can now verify quorum-signed update plans when `identity.json`
|
||||
contains a pinned quorum descriptor.
|
||||
- Backend update plans now include an `authority_quorum` envelope when the
|
||||
cluster authority metadata contains a quorum descriptor. If that configured
|
||||
quorum cannot be satisfied, the update plan is not issued.
|
||||
- Node bootstrap now carries `cluster_authority_quorum`; the approval authority
|
||||
payload signs the quorum descriptor hash, and node-agent persists the
|
||||
descriptor into `identity.json` after verifying the signed hash.
|
||||
- Published `rap-node-agent` and `rap-host-agent` release
|
||||
`0.2.284-quorumauthority`.
|
||||
- Canaried `home-1` to `rap-node-agent 0.2.284-quorumauthority` and
|
||||
`rap-host-agent 0.2.284-quorumauthority`; both reported healthy/noop after
|
||||
update.
|
||||
- Published `rap-node-agent` and `rap-host-agent` release
|
||||
`0.2.285-quorumbootstrap`.
|
||||
- Canaried `home-1` to `rap-node-agent 0.2.285-quorumbootstrap` and
|
||||
`rap-host-agent 0.2.285-quorumbootstrap`; both reported current=target/noop.
|
||||
`ifcm-rufms-s-mo1cr` was intentionally not updated because it is behind NAT
|
||||
and still needs fabric/update-cache artifact reachability before further
|
||||
rollout.
|
||||
|
||||
## Remaining Production Blockers
|
||||
|
||||
- Replace `database_signer` with quorum authority:
|
||||
M-of-N signatures from nodes or hardware/offline keys with
|
||||
`control-authority` / `update-authority` roles.
|
||||
- Store authority descriptors and role certificates as replicated signed state,
|
||||
not only database rows.
|
||||
- Require quorum envelopes for the remaining high-risk mutations: role
|
||||
mutation, release creation, update policy mutation, route lease issuance,
|
||||
relay/rendezvous lease issuance, storage placement, and authority rotation.
|
||||
Node update plans and bootstrap quorum pinning now have the first contract
|
||||
hooks, but production still needs real M-of-N signers.
|
||||
- Add node-side verification of release metadata in addition to update-plan
|
||||
verification; update-plan verification is now enforced by host-agent when a
|
||||
pinned authority or pinned quorum descriptor exists.
|
||||
- Add update-cache mirror selection through fabric endpoint candidates instead
|
||||
of a single HTTP origin.
|
||||
- Add signed endpoint-candidate epochs so peer directory gossip can survive API
|
||||
replica loss.
|
||||
- Add revocation/fencing epochs for compromised authority keys, nodes, and
|
||||
update artifacts.
|
||||
|
||||
## Acceptance Rule
|
||||
|
||||
The cluster is not production-trust-ready while a single `database_signer` can
|
||||
create authoritative cluster mutations. It may remain as a development bootstrap
|
||||
signer only when every signed payload clearly identifies it as bootstrap and
|
||||
nodes can be configured to reject it in production mode.
|
||||
Reference in New Issue
Block a user