Files
rdp-proxy/scripts/smoke/README.md
T
2026-04-28 22:29:50 +03:00

598 lines
20 KiB
Markdown

# Local Smoke Test
This smoke path is for proving the minimal end-to-end server-side session lifecycle without any UI.
## Verification matrix
### Locally proven in this repository work
- backend `go build ./...` succeeds
- worker build environment files exist and are aligned across devcontainer, Docker, and CI
- worker Docker image contract now has a deterministic runtime binary path: `/usr/local/bin/rdp-worker`
- worker source had minimal compile fixes applied for missing declarations/includes needed by the reproducible build environment
### Container-proven
- the canonical worker build environment is [workers/rdp-worker/Dockerfile](/\\?\UNC\192.168.220.200\mst\codex\rdp-proxy\workers\rdp-worker\Dockerfile)
- a successful `docker build` in that environment proves CMake configure + compile + install for the worker
- Data Plane v1 Stage DP-1C builds the optional worker direct WSS endpoint and
installs `/usr/local/bin/rdp-worker-dataplane-token-probe`
- Data Plane v1 Stage DP-1D.1 builds the worker direct JSON realtime bridge
for the same JSON envelopes used by the backend gateway
- DP-1C endpoint validation is proven for malformed-token rejection and
valid-token-without-runtime rejection, and replayed `jti` rejection;
successful direct attach to a live runtime and direct JSON traffic proof are
still live smoke targets
### CI-defined but not yet executed in this verification pass
- [`.github/workflows/build.yml`](/\\?\UNC\192.168.220.200\mst\codex\rdp-proxy\.github\workflows\build.yml) builds the backend
- the same workflow builds the worker Docker image
- the workflow verifies that `/usr/local/bin/rdp-worker` exists inside the image
### Still not proven automatically
- behavior on host reboot during an active real RDP session
- automated assertion of actual viewer-side rendering correctness
- Stage 5.1 file upload is build-proven in this pass, but live upload proof
requires the RAP stack to be running on the current test Docker host
`192.168.200.61`
## Data Plane v1C worker WSS validation smoke
Build the worker image on the test Docker host:
```powershell
$env:DOCKER_BUILDKIT='0'
docker --context test-ubuntu build --tag rap-rdp-worker:dp1c-hardened --file workers/rdp-worker/Dockerfile workers/rdp-worker
```
Run the narrow endpoint smoke:
```powershell
pwsh -ExecutionPolicy Bypass -File scripts/smoke/data-plane-v1c-smoke.ps1
```
Expected evidence:
- worker logs `direct data-plane WSS endpoint listening`
- malformed token receives `401 Unauthorized`
- valid token with no existing session runtime receives `404 Not Found` with
`missing_runtime`
- replaying the same token receives `401 Unauthorized` with
`jti_replay_rejected`
- worker logs `event=token_validation_failed reason=malformed_token`
- worker logs `event=data_plane_bind_failed ... reason=missing_runtime`
- worker logs `event=jti_replay_rejected`
This smoke intentionally does not route Windows client traffic through the
direct worker WSS endpoint. The backend gateway remains the runtime path until
DP-1D.
## Data Plane v1D.1 direct JSON bridge smoke status
Build the DP-1D.1 worker image on the test Docker host:
```powershell
$env:DOCKER_BUILDKIT='0'
docker --context test-ubuntu build --tag rap-rdp-worker:dp1d1 --file workers/rdp-worker/Dockerfile workers/rdp-worker
```
Backend metadata must remain explicitly gated:
```text
DATA_PLANE_DIRECT_WORKER_JSON_RUNTIME=false # default, client falls back
DATA_PLANE_DIRECT_WORKER_JSON_RUNTIME=true # advertise runtime_transport=json_v1
```
Locally/container-proven in DP-1D.1:
- backend tests prove direct candidate metadata appears only when
`DATA_PLANE_DIRECT_WORKER_JSON_RUNTIME=true`
- Windows client build still succeeds and remains capability-gated
- worker Docker image builds with direct JSON envelope bridge
- endpoint/token smoke still proves invalid token, missing runtime, and replay
rejection
Still requiring live RDP smoke:
- direct WSS connects to an already-running runtime
- direct WSS carries input/render/clipboard/file_upload JSON envelopes
- direct WSS does not recreate the RDP runtime
- backend gateway fallback activates when direct WSS is unavailable
- direct vs fallback latency comparison on the same RDP target
## Prerequisites
- Docker Desktop or Docker Engine with `docker compose`
- Go `1.23.x` for local backend runs
- a reachable RDP host for the seeded resource
- a machine where the worker Docker image can actually be built and started
## 0. Verify raw TCP reachability to the target first
Run this from the same machine or container host that will run the worker:
```powershell
python scripts/smoke/check-rdp-target.py --host 192.168.60.210 --port 60210
```
Expected result:
- `tcp_connect=ok ...`
If this step fails, FreeRDP connect proof is blocked by target reachability and the later lifecycle steps cannot be considered proven.
## Canonical build environments
- worker devcontainer: [`.devcontainer/devcontainer.json`](/\\?\UNC\192.168.220.200\mst\codex\rdp-proxy\.devcontainer\devcontainer.json)
- worker Docker image: [workers/rdp-worker/Dockerfile](/\\?\UNC\192.168.220.200\mst\codex\rdp-proxy\workers\rdp-worker\Dockerfile)
- worker CMake preset: [workers/rdp-worker/CMakePresets.json](/\\?\UNC\192.168.220.200\mst\codex\rdp-proxy\workers\rdp-worker\CMakePresets.json)
## 1. Start infra
```powershell
pwsh -File scripts/smoke/start-infra.ps1
```
Expected result:
- PostgreSQL is reachable on `127.0.0.1:5432`
- Redis is reachable on `127.0.0.1:6379`
## 2. Apply backend migrations
```powershell
pwsh -File scripts/smoke/apply-migrations.ps1
```
Expected result:
- backend tables exist in `remote_access_platform`
## 3. Seed a smoke-test user, trusted device, resource, and policy
Edit the connection parameters to point to a reachable RDP host:
```powershell
pwsh -File scripts/smoke/seed-resource.ps1 `
-RdpHost 10.0.0.10 `
-RdpPort 3389 `
-RdpUsername Administrator `
-RdpPassword secret `
-RdpDomain "" `
-CertificateVerificationMode strict
```
Expected result:
- the script creates or reuses the `default` organization created by the v2 migrations
- the script creates an active default-organization membership for the seeded smoke user
- the script prints `user_id`
- the script prints `device_id`
- the script prints `resource_id`
## 4. Start backend
```powershell
pwsh -File scripts/smoke/run-backend.ps1
```
Expected result:
- backend listens on `http://192.168.200.61:8080` from the Windows client and on `127.0.0.1:8080` inside the Docker host network
Containerized fallback when the smoke host does not have `go` installed:
```sh
docker run -d --name rap_backend_smoke --network host \
-v /absolute/path/to/repo/backend:/workspace/backend \
-w /workspace/backend \
-e APP_NAME=rap-api \
-e APP_ENV=development \
-e HTTP_HOST=0.0.0.0 \
-e HTTP_PORT=8080 \
-e POSTGRES_DSN=postgres://rap_user:rap_password@127.0.0.1:5432/remote_access_platform?sslmode=disable \
-e REDIS_ADDR=127.0.0.1:6379 \
-e AUTH_ACCESS_TOKEN_SECRET=smoke-access-secret \
-e AUTH_REFRESH_HASH_SECRET=smoke-refresh-secret \
golang:1.23.8-bookworm /bin/sh -lc '/usr/local/go/bin/go run ./cmd/api'
```
## 5. Build the worker image
```powershell
pwsh -File scripts/smoke/build-worker-image.ps1
docker run --rm --entrypoint /bin/sh rap-rdp-worker:dev -lc "test -x /usr/local/bin/rdp-worker"
```
Expected result:
- image build succeeds
- the binary exists at `/usr/local/bin/rdp-worker`
## 6. Run the worker container
```powershell
pwsh -File scripts/smoke/run-worker-container.ps1
```
Expected result:
- the worker process starts
- Redis contains `worker:registration:rdp-worker-1`
If the test RDP host uses a self-signed or mismatched certificate and smoke verification needs to continue, set:
```text
RDP_WORKER_INSECURE_SKIP_VERIFY=true
```
This override is worker-runtime-only and is intended strictly for smoke verification.
## 7. Start a session
```powershell
$body = @{
resource_id = "<resource_id>"
user_id = "<user_id>"
device_id = "<device_id>"
} | ConvertTo-Json
Invoke-RestMethod `
-Method Post `
-Uri http://192.168.200.61:8080/api/v1/sessions `
-ContentType 'application/json' `
-Body $body
```
Expected result:
- the response includes `session.id`
- the response includes `attachment.id`
- the response includes `attach_token`
- the response session payload carries `organization_id`
- backend logs a session start and assignment path
- worker logs a new assignment and FreeRDP connect attempt
- Redis `worker:events` emits `session_connected` and then periodic `session_heartbeat`
## 8. Detach
```powershell
$body = @{
attachment_id = "<attachment_id>"
user_id = "<user_id>"
reason = "manual_smoke_detach"
} | ConvertTo-Json
Invoke-RestMethod `
-Method Post `
-Uri http://192.168.200.61:8080/api/v1/sessions/<session_id>/detach `
-ContentType 'application/json' `
-Body $body
```
Expected result:
- PostgreSQL session state becomes `detached`
- worker keeps the RDP connection alive
- worker does not emit `session_terminated`
## 9. Reattach
```powershell
$body = @{
user_id = "<user_id>"
device_id = "<device_id>"
} | ConvertTo-Json
Invoke-RestMethod `
-Method Post `
-Uri http://192.168.200.61:8080/api/v1/sessions/<session_id>/attach `
-ContentType 'application/json' `
-Body $body
```
Expected result:
- backend returns a new short-lived `attach_token`
- worker does not recreate the remote RDP session
- worker continues heartbeating the same `session_id`
## 10. Takeover
Create or seed a second trusted device for the same user, then:
```powershell
$body = @{
user_id = "<user_id>"
device_id = "<second_device_id>"
reason = "manual_smoke_takeover"
} | ConvertTo-Json
Invoke-RestMethod `
-Method Post `
-Uri http://192.168.200.61:8080/api/v1/sessions/<session_id>/takeover `
-ContentType 'application/json' `
-Body $body
```
Expected result:
- backend atomically supersedes the previous attachment
- previous controller WebSocket session receives `session.taken_over` if connected
- worker stays on the same remote RDP session and only updates controller ownership
## 10A. Prove WebSocket takeover delivery
Use the real smoke client built into the backend module:
```sh
cd backend
go run ./cmd/ws-smoke-client \
-attach-token "<controller_a_attach_token>" \
-duration 120s
```
Expected result:
- controller A first receives `session.state`
- after takeover from controller B, controller A receives `session.taken_over`
- PostgreSQL shows the new attachment for controller B as `active`
- worker logs only `updated assignment for existing session ...` and does not log a new runtime start
## 11. Terminate
```powershell
$body = @{
user_id = "<user_id>"
reason = "manual_smoke_terminate"
} | ConvertTo-Json
Invoke-RestMethod `
-Method Post `
-Uri http://192.168.200.61:8080/api/v1/sessions/<session_id>/terminate `
-ContentType 'application/json' `
-Body $body
```
Expected result:
- backend marks the session `terminated`
- worker receives a `terminate` control envelope
- worker disconnects the FreeRDP session and emits `session_terminated`
## 12. Prove stale lease and worker-death recovery
With a live active session still running:
```sh
docker rm -f rap_worker_smoke
```
Wait at least:
- `WORKER_HEARTBEAT_TTL`
- plus `WORKER_STALE_LEASE_GRACE_PERIOD`
- plus one lease-monitor interval
With current defaults from [backend/configs/api.example.env](/\\?\UNC\192.168.220.200\mst\codex\rdp-proxy\backend\configs\api.example.env), waiting about `90s` is sufficient for a manual smoke pass.
Expected result:
- `worker:registration:<worker_id>` disappears from Redis
- `worker:session-lease:<session_id>` is released
- `live:session:<session_id>`, `live:binding:<session_id>`, and `live:route:<session_id>` are cleared
- PostgreSQL moves the session to `failed`
- non-superseded attachments become `closed`
- `audit_events` contains `session_failed` with reason `worker_lease_stale_or_worker_missing`
## Troubleshooting
- If the worker image build fails, run `docker build --tag rap-rdp-worker:dev --file workers/rdp-worker/Dockerfile workers/rdp-worker` directly and inspect the compiler output.
- If raw TCP reachability to the RDP target fails, stop there and fix host/network/firewall/port access before evaluating FreeRDP behavior.
- If the worker starts but never receives assignments, verify `worker:registration:<worker_id>` and `worker:control:<worker_id>` in Redis.
- If session start returns `access denied` after the v2 migrations, verify that the seeded user has an active membership in the resource organization.
- If takeover does not produce `session.taken_over`, confirm that controller A is attached through `/api/v1/gateway/ws` using a still-valid attach token and that the broker binding changed.
- If worker death does not transition the session quickly enough, verify `WORKER_HEARTBEAT_TTL`, `WORKER_STALE_LEASE_GRACE_PERIOD`, and the lease monitor interval before treating the session as stuck.
- If FreeRDP cannot connect, verify the seeded host, port, username, password, domain, certificate verification mode, and network reachability from Docker to the target host.
- If attach tokens expire during manual testing, repeat the attach or takeover call to mint a new token.
## Stage 5.1 / 5.1.1 File Upload Smoke
Current target Docker host for this project is `192.168.200.61` via Docker
context `test-ubuntu`. Verify before running:
```powershell
docker context use test-ubuntu
docker ps
```
Policy setup:
```sql
UPDATE resource_policies
SET file_transfer_mode = 'client_to_server',
file_transfer_enabled = TRUE,
updated_at = now()
WHERE resource_id = '<resource_id>';
```
Disabled-policy regression:
```sql
UPDATE resource_policies
SET file_transfer_mode = 'disabled',
file_transfer_enabled = FALSE,
updated_at = now()
WHERE resource_id = '<resource_id>';
```
Manual upload proof from the Windows client:
- start or attach an active RDP session
- open the session window
- click `Upload File`
- choose a small text file, then a small binary file
- verify the UI progress reaches 100%
- inspect backend logs for `session gateway file upload start accepted` and
`session gateway file upload chunk accepted`
- inspect worker logs for `file upload completed`
- verify the file and hash inside the worker container:
```powershell
docker exec rap_worker_smoke sh -lc `
"find /tmp/rap-rdp-worker-transfers -path '*/visible/*' -type f -maxdepth 4 -print -exec wc -c {} \;"
```
Stage 5.1.1 visibility proof:
Automated smoke command used for the accepted proof:
```powershell
pwsh -ExecutionPolicy Bypass -File scripts\smoke\drive-visibility-smoke.ps1 `
-WorkerImage rap-rdp-worker:rdp-p1-region-order2 `
-OutputFrame artifacts\stage5-drive-visibility-frame-p1-rerun.bmp
```
- worker logs must show `visible transfer directory ready`
- worker logs must show `FreeRDP restricted transfer drive configured name=RAP_Transfers`
- inside the remote Windows session, open File Explorer and verify the
redirected drive `RAP_Transfers` is present
- verify uploaded files are visible under `RAP_Transfers`
- open the uploaded text file and verify the content matches
- for binary files, verify size and hash through worker storage evidence unless
a remote-side hash tool is available
- after detach, takeover old-client, and worker failure, upload must be blocked
and the visible transfer directory must be cleaned up
Required PASS cases for accepting Stage 5.1:
- `disabled` blocks upload
- `client_to_server` allows upload
- small text file hash matches
- small binary file hash matches
- file larger than 25 MiB is blocked by client/gateway policy
- path traversal names are blocked by gateway/worker validation
- upload is blocked after detach, old-client takeover, and worker failure
- rendering, mouse input, keyboard input, clipboard, reconnect, and takeover
still work
Accepted Stage 5.1.1 proof artifact:
- `artifacts/stage5-drive-visibility-frame-p1-rerun.bmp` shows the uploaded
`stage5-upload-text.txt` opened inside remote Windows from the restricted
`RAP_Transfers` drive.
Important limitation for Stage 5.1.1: it intentionally exposed only the
restricted per-session `visible` directory as `RAP_Transfers`. It must not be
expanded to arbitrary paths, full shared folders, SMB/WebDAV, or Windows agent
delivery.
## Stage 5.2 File Download Smoke
Stage 5.2 server-to-client download has a runtime-proven core data path and
lifecycle blocking proof. Manual desktop UI proof remains before full
acceptance.
Build-proven images:
```text
rap-backend-smoke:stage5-2-download
rap-rdp-worker:stage5-2-download
```
Headless core data-path proof:
```powershell
pwsh -NoProfile -ExecutionPolicy Bypass `
-File scripts\smoke\file-download-smoke.ps1 `
-AllowMode server_to_client `
-Transport direct_worker_wss `
-OutputDirectory artifacts/stage5-2-download-smoke-direct-fixed2
pwsh -NoProfile -ExecutionPolicy Bypass `
-File scripts\smoke\file-download-smoke.ps1 `
-AllowMode bidirectional `
-Transport direct_worker_wss `
-OutputDirectory artifacts/stage5-2-download-smoke-direct-bidirectional
pwsh -NoProfile -ExecutionPolicy Bypass `
-File scripts\smoke\file-download-smoke.ps1 `
-AllowMode client_to_server `
-Transport direct_worker_wss `
-ExpectBlocked `
-OutputDirectory artifacts/stage5-2-download-smoke-direct-client-to-server-block-fixed
pwsh -NoProfile -ExecutionPolicy Bypass `
-File scripts\smoke\file-download-smoke.ps1 `
-AllowMode disabled `
-Transport direct_worker_wss `
-ExpectBlocked `
-OutputDirectory artifacts/stage5-2-download-smoke-direct-disabled-fixed
pwsh -NoProfile -ExecutionPolicy Bypass `
-File scripts\smoke\file-download-smoke.ps1 `
-AllowMode server_to_client `
-Transport backend_gateway `
-OutputDirectory artifacts/stage5-2-download-smoke-backend-regression-after-direct-block
```
Lifecycle proof:
```powershell
pwsh -NoProfile -ExecutionPolicy Bypass `
-File scripts\smoke\file-download-smoke.ps1 `
-AllowMode server_to_client `
-Transport direct_worker_wss `
-LifecycleScenario detach `
-OutputDirectory artifacts/stage5-2-download-lifecycle-detach-fixed
pwsh -NoProfile -ExecutionPolicy Bypass `
-File scripts\smoke\file-download-smoke.ps1 `
-AllowMode server_to_client `
-Transport direct_worker_wss `
-LifecycleScenario takeover_old_controller `
-OutputDirectory artifacts/stage5-2-download-lifecycle-takeover-fixed
pwsh -NoProfile -ExecutionPolicy Bypass `
-File scripts\smoke\file-download-smoke.ps1 `
-AllowMode server_to_client `
-Transport direct_worker_wss `
-LifecycleScenario worker_failure `
-OutputDirectory artifacts/stage5-2-download-lifecycle-worker-failure
```
Accepted core evidence:
- direct worker WSS `server_to_client`: text and binary size/hash match
- direct worker WSS `bidirectional`: text and binary download succeeds
- direct worker WSS `client_to_server`: download blocked with `access denied`
- direct worker WSS `disabled`: download blocked with `access denied`
- backend gateway fallback `server_to_client`: text and binary size/hash match
- detach blocks download with `file_download.blocked`
- old controller after takeover receives `session.taken_over` and cannot
continue download
- worker failure transitions PostgreSQL state to `failed`; direct WebSocket
closes and download cannot continue
Report:
- `artifacts/stage5-2-file-download-runtime-report.md`
Remaining manual live proof:
- keep the Stage 5.2 backend and worker images on `docker-test`
- set `resource_policies.file_transfer_mode = 'server_to_client'`
- start or attach a real RDP session
- inside remote Windows, copy a small text file to `RAP_Transfers\ToClient`
- verify the Windows client shows `file_download.available`
- click `Download File`, choose a local save path, and verify completion
- compare size and hash with worker evidence
- repeat with a small binary file
- verify `disabled` and `client_to_server` block download
- verify `bidirectional` allows upload and download
- verify rendering, mouse, keyboard, clipboard, upload, reconnect, takeover,
and backend gateway fallback do not regress