Files
rdp-proxy/docs/ops/TEST_DOCKER_DISK_GUARD.md
T
2026-05-14 23:30:34 +03:00

65 lines
2.2 KiB
Markdown

# Test Docker Disk Guard
`test-docker` is a shared build and runtime host. If `/` fills up, Postgres can
restart-loop with `No space left on device`, which breaks VPN diagnostics and
cluster tests. The disk guard is the first operational guardrail for that host.
## What It Does
- Checks `/` usage every run.
- At `>= 85%`, removes safe reclaimable data:
- Docker build cache.
- Dangling Docker images.
- Old RAP temporary build directories under `/tmp`.
- At `>= 85%`, publishes a warning status after cleanup if the host is still above the warning line.
- At `>= 95%` after cleanup, publishes critical status and exits with code `2`.
- Writes machine-readable status to:
- `http://docker-test.cin.su:18080/downloads/ops/test-docker-disk-guard-status.json`
- Writes host log to:
- `/tmp/rap-ops/test-docker-disk-guard.log`
## Install Or Refresh Schedule
Run from the repo root on the Windows workstation:
```powershell
pwsh -ExecutionPolicy Bypass -File scripts/ops/test-docker-disk-guard.ps1 -InstallCron -RunOnce
```
The wrapper uploads `scripts/ops/test-docker-disk-guard.sh` to
`/home/test/bin/rap-test-docker-disk-guard` on `test-docker`. It installs cron
when `crontab` exists; otherwise it installs a user systemd timer named
`rap-test-docker-disk-guard.timer`.
## Manual Check
```powershell
pwsh -ExecutionPolicy Bypass -File scripts/ops/test-docker-disk-guard.ps1 -RunOnce
Invoke-RestMethod http://docker-test.cin.su:18080/downloads/ops/test-docker-disk-guard-status.json
```
## Expansion Approach
Cleanup is only a pressure valve. If the status remains `warning` or `critical`
after cleanup, expand the host disk.
Current host root is expected to be LVM. If the VM already has free VG space,
the guard status will recommend:
```bash
sudo lvextend -r -l +100%FREE /dev/mapper/ubuntu--vg-ubuntu--lv
```
If there is no VG free space, first expand the VM disk in the hypervisor, then
run `pvresize` for the physical volume and finally `lvextend -r` for the root
logical volume.
## Optional Webhook
The shell guard supports `WEBHOOK_URL`. If set in cron/environment, warning and
critical states are posted as JSON:
```json
{"level":"warning","message":"...","host":"...","observed_at":"..."}
```