Health Checks
Health checks are available in config-driven mode only. Each [[upstream]] that includes a [upstream.health_check] section gets a dedicated daemon thread (health-{upstream_name}) that probes every backend on a regular interval and updates the live backend list without any restart.
Configuration
[[upstream]]name = "api"backends = ["api-1:3000", "api-2:3000", "api-3:3000"]
[upstream.health_check]path = "/healthz" # GET path (default: "/health")interval_secs = 15 # probe interval in seconds (default: 30)timeout_ms = 3000 # connect + read timeout per probe (default: 5000)healthy_threshold = 2 # consecutive successes to restore (default: 2)unhealthy_threshold = 3 # consecutive failures to remove (default: 3)How it works
Startup state
All backends start as live. The health checker assumes backends are healthy until proven otherwise.
Probe request
Every interval_secs seconds the checker sends a minimal HTTP/1.1 request to each backend:
GET /healthz HTTP/1.1Host: api-1Connection: closeBoth the TCP connect and the response read are bounded by timeout_ms. A backend is considered healthy if it replies with a 2xx status code (the checker reads only the first 16 bytes of the response — just enough for HTTP/1.1 2).
Failure tracking
Per-backend counters track consecutive successes and failures independently:
backend api-2: consecutive failures = 1 → still live consecutive failures = 2 → still live consecutive failures = 3 → REMOVED from live list (unhealthy_threshold reached)
backend api-2 later: consecutive successes = 1 → still dead consecutive successes = 2 → RESTORED to live list (healthy_threshold reached)The counters reset on state transition: a success resets the failure counter to 0, and vice versa.
Live list update
After probing all backends, the checker atomically replaces the shared live list:
Arc<RwLock<Vec<String>>> // written by health checker; read by DynamicProxyDynamicProxy acquires a read lock on every request, which is concurrent-safe. The health checker acquires a write lock only when publishing the new list.
Log output
State changes are logged to stderr:
[health] upstream=api backend=api-2:3000 removed (3x fail)[health] upstream=api backend=api-2:3000 restored (2x ok)All backends unhealthy
If all backends fail their health checks, the live list becomes empty. DynamicProxy returns 502 Bad Gateway for every request until at least one backend recovers.
Implementation reference
The health checker lives in src/proxy_config/health.rs:
start_health_checker(upstream_name, backends, live, config)— spawns the daemon thread.check_backend(backend, path, timeout)— sends a single probe, returnstrueon2xx.parse_host_port(backend)— strips URL prefixes and returns(host, port).