Health Checks
A health check is an HTTP request Clank sends to your container after it starts. If the container responds, the deployment is marked active and traffic is routed to it. If all retries fail, the deployment is marked failed and the container is stopped.
Health checks prevent broken deployments from receiving traffic. Combined with blue-green deploys, they enable zero-downtime deployments — the old container keeps serving until the new one is verified healthy.
How it works
Section titled “How it works”After a container starts, Clank waits for a startup grace period to give the application time to boot, then begins probing:
Container starts ↓Wait 30s (startup grace) ↓GET http://container:port/health → responds? → ACTIVE ✓ ↓ (no response)Wait 10s ↓GET http://container:port/health → responds? → ACTIVE ✓ ↓ (no response)Wait 10s ↓GET http://container:port/health → responds? → ACTIVE ✓ ↓ (no response)FAILED ✗ (container stopped)Configuration
Section titled “Configuration”| Setting | Default | Description |
|---|---|---|
| Path | /health | The HTTP path to probe. |
| Retries | 3 | Attempts before marking the deployment as failed. |
| Interval | 10s | Seconds between retry attempts. |
| Timeout | 5s | Max seconds to wait for a response per attempt. |
| Startup grace | 30s | Seconds to wait before the first probe. |
Configure these in the service settings in the dashboard or via the CLI.
What counts as healthy
Section titled “What counts as healthy”Any HTTP response counts as healthy — including 2xx, 3xx, 4xx, and 5xx. The health check verifies that the process is listening and responding, not that the response code is “good.”
What counts as unhealthy:
- Connection refused (process not listening)
- Connection timeout (process not responding within the timeout)
- DNS resolution failure
When to skip health checks
Section titled “When to skip health checks”Leave the health check path empty to skip health checking. Clank marks the deployment as ACTIVE immediately after the container starts.
Skip health checks for:
- Databases (PostgreSQL, MySQL, MongoDB, Redis) — they don’t serve HTTP
- Background workers — no HTTP interface
- Message queues — not HTTP-based
All database and cache templates have health checks disabled by default.
Redirects
Section titled “Redirects”Some applications redirect their root path (e.g., WordPress redirects / to https://...). Clank handles this correctly — a 301 or 302 response counts as healthy. Both the agent’s health check client and Traefik’s load balancer are configured to not follow redirects.
Two layers of health checking
Section titled “Two layers of health checking”Clank runs health checks at two levels:
- Deployment health check — Runs once during deployment. Determines whether the deployment succeeds or fails.
- Traefik health check — Runs continuously (every 5 seconds). Traefik only routes traffic to healthy containers, preventing routing to containers that crash after deployment.
Common issues
Section titled “Common issues”| Symptom | Likely cause | Fix |
|---|---|---|
| Stuck at HEALTH_CHECKING | App starts slowly | Increase startup grace or retries |
| Fails immediately | Container crashing | Check deployment logs for errors |
| Wrong port | App listens on different port | Update port in service settings |
| Path returns 404 | Health check path doesn’t exist | Change path to / or a valid endpoint |
Next steps
Section titled “Next steps”- Deployments — Understand the full deployment lifecycle including blue-green.
- Troubleshooting: Health Check Failures — Step-by-step debugging guide.