Traces & SLOs
Multi-backend distributed tracing, synthetic uptime checks, and Service Level Objectives to track reliability targets.
Choose your tracing backend
KeelPilot supports three distributed tracing backends that you can select per cluster. Your choice is saved automatically and responses are normalized to common schemas, so the UI looks the same regardless of backend.
| Backend | When to use it |
|---|---|
| AWS X-Ray | Native AWS services (Lambda, ECS, API Gateway). Nothing to install in-cluster — uses IAM. |
| Grafana Tempo | Clusters with an existing Grafana stack and high trace volume (object storage). |
| Jaeger | Clusters that already run Jaeger or prefer its UI and operator. |
Service maps & search
- Visual service map with latencies between services.
- Search traces by service, status, and duration.
- Inspect segment and subsegment detail.
- Select the AWS connection, region, and cluster, then pick your trace backend.
Tempo and Jaeger can be auto-installed from the prerequisites banner (see Cluster Prerequisites). X-Ray requires the ADOT addon and IAM (IRSA) setup. Java, Node.js, and Python support auto-instrumentation via OpenTelemetry.
Synthetic monitoring
Periodic HTTP checks with SSL certificate validation and alerting. Make sure your expected status codes include the real response (e.g. 301, 403) to avoid false negatives.
SLOs
Define Service Level Objectives bound to real metrics, with burn-rate tracking so you know when you're at risk of breaching your error budget.