Traces & SLOs

Multi-backend distributed tracing, synthetic uptime checks, and Service Level Objectives to track reliability targets.

Choose your tracing backend

KeelPilot supports three distributed tracing backends that you can select per cluster. Your choice is saved automatically and responses are normalized to common schemas, so the UI looks the same regardless of backend.

BackendWhen to use it
AWS X-RayNative AWS services (Lambda, ECS, API Gateway). Nothing to install in-cluster — uses IAM.
Grafana TempoClusters with an existing Grafana stack and high trace volume (object storage).
JaegerClusters that already run Jaeger or prefer its UI and operator.

Service maps & search

  • Visual service map with latencies between services.
  • Search traces by service, status, and duration.
  • Inspect segment and subsegment detail.
  • Select the AWS connection, region, and cluster, then pick your trace backend.
Tempo and Jaeger can be auto-installed from the prerequisites banner (see Cluster Prerequisites). X-Ray requires the ADOT addon and IAM (IRSA) setup. Java, Node.js, and Python support auto-instrumentation via OpenTelemetry.

Synthetic monitoring

Periodic HTTP checks with SSL certificate validation and alerting. Make sure your expected status codes include the real response (e.g. 301, 403) to avoid false negatives.

SLOs

Define Service Level Objectives bound to real metrics, with burn-rate tracking so you know when you're at risk of breaching your error budget.