Changelog

What's new in KeelPilot

New features, improvements, and fixes. Follow along as we ship.

v2.2.0June 15, 2026

Zero-config observability — automatic prerequisite detection with one-click Helm installs, plus multi-backend distributed tracing.

New

New Features

  • Prerequisite Detection & Auto-Install — a blocking banner detects missing cluster components and installs them with one click via Helm, polling every 30s and auto-dismissing on completion.
  • Multi-Backend Tracing — choose Grafana Tempo, Jaeger, or AWS X-Ray per cluster, with selection persisted and responses normalized to common schemas (TraceSummary, TraceDetail, ServiceMap).
  • New Helm presets — preconfigured installs for Tempo, Jaeger, Metrics Server, and the OpenTelemetry Collector.
  • usePrerequisites React hook — polls component status every 30s and triggers Helm installs directly from the UI.
  • Cluster Prerequisites documentation — a complete install guide for Prometheus, Loki, Tempo, Jaeger, X-Ray, Kubecost, OpenTelemetry Collector, and Metrics Server.
Improved

Architecture & Hardening

  • New Route Registry extracts 80+ routes, reducing the app bootstrap from 605 to ~320 lines.
  • Improved graceful shutdown with connection tracking, ordered worker cleanup, and a 10s force timeout.
  • New type-safe input validation framework with schemas for login, register, invites, access profiles, and runbooks (22 tests).
  • TypeScript errors reduced from 71 to 1 (remaining is a library issue), with 33+ missing API methods restored.
Improved

Testing

  • Test suite grew to 670 tests across 73 files (up from 508 in 63).
  • 24 tests for trace normalizers (X-Ray, Tempo, Jaeger) and 22 for the validation framework.
  • 116 HTTP integration tests.
Infra

Infrastructure

  • Migration v39 for per-cluster trace backend selection.
v2.1.1June 14, 2026

Quality & hardening release — 646 automated tests and a security-focused architecture overhaul.

Improved

Testing & Quality

  • Code review score raised to 8.5/10 (up from 6/10 in May 2026).
  • 116 new integration tests covering auth protection, CSRF enforcement, input validation, health checks, rate limiting, and API versioning.
  • Total test suite now at 646 tests across 72 files (up from 508 in 63 files).
  • Coverage added for Change Intelligence, Cost Guardrails, Runbooks, Infra Map, Helm, Security Posture, Access Profiles, Alerting, Incidents, and Notifications.
Improved

Architecture & Hardening

  • New Route Registry centralizes 80+ route mountings, reducing the app bootstrap from 605 to ~320 lines.
  • Improved graceful shutdown with connection tracking, ordered worker cleanup, and a 10s force timeout.
  • New type-safe input validation framework with pre-built schemas for login, register, invites, access profiles, and runbooks.
  • TypeScript errors reduced from 71 to 48, with 8 new Change Intelligence API methods added.
v2.1.0June 10, 2026

Major feature release — Infrastructure Map, Cost Guardrails, executable Runbooks, and more.

New

New Features

  • Infrastructure & Dependency Map — interactive SVG graph with force-directed layout, 16 AWS services, 9 edge types, and blast-radius analysis.
  • Proactive Cost Guardrails — 9 cost-deviation detectors with savings estimates, configurable thresholds, and preventive alerts.
  • Executable Runbooks / Auto-Remediation — 6 triggers, 8 actions, safety guardrails, and approval gates.
  • Quick Terminal — direct pod terminal access with cascading dropdowns (cluster → namespace → pod → container).
  • Metric Explorer (PromQL) — direct Prometheus queries with metric autocomplete and namespace/pod filters.
Improved

Change Intelligence Improvements

  • Adaptive lookback window by signal type (7d anomalies, 4h alerts, 2h incidents).
  • AWS service matching via ARN parsing and a static service dictionary.
  • Operational noise suppression for SSM/Teleport heartbeats and autoscaling alarms.
  • Relevance damping (proximity × 0.3) for candidates without a service relationship.
Fixed

Fixes

  • JSON pretty-print in the K8s Explorer and recursive Terraform state listing.
  • Helm install made async to avoid ALB timeouts, with auto-uninstall of failed releases before reinstalling.
  • Prometheus service path auto-discovery and Metric Explorer loading clusters on mount.
  • SLO form now binds to real metrics; professional sidebar reorganization.
Security

Security

  • Fixed a cross-tenant leak in the metrics dashboard status endpoint.
  • Closed a privilege escalation path in team members.
  • Added requireMutate() to missing routes and ElastiCache write assertions.
Infra

Infrastructure

  • Backfilled aws_service in change events (migration v33).
  • Applied migrations v31–v37.