RetroDash // OVERVIEW

GUIDEOverview dashboard with summary of the 4 Golden Signals

Theme

Requirements

For this dashboard to work correctly you need:

Compatibility Notes

Grafana 12+ Import

These dashboards are tested with Grafana 12.3 and kube-prometheus-stack. To import: go to Dashboards → Import, paste the JSON, then select your Prometheus datasource from the dropdown.

Panel Layout

┌────────────────────────────────────────┐
│    OVERVIEW — 4 GOLDEN SIGNALS         │
├────────────────────────────────────────┤
│  Health  │ Avg      │  Error  │ CPU    │
│  Score   │ Latency  │  Rate   │ Satur. │
├──────────┼──────────┼─────────┼────────┤
│  Signals Summary (4 KPIs) [6 cols]    │
├────────────────────────────────────────┤
│  Node Status Grid    │  Pod Status     │
│  (table)             │  (table)        │
├────────────────────────────────────────┤
│  Top Resource Consumers [12 cols]      │
└────────────────────────────────────────┘

Panel Customization

Health Score

Shows the percentage of "up" targets in Prometheus.

100 * count(up == 1) / count(up)

To change: If you use a custom exporter, replace up with your metric:

100 * count(my_service_healthy == 1) / count(my_service_healthy)

Thresholds: Edit Thresholds in panel (default: 90 yellow, 75 red)

Avg Latency

Average request latency in milliseconds.

avg(rate(http_request_duration_seconds_sum[5m]))
/ avg(rate(http_request_duration_seconds_count[5m])) * 1000

Change metric: If you don't use Prometheus HTTP conventions, use:

avg(my_request_latency_ms)

Error Rate

Percentage of requests with 5xx status code.

100 * sum(rate(http_requests_total{status=~"5.."}[5m]))
/ sum(rate(http_requests_total[5m]))

Customization: To include 4xx errors:

100 * sum(rate(http_requests_total{status=~"[45].."}[5m]))
/ sum(rate(http_requests_total[5m]))

CPU Saturation

CPU usage as a percentage.

100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

Filter by specific instance:

100 - (avg(rate(node_cpu_seconds_total{mode="idle",instance="server1:9100"}[5m])) * 100)

Golden Signals Summary

Table with 4 main KPIs. To customize:

Node Status Grid

Dynamic table with node status.

max by (instance) (up{job="node"})

Change columns: In panel → Columns, add:

Pod Status

Table of pods with status.

max by (pod, namespace) (kube_pod_status_phase)

Filter by namespace: In variables, create selector:

label_values(kube_pod_info, namespace)

Then use in the query:

max by (pod, namespace) (kube_pod_status_phase{namespace="$namespace"})

Top Resource Consumers

Ranking of pods by CPU/memory.

topk(10, sum by (pod, namespace)
(rate(container_cpu_usage_seconds_total[5m])) * 100)

Change top N: Replace topk(10 with topk(20 for top 20

Change Color Theme

This dashboard includes 3 predefined themes. Use the buttons above or edit JSON:

Theme Primary Glow Background
GREEN #33FF00 rgba(51,255,0,0.5) #0A1A0A
AMBER #FFB000 rgba(255,176,0,0.5) #0D1117
BLUE #00BFFF rgba(0,191,255,0.5) #0A0A1A

Adapt to Your Tablet Resolution

Device Resolution GridPos Height
iPad Pro 12.9 2048×2732 12-14
iPad Air 10.9 1640x2360 10-12
Samsung Tab S9 1752x2800 11-13
Google Pixel Tablet 1600x2560 10-12
Desktop 1920x1080 Full 8-10

To adjust: In Grafana, edit each panel → Panel tab → Panel options → modify gridPos.h

Import in Grafana

  1. In Grafana, go to Dashboards → Import
  2. Copy the dashboard JSON (available in repo or export)
  3. Paste in Import via panel json
  4. Select Prometheus datasource
  5. Click Import
  6. Verify that all metrics load correctly