RetroDash // OVERVIEW — Customization Guide

Requirements

For this dashboard to work correctly you need:

Prometheus datasource configured in Grafana
node_exporter for node metrics (CPU, memory, disk)
kube-state-metrics for pod status and K8s resources
up metric for health check (scrape_duration_seconds)

Compatibility Notes

Grafana 12+ Import

These dashboards are tested with Grafana 12.3 and kube-prometheus-stack. To import: go to Dashboards → Import, paste the JSON, then select your Prometheus datasource from the dropdown.

job="node-exporter" is the default job label set by kube-prometheus-stack. Your setup may use job="node" or another value — adjust accordingly.
The $job_node variable (in the dashboard JSON) lets you switch between node-exporter and node without editing individual panel queries.
If the Node Status Grid shows no data, check the $job_node variable value in Dashboard Settings → Variables.

Panel Layout

┌────────────────────────────────────────┐
│    OVERVIEW — 4 GOLDEN SIGNALS         │
├────────────────────────────────────────┤
│  Health  │ Avg      │  Error  │ CPU    │
│  Score   │ Latency  │  Rate   │ Satur. │
├──────────┼──────────┼─────────┼────────┤
│  Signals Summary (4 KPIs) [6 cols]    │
├────────────────────────────────────────┤
│  Node Status Grid    │  Pod Status     │
│  (table)             │  (table)        │
├────────────────────────────────────────┤
│  Top Resource Consumers [12 cols]      │
└────────────────────────────────────────┘

Panel Customization

Health Score

Shows the percentage of "up" targets in Prometheus.

Default query:

100 * count(up == 1) / count(up)

To change: If you use a custom exporter, replace up with your metric:

100 * count(my_service_healthy == 1) / count(my_service_healthy)

Thresholds: Edit Thresholds in panel (default: 90 yellow, 75 red)

Avg Latency

Average request latency in milliseconds.

avg(rate(http_request_duration_seconds_sum[5m]))
/ avg(rate(http_request_duration_seconds_count[5m])) * 1000

Change metric: If you don't use Prometheus HTTP conventions, use:

avg(my_request_latency_ms)

Error Rate

Percentage of requests with 5xx status code.

100 * sum(rate(http_requests_total{status=~"5.."}[5m]))
/ sum(rate(http_requests_total[5m]))

Customization: To include 4xx errors:

100 * sum(rate(http_requests_total{status=~"[45].."}[5m]))
/ sum(rate(http_requests_total[5m]))

CPU Saturation

CPU usage as a percentage.

100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)

Filter by specific instance:

100 - (avg(rate(node_cpu_seconds_total{mode="idle",instance="server1:9100"}[5m])) * 100)

Golden Signals Summary

Table with 4 main KPIs. To customize:

Latency: Change http_request_duration_seconds to your metric
Traffic: Modify time range (5m → 1m for more sensitivity)
Errors: Adjust status codes based on your app
Saturation: Change metric if monitoring something other than CPU

Node Status Grid

Dynamic table with node status.

max by (instance) (up{job="node"})

Change columns: In panel → Columns, add:

node_cpu_seconds_total → Total CPU
node_memory_MemAvailable_bytes → Available memory
node_filesystem_avail_bytes → Disk space

Pod Status

Table of pods with status.

max by (pod, namespace) (kube_pod_status_phase)

Filter by namespace: In variables, create selector:

label_values(kube_pod_info, namespace)

Then use in the query:

max by (pod, namespace) (kube_pod_status_phase{namespace="$namespace"})

Top Resource Consumers

Ranking of pods by CPU/memory.

topk(10, sum by (pod, namespace)
(rate(container_cpu_usage_seconds_total[5m])) * 100)

Change top N: Replace topk(10 with topk(20 for top 20

Change Color Theme

This dashboard includes 3 predefined themes. Use the buttons above or edit JSON:

Theme	Primary	Glow	Background
GREEN	`#33FF00`	rgba(51,255,0,0.5)	#0A1A0A
AMBER	`#FFB000`	rgba(255,176,0,0.5)	#0D1117
BLUE	`#00BFFF`	rgba(0,191,255,0.5)	#0A0A1A

Adapt to Your Tablet Resolution

Device	Resolution	GridPos Height
iPad Pro 12.9	2048×2732	12-14
iPad Air 10.9	1640x2360	10-12
Samsung Tab S9	1752x2800	11-13
Google Pixel Tablet	1600x2560	10-12
Desktop 1920x1080	Full	8-10

To adjust: In Grafana, edit each panel → Panel tab → Panel options → modify gridPos.h

Import in Grafana

In Grafana, go to Dashboards → Import
Copy the dashboard JSON (available in repo or export)
Paste in Import via panel json
Select Prometheus datasource
Click Import
Verify that all metrics load correctly