RetroDash // LATENCY — Customization Guide

Requirements

For latency metrics you need Prometheus histograms:

http_request_duration_seconds (standard OpenMetrics)
Or your custom metric (e.g: api_latency_milliseconds)
With labels: method, endpoint, status
Exporter that publishes histograms (Prometheus client libraries)

Compatibility Notes

Rate interval and homelab traffic

The $rate_interval variable controls the time window used in rate() and histogram_quantile() queries. The default is 5m.

For homelabs with low or intermittent HTTP traffic, increase to 15m or 30m to get stable results instead of spiky gaps.
To change: Dashboard Settings → Variables → rate_interval → edit the default value.

"No data" in latency panels is expected behavior when there is no active HTTP traffic hitting your services — it is not a configuration error. The panels will populate as soon as requests start flowing.

Panel Layout

┌────────────────────────────────────────┐
│    LATENCY — PERCENTILES & SLOs        │
├────────────────────────────────────────┤
│  P50      │  P95      │  P99           │
│  (stat)   │  (stat)   │  (stat)        │
├────────────────────────────────────────┤
│  Distribution Time Series [12 cols]    │
│  (lines: P50, P95, P99)                │
├────────────────────────────────────────┤
│  Heatmap [6 cols] │ Table by Endpoint  │
│  (buckets time)   │ (top handlers)     │
├────────────────────────────────────────┤
│  SLO Alert — P99 < 500ms [12 cols]    │
└────────────────────────────────────────┘

Panel Customization

P50 / P95 / P99 Stats

Three stat panels showing percentiles in milliseconds.

Query P50:

histogram_quantile(0.50, rate(http_request_duration_seconds_bucket[5m])) * 1000

Query P95:

histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) * 1000

Query P99:

histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m])) * 1000

For custom metric (e.g: already in ms):

histogram_quantile(0.95, rate(my_api_latency_ms_bucket[5m]))

Change thresholds: Panel → Thresholds → Adjust limits (e.g: 100 yellow, 200 red)

Distribution Time Series

Line chart showing evolution of P50, P95, P99 over time.

Add percentiles: In Legend/Aliases, customize labels:

Percentile	Query	Label
P25	`histogram_quantile(0.25, ...)`	P25
P50	`histogram_quantile(0.50, ...)`	P50 (Median)
P90	`histogram_quantile(0.90, ...)`	P90
P95	`histogram_quantile(0.95, ...)`	P95
P99	`histogram_quantile(0.99, ...)`	P99

Change rate window: Edit [5m] → [1m] for more sensitivity or [15m] to smooth

Heatmap

Visualization of latency distribution by buckets over time.

Default query (all buckets):

rate(http_request_duration_seconds_bucket[5m])

To resolve specific buckets: Filter by bucket values:

rate(http_request_duration_seconds_bucket{le=~"0.01|0.05|0.1|0.5|1"}[5m])

Adjust resolution: Panel → Heatmap options → Cell size, change Bucket to discrete vs. continuous

Table by Endpoint

Dynamic table with latency by endpoint/handler.

Base query:

topk(10, sum by (endpoint) (rate(http_request_duration_seconds_sum[5m])))
/ topk(10, sum by (endpoint) (rate(http_request_duration_seconds_count[5m]))) * 1000

Change top N endpoints: Replace topk(10 with topk(20 or topk(5

Filter by method:

topk(10, sum by (endpoint, method) (rate(http_request_duration_seconds_sum{method="GET"}[5m])))
/ topk(10, sum by (endpoint, method) (rate(http_request_duration_seconds_count{method="GET"}[5m]))) * 1000

SLO Alert — P99 < 500ms

Stat or gauge that alerts if P99 exceeds the SLO.

Default SLO (500ms):

histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m])) * 1000 < 500

Change SLO to 300ms:

histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m])) * 1000 < 300

For alert by endpoint:

histogram_quantile(0.99, sum by (endpoint) (rate(http_request_duration_seconds_bucket[5m]))) * 1000 < 250

Alert color: In Thresholds, configure:

Green: SLO met (< 500ms)
Yellow: Approaching SLO (> 400ms)
Red: SLO violated (> 500ms)

Change Color Theme

Theme	Primary	Secondary	OK Color
GREEN	`#33FF00`	`#22BB00`	#33FF00
AMBER	`#FFB000`	`#CC8C00`	#FFB000
BLUE	`#00BFFF`	`#0099CC`	#00BFFF

Adapt to Your Resolution

Type	Width (cols)	Height (rows)
Mobile (< 768px)	6 (stack vertical)	8
Tablet 10"	12 (full width)	10
Tablet 12.9"	12 (full width)	12
Desktop 1920x1080	24 (2 columns)	8

Import in Grafana

Export the LATENCY dashboard as JSON from Grafana
Or copy the URI: $GRAFANA_URL/api/dashboards/db/latency
Go to Dashboards → Import
Paste the JSON or URL
Make sure to select Prometheus datasource
Verify the metrics in each panel
Save and add to favorites

Advanced Tips

Correlate latency with errors

histogram_quantile(0.95,
  rate(http_request_duration_seconds_bucket{status!~"5.."}[5m])) * 1000

This shows P95 only for successful requests (excluding 5xx).

Breakdown by instance

histogram_quantile(0.95,
  sum by (instance) (rate(http_request_duration_seconds_bucket[5m]))) * 1000

Useful if you have multiple servers and want to detect instance anomalies.

Degradation detection

rate(increase(http_request_duration_seconds_bucket{le="1"}[5m]))[10m:1m]

Shows latency trend over the last 10 minutes with 1m resolution.