Documentation

Prometheus scraping

Exposing operational metrics and counters through authenticated HTTP endpoints for Prometheus monitoring

Prometheus scraping

Radiator exposes operational counters, timing aggregations and info gauges over authenticated HTTP management endpoints over Prometheus 0.0.4. Metrics are held in memory; reads are inexpensive and non‑blocking. This means they reset at restart.

Access control & authentication

Scraping requires an HTTP management user with at least monitor privilege. Define this in the management.http.credentials block:

credentials {
    user "monitor" {
        password "monitorpassword";
        privilege monitor;
    }
}

Use HTTP Basic Auth when scraping:

curl -u monitor:monitorpassword --basic \
  http://localhost:8080/api/v1/metrics/prometheus

Endpoints

Only the Prometheus 0.0.4 text exposition format is currently provided. It is possible to get the same counters through the API endpoints as shown in the table below, but responses are JSON.

PurposePathFormat
Primary Prometheus scrape/api/v1/metrics/prometheusPrometheus 0.0.4 text
Targeted counter time series/api/v1/statistics/counters/time-series/<group>/<policy>/<name>JSON time series (internal debug / drill‑down)

Prometheus server configuration prometheus.yml snippet example:

scrape_configs:
  - job_name: 'radiator-server'
    metrics_path: '/api/v1/metrics/prometheus'
    basic_auth:
      username: 'monitor'
      password: 'monitorpassword'
    static_configs:
      - targets: ['127.0.0.1:8080']

Example scrape output (excerpt)

From an integration test run after 999 PAP authentications authenticating from an internal file:

# TYPE radiator_counter_total counter
radiator_counter_total{group="policy",policy="BASIC_AUTH",name="Sessions"} 999
radiator_counter_total{group="handler",policy="BASIC_AUTH",handler="PAP",name="Sessions"} 999
radiator_counter_total{group="backend",policy="USERS_INTERNAL_FILE",name="Requests"} 999
# ... additional counters elided ...
# TYPE radiator_counter_microseconds_total counter
radiator_counter_microseconds_total{group="handler",policy="BASIC_AUTH",handler="PAP",name="TimeSpent"} 32705
# TYPE radiator_logging_info gauge
radiator_logging_info{log_level="DEBUG"} 1
# TYPE radiator_build_info gauge
radiator_build_info{app="radiator",version="10.30.0",build_kind="dirty",timestamp="2025-09-20T10:54:37Z",cpu_target="aarch64-apple-darwin"} 1
# TYPE radiator_ha_instance_info gauge
radiator_ha_instance_info{instance_id="R00"} 1

Key points:

  • radiator_counter_total: multi‑dimensional counters grouped by logical group (server / policy / handler / backend) with associated labels (policy, handler).
  • radiator_counter_microseconds_total: cumulative time spent; pair with a rate / delta function to derive latency per handler/policy.
  • radiator_logging_info: current effective log level (info gauge with value 1).
  • radiator_build_info: build metadata (no HA labels currently).
  • radiator_ha_instance_info: presence gauge with instance_id and optional cluster_id.

High availability labels

See the high availability identifiers article for semantics of instance / cluster IDs. Only the HA presence gauge includes those today.

Add recording rules later as metric set stabilises—for example converting *_microseconds_total to seconds and deriving latency of various steps. This can be very useful when connecting to an external source for AAA needs to identify bottlenecks.

Usage guidelines

  • Frequency of scrapes should not have an impact on overall performance as long as they are less than once per second.
  • Do not expose metrics endpoints without authentication; they reveal operational structure.

Troubleshooting

SymptomLikely causeResolution
401 UnauthorizedMissing / wrong credentialsCreate user with monitor privilege
Empty or truncated outputNetwork proxy / auth failure mid-streamRetry with curl -v, inspect server logs
Counters not incrementingNo traffic hitting relevant handler/policyGenerate test load (e.g. integration script)

Push gateway (optional)

prompush is verified to work against Radiator for cases where Prometheus cannot directly scrape Radiator instances.