Prometheus scraping
Exposing operational metrics and counters through authenticated HTTP endpoints for Prometheus monitoring
Prometheus scraping
Radiator exposes operational counters, timing aggregations and info gauges over authenticated HTTP management endpoints over Prometheus 0.0.4. Metrics are held in memory; reads are inexpensive and non‑blocking. This means they reset at restart.
Access control & authentication
Scraping requires an HTTP management user with at least monitor privilege. Define this in the management.http.credentials block:
credentials {
user "monitor" {
password "monitorpassword";
privilege monitor;
}
}
Use HTTP Basic Auth when scraping:
curl -u monitor:monitorpassword --basic \
http://localhost:8080/api/v1/metrics/prometheus
Example Prometheus configuration
Prometheus server configuration prometheus.yml snippet example:
scrape_configs:
- job_name: "radiator-server"
metrics_path: "/api/v1/metrics/prometheus"
basic_auth:
username: "monitor"
password: "monitorpassword"
static_configs:
- targets: ["127.0.0.1:8080"]
Example scrape output (excerpt)
From an integration test run showing log-based counters after authentication attempts:
# TYPE radiator_build_info gauge
radiator_build_info{app="radiator",version="10.31.0",kind="development",timestamp="2025-12-12T06:00:47Z",cpu_target="aarch64-apple-darwin",branch="main",commit="abc123"} 1
# TYPE radiator_uptime_seconds gauge
radiator_uptime_seconds{service_ok="true"} 3600
# TYPE radiator_instance_info gauge
radiator_instance_info{instance_id="R01",pid="12345"} 1
# TYPE radiator_log_total counter
radiator_log_total{namespace="RADIUS UDP::AUTH_UDP",message="Radius UDP packet from unknown client"} 5
radiator_log_total{namespace="policy::BASIC_AUTH::handler::PAP",message="AAA accept"} 999
radiator_log_total{namespace="policy::BASIC_AUTH::handler::PAP::backend call::USERS_FILE",message="Backend query accepted"} 999
# TYPE radiator_total counter
radiator_total{namespace="policy::BASIC_AUTH"} 1500
...
Key metrics:
Build and Instance Information
radiator_build_info: build metadata gauge with labels for version, branch, commit, etc.radiator_uptime_seconds: time since server started.radiator_instance_info: instance identification withinstance_id, optionalcluster_id, and process ID.
Process Statistics
radiator_process_memory_rss_bytes: resident set size (RSS) memory usage in bytes.radiator_process_cpu_milliseconds_total: total CPU time used by the process in milliseconds.
System Statistics
radiator_system_memory_total_bytes: total system memory in bytes.radiator_system_memory_available_bytes: available system memory in bytes (Linux only).radiator_system_swap_total_bytes: total swap space in bytes (Linux only).radiator_system_swap_used_bytes: used swap space in bytes (Linux only).radiator_system_swap_pages_in_total: total pages swapped in from disk (Linux only).radiator_system_swap_pages_out_total: total pages swapped out to disk (Linux only).radiator_system_cpu_count: number of CPUs available.radiator_system_cpu_active_milliseconds_total: total active (non-idle) CPU time across all CPUs in milliseconds (Linux only).radiator_system_cpu_total_milliseconds_total: total CPU time across all CPUs in milliseconds (Linux only).radiator_system_load_1m_x100: 1-minute load average multiplied by 100.radiator_system_load_5m_x100: 5-minute load average multiplied by 100.radiator_system_load_15m_x100: 15-minute load average multiplied by 100.
Log Counters
radiator_log_total: log message counters with hierarchicalnamespace(using::separator) andmessagelabels. The namespace represents the logging context hierarchy (e.g., server, policy, handler, backend).radiator_total: aggregate counters per namespace prefix.
Note: System statistics are OS dependent.
High availability labels
See the high availability identifiers article for semantics of instance / cluster IDs. The radiator_instance_info gauge includes instance_id.
Usage guidelines
- Most metrics are in memory information. Some OS level information will require system calls which has a small performance impact. In general polling once a second should not have an impact.
- Do not expose metrics endpoints without authentication; they reveal operational structure.
Troubleshooting
| Symptom | Likely cause | Resolution |
|---|---|---|
| 401 Unauthorized | Missing / wrong credentials | Create user with monitor privilege |
| Empty or truncated output | Network proxy / auth failure mid-stream | Retry with curl -v, inspect server logs |
| Counters not incrementing | No traffic hitting relevant handler/policy | Generate test load (e.g. integration script) |
Push gateway (optional)
prompush is verified to work against Radiator for cases where Prometheus cannot directly scrape Radiator instances.