Rate Limiting Algorithms
Comparing counter-based and GCRA rate limiting strategies
Radiator provides two rate limiting algorithms with different characteristics.
Both algorithms have equal memory and CPU requirements. Counters are stored per-server in memory and are not shared between servers or persisted across restarts.
Choosing the Right Algorithm
Counter-based: Use when you want to allow bursty traffic within a quota.
- Good for longer periods (≥1 hour) like daily/hourly quotas
- Example: 10,000 API calls per 24 hours (use anytime within the day)
GCRA: Use when you need to space requests evenly to protect backend systems.
- Good for short periods (≤60 seconds) with high request rates
- Example: 1000 requests per 10 seconds (spaced 10ms apart) to prevent backend overload
- Also good for slowing down wrong password attempts. For example setting a period of 300 seconds and allowing 5 failures during that time. Slows brute force attacks efficiently. Using the randomization variant (
rate_limit_gcra_rnd) makes attacks even harder by preventing attackers from predicting when the next attempt will be allowed.
Counter-Based Rate Limiting
Counter-based rate limiting uses fixed time windows to track request counts. Each tracked key (e.g., user, IP address, or endpoint) has a counter that increments with each request. When the counter exceeds the limit, additional requests can be rejected until the window boundary is reached and the counter resets to zero.
Pattern: Bursts of 100 accepted (green), then complete blocking while rejections pile up (red).
Characteristics
| Aspect | Behavior |
|---|---|
| Burst handling | None (all-or-nothing per window) |
| Rate smoothing | No (bursty within window) |
| Window boundary | Edge case (up to 2× rate at boundary) |
| Reset behavior | Hard reset at window boundary |
| Randomization | Not supported |
| Complexity | Simple (counter + timer) |
GCRA Rate Limiting
GCRA (Generic Cell Rate Algorithm) enforces smooth request spacing by tracking a Theoretical Arrival Time (TAT) for each key. Unlike counter-based limiting, GCRA distributes requests evenly over time rather than allowing bursts. This is an implementation of 'ITU-T Recommendation I.371', also known as 'leaky bucket' algorithm.
Pattern: Initial burst of 100 accepted at t=0.33s (green), then steady 100 req/s acceptance. Rejections (red) from requests arriving too early.
Characteristics
| Aspect | Behavior |
|---|---|
| Burst handling | Controlled by period parameter |
| Rate smoothing | Yes - enforces emission interval |
| Window boundary | No repeating edge case - continuous tracking |
| Reset behavior | Soft reset after idle > period |
| Randomization | Optional (rate_limit_gcra_rnd variant) |
| Complexity | TAT calculation with timestamps |
Using Rate Limiting in Lua Scripts
Both algorithms are accessible through Radiator's cache API in Lua scripts.
Since the counter is available to scripts it is possible to react on different levels of the counter differently.
Counter-Based Example
Configuration with tiered rate limiting:
caches {
cache "rate_limit" {
timeout 60s; # Window duration (counter resets every 60s)
}
}
scripts {
lua "tiered_rate_limit" {
file "tiered_rate_limit.lua";
}
}
aaa {
policy "DEFAULT" {
handler "AUTHENTICATION" {
conditions all {
radius.request.code == radius.ACCESS_REQUEST;
radius.request.attr.User-Name == any;
}
@pre-execute {
script "tiered_rate_limit";
}
authentication {
backend {
name "USERS";
}
}
}
}
}
The Lua script file (tiered_rate_limit.lua):
-- tiered_rate_limit.lua
-- Tiered rate limiting with soft and hard limits
local context, previous = ...
local cache = context.cache
local count = cache:increment("rate_limit", "requests", 1)
if count > 200 then
-- Hard limit: always reject
context.aaa.message = "Rate limit exceeded"
return result.REJECT
elseif count > 100 then
-- Soft limit: reject odd requests (50% throttle)
if count % 2 == 1 then
context.aaa.message = "Rate limit: throttled"
return result.REJECT
end
end
return previous
This example shows tiered rate limiting: full access below 100, throttled at 50% between 100-200, and complete blocking above 200.
For daily quotas:
caches {
cache "daily_quota" {
timeout 86400s; # 24 hours = 86400 seconds
}
}
GCRA Example
GCRA rate limiting is typically used to protect backend systems from overload by applying rate limits before forwarding requests. This example shows how to implement per-user rate limiting using the @pre-execute block, which runs before backends are called.
Create a Lua script file (rate_limiter.lua):
-- rate_limiter.lua
-- GCRA rate limiting to protect backends from flooding
-- Parameters passed: context, previous
-- Returns: result.REJECT to reject request, previous to continue processing
local context, previous = ...
local cache = context.cache
local user = context.aaa.identity
-- Validate that user identity exists
if not user then
context.aaa.message = "Identity required for rate limiting"
return result.REJECT
end
-- Apply GCRA rate limit: 100 requests per second per user
-- limit=100, period=1000ms (1 second)
local allowed = cache:rate_limit_gcra(
"backend_protection",
"user:" .. user,
100,
1000
)
if not allowed then
context.aaa.message = "Rate limit exceeded, please slow down"
return result.REJECT
end
-- Rate limit passed, continue processing
return previous
Configure Radiator to use the rate limiter:
caches {
cache "backend_protection" {
# Timeout auto-calculated as 2× period (2 seconds)
}
}
scripts {
lua "rate_limiter" {
file "rate_limiter.lua";
}
}
aaa {
policy "DEFAULT" {
handler "AUTHENTICATION" {
conditions all {
radius.request.code == radius.ACCESS_REQUEST;
radius.request.attr.User-Name == any;
}
# Rate limit BEFORE calling backends to protect them from flooding
@pre-execute {
script "rate_limiter";
}
authentication {
# Backend is only called if rate limit passed
backend {
name "USERS";
}
}
}
}
}
How it works:
-
Lua parameters: The script receives two parameters:
context: Provides access to AAA context (context.aaa), cache (context.cache), and variablesprevious: The result from previous script in the chain (if any)
-
Return values:
result.REJECT: Rejects the request immediately, stops further processingprevious: Continues to the next step in the pipeline
-
Execution order: The
@pre-executeblock runs before authentication backends, protecting them from excessive requests -
Identity validation: The script validates that a user identity exists before applying rate limiting to avoid creating a shared rate limit bucket
-
Cache timeout: The cache timeout is automatically calculated as 2× the period, allowing idle keys to be cleaned up
GCRA with Randomization
The rate_limit_gcra_rnd variant adds random jitter to emission intervals to prevent thundering herd effects. This is useful when multiple clients might synchronize their requests (e.g., all retrying at exactly the same time after a failure).
When to use randomization:
- High-concurrency scenarios where many clients might synchronize
- Preventing cascading failures in distributed systems
- Load distribution across time to smooth traffic peaks
Example:
local cache = context.cache
local user = context.aaa.identity
if not user then
context.aaa.message = "Identity required for rate limiting"
return result.REJECT
end
-- Prevent brute force attacks: 5 login attempts per 5 minutes
-- Applies to both successful and failed logins
-- limit=5, period=300000ms (5 minutes), variation=±30000ms (30 seconds)
local allowed = cache:rate_limit_gcra_rnd(
"login_limit",
"user:" .. user,
5,
300000,
30000
)
if not allowed then
context.aaa.message = "Too many login attempts, please try again later"
return result.REJECT
end
With variation=30000 (±30 seconds), the emission interval is randomized. The base emission interval is 60 seconds (300000ms ÷ 5 attempts), so actual intervals will range from 30 seconds to 90 seconds, averaging 60 seconds over time. This prevents attackers from predicting exactly when the next attempt will be allowed.
Complete configuration example:
Create a Lua script file (brute_force_protection.lua):
-- brute_force_protection.lua
-- Prevents brute force attacks by rate limiting login attempts
local context, previous = ...
local cache = context.cache
local user = context.aaa.identity
if not user then
context.aaa.message = "Identity required for rate limiting"
return result.REJECT
end
-- 5 login attempts per 5 minutes with ±30s variation
local allowed = cache:rate_limit_gcra_rnd(
"login_limit",
"user:" .. user,
5,
300000,
30000
)
if not allowed then
context.aaa.message = "Too many login attempts, please try again later"
return result.REJECT
end
return previous
Configure Radiator to use the script:
caches {
cache "login_limit" {
# Timeout auto-calculated as 2× period (10 minutes)
}
}
scripts {
lua "brute_force_protection" {
file "brute_force_protection.lua";
}
}
aaa {
policy "DEFAULT" {
handler "AUTHENTICATION" {
conditions all {
radius.request.code == radius.ACCESS_REQUEST;
radius.request.attr.User-Name == any;
radius.request.attr.User-Password == any;
}
authentication {
# Check rate limit before authentication
script "brute_force_protection";
# Authenticate if rate limit passed
backend {
name "USERS";
}
# Verify password
pap;
}
}
}
}
In this configuration:
- The handler conditions ensure this only applies to Access-Request packets with a username and password (PAP authentication)
- Rate limiting applies to all login attempts (both successes and failures)
Note: For other authentication methods (CHAP, EAP), replace the User-Password condition check with the appropriate attribute (e.g., CHAP-Password, EAP-Message).
Note: The emission interval is always kept at minimum 1 nanosecond even if randomization would make it negative.
Interchangeable Script Examples
The Lua script files shown above can be easily adapted for different use cases. Key modifications:
To change the cache or key:
-- Change the cache name and key pattern
local allowed = cache:rate_limit_gcra_rnd(
"api_limit", -- different cache name
"endpoint:" .. endpoint, -- key by endpoint instead of user
100, 10000, 1000
)
To adjust rate limits:
-- Modify limit, period, and variation values
local allowed = cache:rate_limit_gcra_rnd(
"login_limit",
"user:" .. user,
10, -- 10 attempts instead of 5
600000, -- 10 minutes instead of 5
60000 -- ±60 seconds instead of ±30
)
To switch between GCRA variants:
-- Without randomization
local allowed = cache:rate_limit_gcra("login_limit", "user:" .. user, 5, 300000)
-- With randomization
local allowed = cache:rate_limit_gcra_rnd("login_limit", "user:" .. user, 5, 300000, 30000)
The radconf configuration remains the same - just update the file path if you rename the Lua script.
For more examples and configuration details, see Cache Context Reference.
Architecture Overview
Backend Load Balancing
Basic Installation
Comparison Operators
Configuration Editor
Configuration Import and Export
Data Types
Duration Units
Execution Context
Execution Pipelines
Filters
Health check /live and /ready
High Availability and Load Balancing
High availability identifiers
HTTP Basic Authentication
Introduction
Local AAA Backends
Log storage and formatting
Management API privilege levels
Namespaces
Password Hashing
Pipeline Directives
Probabilistic Sampling
Prometheus scraping
PROXY Protocol Support
Radiator server health and boot up logic
Radiator sizing
Radiator software releases
Rate Limiting
Rate Limiting Algorithms
Reverse Dynamic Authorization
Template Rendering CLI
Tools radiator-client
TOTP/HOTP Authentication
What is Radiator?