Rate Limiting Algorithms

Comparing counter-based and GCRA rate limiting strategies

Radiator provides two rate limiting algorithms with different characteristics.

Both algorithms have equal memory and CPU requirements. Counters are stored per-server in memory and are not shared between servers or persisted across restarts.

Choosing the Right Algorithm

Counter-based: Use when you want to allow bursty traffic within a quota.

  • Good for longer periods (≥1 hour) like daily/hourly quotas
  • Example: 10,000 API calls per 24 hours (use anytime within the day)

GCRA: Use when you need to space requests evenly to protect backend systems.

  • Good for short periods (≤60 seconds) with high request rates
  • Example: 1000 requests per 10 seconds (spaced 10ms apart) to prevent backend overload
  • Also good for slowing down wrong password attempts. For example setting a period of 300 seconds and allowing 5 failures during that time. Slows brute force attacks efficiently. Using the randomization variant (rate_limit_gcra_rnd) makes attacks even harder by preventing attackers from predicting when the next attempt will be allowed.

Counter-Based Rate Limiting

Counter-based rate limiting uses fixed time windows to track request counts. Each tracked key (e.g., user, IP address, or endpoint) has a counter that increments with each request. When the counter exceeds the limit, additional requests can be rejected until the window boundary is reached and the counter resets to zero.

Pattern: Bursts of 100 accepted (green), then complete blocking while rejections pile up (red).

Characteristics

AspectBehavior
Burst handlingNone (all-or-nothing per window)
Rate smoothingNo (bursty within window)
Window boundaryEdge case (up to 2× rate at boundary)
Reset behaviorHard reset at window boundary
RandomizationNot supported
ComplexitySimple (counter + timer)

GCRA Rate Limiting

GCRA (Generic Cell Rate Algorithm) enforces smooth request spacing by tracking a Theoretical Arrival Time (TAT) for each key. Unlike counter-based limiting, GCRA distributes requests evenly over time rather than allowing bursts. This is an implementation of 'ITU-T Recommendation I.371', also known as 'leaky bucket' algorithm.

Pattern: Initial burst of 100 accepted at t=0.33s (green), then steady 100 req/s acceptance. Rejections (red) from requests arriving too early.

Characteristics

AspectBehavior
Burst handlingControlled by period parameter
Rate smoothingYes - enforces emission interval
Window boundaryNo repeating edge case - continuous tracking
Reset behaviorSoft reset after idle > period
RandomizationOptional (rate_limit_gcra_rnd variant)
ComplexityTAT calculation with timestamps

Using Rate Limiting in Lua Scripts

Both algorithms are accessible through Radiator's cache API in Lua scripts.

Since the counter is available to scripts it is possible to react on different levels of the counter differently.

Counter-Based Example

Configuration with tiered rate limiting:

caches {
    cache "rate_limit" {
        timeout 60s;  # Window duration (counter resets every 60s)
    }
}

scripts {
    lua "tiered_rate_limit" {
        file "tiered_rate_limit.lua";
    }
}

aaa {
    policy "DEFAULT" {
        handler "AUTHENTICATION" {
            conditions all {
                radius.request.code == radius.ACCESS_REQUEST;
                radius.request.attr.User-Name == any;
            }

            @pre-execute {
                script "tiered_rate_limit";
            }

            authentication {
                backend {
                    name "USERS";
                }
            }
        }
    }
}

The Lua script file (tiered_rate_limit.lua):

-- tiered_rate_limit.lua
-- Tiered rate limiting with soft and hard limits
local context, previous = ...
local cache = context.cache
local count = cache:increment("rate_limit", "requests", 1)

if count > 200 then
    -- Hard limit: always reject
    context.aaa.message = "Rate limit exceeded"
    return result.REJECT
elseif count > 100 then
    -- Soft limit: reject odd requests (50% throttle)
    if count % 2 == 1 then
        context.aaa.message = "Rate limit: throttled"
        return result.REJECT
    end
end

return previous

This example shows tiered rate limiting: full access below 100, throttled at 50% between 100-200, and complete blocking above 200.

For daily quotas:

caches {
    cache "daily_quota" {
        timeout 86400s;  # 24 hours = 86400 seconds
    }
}

GCRA Example

GCRA rate limiting is typically used to protect backend systems from overload by applying rate limits before forwarding requests. This example shows how to implement per-user rate limiting using the @pre-execute block, which runs before backends are called.

Create a Lua script file (rate_limiter.lua):

-- rate_limiter.lua
-- GCRA rate limiting to protect backends from flooding
-- Parameters passed: context, previous
-- Returns: result.REJECT to reject request, previous to continue processing

local context, previous = ...
local cache = context.cache
local user = context.aaa.identity

-- Validate that user identity exists
if not user then
    context.aaa.message = "Identity required for rate limiting"
    return result.REJECT
end

-- Apply GCRA rate limit: 100 requests per second per user
-- limit=100, period=1000ms (1 second)
local allowed = cache:rate_limit_gcra(
    "backend_protection",
    "user:" .. user,
    100,
    1000
)

if not allowed then
    context.aaa.message = "Rate limit exceeded, please slow down"
    return result.REJECT
end

-- Rate limit passed, continue processing
return previous

Configure Radiator to use the rate limiter:

caches {
    cache "backend_protection" {
        # Timeout auto-calculated as 2× period (2 seconds)
    }
}

scripts {
    lua "rate_limiter" {
        file "rate_limiter.lua";
    }
}

aaa {
    policy "DEFAULT" {
        handler "AUTHENTICATION" {
            conditions all {
                radius.request.code == radius.ACCESS_REQUEST;
                radius.request.attr.User-Name == any;
            }

            # Rate limit BEFORE calling backends to protect them from flooding
            @pre-execute {
                script "rate_limiter";
            }

            authentication {
                # Backend is only called if rate limit passed
                backend {
                    name "USERS";
                }
            }
        }
    }
}

How it works:

  1. Lua parameters: The script receives two parameters:

    • context: Provides access to AAA context (context.aaa), cache (context.cache), and variables
    • previous: The result from previous script in the chain (if any)
  2. Return values:

    • result.REJECT: Rejects the request immediately, stops further processing
    • previous: Continues to the next step in the pipeline
  3. Execution order: The @pre-execute block runs before authentication backends, protecting them from excessive requests

  4. Identity validation: The script validates that a user identity exists before applying rate limiting to avoid creating a shared rate limit bucket

  5. Cache timeout: The cache timeout is automatically calculated as 2× the period, allowing idle keys to be cleaned up

GCRA with Randomization

The rate_limit_gcra_rnd variant adds random jitter to emission intervals to prevent thundering herd effects. This is useful when multiple clients might synchronize their requests (e.g., all retrying at exactly the same time after a failure).

When to use randomization:

  • High-concurrency scenarios where many clients might synchronize
  • Preventing cascading failures in distributed systems
  • Load distribution across time to smooth traffic peaks

Example:

local cache = context.cache
local user = context.aaa.identity

if not user then
    context.aaa.message = "Identity required for rate limiting"
    return result.REJECT
end

-- Prevent brute force attacks: 5 login attempts per 5 minutes
-- Applies to both successful and failed logins
-- limit=5, period=300000ms (5 minutes), variation=±30000ms (30 seconds)
local allowed = cache:rate_limit_gcra_rnd(
    "login_limit",
    "user:" .. user,
    5,
    300000,
    30000
)

if not allowed then
    context.aaa.message = "Too many login attempts, please try again later"
    return result.REJECT
end

With variation=30000 (±30 seconds), the emission interval is randomized. The base emission interval is 60 seconds (300000ms ÷ 5 attempts), so actual intervals will range from 30 seconds to 90 seconds, averaging 60 seconds over time. This prevents attackers from predicting exactly when the next attempt will be allowed.

Complete configuration example:

Create a Lua script file (brute_force_protection.lua):

-- brute_force_protection.lua
-- Prevents brute force attacks by rate limiting login attempts
local context, previous = ...
local cache = context.cache
local user = context.aaa.identity

if not user then
    context.aaa.message = "Identity required for rate limiting"
    return result.REJECT
end

-- 5 login attempts per 5 minutes with ±30s variation
local allowed = cache:rate_limit_gcra_rnd(
    "login_limit",
    "user:" .. user,
    5,
    300000,
    30000
)

if not allowed then
    context.aaa.message = "Too many login attempts, please try again later"
    return result.REJECT
end

return previous

Configure Radiator to use the script:

caches {
    cache "login_limit" {
        # Timeout auto-calculated as 2× period (10 minutes)
    }
}

scripts {
    lua "brute_force_protection" {
        file "brute_force_protection.lua";
    }
}

aaa {
    policy "DEFAULT" {
        handler "AUTHENTICATION" {
            conditions all {
                radius.request.code == radius.ACCESS_REQUEST;
                radius.request.attr.User-Name == any;
                radius.request.attr.User-Password == any;
            }

            authentication {
                # Check rate limit before authentication
                script "brute_force_protection";

                # Authenticate if rate limit passed
                backend {
                    name "USERS";
                }

                # Verify password
                pap;
            }
        }
    }
}

In this configuration:

  1. The handler conditions ensure this only applies to Access-Request packets with a username and password (PAP authentication)
  2. Rate limiting applies to all login attempts (both successes and failures)

Note: For other authentication methods (CHAP, EAP), replace the User-Password condition check with the appropriate attribute (e.g., CHAP-Password, EAP-Message).

Note: The emission interval is always kept at minimum 1 nanosecond even if randomization would make it negative.

Interchangeable Script Examples

The Lua script files shown above can be easily adapted for different use cases. Key modifications:

To change the cache or key:

-- Change the cache name and key pattern
local allowed = cache:rate_limit_gcra_rnd(
    "api_limit",        -- different cache name
    "endpoint:" .. endpoint,  -- key by endpoint instead of user
    100, 10000, 1000
)

To adjust rate limits:

-- Modify limit, period, and variation values
local allowed = cache:rate_limit_gcra_rnd(
    "login_limit",
    "user:" .. user,
    10,      -- 10 attempts instead of 5
    600000,  -- 10 minutes instead of 5
    60000    -- ±60 seconds instead of ±30
)

To switch between GCRA variants:

-- Without randomization
local allowed = cache:rate_limit_gcra("login_limit", "user:" .. user, 5, 300000)

-- With randomization
local allowed = cache:rate_limit_gcra_rnd("login_limit", "user:" .. user, 5, 300000, 30000)

The radconf configuration remains the same - just update the file path if you rename the Lua script.

For more examples and configuration details, see Cache Context Reference.