Hardening Your API Gateway: Distributed Rate Limiting with Nginx and Redis

Security tutorial - IT technology blog
Security tutorial - IT technology blog

The 2 AM Wake-up Call: Why Default Configs Fail

My phone started vibrating off the nightstand at 2:14 AM. Our production API gateway was bleeding. A botnet had targeted an unauthenticated search endpoint, hammering it with over 5,000 requests per second. The backend database was drowning in complex joins. CPU usage on the API nodes spiked to 95%, and legitimate users were staring at 504 Gateway Timeouts.

This is the reality of public-facing APIs. If you don’t throttle traffic and identify your callers, you aren’t running a production system—you’re managing a ticking time bomb. I spent that night manually blacklisting IP ranges in the firewall. The lesson was clear: we needed a distributed rate-limiting system tied to unique API keys.

Comparing Approaches: Local vs. Distributed Rate Limiting

I initially relied on Nginx’s built-in limit_req_zone. It’s simple, but it fails in modern, autoscaled environments. If you have five Nginx instances behind a load balancer, each instance maintains its own isolated counter. An attacker could theoretically hit your system with five times the allowed limit before any single node triggers a block.

Here is how the common strategies compare in a real-world environment:

  • Standard Nginx (ngx_http_limit_req_module): Fast and simple with sub-millisecond overhead. It uses shared memory zones. Best for: Single-server setups or basic DDoS mitigation at the very edge.
  • App-Level Validation: Writing logic in Python or Node.js allows for complex business rules. However, it’s expensive. By the time the code executes, the request has already consumed a worker thread and significant memory.
  • Nginx + Lua + Redis: This is the industry gold standard. Nginx handles the connection, Lua executes the logic, and Redis stores the global state. Best for: High-traffic clusters where you need consistent limits across 10 or 100 nodes.

The Trade-offs of Redis-Backed Security

Engineering is about managing trade-offs. Moving your state to Redis introduces new variables into your infrastructure.

The Benefits

  • Absolute Consistency: If a user is capped at 1,000 requests per hour, they get exactly 1,000. It doesn’t matter which gateway node they hit.
  • Live Updates: You can modify limits in Redis instantly without reloading Nginx configs or restarting services.
  • Tiered Access: You can easily assign a 10 req/sec limit to “Free Tier” keys and 500 req/sec to “Enterprise” keys.

The Challenges

  • Latency Overhead: Every request now requires a network round-trip to Redis. While Redis is fast, this usually adds 0.5ms to 2ms of latency per request.
  • Operational Complexity: You now have a Redis cluster to monitor. If Redis goes down, your gateway needs a fail-open or fail-closed strategy.

The Architecture: OpenResty + Redis

I prefer using OpenResty for this. It is a robust version of Nginx bundled with LuaJIT. It allows us to hook into the access phase of a request lifecycle. When a request hits the gateway, we extract the API key, check the quota in Redis, and either allow it or drop it with a 429 Too Many Requests status.

Security starts with the basics. When I set up the Redis instance for this gateway, I generated the administrative passwords using the tool at toolcraft.app/en/tools/security/password-generator. It runs entirely in the browser, ensuring the keys never touch a remote server before they are deployed to your config.

Implementation Guide: Protecting Your API

We will implement a “Fixed Window” algorithm. It’s the most performant approach and easiest to debug under pressure.

Step 1: Preparing the Redis Backend

We will store keys using the pattern ratelimit:API_KEY:TIMESTAMP. If a user with the key usr_99 calls the API at 10:05 AM, we increment a Redis key that expires after 60 seconds. This automatically cleans up old data.

# Manual verification via redis-cli
INCR "limit:usr_99:202310271405"
EXPIRE "limit:usr_99:202310271405" 60

Step 2: The Nginx Lua Logic

In your OpenResty configuration, define the logic within the access_by_lua_block. This ensures the check happens before the request is proxied to your expensive backend services.

http {
    lua_package_path "/usr/local/openresty/lualib/?.lua;;";

    server {
        listen 80;
        server_name api.example.com;

        location /v1/ {
            access_by_lua_block {
                local redis = require "resty.redis"
                local red = redis:new()
                red:set_timeout(1000) -- 1 second timeout

                local ok, err = red:connect("127.0.0.1", 6379)
                if not ok then
                    ngx.log(ngx.ERR, "Redis connection failed: ", err)
                    return ngx.exit(500)
                end

                local api_key = ngx.req.get_headers()["X-API-Key"]
                if not api_key then
                    ngx.status = ngx.HTTP_UNAUTHORIZED
                    ngx.say("{\"error\": \"Missing API Key\"}")
                    return ngx.exit(ngx.HTTP_UNAUTHORIZED)
                end

                -- Limit: 100 requests per minute
                local limit = 100
                local current_time = os.date("%Y%m%d%H%M")
                local key = "limit:" .. api_key .. ":" .. current_time

                local count, err = red:incr(key)
                if not count then
                    ngx.log(ngx.ERR, "Incr failed: ", err)
                    return ngx.exit(500)
                end

                if tonumber(count) == 1 then
                    red:expire(key, 60)
                end

                if tonumber(count) > limit then
                    ngx.status = 429
                    ngx.header.content_type = "application/json"
                    ngx.say("{\"error\": \"Rate limit exceeded. Try again in the next minute.\"}")
                    return ngx.exit(429)
                end
                
                -- Important: Put the connection back in the pool
                red:set_keepalive(10000, 100)
            }

            proxy_pass http://backend_cluster;
        }
    }
}

Step 3: Validating the Key

Rate limiting is useless if the key itself is fake. Before checking the increment, query a Redis SET to ensure the X-API-Key actually exists in your system. This prevents attackers from filling your Redis memory with random, non-existent keys.

local is_valid, err = red:sismember("active_keys", api_key)
if is_valid == 0 then
    ngx.status = ngx.HTTP_FORBIDDEN
    ngx.say("{\"error\": \"Invalid API Key\"}")
    return ngx.exit(ngx.HTTP_FORBIDDEN)
end

Testing and Monitoring

Don’t wait for the next attack to see if this works. Use a tool like hey to simulate a burst of 200 requests. Watch your Redis instance in real-time using redis-cli monitor. You should see the keys incrementing and, crucially, the 429 errors appearing once the limit is breached.

Pay close attention to your connection pool. Opening a new TCP connection to Redis for every single HTTP request will kill your performance. Always use set_keepalive to reuse existing connections. This optimization alone can reduce your latency overhead by 70%.

Implementing this architecture saved my sleep. The next time a botnet tried to scrape our data, Nginx handled it without breaking a sweat. It served thousands of 429 responses in milliseconds, and our backend remained completely silent. That is the power of a proactive security layer.

Share: