What is a rate-limit bypass attack?

A rate-limit bypass exploits weaknesses in how an API or web application identifies and counts requests from a single source. Most rate-limit implementations track requests by IP address. If the server reads IP address from client-controlled headers — such as X-Forwarded-For or X-Real-IP — rather than from the actual network connection, an attacker can supply arbitrary IP values in those headers, causing the server to think each request comes from a different address. This resets the counter for every spoofed IP, making the rate limit completely ineffective. A second common bypass is endpoint variation: rate limits applied to /api/v2/endpoint are frequently not retroactively applied to legacy /api/v1/endpoint routes. Either path renders rate limiting useless as a security control.

Why do APIs read IP addresses from headers instead of the connection?

When an API runs behind a reverse proxy, load balancer, or CDN, the server only sees the proxy's IP address on the TCP connection — not the original client's IP. To make the real client IP available to the application, proxies add headers like X-Forwarded-For or X-Real-IP containing the original IP. This is a legitimate and necessary pattern for logging, geolocation, and rate limiting. The problem arises when developers apply rate limiting using these headers without verifying that the header originates from a trusted proxy. If any client can set X-Forwarded-For to an arbitrary value and the server trusts it, rate limiting by IP becomes trivially bypassable. The correct implementation reads the header only when the request arrives from a known, trusted proxy IP.

What is the difference between a soft-window and a fixed-window rate limiter?

A fixed-window rate limiter counts requests in a fixed time interval — for example, 5 requests per 60-second window. The counter resets at the start of each new window. The vulnerability: an attacker can burst up to 10 requests at the window boundary (5 at the end of one window, 5 at the start of the next). A sliding-window rate limiter tracks the number of requests in the last N seconds from the current moment — the window slides forward in real time, eliminating the boundary burst. A token-bucket algorithm grants a budget of tokens that refill at a fixed rate; requests consume tokens and are rejected when the bucket is empty, allowing short bursts up to the bucket size. In this demo, the target used a fixed-window counter. Timing attacks — deliberately placing requests at the window boundary — do not work against it: the fixed counter simply extends to a new window regardless of request timing.

What is the impact of a rate-limit bypass on user enumeration?

User enumeration APIs — search endpoints, password reset forms, username availability checkers — are protected by rate limiting to prevent attackers from systematically probing the user database. A rate-limit bypass removes that protection entirely. With unlimited requests to /api/v1/users/search, an attacker can extract the full user database: every username, email address, or account identifier the endpoint returns. This list then feeds directly into credential attacks: Hydra (D15) can target exactly those usernames with a password wordlist, turning a vague attack into a targeted one. Rate limit bypass is not the end goal — it is the unlock that makes every downstream credential attack viable.

How should rate limiting be implemented to prevent bypass?

Robust rate limiting requires multiple controls layered together. First, implement rate limiting at the infrastructure layer — nginx, AWS API Gateway, Cloudflare, or a WAF — not in application code. Infrastructure-layer rate limiting operates below the HTTP parsing layer, before headers are trusted. Second, if you must rate-limit by IP in application code, only trust IP-forwarding headers when the request originates from a verified trusted proxy IP (check request.remote_addr against your load balancer's known IP range first). Third, apply rate limits to all versions of every endpoint — use middleware or gateway rules, not per-route decorators. Fourth, for authenticated endpoints, rate-limit by session or user identity rather than IP — an authenticated attacker cannot trivially rotate identities. Fifth, consider hard lockout after N failures for sensitive endpoints rather than soft throttling.

40% of APIs Trust a Header Attackers Control — We Proved It in 9 Seconds

March 22, 2026 · SecurityClaw Demo D18 · API Security

AcmeCorp's API blocks you after 5 requests per minute. Add one header and you get a fresh counter. Change the URL path and there's no counter at all. SecurityClaw's rate-limit-bypass skill confirmed two independent bypass paths in 8.7 seconds — either one is enough to make the rate limit useless. Here's the exact campaign output, the honest misses, and the attack chain this unlocks.

Date:March 22, 2026

Tool:SecurityClaw rate-limit-bypass

Category:API Security

Result:✅ Pass — 2/4 bypass techniques confirmed (8.7s)

Target:AcmeCorp User API (Flask, controlled test environment)

Methodology note: This is a controlled demo against a sandboxed AcmeCorp User API (Flask, port 5060). The target was built by Peng's SecurityClaw research team with deliberate rate-limit vulnerabilities representative of real-world API configurations. We tested 9 forwarding headers and 4 bypass techniques. Running rate-limit bypass attacks against APIs you don't own is illegal. This demo exists to show defenders what these bypasses look like and what configuration actually prevents them. Demo D18 — SecurityClaw's API Security category.

The Results at a Glance

Bypass Technique	Result	Detail
Header manipulation	BYPASS ✅	6/9 headers accepted as IP source — fresh counter per spoofed IP
Endpoint variation	BYPASS ✅	/api/v1/users/search — no rate limit at all, 10/10 requests returned 200
Case manipulation	BLOCKED ❌	/API/USERS/SEARCH → 404 on Flask (honest miss — documented below)
Timing delay	BLOCKED ❌	Fixed-window counter doesn't reset from delays (honest miss — documented below)

Two HIGH findings. Two honest misses. Total campaign time: 8.7 seconds. SecurityClaw tested everything. It found what works and — importantly — documented what doesn't. The bypass score (2/4) is accurate. We don't inflate numbers.

Finding 1 (HIGH): Header Manipulation — 6 Bypass Vectors

Rate limiting by IP address sounds simple. In practice, it requires the server to know the client's real IP — and when your API runs behind a load balancer, that's where the problem starts.

The AcmeCorp API reads the client's IP address from forwarding headers: X-Forwarded-For, X-Real-IP, X-Originating-IP, X-Remote-IP, X-Client-IP, and four others. This is a common pattern — load balancers inject these headers so the application can log the real client IP. The problem is that these headers come from the client, not the proxy. The API trusts whatever value arrives in the header without verifying it came from a known load balancer.

# Rate limited after 5 requests from 127.0.0.1
$ curl /api/users/search?q=admin
HTTP 429: Rate limit exceeded. Retry after 60s

# Add X-Forwarded-For — server sees a new IP, fresh counter
$ curl -H "X-Forwarded-For: 10.0.0.1" /api/users/search?q=admin
HTTP 200: {"results": [...], "ip_seen_as": "10.0.0.1"}

# Rotate IPs: 10.0.0.2, 10.0.0.3 ... 10.0.0.254
# Each "IP" gets 5 requests before it's limited
# 50 spoofed IPs = 250 requests. Practical limit: unlimited.

Time to confirm the first bypass: under 1 second. SecurityClaw tried all 9 headers. Six of them worked. Three returned an unmodified 429 — those three are either not checked by the application or mapped to a secondary IP field that feeds a different rate-limit bucket. In practice, 6 working bypass headers means an attacker has 6 independent unlimited-request paths to the same endpoint.

The Headers That Work

Header	Bypass	Notes
`X-Forwarded-For`	✅ YES	Most common — standard proxy header
`X-Real-IP`	✅ YES	nginx convention
`X-Originating-IP`	✅ YES	IBM/older proxy convention
`X-Remote-IP`	✅ YES	Custom/legacy applications
`X-Client-IP`	✅ YES	AWS and various CDNs
`True-Client-IP`	✅ YES	Cloudflare convention
`X-Remote-Addr`	❌ NO	Not parsed by this implementation
`X-Cluster-Client-IP`	❌ NO	Not parsed by this implementation
`Forwarded`	❌ NO	RFC 7239 standard format — not checked

The three headers that failed aren't safe — they simply weren't in this application's IP-parsing logic. A different API could parse all nine. The appropriate control is to stop trusting any of them unless the request comes from a verified trusted proxy address.

Finding 2 (HIGH): Endpoint Variation — The Forgotten v1 Route

The second bypass requires no headers at all. It requires knowing that APIs have version history.

# The protected endpoint (rate-limited)
GET /api/users/search?q=admin     → 5/min limit, 429 after attempt #5

# The forgotten endpoint (no protection)
GET /api/v1/users/search?q=admin  → HTTP 200, same data, zero rate limiting

Ten consecutive requests to /api/v1/users/search. Ten HTTP 200 responses. No 429. The rate-limit decorator was added to the current API version and never retrofitted to the legacy route — a pattern SecurityClaw has found in every API with a version bump in its history.

This is not a rare edge case. API versioning is standard practice. Rate limiting is often added reactively — after an incident, after a security review, after a pentest finding. When it's added to /api/v2/, /api/v1/ stays as-is. The old routes still respond. They still serve production data. They just don't have any of the new security controls.

The practical consequence: an attacker who runs SecurityClaw's ffuf campaign first (D16) will discover both endpoint versions automatically. The rate-limited v2 path is useless as a brute-force surface. The unprotected v1 path is wide open.

The Honest Misses

SecurityClaw's rate-limit-bypass skill tested 4 techniques. Two didn't work. We're documenting them here because why they failed is operationally useful — it tells you what kind of rate limiter you're dealing with.

Case Manipulation: FAILED

Sending /API/USERS/SEARCH (uppercase) returned a 404. Flask is case-sensitive for route matching. This technique works better against Windows/IIS-hosted APIs where URL routing is case-insensitive by default. If you hit a 404 on case manipulation, the target is likely Linux-hosted with a case-sensitive routing layer. Mark it, move on.

Timing Delay: FAILED

Attempting to time requests near the 60-second window boundary had no effect. The rate limiter uses a fixed-window counter — it counts requests in discrete 60-second windows and resets at the window boundary, regardless of when in the window you sent them. Timing attacks work best against sliding-window implementations where the window moves with each request, or against token-bucket systems with misconfigured refill rates. The fixed-window counter here is immune to timing manipulation. This is actually the correct behaviour.

Bypass score: 2/4 techniques worked. The 6-header bypass and the v1 endpoint are independently sufficient. The two misses tell defenders exactly where their implementation is solid and where it isn't.

The Attack Chain: What This Unlocks

Rate limiting is a defence against enumeration and credential attacks. Bypass it and the entire protection chain collapses:

User enumeration (D18 — this campaign). Fire unlimited requests at /api/v1/users/search. No rate limit. Extract the full user database — every username, email, and account identifier the search endpoint returns.
Credential attack (D15 — Hydra). Feed the enumerated usernames to SecurityClaw's Hydra campaign. Target usernames are now known; no more guessing. The rate limit that was supposed to stop this — bypassed in step 1.
Account takeover. admin:password123 — attempt #43, 4.8 seconds. The rate-limit bypass turned a 5-attempt-per-minute constraint into unlimited attempts. The Hydra campaign didn't need more than 43.

D15 + D18 is the complete account takeover chain. The rate limit was the only control between enumeration and credential compromise. It took 8.7 seconds to remove it.

What Actually Stops This

1. Never Trust Client-Controlled Headers for Security Decisions

If you must use forwarding headers for rate limiting, first verify the request comes from a trusted proxy. In Flask/Python: check request.remote_addr against your known load balancer IP range before trusting any X-Forwarded-For value. Better: move IP-based rate limiting to the infrastructure layer (nginx, API gateway, WAF), where the IP value comes from the actual network connection, not application-layer headers.

2. Apply Rate Limits at the Infrastructure Layer

nginx, AWS API Gateway, Cloudflare Workers, or a WAF enforce rate limits before application code runs. They operate at the connection level, not the HTTP header level. IP spoofing via headers doesn't work because the infrastructure reads the TCP connection's source IP, which the client cannot forge.

3. Rate-Limit ALL Versions of Every Endpoint

Use middleware or gateway-level rules applied across all route prefixes — not per-route decorators. A single rule at the gateway: "all paths matching /api/*/users/search — 5 requests per minute per IP" covers v1, v2, v3, and any future versions simultaneously. Per-route decorators require manual application to every new version and every legacy path. They will be missed.

4. For Authenticated Endpoints, Rate-Limit by Identity, Not IP

A logged-in attacker with a valid session can rotate IP addresses freely. Rate-limiting by session token or user ID rather than IP means IP rotation provides no benefit — the identity is what's counted, not the source address. For unauthenticated endpoints (login forms, user search), combine infrastructure-layer IP limiting with bot detection (CAPTCHA, device fingerprinting, behavioural analysis).

5. Rate Limiting Alone Is Not Sufficient

Even a correctly implemented, unbypassable rate limit is a speed bump, not a wall. Against the top 500 most-common passwords at 5 requests per minute, an attacker needs 100 minutes — not a deterrent for an unmonitored endpoint. Hard lockout after N failures and MFA on privileged accounts are the controls that break the attack chain. Rate limiting buys you detection time; the other controls are what deny access.

For the deep-dive on API authentication attack surfaces, Hacking APIs by Corey Ball covers rate-limit bypass, BOLA, BFLA, and mass assignment attacks with working examples. It's the most comprehensive API security testing reference available.

SecurityClaw Scorecard: D18

Metric	Value
Tool	SecurityClaw rate-limit-bypass
Headers tested	9
Headers that bypass	6 / 9 (67%)
Bypass techniques tested	4
Bypass techniques confirmed	2 / 4 (50%)
Endpoint variation bypass	✅ YES — /api/v1/ unprotected (10/10 requests)
Case manipulation	❌ NO — Flask case-sensitive routing (honest miss)
Timing delay	❌ NO — fixed-window counter immune (honest miss)
Total campaign time	8.7 seconds
Tests passing	10/10
Severity findings	2 × HIGH
Campaign result	PASS
Overall scorecard after D18	27/30 campaigns = 90.00%

The headline is 8.7 seconds. The more important number is 6 independent header bypass paths — any one of which is sufficient to grant unlimited requests to a rate-limited endpoint. The endpoint variation bypass is even simpler: it requires no headers at all, just knowledge of the API's history. Together, they mean the rate limit that was supposed to stop brute-force attacks provides no meaningful protection whatsoever.

This is D18 in the SecurityClaw demo series. It connects directly to D15 (Hydra) — where SecurityClaw cracked admin:password123 in 4.8 seconds — and to D16 (ffuf), which discovers the endpoint versions this bypass relies on. For additional reading on credential security and attack chain analysis, The Hacker Playbook 3 covers API enumeration and credential attack chains with real penetration test case studies.