Rate limiting that won't take your API down
Set your sustained load and how hard it spikes. Rate limiting has to hold at peak — it's the thing meant to catch the spike — so we size the Redis fleet you'd run for that peak, the engineer who babysits it, and the rewrite it needs past ~50,000 rps. Then show what Ratelimitly does instead: absorb spikes to 1M rps at sub-millisecond p99, isolated so it can't take your service down, with nothing to integrate.
DIY rate limiting shares fate with your stack. A bad Redis deploy, a noisy neighbor, an OOM — and either the limiter fails (overload storms in) or your API stalls waiting on it.
Ratelimitly runs outside that blast radius: isolated, in-kernel, answered over UDP. It can't take your service down.
And it absorbs spikes: a Redis fleet sized for your average falls over exactly when the burst hits. The per-request price is a rounding error next to surviving that.
And at 385 rps it's $3,324/mo cheaper than DIY.
Same workload, three ways to run it (lower bar = cheaper):
DIY Redis / ValKey
$8,301/mo- 2 nodes, provisioned for 3,850 rps peak
- $1,800
- Cross-AZ bandwidth (149 GB/mo)
- $1
- Engineer time
- $6,500
- ⚠ A TCP round-trip per check — by Little's Law in-flight = λ·W, so as load nears capacity W (and p99) blows up.
- ⚠ Share it with another feature and one failure cascades into the other — teams end up paying for dedicated instances.
Ratelimitly
$4,976/mo Pro · HA (2 instances) included- ✓ Handles 385 rps today — up to 1M rps per key with no change. A single key is serialized, so beyond 1M you partition the keyspace across queues/instances (each key ≤ 1M) — linear scale, no rewrite.
- ✓ Sub-millisecond p99 over UDP, in the kernel via XDP — even at 1M rps.
- ✓ Isolated. It can't take your API down, and your API can't take it down.
- ✓ Drops in at the reverse proxy (nginx & friends) — often zero app code. Retrofit rate limiting onto anything, even closed source.
- ✓ HA included. Every plan runs 2 instances — high availability is built into the price, not a paid add-on.
Ratelimitly On-prem
$1,966/mo Starter · up to 100k rps- 2 servers (2/region · your hardware)
- $1,800
- License (Starter, ≤ 100k rps)
- $166
- ✓ On your hardware — no egress, lowest latency, traffic never leaves your network.
- ✓ No on-call, no rewrite — same XDP engine we run, supported under the license.
- ✓ Rate-capped from $2,000/server/yr — bump the cap with a license key, up to 64M rps/box.
Never interrupts your service
Isolated by design. A shared Redis can fail and take rate limiting with it — or the reverse. Ratelimitly runs outside your stack's blast radius — with HA (2 instances) included on every plan.
Scales to 1M rps per key
Redis tops out near 50k rps (then a C rewrite); past ~500k you're into accelerated-networking R&D — kernel bypass, specialized knowledge, a year-plus of work. We did that with XDP: a million rps on one key, day one — partition the keyspace to go beyond.
Sub-millisecond p99
~1 µs of in-kernel service over UDP keeps p99 under a millisecond even at full tilt (Little's Law: tiny W ⇒ tiny in-flight). Redis adds a TCP round-trip to every check.
No developer required
Deploy at the reverse proxy and limit traffic to systems that don't support it — including closed-source products you can't change.
Correct by design
No read-modify-write races, no lost counters, no double-spend — the failure modes a homemade Redis limiter is full of.
Built for spikes
Absorbs bursts to 1M rps with nothing to pre-provision. A Redis fleet sized for your average falls over when the spike hits — and the spike is when rate limiting matters most.
Method — the laws behind the numbers
Little's Law (L = λ·W). A serialized resource's max throughput is 1 ÷ service time. A Redis check ≈ 20 µs ⇒ ~50,000 rps/key; our in-kernel XDP check ≈ 1 µs ⇒ 1M rps/key — the edge is just less service time (no syscall, TCP, or copy). In-flight N = λ·W, so sub-ms W also means far less concurrency to hold the same rate.
Universal Scalability Law X(N) = N ÷ (1 + α(N−1) + βN(N−1)). One exact counter is serialized (α→1) and a distributed one adds coherency (β>0) ⇒ X(N) ≤ 1 and retrograde: more Redis nodes cannot raise a single key's throughput. Independent keys have α,β ≈ 0 ⇒ X(N) ≈ N, so aggregate scales by partitioning the keyspace — one pipeline per NIC queue for us.
Two physical walls. ~50,000 rps/key (serialization) and ~500,000 rps, where a conventional kernel network stack saturates on packet rate and needs kernel bypass (XDP/DPDK).
Per-key ceiling from Little's Law: 1e6 ÷ 20 µs Redis service ≈ 50,000 rps; our XDP path ≈ 1 µs ⇒ 1M. DIY = (1 region × (1 primary + 1 replica)) at $1/hr, cross-AZ egress at $0/GB on a Redis payload ~5× our compact binary, 0.5 engineer FTE at $13,000/mo, plus a 4-month C rewrite (amortized) once a region's peak (10× sustained) crosses the per-key ceiling — or ~12 months of accelerated networking past ~500,000 rps. Ratelimitly priced from the public plans, HA (2 instances) included; DIY pays for its replicas separately. Estimates for comparison, not a quote.