Restricting the number of requests a given client can make to your system over a particular time period, or rate-limiting, is most commonly thought of in terms of limiting the load a client can place on your system. If your product is an API, this can be quite important, because you don't want one client (probably another business) making 100,000 requests a second and thereby bringing your system down for everyone. For consumer-facing applications like Felt, rate limiting can be just as important—if not even more important (for reasons we will get into).
The needs of a client-facing application are quite different from an application where your API is the product. For starters, your clients are generally humans, rather than other scripts or other servers. That means the maximum amount of traffic that's reasonable for an individual client is probably measured in single digits of requests per second, rather than hundreds or thousands.
From an implementation perspective, client apps diverge even further from API-based rate limits. Rather than using an API token (linked to things like billing or account limits), you'll instead need to rate limit based on IP address, session cookies, or something more creative. You might rate limit all requests to your site, or you may only care about requests to particularly sensitive routes (like the login or signup form).
What should you rate limit?
There are different use cases for rate limiting. The most obvious is to control load to your server: you wouldn’t want a single client able to make 100,000 requests a second and thereby create a denial-of-service attack that prevents the rest of your userbase from accessing the app. (Calling it a denial-of-service attack implies malevolence; it’s also possible for clients to put an undue strain on your server accidentally, usually by hitting routes that are computationally expensive for you. Of course, this happens way more frequently when your product is an API.)
The other big use case—and the reason rate limiting might be even more important for client-facing apps than APIs—is to prevent potentially abusive requests. Many requests to your login endpoint, for instance, probably indicate an attacker trying to gain unauthorized access. Likewise, many accounts being created by a single IP address almost certainly represents a bot with bad intentions.
Credential stuffing is an automated attack whereby a malicious actor tries username and password combinations leaked from across the internet in an attempt to find users who reused a compromised password on your site. This is closely related to brute force attacks, whereby an attacker targets a specific attack and just starts guessing passwords.
For potentially sensitive endpoints like this, you could implement something like a captcha. These have fallen out of favor over the years, though, due to the poor user experience.
Instead, a better option is to aggressively rate limit them.
Along similar lines of preventing abuse, you might want to rate limit actions that can send an email to an arbitrary user. “Share” functionality falls into this category—it might be totally normal for someone to trigger 10 emails to their 10 team members to collaborate on a particular document, but if they’ve triggered, say, 500 emails, you can almost guarantee it’s spam. Over the years, this type of attack has been used by spammers against services like Google Docs and iCal to send messages to huge numbers of people while leveraging the reputation and resources of mega corporations. For smaller companies, there is serious risk of ruining your email-sending server’s IP reputation, and thereby getting all your emails routed to people’s spam folder. (This might also qualify as a “cash overflow” attack, where your company foots the bill for spam messages until you hit a spending limit.)
Rate limiting policies in action
When we rate limit client requests, there are two primary algorithms we want to use: throttling and banning.
Throttling
With throttling, you set a cap on the number of actions the client can take—say, 10 login requests per minute, or 1,000 requests an hour to all endpoints combined. If the client exceeds those limits, you abort the response immediately, saving load on your server. You might deliver an HTTP 429 (Too Many Requests) response, or (depending on how cryptic you want to be) a 400, 401, or 403. You continue refusing the client’s requests to that endpoint (or the whole system) until the time period is up, after which your tracker resets and you allow them back in.
Thus, in a throttling scenario, if your rate limit was 5 login attempts per minute, throttling only that endpoint, the client might see this:
- POST <p-inline>/login<p-inline>: 403 ❌ (wrong password)
- POST <p-inline>/login<p-inline>: 403 ❌ (still the wrong password)
- POST <p-inline>/login<p-inline>: 403 ❌
- POST <p-inline>/login<p-inline>: 403 ❌
- POST <p-inline>/login<p-inline>: 403 ❌
- POST <p-inline>/login<p-inline>: 429 ⏰ (rate limit exceeded, regardless of whether the password was correct)
- POST <p-inline>/login<p-inline>: 429 ⏰ (still rate limited)
- GET <p-inline>/<p-inline>: 200 ✅ (different endpoint; rate limit doesn’t apply)
- POST <p-inline>/login<p-inline>: 429 ⏰
- (Wait 60 seconds)
- POST <p-inline>/login<p-inline>: 403 ❌ (wrong password)
- POST <p-inline>/login<p-inline>: 200 ✅ (finally the correct password)
As you can see, the throttling policy is potentially useful, but it probably doesn't make sense for security-sensitive routes—giving someone running a credential-stuffing attack 5 requests/minute amounts to 7,200 attempts per day.
There's another policy, though, more appropriate to sensitive requests.
Fail-to-ban
Modeled after the extremely popular fail2ban command line utility, the idea behind this policy is that after exceeding the rate limit, the client is prevented from making more requests for a significantly longer period.
For instance, your policy might be something like: allow up to 10 login attempts from a given IP address in a minute; if that IP tries to log in an 11th time, prevent all logins from it for 24 hours.
This is quite a bit more aggressive, but for security-sensitive endpoints, you can probably tweak the thresholds to be something you’re comfortable with.
Getting the rate limits right without impacting customers
If you’re unsure whether your rate limits are too strict, you can test them in a couple different ways.
First, you might do a historical analysis. Working from your server’s past access logs, you can analyze how often clients would have tripped a rate limiter. If there are outliers, can you look at the traffic patterns and use your best judgement about whether that’s usage you’d like to support, or whether it’d be okay to cut them off.
If that’s not an option (either because working with the log data would be tedious, or because it’s totally infeasible at your scale), you can roll out a “dry run” rate limiter into production. The idea here is that you log (but don’t actually block) traffic based on the same ruleset you’re considering implementing for real. Let it run in production for a couple weeks and see how many requests (or IPs) would have been blocked, and tweak the rules from there to gain confidence that they’ll be safe to roll out for real.
Doing this in Elixir
Here at Felt, our backend is written in Elixir. We have the benefit of the outstanding PlugAttack library, created by Michael Lubas of the Elixir-focused security company Paraxial.io. Michael’s guide to configuring PlugAttack is excellent, and we followed it more or less to a T. Ultimately, our implementation ended up looking like this:
We create a Plug that implements our rule set (powered by runtime-configurable limits). Security-sensitive routes use the <p-inline>fail2ban<p-inline> policy, while everything else gets a simple rate limit. Our custom plug gets stuck into our default <p-inline>:browser<p-inline> pipeline, which then gets built upon for all our routes.
Go forth and rate limit!
Rate limiting is important both for keeping your service healthy and for preventing abuse. You don’t want to get paged at 3 am because your app was brought down by a rogue script kiddie, but you especially don’t want to be responsible for a breach of user data.
Comments? Questions? You can give me a shout on Twitter at @TylerAYoung, or tweet at us at @felt.
Best of luck, and may the rate limits be ever in your favor.
We're hiring!
If this sounds like the kind of challenges you’re excited about tackling, we’re hiring for Elixir engineers, engineering managers, growth marketers, and more.