Using rate limiters to build a reliable API

Kevin Chiu

Kevin Chiu,

Software Engineer at Front

17 April 20200 min read

How do you ensure all API users get a reliable experience? Front Software Engineer Kevin Chiu explains how the team built rate limiters for stability.

At Front, we value transparency — in our company and our product. In this series, Front Engineering Stories, our engineering team will share insights into our engineering philosophy, our unique challenges, and visions for the future of our product.

Front Software Engineer Kevin Chiu explains how the team created a more reliable API by extending the existing rate limiter.

Ensuring a reliable API experience with rate limiters

Front’s API lets users customize Front and build programmatic workflows around their inboxes. Businesses use our API to connect Front with third party apps, handle messages, trigger actions in other tools, and more, all from their inboxes.

API users also partner with Front to augment their product’s offering with Front’s core competencies. Some examples include seamlessly switching between bot and human users, synchronizing contacts, and adding collaboration to channels.

Behind the scenes, Front’s API depends on rate limiters, which prevent APIs from overuse by limiting the API usage per consumer. Our rate limiters ensure every user gets a highly available and reliable API experience with Front.

Challenges with the existing system

Front’s API has an egalitarian rate limiter that treats all requests as if they impact our system equally. This is our primary, top-level rate limiter, which blocks users from making additional requests when they have consumed all their allocated API requests. It operates under the assumption that all requests affect our system equally. In reality, some requests create a much heavier load on the system than others.

For example, updating 100 conversations in 1 minute does not have the same implications as updating one conversation 100 times in 1 minute.

Iterating to keep a reliable API

As Front scales and more users adopt our API, we expose our API to intentional and unintentional abuse. Additionally, we want to proactively prevent the potential for a single bad actor to adversely affect the availability and reliability of our API for other users.

Given that the top-level rate limiter could be susceptible to abuse, we needed to explore different avenues for keeping our API reliable.

Building a supplementary rate limiter

We’ve seen cases in which users have errant scripts repeatedly performing the same request in a short period of time. We also have legitimate use cases in which API users maintain bots that send bursts of messages in a short time period. Simple endpoints such as fetching a list of users or tags are relatively cheap (as they’re cached), so they do not need to be limited.

How can we design a system that can accommodate these scenarios? Our team wanted a supplementary rate limiter that would be flexible enough to:

  • Prevent intentional and unintentional malicious use

  • Protect our infrastructure and users from spam requests and attacks

  • Be implemented without affecting existing users key workflows

  • Easily add additional rate limiting protecting to more than one endpoint, new or existing

  • Support different rate limiting usage for different endpoints

  • - Updating the same conversation repeatedly is unacceptable

  • - Updating different conversations is acceptable

  • Allow for bursts of requests

Our process

  1. Review all API endpoints to determine how they affect Front’s infrastructure

  2. Monitor existing API traffic to determine what was an acceptable upper bound to put on the limiter

  3. Categorize endpoints and determine the limit for each category

  4. Implement based on requirements above

  5. Ship it! 🚀

What we built

We decided to supplement our primary rate limiter with additional protection for resource intensive endpoints. We ended up designing and implementing a drop-in, customizable rate limiter, allowing us to keep in mind customer use cases and Front system load.

This was implemented as a new piece of middleware that would accept as parameters:

1. The category the endpoint falls in

2. The dimensions for determining if a request should be rate limited

Take a simple example, the “update conversation” endpoint:

PATCH /conversation/:conversation_id

Step 1. We could categorize it as Category X, which supports a burst of Y number of requests. Different categories can support different burst number of requests.

Step 2. For this example endpoint, if we wanted to prevent spam updating the same conversation and allow updating different conversations, the dimensions passed into the middleware would be the conversation ID of the incoming request.

This means that for the same conversation, requests after request number Y will be rate limited, but requests to different conversations are free to be processed.

Under the hood, our middleware uses an open source node package to manage rate limiting (fun fact: an engineer from our team contributed back to the project 🎉). The package implements a sliding window rate limiting algorithm to manage a sorted set of request timestamps. We rely on Redis to maintain the underlying sorted set for the sliding window algorithm. To keep Redis memory usage low, we use a small time window and evict keys upon expiration.

Like most rate limiters, ours will also send a Retry-After header to notify users when it is safe to retry their request.

The results & the future

Thanks to our Infrastructure team, we were able to easily canary this change to ensure we were not breaking user workflows. We implemented this without user friction, so API users could continue to use Front’s API to support their business’s key workflows. Behind the scenes, we added the extra security and protections needed to help continue to deliver an enjoyable API experience!

For the past 30 days, we’ve been able to track rate limit spikes (accounting for 0.3% of traffic) in traffic, helping us keep our API service available and reliable.

As we continue to grow, we’ll need to find more ways to scale and improve our API for all users. The next improvement of the rate limiter could be implementing load shedding to drop non-critical requests, or implementing shadow-banning so rate limited bad actors are not immediately aware they have been rate limited. We’ll continue to re-evaluate our rate limiter as our API usage grows.

🇺🇸 We’re hiring! 🇫🇷

Thank you for taking the time to read this. I hope you have a better idea of how we identify challenges, solve problems, and design new systems at Front!

Looking to work on challenging engineering problems with a creative and driven team? Check out We’re hiring in our San Francisco and Paris offices!

Written by Kevin Chiu

Originally Published: 17 April 2020

Stories that focus on building stronger customer relationships