Load Testing - How much traffic can I handle?

Load Testing - How much traffic can I handle?

A simple guide to basic load testing, with examples. No new tools required, just some simple commands!

Zayd Simjee's photo
Zayd Simjee
·Jun 22, 2021·

4 min read

Subscribe to our newsletter and never miss any upcoming articles

This is the first post in a series on getting services and apps production ready.

Why do I need to load test?

You had an awesome private beta running on one Digital Ocean or Heroku or AWS EC2 box. Your customers are impressed by your product and how quickly you're able to deliver the features they need. But now, you're trying to take your beta public. How do you know if you’re able to deal with 200 customers instead of 20?

Usually, you don't. You have a hunch that you can maybe probably handle all 200 customers at once, but you don’t have confidence in that answer and you think you'll figure it out as you go along. But why risk an outage if you can know the answer to that question within a few minutes?

The best way to know is by testing! We’ve heard of load testing, but it sounds like a big investment to onboard an awesome Open Source tool like JMeter, Gatling , or Locust. These tools are great to have and make it easy to fire millions of requests against endpoints, but at our scale that’s kind of overkill. Instead, we can use some simple command line utilities to run lighter load tests.

Before testing

Figure out which environments against which you're going to run the load tests. It's usually a good idea to spin up an identical parallel environment so that you don't cut off service for your customers when something goes wrong. As far as where you run the tests from, the strategy outlined below can be run directly from your laptop. If your computer for any reason can't handle the amount of parallelized requests you need to run, you can always have a teammate or friend pitch in and run the commands at the same time.

What to test

I recommend testing three load cases in particular.

  1. Think of how many requests you expect at once, and multiply that number by 10. Then, run that number of parallelized curl commands to your endpoints. You'll know pretty quickly how much load your servers can handle when you start getting error codes 500 or timeouts back from your service. If the APIs are authenticated, try to use different users.

  2. Make requests using large request bodies. This is relevant for APIs that take inputs. I've seen cases where large requests have seriously bogged down or totally imploded a service. I recommend passing requests with bodies that 2x in size, sort of like a binary search, until you can find how big of a request you can actually handle. Then, limit request request sizes on your server, giving yourself some headroom. Most server frameworks usually have a native way to do this, below is an example from express:

    app.use(bodyParser.json({ limit: '[max request size]mb' }))
  3. If you have throttling limits in place, make sure they work! Throttling limits are frequently configured slightly incorrectly, and we don't know that the config works until our service goes down. Run the same parallelized curl commands, but in a way that your request pattern will trigger what you think your throttling rules are. If throttling is not triggered, change the rules and try again.


A few ways to run parallelized curl command - stuff you need to change is in square brackets [].

# 1. single-line command in one step
curl --parallel --parallel-immediate --parallel-max [number of requests] [endpoint pasted the number of times you want to make a requests to it]

# Example:
curl --parallel --parallel-immediate --parallel-max 2 google.com google.com

# 2. Using a config file
# first, create a config file with entries that look like
# url = [endpoint]
# Optionally, you can write the output of each request to a file and analyze those files afterwards by adding a line after each url entry formatted as 
# output = [filename]
# then run your curl commands
curl --parallel --parallel-immediate --parallel-max 2 --config [config file]

# Example:
touch urls
echo 'url = google.com' >> urls
echo 'output = req1.txt' >> urls
echo 'url = google.com' >> urls
echo 'output = req2.txt' >> urls
curl --parallel --parallel-immediate --parallel-max 2 --config urls

Keep track of all of these light-weight load tests in scripts. They can take you to a pretty decent scale, and when you do decide to onboard a tool later on, these scripts will always be useful in creating the initial tests on those frameworks.

Share this