For many companies, performance is the main reason to go with GraphQL. But is that a valid argument? Often developers compare GraphQL to REST APIs and see the N+1 requests (or over-fetching) as an important reason to go for GraphQL. Let's put that to the test and explore if GraphQL APIs actually can outperform existing REST APIs. For this, we'll take a GraphQL-ized REST API and test the performance of GraphQL and compare it to the REST approach. For this we'll be using the popular performance testing tool k6.

Explore a GraphQL API

Before exploring a GraphQL API, let's learn more about the technology. GraphQL is a query language for APIs, designed by Facebook (now Meta) in 2012 to handle API requests on low bandwidth networks. After being used internally, GraphQL was open-sourced in 2015. Since 2019 its trademark has been owned by the GraphQL Foundation, securing its future outside Meta. GraphQL has seen huge adoption both in the open-source community and with enterprises.

The query language of GraphQL depends on a schema that includes all the operations you can use to request or mutate data and the corresponding response types of those operations. This schema for a GraphQL API that returns data about a fictional post, could look like this:

type Post {
  id: ID!
  userId: ID!
  title: String
  Body: String
}

type Query {
  posts: [Post]
  post(id: ID!): Post
}

Two operations with the type Query are defined in this schema, with the response type Post. Meaning you can either query a list of all posts, or specify the identifier id to get a specific post. To get a single post, you could send a request to this GraphQL API and append a body that includes a query to get a message:

query {
  getPost(id: 1") {
    id
    title
    body
  }
}

The response of the GraphQL API will follow the shape of the type Post, and includes all the fields that are defined in the query. In the query above you can see we didn’t include the field userId and therefore it will not be included in the response of this query. Based on the value you give to the parameter id, the message will be returned in JSON format. If you make changes to the query, for example, add more fields that you want to retrieve, these fields will be appended to the result.

Next to Query, an operation can also be a Mutation (for mutating data) or a Subscription (for real-time or stream data).

To try out this query, you can ofcourse use StepZen to convert a REST API to a GraphQL API using the CLI, as I've done for JSONPlaceholder. The free REST API has mocked data for posts, users and comments; which you can now query using GraphQL!

Most GraphQL APIs, like this one, come with GraphiQL, an open-source IDE to interact with GraphQL APIs. A sample query is already added for you, but you can make any change. The GraphiQL interface looks like the following:

Query a StepZen API in GraphiQL

You can try out this query on this deployed demo endpoint. The query you're sending to the GraphQL API on the left-hand side of the screen, while the right-hand side shows the response. The query is the same as we've described before, but this time the query is named GetPost. Naming queries is an advised pattern and helps GraphQL APIs with, for example, caching. Also, the response has the same format as the query.

But you're not limited to using something like GraphiQL to interact with a GraphQL API. GraphQL is a transport-agnostic query language, but most implementations use GraphQL-over-HTTP. This means that you can send a request to a GraphQL API using HTTP(S), similar to REST APIs. Only the requests to GraphQL are formatted differently than requests to REST APIs as you only use the HTTP-method POST, both when you're retrieving data or mutating it, and always require a post body.

This will translate to the following if you're using JavaScript to send HTTP requests and want to send a request to the GraphQL API.

fetch('https://public3b47822a17c9dda6.stepzen.net/api/with-jsonplaceholder/__graphql', {
  method: 'POST',
  mode: 'cors', // no-cors, *cors, same-origin
  headers: {
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    query: `
      query GetPost {
        getPost(id: "1") {
          id
          title
          body
        }
      }
    `,
  }),
});

Note that the content type is set to application/json as GraphQL relies on JSON. Both when sending and receiving requests.

After exploring this GraphQL API, let's set up k6 so we can use it to test GraphQL in the next section.

Set up k6 for GraphQL

The GraphQL API to get posts from the REST API that we've explored in the previous section is implemented using GraphQL-over-HTTP; you can send requests to it like every other REST API. The JavaScript snippet to fetch the data already showed how to do it. This snippet needs to be altered a little to work with the http.post function from k6:

import http from 'k6/http';

const query = `
  query GetPost {
    getPost(id: "1") {
      id
      title
      body
    }
  }
`;

const headers = {
  'Content-Type': 'application/json',
};

export default function () {
  http.post(
    'https://public3b47822a17c9dda6.stepzen.net/api/with-jsonplaceholder/__graphql',
    JSON.stringify({ query }),
    { headers }
  );
}

This k6 script can now hit the GraphQL API using the query to test its performance. But querying the same post on every request isn’t helping us testing the performance of the GraphQL API. Therefore we can make use of dynamic variables in your GraphQL query, and pass a random number between 0 and 100 to the query (number of posts in the REST API). Before doing this, we need to add dynamic parameters to the query to get the posts data. The same query with a dynamic value for id will look like this:

query GetPost($id: ID!) {
  getPost(id: $id) {
    id
    title
    body
  }
}

When sending the request, you need to append a JSON object containing a value for id alongside your GraphQL query. If you visit the GraphiQL interface you can use the tab "query parameters" for this:

Query a StepZen API with Arguments in GraphiQL

The updated version of the k6 script to test the GraphQL API with a dynamic value for the query variable can be seen below. As suggested earlier, we can pass a random value between 0 and 100 as this is the amount of posts present in the REST API that is GraphQL-ized:

import http from 'k6/http';

const query = `
  query GetPost($id: ID!) {
    getPost(id: $id) {
      id
      title
      body
    }
  }
`;

const headers = {
  'Content-Type': 'application/json',
};

export default function () {
  http.post(
    'https://public3b47822a17c9dda6.stepzen.net/api/with-jsonplaceholder/__graphql',
    JSON.stringify({
      query,
      variables: { id: Math.floor(Math.random() * 101) },
    }),
    { headers }
  );
}

Now we’ve got the first k6 script set up, we can run a performance test on the GraphQL API with k6 in the next section.

Test GraphQL Performance using k6

Performance testing a GraphQL API with k6 is very similar to testing a REST API. With the script we’ve set up in the previous section we can not only test the performance of the GraphQL API, but also test if including or excluding fields in a GraphQL query will impact the performance. In the results of the performance tests we’ll compare the size of the data received and the number of iterations, but not response times. Response times between REST and GraphQL are harder to test due, as the caching mechanisms between APIs can be very different. Running these tests only requires you to have k6 downloaded and installed on your local machine.

To run the first test, you need to save the script in a file called single.js so you can run the test with:

k6 run --vus 10 --duration 30s single.js

This command will run k6 with 10 VUs (Virtual Users) for 30 seconds.

          /\      |‾‾| /‾‾/   /‾‾/   
     /\  /  \     |  |/  /   /  /    
    /  \/    \    |     (   /   ‾‾\  
   /          \   |  |\  \ |  (‾)  | 
  / __________ \  |__| \__\ \_____/ .io

  execution: local
     script: single.js
     output: -

  scenarios: (100.00%) 1 scenario, 10 max VUs, 1m0s max duration (incl. graceful stop):
           * default: 10 looping VUs for 30s (gracefulStop: 30s)


running (0m30.1s), 00/10 VUs, 1859 complete and 0 interrupted iterations
default ✓ [======================================] 10 VUs  30s

     data_received..................: 932 kB 31 kB/s
     data_sent......................: 459 kB 15 kB/s
     http_req_blocked...............: avg=1.1ms    min=0s       med=1µs      max=206.02ms p(90)=1µs      p(95)=1µs     
     http_req_connecting............: avg=159.26µs min=0s       med=0s       max=31.23ms  p(90)=0s       p(95)=0s      
     http_req_duration..............: avg=160.46ms min=112.28ms med=139.78ms max=404.8ms  p(90)=258.63ms p(95)=326.02ms
       { expected_response:true }...: avg=160.46ms min=112.28ms med=139.78ms max=404.8ms  p(90)=258.63ms p(95)=326.02ms
     http_req_failed................: 0.00%  ✓ 0         ✗ 1859
     http_req_receiving.............: avg=254.95µs min=26µs     med=142µs    max=89.41ms  p(90)=289µs    p(95)=528.69µs
     http_req_sending...............: avg=172.3µs  min=33µs     med=138µs    max=8.94ms   p(90)=271.4µs  p(95)=364µs   
     http_req_tls_handshaking.......: avg=924.91µs min=0s       med=0s       max=173.97ms p(90)=0s       p(95)=0s      
     http_req_waiting...............: avg=160.03ms min=112.13ms med=139.46ms max=403.45ms p(90)=258.03ms p(95)=325.8ms 
     http_reqs......................: 1859   61.753961/s
     iteration_duration.............: avg=161.72ms min=112.38ms med=139.94ms max=609.5ms  p(90)=258.79ms p(95)=327.48ms
     iterations.....................: 1859   61.753961/s
     vus............................: 10     min=10      max=10
     vus_max........................: 10     min=10      max=10

The results show that the GraphQL API was hit around 1859 times in 30 seconds, with an average duration of 160ms. Also, we can see that 932kb of data was received and 459kb of data has been sent.

One of the things GraphQL can be good at is limiting the amount of data you receive. As opposed to REST APIs you won’t receive a static response but you have the power to determine which fields are returned. Let’s rerun this script but this time limit the fields that are returned by the GraphQL API:

const query = `
  query GetPost($id: ID!) {
    getPost(id: $id) {
      title
    }
  }
`;

Instead of returning the id, title and body this time only the title of the post will be returned. If you rerun the k6 script with this change, the amount of data that was both received should be less than in the first run:

| GETPOST       | FIRST RUN      | FEWER FIELDS   |
|---------------|----------------|----------------|
| data_received | 932kb (31kb/s) | 594kb (20kb/s) |

In my test run, the amount of data sent hasn’t changed much. The only improvement we did there was deleting two fields from the GraphQL query. But the amount of data received has decreased by almost 30%.

What would happen if we involved more data? The REST API would always return a fixed response, containing all the fields for every post it returns. While with GraphQL we can receive only the title for every post. Can we get even more convincing results, when we compare the results of getting all fields for all posts to getting just the title for all posts? Let’s try it out by creating a new script called all.js:

import http from 'k6/http';

const query = `
  query GetPosts {
    getPosts {
      id
      title
      body
      userId
    }
  }
`;

const headers = { 'Content-Type': 'application/json' };

export default function () {
  http.post(
    'https://public3b47822a17c9dda6.stepzen.net/api/with-jsonplaceholder/__graphql',
    JSON.stringify({ query }),
    { headers }
  );
}

This updated k6 script will get all posts and all the fields, this time including the userId. If you’d call the JSONPlaceholder API directly, the API will have the same response. Running this script with k6 under the same conditions as we did previously will result in:

| GETPOSTS      | FIRST RUN       |
|---------------|-----------------|
| data_received | 39mb (1.3 mb/s) |

When we make a change to the k6 script to only get the title field for every post, we should be able to reduce this number by a lot. To try this out update the query in the script all.js to the following:

const query = `
  query GetPosts {
    getPosts {
      title
    }
  }
`;

When you run the k6 performance test a second time, the new results will be visible in your termihnal. As expected the amount of data received has decreased a lot, from 39mb to "only" 13mb. While the number of iterations has also increased, meaning with the same number of iterations the amount of data received would be even less.

| GETPOSTS      | FIRST RUN       | FEWER FIELDS   |
|---------------|-----------------|----------------|
| data_received | 39mb (1.3 mb/s) | 13mb (417mb/s) |

You can imagine this makes a huge difference if you have thousands of users requesting lists of posts every day. Compared to the response the REST API would generate, your users have to load less than a third of the data with GraphQL as you have control over the data that is being returned.

Retrieving too much data due to the set up of REST APIs is what we call "overfetching" in GraphQL. From the performance test we ran, you can see the impact overfetching can have in your application. Now we know k6 can performance test GraphQL APIs, and the GraphQL API is already showing its value, let's proceed by testing a heavier GraphQL query in the next section.

Performance For Nested GraphQL Queries

The ability to determine the shape of the data isn't the only reason developers choose to adopt GraphQL as the query language for their APIs. GraphQL APIs have just one endpoint, and the queries (or other operations) can also handle nested data. You can request data from different database tables (like with SQL joins) or even various data sources in one request. This is different from REST APIs, where you typically have to hit multiple endpoints to get data from different sources. This is also known as "underfetching" (or the N+1 problem), something that GraphQL solves by letting you query multiple data structures at once.

In the GraphiQL interface for the GraphQL API we're testing, you can explore what other queries are available. One of those queries will combine the data from the REST API endpoints to get posts and users. In the REST API these are separate requests, but in the GraphQL API they are combined in one GraphQL query as you can try out on the deployed demo endpoint here or see in the screenshot below:

Query a StepZen API with Nested Data in GraphiQL

To get this data, the GraphQL API will do the following:

  1. Send a request to the underlying JSONPlaceholder REST API to get the post with the specified value for id to https://jsonplaceholder.typicode.com/posts/[id].
  2. Based on the returned value for userId by the first REST API request, it will request the user information in a second REST API request to https://jsonplaceholder.typicode.com/users/[id].

As the GraphQL API also has queries to get the post (called getPost) and user (called getUser) information separately, we can test the performance of the request to these both queries against the request for the single, combined query. First let’s test the separate queries in a k6 script to do another performance test of the GraphQL API. You can put the following k6 script in a new file called batch.js:

import http from 'k6/http';
import { group } from 'k6';

const post = `
  query GetPost($id: ID!) {
    getPost(id: $id) {
      id
      title
      body
      userId
    }
  }
`;

const user = `
  query GetUser($id: ID!) {
    getUser(id: $id) {
      id
      name
      username
      email
      phone
      website
      address {
        street
        suite
        city
        zipcode
        latitude
        longitude
      }
      company {
        name
        catchPhrase
        bs
      }
    }
  }
`;

const headers = {
  'Content-Type': 'application/json',
};

export default function () {
  http.batch([
    [
      'POST',
      'https://public3b47822a17c9dda6.stepzen.net/api/with-jsonplaceholder/__graphql',
      JSON.stringify({
        query: post,
        variables: {
          id: Math.floor(Math.random() * 101),
        },
      }),
      { headers },
    ],
    [
      'POST',
      'https://public3b47822a17c9dda6.stepzen.net/api/with-jsonplaceholder/__graphql',
      JSON.stringify({
        query: user,
        variables: {
          id: Math.floor(Math.random() * 11),
        },
      }),
      { headers },
    ],
  ]);
}

Running this k6 script will send a batch of requests to the GraphQL API mimicking the scenario where you need to send two REST API requests to get both the post and user data. The batch requests are run in parallel, giving you a more realistic scenario than sending just one request. When you run it under the same circumstances as the previous tests:

k6 run --vus 10 --duration 30s batch.js

In the result of the performance test that iteration performed two requests. The response times are similar to earlier tests, but before interpreting the amount of data received let’s create a new script called nested.js to run against the single, nested query first. This query will get the information for the user that wrote the post nested in the data response for the single post you’re retrieving. As we’ve done before we can limit the fields that are returned, as we only request the fields we want to use:

import http from 'k6/http';

const query = `
  query GetPost($id: ID!) {
    getPost(id: $id) {
      title
      user {
        name
        phone
        email
      }
    }
  }
`;

const headers = {
  'Content-Type': 'application/json',
};

export default function () {
  http.post(
    'https://public3b47822a17c9dda6.stepzen.net/api/with-jsonplaceholder/__graphql',
    JSON.stringify({
      query,
      variables: {
        id: Math.floor(Math.random() * 101),
      },
    }),
    { headers }
  );
}

And run it under the same conditions so we can compare the results.

| GETPOST       | BATCH.JS       | NESTED.JS      |
|---------------|----------------|----------------|
| data_received | 2.0mb (65kb/s) | 759kb (25kb/s) |
| data_sent     | 1.2mb (38kb/s) | 514kb (17kb/s) |
| http_reqs     | 3222           | 1758           |
| iterations    | 1611           | 1758           |

The GraphQL API was able to reach more iterations, but did less HTTP requests. The two REST API calls that are made to the GraphQL-ized {JSON}Placeholder API are performed by the GraphQL layer and therefore not visible in this result. But the amount of data that is received is drastically reduced, instead of 2.0mb only 759kb is received this time for a bigger amount of iterations. This is due to the fact that not all fields are requested from the GraphQL API.

The test we've just run shows GraphQL is perfectly able to combine your data all in one request while still being performant. Also, reducing the amount of data received by a lot as it can solve both the N+1 problem and overfetching at the same time.

Conclusion

In this post, we've explored how to performance test a GraphQL API using k6. This GraphQL API is a GraphQL-ized version of the open-source JSONPlaceholder REST API, converted using StepZen. We’ve tested various use cases that developers have for adopting GraphQL: overfetching and the N+1 problem. In the tests we’ve seen how GraphQL solves overfetching by limiting the amount of data received by altering the requested fields. Especially when requesting large amounts of data, the difference between GraphQL and REST was enormous. The N+1 problem, needing a second API request to get all your data, is solved by GraphQL by allowing you to combine data from multiple sources in one request. This became mostly evident in the number of HTTP requests needed to get your data. GraphQL was able to do more iterations in the same time frame, even returning way less data than the REST API.

Want to continue building or performance testing with GraphQL? You can find the code for the GraphQL-ized REST API here and more information on performance testing in the k6 documentation.


This article was originally published on the k6 blog: k6 blog repost