You can test your GraphQL API with graphql-cop which is a command-line tool to check for common security vulnerabilities.
Let’s explore some of the tests it can run and how to prevent them.
Types of attacks
There are different types of attacks that the graphql-cop tool can help you prevent.
Cross-Site Request Forgery (CSRF)
A Cross-Site Request Forgery (CSRF) is a type of attack where a user is tricked and has unwanted actions being performed unbeknownst to them on an application they are authenticated to. For example, in GraphQL, allowing a mutation to be performed via a GET request.
Denial of Service (DoS)
A Denial of Service (DoS) is a type of attack that aims to make a server or network resource unavailable to users. It usually tries to overload the server with malicious or superfluous requests to prevent it from responding to legitimate requests. For DDoS, the first D is for Distributed, meaning that it is the same type of attack but coming from multiple sources.
We’ll be talking about those in this article.
Information Leakage
An information leakage is a type of attack that occurs when an attacker gains unauthorized access to some resources or sensitive information from the server.
For example, with the introspection query or via unwanted and very descriptive error messages from your backend exposing the internals of the database or application.
DOS GraphQL Attacks
Find on the OWASP website a cheat sheet of DOS prevention, or follow along for some examples.
Directive Overloading
Example
Directive Overloading occurs when a user can send a query with many consecutive directives and overload the engine handling those directives.
Here is an example from the graphql-cop tool:
query {
book(id: 1) {
__typename
@aa@aa@aa@aa@aa@aa@aa@aa@aa@aa
}
}
The issue can be spotted when the returned error (because obviously @aa
is not a valid directive) will be composed of
an array of one error per directive. This can scale quickly.
Why allow directives at all in the query?
In some case the @includes
or @skip
directives are useful to further fine-tune the query and could be used by a client.
Like we’ve seen in this previous article about advanced resolvers.
Protection
The recommendation for this one is to implement a limit on the number of directives allowed in a query. To measure how many directives are needed, you can check the current usage of directive in your existing queries.
This should be handled before or by the GraphQL engine while parsing the document, otherwise, this can lead to a heap overflow.
By default, in Apollo GraphQL tries to resolve as many fields as possible and will go through each directive and generate a new error for each one.
Field Duplication
Example
Field Duplication occurs when a user can send a query with excessively duplicated fields hoping that the API will have inefficient processing of those fields.
book(id: 1) {
__typename __typename __typename __typename __typename __typename __typename
}
It can also happen with fragments:
fragment book on Book {
__typename
__typename
}
query {
book(id: 1) {
__typename
__typename
# FragmentSpread
...book
# InlineFragment
... on Book {
__typename
__typename
}
}
}
By default, some popular frameworks like Apollo GraphQL will not allow duplicated fields in the response, but it will not consider as a bad request multiple fields in the query.
Protection
The suggestion for this one is to implement a middleware that will clean out the duplicated fields before it reaches the graphql engine. However, de-duplicating means an additional parsing and traversing of the GraphQL query tree which can impact performance. Find out how to transform the received GraphQL with Apollo GraphQL.
Alias Overloading
Alias Overloading occurs when a user can send a query with many consecutive aliases, hoping that the API will have inefficient processing of those aliases (try it out on your API and find the resolving difference with 10 and 100 aliases). Constructed with field duplication and batch queries (multiple queries in one using aliases), you can craft resource heavy queries.
Using aliases for multiple fields:
query {
book(id: 1) {
__typename
alias: __typename
alias2: __typename
alias3: __typename
alias4: __typename
}
}
Using aliases for multiple queries within one:
query {
book(id: 1) {
__typename
}
book2: book(id: 1) {
__typename
}
}
Aliases are useful to rename fields in the response or to query the same field multiple times with different arguments. As we’ve seen in this article about how to bulk mutate with multiple mutation in one call in GraphQL. This is a plus which can save some post-processing of the response on the client side.
Protection
The suggestion to avoid this type of attack is to implement a limit on the number of aliases allowed in a query. Same as for the directive, the number of aliases should be measured and checked against the current usage in your existing queries. The goal is to prevent attacks while allowing your client to use aliases for their queries as usual.
Another one is applying a rate limit to the client requests, limiting the number of requests per client in a given time frame. The rate limit usually works with attributed points during a time frame, and when the limit is reached, the client is blocked. Find out how GitHub is handling rate limits for their GraphQL API based on the type of operation and total number of nodes in a call.
Query Depth Attack
Example
We talked about the problem of having circular dependencies in the GraphQL schema in a previous article. The Query Depth Attack will use those to create a query that never ends. It can snowball ☃️ as the deeper the query goes the more resources it will likely consume.
Here is how the depth is calculated for a query that can be vulnerable to this attack:
query { # Depth: 0
books { # Depth: 1
title
author { # Depth: 2
title
books { # Depth: 3
title
author { # Depth: 4
... # Depth: ...
}
}
}
}
}
The books
resolver here is not paginated, so there’s no limit on the number of books to fetch!
That means for each iteration we return all the books and their authors and so on!
This problem can be mitigated with some performance improvements on the resolver as we’ve discussed in this article.
But even the best API can avoid the risks of a perfectly tailored malicious query.
Protection
To protect your API against this type of attack, you can implement a limit on the depth of the query. This can be done by setting a maximum depth allowed for a query, and if the query exceeds that depth, it will be rejected. The depth should be calculated before it reaches the GraphQL engine, so it doesn’t even try to resolve it.
Query batching
Example
Array Batching in Apollo GraphQL is usually not enabled by default. When disabled, your API is safe from this type of attack.
Array-based query batching occurs when a user sends an array of queries (a batch) to the service in a single request. If there are no limits on the batch side, you could overload the server, which will have a hard time processing all the queries.
An array-based query:
[
{ query: 'query { book(id: 1) { __typename } }' },
{ query: 'query { book(id: 1) { __typename } }' },
]
Why enable batching at all?
Array-based query batching can be useful when you have multiple queries that are not dependent on each other. This often occurs when working in UI, and you’d rather make one call to the backend instead of multiple. Either to fetch all the data you need or start multiple asynchronous processes at once. Check out this article about how to batch queries in a React component.
Protection
The suggestion to prevent this type of attack is to implement restrictions on the number of allowed queries in a single batch request. This means that for example, if you have a limit of 10 queries per batch, the service will either only process the first 10 queries or reject the whole batch with a specific error.
This suggestion with the implementation of a rate limiter based on the query per batch should be enough to protect the API from a Denial of Service attack.
Implementation
Now that we’ve seen the different vector of attacks and the recommendation, let’s dive a bit into how to implement them in your GraphQL API.
Overload Protection
From the documentation of Apollo GraphQL, you can find a Overload protection middleware that can be used to protect your application from heap overflow.
The package overload-protection
will return a 503 and a Retry-After
header when key metrics set within the
middleware for the server are being exceeded.
Those metrics are:
maxHeapUsedBytes
: The amount of bytes used by the application’s heap.maxRssBytes
: The amount of bytes used by Resident Set Size (RSS) which encompass the heap and all allocated memory for the process execution.maxEventLoopDelay
: The maximum delay in milliseconds for the event loop to process the request.
You can find those metrics in the process.memoryUsage()
object in Node.js.
This approach doesn’t prevent the Denial of Service because the request is still being processed, but it should recover
faster since the server will return a 503 when it starts to be overloaded (instead of when it’s too late).
Custom protection via GraphQL AST traversal
In this custom approach, we will traverse the GraphQL AST and check for the number of directives, aliases, fields, and depth. That way, we can implement a middleware to reject the query if deemed malicious or heavy before we even try to resolve it.
For that we will use some out-of-the-box methods from the graphql
package to parse and traverse the AST (Abstract Syntax Tree).
The AST is a tree representation of the query that can be traversed to extract information about the query.
You can find it in the resolver as the info
object:
import { visit, parse } from 'graphql';
import { ASTNode, DirectiveNode, DocumentNode } from 'graphql/index';
import { ASTVisitor } from 'graphql/language/visitor';
function traverse(query: string): void {
const ast: DocumentNode = parse(query);
let directiveCount = 0;
let aliasCount = 0;
let maxDepth = 0;
let depth = 0;
const visitor: ASTVisitor = {
Directive(node: DirectiveNode) {
directiveCount++;
},
enter(node: ASTNode) {
if (node.kind === 'Field') {
depth++;
if(depth > maxDepth) maxDepth = depth;
if (node.alias) aliasCount++;
}
},
leave(node: ASTNode) {
if (node.kind === 'Field') {
depth--;
}
},
};
const resultingDocument: DocumentNode = visit(ast, visitor);
}
This is a simple example of how you can traverse the AST and count the number of directives, aliases, and depth. Let’s break it down a bit:
- We parse the query into an AST with
parse(query)
asDocumentNode
which is a usable representation of the query. - We define a
visitor
which is aASTVisitor
where you can define custom behaviour that will be executed when traversing the AST.- The
Directive
method will be called each time a directive is found in the query. - The
enter
method will be called each time a node is entered- We look for nodes of type
Field
which are the fields in the query and increment the alias count if one, or calculate the current depth.
- We look for nodes of type
- The
leave
method will be called each time a node is left- We decrement the depth when leaving a
Field
node.
- We decrement the depth when leaving a
- The
- The result of the
visit
method will be a newDocumentNode
with the visitor applied to it.
By using this method, you can also modify the tree by returning null
on a field to remove it from the tree if it is a
duplicate (you would need to check in the SelectionSet
the fields that are already present and remove duplicates as you
go in the Fields
).
Rate Limiter
To implement a rate limiter, there are many options available.
You can use rate-limiter-flexible
with ioredis
to store the rate limit information in a Redis database.
Other options than redis are available, including an in-memory store to test it out (not meant for production).
Here is an example:
import { Request, Response, NextFunction } from 'express';
import { RateLimiterRedis } from 'rate-limiter-flexible';
import * as Redis from 'ioredis';
class RateLimiter {
private readonly limiter: RateLimiterRedis;
constructor() {
this.limiter = new RateLimiterRedis({
storeClient: new Redis({ options: { /* Redis options */ } }),
points: 100, // allowed max points per duration
duration: 10, // point reset frequency in seconds
blockDuration: 60, // block nest requests for 60 seconds if all points consumed in duration
});
}
middleware() {
return async (req: Request, res: Response, next: NextFunction) => {
try {
await this.limiter.consume(req.ip, 1); // Consume 1 point for the request for this IP
next();
} catch (rejRes) {
res.status(429).send('Too Many Requests'); // too many requests -> send HTTP 429 status code
}
};
}
}
In this example, we have a simple rate limiter class, we create a RateLimiterRedis
with a ioredis
client (assuming
you have run and configured through the redis options, otherwise use the RateLimiterMemory
without redis).
Then I set up the rate limiter options, I left some comments, so you know what each option does.
When a graphql call is issued to the API, as a middleware, it will consume
a point for the request’s IP.
Which means that if the rate limit is reached for one IP, it won’t impact another IP.
Better yet, if you have authenticated users using your API, you can apply the rate limit per user instead of IP.
The rate limit throws an exception when the limit is reached, letting you know the number of points consumed, and the
time remaining before the blocking period expires.
To use it in an express app
, you could do for the graphql API on /graphql
route:
const rateLimiter = new RateLimiter();
app.use('/graphql', rateLimiter.middleware());
Or you can find other examples on the rate-limiter-flexible documentation. They have a wiki with examples of the rate limiter usage within different frameworks or situations.
Now you should be all set to defend your GraphQL API against evil DoS threats …if we omit other potential attack vectors (that were not mentioned) and until the next vulnerabilities are discovered. 🙃