JS
Updated
8 min read

Test security rules without breaking production: Arcjet's DRY_RUN mode

Arcjet is designed to run everywhere - locally, in CI/CD, in staging, and in production. Write unit tests for your security rules and avoid breaking production.

Test security rules without breaking production: Arcjet's DRY_RUN mode

Picture this: it’s well into the evening in the office, and you sit at your computer, moments away from altering the security configurations of your company’s critical software. You were urgently asked to tighten some things up, but right now the only thing on your mind is receiving an emergency call as soon as you’re about to go to bed. Who knew making changes could be this stressful?

With the right tools, security changes don’t have to be this intimidating. It’s fear of the unknown that is the biggest cause for hesitation. The best thing you can do is to build confidence in your changes through a data-driven approach to the implementation - an approach that uses real environments, detailed context, and live activity to build evidence that your change will not be disruptive.

In this article, we’ll cover the challenges of configuring legacy WAFs and CDN security services, and how Arcjet takes a different approach that takes the nerves out of security rule changes.

The Old Approach: Cumbersome logging

When it comes to building confidence in our changes, traditional CDN security tools have limited options. Typically, you can turn on WAF logging and evaluate potential rule changes against those logs, but there are a few of downsides to this approach:

  • The WAF is a separate system, meaning different configuration and unusual log formats. This level of logging is usually something that needs to be enabled and may not even be included as a feature of the product available by default.
  • Using a separate system means finding where the logs are and figuring out how to query them. Then when you get a result, you need to understand how to correlate it to the requests that triggered the log entries.
  • Logging is often billed separately and by volume, so if you can only test in production your log volume may suddenly explode.

With so much context switching, it’s easy for something to slip through the cracks.

The Old Approach: Cross-environment limitations

Another way to build confidence in a change is to test it in a non-production environment first, but this is still a challenge for traditional web application security tools. Traditional WAFs operate as a reverse proxy and are usually only deployed to production and never in the development environment.

The software development lifecycle begins on a developer’s workstation, but there isn’t a realistic way to test WAF rules on a workstation. You typically need to wait until you have deployed your application to a dedicated testing environment.

And even if you do have security set up in your testing environment, it can be cumbersome to manage as an external system and your testing won’t include normal user traffic.

The Arcjet Way: Test everywhere

At this point, it’s clear that there’s some toil and doubt involved when making security rule changes with traditional WAFs. Now, let’s explore how Arcjet’s architecture allows you to take an approach that removes doubt through simplicity.

Test locally

In “the old approach”, we discussed how it’s nearly impossible to evaluate your security rules while developing on your workstation because the security engine lives on another system.

A benefit of Arcjet’s architecture is that your security functionality lives inside your application. That means you can build your application on your development workstation, and your security rules will act the same locally as if you were running them in production.

Let’s say you are developing a Next.js application and want to add Arcjet Shield WAF to one of your routes. Once you have Arcjet added to your route.ts file with a Shield rule, you start your application locally.

import arcjet, { shield } from "@arcjet/next";
import { NextResponse } from "next/server";

const aj = arcjet({
  key: process.env.ARCJET_KEY!,
  rules: [
    shield({
      mode: "LIVE",
    }),
  ],
});

export async function GET(req: Request) {
  const decision = await aj.protect(req);

  for (const result of decision.results) {
    console.log("Rule Result", result);
  }

  console.log("Conclusion", decision.conclusion);

  if (decision.isDenied() && decision.reason.isShield()) {
    return NextResponse.json(
      {
        error: "You are suspicious!",
      },
      { status: 403 },
    );
  }

  return NextResponse.json({
    message: "Hello world",
  });
}

/app/api/arcjet/route.ts

All you need to do to make sure the rule is working is send 5 curl requests with the special header to cross the test threshold for malicious activity. You can do this with the following command:

for i in {1..5}; do curl -v -H "x-arcjet-suspicious: true" http://localhost:3000; done

After running the 5th curl request, you should receive a 403 error and see a blocked request in your Arcjet logs.

Since the security engine is part of your application, you can do these simple sanity-checks for your rules anywhere your application can run.

Test in CI/CD

Another area where it’s traditionally hard to test security rules is in your CI pipeline. Arcjet’s security-as-code architecture makes it easy to do automated testing, such as with the Newman framework. Let’s take a look at the following Express app example from GitHub that illustrates how this works:

In our Express app, we have an API endpoint that is very sensitive to performance issues, so we’ll add an Arcjet rate limit rule to only allow 1 request per second.

import express from "express";
import arcjet, { detectBot, fixedWindow } from "@arcjet/node";

const aj = arcjet({
  key: process.env.ARCJET_KEY,
  rules: [],
});

const app = express();

app.get("/api/low-rate-limit", async (req, res) => {
  const decision = await aj
    // Only inline to self-contain the sample code.
    // Static rules should be defined outside the handler for performance.
    .withRule(fixedWindow({ mode: "LIVE", window: "1s", max: 1 }))
    .protect(req);

  if (decision.isDenied()) {
    res.status(429).json({ error: "rate limited" });
  } else {
    res.json({ hello: "world" });
  }
});

//...

const server = app.listen(8080);

// Export the server close function so we can shut it down in our tests
export const close = server.close.bind(server);

index.js

Now, we’ll define our test collection for Newman. To test the rate limiting, we will have Newman send two requests. We will expect the first request to succeed, and the second request should be denied by our Arcjet rate limit rule.

{
  "variable": [{ "key": "baseUrl", "value": "http://localhost:8080" }],
  "item": [
    {
      "name": "/api/low-rate-limit",
      "item": [
        {
          "name": "Allowed",
          "request": {
            "url": "{{baseUrl}}/api/low-rate-limit",
            "header": [
              {
                "key": "Accept",
                "value": "application/json"
              }
            ],
            "method": "GET",
            "body": {},
            "auth": null
          },
          "event": [
            {
              "listen": "test",
              "script": {
                "type": "text/javascript",
                "exec": [
                  "pm.test('should be allowed', () => pm.response.to.have.status(200))"
                ]
              }
            }
          ]
        },
        {
          "name": "Denied",
          "request": {
            "url": "{{baseUrl}}/api/low-rate-limit",
            "header": [
              {
                "key": "Accept",
                "value": "application/json"
              }
            ],
            "method": "GET",
            "body": {},
            "auth": null
          },
          "event": [
            {
              "listen": "test",
              "script": {
                "type": "text/javascript",
                "exec": [
                  "pm.test('should be rate limited', () => pm.response.to.have.status(429))"
                ]
              }
            }
          ]
        }
      ]
    }
  ],
  "event": []
}

tests/low-rate-limit.json

Lastly, we’ll create the Javascript file to run our test:

import { after, before, describe, test } from "node:test";
import assert from "node:assert";
import { fileURLToPath } from "node:url";
import { promisify } from "node:util";

import { run } from "newman";

// Promisify the `newman.run` API as `newmanRun` in the tests
const newmanRun = promisify(run);

describe("API Tests", async () => {
  // Importing the server also starts it listening on port 8080
  const server = await import("../index.js");

  after((done) => server.close(done));

  test("/api/low-rate-limit", async () => {
    const summary = await newmanRun({
      collection: fileURLToPath(
        new URL("./low-rate-limit.json", import.meta.url),
      ),
    });

    assert.strictEqual(
      summary.run.failures.length,
      0,
      "expected suite to run without error",
    );
  });

//...

});

tests/api.test.js

Now, you just need to set up a workflow to execute the automated tests within your CI environment and you can run the security rules as part of your test suite.

Test in staging or preview environments

With traditional WAFs, staging environments are typically set up to simulate production and run integration tests that behave like they would for your users. Even though Arcjet’s ability to simulate production is similar to WAFs in dedicated environments, there are still two major benefits at this point.

The first benefit is that you’ve already simulated production before you even got to the staging environment. This means you may have run most of your security-specific tests in CI already and caught issues earlier in the development lifecycle.

The second benefit is that setting up additional environments is less work with Arcjet. You don’t need to configure a reverse proxy, and all of your security rules were configured when you wrote the code.

When it comes to dedicated environments, you can stick to the old ways and run tests in staging or preview. However, it’s even more effective to test Arcjet rules in production.

Test in production

Admit it - you read the words “test in production” and cringed a little. With Arcjet rules, testing in production isn’t a bad thing. Any rule you create in Arcjet can be run in DRY_RUN mode without affecting your users. Let’s break down what that looks like.

When you are defining Arcjet security rules, each rule is deployed in either LIVE or DRY_RUN mode. LIVE rules will actively block a request that matches the security rule, but DRY_RUN rules will simply log the would-be block action in Arcjet. Here’s an example:

Let’s say you have an existing Next.js application with Arcjet Shield for blocking attacks like SQL injection, but you’d also like to start blocking automated bots. You simply add the detectBot rule to your Arcjet object with mode: "DRY_RUN":

import arcjet, { createMiddleware, detectBot } from "@arcjet/next";
export const config = {
  // Matcher tells Next.js which routes to run the middleware on.
  // This runs the middleware on all routes except for static assets.
  matcher: ["/((?!_next/static|_next/image|favicon.ico).*)"],
};
const aj = arcjet({
  key: process.env.ARCJET_KEY!,
  rules: [
    shield({
      mode: "LIVE", // Will block requests.
    }),
    detectBot({
      mode: "DRY_RUN", // New rule, log only for evaluation.
      allow: [
        "CATEGORY:SEARCH_ENGINE", // Google, Bing, etc.
      ],
    }),
  ],
});
// Pass any existing middleware with the optional existingMiddleware prop.
export default createMiddleware(aj);

After deploying the new rule, your application’s traffic keeps flowing the same way it did before. After some time, you can check Arcjet’s logs and notice that your uptime monitor’s requests are being logged as would-be blocks. You add another category to your detectBot rule and redeploy:

const aj = arcjet({
  key: process.env.ARCJET_KEY!,
  rules: [
    shield({
      mode: "LIVE",
    }),
    detectBot({
      mode: "DRY_RUN",
      allow: [
        "CATEGORY:SEARCH_ENGINE",
        "CATEGORY:MONITOR", // Uptime monitoring services.
      ],
    }),
  ],
});

Since the DRY_RUN capability is built into the rule definition, the process of evaluating rule changes is as easy as actually making the change.

With Arcjet, all your rules are just code, so you can do things like selectively sampling requests and applying rules to a subset of traffic. For example, if you wanted to trigger Arcjet Shield and bot detection rules in live mode on 10% of your traffic then you could write a sample function like this:

import arcjet, { detectBot, shield } from "@arcjet/next";
import { NextRequest, NextResponse } from "next/server";

export const config = {
  // matcher tells Next.js which routes to run the middleware on. This runs
  // the middleware on all routes except for static assets.
  matcher: ["/((?!_next/static|_next/image|favicon.ico).*)"],
};

const sampleRate = 0.1; // 10% of requests

const aj = arcjet({
  key: process.env.ARCJET_KEY!,
  // You could include one or more base rules to apply to all requests
  rules: [],
});

function shouldSampleRequest(sampleRate: number) {
  // sampleRate should be between 0 and 1, e.g., 0.1 for 10%, 0.5 for 50%
  return Math.random() < sampleRate;
}

// Shield and bot rules will be configured with live mode if the request is
// sampled, otherwise only Shield will be configured with dry run mode
function sampleSecurity() {
  if (shouldSampleRequest(sampleRate)) {
    console.log("Rule is LIVE");
    return aj
      .withRule(
        shield(
          { mode: "LIVE" }, // will block requests if triggered
        ),
      )
      .withRule(
        detectBot({
          mode: "LIVE",
          allow: [], // "allow none" will block all detected bots
        }),
      );
  } else {
    console.log("Rule is DRY_RUN");
    return aj.withRule(
      shield({
        mode: "DRY_RUN", // Only logs the result
      }),
    );
  }
}

export default async function middleware(request: NextRequest) {
  const decision = await sampleSecurity().protect(request);

  if (decision.isDenied()) {
    if (decision.reason.isBot()) {
      return NextResponse.json({ error: "You are a bot" }, { status: 403 });
    } else if (decision.reason.isShield()) {
      return NextResponse.json({ error: "Shields up!" }, { status: 403 });
    } else {
      return NextResponse.json({ error: "Forbidden" }, { status: 403 });
    }
  } else {
    return NextResponse.next();
  }
}

Next.js middleware.ts

Maybe “evaluate in production” is the more accurate term for what Arcjet allows you to do, but the benefits are clear: you can push a new rule to production with no worries and see what happens.

Conclusion

Making changes to security rules for your application doesn’t have to be intimidating or uncertain. We covered some of the ways traditional web application security tools fall short for change management, and how Arcjet provides a solution.

Arcjet’s architecture delivers a lot of benefits for developer experience, and testing changes is one of them. By making use of DRY_RUN mode, you can build confidence in your changes with no added complexity. With early sanity-checks and simple evidence from real traffic, you will have no fear of breaking production when using Arcjet to protect your application.

Related articles

Remix Security Checklist
Remix
12 min read

Remix Security Checklist

A security checklist for Remix applications: dependencies & updates, module constraints, environment variables, authentication and authorization, cross-site request forgery, security headers, validation, and file uploads.

Subscribe by email

Get the full posts by email every week.