Introducing Arcjet AI prompt injection protection
Introducing Arcjet prompt injection detection. Catch hostile instructions before inference. Works with Next.js, Node.js, Flask, FastAPI, and any JavaScript / TypeScript or Python application.
Arcjet Guards runs security rules inside agent tool handlers, queue consumers, and workflow steps - where proxies and WAFs can't see.
Developer infrastructure is reshaping around agents. Coding agents are now the primary integrators of most developer tools, so they must be redesigned assuming the agent is the primary user. But a new architectural model has also appeared - agentic systems.
These are applications built around long-running workflows that execute inside sandboxed environments, not triggered by an HTTP request. Picture a customer support agent that reads an incoming ticket, looks up the customer in your CRM, summarizes their account history with an LLM, and drafts a reply - kicked off by a queue message, with no human click and no HTTP request anywhere in the chain.
Today we're announcing Arcjet Guards: security rules that run inside AI agent tool handlers, queue consumers, workflow steps, and any other code that processes untrusted input without a Request object.
The enforcement point is wherever untrusted input is - not only where HTTP is.
Guards is the natural extension of how Arcjet has always worked: security as building blocks the developer composes in code, alongside the feature being built. This post explains why that shape matters more than ever, and how Guards fits alongside the Arcjet MCP server and CLI to cover the full surface of how software now gets built.
Being agent-friendly isn't the same thing as being agent-first. Shipping a CLI or an MCP server on top of an existing control plane or dashboard makes that control plane accessible to an agent, but that's only half of the work. Agents like to write code. That's what they're best at, and it's where most of the surface area of a modern coding agent actually sits - inside an editor, in a repository, producing diffs that a developer reviews and merges. A tool that lets an agent configure it by writing code meets the agent where it already is.
The agent writing your chat handler can see the prompt injection rule protecting it when that code is three lines above in the same file. The code review covers both. The pull request that adds the feature adds the protection. The reviewer is looking at one diff, not two systems. Explicit code beats remote configuration, because the code is where the agent is already working and where the developer is already reviewing.
Your app used to have one front door. An HTTP request came in, hit your router, passed through middleware that checked authentication, rate limits, and bot signals, then reached your code. A web application firewall (WAF) or proxy sat in front of that door, and because every request came through it, enforcement could live there.
Agentic systems don't have a front door. An agent tool handler receives untrusted input as a function argument, not a request body. A queue consumer pulls a message off a broker, never touching a router. A multi-agent pipeline passes state from one step to the next through shared memory or a workflow engine. These are the contexts where a growing share of application code now runs, and they're invisible to every security tool built around the assumption of a request boundary. Proxies can't see them. WAFs can't see them. HTTP middleware can't see them. The boundary an agent's tool call crosses is a function call boundary, and enforcement has to be there too.
Enforcement also needs context. An enforcement decision needs the same information the application has: who is the authenticated user, what route is this, what's this user's spend, what's normal for this session, what did the last tool call return. A proxy sitting in front of the application can see the request. It can't see the identity, the session, the business logic, or the budget. That information lives inside the application, and any enforcement that doesn't live there too is working with a reduced picture. Inside an agentic system, the reduction is even worse - a proxy can't see a tool call at all, because a tool call isn't a request.
Arcjet's SDK already protects HTTP handlers - the agent adds protection in code it's already writing, rules live in the same file as the handler, and the code review covers both. Guards extends that same model into the places HTTP middleware can’t reach. The agent calls a tool, the tool passes the input to a guard, the guard returns a decision.
Here's what that looks like inside an agent's fetch tool - prompt injection detection on the page content before it goes back to the model. This is TypeScript, but it's also supported in our Python SDK:
import { launchArcjet, detectPromptInjection } from "@arcjet/guard";
const arcjet = launchArcjet({
key: process.env.ARCJET_KEY!,
});
const promptInjection = detectPromptInjection({
mode: "LIVE",
});
// Inside an agent's fetch tool
export async function fetchTool({ url }: { url: string }) {
const content = await fetch(url).then((r) => r.text());
const decision = await arcjet.guard({
label: "tool.fetch", // identifier for analytics
rules: [promptInjection(content)],
});
if (decision.conclusion === "DENY") {
return { content: "[Blocked: prompt injection detected]" };
}
return { content };
}No request object, no middleware, no proxy. The guard runs where the untrusted input arrived.
When the agent needs live context - what's being blocked right now, which patterns are emerging, whether a dry-run rule is safe to promote - Arcjet's MCP server and CLI expose it in the same tooling the agent already operates in. Analyze traffic and tool decisions, investigate threats, promote rules, get a security briefing: all of it as tool calls the agent can make and reason over, without leaving the editor.
Proxies and WAFs exist because, for a long time, they were the only way to put enforcement between the internet and an application without involving the developers. That worked because the perimeter was real. It’s now dissolving. Tool calls happen inside the application. Pipeline steps fire without an HTTP request. There’s no network firewall boundary.
Application security has to move closer to the code, which means it has to live as code - written alongside the feature, reviewed in the same pull request.
An agent writing code has different needs than a human clicking a dashboard, and the products that fit those needs will be the ones that build the right shape from the start. We're building for how software actually gets built now.
Install the Arcjet skill into your AI coding agent:
npx skill add arcjet/skills --skill add-guard-protectionThen open your AI coding agent and describe what you want to protect e.g. for example: “rate limit my tool calls per user and block prompt injection” or “secure my MCP server” or “protect my queue worker”.
Introducing Arcjet prompt injection detection. Catch hostile instructions before inference. Works with Next.js, Node.js, Flask, FastAPI, and any JavaScript / TypeScript or Python application.
The Arcjet Python SDK allows you to implement rate limiting, bot detection, email validation, and signup spam prevention in FastAPI and Flask style applications.
Announcing Arcjet’s local AI security model, an opt-in AI security layer that runs expert security analysis for every request entirely in your environment, alongside our Series A funding.
Get the full posts by email every week.