Research
Threat modelUpdated May 1, 20269 min

AI Agent Runtime Threat Model

The security boundary moved from chat to action. Useful controls evaluate the exact file, command, MCP call, destination, identity, and policy before the agent completes the operation.

Thesis

AI agent security needs an execution-layer model because intent, data access, and tool use separate after the first prompt.

runtime securityworkstationspolicy

Technical readout

Runtime event shape

A serious evaluation should inspect the event envelope, not just whether the product has an allow/block toggle.

actor + surface

Which user, group, host, repository, and agent produced the action?

Bind the tool call to stable user, host, session, and repo identifiers before policy evaluation.

operation payload

Is this a file read, write, command, web fetch, prompt, or MCP call, and what exact arguments matter?

Normalize tool-specific payloads into a common action schema with path, argv, URL, server, tool, and risk flags.

policy resolution

Which pack and rule matched, and did a stricter inherited rule win?

Store pack ID, rule ID, assignment source, enforcement mode, verdict, and explanation on every event.

evidence handling

Can analysts review enough context without exposing raw secrets broadly?

Capture prompts and outputs behind redaction, retention, and role-aware display boundaries.

The boundary moved from chat to action

A chat transcript can explain what a user asked for, but it does not fully describe what the agent did next. The risky moment is often a command, a file read, a code edit, a web fetch, or an MCP call that happens after the model has gathered local context.

That makes AI agent security different from traditional prompt filtering. The control point has to follow the agent into the runtime surface where actions become concrete operations with paths, destinations, repositories, identities, and arguments.

Agent actions have multiple owners

The developer starts the session, the model chooses the next step, the local workstation provides files and credentials, and connected tools expand reach into SaaS and internal systems. No single log source explains that chain by itself.

A useful threat model treats each action as a decision point. Who initiated it, which agent surfaced it, what tool executed it, what data was touched, which policy applied, and what verdict was returned?

Policy should run on normalized actions

Different agents expose different hooks. Some send file paths, some send command lines, some send MCP server and tool names, and some only expose a request envelope. A runtime security layer has to normalize those into one policy vocabulary before teams can reason about them.

That normalized shape is also what makes reporting useful. A blocked shell command, a warned MCP export, and a passed prompt action should be comparable without forcing analysts to learn every agent's native event format.

The useful controls are boring

The goal is not to block AI adoption. The goal is to make agent behavior legible enough that security can start in audit mode, learn real usage, then move specific risky behaviors to warning or blocking.

That means normalized events, deterministic policy outcomes, and investigation trails that do not require analysts to reconstruct sessions from screenshots, shell history, IDE logs, and SaaS audit exports.

Technical model

Runtime decision point

The useful control point is not the chat box. It is the moment the agent turns context into an operation.

Signals in the model

Prompt and context

User request, retrieved files, tool output, model state

Action proposal

Read, write, command, web fetch, prompt, MCP tool call

Policy evaluation

Identity, host, repo, path, destination, tool, arguments

Verdict and evidence

Pass, warn, block, redacted output, session timeline

A runtime event should explain what happened and why it was allowed.