AI Agent Runtime Threat Model
The security boundary moved from chat to action. Useful controls evaluate the exact file, command, MCP call, destination, identity, and policy before the agent completes the operation.
Thesis
AI agent security needs an execution-layer model because intent, data access, and tool use separate after the first prompt.
Technical readout
Runtime event shape
A serious evaluation should inspect the event envelope, not just whether the product has an allow/block toggle.
actor + surface
Which user, group, host, repository, and agent produced the action?
Bind the tool call to stable user, host, session, and repo identifiers before policy evaluation.
operation payload
Is this a file read, write, command, web fetch, prompt, or MCP call, and what exact arguments matter?
Normalize tool-specific payloads into a common action schema with path, argv, URL, server, tool, and risk flags.
policy resolution
Which pack and rule matched, and did a stricter inherited rule win?
Store pack ID, rule ID, assignment source, enforcement mode, verdict, and explanation on every event.
evidence handling
Can analysts review enough context without exposing raw secrets broadly?
Capture prompts and outputs behind redaction, retention, and role-aware display boundaries.
The boundary moved from chat to action
A chat transcript can explain what a user asked for, but it does not fully describe what the agent did next. The risky moment is often a command, a file read, a code edit, a web fetch, or an MCP call that happens after the model has gathered local context.
That makes AI agent security different from traditional prompt filtering. The control point has to follow the agent into the runtime surface where actions become concrete operations with paths, destinations, repositories, identities, and arguments.
Agent actions have multiple owners
The developer starts the session, the model chooses the next step, the local workstation provides files and credentials, and connected tools expand reach into SaaS and internal systems. No single log source explains that chain by itself.
A useful threat model treats each action as a decision point. Who initiated it, which agent surfaced it, what tool executed it, what data was touched, which policy applied, and what verdict was returned?
Policy should run on normalized actions
Different agents expose different hooks. Some send file paths, some send command lines, some send MCP server and tool names, and some only expose a request envelope. A runtime security layer has to normalize those into one policy vocabulary before teams can reason about them.
That normalized shape is also what makes reporting useful. A blocked shell command, a warned MCP export, and a passed prompt action should be comparable without forcing analysts to learn every agent's native event format.
The useful controls are boring
The goal is not to block AI adoption. The goal is to make agent behavior legible enough that security can start in audit mode, learn real usage, then move specific risky behaviors to warning or blocking.
That means normalized events, deterministic policy outcomes, and investigation trails that do not require analysts to reconstruct sessions from screenshots, shell history, IDE logs, and SaaS audit exports.
Technical model
Runtime decision point
The useful control point is not the chat box. It is the moment the agent turns context into an operation.
Decision lattice
identity
host
repo
tool
Read, write, command, web fetch, prompt, MCP tool call
Identity, host, repo, path, destination, tool, arguments
PASS, WARN, BLOCK with evidence
path=/repo/.envargv=npm testmcp=github.searchSignals in the model
Prompt and context
User request, retrieved files, tool output, model state
Action proposal
Read, write, command, web fetch, prompt, MCP tool call
Policy evaluation
Identity, host, repo, path, destination, tool, arguments
Verdict and evidence
Pass, warn, block, redacted output, session timeline
A runtime event should explain what happened and why it was allowed.