AI Agent Governance
What it is, why it's emerging now, and how it differs from AI security and AI governance. A practitioner's reference for the emerging category.
AI Agent Governance#
A working reference for security leaders and platform engineers who have started seeing the term everywhere and want to understand what it actually means.
What it is, in one paragraph#
AI agent governance is the layer of controls that determines what your AI agents can do, on whose behalf, and with what accountability. It sits between the agent runtime (Claude Code, Cursor, LangChain, OpenAI Agents SDK, an in-house orchestrator) and the systems those agents act on (your repos, your cloud, your SaaS, your databases). It includes identity, access control, policy enforcement, observability, and audit — applied specifically to the runtime behavior of autonomous software that takes actions on real systems.
It is not a product category most security teams had on their roadmap two years ago. It is now one of the fastest-moving areas in security, because agents broke a set of assumptions that most of the existing security stack was built on.
Why "agent governance" and not "AI security" or "AI governance"?#
These three terms get used interchangeably, and that's a problem — they describe different things, solve different problems, and are built by different vendors. Getting the distinction right is the first step.
| Scope | Primary concern | Typical buyer | |
|---|---|---|---|
| AI governance | Model lifecycle, training data, regulatory compliance | Is this model allowed to exist? Was it trained appropriately? | GRC, Legal, Chief AI Officer |
| AI security | The model as a target — prompt injection, jailbreaks, model theft, adversarial inputs | Can attackers make the model misbehave? | AppSec, ML security |
| AI agent governance | Runtime behavior of agents acting on systems — identity, access, audit | What is this agent allowed to do, and can we prove what it did? | Security engineering, platform security, CISO |
The boundaries aren't perfectly clean. Prompt injection is a real attack on agents, and the first-order defense is often at the model layer. But the consequences of a successful prompt injection are governed at the agent layer — what credentials the agent holds, what systems it can reach, what it is allowed to execute. A jailbroken model that has access to nothing is a curiosity. A well-behaved model with unrestricted access to your production database is a breach waiting to happen.
Agent governance starts from that second premise: assume the agent will occasionally do the wrong thing, for any reason — compromised tool, prompt injection, model error, supply chain attack, operator mistake — and build the controls that limit the damage.
The five dimensions#
Every mature agent governance program covers five things. A program missing any one of them has a known, structural gap.
1. Visibility#
You cannot govern what you cannot see. The baseline question: what agents are running in my environment, what tools are they connected to, and what upstream systems are they reaching?
In practice, the honest answer for most organizations today is "we don't know." Agents get installed by developers, connected to MCP servers by developers, and given scoped OAuth tokens by developers — most of it outside any security review. The first goal of any governance program is to build a real inventory: every agent, every tool it has, every upstream it calls, every principal it acts as.
2. Identity#
An agent needs a verifiable answer to "who are you?" that is distinct from the human who deployed it. Bearer tokens — the default on most stacks today — don't answer that question. A bearer token says "whoever holds this bill can spend it." It doesn't prove the holder is the entity the token was issued to.
The industry is converging on proof-of-possession. RFC 9449 (DPoP) is the deployed form. The IETF's Agent Identity and Management working group is standardizing the agent-specific variants. (We covered the full argument here.)
3. Access control#
Once you know which agent is acting, you need to decide what it is allowed to do. This is the policy layer. It answers questions like: can this agent call this tool? on behalf of this user? for this action? against this resource?
The important shift: the unit of policy is not "the agent" (a static principal). It is "this agent, acting as this human, in this session, for this action." Policy is four-dimensional, not one-dimensional. Most IAM systems are not built for this.
4. Audit#
If the agent takes an action you cannot explain, you don't have governance — you have hope. Audit means: for every consequential action an agent took, you can answer who initiated it, through which agent, using which tool, against which resource, with what result — and you can answer it fast enough to matter during an incident.
The current state of audit logs on most agent stacks is "the chat transcript" and "scattered tool-side logs." Those are not the same thing as a principal-chained audit record.
5. Runtime enforcement#
Policy in a config file that nothing enforces is not policy. Enforcement has to happen at the call site — the moment the agent actually reaches for a tool or sends a request upstream — not at the boundary of the agent runtime, and not after the fact in a SIEM.
This is the architectural point most ignored by existing tooling. The agent runtime is not a trustworthy enforcement point. It can be manipulated by inputs it processes. Enforcement belongs in a separate layer that the agent has to pass through, not a policy the agent checks itself.
Why it's emerging now#
Three public incidents in 12 months mapped directly onto the three weakest assumptions in the pre-agent security stack. Each incident showed an assumption breaking, and each breakage created demand for a governance primitive that hadn't existed.
The identity assumption broke first. The Salesforce Drift breach disclosed in 2025 — attributed to the ShinyHunters group — saw OAuth bearer tokens stolen from a conversational AI integration and used to access Salesforce environments across more than 700 customer organizations. No authentication alerts fired. The tokens were valid. The system had no way to distinguish "the entity that legitimately holds this token" from "whoever holds this token." That is a category-defining incident for agent identity.
The access assumption broke next. GitHub Copilot's CamoLeak vulnerability, disclosed by Legit Security in October 2025 at CVSS 9.6, showed that an AI agent operating with a developer's full repository access could be manipulated via hidden instructions in a pull request comment to exfiltrate AWS credentials through GitHub's own image proxy. The agent did exactly what its permissions allowed. The failure was not authentication. It was that "developer-level access" and "agent-level access" were treated as the same thing when they are not.
The runtime assumption broke third. The LiteLLM supply chain compromise in March 2026 — a malicious release pushed to a package with tens of millions of monthly downloads — showed that a single upstream dependency could silently harvest credentials from every machine running an agent, anywhere in the stack. The agent did not misbehave. The infrastructure under the agent did. Runtime enforcement at the call site would have detected the anomalous egress. Enforcement inside the agent runtime would not have.
Each incident answered a structural question the industry had been putting off: do agents need their own identity? their own access model? their own runtime controls? The answer in each case was yes.
Market response has been consistent with that answer. Every major identity vendor shipped agent-specific products at RSAC 2026. Several large security acquisitions in the last year centered on AI and agent runtime protection. Regulatory frameworks — NIST's AI RMF updates, the IETF AIMS working group, OWASP's Top 10 for LLM Applications — are codifying the controls. This is the same pattern seen when cloud workloads outgrew endpoint security (and CSPM emerged) and when SaaS adoption outgrew network DLP (and CASB emerged). New compute layer, new governance layer.
Reference architecture#
A working agent governance architecture has four components, deployed as a system:
-
A discovery surface that knows every agent, every tool, every MCP server, every upstream connection in the environment — continuously, not as a point-in-time scan.
-
An identity layer that issues non-bearer credentials to agents, binds them to proof-of-possession keys, and can answer "who is acting right now" for every call.
-
A policy engine whose inputs are four-dimensional — agent, principal-chain, action, resource — and whose output is an allow/deny decision at the moment of the call.
-
An enforcement point between the agent and the upstream system. Every call the agent makes passes through it. It decides, logs, and records — so that identity, policy, and audit are the same event, not three eventual-consistent systems.
The key architectural constraint: enforcement must be out of band of the agent runtime. The agent cannot be trusted to enforce controls on itself, because the agent's reasoning is driven by inputs that may be adversarial. Controls live in a layer the agent has to pass through.
Common questions#
Is this different from IAM? Yes. IAM assigns static identities and permissions to entities — users, service accounts, roles. Agent governance is four-dimensional and session-scoped: the question isn't "what can this agent do" but "what can this agent, acting as this person, in this session, for this action, do right now." Existing IAM systems are an input to agent governance, not a replacement for it.
Is this different from an API gateway? Yes. API gateways enforce policy at the network boundary of a service. Agent governance enforces policy at the boundary of an agent's action — which may cross many services, many tools, many tokens in a single logical operation. It is also bidirectional: governance applies both to what the agent calls upstream and to what the agent does with the results.
Is this different from CASB? Related, not the same. CASB was built for humans using SaaS through browsers. Agents use SaaS through APIs, often with different credentials, across many tools per session. The inventory problem is similar. The identity and enforcement problems are different.
Do I need this if agents are only internal? Most of the public incidents above involved internal-facing agents. Internal does not mean low-risk when the agent has production access. The most common incident shape in practice is not an external attack — it is an internal agent taking an action nobody authorized.
How does this relate to the IETF AIMS drafts? AIMS (Agent Identity and Management) is standardizing the identity layer — specifically, how agents prove their identity in a way that bearer tokens cannot. It addresses dimension 2 of agent governance. Access control, audit, and runtime enforcement are separate problems that AIMS deliberately leaves out of scope.
The landscape#
The category is forming, not formed. Four rough vendor archetypes exist today, each covering part of the five dimensions:
- Identity-first vendors are building agent-specific identity and proof-of-possession credentials. Strong on dimension 2, thin on the others.
- Observability-first vendors are building agent traces, tool call logs, and model monitoring. Strong on dimensions 1 and 4, weak on enforcement.
- Runtime-protection vendors are building model-layer defenses — prompt injection filters, jailbreak detection, output scanning. Adjacent to dimension 5, but operating on the model's text, not the agent's actions.
- Governance-platform vendors are trying to cover all five dimensions as a single control plane. This is the newest archetype and the thinnest today.
A mature stack will probably involve pieces from multiple archetypes. An honest self-assessment is to map your current coverage against the five dimensions and look at which ones are blank.
A starter playbook for security leaders#
You do not need to buy anything to start. A useful 30-day baseline:
-
Inventory. For each business unit, ask: what agents are running? what tools do they connect to? whose credentials are they using? Most organizations cannot answer this. Building the list is the first piece of governance work.
-
Identity audit. For every agent in the inventory, identify the credential type. Every bearer token is a risk. Prioritize the agents with access to production systems.
-
Enforcement gap analysis. For each agent, answer: if this agent was compromised today, what is the blast radius? The agents where the answer is "unlimited" are where runtime enforcement needs to land first.
The objective in the first 30 days is not to solve the problem. It is to make the gaps concrete enough that the next budget cycle can address them.
Where FirstOps fits#
FirstOps is building the governance-platform archetype — a single control plane covering visibility, identity, access, audit, and runtime enforcement across the agent runtimes a team actually uses (Claude Code, Cursor, MCP servers, LangChain, OpenAI Agents SDK, and in-house orchestrators). The goal is to make the work described in the playbook above something a security team can do in hours instead of quarters.
If that is the shape of problem you are working on, we would be glad to walk through it with you.
Further reading#
- Your Agent Passed OAuth. Now What? — why bearer tokens are an antipattern for agent identity, and what the IETF is doing about it.
- A Security Scanner Walked Into a Supply Chain — what the LiteLLM compromise means for every stack that runs agents.
- Your Coding Agent Has Your Keys — a trust boundary analysis of coding agents and the credentials they inherit.
- Prompt Injection Is Not the Incident — why stopping prompt injection at the model layer is not enough.