agent-identityoauthdpopsecurityrfc-9449ietf-aims

Your Agent Passed OAuth. Now What?

OAuth was designed for humans clicking 'Authorize' in a browser. AI agents don't click anything. The protocol's core assumptions (human presence, static scopes, one-time consent, bearer semantics) break in ways that have already caused real breaches. The industry is converging on proof-of-possession. Here's why, and what comes after.

Anshal Dwivedi

April 9, 2026·16 min read

In 2025, a threat group called ShinyHunters stole OAuth bearer tokens from Drift, Salesforce's conversational AI chatbot platform. With those tokens, they accessed Salesforce environments across more than 700 organizations. No alarms went off. No authentication alerts triggered. The identity system worked exactly as designed.

That's the problem. The tokens were valid. The access was authorized. And the system had no way to know that the entity presenting the token wasn't the entity the token was issued to.

Bearer tokens are like cash. Whoever holds the bill can spend it. That property was an acceptable trade-off when the holder was a human sitting at a browser, and the token lived in a secure HTTP-only cookie for 15 minutes. It is not an acceptable trade-off when the holder is an autonomous agent passing credentials through multi-hop tool chains, and the token is stored in a config file on disk alongside 14 other secrets.

OAuth has been the backbone of authorization on the internet for 14 years. It is a well-designed protocol for the problem it was designed to solve. But the problem has changed, and OAuth's core assumptions (human presence, static scopes, one-time consent, bearer semantics) are now falsifiable claims about a world that no longer exists.

This is not speculation. The IETF published a draft in March 2026 that says it plainly: static bearer credentials are an antipattern for agent identity. Every major identity vendor shipped agent-specific products at RSAC 2026. The industry is converging. The question is no longer whether OAuth is enough. It's what comes after.

OAuth 2.0 (RFC 6749, October 2012) codified a social contract between three parties: a user, an application, and a resource server. The user grants the application limited access to their resources. The application receives a token. The resource server accepts the token.

Every assumption in the protocol reflects a specific interaction model: a human being, sitting at a computer, making a conscious decision to grant access.

The authorization code flow, the most secure OAuth flow, works like this:

The application redirects the user to the authorization server
The user authenticates (enters password, completes MFA)
The user reviews the requested permissions ("This app wants to read your email")
The user clicks "Authorize"
The authorization server issues a code
The application exchanges the code for an access token
The application uses the token to access the user's resources

Steps 2 through 4 require a human. A human with a browser, a human who can read a consent screen, a human who can make a judgment call about whether this application should have access to their email. The entire security model rests on that moment of informed consent.

Now consider an AI agent. It has no browser. It has no eyes to read a consent screen. It doesn't make judgment calls about authorization. It makes API calls. It runs at 3 AM on a server in us-east-1, headless, unsupervised, executing a task chain that might involve 40 tool calls across 6 different services.

The protocol has no concept of this entity.

Five Assumptions That Break#

1. Human presence#

OAuth assumes someone is there to authenticate. The authorization code flow with PKCE, the gold standard for OAuth security, requires an interactive redirect. Someone clicks a link, logs in, reviews permissions, and clicks "Authorize."

This is the flow OAuth was designed for:

Steps 6 and 7 are where the security lives. The browser shows the user a consent screen. The user reads it, makes a judgment call, and explicitly grants permission. The entire trust model depends on that moment: a human, in real time, deciding "yes, this application should have access."

Now here is what happens when an agent needs access:

Two steps. No consent screen. No scope review. No human in the loop.

The client credentials grant was designed for server-to-server communication between known, trusted services: a payment service calling an inventory service, both deployed and configured by the same team. The trust model is implicit: if you have the client secret, you are the service. The authorization server doesn't ask "should this service have access?" because the answer was decided at deployment time by the engineer who configured the credentials.

Agents are not known, trusted services. They are autonomous entities that discover tools dynamically, call APIs they weren't explicitly configured to use, and operate across trust boundaries that shift with every task. The client credentials grant gives them a token with no consent, no review, and no mechanism for the authorization server to distinguish "research agent summarizing Notion pages" from "compromised agent exfiltrating Notion pages." Both present the same client ID, the same secret, and receive the same token.

The client credentials grant trades all of OAuth's safety properties for the one property that agents need: no human required.

2. Static scopes#

OAuth scopes are coarse and fixed at authorization time. An agent requesting the repo scope on GitHub gets read and write access to all repositories. An agent requesting https://www.googleapis.com/auth/drive gets access to the entire Google Drive. There is no mechanism in OAuth to say "read the README from one specific repository" or "edit only this one spreadsheet for the next 10 minutes."

Agent tasks are the opposite of coarse and fixed. They are narrow and dynamic. An agent might need to read a Notion page, then write to a Jira ticket, then query a database, then send a Slack message. Four different resources, four different permission levels, all within a single task that took 8 seconds. OAuth's scope model cannot represent this.

The IETF AIMS draft proposes transaction tokens as a fix: tokens bound to a specific transaction that cannot be reused for other purposes. This is an acknowledgment that OAuth's scope model was not designed for entities that perform dozens of distinct operations per minute.

When a human authorizes an application via OAuth, that decision is a point-in-time event. The user reviews the requested permissions once. The resulting token carries that authorization for its entire lifetime, which could be hours, days, or indefinitely if the refresh token has no expiration.

There is no mechanism for re-evaluation. If the agent's behavior changes (it starts calling tools it wasn't expected to call), if the risk context changes (the agent's host is compromised), if the task scope changes (the agent pivots from reading to writing), the token doesn't know and doesn't care. It was authorized once, and it remains authorized until it expires or is explicitly revoked.

ISACA framed this as a "looming authorization crisis": traditional IAM systems assume users authenticate occasionally, maintain stable roles, and operate during predictable hours. Agents authenticate continuously, assume different permission sets per task, operate around the clock, and communicate with other systems autonomously. The one-time consent model collapses under this workload.

4. Bearer semantics#

This is the most dangerous assumption, and the one with the most documented real-world consequences.

An OAuth access token is a bearer token. Possession equals authorization. The resource server validates the token, not the presenter. If you have the token, you are authorized. If someone else has the token, they are also authorized. The token doesn't know the difference.

For agents, this creates a specific and exploitable attack surface. Agent tokens are stored in config files, passed through environment variables, logged in debug output, cached by proxy servers, embedded in MCP configuration files, and shared across multiple agent instances. Every one of these locations is a potential point of theft.

GitGuardian's 2025 analysis found 24,008 unique secrets in MCP configuration files on public GitHub repositories. Of those, 2,117 were confirmed valid. These aren't passwords that might be rotated. They're bearer tokens that grant immediate access to whatever service the agent was configured to reach.

Obsidian Security documented the specific failure mode in the Salesloft-Drift breach: "The logs could only verify the token. They couldn't verify the system that generated the request." When every entity presents the same token format with no binding to the presenter, there is no forensic trail to follow after a breach.

The ephemeral entity paradox makes it worse. An agent might live for 500 milliseconds: spin up, execute a task, terminate. But the token it was issued lives for hours or days. The gap between the agent's lifespan and the token's lifespan is a window of exploitation that grows wider with every agent that is created and destroyed.

5. No delegation chains#

When Agent A calls Agent B, which calls Tool C on behalf of User D, OAuth has no standard way to represent this chain. The act claim in RFC 8693 (Token Exchange) provides partial support for delegation, but it has critical limitations.

The IETF OAuth Working Group noted in March 2026 that "the may_act claim validates the actor's identity, not that the actor credential was acquired within the same delegation context. It is also optional. There is no normative requirement that subject tokens carry may_act or that the STS enforce it."

In practice, this means that an agent-to-agent delegation chain is invisible to the resource server. Tool C sees a valid token. It has no way to know whether that token arrived through a legitimate delegation chain or was stolen from a log file three hops back.

The AIMS draft proposes scope reduction at each hop, where each delegation step narrows permissions, never widens them, combined with transaction tokens that bind to a specific transaction context. OAuth has no native support for either mechanism.

The Breach Pattern#

These aren't theoretical concerns. There is a consistent pattern in agent-related security incidents, and it starts with bearer tokens.

The Salesloft-Drift breach (2025). ShinyHunters stole OAuth bearer tokens from Drift's platform. 700+ organizations compromised. No authentication alerts. The tokens were valid. The access was authorized. The identity system had no way to distinguish legitimate use from theft.

The LiteLLM supply chain attack (March 2026). The credential harvester in the malicious LiteLLM release swept ~/.aws/credentials, .env files, API keys, and every other bearer artifact on the machine. It didn't need to break encryption or exploit a vulnerability. The credentials were sitting on the filesystem, readable by any process running as the user. If 15 agents share the same OpenAI API key through LiteLLM, you have to rotate the key and disrupt all 15 agents, because bearer tokens have no per-agent binding.

The Meta Sev 1 (March 2026). An agent passed all identity checks, received valid tokens, called approved tools, and still performed unauthorized actions. The post-mortem finding: identity verification (who is this?) was working. Behavioral enforcement (should this specific action be allowed right now?) was not.

The Claude Code Terraform incident (February 2026). An agent with full AWS credentials ran terraform destroy, wiping 2.5 years of data. The credentials were valid. The agent was authenticated. There was no mechanism to evaluate the blast radius of a specific action before it executed.

The common thread: authentication succeeded and security failed anyway. The identity layer confirmed who the agent was. Nothing confirmed whether what the agent was doing was safe.

What the Industry Is Building#

The response has been fast. As of March 2026, there are at least seven major tracks of standards work and five major vendor products addressing agent identity.

IETF AIMS (draft-klrc-aiagent-auth-00)#

Published March 2, 2026. Authors from AWS, Zscaler, Ping Identity, and Defakto Security. This is the most comprehensive proposal. Rather than inventing a new protocol, it composes existing standards:

Identity: WIMSE workload identifiers (URI-based, compatible with SPIFFE IDs)
Credentials: Short-lived, cryptographically bound tokens, not bearer artifacts
Authentication: WIMSE Proof Tokens (signed JWTs proving key possession) and HTTP Message Signatures for end-to-end authentication through intermediaries
Scope management: Transaction tokens bound to specific transactions, not reusable
Human approval: OpenID CIBA for out-of-band confirmation when needed
Observability: Mandatory logging of agent identifier, delegated subject, accessed resource, action, authorization decision, and attestation state

The draft's position on bearer tokens is explicit: they are an antipattern. The replacement is proof-of-possession. Every request must prove that the sender holds a private key, not just a token.

WIMSE (Workload Identity in Multi-System Environments)#

The IETF working group providing the identity substrate for AIMS. The latest draft (v07, late 2025) added AI agent use cases explicitly, recognizing that "autonomous AI agents are increasingly calling various workloads on behalf of users and delegating tasks to other AI agents."

Dual-Identity Credentials (draft-ni-wimse-ai-agent-identity-02)#

A Huawei-authored draft that cryptographically binds an agent's identity to its owner's identity. Defines three issuance models: agent-mediated (offline owner signing), owner-mediated (gateway supervision), and server-mediated (real-time challenge-response). The design ensures that an agent's actions are always traceable to the human or organization that authorized its creation.

Vendor Responses#

At RSAC 2026, five vendors simultaneously shipped agent identity products:

Vendor	Product	Approach
Microsoft	Entra Agent ID	Conditional Access + ID Protection + Governance for agent identities
Okta	Cross App Access (XAA)	Identity Assertion Authorization Grant. Centralized admin control over app-to-app connections.
Google	Agent2Agent (A2A)	Agent Cards with signed security declarations, OAuth + mTLS + JWT
CrowdStrike	Falcon AI Security	Acquired SGNL for $740M. Real-time access revocation for agents.
Cisco	Duo Agentic Identity	Agent identity as extension of Duo's zero-trust platform

VentureBeat's analysis of the RSAC announcements was blunt: "Every vendor verified who the agent was. None tracked what the agent did."

That observation matters. It points to the gap between identity (authentication) and governance (enforcement), which is where the real problem lives.

Proof-of-Possession: The Minimum Viable Fix#

Every track of standards work converges on one mechanism: proof-of-possession. Instead of a bearer token that anyone can use, the agent proves it holds a private key on every request.

The most practical implementation today is DPoP (Demonstration of Proof-of-Possession), published as RFC 9449 in September 2023. It works at the application layer, requiring no infrastructure changes, passing through proxies and load balancers. It is already mandated by the Financial-grade API (FAPI 2.0) specification.

The mechanism:

The agent generates an asymmetric key pair (ES256). The private key never leaves the agent's environment.
The public key thumbprint (a SHA-256 hash of the public key, per RFC 7638) is registered with the server. The server stores this thumbprint as the binding for the agent's identity.
On every request, the agent creates a DPoP proof: a signed JWT that includes the full public key in its header, plus the HTTP method, target URL, a unique identifier (jti), and a timestamp.
The server verifies in two steps. First, it extracts the public key from the proof's header and verifies the signature, confirming the sender holds the corresponding private key. Second, it computes the thumbprint of that public key and compares it to the stored thumbprint, confirming this is the registered key and not just any valid key pair. Then it checks that the method and URL match the actual request, the jti hasn't been seen before, and the timestamp is within a clock-skew window.

What this changes:

Attack	Bearer Token	With DPoP
Token stolen from config file	Full access	Useless. Attacker can't generate proofs.
Token intercepted on network	Full access until expiry	Proof is bound to one request. Replay fails.
Token found in logs	Full access	Token alone is insufficient without private key
Connection ID guessed	If token is UUID-based, full access	No valid proof, rejected
CI/CD pipeline compromised	Leaked token = permanent access	Token without key = no access

DPoP is not a silver bullet. It does not prevent compromise of the private key itself. If an attacker gains root access to the machine where the key is stored, they can extract it. That's a problem for hardware-backed key storage (TPM, Secure Enclave) and runtime attestation (SPIFFE/WIMSE), the next layer of the stack, not the current one.

But DPoP eliminates the most common and most exploitable failure mode: the token that sits on disk, gets logged, gets shared, gets stolen, and works for anyone who picks it up. That alone changes the economics of attacking agent infrastructure.

Identity Is Necessary but Not Sufficient#

Here is the part that most agent identity proposals miss, and that the VentureBeat observation gets exactly right.

Proof-of-possession solves the authentication problem: is this agent who it claims to be? It does not solve the authorization problem: should this agent be doing what it's doing right now?

The Meta Sev 1 agent passed all identity checks. The Claude Code Terraform agent had valid credentials. The financial institution's AI wire fraud agent was fully authenticated. In every case, the agent was correctly identified, correctly authorized, and still caused damage.

The gap is between identity and enforcement. Specifically:

Identity tells you who. This is Agent X, owned by Team Y, running in environment Z. DPoP/WIMSE/SPIFFE all solve this well.

Authorization tells you what. Agent X has access to Notion and Jira, scoped to the engineering workspace. OAuth scopes and access policies solve this partially.

Enforcement tells you whether. This specific tool call, with these specific arguments, at this specific moment: should it go through? This is where everything falls apart. A write_file call to /tmp/report.txt and a write_file call to /etc/passwd look identical to the identity layer. The distinction is pure policy.

An agent calling notion.pages.create with a summary of a meeting is routine. The same agent calling notion.pages.create with the contents of /etc/shadow injected via prompt manipulation is a data exfiltration event. The identity is the same. The authorization is the same. The action is different. Only a system that inspects the actual content and context of each request can distinguish between the two.

The AIMS draft recognizes this. It mandates observability, logging every action, not just every authentication event. But observability after the fact is forensics, not prevention. The missing piece is inline enforcement: evaluating every tool call against policy before it reaches the upstream service, and blocking or transforming calls that violate policy in real time.

Where This Is Heading#

The industry is converging on an architecture that has three layers, not one:

Layer 1: Identity (who is this agent?)
  → DPoP / WIMSE / SPIFFE
  → Proof-of-possession, not bearer tokens
  → Per-agent identity, not shared credentials
  → Lifecycle management (creation, rotation, revocation)

Layer 2: Authorization (what is this agent allowed to access?)
  → Scoped tokens, transaction tokens
  → Access groups and policy assignment
  → Scope reduction across delegation chains

Layer 3: Enforcement (should this specific action proceed?)
  → Inline inspection of tool calls and responses
  → Content-aware policy evaluation
  → Real-time blocking, flagging, and transformation
  → Continuous monitoring, not one-time consent

OAuth lives in layers 1 and 2. It provides a framework for identity and coarse-grained authorization. But it was never designed for layer 3, and layer 3 is where agents cause damage.

Most of the standards work happening right now (AIMS, WIMSE, DPoP, XAA) focuses on layers 1 and 2. That work is critical. Without a solid identity foundation, enforcement has nothing to anchor to. But identity without enforcement is a lock on the front door with no walls.

The next generation of agent infrastructure will be defined by systems that can answer all three questions: who is this, what can it access, and should this specific action proceed. In real time, on every request, at every hop in the chain.

OAuth got the internet through 14 years of "who is this user and what did they consent to?" It was the right answer for that question. The question has changed.

Sources#

Identity Is the Foundation for Control

Every layer of human identity (signature, passport, fingerprint, key) was anchored to a body. Agents have none. The infrastructure we use to identify them today gives every answer to 'who acted' simultaneously, and none of them well. What is missing is not a feature. It is an entire ecosystem that recognizes agents as first-class actors.

March 25, 2026·13 min read

A Security Scanner Walked Into a Supply Chain: What the LiteLLM Compromise Means for AI Agents

On March 24, 2026, a bug in malware crashed a developer's machine, uncovering a 24-day supply chain attack that turned a security scanner into a weapon against AI infrastructure.

All posts

Related posts

Identity Is the Foundation for Control

A Security Scanner Walked Into a Supply Chain: What the LiteLLM Compromise Means for AI Agents