Back to blog
supply-chainsecuritylitellmai-infrastructureincident-analysis

A Security Scanner Walked Into a Supply Chain: What the LiteLLM Compromise Means for AI Agents

On March 24, 2026, a bug in malware crashed a developer's machine, uncovering a 24-day supply chain attack that turned a security scanner into a weapon against AI infrastructure.

AD
Anshal Dwivedi
·13 min read

A developer was working in Cursor when their machine ran out of RAM. Not gradually. All at once. The culprit was a Python process spawning copies of itself in an infinite loop, eating memory until the system crashed. A fork bomb, but not a deliberate one. A bug.

The bug was in malware. Specifically, in a .pth file that had been injected into a new release of LiteLLM, a Python library with 95 million monthly downloads. The malware was designed to silently harvest every credential on the machine (SSH keys, cloud tokens, Kubernetes secrets, crypto wallets), encrypt it all, and send it to an attacker-controlled server. It would have worked, too, except the author made a mistake in the subprocess spawning logic that turned the payload into an accidental fork bomb.

That crash is the only reason the compromise was caught as fast as it was.

What Is LiteLLM, and Why Does It Matter?#

LiteLLM is the plumbing behind a large portion of the AI agent ecosystem. It provides a unified Python interface for calling LLM providers (OpenAI, Anthropic, Cohere, and dozens of others) through a single API. Instead of writing integration code for each provider, developers route everything through LiteLLM.

That makes it useful. It also makes it a high-value target. LiteLLM handles API keys for every LLM provider an organization uses. It sits in CI/CD pipelines, on developer machines, inside agent frameworks, and on production servers. With over ~40,000 GitHub stars and 95 million monthly downloads, it is embedded as both a direct and transitive dependency across hundreds of AI tools, MCP servers, and orchestration libraries.

The developer who discovered the compromise wasn't even using LiteLLM directly. An MCP plugin in Cursor pulled it as a transitive dependency. Three layers of indirection between the developer's intent and the compromised package.

The Origin: Trivy and the Unlocked Door#

The LiteLLM compromise didn't start with LiteLLM. It started 24 days earlier with Trivy, a security scanner.

What Trivy Is#

Trivy is one of the most popular vulnerability scanners in the open-source ecosystem, maintained by Aqua Security. Developers embed it in their automated build pipelines to scan code, containers, and configurations for known security issues. Thousands of projects on GitHub use Trivy as a GitHub Action, an automated step that runs whenever code is pushed.

On February 28, 2026, an automated bot account called hackerbot-claw submitted a pull request to Trivy's repository. Under normal circumstances, code from an outside contributor runs with limited permissions. It cannot access the project's secrets. But Trivy's CI/CD pipeline used a trigger called pull_request_target, which is designed for tasks that require write access (labeling PRs, posting comments). The problem: if the pipeline also checks out and executes the contributor's code, that untrusted code inherits all of the project's secrets.

Trivy's pipeline had this misconfiguration. The bot's code ran with full access and stole a Personal Access Token (PAT) with write permissions to the entire Trivy repository.

The Failed Remediation#

Aqua Security noticed the breach and rotated credentials. But they missed some. The attacker retained residual access.

The Tag Hijack (March 19)#

On March 19, the attacker (a group calling themselves TeamPCP) used that residual access to execute a tag hijack.

Here is how GitHub Action versioning works: when a project includes Trivy in its pipeline, it references a version tag, something like aquasecurity/trivy-action@v0.69. That tag is a pointer to a specific commit. But tags in Git are mutable. Anyone with write access to the repository can change where a tag points. Move the tag, and every project referencing it pulls the new code on its next run, without changing a single line in their own configuration.

TeamPCP force-pushed nearly all version tags in Trivy's GitHub Action repository. All of them now pointed to malicious code that looked and behaved like a normal Trivy scan, but with one addition: before running the scanner, it silently collected every secret it could find in the CI/CD environment (SSH keys, cloud credentials, API tokens), encrypted them, and exfiltrated them to an attacker-controlled server.

Think of it this way: imagine a library catalog where the card for a popular book now points to a different shelf. The book on that shelf has the same cover, the same title, the same first chapter. But hidden in the middle are extra pages that photograph everything in your bag while you're reading.

Every project that referenced Trivy by tag was now running the attacker's code.

The Cascade: From Trivy to LiteLLM#

LiteLLM's CI/CD pipeline used Trivy to scan for vulnerabilities. Their workflow configuration referenced aquasecurity/trivy-action@v0.69, a tag, not a pinned commit SHA. After the tag hijack, LiteLLM's pipeline started running the malicious version of Trivy without anyone at BerriAI (LiteLLM's maintainer) changing anything or noticing anything wrong.

Inside LiteLLM's CI/CD environment, the compromised Trivy action found what it was looking for: a PYPI_PUBLISH token. This is the credential that authorizes publishing new versions of LiteLLM to PyPI, the Python package registry. The attacker exfiltrated it.

What happened next was fast.

At 10:39 UTC on March 24, the attacker published LiteLLM version 1.82.7 to PyPI. The malicious code was injected into litellm/proxy/proxy_server.py at line 128: a 12-line insertion containing a base64-encoded payload, placed between legitimate code blocks. The file actually contained three iterations of the payload, with earlier versions commented out. Development artifacts left in the production release, an operational security failure by the attacker.

Thirteen minutes later, at 10:52 UTC, they published version 1.82.8. This version was more aggressive. Instead of hiding in a Python source file (which only executes when imported), it used a .pth file. That changes everything.

Neither version had a corresponding release on GitHub. No code review, no CI/CD run, no approval process. The attacker published directly to PyPI with a valid credential. The registry accepted it because the token was legitimate.

The .pth Mechanism: A Trap in the Bookshelf#

Python has a feature that most developers don't know about. Files with a .pth extension, placed in the site-packages directory, are processed by Python's site.py module at interpreter startup, before any code runs, before any import happens. It's a startup hook built into the language.

Version 1.82.8 included a file called litellm_init.pth (34,628 bytes) that exploited this mechanism. A single line in the file imports subprocess and launches a detached Python process to decode and execute the base64 payload. The result: the malware runs on every Python process on the machine.

Not just when you import LiteLLM. Every Python process:

  • python anything.py
  • pip install something_else
  • Your IDE's language server
  • Your linter
  • Your test runner
  • Any script in any Python environment where LiteLLM happens to be installed

Most supply chain malware is a trap inside a book. It triggers when you open the book. This was a trap in the bookshelf. It triggers every time you walk into the room, regardless of which book you're reaching for.

Coding Agents Make This Worse#

A human developer might hesitate before running pip install on a package they didn't choose. They might check when the version was published, whether it has a corresponding GitHub release, or whether anything looks off. A coding agent doesn't hesitate.

Cursor, Claude Code, Windsurf, and similar coding agents discover tools and libraries from internet context: documentation, READMEs, Stack Overflow answers, blog posts. The agent reads pip install litellm in a tutorial, and it runs the command. It doesn't check the publish date. It doesn't compare the PyPI release to the GitHub source. It doesn't notice that two versions were published 13 minutes apart with no code review. It trusts the context it finds, executes with the developer's full filesystem and network permissions, and moves on.

This is exactly what happened in this incident. The developer who discovered the compromise wasn't using LiteLLM directly. A coding agent in Cursor installed an MCP plugin that pulled LiteLLM as a transitive dependency. The agent made the installation decision based on available context, the developer had no reason to audit it, and the .pth mechanism ensured the malware ran on every subsequent Python process, including the agent's own.

For a supply chain attacker, coding agents are the ideal distribution mechanism. They install without scrutiny, execute with broad permissions, and they run on machines where credentials are stored in all the usual places. The attack surface isn't just the package registry anymore. It's every piece of context on the internet that a coding agent might act on.

The Payload: What the Malware Did#

The payload operated in three stages.

Stage 1: Credential Harvester#

A 332-line script systematically swept the compromised machine:

CategoryWhat It Targeted
SSHid_rsa, id_ed25519, id_ecdsa, authorized_keys, known_hosts
AWSIMDS metadata endpoint (including IMDSv2 signed requests), ~/.aws/credentials, ~/.aws/config
GCPapplication_default_credentials.json
Azure~/.azure/ directory
KubernetesService account tokens, full secret dump across all namespaces
Environment files.env, .env.production, searched recursively up to 6 directories deep
Databases.pgpass, my.cnf, redis.conf
CryptocurrencyBitcoin wallet.dat, Ethereum keystores, Solana keypairs, Monero, Cardano
TLS/SSL.pem, .key, .p12, .pfx files
CI/CDterraform.tfvars, terraform.tfstate, Jenkinsfile
Git.gitconfig, stored credentials
System/etc/passwd, /etc/shadow, shell histories, running processes, routing tables

Stage 2: Exfiltration#

The harvested data was bundled into an archive named tpcp.tar.gz (the name references the group). A 32-byte AES-256-CBC session key was generated via openssl rand, encrypted with a hardcoded RSA-4096 public key, and the archive plus encrypted key were POSTed to https://models.litellm[.]cloud/, a domain designed to look like legitimate LiteLLM infrastructure.

The same RSA public key appeared in all three TeamPCP operations (Trivy, KICS, LiteLLM), providing strong attribution across the campaign.

Stage 3: Persistence and Lateral Movement#

The malware installed a backdoor at ~/.config/sysmon/sysmon.py, registered as a systemd user service called "System Telemetry Service." After a 5-minute initial delay, it polled https://checkmarx[.]zone/raw every 50 minutes for new payloads.

The kill switch is worth noting. If the C2 server's response contained "youtube.com", execution stopped. At the time of analysis, the C2 URL redirected to Queen's "The Show Must Go On" on YouTube.

If the malware found a Kubernetes service account token, it escalated. It read every secret in every namespace in the connected cluster, then deployed privileged pods named node-setup-{node_name} to every cluster node. These pods ran in the kube-system namespace with privileged: true, hostPID, hostNetwork, full host filesystem mounted at /host, and tolerations for all taints. Each pod executed chroot /host to access the host filesystem, giving the attacker root access to every node in the cluster.

The Domino Map#

Feb 28  Bot exploits misconfigured Trivy pipeline, steals PAT
          ↓
Mar 19  TeamPCP re-enters Trivy, hijacks 75 version tags
          ↓
Mar 20  45+ npm packages compromised via stolen CI/CD tokens
          ↓
Mar 22  Malicious Trivy images published to Docker Hub
          ↓
Mar 23  KICS (another security tool) compromised, 35 tags hijacked
          ↓
Mar 24  LiteLLM's pipeline runs poisoned Trivy
  10:39   → PYPI_PUBLISH token stolen
  10:39   → Malicious v1.82.7 published to PyPI
  10:52   → Malicious v1.82.8 published to PyPI
          → Credential harvesting begins on every machine that installs/updates
          → Persistent backdoors deployed
          → Kubernetes clusters compromised

Five ecosystems (GitHub Actions, Docker Hub, npm, Open VSX, and PyPI) compromised in 24 days. One misconfigured pipeline in a security scanner was the entry point for all of it.

What This Means for AI Infrastructure#

This is not a hypothetical scenario for AI infrastructure. This happened.

LiteLLM is agent infrastructure. It is the layer that brokers API keys between AI agents and LLM providers. The harvester that ran on compromised machines was generic: it swept SSH keys, cloud credentials, database passwords, and everything else it could find. It didn't target agents specifically. But LiteLLM tends to be installed on exactly the kind of machines that concentrate high-value credentials: LLM provider keys, cloud tokens, SaaS API secrets. The overlap between "machines running LiteLLM" and "machines with agent credentials worth stealing" is nearly 100%.

Three structural problems amplified the damage, and none of them have been resolved.

Nobody knows what their agents are connected to. The developer who discovered the compromise didn't choose to install LiteLLM. An MCP plugin pulled it as a transitive dependency. Three layers of indirection between the developer's intent and the compromised package. In agent ecosystems, dependency graphs are deep and opaque. Most organizations cannot even inventory which agents depend on LiteLLM, let alone monitor those connections for compromise. You can't govern what you can't see.

Agent credentials sit on disk, in the clear. The harvester's target list reads like a map of how the industry manages agent secrets: files in ~/.aws/, files in ~/.ssh/, files in .env, API keys in config files. The malware didn't need to break any encryption or exploit any vulnerability to read them. They were sitting on the filesystem, readable by any process running as the user. Every agent on every compromised machine just had its credentials, and the credentials of every service it connects to, exfiltrated. If those credentials were brokered through a secure channel and never written to disk, a harvester finds nothing to harvest.

There is no identity layer for agents. Per-agent identity wouldn't have prevented this breach. The .pth file ran as the user process, not as an agent. But the absence of an identity layer made the response far harder. Once the credentials were stolen, which agents were affected? If 15 agents share the same OpenAI API key (because that's how LiteLLM works, one key per provider), you have to rotate the key and disrupt all 15 agents. There's no way to revoke access for one compromised agent without breaking every agent that shares the same credential.

The supply chain vulnerability itself is well-understood. That's a problem for package registries and build systems to solve. The less visible problem is what happens after the supply chain is breached: agent credentials are exposed, agent connections are invisible, and there is no governance layer to detect, attribute, or contain the damage. That's the gap.

What To Do Right Now#

If you use LiteLLM in any environment, take these steps.

Check your installed version. If you have litellm==1.82.7 or litellm==1.82.8, you are affected. The last known clean version is 1.82.6.

Search for persistence artifacts:

  • ~/.config/sysmon/sysmon.py
  • ~/.config/systemd/user/sysmon.service
  • /tmp/pglog
  • /tmp/.pg_state
  • Pods named node-setup-* in the kube-system namespace

Verify package hashes:

PackageSHA-256 (compromised)
litellm-1.82.78395c3268d5c5dbae1c7c6d4bb3c318c752ba4608cfcd90eb97ffb94a910eac2
litellm-1.82.8d2a0d5f564628773b6af7b9c11f6b86531a875bd2d186d7081ab62748a800ebb

Check for C2 communication:

  • models.litellm[.]cloud (exfiltration endpoint)
  • checkmarx[.]zone (persistence C2)

If you are affected:

  1. Rotate every credential on the machine: SSH keys, cloud credentials, API keys, database passwords, TLS certificates
  2. Rotate every secret accessible from any Kubernetes cluster the machine could reach
  3. Audit CI/CD pipelines for unauthorized runs or artifact publications
  4. Pin all GitHub Actions to commit SHAs, not tags
  5. Pin Python dependencies to exact versions with hash verification (pip install --require-hashes)
  6. Audit your transitive dependency tree. pip show litellm will tell you if it's installed, but pipdeptree will tell you why

For ongoing protection:

  • Use PyPI Trusted Publishers (OIDC-based publishing) instead of stored tokens
  • Isolate publishing credentials from CI/CD runners that execute third-party code
  • Monitor for packages published outside normal CI/CD workflows

The Structural Condition Hasn't Changed#

TeamPCP posted a message on Telegram: "Many of your favourite security tools and open-source projects will be targeted in the months to come."

There is no reason to doubt them. The structural conditions that made this attack possible (mutable version tags, credentials stored on filesystems, overly broad CI/CD permissions, and the absence of source-to-artifact verification) are still present across the ecosystem. The same technique that compromised Trivy, then LiteLLM, works against any project that references a GitHub Action by tag and stores publishing credentials in CI/CD secrets.

The next attack won't require any new techniques. It won't require a zero-day. It will require a valid credential and a mutable pointer, same as this one.

The question isn't whether it will happen again. It's whether the next malware author will remember to test their subprocess spawning logic.


Sources#