Veracode tested over 100 large language models on security-sensitive coding tasks and found that 45% of the code they produced introduced OWASP Top 10 vulnerabilities — a pass rate that did not improve across multiple testing cycles. That single number explains why AI in 2026 did not add one attack surface to the cloud. It stacked three, and they compound. Wiz Research reports 80% of organizations now ship AI-generated code, 68% inherit models through third-party software they never deployed, and 80% run Model Context Protocol servers as overprivileged control planes. Each layer is governed by a different team, and each one hands the next its entry point.
The Vibe Coding Defect Layer
The first surface is the one engineers touch every day. With 80% of organizations now using AI IDE extensions, “vibe coding” is no longer an experiment — it is the default development mode. The Wiz report found roughly one in five organizations using these platforms carry systemic security weaknesses traceable to insecure AI-generated defaults. Systemic is the operative word: these are not isolated bugs but repeated, pattern-level mistakes baked into how code gets written.
The defect profile is well-documented and unglamorous. Veracode’s testing, summarized by the Cloud Security Alliance, shows the failures cluster in injection flaws, broken authentication, and insecure dependencies. Arnica’s analysis places the flawed-code figure at 38% across sampled repositories. The most common artifacts are hardcoded credentials, overly permissive IAM bindings, open security groups, and unverified transitive dependencies that the model pulled in because they looked plausible.
What makes this layer structurally dangerous is throughput. A developer shipping one insecure default per quarter is a code-review problem. Five thousand developers shipping AI-assisted code at 3x velocity produces the same defect class at a volume that no human review queue can absorb. In March 2026 alone, at least 35 new CVEs were disclosed as direct results of AI-generated code, according to Infosecurity Magazine. The volume is the vulnerability.
Transitive AI Inherits Model Risk
The second surface sits below the application layer, where most platform teams have no instrumentation. Wiz found that 90% of organizations run self-hosted models, but 68% of those ingest the models through third-party software rather than deploying them directly. Wiz calls this “transitive AI”: your security team inherits model weights, inference pipelines, and supply-chain dependencies that nobody in your organization explicitly chose, approved, or scanned.
The engineering consequence is a blind dependency graph. A SaaS tool your data team adopted six months ago bundles a quantized model pulled from a public registry. That model’s tokenizer has a known prompt-injection path. Your cloud-native application security tooling was never told this model exists, because it never passed through your artifact registry — it arrived embedded in a container you pulled for an unrelated reason. The attack surface is real, the telemetry is absent, and ownership is diffuse enough that no single team is responsible for closing it.
This is distinct from ordinary open-source supply-chain risk. A vulnerable npm package executes in a sandboxed runtime with predictable capabilities. A transitive model is a probabilistic system that ingests untrusted input at inference time and can be coerced into emitting tool calls, exfiltrating prompts, or returning manipulated results. Traditional SCA tools do not model this, and most cloud security posture products do not inventory model artifacts as first-class resources. We unpacked the scope of this unapproved-software inheritance problem separately — here the concern is how it chains into the other two layers.
Agents Become Lateral-Movement Bridges
The third surface is the one attackers actually want, because it connects the first two to your data. Wiz reports 57% of organizations now deploy self-hosted AI agents and 80% have adopted Model Context Protocol servers. MCP servers are designed to hand agents authenticated access to databases, file stores, APIs, and internal tools. Deployed without authentication, with plaintext credentials, or bound to default network interfaces, they become exactly what an attacker needs after the first breach: a pre-authenticated bridge to sensitive systems.
The threat model is concrete. An overprivileged MCP server “can act as a bridge for lateral movement, allowing an attacker to reach sensitive enterprise data through the agent’s legitimate, authenticated connections,” explains Gopher Security’s threat-detection analysis. Aembit’s vulnerability guide is blunter: once a breached tool or agent leverages its initial access, “privilege escalation happens… moving from limited access to system-wide control. Lateral movement becomes trivial in loosely secured environments.”
Crucially, a compromised agent operates inside the bounds of its own legitimate identity. Its API calls carry valid tokens, hit expected endpoints, and fire during normal business hours. Perimeter defenses and anomaly detectors built for human-paced intrusions frequently miss them, because nothing about the traffic looks anomalous — it looks like the agent doing its job. Hardening this layer is ultimately an identity, isolation, and budget problem, not a firewall problem.
Why the Three Layers Compound
Each surface is manageable in isolation. The structural problem is that they connect, and the connection points are where defenders have the least visibility. A defect introduced by vibe coding (Layer 1) gives an attacker a foothold in an application that talks to a transitive model (Layer 2), whose prompt-injection vulnerability pivots into an agent (Layer 3) holding valid credentials to your production database. No single layer was catastrophic. The chain is.
| Layer | Defect class | Owner in most orgs | Adoption (Wiz 2026) |
|---|---|---|---|
| AI-generated code | OWASP Top 10 defaults, hardcoded secrets | AppSec / dev team | 80% use AI IDEs |
| Transitive AI models | Unvetted model supply chain | Unclear / procurement | 68% via 3rd-party SW |
| Agent & MCP control planes | Overprivileged, lateral movement | Platform / ML ops | 80% run MCP servers |
Notice the ownership column. The three layers are governed by different teams with different tooling stacks and different threat models. There is no shared inventory that says: this application (owned by AppSec) calls this transitive model (owned by nobody) which is wrapped by this agent (owned by ML ops). Attackers do not respect these org boundaries. Defenders, structurally, are trapped inside them.
The Governance Gap Is the Bug
This is the part vendors underplay. Buying a model-scanning tool, an MCP gateway, or an AI-aware SAST does not close the chain — it closes one link. The compounding risk lives in the seams between tools, and seams are an organizational failure mode, not a product category. Wiz’s own conclusion is that “security is no longer about protecting isolated models; it is about governing a distributed ecosystem of autonomous agents, embedded dependencies, and AI-driven code.”
The teams that handle this well share one trait: they built a unified inventory that treats model artifacts, agent identities, MCP servers, and AI-generated code paths as related resources in the same graph, not as tickets in three different queues. Without that graph, you are hardening three doors in a building that has no walls between the rooms.
Engineering Controls That Actually Work
The controls that matter are unsexy and incremental, but they break the chain at predictable points:
- Treat model artifacts as first-class registry objects. Every self-hosted and transitive model should pass through an artifact registry with provenance, hash pinning, and a scan step. If a model did not enter through the registry, it does not run. This closes the largest transitive-AI blind spot.
- Enforce least-privilege on every MCP server. Agents should hold scoped, short-lived credentials — never long-lived service-account tokens. Bound network interfaces, not default bindings. If you cannot answer “what could this agent reach if compromised?” in one sentence, it is overprivileged.
- Gate AI-generated code with static analysis in the pipeline. SAST must run on every AI-assisted commit, not as a quarterly sweep. The 45% defect rate means manual review is already insufficient; the pipeline is the only control fast enough to keep up.
- Build one inventory graph, not three dashboards. Map the edges: which application calls which model, which model wraps which agent, which agent holds which credentials. The attack chain follows these edges; your defenses must too.
- Add agent behavioral baselines. Log every tool call, every data store touched, every token consumed. Lateral movement through a legitimate agent looks like anomalous volume or destination, not anomalous authentication. You need the baseline to see the deviation. Most teams build observability but never ship it — only 11% get it to production, which means this control is theoretical for the majority.
None of this is exotic. The reason most organizations have not done it is that the work spans three teams and a budget line that does not exist yet. That organizational gap — not any single vulnerability — is the actual attack surface, and it is widening every quarter that AI adoption outruns governance.
References
- Wiz Research — State of AI in the Cloud 2026
- Cloud Security Alliance — AI-Generated Code Vulnerability Surge (Veracode data)
- Infosecurity Magazine — Vulnerabilities in AI-Generated Code (March 2026 CVEs)
- Arnica — Vibe Coding Security Risks
- Gopher Security — Securing MCP: Threat Detection and Policy
- Aembit — MCP Security Vulnerabilities: Complete Guide for 2026
- RKON — MCP Server Security: Navigating the New AI Attack Surface