Skip to main content
TopAIThreats home TOP AI THREATS
Technical Attack

Prompt Injection

An attack that inserts adversarial instructions into an AI model's input to override its intended behaviour, bypass safety constraints, or extract restricted information.

Definition

Prompt injection is an attack technique in which adversarial instructions are embedded within the input to a large language model or AI agent, causing the system to deviate from its intended behaviour. The attack exploits the fact that current LLM architectures cannot reliably distinguish between legitimate user instructions and injected adversarial content. Prompt injection can be direct, where the attacker crafts the input themselves, or indirect, where malicious instructions are embedded in external data sources that the model retrieves and processes during operation. Indirect prompt injection is particularly concerning in agentic systems that retrieve context from websites, documents, or databases, as the attacker need not interact with the model directly.

How It Relates to AI Threats

Prompt injection is a foundational vulnerability within the Security & Cyber domain, as it undermines the integrity of AI system behaviour at the input layer. Within the Agentic & Autonomous domain, indirect prompt injection poses escalating risks as AI agents gain access to external tools, APIs, and data sources. An injected instruction can cause an agent to exfiltrate data, execute unauthorised actions, or propagate compromised outputs to downstream systems. The attack is considered one of the most significant unresolved security challenges in LLM deployment.

Why It Occurs

  • LLM architectures process all input tokens in a shared context window without a reliable boundary between instructions and data
  • No current technique fully separates system-level directives from user-provided or retrieved content
  • Agentic systems that retrieve and process external content expand the attack surface to any data source the agent accesses
  • Safety alignment through fine-tuning and reinforcement learning provides probabilistic rather than deterministic protection against adversarial inputs
  • The rapid deployment of LLM-based applications has outpaced the development of robust input sanitisation frameworks

Real-World Context

Prompt injection has been demonstrated across all major commercial LLM deployments. Security researchers have shown that indirect prompt injection can compromise AI agents by embedding instructions in web pages, emails, or documents that agents process during retrieval-augmented generation. The OWASP Top 10 for LLM Applications lists prompt injection as the highest-priority vulnerability. The AI-orchestrated cyber espionage campaign documented in INC-25-0001 leveraged prompt-like manipulation techniques as part of its multi-stage autonomous attack chain. Organisations deploying LLM-based systems increasingly treat prompt injection as a core security concern requiring defence-in-depth strategies.

Related Incidents

INC-25-0001 critical 2025-09

AI-Orchestrated Cyber Espionage Campaign Against Critical Infrastructure

INC-26-0087 medium 2026-03

Context Hub Documentation Poisoning — AI Coding Assistants Write Malicious Code 100% of Time from Poisoned Docs

INC-26-0089 high 2026-03

Claude Code 'Claudy Day' Vulnerability Chain — Silent Data Exfiltration via Prompt Injection

INC-26-0006 high 2026-02

AI Recommendation Poisoning via 'Summarize with AI' Buttons (31 Companies)

INC-26-0007 medium 2026-02

Unit 42 Demonstrates Persistent Memory Injection in Amazon Bedrock Agents

INC-26-0016 critical 2026-02

Clinejection: Prompt Injection in Cline AI Bot Enables npm Supply Chain Attack

INC-26-0021 high 2026-02

ModelScope MS-Agent Shell Tool Command Injection Vulnerability

INC-26-0017 high 2026-01

Claude Code Remote Code Execution and API Key Exfiltration Vulnerabilities

INC-26-0022 high 2026-01

Cursor AI Code Editor Shell Built-In Allowlist Bypass Enables Zero-Click RCE

INC-26-0055 high 2026-01

Perplexity Comet AI Browser Enables Zero-Click Credential Theft via Prompt Injection

INC-25-0033 critical 2025-12

Jailbroken Claude AI Used to Breach Mexican Government Agencies

INC-25-0010 medium 2025-11

Unit 42 Demonstrates Agent Session Smuggling in A2A Multi-Agent Systems

INC-25-0007 critical 2025-08

GitHub Copilot Remote Code Execution via Prompt Injection (CVE-2025-53773)

INC-25-0008 high 2025-08

Cursor IDE MCP Vulnerabilities Enable Remote Code Execution (CurXecute & MCPoison)

INC-25-0005 medium 2025-07

ChatGPT Jailbreak Reveals Windows Product Keys via Game Prompt

INC-25-0004 critical 2025-06

EchoLeak: Zero-Click Prompt Injection in Microsoft 365 Copilot (CVE-2025-32711)

INC-25-0035 high 2025-06

Three Chained Prompt Injection Vulnerabilities in Anthropic MCP Git Server

INC-25-0031 medium 2025-03

MINJA: Memory Injection Attack Against RAG-Augmented LLM Agents

INC-25-0028 high 2025-02

Google Gemini Long-Term Memory Corruption via Prompt Injection

INC-24-0020 high 2024-08

Slack AI Indirect Prompt Injection Data Exfiltration Vulnerability

INC-24-0012 high 2024-03

Morris II — First Self-Replicating AI Worm Demonstrated

INC-24-0007 high 2024-01

Indirect Prompt Injection: How Attackers Hijack LLM Apps

INC-23-0016 high 2023-02

Bing Chat (Sydney) System Prompt Exposure via Prompt Injection

Last updated: 2026-02-14