Skip to main content
TopAIThreats home TOP AI THREATS
Annual Report 2025

2025 Annual AI Threat Report

In 2025, TopAIThreats documented 46 AI-enabled threat incidents spanning 7 of the 8 threat domains in our taxonomy. Security & Cyber was the most active domain, accounting for 37% of documented incidents. 76% of incidents were rated critical or high severity. 43% have reached resolution.

This report provides a quantitative overview and interpretive analysis of the year's documented AI threats, grounded entirely in the incident database and classified using the 8-domain taxonomy.

All figures computed at build time (2026-04-17). Incidents may appear in multiple domains via secondary patterns.

46
Incidents
7
Domains
43%
Resolved
14
Critical

Domain Analysis

Activity was distributed across 7 domains, led by Security & Cyber (17 incidents, 37%) and Human-AI Control (8 incidents). This spread suggests AI threats are materializing across multiple fronts rather than concentrating in a single area.

Severity & Failure Stages

A majority (76%) of 2025 incidents were rated critical or high severity, indicating that the incidents reaching public documentation tend to involve substantial harm rather than minor disruptions. 65% of incidents reached the "harm" failure stage — meaning measurable damage was documented, not just capability demonstrations or near-misses.

Severity Breakdown

critical
14
30%
high
21
46%
medium
11
24%
low
0
0%

Failure Stage Distribution

Signal 10
Near Miss 5
Harm 30
Systemic Risk 1

Failure stages represent an escalation ladder: signal (capability demonstrated) → near miss (harm avoided) → harm (measurable damage) → systemic risk (structural threat pattern).

Top Threat Patterns

Adversarial Evasion was the most frequently referenced threat pattern in 2025 (7 incidents), followed by Automation Bias in AI: Definition, Examples, and Prevention (6) and Tool Misuse & Privilege Escalation (6). The concentration at the top of this ranking highlights where AI-enabled threats are most actively manifesting in documented incidents.

Sectors Affected

AI-enabled threats affected at least 10 distinct sectors in 2025. Technology was the most impacted sector (25 incidents), followed by Corporate (11) and Government (8).

Resolution Status

43% of 2025 incidents are resolved, while 26 remain open. The significant proportion of unresolved incidents reflects the ongoing nature of many AI-related threats, where structural causes persist beyond individual incident remediation.

20
Resolved
26
Open

All 2025 Incidents

46 incidents that occurred in 2025, sorted by date (most recent first).

INC-25-0048 medium

Australia Scraps AI Advisory Body After 15 Months and $188K, Drops Mandatory AI Guardrails

The Australian government scrapped its planned AI Advisory Body in late 2025 after a 15-month, $188,000 AUD recruitment process that identified 270 experts and shortlisted 12 nominees, none of whom were appointed. The December 2025 National AI Plan also dropped 10 mandatory guardrails for high-risk AI proposed in September 2024, relying instead on existing laws and a new advisory-only AI Safety Institute ($29.9 million AUD). The rollback removes governance mechanisms that would have applied to algorithmic decision-making in welfare, policing, credit, and other high-risk domains. Coded as INC-26 because the full scope of the decision, including the $188,000 cost, was first reported publicly in February 2026.

INC-25-0016 medium

Heber City AI Police Report Generates Fictional Content from Background Audio

During a pilot of AI-assisted police report writing tools in Heber City, Utah, an AI system generated a report stating that an officer had 'turned into a frog.' The system had picked up background audio from the Disney film 'The Princess and the Frog' playing nearby and incorporated fictional dialogue into the official report. The incident was caught during review and the report was corrected.

Developer: Unknown vendor
INC-25-0020 medium

Instacart AI-Driven Algorithmic Price Discrimination

A joint investigation by Consumer Reports, Groundwork Collaborative, and More Perfect Union revealed that Instacart's AI-powered Eversight pricing platform displayed different prices for identical grocery items to different customers, with variations reaching up to 23% per item and approximately 7% per basket. The investigation, based on 437 volunteer shoppers across four cities, estimated an annual cost impact of approximately $1,200 per affected household. Instacart halted all item price tests in December 2025 following public backlash, an FTC probe, and scrutiny from the New York Attorney General.

Developer: Instacart
INC-25-0026 medium

CrimeRadar AI App Sends False Crime Alerts Across U.S. Communities

In December 2025, the CrimeRadar app — an AI-powered tool developed by Scoopz Inc. that monitors U.S. police radio and pushes local crime alerts to over 2 million users — sent waves of false notifications about shootings and violent crimes across multiple cities. The AI misinterpreted routine police radio chatter: a fire alarm pull at an Ohio elementary school became 'firearms discharged,' and a 'Shop With the Cop' charity event in Oregon became a report of an officer being shot. A BBC Verify investigation documented the pattern. CrimeRadar apologized and promised model improvements.

Developer: Scoopz Inc.
INC-25-0033 critical

Jailbroken Claude AI Used to Breach Mexican Government Agencies

A hacker jailbroke Anthropic's Claude AI through a month-long campaign using Spanish-language prompts and role-playing scenarios, then used the compromised model to generate vulnerability scanning scripts, SQL injection exploits, and credential-stuffing tools. The resulting attacks compromised 10 Mexican government agencies and one financial institution, exfiltrating approximately 150 GB of data including 195 million taxpayer records.

Developer: Anthropic
INC-25-0036 high

State-Backed Hackers from Four Nations Weaponize Google Gemini for Cyberattack Operations

Google's Threat Intelligence Group (GTIG) reported that state-backed hacking groups from North Korea (UNC2970), Iran (APT42), China, and Russia used Google Gemini for reconnaissance, target profiling, phishing message generation, malware coding, and vulnerability research, with one group developing HONESTCUE malware that outsources code generation to Gemini's API.

Developer: Google
INC-25-0038 critical

Grok AI Generates 3 Million Sexualized Images Including Approximately 23,000 Depicting Children

xAI's Grok image generation system produced approximately 3 million sexualized images in 11 days, with roughly 23,000 depicting children. Tennessee teenagers filed a class-action lawsuit, Baltimore became the first city to sue, a Dutch court imposed a ban with EUR 100,000/day penalties, 35 state attorneys general sent a demand letter, and investigations were opened in the UK, Ireland, and Canada.

Developer: xAI
INC-25-0010 medium

Unit 42 Demonstrates Agent Session Smuggling in A2A Multi-Agent Systems

Palo Alto Networks Unit 42 researchers demonstrated 'agent session smuggling,' a technique in which a malicious AI agent exploits stateful sessions in the Agent2Agent (A2A) protocol to inject covert instructions into a victim agent. Two proof-of-concept attacks using Google's Agent Development Kit showed escalation from information exfiltration to unauthorized financial transactions.

Developer: Google