AI Threat Glossary | TopAIThreats

A

Accountability — The principle that identifiable individuals or organisations must be answerable for AI system outcomes, including harms caused by automated decisions.
Adversarial Attack — A deliberate manipulation of inputs to a machine learning model designed to cause incorrect outputs, misclassifications, or security bypasses. Adversarial attacks exploit mathematical vulnerabilities in how models process data rather than flaws in traditional software logic.
Adversarial Perturbation — A carefully calculated modification to an input — often imperceptible to humans — that causes a machine learning model to produce an incorrect or attacker-chosen output. Adversarial perturbations exploit the mathematical properties of neural network decision boundaries rather than flaws in traditional software logic.
Adversarial Training — A machine learning defense technique in which a model is trained on adversarial examples — inputs specifically crafted to cause misclassification or incorrect outputs — alongside normal training data, with the goal of improving the model's robustness against adversarial attacks at inference time.
Agent Framework — A software library or platform that provides the infrastructure for building AI agents — autonomous systems that use large language models to reason, plan, and execute multi-step tasks by invoking tools, managing memory, and coordinating with other agents. Common examples include LangChain, AutoGen, CrewAI, and the OpenAI Agents SDK.
Agent Propagation — The spread of errors, hallucinations, or adversarial inputs from one AI agent to others in connected multi-agent systems, potentially causing cascading failures.
Agent Safety — The field of ensuring AI agents operate within intended boundaries and do not cause unintended harm through autonomous actions, tool use, or goal pursuit.
Agentic AI — AI systems that autonomously plan and execute multi-step actions with minimal human oversight.
AI Risk Management Framework — A structured methodology published by the US National Institute of Standards and Technology (NIST) that provides organisations with a systematic approach to identifying, assessing, and mitigating risks associated with AI systems throughout their lifecycle. The NIST AI RMF (AI 100-1) is a voluntary, non-sector-specific framework applicable to all AI technologies.
AI Safety — The field of research and practice dedicated to ensuring that artificial intelligence systems operate reliably within intended boundaries and do not cause unintended harm to humans, society, or the environment.
AI-Generated Code — Code produced by AI systems, which can be used for both legitimate software development and malicious purposes including malware creation and vulnerability exploitation.
Alert Fatigue — Desensitisation of human operators to system warnings due to excessive or poorly calibrated alerts, reducing the effectiveness of human oversight over AI systems.
Algorithmic Amplification — The process by which recommendation algorithms and content curation systems disproportionately promote certain content, amplifying its reach and societal impact beyond organic levels.
Algorithmic Bias — Systematic errors in AI systems that produce unfair outcomes, often favouring one group over another.
Algorithmic Trading — The use of AI algorithms to execute financial trades at speeds and volumes exceeding human capability, introducing systemic risks including flash crashes and market manipulation.
Alignment — The property of an AI system whose objectives, decision-making processes, and behaviours remain consistent with human values, intentions, and safety requirements. Alignment is a foundational challenge in AI safety research.
Allocational Harm — Unfair distribution of resources, opportunities, or services when AI systems systematically disadvantage certain groups in consequential decisions such as hiring, lending, or housing.
Anonymization — The process of removing or obscuring personally identifiable information from datasets to protect individual privacy, which AI techniques can increasingly defeat through re-identification attacks.
Artificial General Intelligence (AGI) — A hypothetical AI system capable of performing any intellectual task that a human can, with the ability to transfer learning across domains without task-specific programming.
Attack Surface — The totality of entry points, interfaces, and pathways through which an adversary can attempt to interact with, extract data from, or inject inputs into an AI system. In machine learning contexts, the attack surface extends beyond traditional software boundaries to include training pipelines, model APIs, prompt interfaces, tool integrations, and data ingestion channels.
Attribute Inference — Using AI to deduce sensitive personal characteristics such as health status, political affiliation, or sexual orientation from seemingly innocuous data patterns.
Authority Transfer — The gradual, often unrecognised shift of decision-making power from humans to AI systems, eroding meaningful human control over consequential outcomes.
Automated Decision-Making — Using algorithms or AI to make decisions affecting individuals with limited human review.
Automated Exploit — AI-driven tools that automatically discover and exploit software vulnerabilities without human intervention, accelerating the pace and scale of cyber attacks.
Automated Vulnerability Discovery — Using AI to autonomously identify security weaknesses in software, networks, or systems.
Automation — The use of AI to perform tasks previously requiring human labour, spanning physical, cognitive, and creative work, with implications for employment and economic structures.
Automation Bias — The tendency to favour automated system outputs over independent human judgement, even when incorrect.
Autonomous Vehicle — A vehicle using AI to navigate and operate without direct human control.
Autonomous Weapons — Weapon systems that use artificial intelligence to select and engage targets without meaningful human control over the critical functions of target identification, tracking, and engagement.
Autonomy — The capacity of individuals to make self-directed decisions free from undue external influence or automated override, which AI systems can undermine through manipulation or substitution.

B

C

D

E

F

G

H

I

J

L

M

N

Non-Consensual Intimate Imagery — Sexually explicit images or videos created or distributed without the depicted person's consent, increasingly generated using AI deepfake tools.

O

P

R

S

T

V

Z

Zero-Day — A software vulnerability unknown to the vendor and without an available patch, whose discovery by AI dramatically accelerates exploitation timelines and increases the threat to unprotected systems.