AI Defensive Methods
Reference pages explaining how detection, prevention, and enterprise monitoring methods work, with documented limitations and evidence. For step-by-step practitioner workflows, see the Guides.
17 methods across 3 categories
Detection Methods
Adversarial Input Detection
DetectionTechniques for identifying inputs crafted to cause AI model misclassification or misbehavior, including perturbation analysis, input validation, certified defenses, and adversarial example detection.
2 threat patterns
AI Bias & Fairness Auditing
DetectionFrameworks and tools for evaluating AI systems for discriminatory outcomes, including statistical parity testing, disparate impact analysis, intersectional auditing, and algorithmic accountability methodologies.
5 threat patterns
AI Phishing Detection Methods
DetectionTechnical approaches for detecting AI-generated phishing campaigns, including LLM-output classifiers, behavioral email analysis, AI-enhanced threat intelligence, and organizational controls.
2 threat patterns
AI-Generated Text Detection Methods
DetectionTechnical approaches for identifying text produced by large language models, including statistical classifiers, watermark detection, stylometric analysis, and their documented limitations.
2 threat patterns
Data Poisoning Detection Methods
DetectionTechnical approaches for identifying malicious modifications to AI training data, including statistical outlier detection, provenance tracking, dataset integrity verification, and model behavior analysis.
2 threat patterns
Deepfake Detection: How It Works, Tools & Limitations
DetectionHow deepfake detection works — forensic analysis, AI classifiers, and C2PA provenance explained. Includes real-world incident outcomes and tool comparisons.
3 threat patterns
Voice Cloning Detection: Technical Methods & Tools
DetectionReal-time and forensic detection methods for identifying AI-generated or cloned speech audio — spectral analysis, liveness detection, enterprise deployment, and procedural verification for voice cloning threats.
2 threat patterns
Prevention Methods
AI Supply Chain Security
PreventionPractices for securing the AI model supply chain, including model provenance verification, dependency scanning, trusted model registries, and third-party component validation.
2 threat patterns
Content Provenance & Watermarking
PreventionStandards and techniques for establishing content authenticity and origin, including C2PA cryptographic provenance, invisible watermarking, and content authentication infrastructure.
3 threat patterns
Deepfake Social Engineering Prevention
PreventionOrganizational and technical controls for preventing deepfake-enabled social engineering attacks, including verification protocols, multi-channel authorization, employee training, and incident response procedures.
2 threat patterns
Privacy-Preserving Machine Learning
PreventionTechniques for training and deploying AI models while protecting individual privacy, including differential privacy, federated learning, secure computation, and data minimization strategies.
5 threat patterns
Prompt Injection Defense Methods
PreventionWhy prompt injection cannot be fully solved, what structural constraints limit every defense, and how to select defenses by deployment scenario. Companion reference to the implementation guide.
4 threat patterns
Enterprise Methods
AI Audit & Logging Systems
EnterpriseInfrastructure for recording AI system decisions, inputs, outputs, and actions to support accountability, compliance, forensic analysis, and continuous improvement.
4 threat patterns
AI Risk Monitoring Systems
EnterpriseEnterprise platforms and methodologies for continuous monitoring of AI system behavior, including drift detection, performance degradation alerts, fairness monitoring, and risk dashboards.
4 threat patterns
Human Oversight Design for AI Systems
EnterpriseEnterprise method for maintaining meaningful human control over AI systems. Covers human-in-the-loop design, automation bias countermeasures, escalation mechanisms, and automation level frameworks for high-stakes AI deployments.
5 threat patterns
Model Governance Controls
EnterpriseOrganizational frameworks for managing AI model lifecycles, including model registries, approval workflows, version control, access management, and decommissioning procedures.
5 threat patterns
Red Teaming AI Systems
EnterpriseStructured adversarial testing methodologies for evaluating AI system safety and security, including prompt injection testing, bias probing, capability elicitation, and organizational red team operations.
4 threat patterns