Skip to main content
TopAIThreats home TOP AI THREATS
How-To Guide

How to Detect AI-Generated Text: Practitioner Checklist (2026)

6-step workflow to detect AI-generated text. Includes manual indicators, Python code for stylometric analysis, detection tool comparison, and decision framework.

Last updated: 2026-04-13

To detect AI-generated text, apply a six-step multi-signal evaluation: (1) establish context and consequences, (2) inspect for stylistic uniformity, hedging language, and factual fabrication, (3) run automated detection tools as supplementary signals, (4) compare stylometric fingerprints against author baselines, (5) verify factual claims against primary sources, and (6) assess the totality before making any consequential decision. No single signal is reliable — convergence across multiple indicators is the standard.

Who this is for: Educators evaluating student submissions, editors reviewing contributed content, compliance teams assessing documentation, and anyone who needs to determine whether specific text was AI-generated.

Critical caveat: No AI text detection method is reliable enough for high-stakes decisions in isolation. False positives disproportionately affect non-native English speakers and formal writers. This guide provides a multi-signal evaluation framework — never base consequential decisions on a single indicator or tool score.

What AI-Generated Text Is and Why Detection Matters

AI-generated text is content produced by large language models (LLMs) — systems like GPT-4, Claude, Gemini, and their derivatives. The detection challenge arises because LLM output is grammatically correct, topically relevant, and stylistically variable — it does not contain the “tells” of earlier machine-generated text.

Detection matters in specific contexts:

  • Academic integrity — evaluating whether student work is original
  • Content authenticity — verifying that published content was written by the attributed author
  • Scientific publishing — identifying AI-generated manuscripts that bypass peer review. The ‘vegetative electron microscopy’ incident demonstrated how AI-generated content contaminated at least 22 scientific papers
  • Disinformation — detecting AI-generated content in coordinated manipulation campaigns

For the underlying science — how detection methods work and where they fail — see the AI-Generated Text Detection Methods reference page.

Threat patterns this guide addresses

Step 1: Establish the Context

Before analyzing the text, understand what question you are actually asking:

Step 2: Manual Inspection Checklist

Examine the text for indicators of AI generation. Each is suggestive, not conclusive.

Stylistic indicators

Content indicators

Contextual mismatch indicators

Automate the manual checklist with Python

The stylistic and content indicators above can be partially automated. The following script computes measurable proxies for several manual inspection signals — sentence length uniformity, vocabulary diversity, hedging frequency, and contraction absence:

import re

HEDGING_PHRASES = [
    "it is important to note", "it is worth noting",
    "furthermore", "moreover", "in conclusion",
    "while there are various", "this is a complex issue",
    "it should be noted", "generally speaking",
    "there are several", "it is essential to",
]

def analyze_text_signals(text: str) -> dict:
    """Compute statistical signals that may indicate AI-generated text.
    These are suggestive indicators, not conclusive evidence."""
    sentences = [s.strip() for s in re.split(r'[.!?]+', text) if s.strip()]
    words = text.lower().split()
    unique_words = set(words)

    # Sentence length variance — low variance suggests uniform AI structure
    lengths = [len(s.split()) for s in sentences]
    mean_len = sum(lengths) / len(lengths) if lengths else 0
    variance = (
        sum((l - mean_len) ** 2 for l in lengths) / len(lengths)
        if lengths else 0
    )

    hedging_count = sum(1 for p in HEDGING_PHRASES if p in text.lower())

    return {
        "sentence_count": len(sentences),
        "avg_sentence_length": round(mean_len, 1),
        "sentence_length_variance": round(variance, 1),
        "type_token_ratio": round(len(unique_words) / len(words), 3) if words else 0,
        "hedging_phrase_count": hedging_count,
        "contraction_count": len(re.findall(r"\b\w+'\w+\b", text)),
        "word_count": len(words),
    }

# Usage
text = open("suspect_document.txt").read()
signals = analyze_text_signals(text)
print(signals)
# Low variance + low TTR + high hedging + zero contractions → investigate further

Interpreting the output: No single metric is diagnostic. Low sentence length variance (below ~15) combined with zero contractions and 3+ hedging phrases warrants further investigation, but each can also appear in legitimate formal writing.

Step 3: Run Automated Detection Tools

Use one or more AI text detection tools as a supplementary signal. Never treat a tool score as a verdict.

ToolApproachBest for
GPTZeroMulti-feature (perplexity, burstiness)Academic integrity
Originality.aiNeural classifier + plagiarismContent publishing
Turnitin AI DetectionIntegrated with plagiarism infrastructureAcademic institutions
CopyleaksMulti-lingual detectionEnterprise compliance

Step 4: Stylometric Comparison (When Baseline Exists)

If you have authenticated writing samples from the purported author, compare:

Stylometric comparison is the most reliable detection method when a baseline exists. It is the least reliable when no baseline exists or when the author has limited prior writing.

Automate stylometric comparison with Python

When you have authenticated writing samples from the same author, use the following script to build a stylometric fingerprint and flag deviations:

import re
import statistics

def stylometric_profile(text: str) -> dict:
    """Build a stylometric fingerprint from a text sample."""
    sentences = [s.strip() for s in re.split(r'[.!?]+', text) if s.strip()]
    words = text.lower().split()
    lengths = [len(s.split()) for s in sentences]

    return {
        "mean_sentence_length": round(statistics.mean(lengths), 2),
        "stdev_sentence_length": round(
            statistics.stdev(lengths) if len(lengths) > 1 else 0, 2
        ),
        "type_token_ratio": round(
            len(set(words)) / len(words) if words else 0, 3
        ),
        "contraction_rate": round(
            len(re.findall(r"\b\w+'\w+\b", text)) / len(words) if words else 0, 4
        ),
        "semicolon_rate": round(
            text.count(";") / len(sentences) if sentences else 0, 4
        ),
        "question_rate": round(
            text.count("?") / len(sentences) if sentences else 0, 4
        ),
    }

def compare_profiles(baseline: dict, suspect: dict) -> list[str]:
    """Flag metrics where suspect deviates from baseline by more than 50%."""
    flags = []
    for key in baseline:
        if baseline[key] == 0:
            continue
        relative_diff = abs(baseline[key] - suspect[key]) / baseline[key]
        if relative_diff > 0.5:
            flags.append(
                f"{key}: baseline={baseline[key]}, suspect={suspect[key]} "
                f"(diff={relative_diff:.0%})"
            )
    return flags

# Compare known author writing against suspect document
baseline = stylometric_profile(open("known_author_samples.txt").read())
suspect = stylometric_profile(open("suspect_document.txt").read())

flags = compare_profiles(baseline, suspect)
if flags:
    print("Stylometric deviations detected:")
    for f in flags:
        print(f"  - {f}")
else:
    print("No significant deviations from author baseline.")

Requirements: Python 3.10+ with only standard library modules. For more robust analysis, consider NLTK for tokenization or spaCy for part-of-speech distributions, which provide stronger stylometric features than word-level statistics.

Step 5: Verify Factual Claims

AI-generated text frequently contains fabricated facts that sound plausible:

Step 6: Make a Responsible Decision

After gathering evidence from Steps 2–5, assess the totality:

What This Guide Does Not Cover