Most takes on "how to rank in ChatGPT" are vibes. There's now a decent body of public research that's a lot more concrete. I've spent the last few weeks pulling together what the academic and industry studies actually say, and the picture is clearer than the hot-take economy would suggest.

Here's what the data shows about how AI engines decide which brands to name in their answers.

Key takeaways

AI engines pick favorites. In OpenAI's models, the top 20 news sources account for 67.3% of all news citations. If a brand isn't already on the list, getting onto it is the job.
On-page signals that measurably lift visibility: adding statistics, direct quotations, and external citations (up to 115% for lower-ranked content). Keyword stuffing produces little to no improvement.
Entity clarity unlocks citations. Brands with a consistent identity across LinkedIn, Crunchbase, Wikipedia, and their own site outperform those that scatter it.
Engines differ: ChatGPT leans on training-data and entity signals, Gemini on live SEO carryover, Perplexity on a strict quality gate, Claude on enterprise-grade sources.

The starting fact: AI engines are not neutral

The Lantern AI Citation Visibility Report (February 2026) analyzed over 200 million citations pulled directly from the interfaces of ChatGPT, Perplexity, Gemini, and Claude. Their headline finding: a small number of domains capture a disproportionate share of mentions, and the mix differs sharply across engines.

A separate arXiv study by Kai-Cheng Yang ("News Source Citing Patterns in AI Search Systems", July 2025) looked at 65,000 responses across OpenAI, Perplexity, and Google. In OpenAI's models, the top 20 news sources accounted for 67.3% of all news citations. Google and Perplexity were less concentrated, but still skewed heavily toward the same few authoritative outlets.

The takeaway is uncomfortable for most brands: AI engines are picking favorites. If a brand isn't already on the list, getting onto it is the entire job.

What actually drives mentions

1. Citation-worthy structural signals

The most-cited piece of GEO research is the Princeton and IIT Delhi paper "GEO: Generative Engine Optimization" (Aggarwal et al., KDD 2024). It tested specific on-page changes against a 10,000-query benchmark and measured how they affected visibility in generative responses.

The optimizations that actually moved the needle:

Adding statistics to existing content: roughly 30 to 40% lift in position-adjusted visibility
Adding direct quotations from authoritative sources: similar range
Citing external sources within the content: up to 115% improvement for content that started in lower-ranked positions
Keyword stuffing: little to no improvement over the unoptimized baseline

That last one matters. The old SEO instinct to pack in keywords does nothing for AI search. The engines reward content that reads like a credible explanation, not content that reads like it was written for a crawler.

2. Entity clarity

AI engines model the world as entities (organizations, products, people) with defined attributes. When the entity definition is ambiguous, the brand gets suppressed even when it should be a strong match for a query.

The Lantern report found that YouTube is the single most-cited domain across all four major engines, with more than twice the citation share of the second-place domain. Why YouTube? It pairs structured metadata (titles, descriptions, transcripts) with a clear entity model that AI systems can resolve confidently. The lesson isn't "post on YouTube". It's that entity clarity is what unlocks citations, and YouTube happens to enforce it by design.

Brands that scatter their identity across inconsistent profiles (different names on LinkedIn, Crunchbase, Wikipedia, their own site) measurably underperform brands that maintain a clean, consistent entity definition across the same surfaces.

3. Source quality thresholds

Yang's study also found that all major engines lean heavily toward high-quality sources. OpenAI's models cited high-quality outlets 96.2% of the time. Google's cited high-quality outlets 92.2% of the time. The bar is set, and it's not low.

Perplexity goes further with an explicit quality gate. According to published analyses of its retrieval pipeline, Perplexity uses a multi-stage reranking system, with a late-stage XGBoost quality classifier reportedly filtering out roughly 70% of candidate sources before they're allowed to appear in citations.

If a domain doesn't meet that threshold, the page is invisible regardless of how good it is.

4. Recency and the live-retrieval gap

Gemini and Perplexity both actively retrieve live web content. ChatGPT and Claude increasingly do as well, but with different defaults and different weight given to freshness signals.

BrightEdge's twelve-month analysis (February 2025 to February 2026) of Google AI Overviews found that AIO presence grew from roughly 30% to 48% of tracked queries, while the overlap between AIO citations and the organic top 100 edged up from about 49% to 53% (overlap with the organic top 10 stayed roughly flat, near 17%). For Gemini and AIO specifically, recently updated pages that rank well in classic search are still disproportionately likely to be cited.

This is the most actionable finding for SEO teams: existing SEO work has measurable carryover into AIO and Gemini visibility. It does not carry over evenly to ChatGPT or Perplexity, which depend more on entity, citation, and training-data signals than on real-time ranking position.

Engine-specific differences

The engines aren't interchangeable. The public data points to clear personalities:

ChatGPT weighs aggregate training-data signals most heavily: how often a brand appears across the broader web, how strongly it co-occurs with category terms, and how clean its entity definition is across authoritative third-party sources.

Gemini is the most sensitive to live web signals. Of all the engines, it's the one where good SEO carries over most directly. Recent, well-ranked pages on authoritative domains tend to surface.

Perplexity has the most explicit citation logic and the highest quality threshold for what it'll cite. AuthorityTech's domain audit (April 2026) found that ChatGPT and Perplexity share only 11% of their cited domains, which signals that these systems are pulling from very different source sets.

Claude indexes more on enterprise-grade and structured sources. Its user base is disproportionately enterprise (the May 2026 Ramp AI Index showed Anthropic at 34.4% of US business AI adoption versus OpenAI at 32.3%), which matters for B2B brands more than its overall consumer market share would suggest.

What this means for GEO strategy

The common thread across every engine is the same: authoritative, specific, citable content from an entity the AI can resolve cleanly. That's the foundation. Without it, engine-specific tactics don't compound.

On top of the foundation, the engine-specific work:

For ChatGPT: invest in third-party entity signals. Wikipedia, Crunchbase, named-author bylines in industry publications.
For Gemini: treat GEO as a layer on top of strong SEO. The overlap with classic search is the highest of any engine.
For Perplexity: structure for extraction. Tables, comparisons, named entities, direct quotes. The Princeton paper data points directly at this.
For Claude: prioritize depth, original analysis, and credibility signals. Enterprise readers and enterprise-tilted engines reward substance over volume.

None of this replaces the need for ongoing monitoring. AI engines change. Last quarter's findings won't all hold next quarter. The brands that stay ahead are the ones tracking their visibility continuously, not auditing it once a year.

Zumi measures how all nine AI engines, from ChatGPT to Perplexity, rank and cite brands, tracking mention rate, share of voice, position, and citation share so the ranking becomes something a team can act on.

Sources

Aggarwal, P. et al. "GEO: Generative Engine Optimization." KDD 2024 / arXiv:2311.09735
Yang, K-C. "News Source Citing Patterns in AI Search Systems." arXiv:2507.05301
Lantern. "AI Citation Content Visibility Report" (February 2026). asklantern.com
BrightEdge. "AI Overviews at the One-Year Mark" (February 2026). brightedge.com
Ramp AI Index, May 2026, reported via TechCrunch (May 13, 2026). techcrunch.com
Averi / AuthorityTech. "Only 11% of domains cited by both ChatGPT and Perplexity" (early 2026). authoritytech.io
ZipTie. "How Perplexity AI answers work: retrieval, ranking and citation pipeline" (2026). ziptie.dev