Most takes on "how to rank in ChatGPT" are vibes. There's now a decent body of public research that's a lot more concrete. I've spent the last few weeks pulling together what the academic and industry studies actually say, and the picture is clearer than the hot-take economy would suggest.
Here's what the data shows about how AI engines decide which brands to name in their answers.
The starting fact: AI engines are not neutral
The Lantern AI Citation Visibility Report (February 2026) analyzed over 200 million citations pulled directly from the interfaces of ChatGPT, Perplexity, Gemini, and Claude. Their headline finding: a small number of domains capture a disproportionate share of mentions, and the mix differs sharply across engines.
A separate arXiv study by Kai-Cheng Yang ("News Source Citing Patterns in AI Search Systems", July 2025) looked at 65,000 responses across OpenAI, Perplexity, and Google. In OpenAI's models, the top 20 news sources accounted for 67.3% of all news citations. Google and Perplexity were less concentrated, but still skewed heavily toward the same few authoritative outlets.
The takeaway is uncomfortable for most brands: AI engines are picking favorites. If your brand isn't already on the list, getting onto it is the entire job.
What actually drives mentions
1. Citation-worthy structural signals
The most-cited piece of GEO research is the Princeton and IIT Delhi paper "GEO: Generative Engine Optimization" (Aggarwal et al., KDD 2024). It tested specific on-page changes against a 10,000-query benchmark and measured how they affected visibility in generative responses.
The optimizations that actually moved the needle:
- Adding statistics to existing content: roughly 30 to 40% lift in position-adjusted visibility
- Adding direct quotations from authoritative sources: similar range
- Citing external sources within the content: up to 115% improvement for content that started in lower-ranked positions
- Keyword stuffing: about 10% worse than the unoptimized baseline
That last one matters. The old SEO instinct to pack in keywords actively hurts in AI search. The engines reward content that reads like a credible explanation, not content that reads like it was written for a crawler.
2. Entity clarity
AI engines model the world as entities (organizations, products, people) with defined attributes. When the entity definition is ambiguous, the brand gets suppressed even when it should be a strong match for a query.
The Lantern report found that YouTube is the single most-cited domain across all four major engines, with more than twice the citation share of the second-place domain. Why YouTube? It pairs structured metadata (titles, descriptions, transcripts) with a clear entity model that AI systems can resolve confidently. The lesson isn't "post on YouTube". It's that entity clarity is what unlocks citations, and YouTube happens to enforce it by design.
Brands that scatter their identity across inconsistent profiles (different names on LinkedIn, Crunchbase, Wikipedia, their own site) measurably underperform brands that maintain a clean, consistent entity definition across the same surfaces.
3. Source quality thresholds
Yang's study also found that all major engines lean heavily toward high-quality sources. OpenAI's models cited high-quality outlets 96.2% of the time. Google's cited high-quality outlets 92.2% of the time. The bar is set, and it's not low.
Perplexity goes further with an explicit quality gate. According to public documentation of its retrieval pipeline, Perplexity uses a three-stage reranking system, with the final stage (an XGBoost quality classifier) filtering out roughly 70% of candidate sources before they're allowed to appear in citations.
If your domain doesn't meet that threshold, you're invisible regardless of how good the page is.
4. Recency and the live-retrieval gap
Gemini and Perplexity both actively retrieve live web content. ChatGPT and Claude increasingly do as well, but with different defaults and different weight given to freshness signals.
BrightEdge's twelve-month analysis (February 2025 to February 2026) of Google AI Overviews found that AIO presence grew from roughly 30% to 48% of tracked queries, and that the overlap between AIO citations and traditional organic top-10 rankings grew from 32.3% to 54.5%. For Gemini and AIO specifically, recently updated pages that rank well in classic search are disproportionately likely to be cited.
This is the most actionable finding for SEO teams: the work you're already doing has measurable carryover into AIO and Gemini visibility. It does not carry over evenly to ChatGPT or Perplexity, which depend more on entity, citation, and training-data signals than on real-time ranking position.
Engine-specific differences
The engines aren't interchangeable. The public data points to clear personalities:
ChatGPT weighs aggregate training-data signals most heavily: how often your brand appears across the broader web, how strongly you co-occur with category terms, and how clean your entity definition is across authoritative third-party sources.
Gemini is the most sensitive to live web signals. Of all the engines, it's the one where good SEO carries over most directly. Recent, well-ranked pages on authoritative domains tend to surface.
Perplexity has the most explicit citation logic and the highest quality threshold for what it'll cite. AuthorityTech's domain audit (April 2026) found that ChatGPT and Perplexity share only 11% of their cited domains, which tells you these systems are pulling from very different source sets.
Claude indexes more on enterprise-grade and structured sources. Its user base is disproportionately enterprise (the May 2026 Ramp AI Index showed Claude at 34.4% of US business AI spend versus OpenAI at 32.3%), which matters for B2B brands more than its overall consumer market share would suggest.
What this means for your GEO strategy
The common thread across every engine is the same: authoritative, specific, citable content from an entity the AI can resolve cleanly. That's the foundation. Without it, engine-specific tactics don't compound.
On top of the foundation, the engine-specific work:
- For ChatGPT: invest in third-party entity signals. Wikipedia, Crunchbase, named-author bylines in industry publications.
- For Gemini: treat GEO as a layer on top of strong SEO. The overlap with classic search is the highest of any engine.
- For Perplexity: structure for extraction. Tables, comparisons, named entities, direct quotes. The Princeton paper data points directly at this.
- For Claude: prioritize depth, original analysis, and credibility signals. Enterprise readers and enterprise-tilted engines reward substance over volume.
None of this replaces the need for ongoing monitoring. AI engines change. Last quarter's findings won't all hold next quarter. The brands that stay ahead are the ones tracking their visibility continuously, not auditing it once a year.
Sources
- Aggarwal, P. et al. "GEO: Generative Engine Optimization." KDD 2024 / arXiv:2311.09735
- Yang, K-C. "News Source Citing Patterns in AI Search Systems." arXiv:2507.05301
- Lantern. "AI Citation Content Visibility Report" (February 2026)
- BrightEdge. "AI Overviews at the One-Year Mark" (February 2026)
- Ramp AI Index (May 2026)
- AuthorityTech. "ChatGPT and Perplexity Share Only 11% of Cited Domains" (April 2026)