Mention rate was the first GEO metric because it was the easiest to observe. A marketer asks an AI engine a question, checks whether the brand appears, and counts. Simple. It is also the least useful metric to optimize in isolation.

A brand mentioned seventh in a list of eight, with a sentence noting that "some teams find their pricing opaque," is in a worse position than a brand not mentioned at all. Mention rate says a brand is in the answer. It does not say whether that brand is winning.

The full GEO measurement framework has four core metrics, a supporting set of diagnostic signals, and a distinct approach to revenue attribution that differs from how SEO is typically measured.

Key takeaways

The four core GEO metrics are: mention rate (presence), average position (prominence), share of voice (relative competitive standing), and sentiment (quality of the mention).
Each metric tells a different part of the story. Optimizing for one without tracking the others produces misleading progress signals.
Attribution is the hardest GEO measurement challenge because AI answers rarely generate a direct click. The methods that work: AI referral traffic segmentation and pipeline survey data from discovery calls.
The north star metric is not score improvement on a dashboard. It is appearing in the answers that precede purchase decisions in the specific category, for the specific buyer type the brand targets.

The four core metrics

Mention rate. The percentage of target queries for which the brand appears in the AI answer. A useful mention rate requires a defined query set, not a single test query. The query set should cover the category queries, use-case queries, and comparison queries that represent how the target buyer actually researches the category. Mention rate across twenty to thirty consistent queries is a reliable baseline. A single query is an anecdote.

Average position. When the brand appears, where in the answer does it appear? AI answers are consumed like prose. The brand named first in a synthesized list of recommendations captures more attention and more implicit endorsement than the brand named fourth. Position tracking requires consistent methodology: count the brand's position across the full query set and average it.

Average position is particularly useful for detecting improvement before mention rate changes. A brand moving from average position 3.2 to average position 1.8 across the same query set is building competitive strength that will show up in mention rate improvements over the following months.

Share of voice per engine. How often does the brand appear relative to competitors, across the same query set, on the same engine? Share of voice makes the competitive picture concrete. A brand mentioned in a third of answers while the leading competitor appears in twice as many faces a different strategic situation than the same brand in a category where no competitor clears one answer in five.

Share of voice should be tracked per engine separately. Perplexity, ChatGPT, and Gemini have different citation patterns, and a brand that leads on one engine may lag significantly on another. The per-engine breakdown reveals where the competitive gap is widest and where improvement investment will produce the most impact.

Sentiment score. Not all mentions are equivalent. A brand mentioned as "the recommended platform for growth-stage teams" has a different competitive impact than a brand mentioned as "a common option, though some users report a steep learning curve." Sentiment in AI answers is a leading indicator of how the competitive narrative is forming.

Sentiment scoring can be done manually on small query sets: positive, neutral, or negative per mention, per engine. At scale, this is where monitoring tooling provides real value. The sentiment dimension is what turns mention rate from a binary measurement into a meaningful signal about competitive positioning.

Diagnostic signals

Beyond the four core metrics, several diagnostic signals help explain why the core metrics are moving or not moving.

Citation sources. Which third-party sources is the engine drawing on when it recommends the brand? If the engine is primarily citing the brand's own content, the entity signal is weak. If the engine is citing analyst firms, review platforms, and high-authority publications, the entity signal is strong. A shift in citation sources toward higher-authority third parties is a leading indicator of future mention rate improvement.

Engine distribution. On which engines does the brand appear most and least consistently? Significant variation across engines points to specific signal deficiencies. Strong Perplexity presence but weak ChatGPT presence typically means strong SEO signals but weak entity and training-data signals. Strong ChatGPT presence but weak Perplexity presence typically means strong historical training-data presence but aging or thin current content.

Query type breakdown. Does the brand appear more reliably in category queries, use-case queries, or comparison queries? A brand that only appears in comparison queries is known but not positioned as a category default. A brand that only appears in category queries but not use-case queries has a positioning gap in specific applications.

Measuring revenue attribution

Attribution is the hardest GEO measurement challenge because AI answers rarely generate a direct, trackable click to the brand's website. A buyer reads a Perplexity answer that names three vendors, then independently types one of those vendors into Google. The referral source is Google, not Perplexity. The AI influence is invisible in the standard analytics.

Two methods produce usable attribution data:

AI referral traffic segmentation. AI engines do send some direct referral traffic, measurable in GA4 or equivalent tools. Segment visitors from known AI referral sources (perplexity.ai, chatgpt.com, gemini.google.com) and track their conversion rates against other traffic sources. AI referral visitors typically convert at significantly higher rates because they arrive with category intent already formed.

Discovery call survey data. Add a single question to the discovery call intake or post-demo survey: "Where did this brand first come up?" Include AI-specific options ("ChatGPT recommendation", "Perplexity answer", "AI search"). Even with imperfect response rates, the data builds a picture of AI's influence on pipeline over time, and it captures the AI-influenced buyers that referral analytics never see.

For the CMO-level framing of AI visibility spend in terms of measurable ROI metrics, the CMO's guide to justifying AI visibility spend covers how to build the business case with the metrics above.