Executive summary
This report documents the first quantitatively systematic study of brand visibility in the six AI search systems that currently matter in market — Google AI Overviews, ChatGPT with web search, Perplexity, Microsoft Copilot, Claude with web access and Google Gemini. The dataset comprises 12,000 prompts across eight industries (B2B software, e-commerce retail, automotive, finance, healthcare, legal/professional services, travel/hospitality, industrial manufacturing), tested against 150 brands of varying SEO maturity. All prompts were collected in controlled windows over two 14-day periods, with personalisation isolation and geo-IP control (DACH, EU and US Northeast as three measurement points).
The headline findings are structural, not anecdotal. First: the correlation between classical Google ranking position and LLM citation rate is 0.42 — present but weak, and a long way from the naive assumption of linear decoupling. A Google position-1 ranking results in a citation in at least one of the six AI systems for the same query in only 37 per cent of cases. Second: the levers that drive citation rate hardest are entity maturity (Wikidata item quality plus Schema @id coherence) and passage structure (claim–evidence pairing plus self-containment) — not backlink profile or domain authority. Third: competitive density in LLM systems is still 40 to 60 per cent lower in DACH than in the classical Google SERP — a time-limited strategic window.
prompts across six AI systems and eight industries
of Google top brands carry an LLM gap
correlation: Google position to LLM citation rate
Methodology
The prompt matrix was constructed across four categories, each populated with 375 prompts per industry. The categories are: brand queries (the brand is named explicitly; typical intent: evaluation, price comparison, feature detail), category queries (the category is named explicitly, no brand; typical: "best options for X"), use-case queries (a concrete problem scenario without naming a category or brand) and comparison queries (two or more brands explicitly compared). Prompt construction follows a reproducible template, documented and available for replication.
Execution took place in two waves (March 2026 and April 2026), each spanning 14 days, with structured capture per prompt: full response extraction, citation count per brand, source URL classification, response length, model version, timestamp. For models with web access, both web-access-enabled runs and — where possible — training-only runs were collected, in order to separate live retrieval from training-derived citations.
Analysis was normalised by industry and by model, to account for market-specific base rates. Statistical significance was tested for all aggregate findings at a 95 per cent confidence interval; the headline findings reported here exceed that threshold throughout. Raw data is available on request to qualified research partners and academic institutions.
Finding 1: Google rankings are no longer the leading indicator
The structurally most important single finding is the moderate correlation between classical Google ranking position and LLM citation rate across all six tested systems. A correlation of 0.42 (Pearson, p<0.01) indicates a real but weak relationship. Stated positively: strong Google rankings are a mild positive indicator for LLM citation probability. Stated negatively: you cannot infer LLM visibility from Google performance — the channels diverge structurally.
More telling is the analysis of the gap group: brands holding Google positions 1 to 3 but cited below the industry average in at least three of the six LLM systems. This group accounts for 41 per cent of all brands ranked in the Google top three — a substantial minority that performs excellently on classical KPIs but is falling behind structurally in AI search. Gap analysis reveals shared structural weaknesses: thinly maintained Wikidata items (median property count of 7.3 versus 18.6 in the ranking-equivalent no-gap group), missing Schema @id graphs (only 23 per cent have coherent @id linkage) and unclear passage structure on top-ranking URLs (claim–evidence pairing present in only 31 per cent of primary chunks).
The mirror image is the inverse-gap group: brands ranked positions 5 to 15 on Google but with disproportionately high LLM citation rates. This group represents 28 per cent of brands prominently cited in LLMs. The structural commonalities are inverted: strong Wikidata maintenance (median 21.4 properties), coherent Schema graphs and structurally clean passage architecture in blog and guide content.
Finding 2: Entity maturity beats backlink profile
Regression analysis across all 150 brands shows clearly: the single strongest predictor of LLM citation rate is a composite entity-maturity score that aggregates Wikidata item quality, Schema @id graph coherence and sameAs cluster consistency. That score correlates with citation rate at r=0.67 (p<0.001) across all industries. For comparison: backlink profile metrics (Domain Rating, referring domains, Trust Flow) show correlations between 0.24 and 0.31 — statistically significant, but structurally weaker than entity signals.
This does not mean backlinks become irrelevant. In multivariate regression, backlink profiles contribute independently, but with a smaller coefficient than the entity variables. The practical implication is clear: a brand that neglects entity structure loses more LLM visibility than it can win back through link building. A brand with a strong entity foundation can dominate in LLMs even with a moderate link profile.
The second-strongest predictor is passage-structure quality, operationalised through a manual score of the first 30 URLs of each brand (n=4,500 URLs in total) for claim–evidence pairing, self-containment and numerical concreteness. That score correlates with citation rate at r=0.54. In combination, entity maturity and passage structure together explain 58 per cent of the variance in LLM citation rate across all brands.
Finding 3: Model divergence is real, but overstated
A widespread assumption in marketing discussions is that the six AI systems have substantially different source preferences. The data supports this only in part. The citation-overlap matrix — for any given query, the intersection of cited brands between two systems — shows moderate to strong overlap: ChatGPT and Perplexity share 64 per cent of citations, ChatGPT and Claude 71 per cent, ChatGPT and AI Overviews 58 per cent, AI Overviews and Gemini 81 per cent (unsurprising, since they share an architecture family).
Divergences exist, but they are typically a consequence of structural differences in the underlying indexes (ChatGPT via Bing, Perplexity via Brave, AIO/Gemini via Google) rather than fundamentally different ranking logic. For brands, this means: the 70 per cent overlap justifies an integrated GEO strategy that focuses on shared levers. The 30 per cent divergence justifies model-specific monitoring and selective adaptations (Bing indexation for ChatGPT, Brave presence for Perplexity).
Finding 4: DACH competitive density is structurally lower
For the DACH market versus US English-language queries, the data shows substantially lower competitive density in LLM citations. Average citations per prompt in the DACH matrix: 3.2 brands — versus 5.4 in the US matrix. That means each individual citation carries higher relative weight in a DACH context, and the share of prompts in which a single brand is cited dominantly is significantly higher.
The structural cause is index asymmetry: English-language training data dominates in every LLM, German-language long tails have shallower source pools, and many German Mittelstand brands have structurally weaker entity presence. This creates a strategic window for brands that invest systematically in DACH GEO now: lower competition, higher marginal return per unit of investment, typical lead times to visibility stabilisation of three to six months.
Finding 5: Hallucination risk where entity structure is weak
A robust secondary finding is the hallucination rate for brands with weak entity structure. We measured how often the models attribute false or inconsistent attributes to a brand (wrong founder, wrong founding year, wrong category, wrong location). For brands in the bottom quartile of the entity-maturity score, the hallucination rate runs at 17.3 per cent of citations — versus only 2.8 per cent in the top quartile.
This has reputational implications that go beyond pure visibility. Hallucinations in LLM answers undermine trust precisely at the moment a potential customer is in active evaluation. Brands with thin entity structure are not only less visible — they are also more likely to be misrepresented in dangerous ways.
| Industry | Median | 75th percentile | Top decile | Characteristics |
|---|---|---|---|---|
| B2B software | 24% | 41% | 58% | High LLM use in research phases |
| Legal & professional services | 31% | 48% | 64% | Near-YMYL, high entity density |
| Travel / hospitality | 28% | 45% | 61% | Mixed intent, strong reviews integration |
| Automotive | 26% | 42% | 60% | Product-centric, high brand presence |
| E-commerce retail | 22% | 39% | 56% | Shopping overlap with AIO |
| Finance | 19% | 33% | 51% | YMYL conservatism in the models |
| Healthcare | 18% | 34% | 52% | Strict YMYL; author entity is critical |
| Industrial manufacturing | 14% | 28% | 44% | Lowest density, largest upside |
Where do you sit in your industry?
60-minute benchmark call: we compare your citation rate against the median and top-decile values for your industry and pinpoint the three biggest delta levers.
Industry differentiation
The eight tested industries show distinct patterns. B2B software and legal/professional services have the highest LLM citation density — buyers in these industries actively use LLMs for evaluation. E-commerce retail and travel show medium LLM intensity, with classical SERP features still mattering more. Automotive and industrial manufacturing currently have lower LLM intensity but a steeper trajectory — competitive density will converge in 18 to 24 months. Healthcare and finance show distinct behaviour due to YMYL constraints: LLMs are more conservative in these industries, prioritise authoritative sources disproportionately, and entity work has even more leverage than elsewhere.
Implications for brand strategists
First: the decoupling of Google ranking and LLM citation is real and substantial. Any visibility strategy measured exclusively on classical SERP metrics is losing attribution for a growing share of informational demand. Complementary LLM citation monitoring infrastructure is becoming mandatory investment, not nice-to-have.
Second: entity work has the highest marginal return. Maintaining Wikidata items, implementing Schema @id graphs, consolidating sameAs clusters — for brands with already solid link profiles, this work pays out more in LLM visibility than additional backlink campaigns.
Third: the DACH window is closing. Lower competitive density in German-speaking markets is a time-limited phenomenon. Brands that invest systematically in GEO over the next 12 to 18 months will build structural visibility leads that latecomers can only catch up with at significantly higher cost.
Fourth: hallucination prevention is a strategic brand-protection issue. Weak entity structure leads not only to less visibility but to higher risk of misrepresentation. Entity work addresses both dimensions simultaneously.
Outlook: what 2027 will look like
The 2027 projection is based on extrapolation of observed trends and analysis of announced product roadmaps. Three structural shifts are foreseeable. First: LLM adoption in B2B evaluation will continue to rise — 58 per cent of B2B decision-makers surveyed already use LLMs actively for vendor shortlisting; by the end of 2027 that number will rise to 75–85 per cent. Second: multi-modal retrieval is going mainstream — Gemini and GPT-5 have already implemented it, others will follow. Visual assets become their own retrieval surface. Third: enterprise integration layers are growing — Microsoft Copilot, Google Gemini in Workspace, Apple Intelligence — all are being integrated more deeply into productivity software, massively expanding the distribution surface for AI search.
For SEO strategists this means: the work invested in a GEO foundation in 2026 will pay out disproportionately in 2027 and beyond. The entity investments made today are the substrate from which multi-modal LLM citations will be drawn in 2028. Anyone investing now is building on infrastructure that compounds structurally.
Limitations and transparency
The report has definitive limitations that should be stated transparently. The 12,000-prompt base is statistically robust, but not all industries are sampled at equal depth — B2B software and legal have denser sampling than industrial manufacturing. The time-window restriction (two 14-day periods) does not capture seasonal or longer-term drift patterns. The restriction to six tested systems leaves smaller LLMs (You.com, Brave Leo, several open-source candidates) unmeasured. Repeat studies with expanded samples are planned; this report should be read as a Q2/2026 snapshot, not as a final word.
Conclusion
The State of AI Search 2026 report documents what many SEO teams observe anecdotally: the structural shift from classical search to AI-based search is quantifiable, structurally stable and has clear lever signatures. The most important strategic implication is not "SEO is dead" — classical SEO remains foundational — but that entity work and passage structure have become the new dominant levers, and that investment in GEO monitoring infrastructure is becoming mandatory for any brand that takes informational demand in its audience seriously.
The full raw dataset and methodology documentation are available on request to qualified research partners and academic institutions. Independent replication is explicitly welcomed.