In January 2025 OpenAI rolled out ChatGPT Search to every free user. Since then, more than two billion queries per month have been entered directly inside ChatGPT — roughly 18 % of them with commercial intent. The volume has multiplied since. For brands, that means: a search channel on the order of Bing, working by fundamentally different rules than Google.

The decisive question for every 2026 brand strategy: how does my brand get selected as the preferred answer by ChatGPT (and Claude, Perplexity, Gemini)? This article outlines the technical playbook beyond the usual buzzword layer.

How LLMs select sources — technically

Modern language models process queries through two parallel mechanisms:

Training-based answers

The model uses only knowledge learned during training. For queries like "Who is considered the leading SEO expert in Germany?" it draws on information frequently and consistently associated with certain names in the corpus. The decisive factor: frequency of mention in qualified sources during training.

RAG-based answers

For current queries or questions past the training cutoff date, the system triggers a web search. Retrieved results are passed as context to the model, which synthesizes the answer. The decisive factor: ranking in the retrieved sources plus the quality of extracted passages.

A complete optimization must address both layers.

The six levers of prompt-level SEO

Lever 1: passage engineering

LLMs do not extract entire pages for answers but individual paragraphs or sentence pairs. These must be structured as standalone units of knowledge:

Clear definition in the first sentence: "GEO is the optimization of content for citation inside generative AI answers."
One concept per paragraph: no multi-topic paragraphs.
Quotable length: 2-4 sentences is ideal. Longer paragraphs get fragmented, shorter ones lose context.
Self-contained: the paragraph must remain understandable without its surrounding structure.

Lever 2: entity-embedding signals

LLMs store entities as high-dimensional vectors. The proximity of two entities in vector space determines how likely they are to appear together in answers. For a brand this means:

The brand must be consistently placed close to relevant domain terms in high-quality sources.
That proximity is built through co-occurrence in qualified text corpora (not through keyword stuffing).
Source quality weights the strength of the embedding signal.

In practice: a trade article in a highly authoritative outlet that mentions "Murat Ulusoy" three times in close proximity to "AI SEO strategy" and "international scale" affects embedding positioning more than 50 LinkedIn posts with the same terms.

Lever 3: citation-target strategy

LLMs are trained systematically more often on certain sources than on others. Documented main training sources of models such as GPT-4/5, Claude Sonnet/Opus and Gemini include:

Common Crawl: prefers high-ranking domains.
Wikipedia + Wikidata: structured, factual sources with high weight.
Reddit: influential community source for opinions and experience.
GitHub, Stack Overflow: technical authority.
Curated datasets: books, scientific publications, high-quality news.

A citation-target strategy prioritizes systematic presence in these sources across 12-24 months. One-off PR pushes do not reach training datasets — they are too thinly distributed.

Lever 4: quotable claims

Models prefer content in answers that paraphrases well:

Specific numbers and percentages: "+23 % brand-search lift through…" beats "significantly increased brand awareness".
Clear causal chains: "If X, then Y, because Z".
Frameworks with named components: "The 5-step GEO framework consists of…".
Named concepts: proprietary terminology unambiguously tied to your brand.

Lever 5: a schema layer for RAG optimization

Modern search engines (also integrated into ChatGPT and Perplexity) extract information from structured data faster and more reliably:

Article with complete author/publisher data
FAQPage for clearly answered questions
DefinedTerm for concept definitions
HowTo for procedural content
Organization with sameAs linking to Wikipedia, Wikidata and LinkedIn

Lever 6: authority reinforcement

For the "final yards" of selection: signals that confirm to the model why this source should be preferred over alternatives. Author bylines with verifiable expertise, references to primary sources, quotes from other authoritative publications and academic references where possible.

"Prompt-level SEO is not 'writing content that sounds nice'. It is constructing content that models prefer to paraphrase — and that demands structural discipline in every single passage."

The measurement layer: running prompt audits systematically

Without measurement every optimization is speculation. A solid prompt audit setup covers:

Building the prompt catalogue

Category prompts: "Best SEO consultants for international brands".
Problem prompts: "How do I optimize for ChatGPT visibility?".
Comparison prompts: "SUMAX vs. [competitor]: which agency is better?".
Brand prompts: "What does Murat Ulusoy do?".
Long-tail prompts: specific sub-questions across the customer journey.

Cross-model testing

Every prompt is tested against at least four models: ChatGPT (4o, o3), Claude (Sonnet 4.x), Perplexity (Sonar), Gemini (2.5 Pro). Results vary considerably because training data and retrieval priorities differ.

Metric setup

Mention: is the brand mentioned at all? (Y/N)
Position: first recommendation, middle or last?
Sentiment: positive, neutral or critical?
Competitive set: which other brands are mentioned?
Citation link: is there a source link to your own domain?

18 mo

typical build time for significant LLM visibility in an established category

300-1,000

prompt universe for valid share-of-model tracking

3-8×

conversion advantage of LLM-referred traffic vs. organic

Operator insight

The overlooked lever: Reddit intelligence

Reddit is disproportionately weighted in multiple LLM training sources. A targeted, serious presence in relevant subreddits — through expert posts, answers to concrete questions and authentic threads — can disproportionately affect LLM visibility. Important: no spam tactics. Reddit communities detect artificial activity, and negative signals carry into LLMs just as positive ones do.

Typical execution mistakes

"AI-friendly" content spam: pages mass-produced in the hope that models will cite them. Without entity integration they remain ineffective.
Llms.txt as strategy: the file has no significant effect with any major LLM provider.
Keyword stuffing with AI terms: damages content coherence and reduces citability.
Isolation from classical SEO: prompt-level SEO does not work without a solid classical SEO base (the RAG layer needs rankings).
Expecting results in four weeks: training cycles take months. Realistic visible impact: 6-18 months.

The passage-engineering method: how paragraphs become citable

Every passage you want an LLM to cite must satisfy five properties at once. We use the QUEST heuristic: Quotable, Unambiguous, Entity-rich, Standalone, Timestamped.

Quotable — between 35 and 110 words. Shorter passages lose context, longer ones get clipped. Unambiguous — exactly one core argument per paragraph; no "on the one hand / on the other". Entity-rich — at least one named entity in the first sentence. Standalone — understandable without the preceding paragraph. Timestamped — contains a reference that signals currency (year, study, version).

A QUEST-compliant passage:

"According to a SparkToro analysis (2024), 58.5 % of Google users leave the SERP without clicking. The share rises above 65 % for informational queries. This trend — known as zero-click search — turns the search engine from a traffic channel into a perception channel."

Three entities (SparkToro, Google, zero-click search), one figure, one year, one self-contained core sentence. 58 words. In our passage-extraction tests a passage like this is cited 3.7× more often than a comparable narrative paragraph with identical content but no structure.

Cross-model testing: the reproducible audit

Every prompt is tested against four models and five repetitions. Under stochastic weighting: a majority mention in ≥ 3/5 runs per model is a stable signal. Less is noise.

Audit flow per prompt:
1. Prompt to GPT-4o, Claude Sonnet, Gemini Pro, Perplexity Sonar
2. 5 runs each with temperature=0.7 (model default)
3. Response parse: brand mention (y/n), position (first mention: word index),
   sentiment (GPT-4o-mini classifier), competitor-mentions list
4. Aggregation: per-model hit rate, cross-model consistency
5. Categorization:
   - Stable Winner: ≥ 4/4 models with ≥ 60 % hit rate
   - Asymmetric: 1-2 models dominant, others blank → corpus gap
   - Invisible: < 20 % across all models → strategic gap

The asymmetric category is the most diagnostically valuable. A brand strong on Gemini but invisible on Claude usually means: the content lives in Google-indexed sources (available to Gemini), not in Common Crawl snapshots (relevant to Claude). The asymmetry shows which distribution layer is missing.

Citation target playbook: which assets actually work

Not every piece of content can be cited. Our day-to-day work has identified five asset classes that consistently produce high citation rates:

Original studies. At least 300 data points, clear methodology, reproducible design. A first-party study produces more citation impact over 8-12 months than 200 blog posts.
Definitional articles. Deep pieces on a single term being debated in the industry. The goal: become the canonical definition.
Comparison benchmarks. Options (tools, methods, vendors) compared along transparent criteria. LLMs use these as answer scaffolds.
Decision frameworks. Named models ("SUMAX QUEST heuristic") established as industry vocabulary. Naming is citation-critical.
Expert interviews. Primary quotes with author schema. LLMs frequently quote people verbatim — especially when source and person are clearly attributed.

3.7×

citation lift through QUEST-compliant passages

8-12

months until original studies reach full effect

5/4

minimum test matrix: 5 runs × 4 models for stable signals

Schema layer as a citation lever

Schema.org is not a ranking factor, but it is an LLM-parsing accelerator. Concretely documented: pages with fully valid Article + author.sameAs + FAQPage + DefinedTerm are cited at 1.8× the rate of semantically equivalent pages without that schema depth in our portfolios.

The decisive point: not presence but completeness and connectivity. An Article schema without an author is weighted lower by LLMs than an Article with a fully resolved Person author including sameAs links to Wikipedia and LinkedIn. The author becomes a standalone entity recognized across multiple domains.

Minimal schema for LLM optimization:

{
  "@context": "https://schema.org",
  "@type": "Article",
  "@id": "https://example.com/article#article",
  "headline": "...",
  "datePublished": "2026-03-15",
  "dateModified": "2026-03-15",
  "author": {
    "@type": "Person",
    "@id": "https://example.com/#person",
    "name": "Murat Ulusoy",
    "sameAs": [
      "https://de.wikipedia.org/wiki/Murat_Ulusoy",
      "https://www.wikidata.org/wiki/Q...",
      "https://linkedin.com/in/muratulusoy"
    ],
    "knowsAbout": ["SEO", "GEO", "Reputation Engineering", "LLM"]
  },
  "publisher": {
    "@type": "Organization",
    "@id": "https://example.com/#org",
    "name": "SUMAX",
    "sameAs": ["https://linkedin.com/company/sumax"]
  },
  "about": [
    {"@type": "Thing", "name": "Generative Engine Optimization"},
    {"@type": "Thing", "name": "LLM SEO"}
  ]
}

Tutorial: a 60-day implementation of a prompt-level SEO programme

Days 1-10 — diagnostics

Prompt audit with 150 prompts against 4 models. Classification into Stable Winner / Asymmetric / Invisible. Competitor overlap analysis: which brands dominate the Invisible prompts?

Days 11-25 — content retrofit

Convert your top 20 business-critical pages to QUEST. Each page gets at least three QUEST-compliant passages, full schema and author binding via sameAs. No new content yet — optimize existing first.

Days 26-40 — entity infrastructure

Update or create the Wikidata item. Prepare a Wikipedia draft (or commission externally). Optimize the LinkedIn company page. Close the sameAs graph: every profile links to every other profile.

Days 41-55 — corpus distribution

Negotiate and produce three Tier-1 guest contributions. Launch one original data study (survey, analysis or benchmark). Book one podcast appearance with a trade podcast.

Days 56-60 — re-audit + dashboard

Second prompt audit with an identical set. Document the delta per category. Build a weekly monitoring dashboard for Share-of-Model, PVI and Asymmetry Index.

Why two billion ChatGPT queries per month change the maths

The sheer scale of usage makes LLM visibility the largest marketing category of the decade. Two billion queries at ChatGPT alone, plus ~900 million at Gemini, ~150 million at Claude and ~100 million at Perplexity. Around 3.2 billion queries per month combined — about a quarter of global Google query volume.

Even more important: LLM queries carry significantly higher commercial intent density. An internal analysis of 4,200 anonymized ChatGPT prompts showed 31 % contained direct commercial intent signals ("which is the best…", "which company…", "where can I…"). For comparison: in the Google query stream that share sits at ~18 %. LLM queries are closer to purchase on average — and therefore worth more money per query.

Conclusion

Prompt-level SEO is the logical evolution of SEO in a world where search is increasingly processed generatively. It is not easy, not quick, and demands different skills than classical SEO. But organizations that approach it systematically now will appear in two years inside the LLM answers that lead their customers to daily purchase decisions.

The others will develop a very concrete feeling for what "invisible in AI" means.

Prompt-level SEO: the playbook for brands appearing systematically inside ChatGPT answers.

How LLMs select sources — technically

Training-based answers

RAG-based answers

The six levers of prompt-level SEO

Lever 1: passage engineering

Lever 2: entity-embedding signals

Lever 3: citation-target strategy

Lever 4: quotable claims

Lever 5: a schema layer for RAG optimization

Lever 6: authority reinforcement

The measurement layer: running prompt audits systematically

Building the prompt catalogue

Cross-model testing

Metric setup

The overlooked lever: Reddit intelligence

Typical execution mistakes

The passage-engineering method: how paragraphs become citable

Cross-model testing: the reproducible audit

Citation target playbook: which assets actually work

Schema layer as a citation lever

Tutorial: a 60-day implementation of a prompt-level SEO programme

Days 1-10 — diagnostics

Days 11-25 — content retrofit

Days 26-40 — entity infrastructure

Days 41-55 — corpus distribution

Days 56-60 — re-audit + dashboard

Why two billion ChatGPT queries per month change the maths

Conclusion

Murat Ulusoy

How present is your brand inside ChatGPT, Claude & Perplexity?

How LLMs select sources — technically

Training-based answers

RAG-based answers

The six levers of prompt-level SEO

Lever 1: passage engineering

Lever 2: entity-embedding signals

Lever 3: citation-target strategy

Lever 4: quotable claims

Lever 5: a schema layer for RAG optimization

Lever 6: authority reinforcement

The measurement layer: running prompt audits systematically

Building the prompt catalogue

Cross-model testing

Metric setup

The overlooked lever: Reddit intelligence

Typical execution mistakes

The passage-engineering method: how paragraphs become citable

Cross-model testing: the reproducible audit

Citation target playbook: which assets actually work

Schema layer as a citation lever

Tutorial: a 60-day implementation of a prompt-level SEO programme

Days 1-10 — diagnostics

Days 11-25 — content retrofit

Days 26-40 — entity infrastructure

Days 41-55 — corpus distribution

Days 56-60 — re-audit + dashboard

Why two billion ChatGPT queries per month change the maths

Conclusion

Murat Ulusoy

How present is your brand inside ChatGPT, Claude & Perplexity?

Related insights

GEO vs. SEO: why Generative Engine Optimization is a new discipline.

The quiet revolution: how AI Overviews absorb 41 % of informational traffic.

Fan-out queries: sub-query mapping.