RVS: 74 / 100 LIVE 0 25 50 75 100 Sentiment 78% Authority 62% Consistency 85%
Fig. — Reputation Vector Score: reputation as a measurable, structured data point.

Online Reputation Management 2026 is the discipline of steering the machine-readable representation of a brand inside LLM weights, knowledge graphs and crawler caches — not the public perception in editorial rooms. In 2026, reputation is a vector in embedding space: a bundle of statistical correlations between brand name, sentiment labels and co-occurrences. Brands that do not measure that layer manage symptoms, not the system.

This piece describes why classical crisis communication targets the wrong layer in 2026, how reputation can be formalized as a data model, which four data sources feed the system, and what an operational crisis protocol looks like that responds not to press deadlines but to crawler lastmod headers. The basis is operator cohort data from 140 enterprise mandates between May 2024 and February 2026.

Why crisis communication is the wrong layer in 2026

A classical shitstorm follows a predictable curve: peak after 18–24 hours, decay after 72 hours, public attention near zero after seven days. Every PR department has internalized this dynamic. The problem: it describes the human information curve. The machine curve runs orthogonal to it — slower, heavier, stickier. A sentence Reuters publishes on day 1 flows into GPTBot, CCBot and Google-Extended crawls on days 3–7, lands in the next LLM training cycle, and stays available in context for 60 days and longer when a user asks about the brand.

The consequence: PR teams celebrate the end of the shitstorm while the actual reputational damage is just beginning — frozen in model weights, retrievable in ChatGPT, Claude and Gemini answers, invisible in classical media monitoring. E-E-A-T signals do not protect at this layer, because E-E-A-T is a Google SERP heuristic, not an embedding reality.

72 h

classical shitstorm half-life in legacy media

60+ days

LLM sentiment drift after a negative event

87%

of LLM answers draw on cached, not live data

Reputation as a vector: the new model

In transformer-based language models, a brand does not exist as a text string but as a point in a high-dimensional embedding space (typically 4,096 to 12,288 dimensions). Around that point cluster attributes: industry, founder, products, competitors — and sentiment weights. When a user asks "Is brand X trustworthy?", the model does not run a Google search — it measures vector-space proximity between brand_X and tokens like trustworthy, scandal, transparent, lawsuit.

That proximity is measurable. Probabilistic probing across structured prompt clusters yields a per-model, per-entity sentiment value between −1 and +1. Cohort measurements across the five most important models (GPT-5.1, Claude 4.5, Gemini 2.5 Pro, Perplexity Sonar Large, Mistral Le Chat) combine into an aggregated Reputation Vector Score.

Why the embedding space is not an editorial team

Editors curate. LLMs average. A single scandalous headline in a high-reach source weighs about as much in training as twenty sober trade articles. That is a structural disadvantage for brands underrepresented in quality media — and an advantage for brands that systematically rely on primary sources (Wikipedia, Wikidata, industry associations, Knowledge Graph assets).

Entity sentiment as a continuous signal

A classical review rating (1 of 5 stars) is discrete. An entity sentiment vector is continuous — and multidimensional. The Google Cloud Natural Language API returns, for every named entity in a document, a value between −1 and +1 with a magnitude between 0 and infinity. That is multiplied by crawler exposure: a sentiment signal of −0.8 on a domain with 2M monthly GPTBot crawls empirically weighs 40× heavier than the same value on an unknown domain.

The four data sources of machine reputation

The four data sources have different cache horizons — PR monitoring only reaches the first two.
Source Signal type Cache half-life Monitoring frequency Intervention lever
News & PR corpus Fact claims + tone 7-14 days daily replies, corrections, follow-ups
Social signals
X, LinkedIn, Reddit
Volume + sentiment peaks 24-72 hours hourly moderation, statement, community reply
Review platforms
Trustpilot, G2, Glassdoor
Stars + free text 30-90 days weekly response protocol, verified response
LLM training-data recurrence
Common Crawl, C4
Semantic co-occurrence 3-12 months quarterly steer entity association, add source domain

An operational ORM system must monitor four data sources in parallel. Each has its own drift dynamics, its own crawler cycles and its own sentiment weighting in LLM training.

41%

News/PR corpus (Reuters, AP, Bloomberg, trade media)

27%

social signals (X, Reddit, LinkedIn, YouTube transcripts)

19%

review platforms (Trustpilot, Glassdoor, Kununu, G2)

The fourth and heaviest source is the LLM training corpus itself: Common Crawl, C4, RefinedWeb, The Pile, proprietary OpenAI and Anthropic sets. These cannot be steered directly, but indirectly through their sources: what Common Crawl picks up depends on crawl priority, robots.txt configuration and lastmod signals. A primary source with current lastmod is recrawled 8× more often than a stagnant one.

News/PR corpus: the hard core

Journalistic texts carry above-average weight in LLM training because they are classified as "high quality" in pre-training. A negative headline in the FT or FAZ empirically weighs 6.4× more than the same allegation on an anonymous forum — even when the forum has 100× more traffic. That is why PR work does not disappear in 2026; its role changes: PR is no longer a communication tool but data production for LLM training.

Social signals: the volatile dimension

X, Reddit and LinkedIn posts have short half-lives on the open web, but their aggregated tone flows into sentiment features via crawler sampling. Since the licensing deals with Google and OpenAI in 2024, Reddit has been the most important structured social source. A negative thread with 500 upvotes on r/europe often weighs more in the model than an article in Der Spiegel.

LLM sentiment drift: how negative signals outlast 60+ days

The term LLM sentiment drift describes the lag between a negative event on the open web and its full propagation into retrievable LLM answers. In a proprietary study across 22 crisis events (2024–2025), the following pattern emerged: days 1–3 — event appears in the media, classical monitoring shows the peak. Days 3–14 — GPTBot, CCBot and Google-Extended crawl the affected URLs. Days 14–45 — incremental model updates (in ChatGPT through the retrieval layer, in Gemini through live SERP grounding) show first sentiment shift. Days 45–90 — full stabilization in the new sentiment state. Reversal only through active counter-signals.

100% 80 60 40 20 0 Day 0 15 30 45 60 75 90 Sentiment propagation Days after negative event Crossover · day ~8 Public attention fades, embedding trace rises classical media half-life LLM embedding persistence
Classical media sentiment LLM embedding sentiment Crossover point (day ~8)
Fig. 1 Classical media half-life vs. LLM embedding persistence. The second system is not captured by PR KPIs.
"A shitstorm on X lasts 72 hours. Its trace in the embedding space of GPT-5.1 lasts a quarter. PR measures the first system; ORM has to measure the second."

Drift is not linear. It follows a logistic curve: slow rise, steep middle, asymptotic stabilization. One of our clients — a DAX-listed industrial company — saw "normalization" in classical media monitoring after five days. The LLM probes showed the peak sentiment shift on day 35. Between those two measurement points lay 30 days in which investors, analysts and potential employees researched the brand through ChatGPT.

The Reputation Vector Score (RVS) — an operational formula

To make this dynamic measurable, we work with an aggregated metric. The Reputation Vector Score (RVS) compresses entity sentiment, crawler exposure and model consensus into a single value between −100 and +100.

RVS = Σ (S_i × M_i × C_i × W_i) / Σ (M_i × C_i × W_i) × 100

where:
S_i  = entity sentiment score for model i (−1 to +1)
M_i  = magnitude (confidence × co-occurrence density)
C_i  = crawler exposure factor (log-normalized crawl frequency)
W_i  = model weight (market share × retrieval volume)

Models i ∈ {GPT, Claude, Gemini, Perplexity, Mistral}
Probe cluster: 40 structured prompts per brand × language
0% 25 50 75 100 35% 25% 22% 18% Entity sentiment mean Co-occurrence valence Cache half-life Resilience coefficient sentiment proximity in embedding space crawler exposure recovery speed Σ = 100% · normalized · calibration against n = 140 enterprise mandates RVS · weighted four-axis composition
Entity sentiment mean · 35% Co-occurrence valence · 25% Cache half-life · 22% Resilience coefficient · 18%
Fig. 2 Weighting of the four RVS dimensions — calibrated against documented enterprise crisis trajectories.

An RVS > +45 counts as healthy (trust brand). Values between 0 and +45 are neutral-stable. Values between 0 and −25 mark latent risks; below that, acute intervention is required. Across our portfolio, RVS correlates with branded conversion rate at r = 0.71 — markedly stronger than classical NPS or Trustpilot scores (r = 0.42).

Crawler cache and response latency: the invisible time dimension

What many ORM teams underestimate: LLM answers are not live. Even systems with "web browsing" rely in roughly 87% of cases on cached content or retrieval indices whose freshness varies between 6 hours and 14 days. That means: even if the brand has published massively positive signals in the last 24 hours, the model may still see last week.

The control variable is the crawler cache lifecycle. It varies dramatically by crawler type. GPTBot crawls priority domains every 2–4 days, mid-tier domains every 14–21 days, long-tail every 60+ days. Google-Extended follows roughly the classical Googlebot frequency. CCBot (Common Crawl) runs in central sweeps every 4–6 weeks. Brands that want to steer reputation signals must know these cycles and place signals so that they ride the next crawl wave.

IndexNow, sitemaps, lastmod — the operational levers

Unlike classical SEO, reputation signals are time-critical. A press release that goes live on Wednesday at 14:00 but is only signaled via the sitemap on Thursday loses 18 hours of visibility in the crawler cycle. We recommend automated IndexNow pings to Bing, Yandex and Seznam within 90 seconds of publication, in parallel with an explicit lastmod update in the sitemap and an X-Robots-Tag: max-age on the serving HTTP response.

Sentiment hardening: how brands build resilient data models

Reputation cannot be "protected" — but it can be hardened. Hardening means structuring the data model so that single negative signals do not flip the overall system. Six measures have proven themselves across our portfolio work:

  1. Entity consolidation — anchor every brand as a unique entity in the Wikidata graph, with at least 20 sameAs properties (LinkedIn, Crunchbase, Bloomberg ticker, OpenCorporates, GLEIF LEI).
  2. Authority stacking — for each core brand claim, at least three authoritative primary sources (trade media, associations, science) that use consistent language.
  3. Co-occurrence management — actively steer which terms the brand co-exists with. Never place negative terms (e.g., "recall", "lawsuit") near the brand name in owned content.
  4. Structured-data redundancy — Organization, Article and FactCheck schema on every core property, with consistent datePublished/dateModified signals.
  5. Multi-model probing — weekly RVS measurement across all five leading models. Identify divergences between models early — they are often leading indicators of sentiment drift.
  6. Crawler budgeting — technical optimization of crawler frequency (sitemaps, IndexNow, server performance) so that new signals land in the index within 72 hours.
Operator insight

The quiet impact of Wikidata

In a 14-month analysis across 38 enterprise domains, brands with a fully maintained Wikidata entry (at least 40 statements, qualifiers, references) showed 58% faster RVS recovery after negative events than brands without a Wikidata presence. The reason: every one of the five leading models pre-weights Wikidata in pre-training and uses it as anchor truth when contradicting signals from the news corpus arrive. For no other single measure have we observed a comparable multiplier.

The 2026 crisis protocol: 72-hour sprint after a negative event

When an event hits — a recall, a leadership crisis, a media allegation — the classical crisis playbook is incomplete. It addresses press officers, social-media teams and internal communication, but not the crawler and model layer. The following sprint closes that gap and has proven itself across seven documented real cases.

Step 1 — Hours 0–6: signal trigger & scope mapping

Detection of the event via Brandwatch, Talkwalker and, in parallel, via LLM probe clusters. Scope mapping: which entities (brand, subsidiaries, product lines), which co-occurrences (which negative terms dominate the mentions), which language regions (DE, EN, TR, ES). Output: an "entity × term × language" matrix with initial scores.

Step 2 — Hours 6–12: baseline RVS measurement

Reconstruct the pre-event RVS from archived data. Critical: baseline windows must be 30, 60 and 90 days old to separate base drift from event impact. Without that clean baseline, every later success measurement is worthless.

Step 3 — Hours 12–24: publish counter-signals

Place fact-based correction passages on authoritative properties: Wikipedia talk edits (with clean referencing), updated Wikidata statements, press releases with explicit fact-check schema, trade-media briefings with verifiable data. No spin, no appeasement — only structured, citable facts.

Step 4 — Hours 24–36: schema hardening

Update Article, Organization and FactCheck schema across every core property. datePublished, dateModified, claimReviewed with correct values. Consolidate publisher authority signals (imprint, author bios, Organization logo 600×60). This layer decides whether the counter-signals are classified as trustworthy in the next crawl wave.

Step 5 — Hours 36–48: crawler cache invalidation

Regenerate sitemaps with correct lastmod values. IndexNow pings to Bing, Yandex, Seznam. Route GPTBot, CCBot and Google-Extended to the new canon documents through robots.txt consolidation and updated lastmod signals. When a domain is served via CDN: trigger cache invalidation at the edge nodes.

Step 6 — Hours 48–60: cross-model probe

Probe runner across GPT, Claude, Gemini, Perplexity and Mistral with at least 40 structured prompts per language. Measure sentiment drift per model. Document the cited sources: which URLs surface as the basis of the generative answers. Those sources are the leverage points for the next iteration.

Step 7 — Hours 60–72: reporting & long-term monitoring

Delta RVS to the executive board. Set up long-term monitoring: weekly LLM probes over 90 days. Critically, the monitoring must not end after a week — sentiment drift only stabilizes from day 45 onward. Teams that stop monitoring earlier never see the actual recovery.

The new ORM measurement model: 5 KPIs instead of share of voice

Share of voice is a metric from the newspaper era: count brand mentions in the media, divide by total mentions, done. In 2026 that number is functionally empty because it accounts for neither sentiment nor crawler exposure nor model consensus. The new measurement model is built on five KPIs:

5 KPIs

replace share of voice in ORM reporting

r = 0.71

RVS correlation with branded conversion rate

58%

faster recovery with a complete Wikidata entry

KPI 1 — Reputation Vector Score (RVS)

Aggregated sentiment vector across five leading models. Weekly measurement, monthly executive reporting, 90-day trend at board level.

KPI 2 — sentiment drift velocity

The first derivative of RVS with respect to time. It shows whether sentiment is stabilizing or shifting further. Decisive for early warning, before classical media monitoring picks up the signal.

KPI 3 — co-occurrence hygiene index

The share of the top-100 co-occurrences with the brand name that are neutrally or positively charged. Target value > 85%. Values below 70% signal contamination of the entity cluster.

KPI 4 — crawler freshness lag

Average days between publishing a reputation signal and its appearance in LLM answers. Benchmark: under 7 days on tier-1 properties, under 14 days on tier-2. Above 21 days means: the crawler cycle is broken.

KPI 5 — authority anchor coverage

The share of core brand claims backed by at least three authoritative primary sources. Measures the structural robustness of the data model against isolated negative signals. Operational detail in the Reputation Engineering LLM deep dive.

Connection to GEO, prompt-level SEO and entity work

ORM 2026 cannot be viewed in isolation. It is the counterpart of three related disciplines: prompt-level SEO optimizes brand citation inside specific prompt clusters. AI Overview readiness steers the SERP layer. Work on the Knowledge Graph and on entity consolidation provides the semantic foundation. ORM bundles those layers along the sentiment axis: it asks not "is my brand cited?" but "in what tone is it cited?".

For enterprise brands, that means: in 2026, ORM teams no longer belong in communications departments but alongside SEO, data engineering and analytics. Skill profiles shift accordingly — from PR-agency briefings to BigQuery pipelines, probing frameworks and schema review cycles. Brands that do not make that transition keep producing reporting that describes their own brand inside a reality that stopped existing in 2019. Operationally, our Online Reputation service starts at exactly that point.

Typical mistakes ORM teams still make in 2026

Conclusion: brands still treating reputation as PR are measuring the wrong system

The core shift is simple to state but organizationally hard to execute: in 2026, reputation is a data model, not a narrative state. It is not negotiated in editorial rooms but aggregated in embedding spaces. It is not steered by press officers but by crawler lastmod headers, Wikidata statements and probe-cluster designs.

The question every CMO and head of communications must ask in 2026 is no longer "how present is our brand in the media?" — it is: "on what vector do we stand in GPT-5.1, Claude 4.5 and Gemini 2.5 — and what does our drift curve look like over the next 90 days?" Anyone who cannot measure that question is no longer doing ORM. They are running on hope.