ChatGPT won't mention your brand? The 8 reasons it ignores you — ranked by likelihood — each with a diagnose/fix/speed table, a decision flowchart, and how to prove the fix worked in revenue, not vibes.
A founder messaged me in March, frustrated. He had spent three weeks "doing GEO" — schema, llms.txt, the whole checklist from every blog post he could find — and ChatGPT still would not name his product when he asked it to recommend tools in his category. It listed five competitors. It did not list him. He wanted to know what he was doing wrong.
Nothing, as it turned out. His product launched in late 2025. The model he was testing against had a training cutoff in mid-2024. ChatGPT had never seen his company because it did not exist when the corpus was assembled, and he was testing with browsing turned off. No schema bundle on earth fixes that. The fix was a different model, browsing mode, and patience for the next retrain — plus a few retrieval-side moves to make sure that when browsing did fire, he was findable.
That conversation is why this article exists. "ChatGPT isn't recommending my product" is a single symptom with at least eight distinct root causes, and the fixes range from "ship it this afternoon" to "wait two quarters." If you misdiagnose, you spend effort on the wrong layer and conclude GEO does not work. It does work. You just have to know which of the eight you are dealing with.
This is the troubleshooting companion to two strategy pieces: how to get cited by AI engines, which covers the structural moves, and how to rank in ChatGPT, which covers the offensive playbook. This one is diagnostic. We rank the eight reasons by how often I actually see them, give each a diagnose/fix/speed table, draw the decision tree, and finish on the part nobody else writes: how to verify the fix paid off in revenue, using revenue attribution rather than feel.
Quick Facts
Metric
Value
Source
ChatGPT weekly active users (Q4 2025)
~400 million
OpenAI investor update [1]
Most common cause of "ChatGPT ignores my brand"
Launched after training cutoff
Attrifast diagnostic, n=60+ audits
OpenAI documented user-agents
3 (GPTBot, ChatGPT-User, OAI-SearchBot)
OpenAI bot docs [2]
Share of top sites that blocked GPTBot (2023-2024 peak)
~26-35% of top-1000 news/SaaS
Originality.ai / press tracking [3]
Citation density per ChatGPT search answer
3-5 sources typical
OpenAI search docs [4]
Reddit's share of cited domains in AI answers (2025)
Among the most-cited domains
Citation-source studies [5]
Wikipedia's role in AI answer grounding
Heavily over-represented in citations
Citation-source studies [6]
FAQ schema items on AI-cited pages (median)
4+
Ahrefs / Semrush GEO research [7]
GA4 default attribution for ChatGPT clicks
Direct/(none); no AI rule
Google Analytics docs [8]
ChatGPT RPV vs Google organic (B2B SaaS)
1.4-2.1x, n=24
Attrifast aggregate, Q1 2026
Retrieval-side fix latency
Days to weeks
Live index refresh, est.
Corpus-side fix latency
Quarters (next retrain)
Model release cadence, est.
Two of those rows frame the entire piece. The corpus-versus-retrieval distinction (rows 11 and 12) is the axis everything else rotates around: corpus fixes are slow and out of your hands, retrieval fixes are fast and in your hands. And the GA4 row (row 9) is why the last section exists — you can do everything right and still not be able to prove it, because the tool most people measure with is structurally blind to AI traffic.
How ChatGPT decides what to recommend (the quick mechanic)
When ChatGPT recommends a product, it is doing one of two very different things, and which one determines every fix below. Without browsing, it recommends from its training corpus — a frozen snapshot of the web, weighted toward what was mentioned often and authoritatively. With browsing, it runs a live search, retrieves a handful of pages, and synthesizes an answer with inline citations. Corpus is memory; browsing is retrieval. Most "why doesn't it mention me" problems live in exactly one of those two modes.
The practical consequence is that the same query can produce two different answers depending on whether browsing fires, and you need to test both. A brand absent from the corpus but present and well-structured on the live web will appear only in browsing-mode answers. A brand baked deep into the corpus (think Stripe, Notion, HubSpot) appears even with browsing off, because the model "remembers" it. Your goal, if you are not a household name, is to win retrieval first — it is faster — while you slowly earn your way into the corpus.
Here is the mechanic laid out as a comparison, because the rest of the article keeps referring back to it:
Dimension
Corpus mode (no browsing)
Retrieval mode (browsing on)
Source of truth
Frozen training snapshot
Live web search at query time
What gets you in
Frequent, authoritative mentions before cutoff
Crawlable, well-structured, retrievable pages now
Latency to influence
Quarters (next model retrain)
Days to weeks (index refresh)
You control it?
Indirectly, slowly
Mostly, quickly
Killer for new brands
Training cutoff predates you
Blocked crawler or thin content
Citations shown?
Rarely; recalled from memory
Yes, inline footnotes
The two columns also explain why honest GEO advice always hedges on timing. Anyone promising ChatGPT will recommend you "next week" is implicitly promising a retrieval win and quietly ignoring the corpus, which is the layer that produces the unprompted, browsing-off recommendations that actually feel like the model "knows" you. Both matter. They just move on different clocks.
One more mechanic worth internalizing: a recommendation is not a citation is not a crawl. The model crawling your page (GPTBot) does not mean it cites you; citing you in a browsing answer does not mean it recommends you unprompted from memory; recommending you from memory does not mean a user clicked. Keep those four states separate — crawled, cited, recommended, clicked — or your diagnosis will blur.
State
What it means
How you detect it
Crawled
GPTBot fetched your page for training
Server logs show GPTBot user-agent
Retrieved
ChatGPT-User fetched your page live for an answer
Server logs show ChatGPT-User user-agent
Cited
Your URL appears as a footnote in a browsing answer
Run the query with browsing; read the sources
Recommended
Model names you unprompted, browsing off
Run the query with browsing off
Clicked
A human followed the citation to your site
First-party attribution (GA4 hides it)
Now the eight reasons, ranked by how often they are the actual culprit in the audits I run.
Reason 1: The training cutoff predates your company
The most common reason ChatGPT does not recommend your product is the most deflating: your company did not exist when the model was trained. Every ChatGPT model has a training cutoff — a date after which it learned nothing — and if you launched after it, the model has zero memory of you. With browsing off, it cannot recommend what it never saw. This is not a quality problem or a schema problem; it is a calendar problem, and it is the first thing to rule out.
I rule it out first because it changes the entire strategy. If you are corpus-absent, on-page optimization buys you nothing in browsing-off mode — it only helps the live-retrieval path. Founders routinely burn weeks polishing schema while the real issue is that the model simply has not been retrained since they launched. Diagnose this before anything else.
Item
Detail
Diagnose
Ask ChatGPT (browsing OFF) to "describe [exact company name]." If it hallucinates, says it does not know, or describes a different company, you are corpus-absent. Cross-check the model's known training cutoff against your launch date.
Fix
(1) Test in browsing-ON mode and in newer models — you may already appear via retrieval. (2) Start accumulating the durable signals (Reasons 2, 3, 8) that get you into the next corpus. (3) Lean on retrieval-side wins (Reasons 4, 5, 7) in the meantime.
Speed
Corpus inclusion: SLOW (next retrain, OpenAI's schedule, typically quarters). Retrieval workaround: FAST (days).
You control timing?
No for corpus; yes for retrieval.
The honest caveat here is the one I most often have to deliver in person: there is no button that forces OpenAI to retrain on your existence sooner. You cannot pay for it. You cannot submit a form. The only thing you can do is be maximally present and authoritative across the web now, so that whenever the next training pass happens, you are unmissable. Everything in Reasons 2, 3, and 8 is really "earn your way into the next corpus." Everything in Reasons 4, 5, and 7 is "win retrieval while you wait."
A diagnostic test matrix I use to separate corpus absence from everything else:
Test (run all three)
Browsing OFF result
Browsing ON result
Interpretation
"Describe [your company]"
Knows you
Knows you
In corpus; not your problem
"Describe [your company]"
Hallucinates / unknown
Knows you
Corpus-absent, retrieval works — Reason 1, mitigated
"Describe [your company]"
Hallucinates / unknown
Hallucinates / unknown
Corpus-absent AND retrieval-blind — Reason 1 + Reasons 4/2
"Best tools for [category]"
Omits you
Omits you
Either corpus-absent or out-cited (Reason 6)
"Best tools for [category]"
Omits you
Includes you
Retrieval is carrying you; corpus is the gap
If the second row is your result, you are in the best-case version of Reason 1: the model does not remember you but can find you. Your job is to keep retrieval reliable (Reasons 4, 5, 7) and grind on corpus signals (Reasons 2, 3, 8). If the third row is your result, you have two problems stacked, and you should jump straight to Reason 4 to check whether you accidentally blocked the crawler.
Reason 2: No authoritative third-party mentions
The second most common reason is that nothing credible on the web talks about you. LLMs learn what to recommend from how often, and how authoritatively, a brand is mentioned across the corpus. If the only place your product name appears is your own website, the model has no third-party signal to ground a recommendation in. Self-published claims are weak training signal; independent mentions are strong. This is the "you are talking, nobody else is" problem.
This matters because recommendations are fundamentally a trust judgment the model is approximating from data. A brand named in TechCrunch, in a Reddit thread, in three "best tools" roundups, and on a competitor's comparison page has a dense web of corroborating mentions. A brand named only on its own blog has a single, self-interested node. The model treats the second case the way you would treat a restaurant that only reviews itself.
Item
Detail
Diagnose
Run a site: exclusion search on Google: search your brand name minus your own domain ("YourBrand" -site:yourbrand.com). Count independent results. Fewer than ~10 quality third-party mentions is a thin entity. Also check whether any "best [category] tools" articles name you.
Fix
Earn mentions: launch on Product Hunt, get into category roundups and listicles (Reason 5), do podcast or newsletter guest spots, publish original data others cite, get reviewed on G2/Capterra. Each independent mention is corpus fuel and retrieval fuel.
Speed
SLOW to MEDIUM. Individual mentions land fast; the aggregate density that moves recommendations builds over months.
You control timing?
Partly — you can pursue mentions, but you cannot force coverage.
The honest caveat: this is the slow, unglamorous PR-and-relationships work that founders hoping for a technical fix do not want to hear. There is no schema tag for "be famous." But it is also the most durable lever, because third-party mention density feeds both the corpus (slow) and live retrieval (fast — the model can browse those same third-party pages today).
A rough hierarchy of which third-party mentions carry the most weight, from what I have seen correlate with recommendation presence:
Mention type
Corpus weight
Retrieval weight
Difficulty to earn
Wikipedia entry
Very high
Very high
Very high (notability bar)
Major press (TechCrunch, etc.)
High
High
High
"Best [category] tools" listicles
High
Very high
Medium
Reddit recommendation threads
Medium-high
High
Medium (organic only)
G2 / Capterra reviews
Medium
Medium
Low-medium
Competitor comparison pages
Medium
High
Low (you publish them)
Podcast / newsletter mentions
Low-medium
Low
Medium
Your own blog
Low (self-interested)
Low-medium
Trivial
Notice the bottom row. Your own content is the easiest to produce and the weakest signal for recommendation. It is necessary — it gives retrieval something to cite — but it is not sufficient. The leverage is in the rows above it, which is exactly why so few founders do them.
Reason 3: Thin or absent Wikipedia and Wikidata entity
If your brand has no Wikipedia page and no Wikidata entry, the model is missing the single most over-represented source in its grounding data. Wikipedia and Wikidata are disproportionately present in LLM training corpora and in live retrieval, and they function as the canonical "entity card" the model uses to disambiguate and trust a brand. No entity card means the model is unsure you are a real, distinct, notable thing — and uncertainty suppresses recommendation.
This is closely related to Reason 2 but distinct enough to rank separately, because Wikidata in particular is achievable without the notability bar of a full Wikipedia article, and because the sameAs entity graph is a structural fix you control. Most founders skip it entirely. The brands that show up reliably in AI answers tend to have a clean entity graph; the ones that do not, do not. I cover the mechanism in depth in the Wikipedia effect on AI visibility; here is the troubleshooting version.
Item
Detail
Diagnose
Search Wikidata for your brand; search Wikipedia. Then check your own Organization JSON-LD sameAs array — does it list LinkedIn, X, GitHub, Crunchbase, and (if it exists) your Wikidata ID? Fewer than 4 matched sameAs surfaces is a thin entity.
Fix
(1) Create a Wikidata item (lower bar than Wikipedia; you can do it today). (2) Build a clean sameAs array across LinkedIn, X, GitHub, Crunchbase, Product Hunt, G2. (3) Pursue Wikipedia only once you have 5+ independent reliable-source citations — do not attempt it before, it will be deleted.
Speed
Wikidata + sameAs: FAST (hours to days). Wikipedia: SLOW (needs notability, can take quarters).
You control timing?
Wikidata/sameAs: yes. Wikipedia: no (community-gated).
The honest caveat: do not pay anyone to "get you a Wikipedia page." Paid Wikipedia creation violates the platform's terms, the articles get flagged and deleted, and a deleted article is worse than none. Earn it through genuine notability, or skip it and lean on Wikidata, which you can legitimately create yourself. The sameAs array, by contrast, is pure upside and takes an afternoon — it is the highest-ROI structural move on this list.
A sameAs audit checklist, mapped to whether each surface is realistic to claim quickly:
Surface
Effort to claim
Entity weight
LinkedIn company page
Trivial
High
X / Twitter handle
Trivial
Medium
GitHub organization
Trivial (even if empty)
Medium
Crunchbase profile
Low
Medium-high
Product Hunt page
Low
Medium
G2 / Capterra listing
Low-medium
Medium
Wikidata item
Low (self-create)
High
Wikipedia article
Very high (gated)
Very high
Reason 4: You blocked GPTBot in robots.txt
A surprising number of "ChatGPT ignores me" cases trace to a robots.txt line the founder forgot about — or inherited. In 2023 and 2024, blocking GPTBot was a widely-recommended reflex to protect content from AI training. If you (or a previous dev, or a CMS default, or a security plugin) disallowed GPTBot, you removed yourself from OpenAI's training crawl, which slowly erodes how often the model recommends you from memory. This is the fastest reason to diagnose and, if it is the culprit, the fastest to fix.
It matters because it is invisible until you look. The site works fine for humans and for Googlebot; nothing breaks. But OpenAI's crawler hits the disallow rule, never ingests your pages, and you quietly fall out of the corpus over successive retrains. I check this on every audit because it costs thirty seconds and occasionally explains the entire problem.
Item
Detail
Diagnose
Open https://yourdomain.com/robots.txt. Search for GPTBot, ChatGPT-User, OAI-SearchBot, and any blanket User-agent: *Disallow: /. Also check your CDN/WAF (Cloudflare's "block AI bots" toggle) and any security plugin defaults.
Fix
Remove the disallow lines for GPTBot, ChatGPT-User, and OAI-SearchBot. Confirm your CDN is not blocking them at the edge. Re-test robots.txt with a validator. Then wait for re-crawl.
Speed
Removing the block: FAST (minutes). Re-crawl and retrieval recovery: FAST-MEDIUM (days to a few weeks). Corpus recovery: SLOW (next retrain).
You control timing?
Yes for the block removal and re-crawl; no for corpus.
The honest caveat is the tradeoff itself. Blocking GPTBot is a legitimate choice if you have a genuine content-rights reason — and some publishers do. But understand the cost: you are trading future recommendation presence for control over training inclusion. Blocking GPTBot does not block ChatGPT-User (the live-fetch agent), so you can still be retrieved on demand if a user explicitly asks ChatGPT to read your URL. But you lose the unprompted, baked-in recommendation. For most SMB SaaS and e-commerce brands trying to get recommended, the right call is allow all three bots and instrument the crawl.
The three OpenAI bots and what blocking each costs you:
Bot
Purpose
Blocking it costs you
GPTBot
Training corpus crawler
Future corpus inclusion → fewer unprompted recommendations
ChatGPT-User
Live on-demand fetch
The ability to be retrieved when a user asks ChatGPT to read your page
OAI-SearchBot
ChatGPT search index crawler
Presence in ChatGPT search results
If you find a block, fix it and move on — but do not expect overnight recovery. The crawler has to come back, re-ingest, and (for corpus) wait for a retrain. Removing the block is necessary but not instantaneous.
Reason 5: No comparison or listicle content cites you
ChatGPT loves a list. When a user asks "what are the best X tools," the model leans heavily on existing "best X tools" articles, comparison posts, and roundups — because those are pre-structured recommendation lists it can lift directly. If your product appears in zero of the listicles that own your category, you are invisible to the most common recommendation-shaped query there is. The model recommends what the web has already recommended.
This is distinct from Reason 2 (general mentions) because listicles are a specific, high-leverage mention type: they are literally formatted as the answer ChatGPT wants to give. A single placement in a well-ranked "best [category] tools" post can do more for recommendation presence than ten passing brand mentions, because it directly matches the query intent and the answer structure. It is also a type of content you can partly manufacture by publishing your own honest comparison pages.
Item
Detail
Diagnose
Ask ChatGPT (browsing ON) "best [your category] tools" and "[competitor] alternatives." Read the cited sources. Are you in any of them? Then Google the same queries and check the top 10 listicles for your name.
Fix
(1) Get into existing roundups — pitch the authors, offer to be added, earn it via reviews. (2) Publish your own comparison content: "[You] vs [Competitor]," "[Competitor] alternatives," "best [category] tools" with honest inclusion of rivals. (3) Encourage customers to mention you in community roundups.
Speed
Your own comparison pages: FAST (publish today, retrievable in days). Getting into third-party listicles: MEDIUM (outreach cycle). Corpus effect: SLOW.
You control timing?
Your pages: yes. Third-party: partly.
The honest caveat: publishing your own "best tools" page that conveniently ranks you first fools no one, including the model, which cross-references. The comparison content that works is genuinely useful and honest about where competitors win — that is what gets cited and shared, which is what feeds retrieval and corpus. Self-serving comparison pages are low-effort and the model discounts them accordingly.
Which comparison formats earn the most retrieval pickup, ranked:
Format
Retrieval value
Why
Third-party "best [category] tools" listicle including you
Highest
Matches query intent exactly, independent source
Reddit thread recommending you in a roundup
Very high
Independent + high-trust domain (Reason 8)
Your honest "[Competitor] alternatives" page
High
Captures competitor-name queries
Your "[You] vs [Competitor]" page
High
Captures comparison queries
Your "best [category] tools" page (honest)
Medium
Self-published, discounted but still indexed
Spammy self-ranking comparison page
Low
Discounted as self-interested
Reason 6: Competitors dominate the citation graph
Sometimes the problem is not that ChatGPT does not know you — it is that it knows your competitors so much better that you never make the cut. In every category there is a citation graph: the dense web of mentions, links, listicles, and threads that the model has learned associates certain brands with certain queries. If two or three incumbents own that graph, the model recommends them by default, and you are crowded out even if you are technically present. This is a relative-authority problem, not an absence problem.
It matters because the fix is different from the earlier reasons. You are not trying to go from zero to one; you are trying to break into an established set. That means you cannot just optimize one page — you have to insert yourself into the same sources that name the incumbents. The model recommends a cluster of brands it has seen recommended together; your goal is to become a member of that cluster in the eyes of the corpus and the retriever.
Item
Detail
Diagnose
Ask "best [category] tools" and note which 3-5 brands always appear. Search those brands' names and see which domains mention them (listicles, Reddit, review sites, Wikipedia). That set of domains is the citation graph. Check how many of them mention you.
Fix
Get into the same sources that name the incumbents: the listicles they are in (Reason 5), the subreddits where they are recommended (Reason 8), the review sites where they are listed, the comparison pages. Aim to co-occur with them, not to beat one page.
Speed
MEDIUM to SLOW. Co-occurrence builds as you appear in more of the shared sources; no single fast lever.
You control timing?
Partly — you can pursue the sources, but displacing incumbents is gradual.
The honest caveat: you will probably not displace a category-defining incumbent in ChatGPT's recommendations through GEO alone. If you compete with Stripe, ChatGPT will name Stripe. The realistic win is to be one of the recommended options — to make the list, even if you are not first — and to own the long-tail and comparison queries ("[incumbent] alternatives," "[incumbent] for [specific use case]") where the incumbent's dominance is weaker and intent is higher. That is a winnable game; head-to-head displacement usually is not.
Where incumbent dominance is strong versus weak, and where to aim:
Query type
Incumbent dominance
Your realistic shot
"best [category] tools"
Very strong
Make the list, not the top
"[incumbent] alternatives"
Weak
High — this is your query
"[category] tool for [niche use case]"
Weak
High — long-tail intent
"cheapest [category] tool"
Medium
Good if you are genuinely cheaper
"[category] tool for [SMB / specific segment]"
Weak
High — segment specificity
"is [incumbent] worth it"
Medium
Medium — position as the lighter option
Reason 7: Your category language doesn't match how users ask
You might be present and well-cited, yet still missed, because you describe yourself in language nobody uses when they ask ChatGPT for a recommendation. If you call yourself a "revenue intelligence platform" but users ask for "a tool to track which marketing channel drives sales," the model may never connect the two. LLMs match on meaning, but the match is far stronger when your on-page language mirrors the actual phrasing of the query. A vocabulary mismatch makes you semantically invisible for the queries that matter.
This matters because it is a self-inflicted wound that founders are especially prone to. We fall in love with our own positioning language — the clever category we invented, the aspirational framing — and forget that buyers do not search in our vocabulary. The model recommends against the user's phrasing, not yours. If there is a gap, you lose, and the fix is fully within your control and fast.
Item
Detail
Diagnose
List the 10 ways your buyers actually phrase the problem (check your support tickets, sales calls, Reddit, the "People also ask" box). Now grep your own site for those phrases. If your pages use only your invented category language and never the buyer's words, you have a mismatch.
Fix
Add the buyer's actual phrasing to your pages — in H2s, FAQ questions, and direct-answer blocks. Use question-shaped headers that match real queries. Keep your brand language, but layer the vernacular alongside it so the semantic match is unambiguous.
Speed
FAST. Publish today; retrievable within days. Pure on-page change.
You control timing?
Fully.
The honest caveat: this is not keyword stuffing, and the model will discount you if it reads as such. The goal is genuine alignment — describing what you do in the words your buyers use, because that is also just clearer writing. The fix overlaps heavily with classic on-page SEO and the structural moves in how to get cited by AI engines; the GEO twist is that you are matching conversational query phrasing (full questions) rather than keyword phrases.
A before/after of category-language alignment, using a generic example:
Buyer's actual question
Your invented-language page
Aligned page
"tool to see which channel drives sales"
"Revenue Intelligence Platform"
"See which marketing channel drives revenue"
"cookieless analytics"
"Privacy-First Attribution Suite"
"Cookieless revenue analytics, no consent banner"
"track ChatGPT traffic"
"AI Discovery Attribution Layer"
"Track ChatGPT traffic and join it to Stripe"
"Google Analytics alternative for revenue"
"Next-Gen Measurement Stack"
"A GA4 alternative that shows revenue by channel"
Reason 8: No Reddit or forum presence
Reddit and community forums are among the most-cited domains in AI answers, and if your brand never appears in them, you are missing a source class the model trusts disproportionately. When a user asks ChatGPT for a recommendation, the model often grounds its answer in the same place a savvy human would — community threads where real users debate real tools. No presence in those threads means no inclusion in the grounded answer, and Reddit specifically punches far above its weight in citation studies.
This ranks last not because it is unimportant — it is genuinely high-leverage — but because it is the slowest and least directly controllable, and because the prior seven usually bite first. You cannot spam your way in; Reddit communities are aggressively hostile to self-promotion, and a ban is worse than absence. But organic, earned presence in the right subreddits is one of the strongest recommendation signals available, precisely because it is hard to fake. I dig into the revenue side of this in Reddit AI citations and revenue.
Item
Detail
Diagnose
Search Reddit for your brand name and your category ("best [category] tool reddit"). Are you mentioned, recommended, or discussed? Then ask ChatGPT (browsing ON) your category query and check whether Reddit threads appear in the cited sources — they usually do.
Fix
Build genuine community presence: answer questions in relevant subreddits without pitching, contribute useful data, let satisfied customers mention you organically, do an AMA if you have standing. Never astroturf — detection is harsh and the backlash is durable.
Speed
SLOW. Organic community standing takes months; there is no fast, safe shortcut.
You control timing?
Barely — you can participate, but recommendations must be earned organically.
The honest caveat is blunt: the fast version of this (paying for fake recommendations, sockpuppet threads) is both against platform rules and self-defeating, because the model and the community both detect and discount manufactured enthusiasm, and a public astroturfing bust is a brand liability that outlasts any short-term gain. The only durable play is to be genuinely useful in communities where your buyers already are. It is slow. It is also, per the citation studies, one of the most powerful signals there is — which is exactly why it is hard.
Perceived independence (real users, not marketing)
Strong positive
High engagement (upvotes, replies)
Positive — signals consensus
Recency (active recent threads)
Positive in retrieval mode
Direct recommendation phrasing ("I use X for Y")
Strong positive — matches answer shape
Detectable astroturfing
Strong negative — discounted or penalized
The diagnostic flowchart: which of the 8 is it?
The eight reasons are not equally likely, and they are not independent — corpus absence (Reason 1) often co-occurs with thin entity (Reasons 2, 3) and crawler blocks (Reason 4). The fastest way to narrow it down is a sequence of cheap tests, ordered so you rule out the highest-likelihood and cheapest-to-check causes first. Here is the decision tree I run, top to bottom.
The tree front-loads the two cheapest, highest-yield checks: corpus knowledge (browsing off) and the robots.txt block. Roughly two-thirds of the audits I run resolve in those first two branches. If you clear both, you are into the relative-authority and structural reasons, which are slower to fix but also the ones where sustained effort compounds.
A second view of the same logic — the order to run the tests and what each costs you:
The point of running all eight even after you find one cause is that the causes stack. A post-cutoff launch (Reason 1) with a blocked crawler (Reason 4) and a thin entity (Reason 3) needs all three fixed; fixing one and re-testing immediately will look like the fix did not work. Diagnose comprehensively, then sequence the fixes by speed.
Fast fixes vs slow fixes: the speed reality table
Before you spend a week on anything, sort your diagnosed reasons by how fast the fix can possibly land. The corpus-versus-retrieval split is the deciding factor: anything that depends on the next model retrain is measured in quarters and is not under your control, while anything that flows through live retrieval can move in days. Pulling the fast levers first buys you visible progress while the slow ones bake.
Here is every reason sorted by realistic time-to-effect, with the lever type:
Reason
Primary lever
Type
Realistic time-to-effect
You control timing?
4. Blocked GPTBot
Unblock crawler
Retrieval + corpus
Minutes to fix; days to re-crawl
Mostly
7. Category language mismatch
On-page rewrite
Retrieval
Days
Fully
5. No comparison content (your own)
Publish comparison pages
Retrieval
Days
Fully
3. Thin entity (Wikidata + sameAs)
Build entity graph
Both
Hours to days
Mostly
5. No comparison content (third-party)
Earn listicle placements
Both
Weeks (outreach)
Partly
6. Competitors dominate
Insert into citation graph
Both
Weeks to months
Partly
2. No third-party mentions
Earn press/reviews
Both
Months
Partly
8. No Reddit presence
Build community standing
Both
Months
Barely
3. Thin entity (Wikipedia)
Earn a Wikipedia article
Both
Quarters
No (gated)
1. Training cutoff
Wait for retrain
Corpus
Quarters
No
Two columns matter most. The "type" column tells you whether a fix helps live retrieval (fast, browsing-on answers) or the corpus (slow, browsing-off recommendations) or both. The "you control timing" column tells you where to spend energy for predictable results. Notice that the only "Fully" controllable, "Days" fixes are Reasons 5 (your own pages) and 7 (language) — start there.
This shape front-loads everything fast and controllable, runs the slow earned-media work in parallel, and treats the corpus (Reason 1) as a quarterly re-test rather than a task — because it is not a task you can complete, only a condition you wait for and feed.
How to verify the fix actually worked: revenue, not vibes
Here is the step everyone skips: proving the fix worked. The trap is measuring GEO success by "I asked ChatGPT and it mentioned me," which is a sample size of one, on your IP, at one moment, in one mode. The signal that actually matters is whether AI-attributed traffic and revenue moved after your change — and that is precisely what default GA4 cannot show you, because ChatGPT strips the Referer header and the clicks land in Direct/(none).
This matters because GEO has a brutal feedback-loop problem. The fixes are slow, the corpus is opaque, and the temptation to declare victory off a single anecdotal mention is enormous. Without a revenue signal, you cannot tell a real win from confirmation bias. The discipline is to instrument first, change second, measure third — in revenue, the only number that survives a board meeting.
Verification method
What it tells you
Reliability
"I asked ChatGPT and it named me"
Presence, one sample, one moment
Low — anecdotal, IP/session-dependent
Citation-monitoring tool (Profound, etc.)
How often you are mentioned across tracked prompts
Medium — presence at scale, no traffic/revenue
Server-log GPTBot/ChatGPT-User crawl rate
Whether you are being crawled/retrieved
Medium — crawl is not citation is not click
First-party AI-referrer attribution
AI-sourced sessions arriving on your site
High — actual traffic
AI-attributed revenue joined to Stripe
Whether AI traffic converted and paid
Highest — the number that matters
The bottom row is the only one that closes the loop, and it is the one GA4 structurally cannot produce. The mechanic is the same one I detail in the ChatGPT referral analytics guide and the AI search ranking factors breakdown: detect the AI source server-side at the edge, persist a first-party session row, and join it to revenue on the Stripe checkout.session.completed webhook. No third-party cookie, no consent banner, no dependence on the engine passing a referer.
The before/after framework I give every customer testing a GEO change:
Question
Wrong way to answer
Right way to answer
Did ChatGPT start recommending me?
"I asked it once and it did"
Citation-monitoring share across tracked prompts over time
Is ChatGPT sending traffic?
"GA4 Direct went up" (could be anything)
First-party AI-source sessions, broken out from Direct
Did it drive revenue?
"We grew last month"
AI-attributed revenue joined to Stripe, before vs after
Which fix mattered?
"We did a bunch of stuff"
Time-correlate each fix to the AI-revenue line
Should I keep investing?
Gut feel
AI-channel RPV vs other channels' RPV
The honest caveat: even with first-party attribution, GEO has a long and noisy feedback loop. Corpus fixes will not show for quarters, retrieval fixes show in weeks, and AI traffic is lumpy — one viral citation can skew a small site's chart. Measure trends over months, not single weeks, and always pair the traffic number with the revenue join. Across the sites I measure, ChatGPT-attributed sessions convert at 1.4-2.1x equivalent Google organic on B2B SaaS, with median RPV near $0.84 versus $0.51 — but that is a directional benchmark, not a promise. Your number is the one that matters, and you can only get it by instrumenting.
Common mistakes operators make diagnosing this
The diagnosis goes wrong in predictable ways. Eight mistakes I see often enough to name, each with the correction.
Mistake 1: Testing only in browsing-off mode (or only browsing-on). The two modes answer differently, and conflating them hides the corpus-versus-retrieval split. Fix: always test both, on the same query, and compare. The gap between them is the diagnosis.
Mistake 2: Assuming it is a content-quality problem. Founders default to "I need better content," when the actual cause is usually structural (cutoff, blocked crawler, thin entity). Fix: run the flowchart before rewriting anything. Quality is necessary but rarely the binding constraint here.
Mistake 3: Ignoring robots.txt. It is a thirty-second check that occasionally explains the entire problem, and it is the one founders never think to look at. Fix: open robots.txt first, every time.
Mistake 4: Treating one anecdotal mention as proof. "I asked and it named me" is a sample of one, on your session, at one moment. Fix: measure presence at scale (citation monitoring) and impact in revenue (first-party attribution), not single anecdotes.
Mistake 5: Expecting corpus fixes to land fast. Earning a Wikipedia page or accumulating mentions does nothing for browsing-off recall until the next retrain. Fix: separate fast retrieval levers from slow corpus levers and set timing expectations accordingly.
Mistake 6: Using your own invented category language everywhere. If your buyers say "track ChatGPT traffic" and you say "AI discovery attribution layer," the model may never connect you to the query. Fix: layer the buyer's vernacular alongside your brand language.
Mistake 7: Trying to displace a category-defining incumbent head-on. You will not beat Stripe in "best payment tools" through GEO. Fix: aim to make the list and own the alternatives/long-tail queries where intent is higher and dominance is weaker.
Mistake 8: Measuring GEO in GA4. GA4 buckets ChatGPT clicks as Direct, so any "did it work" read off GA4 is structurally wrong. Fix: use first-party server-side attribution joined to Stripe; the revenue attribution feature and track ChatGPT traffic overview walk the mechanic.
A quick mistake-to-correction reference:
Mistake
Symptom
Correction
One mode only
Confused, contradictory results
Test browsing on AND off
Assume content quality
Endless rewrites, no movement
Run the flowchart first
Ignore robots.txt
Missed the obvious block
Check robots.txt first
Anecdotal proof
False confidence
Citation monitoring + revenue join
Expect fast corpus fixes
Frustration at "no results"
Separate fast vs slow levers
Invented language
Semantically invisible
Match buyer phrasing
Head-on with incumbent
Wasted effort
Own alternatives/long-tail
Measure in GA4
Wrong conclusions
First-party + Stripe join
What this looks like inside Attrifast
A short, honest note on the product, because the article should not pretend the author is disinterested. Attrifast does not "do GEO" — it does not write your schema, build your Wikidata entry, or post in subreddits for you. What it does is the measurement layer underneath all eight fixes: it detects AI-sourced sessions server-side (ChatGPT, Perplexity, Claude, Gemini, Copilot), persists them first-party without a cookie or consent banner, and joins each session to revenue on the Stripe checkout.session.completed webhook.
The practical value for this specific problem is that it turns "did my GEO fix work" from a vibe into a line on a chart. You ship a fix in week one, and over the following weeks you watch whether AI-attributed sessions and AI-attributed revenue actually move, broken out from the Direct/(none) junk drawer where GA4 hides them. Cost is $29/mo, the tracking script is 4 KB and cookieless, and the Stripe connection is OAuth, not an API key. The pricing and setup live at attrifast.com, and the track ChatGPT traffic page covers the detection mechanic end to end.
The first-person reason I built it: I was the founder staring at a Direct bucket climbing past 30%, unable to tell whether my AI-citation work was paying off or whether I was fooling myself with a single lucky ChatGPT mention. I had no revenue signal, so I had no idea. The product is the signal I wished I had.
Limitations
Five things this article does not claim, so you do not over-extrapolate.
No fix forces OpenAI to retrain on your existence. The corpus levers (Reasons 1, 2, 3-Wikipedia, 8) influence the next training pass on OpenAI's schedule. There is no button, no form, and no vendor who can make it happen sooner. Anyone claiming otherwise is selling something.
The corpus-versus-retrieval split is a useful model, not the literal internal architecture. OpenAI does not publish how recommendation, recall, and live retrieval interact. The two-mode framing is a practical approximation that holds up in testing; it is not a leaked spec.
The RPV and conversion numbers are Q1 2026 aggregates, not guarantees. The 1.4-2.1x ChatGPT-vs-Google-organic multiplier and the ~$0.84 median RPV are measured across a specific set of mostly-SaaS sites and will drift as ChatGPT's user base broadens. Treat them as directional.
Some reasons interact in ways this linear ranking flattens. A post-cutoff launch with a blocked crawler and a thin entity is one situation, not three independent ones; the flowchart helps, but real diagnoses are often "several at once."
This is consumer-surface ChatGPT. ChatGPT Enterprise, custom GPTs, and API-based deployments behave differently and are out of scope. The advice targets the public chat and search surfaces most SMB buyers use.
FAQ
Why doesn't ChatGPT mention my brand when people ask for recommendations?
Usually one of three things, in order of likelihood: your company launched after the model's training cutoff, so it was never in the corpus; you have no authoritative third-party mentions (press, listicles, Reddit, Wikipedia) for the model to learn from; or you blocked GPTBot in robots.txt and removed yourself from the training crawl. The fast fixes are retrieval-side (schema, allowing crawlers, comparison content). The slow fix is corpus-side and only lands with the next model retrain, which you do not control.
How do I check whether ChatGPT even knows my company exists?
Run three prompts with browsing turned off: ask it to describe your company by exact name, ask it to list tools in your category, and ask it who competes with a named competitor of yours. If it hallucinates your description, omits you from the category list, and never names you against competitors, you have a training-corpus absence problem, not a ranking problem. Then repeat with browsing on; if you appear only with browsing on, the issue is corpus presence, and retrieval is carrying you.
Is blocking GPTBot the reason ChatGPT ignores my product?
It can be a major reason. GPTBot is OpenAI's training crawler and it respects robots.txt. If you disallowed it (many sites did in 2023-2024 over content-rights concerns), you removed yourself from future training corpora, which slowly erodes how often the model recommends you without browsing. Blocking it does not block ChatGPT-User, the live-fetch agent, so you can still be retrieved on demand, but you lose the baked-in recommendation that drives unprompted mentions. Check robots.txt first; it is a five-minute diagnosis.
How long does it take to get ChatGPT to start recommending my product?
It depends entirely on which lever you pull. Retrieval-side fixes — allowing GPTBot and ChatGPT-User, shipping FAQ and Organization schema, publishing comparison content, fixing your category language — can influence browsing-mode answers within days to a few weeks as the live index refreshes. Corpus-side fixes — getting into Wikipedia, accumulating Reddit and press mentions, earning listicle placements — only fully land when OpenAI retrains and ships a new model, which happens on their schedule, not yours. Plan for weeks on retrieval, quarters on corpus.
Does getting cited by ChatGPT actually drive revenue, or is it a vanity metric?
It drives revenue, but you cannot see it in default GA4 because ChatGPT strips the Referer header and the clicks land in Direct/(none). Across the sites I measure, ChatGPT-attributed sessions convert at 1.4-2.1x the rate of equivalent Google organic on B2B SaaS, median revenue per visitor around $0.84 versus $0.51. The only way to prove a GEO fix worked is server-side first-party attribution joined to Stripe, so you can watch AI-attributed revenue move after the change. Citation-monitoring tools tell you that you are mentioned; they do not tell you it paid.
Why does ChatGPT recommend my competitors but not me?
Because they dominate the citation graph your category is built on. Competitors that appear in the listicles, Reddit threads, Wikipedia entries, and review-site roundups for your category get learned as the canonical answers. The model recommends what it has seen recommended. The fix is not to out-optimize a single page; it is to get yourself into the same third-party sources that name them — the comparison posts, the "best X tools" roundups, the subreddit recommendation threads — so the next training pass and the live retrieval both see you alongside them.
ChatGPT describes my company incorrectly. How do I fix the hallucination?
A hallucinated description almost always means a weak entity: the model is filling gaps because it lacks a clean, authoritative source for what you are. The fix is entity-side — build your Wikidata item, tighten your Organization JSON-LD with an accurate description and a complete sameAs array, and make sure your homepage and About page state plainly what you do in the buyer's language. As those signals enter retrieval and the next corpus, the description stabilizes. There is no direct "correct this fact" channel to OpenAI; you fix it by improving the source data.
Should I create FAQ and Organization schema to get recommended?
Yes — it is a fast, free, retrieval-side move with no downside, and it directly helps Reasons 3 and 7. FAQ schema with 4+ question-answer pairs matching your visible H2s gives the model pre-extracted, query-shaped answers; Organization schema with a clean sameAs array is the entity bridge. Schema will not overcome a training-cutoff absence (Reason 1) or a blocked crawler (Reason 4), so ship it as part of the stack, not as a silver bullet. The full schema bundle is in how to get cited by AI engines.
Does llms.txt help ChatGPT recommend me?
Marginally, and not the way most people hope. llms.txt is a curated index of your most relevant pages that some AI crawlers read; it is a low-cost, low-certainty bet that helps retrieval find your best pages, not a ranking or recommendation lever. Adoption is low, so the downside is zero and there is a plausible small upside. Spend 30 minutes on it and move on — do not let it distract from the higher-leverage fixes (unblocking crawlers, comparison content, entity graph) that move recommendation more.
How do I know if it is a corpus problem or a retrieval problem?
Run the same query in browsing-off and browsing-on modes. If the model knows and recommends you in both, you are not the problem. If it knows you only with browsing on, you have a corpus gap and retrieval is carrying you — keep retrieval reliable and grind corpus signals. If it fails in both modes, you likely have a stacked problem: corpus absence plus a retrieval blocker (blocked crawler or thin/uncrawlable content). The two-mode test is the single most informative diagnostic on this whole list.
My traffic from ChatGPT went up but revenue didn't. What's wrong?
Two likely causes. First, the AI traffic may be low-intent for your offer — informational queries that bring readers, not buyers — in which case the fix is landing-page and offer-fit, not more citations. Second, your attribution may be misjoining: confirm the AI-sourced sessions are actually being tied to Stripe checkouts via the webhook, not lost at the join. The diagnostic is to segment AI-attributed sessions by landing page and look at conversion rate per page; a uniformly low rate points to intent/offer fit, while a broken join shows up as sessions with no downstream revenue at all.
Can I pay to get ChatGPT to recommend my product?
No, not in the organic recommendation surface, and you should be wary of anyone who says otherwise. There is no pay-to-rank channel for ChatGPT's organic recommendations, no way to buy corpus inclusion, and paid Wikipedia creation violates the platform's rules and gets deleted. OpenAI has experimented with separate ad/commerce surfaces, but those are distinct from the model's organic recommendations. The durable path is the unglamorous one: earn authoritative mentions, build a clean entity, get into honest comparison content, and be genuinely useful in communities.
How often should I re-run this diagnosis?
Quarterly for the corpus reasons, monthly for the retrieval reasons. Corpus presence (Reason 1) only changes when OpenAI ships a new model, so re-test browsing-off recall each quarter and whenever a major model release lands. Retrieval reasons (4, 5, 7) can move within weeks of a fix, so re-check those monthly and pair every check with your first-party AI-revenue line so you are measuring impact, not just presence. Set a recurring 30-minute slot; the eight tests in the flowchart take half an hour.