AI Search

ChatGPT Query Fan-Out, Explained for Attribution Operators (2026)

One ChatGPT prompt becomes 8 internal searches, 3 fetches, and 2 sessions on your site that GA4 buckets as Direct. A 2026 operator breakdown of how fan-out actually moves through your attribution layer and what to instrument.

Part of the AI Search Hub — browse all 35 AI Search guides.

ChatGPT query fan-out flow: one user prompt expands into 4 to 6 internal sub-queries, each producing its own retrieval and potential citation click, with GA4 catching only about 28 percent of the resulting sessions

The single best diagnostic conversation I had this year started with a screenshot of a GA4 channel report and one line: "this Direct row is eating us alive and I think it is ChatGPT." It was. But the deeper finding, the one that took two hours of log-line walking to reach, was that the founder's biggest paying customer that month had landed on his site three separate times in two hours from what was clearly one ChatGPT session. Three different deep pages, three different Direct rows, no referer on any of them. One prompt. Three sessions. One eventual Stripe payment.

That moment was the cleanest illustration I have of why the fan-out matters as an attribution problem, not just as a search-mechanics curiosity. Peec.ai has done excellent work documenting the patterns inside ChatGPT's fan-out behaviour, and their patterns piece and the length-growth analysis are the cleanest public datasets on the engine side. This article is the complementary view from the operator side: what fan-out does to your sessions, your referer logs, your UTM strategy, your channel report, and ultimately your revenue join. It is the version I wished I had when that founder messaged me.

I will walk the mechanics, then the measurement, then the structural implications for how you instrument and write content. Everything below is grounded in the rough 200-site cohort I read every Friday morning and cross-checked against the public datasets Peec, Profound, and the search platforms themselves have published. Cohort numbers are cohort numbers, not industry truth, and I flag where the magnitudes diverge from public benchmarks.

Quick facts

MetricValueSource
Sites in cohort~200Attrifast cohort, May 2026
Median ChatGPT sub-queries per commercial prompt4 to 6Attrifast cohort
Upper-decile sub-queries per prompt11+Attrifast cohort
Peec.ai average ChatGPT fan-outs per prompt2.1Peec.ai [1]
Peec.ai average Perplexity fan-outs per prompt1.4Peec.ai [1]
Peec.ai average Grok fan-outs per prompt6.8Peec.ai [1]
Peec.ai average fan-out word length, Oct 2025~6 wordsPeec.ai [2]
Peec.ai average fan-out word length, Jan 2026~12 wordsPeec.ai [2]
Most injected fan-out tokenbest (~24% of advice prompts)Peec.ai [1]
Current-year freshness modifier rate~5% of fan-outsPeec.ai [1]
% of fan-out sessions with recognisable referer (cohort)~28%Attrifast cohort
Fan-out conversion to Stripe payment, B2B SaaS2.7%Attrifast cohort
Google organic conversion on same SaaS pages1.4%Attrifast cohort
Direct-prompt ChatGPT referral conversion (no fan-out)2.5%Attrifast cohort
Average follow-through citation clicks per answer1.4Attrifast cohort
Median GA4 under-count factor on fan-out channel3 to 5xAttrifast vs GA4 reconcile

Two anchors carry the rest of the article. The first is that fan-out is a documented, growing phenomenon: Peec.ai's 20-million-QFO dataset shows the average word length of a sub-query doubled in a single quarter, and our cohort shows commercial prompts now fanning out into 4 to 6 sub-queries by default. The second is that the attribution layer was not designed for this: GA4's last-non-direct model, its single-session referer logic, and its UTM-first channel grouping all assume one source per visit, and one visit per prompt. A fan-out breaks both assumptions simultaneously.

What a query fan-out actually is

A query fan-out is the model expanding one user prompt into several internal searches before it composes the answer. The user sees one input box and one answer. The engine, between those two surfaces, runs N searches, fetches M documents, and merges everything into the final response. The fan-out is the N searches in the middle.

The simplest way to think about it: when you type "best CRM for solo founders" into ChatGPT, the engine does not run that exact string against its search partner. It rewrites and expands it into a small batch of sub-queries that look something like this:

Fan-out sub-queryType of expansion
best CRM solo founders 2026freshness modifier added
top lightweight CRM small businesssynonym expansion
Pipedrive vs HubSpot solo foundernamed-entity injection
CRM reviews self-employedreview-site intent
solo founder CRM Redditforum-source intent
best CRM for indie hackerscommunity-vocabulary expansion
CRM pricing under 30 per monthqualifier extraction from context
affordable CRM features founderfeature-attribute decomposition

That is one prompt expanding into eight sub-queries. The exact count varies by prompt type and engine version, but the structural pattern is consistent: the engine rewrites and decomposes the user's question into several internal questions, each of which gets its own retrieval pass.

Peec.ai's pattern analysis measured the average across a broad sample at 2.1 sub-queries per prompt for ChatGPT, with the most-injected tokens being best, top, reviews, comparison, tools, software, and features. Their separate length-growth piece showed the average sub-query word length doubling from about 6 words to about 12 words between October 2025 and January 2026 across five tested countries. Both are essential context for what follows.

One prompt becomes eight internal searches becomes three site visitsUser promptbest CRM for solo foundersbest CRM solo founders 2026top lightweight CRMPipedrive vs HubSpot soloCRM reviews self-employedsolo founder CRM Redditbest CRM indie hackersCRM pricing under 30 moaffordable CRM featuresyour-site.com /comparison/session #1, no referer, deep pageyour-site.com /reviews/session #2, no referer, deep pageyour-site.com /pricing/session #3, no referer, deep pageGA4 reads three Direct sessions. The originating prompt is invisible without server-side capture.

The diagram above is the core of this article in one image. One prompt fans into eight sub-queries; three of them retrieve a page on your domain; those three retrievals can each produce a citation in the answer; the user clicks a subset of them. GA4 reads the resulting visits as three independent Direct sessions, with no shared identifier and no referer. The challenge for an attribution operator is to reconstruct the bridge in the middle.

Why this is an attribution problem, not just a search problem

Peec, Profound, and the rest of the GEO visibility category measure fan-out from the engine side. They tell you what sub-queries fired and what got cited. That answers half the question. The other half is what happens after the citation is clicked, and that is where the attribution layer breaks.

Three things go wrong simultaneously when a fan-out exit lands on your site.

The referer header is missing. ChatGPT's in-app browser, the mobile client, and the desktop app all strip the referer on outbound clicks. Per OpenAI's published docs, the engine's traffic is mediated by clients that do not preserve referrer chains the way a classic web browser does. Even when chatgpt.com appears as the referer (which happens on a minority of clicks from the web UI), the URL fragment that would identify the originating prompt and sub-query is not exposed.

UTM parameters are not added. ChatGPT does not append a utm_source value to citation URLs. Some community guides recommend that publishers self-tag, but that requires the publisher to anticipate every citation pattern, which is not realistic. So the channel-grouping rules that depend on UTM in GA4 default settings simply do not fire on fan-out exits.

The landing page is rarely the homepage. Citations point to deep URLs because the engine extracts the most answer-shaped passage from your site, which is almost never your root page. So even your behavioural fallbacks (visitor lands on /, visits a few pages, signs up) do not match the fan-out exit pattern.

What happens on a fan-out exitWhy it breaks GA4 default attribution
Referer header strippedGA4 cannot identify the source, so visit lands in Direct/(none)
No UTM source appended by clientChannel grouping rules do not match, visit lands in Direct
Deep page landing rather than rootHomepage-based heuristics do not trigger
Multiple sessions per promptGA4 attributes each session independently, no fan-out join
In-app browser cookies isolatedCross-session linkage on the same device is lost
Last-non-direct window resets oftenFan-out origin overwritten by subsequent direct visits

If you take only one row from that table, take the last one. GA4's last-non-direct attribution model is the single most damaging default for fan-out tracking, because it resets the origin every time a no-referer visit arrives, and fan-out visits are no-referer visits by definition. So the engineering customer who first arrived via a ChatGPT fan-out, came back five days later via a branded search, signed up the next day on a direct visit, and converted three weeks later, gets attributed to whichever last-non-direct touch survived the 90-day window. The fan-out, which was actually the discovery channel, vanishes.

For the deeper version of this issue across the broader AI-engine surface, the chatgpt referral traffic not showing in analytics walk-through covers the referer mechanics, and the how to track ChatGPT traffic in Google Analytics piece walks through the GA4-specific channel grouping work that recovers part (not all) of the visible fraction.

Fan-out distribution across intent types

Not every prompt fans out the same way. The shape of the fan-out depends heavily on intent. The cohort breakdown below is from the rolling 90 days ending May 15, 2026 across the prompts where we could classify intent reliably (about 4,200 prompt-source pairs observed via the OAI-SearchBot User-Agent on cohort sites). Engine averages are the Peec.ai cross-platform figures cross-referenced with our own logs.

Intent typeExample promptChatGPT median sub-queriesPerplexityGrokGoogle AI Mode
Informationalwhat is RAG2152
Navigationalattrifast pricing1121
Commercial investigationbest CRM for solo founders5284
ComparisonPipedrive vs HubSpot4273
Transactionalbuy Stripe subscription analytics3162

The pattern is consistent across the cohort: commercial-investigation and comparison prompts produce the largest fan-outs, navigational prompts produce the smallest. Grok consistently runs the deepest fan-outs, often deeper than the user might expect, while Perplexity stays tight because it relies on a single high-quality retrieval and ranks candidates within it rather than expanding outward. ChatGPT sits in the middle but has been climbing all year on commercial prompts as the engine has gotten more confident running multi-step research.

The reason this distribution matters is that the prompts that drive purchase decisions fan out the most. If you sell to founders evaluating tools, the queries your buyers ask are exactly the queries the engine fans most aggressively. So the gap between the literal phrase the user typed and the queries actually running against the retriever is widest where revenue lives.

ChatGPT fan-out sub-query distribution on commercial promptsAttrifast cohort, May 2026, commercial-investigation prompts only0%10%20%30%40%1 sub-query6%212%316%4 to 535%6 to 1025%11+6%Median: 4 to 5 sub-queries. Heavy tail past 11 on multi-part research prompts.

The histogram above is the cohort distribution. Most commercial prompts produce a fan-out between 3 and 10 sub-queries, with the mode at 4 to 5 and a long tail past 11 on multi-part research questions like "what is the best CRM for solo founders and how does it compare to HubSpot on pricing and Reddit reviews." Those compound research prompts are where the engine runs the deepest, and where the attribution surface is most distorted.

How fan-out grew through 2025 and 2026

Peec.ai's country-level analysis is the cleanest public dataset on fan-out length over time. They sampled 20 million ChatGPT QFOs between October 2025 and January 2026 across Germany, the UK, Singapore, Thailand, and the US, and they found the average word count per QFO roughly doubled in that single quarter, climbing from about 6 words to about 12 with a peak around 16 in week 49 of 2025. Critically, the QFO count per prompt held steady at 2.3 to 2.8 in their dataset across countries. So the engine was asking longer, more specific questions, not necessarily more of them.

Our cohort sees a complementary trend at the sub-query count level on commercial prompts: the median climbed from about 3 sub-queries in Q3 2025 to 4 to 5 by Q1 2026, with the upper decile rising from 7 to 11. So between Peec.ai's word-length data and our cohort's count data, the fan-out has been getting both wider and deeper through the trailing nine months.

QuarterMedian fan-out word length (Peec)Median commercial sub-queries (Attrifast)Upper-decile sub-queries (Attrifast)
Q3 2025~6 words~3~7
Q4 2025~10 words~4~9
Q1 2026~12 words~4 to 5~11
Q2 2026 (partial)~13 words~5 to 6~12

The growth was not driven by one country or language. Peec.ai noted that all five tested countries showed virtually identical doubling curves, which they read (and I agree) as evidence of a global engine-side architectural shift rather than localised experimentation. The most likely drivers are improved vector-retrieval performance on longer queries and a countermeasure against AI-generated content saturation, which makes shallow keyword matches less informative.

Fan-out length grew across 2025 and 2026Average ChatGPT QFO word length, Peec.ai cross-country sample0481216Oct '25Nov '25Dec '25Jan '26Feb '26Mar '26May '26Week 49 peak ~16 wordsSources: Peec.ai 20M-QFO sample (Oct 2025 to Jan 2026), Attrifast cohort extrapolation through May 2026

The line above is the curve Peec.ai documented, with our cohort's extrapolation tacked on through May 2026. The peak in week 49 (late November 2025) was likely a brief experimentation phase that settled lower; the steady-state through Q1 and Q2 2026 has been in the 12 to 13 word range. For content teams, the takeaway is that the average sub-query is now longer than most page titles and most H1 tags, which means matching on a single short keyword is increasingly mismatched to the queries the retrieval layer actually runs.

What ChatGPT actually injects into your sub-queries

The injection pattern matters because it determines which pages on your site get retrieved during fan-out. Peec.ai's published rankings of the most-injected tokens are the cleanest public data on this. I have re-ordered their list against my cohort's observation about which injections appear most often on the commercial-investigation prompts that drive revenue.

Injected modifierFrequency on advice prompts (Peec)Effect on retrievalWhat this means for your content
best~24%Pulls listicles and ranking pagesListicles dominate commercial fan-outs
top~15% to 18%Similar to best, slightly more category-skewedSame as above
reviews~12% to 15%Pulls G2, Glassdoor, Sitejabber, TrustpilotThird-party review pages are co-cited
comparison~8% to 12%Pulls X vs Y pages and matrix contentComparison pages are co-cited
tools~6% to 10%Pulls category-listing pages"Tools" pages outrank product pages
software~6% to 9%Software-specific filter on tools fan-outSoftware listicles dominate B2B
features~4% to 8%Pulls feature-decomposition contentFeature-level pages get co-cited
current year (2026)~5%Freshness filterFresh content gets weighted
Reddit~3% to 6%Forum-source intentReddit threads enter the citation set
brand name XvariesNamed-entity injectionNamed competitors get co-cited

The single most actionable pattern in this table is that best appears in roughly a quarter of advice-style fan-outs even when the user did not type it. So a page that ranks for "CRM for solo founders" but is invisible to "best CRM for solo founders" is going to lose most of the fan-out citations on the prompt the user actually typed. The same applies to comparison and reviews modifiers.

For the deeper view on how this affects content shape, the how AI engines choose sources piece walks through the retrieval-and-rerank mechanics, and the AI search ranking factors checklist lists the structural levers that map to each injection pattern.

The referer story: what actually shows up in your logs

Now we move from the engine side to your server logs. This is the section that matters most for an attribution operator, because the question is no longer "what did the engine search for" but "what arrived at my domain and how do I read it." Across the cohort, here is what the referer landscape looks like for fan-out exits during May 2026.

Source patternShare of fan-out sessionsWhat GA4 calls it by default
chatgpt.com referer~22%Referral (chatgpt)
chat.openai.com referer~3%Referral (chat.openai.com)
oai.com referer~2%Direct (unless you map it)
utm_source=chatgpt or similar~1%Identified (if mapping is set up)
No referer, deep page landing~58%Direct/(none)
No referer, homepage landing~10%Direct/(none)
chatgpt.com referer but to homepage~4%Referral (chatgpt)

The first four rows are the visible fraction. They total about 28 percent of fan-out sessions, which is consistent with the broader how much traffic comes from ChatGPT benchmark. The remaining 72 percent are the invisible fraction, and they break down further by behavioural signature.

Invisible pattern signatureShareRecoverable server-side?
Deep page, no referer, US/EU geo, business hours~32%Yes, behavioural fingerprinting
Deep page, no referer, mobile UA, in-app browser hint~14%Partial, UA pattern recognition
Deep page, no referer, ChatGPT desktop app pattern~8%Yes, UA + OS handoff signature
Homepage, no referer, but recent OAI-SearchBot fetch~6%Yes, fetch-then-human pair join
No referer, no signature, ambiguous~12%No, treat as Direct

About 60 of the 72 percent invisible fraction is recoverable with server-side enrichment. The last 12 percent is genuinely ambiguous and stays in Direct. The recovery rate matters because it changes the headline channel share dramatically. A site reading its GA4 dashboard sees roughly 28 percent of true fan-out traffic. The same site with server-side capture and the OAI-SearchBot pair-join logic sees roughly 88 percent. That gap is the entire reason this article ends where it does.

For a step-by-step walk-through of the capture logic at the request level, the track ChatGPT traffic page documents the runtime, and the AI visibility score feature page covers how the per-engine and per-prompt signal is surfaced inside the dashboard.

How fan-out distorts UTM strategy

UTM tagging has been the SEO operator's default lever for a decade. You append utm_source to a link, GA4 picks it up, the visit lands in the right channel. Fan-out breaks this in three ways.

UTM assumptionWhat fan-out does to it
Source is added by the linker, not the visitorFan-out citations are added by the engine, with no UTM appended
One source per clickA single prompt produces multiple clicks from different sub-queries
Source survives the clickThe ChatGPT client may strip the URL fragment on follow-through
Source matches the user's intentThe sub-query that produced the citation is not the user's typed prompt

The fix is not "add more UTM." The fix is to acknowledge that UTM is a publisher-side instrument and fan-out is a non-publisher-side phenomenon. The publisher (you) cannot tag what the engine retrieves, because the engine retrieves your existing URL with no parameters. So the recovery logic has to happen on the server, on first visit, before any client-side analytics runs.

A useful mental model is that you have three layers of attribution clarity for fan-out sessions:

LayerWhat it catchesEffort
Layer 1: GA4 defaultThe 28% visible fractionNone
Layer 2: GA4 channel grouping with regexSame 28%, but routed to a clean channelLow
Layer 3: Server-side referer captureRecovers about 60% of the dark fractionModerate
Layer 4: Bot-fetch + human-click pair joinRecovers another 6 to 10%High
Layer 5: First-touch persistence to StripeJoins fan-out origin to paying customerModerate

Most teams stop at Layer 1 and assume the channel is small. The teams that move to Layer 3 see roughly a 3 to 5x increase in measured ChatGPT volume overnight. The teams that move to Layer 5 are the ones who can answer the question "what fraction of our MRR is from fan-out citations" with anything other than a guess.

Conversion rate of fan-out sessions

The most interesting cohort finding is that fan-out sessions, once you can actually measure them, convert at materially higher rates than Google organic on the same landing pages. Across the cohort's 118 B2B SaaS sites in May 2026, the comparison looks like this.

SourceSessionsConversion to Stripe paymentFirst-month subscription value
Google organic~1.42M1.4%$28.70
ChatGPT direct prompt (no fan-out)~98k2.5%$43.10
ChatGPT fan-out exit~74k2.7%$44.80
Perplexity citation~52k2.6%$42.20
Direct/(none) catch-all (truly direct)~340k3.4%$51.10
Branded search~620k4.2%$58.40

The fan-out row is the second-best non-branded source in the table, behind only true direct type-ins (which are usually returning users and brand-driven). It outperforms Google organic by roughly 1.9x on conversion rate and by roughly 56 percent on first-month value. The reason is intent: a user who clicks a fan-out citation is already in active-evaluation mode, because the engine ran a commercial-investigation fan-out before producing the answer. So the click that arrives on your page is from a user who has been comparing options, possibly across multiple sub-queries, and is engaged enough to click through.

The structural lesson is that fan-out is a high-intent, under-counted channel. Every operator I have talked to who instruments it for the first time has the same reaction: the channel is much bigger and much more valuable than they thought. The combination of under-counting and over-converting is what makes the gap so material on a Stripe revenue join.

For the broader picture on AI-traffic conversion benchmarks across the cohort, the AI traffic revenue benchmark piece runs the full attribution-by-source matrix. The slice above is the fan-out-specific row.

Cohort gap: GA4 versus first-party on fan-out

This is the table I show every operator who asks me whether their measurement gap is real or paranoid. It is the side-by-side of what GA4 reports versus what the server-side capture sees, on the same sites in the same window.

MetricGA4 defaultServer-side captureMultiple
Fan-out attributed sessions (median cohort site)4121,4753.6x
Fan-out conversion to Stripe (median cohort site)1.1%2.7%2.5x
Fan-out revenue share of total MRR0.4%2.1%5.3x
First-touch persistence past 30 days18%76%4.2x
% of fan-out sessions correctly labeled28%88%3.1x

That last row is the headline. GA4 correctly labels 28 percent of fan-out sessions. Server-side capture correctly labels 88 percent. The 60-percentage-point gap is the channel's measurement debt. It is the reason the founder I mentioned at the top of this article was watching his Direct row grow without understanding why, and why his actual ChatGPT revenue contribution was roughly 5x larger than his dashboard said it was.

What to instrument on the server side

This is the practical section. The capture stack has three layers, and you need all three to get to the 88 percent recovery rate above.

Layer 1: capture the Referer header before client-side processing

The single biggest gain is putting a small request-handling step at your edge (Vercel middleware, Cloudflare Worker, or a Next.js middleware route) that reads the Referer header on every first-visit request, stores it in a first-party cookie, and writes it to your analytics log before the page renders. This is the layer that catches the 28 percent visible fraction reliably.

Capture targetPatternWhat it labels
chatgpt.comsubstring matchChatGPT
chat.openai.comsubstring matchChatGPT (legacy)
oai.comsubstring matchChatGPT (short link)
perplexity.aisubstring matchPerplexity
copilot.microsoft.comsubstring matchMicrosoft Copilot
gemini.google.comsubstring matchGemini
aistudio.google.comsubstring matchGemini Studio
claude.aisubstring matchClaude

Layer 2: recognise the no-referer-deep-page pattern as a fan-out signature

The 58 percent of fan-out sessions that arrive with no referer on a deep page are the largest single bucket. Server-side, you can label these probabilistically. The cohort heuristic that gets us to roughly 88 percent recovery is below.

SignalWeightInterpretation
No referer+1Could be fan-out, type-in, or app handoff
Deep page (not root)+2Type-ins rarely land on deep URLs
Recent OAI-SearchBot fetch on same URL (under 30 minutes)+3Strong fan-out signal
Modern Chrome/Edge UA, no in-app browser hint+1Excludes bot scrapes
Geo and time-of-day cluster matches engaged-user pattern+1Excludes accidental hits
Score 5 or abovelabelProbable fan-out exit

The OAI-SearchBot pair-join is the highest-signal piece. When OpenAI's documented bot fetches a specific URL on your site, and a real human arrives at that same URL minutes later with no referer, the probability that the human click came from a ChatGPT fan-out is very high. We log both legs and join them with a sliding window. This is the layer that recovers the bulk of the dark fraction.

Layer 3: persist a first-touch identifier through the Stripe webhook

The third layer is the revenue join. Once you have labeled the first-visit source server-side, you need that label to survive every subsequent session for the same visitor until the Stripe payment fires. This is where GA4's last-non-direct model breaks: it overwrites the first-touch source on every Direct visit, which means a fan-out first touch is gone within days.

The cookie-based fix is straightforward in principle (write a first-touch source on visit 1, never overwrite, send it to Stripe metadata on checkout) and operationally fragile in practice because of cookie expiry, cross-device drift, and consent constraints. The server-side fix is more durable: persist the first-touch identifier on your backend, keyed by your own user identifier, and join it to the Stripe customer_id when the webhook fires.

For the deeper version of this join across all marketing channels, the revenue attribution feature page documents the Stripe webhook flow, and the share of voice AI search feature page covers how per-prompt visibility metrics get reconciled against the revenue join.

How fan-out reshapes content strategy

I will keep this section short because the broader content question is covered in content strategy AI search 2026 and how to get cited by AI engines. The fan-out-specific implications are three.

First, the best modifier matters more than any other content lever. Roughly a quarter of advice-style fan-outs inject the word best. A page that does not earn the best variation of its target query is forfeiting a quarter of its possible citations. This is why listicles dominate AI answers. It is also why a comparison post titled "Best CRM for Solo Founders" tends to outrank a feature post titled "Why Our CRM is the Best Choice" even when the feature post is more detailed.

Second, the named-entity injection demands you mention your competitors. The engine fans into Pipedrive vs HubSpot solo founder even when the user did not type either brand. A page that names Pipedrive, HubSpot, Folk, Attio, and Capsule by name, and answers the comparison sub-queries with on-page passages, wins multiple fan-outs in parallel. A page that names only your own product loses the comparison fan-outs entirely.

Third, freshness modifiers reward recent dates on the page. The current-year modifier appears in about 5 percent of fan-outs, and the engines weight that match. A piece dated 2024 will lose freshness fan-outs to a piece dated 2026 on otherwise identical content. This is why we update the publishedAt and updatedAt fields on cohort sites every quarter on the pages we want to keep winning fan-outs.

The deeper implication is that fan-out optimisation is structurally aligned with breadth, not depth. A page that goes wide across multiple injected sub-queries wins more fan-outs than a page that goes deep on the literal phrase the user typed. This inverts a lot of classic content advice about ranking for one keyword cleanly.

Comparing engines on fan-out behaviour

The fan-out shape differs engine by engine, which means the content that survives ChatGPT's fan-out may not survive Grok's or Gemini AI Mode's. The table below summarises the per-engine pattern.

EngineMedian fan-out per promptMost-injected modifierSource preferenceCitation density per answer
ChatGPT (search tool)2 to 4, commercial 4 to 6best, top, reviews, comparisonBing-indexed pages, Reddit3 to 5
Perplexity1.4 averageminimal expansionbroad web, Reddit, X3 to 7
Grok6.8 averageyear, brand, trusted forumsReddit, Wirecutter, Consumer Reports5 to 10
Google AI Mode3 to 5 commercialcomparison, reviewstop-10 organic pages4 to 7
Claude with web2 to 3minimal expansionauthoritative editorial sources3 to 5
Gemini browse2 to 4year, locationGoogle-indexed pages3 to 6

The implication is that a page that wins ChatGPT's fan-out (broad best modifier coverage) may not win Grok's (deep brand-and-forum modifier coverage). Multi-engine fan-out optimisation is a real, growing surface, and most teams are not yet thinking about it as a per-engine portfolio question. For the cross-engine view, the share of voice in AI search feature page documents how we track per-engine visibility across the cohort.

Average fan-out sub-queries per prompt, by engineCross-source: Peec.ai average + Attrifast cohort medians, May 202602468Perplexity1.4Claude2.5Gemini3.0ChatGPT4.0AI Mode4.0Grok6.8Grok runs the deepest fan-outs by a wide margin. Perplexity stays tight on a single high-quality retrieval.

The operator playbook for fan-out attribution

Putting the article together: here is the sequence I run for any cohort site that wants to take fan-out seriously as an attribution channel.

The first step is the GA4 channel grouping work. Add custom regex rules for chatgpt.com, chat.openai.com, oai.com, perplexity.ai, copilot.microsoft.com, gemini.google.com, aistudio.google.com, and claude.ai, and route them to clean named channels. This is the how to track ChatGPT traffic in Google Analytics walk-through, which recovers the visible 28 percent fraction cleanly. It is necessary but not sufficient.

The second step is the server-side capture stack. Add a request-handling layer (middleware, edge function, or a tool that does this for you) that reads the Referer header on first visit, stores it server-side, and labels the session. Add the OAI-SearchBot pair-join logic for the no-referer-deep-page bucket. This is where you go from 28 percent recovery to roughly 88 percent recovery.

The third step is the first-touch persistence. Write the labeled source to your backend on visit 1, keyed by your user identifier, and never overwrite it. Send it to Stripe metadata on checkout, or join it to the Stripe customer_id when the webhook fires. This is the revenue-join layer, and it is the one that finally lets you answer "what fraction of our MRR is from ChatGPT fan-out citations."

The fourth step is the multi-engine visibility tracking. Pair the attribution stack with a GEO visibility tool (Peec, Profound, Otterly, or our own AI visibility score) so you can see, per prompt, which sub-queries surfaced your domain. That pairing is the only way to correlate engine-side citation share with site-side revenue, which is the only complete picture of the fan-out channel.

The fifth step is the content side. Write pages that win across multiple injected modifiers, name your competitors by brand, keep dates fresh, and structure passages so they extract cleanly. The AI citations versus backlinks piece walks the structural levers in depth.

The whole sequence is roughly two weeks of engineering work plus a recurring content cadence. The payoff is moving from a 28 percent view of your fan-out channel to an 88 percent view, with a defensible revenue join on top.

Three case studies from the cohort

To make the abstract claims concrete, here are three cohort sites where I have permission to share the shape (not the names) of what fan-out did to their attribution layer through Q1 and Q2 2026.

The first is a developer-tools company selling a CI/CD observability product at $79 per seat per month. Their GA4 dashboard read 0.6 percent of monthly sessions as ChatGPT through April 2026 and they had largely written off the channel. After we put the server-side stack in front of their analytics, the same month read 2.9 percent of sessions as ChatGPT, with 71 percent of those classified as fan-out exits (no referer, deep page, OAI-SearchBot fetch within the prior 30 minutes). The Stripe revenue join surfaced that fan-out exits had produced $14,400 of new MRR in Q1 2026 versus the $2,200 GA4 had attributed to chatgpt.com referrals. The ratio was 6.5x, which is on the high end of the cohort distribution but not unique.

The second is a DTC supplements brand. Their pattern was the opposite shape: GA4 read 1.8 percent of sessions as ChatGPT, and the server-side stack read 3.1 percent, a 1.7x correction rather than a 5x correction. The reason was that their buyers tended to type branded queries (the product name) rather than commercial-investigation queries (best supplement for X). Branded queries fan out less aggressively, so the gap between GA4 and server-side was smaller. The lesson is that fan-out under-counting is intent-dependent, not site-dependent: a brand whose buyers ask commercial-investigation questions has a bigger gap than a brand whose buyers type the product name directly.

The third is a B2B SaaS in the data-infrastructure category. Their dashboard story was the most striking. GA4 said ChatGPT was their seventh-largest channel by sessions and twelfth-largest by attributed revenue. The server-side stack said it was their second-largest channel by sessions and fourth-largest by attributed revenue. The reclassification was driven entirely by the no-referer-deep-page bucket, which had been bleeding into a Direct row that the founder had assumed was returning users. Once the fan-out exits were pulled out of Direct, the true Direct row dropped by about a third and the Direct conversion rate per session went up because the remaining Direct visits were genuinely returning, branded users with much higher purchase intent. So the recovery did not just reclassify volume; it also revealed that the residual Direct channel was a different, cleaner cohort than the dashboard had suggested.

All three cases share a structural property worth naming. The fan-out channel does not feel like a single channel to the operator. It feels like a slow drip of unrelated Direct visits arriving on deep pages, which the operator instinctively reads as either bot traffic or returning-user noise. The act of pulling those visits out, labeling them, and running the revenue join is what turns the drip into a recognisable, sized, optimisable channel. Until that happens, fan-out exists as a measurement debt that grows quietly every quarter as the engines fan out more aggressively.

What I expect to change through 2026 and 2027

A few predictions I am willing to put my name behind, with the standard caveat that AI search is moving fast enough that any prediction past 12 months is a guess. First, fan-out sub-query counts will keep climbing on commercial prompts, but at a slower rate than 2025-to-2026. Most engines are settling into the 3 to 8 range on commercial prompts and I expect that to be the steady state through 2026. Second, fan-out sub-query word length will keep growing as retrievers get better at vector matching on longer queries. Peec.ai's data suggests the curve is slowing but not flattening. Third, more engines will adopt the OAI-SearchBot-style approach of fetching the citation page before the user clicks, which actually helps attribution operators because it provides a clean second-leg signal to pair with the human click. Fourth, the gap between the prompt a user types and the queries the retriever runs will widen, which makes single-keyword content optimisation increasingly mis-targeted relative to the fan-out shape.

The longer-term implication is that the attribution stack you build for fan-out becomes the attribution stack for AI search more broadly. The same referer-capture, pair-join, and first-touch-persistence layers that recover fan-out exits also recover Perplexity citations, Gemini AI Mode clicks, Claude with web search visits, and the next surface that has not been built yet. The fan-out problem is the canary in the coal mine for the broader measurement problem, and the fix is the same fix.

FAQ

What is a ChatGPT query fan-out?

A query fan-out is the set of internal searches the model expands a single user prompt into before it composes the answer. When someone types best CRM for solo founders into ChatGPT, the engine does not run that exact string against its search index. It rewrites and expands it into several sub-queries like best CRM solo founders 2026, top lightweight CRM small business, Pipedrive vs HubSpot solo founder, CRM reviews self-employed, and so on. Each sub-query becomes its own retrieval, and the result sets are merged before the model writes the answer. Across the roughly 200 sites in the Attrifast cohort, the median ChatGPT search-tool prompt now expands into 4 to 6 sub-queries, with the upper decile fanning out past 11.

Why does fan-out matter for attribution and revenue tracking?

Because each sub-query can produce its own retrieval, its own document fetch, and in some cases its own click to your site, but GA4 sees those clicks as unrelated events with no referer. One user prompt can generate 2 or 3 separate sessions on your domain across a few hours as the engine cites different pages on different sub-queries. The standard attribution layer treats those sessions as independent Direct visits because the referer header is stripped by the ChatGPT client, no UTM is appended, and the sessions arrive on deep pages. In our cohort the median site under-counts fan-out sourced traffic by roughly 70 to 80 percent under default GA4 settings.

How many sub-queries does ChatGPT actually run per prompt?

It depends on intent type and on engine version. Peec.ai measured an average of 2.1 fan-outs per ChatGPT prompt across a broad sample, with Perplexity at 1.4 and Grok at 6.8. Our Attrifast cohort sees a higher median because we sample heavily on commercial-investigation prompts where the engine fans out more aggressively. On commercial queries like best X for Y or alternatives to Z, our median is 4 to 6 sub-queries, with a long tail past 11. Peec also documented that the average word count per fan-out roughly doubled between October 2025 and January 2026, climbing from about 6 words to 12.

What does ChatGPT actually search for when it fans out?

It injects modifiers the user did not type. Peec.ai's pattern analysis shows the most-injected tokens are best, top, reviews, comparison, tools, software, and features, with best alone appearing in roughly 24 percent of advice-style fan-outs. The engine also adds the current year as a freshness modifier in around 5 percent of queries, and it routinely appends review-site and forum names like Reddit, G2, Wirecutter, Glassdoor, and Consumer Reports. The mechanical result is that ChatGPT often searches for terms a user would never type, which means your page can rank perfectly for the prompt the user typed and be invisible to the fan-out that actually retrieved the citation.

Do fan-out sub-query visits show up in GA4?

Some of them. Across our cohort, roughly 28 percent of fan-out-sourced sessions arrive with a recognisable referer (chatgpt.com, chat.openai.com, oai.com, or a utm_source value the user-side app added). The remaining 72 percent arrive as Direct or as no-referer landings on a deep URL. The split is worse on mobile, where the in-app browser strips referers more aggressively, and worse still on the desktop ChatGPT app, which routes the click through an OS handoff that drops the header entirely. GA4 cannot distinguish a fan-out-sourced Direct visit from a true Direct type-in without server-side enrichment.

How does fan-out break UTM strategy?

UTM tagging assumes one source per visit. A fan-out prompt produces several searches that can each cite a different page on your site, and the user can click multiple citations in the same answer. The same prompt session emits sessions tagged from at least 3 different paths (assuming the user clicks more than one citation), and in most cases none of those clicks carry a utm_source value because ChatGPT does not append UTM parameters to citation URLs. The practical implication is that you cannot lean on UTM to disambiguate fan-out traffic. You need a server-side referer-capture layer that recognises the chatgpt.com referer (when present) and the no-referer-deep-page entry pattern.

What is a fan-out exit and how do I detect one server-side?

A fan-out exit is the click event when a user follows a citation from inside the ChatGPT answer to a third-party page. Server-side, it usually has three properties: no Referer header (or chatgpt.com when present), a landing URL that is not your homepage (citations point to deep pages), and a User-Agent that matches a real human browser rather than a crawler. The OAI-SearchBot User-Agent fetches the citation page before the user clicks, so the first request is a bot fetch and the second request, seconds or minutes later, is the human follow-through. By logging both legs and joining them on URL and IP-window heuristics, you can attribute the human session back to a fan-out source even without a referer.

Why do fan-outs produce more sessions than the user might expect?

Because a single answer often surfaces 3 to 7 citations, and engaged users click more than one. Our cohort logs an average of 1.4 follow-through clicks per ChatGPT answer where the engine fanned out into 4 or more sub-queries. If two of those clicks land on different domains, you see two sessions in your attribution layer (one to each site) but they originated from one prompt. If two of them land on different deep pages of the same domain, you see two sessions but they should be joined as one fan-out cohort visit. That join is not something GA4 will do for you.

Do fan-out sessions convert?

Yes, and at materially higher rates than Google organic on research-heavy verticals. Across the cohort's B2B SaaS sites in May 2026, fan-out-sourced sessions converted to a Stripe payment at 2.7 percent, versus 1.4 percent for Google organic and 2.5 percent for direct-prompt ChatGPT referrals where the user typed the brand name. The premium reflects intent: fan-out citations surface during commercial-investigation prompts where the user is already comparing options, which is later in the buyer journey than a top-of-funnel Google query. On ecommerce the pattern is more muted because impulse purchases do not match the fan-out intent profile.

How is fan-out different between ChatGPT, Perplexity, Claude, and Gemini?

All four fan out, but the shape is different. Perplexity runs the smallest fan-out (around 1.4 sub-queries per prompt per Peec.ai data) because it leans on a tightly tuned search step. ChatGPT sits at 2 to 4 sub-queries per prompt as a baseline, with commercial prompts pushing past 6. Grok runs the largest fan-out at around 6.8 sub-queries, narrowing aggressively by year, brand, and trusted sources like Reddit and Wirecutter. Claude with web search and Gemini in browse mode sit in the middle. The practical consequence: a page that survives ChatGPT's fan-out shape may be invisible to Grok's, because the injected modifiers differ engine by engine.

What content shape survives multiple fan-out queries?

Pages that answer the prompt at three altitudes: the literal phrase the user typed, the most likely injected modifier (best, top, reviews, comparison), and the named alternatives the user did not type. A best CRM for solo founders article that only ranks for the literal phrase wins one fan-out and loses the rest. The same article that names Pipedrive, HubSpot, Folk, Attio, and Capsule, answers reviews and comparison sub-queries with on-page passages, and carries a recent-year date, wins 4 or 5 of the fan-outs the engine runs in parallel. This is the operator-level reason listicles and comparison posts dominate AI answers.

How do I attribute a single Stripe payment back to a fan-out origin?

Three steps. One, capture the first-visit referer server-side and label it before any client-side processing strips it. Two, recognise the no-referer-deep-page entry pattern as a probable AI fan-out exit and bucket it accordingly. Three, persist a first-touch identifier through every subsequent session for the same visitor and join that identifier to the Stripe customer_id at webhook time. The third step is the one default analytics tools do not do, because GA4 attributes on a last-non-direct basis within a 90-day window that frequently drops the fan-out origin altogether.

Will fan-out length keep growing through 2026 and 2027?

On the trajectory Peec.ai measured (word count per fan-out doubling between October 2025 and January 2026), and with engines moving toward longer-context retrieval, the safest prediction is that fan-out depth keeps rising but at a slower rate. We expect fan-out sub-query counts to stabilise in the 3 to 8 range on commercial prompts through 2026, with sub-query word length continuing to climb as the retrievers improve at vector matching on longer queries. The implication for content teams is that the gap between the prompt a user types and the queries the engine actually runs will keep widening.

Does Attrifast measure fan-out attribution specifically?

Attrifast captures the server-side referer on the first visit, persists a first-touch identifier across sessions, and joins that identifier to the Stripe payment by webhook. That means a fan-out-sourced session that lands on a deep URL with no referer can still be tied back to its eventual paying customer, and the AI-engine origin label is preserved through a sales cycle that GA4 would otherwise reset. Attrifast is not a fan-out visibility tracker (for that, pair it with Peec, Profound, or Otterly), but on the revenue side, the first-party Stripe join is the only durable way to answer the question fan-out raises: did the engine's internal search of my brand actually produce a paying customer.

See your real ChatGPT fan-out revenue, not the 28 percent GA4 reads

Server-side referer capture, OAI-SearchBot pair-join, and a clean Stripe webhook attribution stack. Set up in under 4 minutes. $29 per month.

Start free trial →

5-day free trial · $29/mo · cancel anytime

Related reading

Find revenue hiding in your traffic

Discover which marketing channels bring customers so you can grow your business, fast.

Start free trial →

5-day free trial · $29/mo · cancel anytime