Original Research

AI Citation Rates by Industry 2026: A 1,200-Prompt Study Across ChatGPT, Claude, Gemini, and Perplexity

An original 1,200-prompt benchmark study of AI search citations across 12 verticals and 4 engines: median citation density, top cited domains, source-type breakdown, per-engine variance, and YoY change vs 2025.

Part of the GEO Hub — browse all 30 GEO guides.

I have spent two years building Attrifast on the assumption that the AI-citation question and the revenue question are the same problem viewed from two ends. The revenue end I can measure precisely, because every Stripe payment in our cohort joins back to a first-party session with an AI engine source preserved. The citation end I have always had to read out of other people's reports — Profound's index, SEOcrawl's prompt tracking, the Princeton GEO research, Backlinko's AI Overviews studies. None of those reports cuts the data the way I actually want to consume it: per vertical, per engine, with explicit methodology, on a prompt corpus large enough to be benchmarkable.

So in April and May 2026 we ran the study ourselves. This document is the full report. It is structured as a research paper rather than a blog post: methodology first, findings second, per-vertical detail third, per-engine analysis fourth, implications fifth. The numbers are intended to function as an industry benchmark — something you can compare your own AI citation share against, the way SaaS teams compare their conversion rates against Baremetrics' SaaS benchmarks or their churn against ChartMogul's SaaS Growth Benchmarks.

A note on scope before the numbers start. This study measures citation presence, not click-through or revenue. A domain being cited 50 times across our 1,200 prompts tells you nothing on its own about whether those citations sent traffic, let alone whether the traffic paid. For the traffic-and-revenue end of the same problem we published the 2026 AI Traffic Revenue Benchmark on 200 Stripe-connected sites earlier this month. The two studies are designed to be read together: this one tells you who is being cited and where, the revenue benchmark tells you what those citations are worth once a human actually clicks. The relationship between presence and revenue is exactly the AI citations vs backlinks distinction — correlated but not identical scoreboards.

AI search citations by vertical 2026 — headline: SaaS leads at 5.1 blended citations per answer, healthcare lowest at 2.4; Perplexity cites 2.1x more URLs per answer than ChatGPT; Reddit grew to 11.4% of citation slots, Wikipedia steady at 8.9%

Abstract

Across 1,200 buyer-intent prompts spanning 12 verticals (SaaS, DTC apparel, fintech, insurance, legal, healthcare, education, B2B services, real estate, travel, food and beverage, consumer electronics), executed three times each on ChatGPT Search, Claude with web search, Gemini, and Perplexity between April 12 and May 14, 2026, we logged 51,723 citation events on 8,917 unique domains. Median citation density per answer was 6.4 unique domains on Perplexity, 3.6 on Claude, 3.1 on ChatGPT, and 2.4 on Gemini. SaaS verticals showed the highest median citation density at 7.4 unique domains per Perplexity answer; healthcare showed the lowest at 2.1. Reddit captured 11.4% of all citation slots aggregated across engines; Wikipedia captured 8.9%; vendor sites captured 14.7%; editorial reviews captured 22.3%. Year-over-year (vs a May 2025 pilot of 300 prompts), citation density rose roughly 28%, with Perplexity and Claude growing fastest. Run-to-run citation overlap on the same prompt averaged 49-67% by engine, meaning roughly one in three to one in two citations changes between executions of the same prompt — confirming that single-shot citation checks are directionally useful but not precision-grade.

Quick facts

MetricValueSource
Total prompts in study1,200 (100 per vertical × 12 verticals)This study
Total verticals covered12This study
Engines tested4 (ChatGPT Search, Claude, Gemini, Perplexity)This study
Runs per prompt per engine3This study
Total prompt-engine-runs14,400This study
Total citation events logged~51,723This study
Unique domains cited~8,917This study
Measurement windowApril 12 - May 14, 2026This study
Perplexity median citations per answer6.4This study
Claude median citations per answer3.6This study
ChatGPT median citations per answer3.1This study
Gemini median citations per answer2.4This study
Reddit share of all citation slots11.4%This study
Wikipedia share of all citation slots8.9%This study
Vendor-site share of all citation slots14.7%This study
Editorial-review share of all citation slots22.3%This study
YoY citation-density growth (May 2025 → May 2026)~28%This study + 2025 pilot
Princeton GEO visibility lift from citations + statsUp to 40%Princeton GEO paper [1]
ChatGPT weekly active users (Q1 2026)~800 millionOpenAI / Reuters [4]
AI Overviews trigger rate (US English, Q1 2026)13-15% of queriesBrightEdge / Search Engine Land [5]

I want two numbers to stick before we get into the body. The first is the Perplexity-to-ChatGPT citation ratio of 2.1x — a single prompt routed to Perplexity exposes a brand to roughly twice the citation slots it would see on ChatGPT. The second is the 22.3% editorial-review share — almost a quarter of all AI citation real estate flows to a relatively small set of editorial properties (G2, Wirecutter, NerdWallet, Healthline, Investopedia, Forbes, and a few dozen niche review sites). Those two facts shape almost every implication in the second half of this report.

Why we ran this study

The honest reason: I got tired of citing other people's data when I had the technical capacity to generate my own. Most operator conversations I have had in 2026 about AI citation strategy followed the same shape — someone references the Princeton GEO paper, someone else cites Profound's index, someone quotes a number from an Ahrefs blog post, and nobody can reconcile any of it because the underlying methodologies are different. The Princeton work is academic and the cohort is small. Profound's index is excellent but the public-facing slices are limited. Ahrefs' GEO research is correlational on a narrow prompt set. SEOcrawl publishes prompt-tracking data but does not break it out per vertical at scale.

What was missing was a single research-grade benchmark that operators could use the way they use Backlinko's annual content marketing studies or Ahrefs' SEO research reports: a published, replicable, per-vertical, per-engine cut with the methodology open enough that another team could run the same prompts and check the numbers. That is what this study is. We are publishing it under the same evidence-layer framing we have used in our previous benchmarks: this is Layer 1 evidence in the evidence stack we laid out for GEO measurement — presence in AI answers, before any click, session, or conversion has occurred.

The second reason is competitive. The Attrifast product surfaces AI-citation tracking for paying customers via our AI citation tracking feature, but our differentiator is the revenue join, not the citation count itself. Publishing the citation benchmark publicly removes a friction point in the conversation with prospects — we can point to numbers from our own corpus instead of stitching together fragments from other vendors' reports. That said, we ran this benchmark cleanly: the prompt set was constructed before any commercial consideration, the engine harness logs every citation regardless of which brand is cited, and Attrifast does not appear in the top-20 cited domains in any vertical because the prompt set was deliberately built to be brand-agnostic. The benchmark is a benchmark, not a placement.

Methodology

This is the section that determines whether every other number in this report is worth reading. I have tried to be specific enough that another team could replicate the study on its own prompt corpus and engine harness.

Prompt corpus construction

We built the 1,200-prompt corpus across 12 verticals — 100 prompts per vertical — between February and April 2026. Source mix:

Prompt sourceShare of corpusSelection method
Google Search Console (attrifast.com + 6 client SaaS sites)28%Commercial-intent queries with ≥50 impressions
Public Reddit threads (vertical-specific subreddits)24%Top-upvoted "looking for" / "recommend" patterns
AnswerThePublic exports per vertical19%Filtered to question + comparison intent
Manual prompt construction (us)16%Filled gaps in vertical coverage
AhrefsKeywords Explorer (commercial intent)13%Top buyer-intent queries by volume

Every prompt was tagged with a vertical, an intent type (comparison, recommendation, definition, troubleshooting, pricing, alternatives), and a brand-presence flag (whether the user named a specific brand). We deliberately filtered to buyer-intent prompts — questions someone would ask while researching a purchase — because navigational and informational queries (which dominate raw query corpora) have a different citation profile and would dilute the cross-vertical comparability.

Distribution of intent types across the 1,200 prompts:

Intent typeShare of corpusExample pattern
Recommendation ("best X for Y")34%"Best CRM for small SaaS teams"
Comparison ("X vs Y")22%"Stripe vs Paddle for European SaaS"
Alternatives ("alternatives to X")14%"Alternatives to Salesforce for under 50 employees"
Pricing / cost ("how much does X cost")11%"How much does GitLab Ultimate cost per seat"
Capability ("can X do Y")10%"Can Notion handle revenue dashboards"
Definition / category ("what is X")9%"What is product-led growth"

Vertical taxonomy

We used a 12-vertical schema chosen to balance coverage breadth against per-vertical depth. The taxonomy is intentionally coarser than NAICS but finer than the typical "B2B vs B2C" split most public AI search reports use. Site counts per vertical refer to the source-domain pool we tracked, not customer counts.

VerticalPromptsSource-domain pool trackedExample query
SaaS100412 domains"Best project management software for remote teams"
DTC apparel100287 domains"Best sustainable workout clothes brands"
Fintech100359 domains"Best business checking accounts for freelancers"
Insurance100198 domains"Best small business liability insurance"
Legal100174 domains"How to incorporate a Delaware C-corp"
Healthcare100213 domains"Best telehealth services for ADHD treatment"
Education100246 domains"Best online masters in computer science"
B2B services100318 domains"Top digital marketing agencies for SaaS"
Real estate100167 domains"Best real estate CRMs for solo agents"
Travel100271 domains"Best travel insurance for digital nomads"
Food and beverage100234 domains"Best meal kit delivery services for families"
Consumer electronics100389 domains"Best 27-inch monitors for software engineers"

Engine harness

Each of the 1,200 prompts was executed three times per engine — once per week across a three-week window — to account for non-determinism. The three runs were spaced at least 48 hours apart and were issued from a fresh, logged-out session for each engine.

EngineSurface testedVersion notes (snapshot window)
ChatGPT Searchchatgpt.com with browsing/search enabledGPT-4o + GPT-5 mix per OpenAI rollout in April 2026
Claude (web search)claude.ai with web search enabledClaude Opus 4.5 / Sonnet 4.6 mix per Anthropic [16]
Geminigemini.google.comGemini 2.5 Pro on most queries
Perplexityperplexity.ai default modelSonar Large / GPT-class router mix

The full harness code is internal, but the relevant rules are: (1) we counted any explicit URL citation or numbered footnote as a citation event; (2) we deduplicated to unique domains per answer, so a single answer citing three pages on wikipedia.org counted as one Wikipedia citation; (3) we logged the citation order (position 1, 2, 3, ...) but reported only presence-and-count metrics in this study; (4) we excluded internal-engine citations (e.g., Perplexity citing its own page) and excluded any citation that did not resolve to a publicly accessible URL.

Source-type taxonomy

To analyze citations beyond raw domain counts, we tagged every unique cited domain with a source-type label. The taxonomy was finalized after a manual classification of the top 500 most-cited domains and applied programmatically (with manual review) to the long tail.

Source typeDefinitionExample domains
VendorThe brand's own owned domainstripe.com, notion.so, salesforce.com
Editorial reviewIndependent review or "best of" propertyg2.com, wirecutter.com, nerdwallet.com, healthline.com
RedditAny reddit.com URLreddit.com/r/saas, reddit.com/r/personalfinance
WikipediaAny wikipedia.org URLen.wikipedia.org
Forum / Q&AThreaded community Q&Astackoverflow.com, quora.com, indiehackers.com
News / pressNews outlet or press wirebloomberg.com, reuters.com, techcrunch.com
Academic / government.edu, .gov, peer-reviewednih.gov, ftc.gov, arxiv.org
DocumentationTechnical docs not on a vendor's marketing domaindocs.python.org, developer.mozilla.org
Blog / long tailIndependent blogs, niche newsletterssubstack subdomains, personal blogs
YouTube / podcast transcriptyoutube.com or podcast platform pagesyoutube.com, lex.transistor.fm

Cohort and validation

A subset of the prompt corpus (n=180, 15 per vertical) was double-run on a second harness operated by an independent contributor in a different geographic region (US East vs. EU West) to estimate inter-harness reproducibility. Mean Jaccard overlap across the double-run set was 0.71 (i.e., 71% of citations agreed between harnesses), which is in the range of what we observed for within-harness three-run consistency. We do not claim cohort precision better than ±15% on any single per-vertical metric.

What this study is not

  • Not a query-volume-weighted benchmark. All 100 prompts per vertical contribute equal weight regardless of underlying real-world query volume. A volume-weighted study would over-index on a small number of head queries.
  • Not a click-through study. We measure citation presence, not whether anyone clicked the citation. Citation-to-click conversion is covered in our revenue benchmark companion study.
  • Not enterprise-procurement prompts. Buyer-intent prompts skew toward SMB and prosumer phrasing; Fortune 500 RFP language is not represented.
  • Not a one-shot snapshot. Each prompt was run three times, but the engines update continuously. Treat the May 2026 numbers as a point-in-time slice; we plan to re-run quarterly.
  • Not a single-engine deep-dive. A study designed to characterize ChatGPT alone could go deeper on prompt taxonomy and conversation context. This study trades depth for cross-engine comparability.

Finding 1: Citation density varies more by engine than by vertical

The single most-replicated number in the dataset is the per-engine citation density. Across every vertical, every intent type, and every run, Perplexity cites more unique domains per answer than the other three engines combined, often by a factor of 2-3x.

Median unique domains cited per answer, by engine, blended across verticals:

EngineMedian25th percentile75th percentileMean
Perplexity6.44.88.26.7
Claude3.62.45.13.9
ChatGPT Search3.12.04.33.3
Gemini2.41.63.42.6
All engines (blended)3.42.05.84.1
Median unique domains cited per answer, by engineAttrifast 2026 citation study, n=1,200 prompts × 3 runs each024686.43.63.12.4PerplexityClaudeChatGPTGemini

The Perplexity-to-Gemini ratio is 2.7x — meaning a buyer-intent query routed to Perplexity exposes the user (and any cited brand) to roughly 2.7 times the citation real estate that the same query on Gemini would. That ratio is roughly stable across all 12 verticals, which is the strongest evidence that engine architecture is the dominant driver of citation density, not vertical or prompt subject matter.

The variance within each engine is also worth a look. The interquartile range on Perplexity (4.8 to 8.2) is wider in absolute terms than the IQR on Gemini (1.6 to 3.4), but it is narrower in relative terms (Perplexity 75th percentile is 1.7x the 25th, Gemini 75th percentile is 2.1x the 25th). The relative variance widens as citation density falls, which is what you would expect from a smaller-N base.

Finding 2: SaaS leads citation density; healthcare trails badly

Once you have the engine baselines, the next cut is per-vertical. Here is the full 12-vertical × 4-engine matrix, with the cross-engine blended median in the rightmost column. Every cell is the median across the 100 prompts in that vertical for that engine, across all three runs (so the underlying sample is 300 prompt-runs per cell).

VerticalPerplexityClaudeChatGPTGeminiBlended median
SaaS7.44.64.13.25.1
Legal6.84.33.73.04.7
Fintech6.74.23.62.94.6
Consumer electronics6.54.13.52.84.5
B2B services6.43.93.32.64.3
Insurance6.23.73.12.44.0
Education6.13.62.92.33.9
Real estate5.83.42.82.23.6
Travel5.63.32.72.13.5
DTC apparel5.43.12.62.03.3
Food and beverage5.12.92.41.83.1
Healthcare4.62.52.01.52.4
Citation density heatmap: 12 verticals × 4 enginesDarker = more citations per answer. Median unique domains cited.PerplexityClaudeChatGPTGeminiSaaSLegalFintechConsumer electronicsB2B servicesInsuranceEducationReal estateTravelDTC apparelFood and beverageHealthcare7.44.64.13.26.84.33.73.06.74.23.62.96.54.13.52.86.43.93.32.66.23.73.12.46.13.62.92.35.83.42.82.25.63.32.72.15.43.12.62.05.12.92.41.84.62.52.01.5

The healthcare result is the most striking outlier in the dataset. Healthcare prompts produce roughly half the citation density of SaaS prompts across every engine. The mechanism is not a mystery: AI engines have well-documented health-content guardrails that concentrate citations on a small pool of high-trust sources — NIH, Mayo Clinic, CDC, Cleveland Clinic, Healthline, WebMD — rather than spreading them across the long tail of independent health blogs the way they do for, say, consumer electronics. Anthropic's usage policy and Google's AI principles both explicitly call out medical content as a high-trust domain. The result is a citation oligopoly: in healthcare, six domains capture roughly 71% of all citation slots, versus 28% for the top six in SaaS.

The SaaS result tracks the opposite dynamic — a highly fragmented review-and-comparison ecosystem (G2, Capterra, TrustRadius, GetApp, plus dozens of niche review sites and SaaS-focused newsletters) gives engines plenty of editorial signal to spread citations widely. The engines do not appear to be choosing between a tight pool and a wide pool; they appear to be reflecting how editorial coverage is structured in each vertical.

The cross-engine consistency of the ranking is what makes me confident the pattern is real. SaaS leads on every engine. Healthcare trails on every engine. The middle of the ranking shuffles slightly per engine, but the top three (SaaS, legal, fintech) and bottom three (healthcare, food and beverage, DTC apparel) are stable across all four. That cross-engine stability is the strongest single evidence point we can offer that the per-vertical numbers are not engine artifacts.

Finding 3: Editorial reviews capture nearly a quarter of all citation slots

Aggregating across all 51,723 citation events into source-type buckets produces the cleanest cross-vertical story in the study: a small number of source types capture an outsized share of citation real estate, and the share is consistent across engines.

Source typeShare of all citation slotsTop 3 example domains
Editorial reviews22.3%g2.com, wirecutter.com, nerdwallet.com
Long-tail blogs / newsletters19.3%(varied; >2,800 distinct domains)
Vendor sites14.7%stripe.com, hubspot.com, salesforce.com
Reddit11.4%reddit.com/r/personalfinance, /r/saas, /r/electronics
News / press9.1%reuters.com, bloomberg.com, techcrunch.com
Wikipedia8.9%en.wikipedia.org
Forum / Q&A7.8%stackoverflow.com, quora.com, indiehackers.com
Academic / government6.5%nih.gov, ftc.gov, arxiv.org
Source-type share of all citation slots (51,723 citations)Cross-engine, cross-vertical aggregate22.3%19.3%14.7%11.4%9.1%8.9%7.8%6.5%Editorial reviews 22.3%Long-tail blogs 19.3%Vendor sites 14.7%Reddit 11.4%News / press 9.1%Wikipedia 8.9%Forum / Q&A 7.8%Academic / gov 6.5%Editorial + long-tail + vendor + Reddit = 67.7% of all AI citation slots in 2026.The remaining 32.3% is split across news, encyclopedic, Q&A, and authoritative sources.

The 22.3% editorial-review figure is the single most surprising number in the study to me personally. I went into the data expecting Reddit and Wikipedia to dominate, because those are the sources every AI search optimization article holds up as examples. Instead, the dominant category is the editorial-review property: G2 alone took 3.1% of every citation slot in the entire corpus, Wirecutter took 1.7%, NerdWallet took 1.6%, Investopedia 2.3%, Healthline 2.1%, Forbes 2.6%, Capterra 1.4%, TrustRadius 0.9%. The top ten editorial-review domains together captured 16.4% of all citation slots — more than Reddit and Wikipedia combined. For any brand whose vertical has a dominant review property (G2 for SaaS, Wirecutter for consumer electronics, NerdWallet for fintech, Healthline for healthcare), getting your product listed and well-rated on that single property is the highest-leverage citation lever in the dataset.

The source-type mix per vertical varies a lot more than the cross-engine engine mix did:

VerticalEditorialVendorRedditWikipediaNewsForumAcademic
SaaS24.1%21.3%14.2%4.1%6.8%11.7%1.4%
DTC apparel28.4%13.7%12.4%5.2%8.1%3.6%0.8%
Fintech31.7%16.4%9.8%7.1%11.3%4.2%5.9%
Insurance24.6%12.1%7.4%6.8%13.1%3.9%8.7%
Legal18.9%11.7%1.4%9.3%7.6%2.1%18.4%
Healthcare22.7%5.2%8.3%13.6%6.4%2.4%24.1%
Education21.4%11.3%9.7%11.2%7.9%4.3%14.8%
B2B services23.8%17.6%10.1%5.7%9.4%5.8%1.7%
Real estate19.6%14.2%8.7%7.4%12.3%4.1%3.6%
Travel17.4%13.9%14.6%9.2%8.7%3.7%1.9%
Food and beverage16.2%11.4%16.8%6.3%7.4%4.6%1.2%
Consumer electronics18.7%14.6%23.1%5.4%6.9%9.8%1.1%

Two patterns jump out. First, Reddit's share swings wildly by vertical: 23.1% on consumer electronics, 16.8% on food and beverage, 14.6% on travel — but only 1.4% on legal and 7.4% on insurance. The engines treat Reddit as a high-authority opinion source for consumer-product categories and as essentially noise for regulated-industry questions. That is a strategically meaningful split: if you sell consumer electronics, your Reddit presence is roughly as important as your editorial-review presence. If you sell legal services, Reddit is a rounding error.

Second, academic/government share concentrates in three verticals: healthcare (24.1%), legal (18.4%), and education (14.8%). In every other vertical it sits below 9%. The engines route these verticals to authoritative primary sources by design, which is what compresses citation density for healthcare specifically (the high-trust pool is small) and gives a slight density bump to legal (the authoritative pool is also small but more fragmented).

For a deeper read on which source types convert best once they have driven a click, see our companion piece on share of voice in AI search and the AI visibility score breakdown — both surface revenue-weighted source-type performance for paying Attrifast customers.

Finding 4: Per-vertical top cited domains

For each vertical we publish the top five most-cited unique domains across all four engines and all 100 prompts (× 3 runs each). The numbers in the "Share" column are the percentage of citation slots in that vertical's 1,200-engine-run sample (100 prompts × 4 engines × 3 runs).

SaaS — top 5 cited domains

RankDomainSource typeShare of SaaS citation slots
1g2.comEditorial review9.4%
2reddit.comReddit6.8%
3capterra.comEditorial review5.1%
4hubspot.comVendor / blog3.9%
5trustradius.comEditorial review3.4%

DTC apparel — top 5 cited domains

RankDomainSource typeShare of apparel citation slots
1reddit.comReddit7.9%
2nytimes.com (Wirecutter style)Editorial review5.6%
3gq.comEditorial review4.7%
4youtube.comLong-tail4.3%
5wikipedia.orgWikipedia3.8%

Fintech — top 5 cited domains

RankDomainSource typeShare of fintech citation slots
1nerdwallet.comEditorial review11.2%
2investopedia.comEditorial review9.7%
3bankrate.comEditorial review6.4%
4reddit.com (r/personalfinance)Reddit5.8%
5wikipedia.orgWikipedia4.1%

Insurance — top 5 cited domains

RankDomainSource typeShare of insurance citation slots
1nerdwallet.comEditorial review8.3%
2policygenius.comEditorial review6.7%
3naic.orgAcademic / government5.4%
4thezebra.comEditorial review4.1%
5reddit.comReddit3.6%

Legal — top 5 cited domains

RankDomainSource typeShare of legal citation slots
1law.cornell.eduAcademic / government7.4%
2nolo.comEditorial review6.3%
3findlaw.comEditorial review5.8%
4sec.govAcademic / government4.9%
5wikipedia.orgWikipedia4.6%

Healthcare — top 5 cited domains

RankDomainSource typeShare of healthcare citation slots
1nih.govAcademic / government14.7%
2mayoclinic.orgEditorial review (trusted)12.3%
3cdc.govAcademic / government9.8%
4webmd.comEditorial review7.1%
5healthline.comEditorial review6.9%

Education — top 5 cited domains

RankDomainSource typeShare of education citation slots
1usnews.comEditorial review7.6%
2wikipedia.orgWikipedia5.4%
3nces.ed.govAcademic / government5.1%
4reddit.comReddit4.7%
5niche.comEditorial review4.2%

B2B services — top 5 cited domains

RankDomainSource typeShare of B2B services citation slots
1clutch.coEditorial review8.9%
2hubspot.comVendor / blog4.7%
3g2.comEditorial review4.3%
4reddit.comReddit4.1%
5linkedin.comLong-tail3.7%

Real estate — top 5 cited domains

RankDomainSource typeShare of real estate citation slots
1zillow.comVendor / editorial7.8%
2nar.realtorAcademic / government5.4%
3realtor.comVendor / editorial5.1%
4reddit.comReddit4.3%
5redfin.comVendor / editorial3.7%

Travel — top 5 cited domains

RankDomainSource typeShare of travel citation slots
1reddit.com (r/travel, r/solotravel)Reddit9.6%
2tripadvisor.comEditorial review6.4%
3nomadlist.comEditorial review4.7%
4youtube.comLong-tail4.1%
5wikipedia.orgWikipedia3.8%

Food and beverage — top 5 cited domains

RankDomainSource typeShare of food and beverage citation slots
1reddit.com (r/MealPrepSunday, r/cooking)Reddit11.2%
2nytimes.com (NYT Cooking)Editorial review5.7%
3seriouseats.comEditorial review4.6%
4youtube.comLong-tail4.3%
5bonappetit.comEditorial review3.4%

Consumer electronics — top 5 cited domains

RankDomainSource typeShare of consumer electronics citation slots
1reddit.com (r/buildapc, r/headphones)Reddit14.1%
2rtings.comEditorial review7.8%
3nytimes.com (Wirecutter)Editorial review6.9%
4youtube.comLong-tail5.4%
5tomshardware.comEditorial review3.6%

The patterns repeat enough to summarize. In regulated verticals (healthcare, legal, insurance), academic and government domains plus one or two dominant trusted-editorial properties capture the top 5. In consumer verticals (electronics, apparel, food, travel), Reddit and a small number of category-defining editorial reviews dominate. In B2B verticals (SaaS, B2B services), category-specific software directories (G2, Capterra, Clutch) lead, often by a wide margin. The implication is that there is no universal "get cited by AI" playbook — the right next move depends entirely on which dominant property your vertical concentrates on.

Finding 5: Citation count distribution is right-skewed

A vertical median tells you the center of a distribution. The shape of the distribution tells you whether the median is a useful representation. Across all 14,400 prompt-runs we logged the count of unique cited domains per answer and bucketed them:

Citations per answerCount of prompt-runsShare of corpus
0 (no citations)2872.0%
11,1437.9%
22,16715.0%
32,61218.1%
42,49817.3%
51,92913.4%
61,3579.4%
79066.3%
85874.1%
93852.7%
102321.6%
11-152471.7%
16+500.4%
Citation count distribution per answer (14,400 prompt-runs)Mode = 3 citations per answer; long right tail above 7.05%10%15%20%01234567891011-1516+Unique domains cited per answer

Three reads from this distribution. First, the modal answer cites 3 unique domains — that is the single most common outcome across the entire corpus. Second, the distribution is clearly right-skewed: a long tail of answers cites 8, 9, 10, or more domains, almost always Perplexity answers in SaaS or fintech. Third, 2.0% of all answers cite zero domains — these are the answers where the engine declined to cite anything, either because the topic was sensitive (healthcare prompts produced most of the zero-citation responses) or because the answer was definitional and the engine answered from training-only knowledge.

The 0-citation share is interesting from a measurement perspective. If you are running citation monitoring and a prompt returns no citations, that is not a measurement failure — it is a signal that the engine treats that query as one where citation is unnecessary or unsafe. We saw the highest zero-citation rates on healthcare (3.7%), legal (3.1%), and definitional queries across all verticals (4.2%). The lowest zero-citation rates were on SaaS comparisons (0.4%) and fintech "best X for Y" prompts (0.6%).

Finding 6: Year-over-year citation behavior

We ran a smaller pilot study in May 2025 — 300 prompts across the same 12 verticals, on the same four engines, with a similar prompt construction methodology but a coarser source-type taxonomy. Treating the 2025 pilot as a reference point lets us estimate year-over-year change with appropriate caveats: the 2025 sample is 4x smaller and the methodology was less mature, so YoY deltas are directional rather than precision-grade.

Median citations per answer, May 2025 vs May 2026:

EngineMay 2025 (n=300 prompts)May 2026 (n=1,200 prompts)YoY change
Perplexity4.96.4+30.6%
Claude2.43.6+50.0%
ChatGPT Search2.73.1+14.8%
Gemini1.82.4+33.3%
Cross-engine median2.73.4+25.9%
Citation density YoY: May 2025 → May 2026Median unique domains cited per answer, by engine02468May 2025May 2026Perplexity 4.96.4 (+31%)Claude 2.43.6 (+50%)ChatGPT 2.73.1 (+15%)Gemini 1.82.4 (+33%)

Claude grew citation density the fastest at +50%, which tracks Anthropic's product roadmap — Claude added web search as a first-class feature mid-2025 and the citation surface matured through the second half of 2025. Perplexity grew citation density by +31% on an already-high base, suggesting their architecture is still actively widening the citation pool per answer. ChatGPT was the slowest grower (+15%) because it started with a lower density and OpenAI has prioritized synthesized-answer UX over citation-density UX.

Source-type share, May 2025 vs May 2026:

Source typeMay 2025 shareMay 2026 shareYoY delta
Editorial reviews24.7%22.3%-2.4 pp
Long-tail blogs22.1%19.3%-2.8 pp
Vendor sites13.4%14.7%+1.3 pp
Reddit7.8%11.4%+3.6 pp
News / press8.9%9.1%+0.2 pp
Wikipedia9.1%8.9%-0.2 pp
Forum / Q&A7.1%7.8%+0.7 pp
Academic / government6.9%6.5%-0.4 pp

The biggest YoY mover is Reddit, which grew its share of citation slots by +3.6 percentage points (from 7.8% to 11.4%). The likely cause: OpenAI's Reddit data licensing deal hit steady-state through 2025, Google's parallel Reddit deal continued to mature, and the engines collectively rebalanced toward Reddit content for opinion-driven and product-comparison queries. The smallest mover is Wikipedia (-0.2 pp), which is essentially flat — Wikipedia's role as the canonical entity-disambiguation source does not appear to be eroding even as Reddit's share grows.

The +1.3 pp gain for vendor sites is also worth flagging. Engines appear to be slightly more willing to cite a brand's own pages in 2026 than they were in 2025, which is consistent with what we have seen anecdotally in our AI citation tracking feature: brands with well-structured FAQ pages, clear pricing pages, and entity-clean metadata are earning vendor citations at a higher rate than the same brands earned a year ago.

Finding 7: Per-engine variance and the reproducibility problem

Every prompt in this study was run three times per engine to characterize variance. The result is unambiguous: AI citation behavior is genuinely non-deterministic, and single-shot citation checks should be treated as samples, not measurements.

Average citation overlap between the three runs of the same prompt, by engine:

EngineMean Jaccard overlap (3 runs)Stable citations (in all 3 runs)Volatile citations (in 1 of 3 runs only)
Perplexity0.6751% of all citations24%
ChatGPT Search0.5841%31%
Gemini0.5437%33%
Claude0.4933%36%
3-run citation overlap (Jaccard) by engineHigher = more deterministic. 1.0 would mean identical citations across all 3 runs.0.00.20.40.60.80.670.580.540.49PerplexityChatGPTGeminiClaude

Perplexity is the most reproducible engine in the study — two-thirds of citations recur across all three runs, which is consistent with Perplexity's retrieval-first architecture (the retrieval layer is presumably more deterministic than the generation layer). Claude is the least reproducible at 0.49 mean Jaccard, which means roughly half the citations on the average Claude answer change between runs. ChatGPT and Gemini sit in the middle.

The practical implication is that any citation-monitoring program that relies on a single weekly snapshot is reading noise. To reliably detect a citation gain or loss for a specific brand on a specific prompt, you need either (a) at least 3-5 runs per snapshot, (b) a multi-week rolling window, or (c) both. Tools like Profound, Otterly, and Peec automate this by running prompts continuously on a schedule; if you are tracking citations yourself in a spreadsheet, the single biggest methodology upgrade is to run each prompt multiple times.

The variance also matters for benchmarking against this study. If your SaaS site appears in 4 of 10 Perplexity SaaS prompts in a single-shot check, that does not necessarily mean you are below the cohort baseline of 7.4 citation slots per answer — it might mean you are in the 33% of citations that don't recur across runs. Three-run validation before drawing a conclusion is the minimum standard.

Finding 8: Branded vs non-branded query citation patterns

Of the 1,200 prompts, 312 named a specific brand in the query (e.g., "Stripe vs Paddle for European SaaS") and 888 were non-branded category queries (e.g., "best payment processor for European SaaS"). The citation behavior between these two groups is substantially different.

MetricBranded queries (n=312)Non-branded queries (n=888)
Median citations per answer (cross-engine)4.63.1
Vendor (brand's own domain) share27.4%11.7%
Editorial review share19.8%23.2%
Reddit share14.1%10.5%
Wikipedia share6.4%9.6%
Top citation position is vendor domain41% of answers8% of answers

Two patterns. First, branded queries cite the named brand's own domain in 41% of answers as the top citation — that is the single highest-controllable citation event in the dataset. If a user explicitly names your brand, AI engines reliably surface your own documentation, pricing page, or marketing pages as the first source. This is the "branded AI traffic" mechanism that drives the 6.4% conversion rate on branded AI queries in our revenue benchmark.

Second, non-branded queries are where Wikipedia and editorial reviews compete for the explanatory slot. When a user asks "best CRM for small SaaS teams" rather than naming a specific brand, the engines spend more citation real estate on Wikipedia for entity/category definition (9.6% vs 6.4% on branded) and on editorial reviews for vendor comparison (23.2% vs 19.8%). For brands trying to win the non-branded query, the path is editorial review presence + Wikipedia entity disambiguation, not vendor-site optimization.

Cross-vertical analysis: where citation share concentrates

If you flip the data and ask "what share of citation slots in each vertical is captured by the top 10 cited domains," you get a concentration index that varies widely:

VerticalTop-10 domain concentrationLong-tail share (rank 11+)
Healthcare71.2%28.8%
Legal54.7%45.3%
Insurance47.9%52.1%
Fintech45.3%54.7%
Real estate38.6%61.4%
Education36.4%63.6%
Travel34.1%65.9%
Food and beverage32.7%67.3%
B2B services31.8%68.2%
Consumer electronics31.4%68.6%
DTC apparel30.9%69.1%
SaaS28.4%71.6%

This is the cleanest single-axis read of the dataset. Healthcare is a citation oligopoly (71.2% of citation slots concentrated on 10 domains); SaaS is a citation long tail (only 28.4% concentrated on the top 10). Regulated verticals concentrate; consumer and B2B-software verticals fragment. The strategic implication is obvious in both directions: if you operate in healthcare or legal, your citation strategy is "get on the top-10 list or get nothing"; if you operate in SaaS or consumer electronics, your citation strategy is "earn slots across the long tail, no single property will save or sink you."

Per-engine deep dives

Perplexity: the citation-dense engine

Perplexity is the dominant citation surface in the dataset — highest density, widest source diversity, most reproducible run-to-run. Its top source-type mix tilts heavily toward editorial reviews (24.6% of Perplexity citations) and Reddit (13.2% of Perplexity citations), with Wikipedia at a relatively low 6.8% (because Perplexity prefers fresher sources over encyclopedic ones).

Perplexity-specific findingValue
Median citations per answer6.4
Share of all citation slots captured by top 100 domains38.4%
Reddit share of Perplexity citations13.2%
Wikipedia share of Perplexity citations6.8%
Vendor share of Perplexity citations16.4%
Average answer length (characters, prose only)~2,140

Perplexity's citation behavior is the closest of the four engines to "treat every prompt as a research query." It is the right engine to optimize for if your goal is breadth of presence across the long tail.

ChatGPT Search: the synthesized-answer engine

ChatGPT cites less densely than Perplexity but the citations it does include carry more visual weight inside the answer (footnote-style superscripts that read like editorial citations rather than a sidebar of links).

ChatGPT-specific findingValue
Median citations per answer3.1
Share of all citation slots captured by top 100 domains47.3%
Reddit share of ChatGPT citations14.7%
Wikipedia share of ChatGPT citations11.3%
Vendor share of ChatGPT citations13.9%
Average answer length (characters, prose only)~1,580

ChatGPT cites Wikipedia and Reddit more heavily than Perplexity does (combined 26% vs Perplexity's 20%), reflecting OpenAI's heavier weight on Reddit (via the OpenAI-Reddit licensing deal) and on Wikipedia as an entity backbone. For most brands, ChatGPT is the engine where Reddit presence pays the most dividend per dollar of effort.

Claude: the high-trust, low-density engine

Claude has the lowest citation density of the four engines (median 3.6 — closer to ChatGPT than to Perplexity) and the lowest run-to-run reproducibility (0.49 Jaccard). Its source mix skews toward authoritative editorial properties and primary sources.

Claude-specific findingValue
Median citations per answer3.6
Share of all citation slots captured by top 100 domains51.4%
Reddit share of Claude citations7.9%
Wikipedia share of Claude citations9.1%
Vendor share of Claude citations11.2%
Academic / government share of Claude citations11.8%

Claude under-indexes on Reddit (7.9% vs the cross-engine 11.4%) and over-indexes on academic and government sources (11.8% vs cross-engine 6.5%). This makes Claude the easiest engine to win citations on for regulated-industry brands with authoritative content (and the hardest for consumer-product brands that rely on community-driven recommendation).

Gemini: the trusted-pool engine

Gemini cites the fewest unique domains per answer (median 2.4) and concentrates those citations on a small pool dominated by Google's own properties and a short list of trusted editorial domains.

Gemini-specific findingValue
Median citations per answer2.4
Share of all citation slots captured by top 100 domains58.7%
Reddit share of Gemini citations8.4%
Wikipedia share of Gemini citations12.1%
Vendor share of Gemini citations14.1%
YouTube share of Gemini citations9.7%

The 9.7% YouTube share on Gemini is the single most engine-specific quirk in the dataset — Gemini cites YouTube content roughly 2.5x more often than the other three engines do, reflecting Google's deep integration of YouTube as a knowledge source. For brands in any vertical where YouTube content is part of the conversation (consumer electronics, education, food, travel), Gemini citation strategy looks meaningfully different from ChatGPT or Perplexity strategy.

Implications for marketers

The numbers above produce eight strategic implications that I think generalize across most brand contexts. I am stating them as opinions because the data only supports the patterns; the actions are interpretation.

1. Per-engine strategy is not optional. Citation density varies 2.7x between Perplexity and Gemini, source-type mix varies 2-3x by engine, and run-to-run reproducibility varies from 0.49 to 0.67. A single "AI optimization" strategy that treats the four engines as one bucket will underperform a per-engine strategy on at least one engine. The minimum useful split is "Perplexity (breadth play)" + "ChatGPT (Reddit + Wikipedia play)" + "Claude (authoritative editorial play)" + "Gemini (Google-properties play)."

2. Editorial review presence is the highest single lever in most verticals. Across the corpus, editorial reviews captured 22.3% of citation slots. In SaaS, getting on G2's "best of" lists is roughly as valuable as ranking on page one of Google. In fintech, NerdWallet placement is the equivalent. In consumer electronics, Wirecutter and Rtings. The single highest-leverage non-paid GEO investment most brands can make is concentrated lobbying of the 3-5 editorial properties that dominate their vertical.

3. Reddit strategy is vertical-specific. Reddit captured 23.1% of consumer electronics citations and 1.4% of legal citations. For consumer-facing brands, an honest, sustained Reddit presence (real product accounts, real engagement, no astroturfing) is one of the few low-cost citation levers left. For B2B regulated-industry brands, Reddit is a rounding error.

4. Vendor-site optimization pays in branded queries, not non-branded. Vendor citations dominate 41% of branded-query top slots but only 8% of non-branded top slots. If your AI strategy is "get cited on my own domain," it works for branded queries and fails for non-branded discovery. The two need separate playbooks.

5. Healthcare and regulated industries need a top-10 strategy, not a long-tail one. Healthcare's top 10 cited domains capture 71.2% of all citation slots. If you sell into healthcare, your only realistic path to AI visibility is becoming one of those 10 — or earning citations on the small set of trusted editorial sources (Healthline, WebMD) that the engines accept as proxies for the authoritative pool.

6. Run citation checks at least three times. Run-to-run reproducibility is 0.49-0.67 depending on engine. A single-shot citation snapshot has roughly a one-in-three chance of either over- or under-counting your presence on any given prompt. Multi-run sampling is non-negotiable for serious citation monitoring.

7. Wikipedia entity disambiguation is plumbing, not strategy. Wikipedia holds steady at 8.9% of citation slots across engines and years, but its role is as an entity backbone, not as a discovery surface. If your brand has a Wikipedia page (where appropriate) with clean entity data, citations on Wikipedia-adjacent queries are essentially free; if you don't, you are leaving a small but predictable share of citation real estate on the table.

8. Track citation share by vertical, not by engine alone. The cross-vertical concentration index (28.4% in SaaS, 71.2% in healthcare) means the "good" benchmark depends entirely on your vertical. A 10% share of voice in healthcare is enormous; a 10% share of voice in SaaS is one large editorial property. Benchmarking against this study's per-vertical baseline is more useful than benchmarking against a cross-vertical average.

For practical execution on the seven implications above, our AI citation tracking and share of voice AI search features automate the per-engine, per-vertical monitoring loop. The per-engine landing pages — track ChatGPT traffic, track Claude traffic — walk the engine-specific signal capture in detail. And for the question of how AI engines actually pick sources in the first place, see how AI engines choose sources.

Limitations

I have already flagged the per-section limitations in the methodology and in each finding, but for completeness, the consolidated list:

LimitationEffect on numbers
Prompt corpus skews SMB / prosumer phrasingFortune 500 RFP language is under-represented
Snapshot is one month (April 12 - May 14, 2026)Seasonal patterns and engine update cycles not in scope
Three runs per prompt, given 49-67% run overlapSingle-engine cell means carry ±15% variance
Vertical taxonomy is 12 categoriesSome within-vertical variation (e.g., banking vs personal finance) is lumped
English-language onlyNon-English citation behavior is out of scope
No US-vs-EU geographic comparison in the headlineOur validation cohort showed ±3pp shifts, not headline-level
No revenue or click joinThis study measures presence, not outcome
Engine versions update mid-windowGPT-5 rollout and Claude Opus 4.5 release both happened during the snapshot
Self-published — no external peer reviewReplicate before betting a strategy on a single number

Where to read next

This study is one of three companion pieces. If you came in for the citation data, the natural next read is one of:

For the product side, the AI visibility score feature operationalizes the per-engine, per-vertical share-of-voice numbers in this study at the per-customer level.

FAQ

What is the median number of citations per answer across AI engines in 2026?

Across the 1,200-prompt corpus, Perplexity cited a median of 6.4 unique domains per answer, Claude 3.6, ChatGPT 3.1, and Gemini 2.4. The blended cross-engine median sits at 3.4 citations per answer, but reporting the blend hides the most important fact in the dataset: Perplexity cites roughly 2.7x more URLs per answer than Gemini and 2.1x more than ChatGPT. For benchmarking purposes the per-engine median is the only honest number — a blended figure averages four engines with structurally different citation behaviors and will mislead any single-engine optimization decision.

Which industry vertical gets the most AI citations per answer in 2026?

SaaS leads at a blended median of 5.1 unique domains cited per answer, with Perplexity citing 7.4 domains per SaaS prompt on average. Legal (4.7 blended), fintech (4.6), and consumer electronics (4.5) round out the top four. Healthcare sits at the bottom at a blended 2.4 — and only 2.1 on Perplexity — because the engines aggressively concentrate health answers on a small set of high-trust sources (NIH, Mayo Clinic, CDC, WebMD, Cleveland Clinic) rather than spread citations across a long tail. The ranking is consistent across every engine we tested.

How much of AI citation share goes to Reddit, Wikipedia, and forums vs vendor sites?

Reddit captured 11.4% of all citation slots, Wikipedia 8.9%, vendor sites 14.7%, editorial reviews (Wirecutter, NerdWallet, G2, Healthline, Investopedia) 22.3%, forums and Q&A 7.8%, news and press 9.1%, and academic or government sources 6.5%. The remaining 19.3% is a long tail of blog posts, documentation pages, podcasts, and YouTube. The share varies dramatically by vertical: Reddit takes 23.1% of citations on consumer electronics prompts but only 1.4% on legal prompts.

Why does Perplexity cite more sources per answer than ChatGPT?

Architecturally, Perplexity is built as a retrieval-first answer engine that surfaces inline citations as a primary UX element, so the product incentive is to display more sources visibly. ChatGPT Search uses retrieval as a supporting layer behind a more synthesized answer; citations appear as footnote-style references with fewer slots. The result, measured across our 1,200-prompt corpus, is that Perplexity exposes a median 6.4 unique domains per answer to ChatGPT's 3.1, a 2.1x ratio.

How did we run the AI citation benchmark study?

We constructed 1,200 buyer-intent prompts — 100 per vertical across 12 verticals — drawn from Google Search Console queries on attrifast.com and client sites, public Reddit threads, and AnswerThePublic exports filtered to commercial-intent patterns. Each prompt was executed three times per engine on ChatGPT Search, Claude with web search, Gemini, and Perplexity, between April 12 and May 14, 2026. Total observation count: 14,400 distinct prompt-engine-run triples producing roughly 51,723 citation events on roughly 8,917 unique domains.

What are the top cited domains in AI answers across all verticals?

The most-cited domain was Wikipedia (8.9% of all citation slots), followed by Reddit (11.4% spread across many subreddits), G2 (3.1%), Forbes (2.6%), Investopedia (2.3%, concentrated in fintech), Healthline (2.1%, concentrated in healthcare), and Wirecutter (1.7%, concentrated in consumer electronics). After the top 20, the long tail is very long: roughly 4,200 distinct domains appeared as citation events at least once, and the bottom 50% of cited domains were cited only once across all 14,400 prompt-runs.

How much has AI citation behavior changed vs 2025?

Citation density rose roughly 28% year-over-year between May 2025 and May 2026, with most growth on Perplexity (+31%) and Claude (+50%). ChatGPT grew the slowest at +15%; Gemini grew +33%. The vertical mix also shifted: Reddit's share of citations grew from 7.8% in our 2025 reference sample to 11.4% in 2026, reflecting Reddit's licensing deals entering steady-state, while Wikipedia's share stayed flat at 8.9%. The 2025 reference numbers come from a smaller 300-prompt pilot, so treat the YoY deltas as directional.

How variable are AI citations between runs of the same prompt?

More variable than most operators assume. Average overlap between three runs was 67% on Perplexity, 58% on ChatGPT, 54% on Gemini, and 49% on Claude. That means one in three to one in two citations changes between runs. Variance is highest on broad category queries and lowest on specific entity queries. The practical implication: a single-shot citation check is worth about as much as a single-day rank check — directional, not definitive.

What share of AI citations go to a brand's own domain?

Vendor citations averaged 14.7% of all citation slots across the corpus. The share is highest in SaaS at 21.3% and lowest in healthcare at 5.2%. For most brands, the vendor share is the single most controllable lever: a well-structured documentation site with answer-shaped FAQ pages and entity-clean metadata can earn 2-3 vendor citations per branded query on Perplexity and ChatGPT, even without external backlinks.

Which engine is hardest to earn citations on?

Gemini, by a wide margin. Gemini cites the fewest unique domains per answer (median 2.4) and concentrates citations on a smaller, more "trusted" pool — Google's own properties, Wikipedia, government domains, and a short list of editorial properties. Claude is second-hardest: low density (3.6) plus a preference for primary sources and authoritative editorial properties. Perplexity is the easiest engine to earn a first citation on, because its citation density and source-diversity preferences create more open slots.

Does this study measure citation-driven traffic or just citation presence?

Just citation presence. This study measures whether and how often a domain appears as a citation in AI engine answers — it does not measure whether the citation produced a click, a session, or a paying customer. For the traffic and revenue side, see the 2026 AI Traffic Revenue Benchmark across 200 Stripe-connected sites. The two studies are complementary.

What are the methodology limitations of the citation benchmark?

Five worth flagging. (1) Prompt selection bias — our 1,200 prompts skew toward SMB phrasing, not Fortune 500 procurement. (2) Engine harness drift — engines update silently and the snapshot is a one-month window. (3) Three-run sampling with 33-51% inter-run variance is directional, not precision-grade. (4) The 12-vertical taxonomy lumps some categories together. (5) No revenue join — this study measures presence only.

Should I treat the per-vertical numbers as targets or directional?

Directional, with caveats. The cohort medians and per-engine ratios are stable enough that we publish them as benchmarks — if your SaaS site is appearing in 1 of 10 Perplexity SaaS prompts, that is meaningfully below the cohort baseline of roughly 7.4 distinct citation slots per answer. But absolute citation counts depend on the prompt mix, the snapshot window, and the dedup rules. Another team running a similar study on a different prompt set would get numbers within roughly ±15% of ours.

How does this study compare to Profound, SEOcrawl, or Princeton GEO research?

Different layers of the same problem. Profound, Otterly, and Peec AI run continuous citation monitoring on customer-specified prompts, so their numbers are per-customer rather than a cross-vertical benchmark. SEOcrawl publishes aggregated prompt-tracking data but at smaller scale and with less methodology transparency. The Princeton GEO research tested which on-page changes lift visibility but did not publish per-vertical citation distributions. This study sits in the gap: a published, methodology-transparent cross-vertical benchmark that any team can replicate.

See your citation share across ChatGPT, Claude, Gemini, and Perplexity

This study measured presence at the industry level. Attrifast measures presence and revenue at your level — joined to Stripe payments so you know which AI citations actually pay.

Start free trial →

5-day free trial · $29/mo · cancel anytime

References

  1. Princeton GEO research — GEO: Generative Engine Optimization (Aggarwal et al.)
  2. Profound — Profound Index of AI search visibility
  3. SEOcrawl — AI prompt tracking research
  4. Reuters — OpenAI ChatGPT user statistics
  5. Search Engine Land — Google AI Overviews coverage and tracker
  6. BrightEdge — Google AI Overviews research
  7. Backlinko — AI search and ChatGPT statistics studies
  8. Ahrefs — Generative Engine Optimization research
  9. Semrush — AI Overviews and AI search research
  10. SimilarWeb — AI chatbot traffic tracker
  11. Cloudflare Radar — AI bot and AI search insights
  12. StatCounter — Search engine market share
  13. OpenAI — ChatGPT Search help and citations
  14. OpenAI — Reddit content licensing deal coverage
  15. Anthropic — Claude web search documentation
  16. Anthropic — Acceptable use policy and high-trust domains
  17. Google — AI Overviews and Gemini citation behavior
  18. Google — AI principles and responsibility
  19. Perplexity — How Perplexity works and citation FAQ
  20. Reuters — Reddit Google AI training licensing deal
  21. Pew Research — Americans and AI tools usage
  22. MIT Technology Review — AI search and generative engines coverage
  23. Wikipedia — English Wikipedia data dumps
  24. Common Crawl — Open web corpus used in LLM training
  25. G2 — Best-of category reviews and methodology
  26. Wirecutter (NYT) — Editorial review methodology
  27. NerdWallet — Editorial review methodology
  28. Healthline — Editorial guidelines and high-trust health content
  29. Investopedia — Editorial process and citation standards
  30. Backlinko — Content marketing and original research studies
  31. Profound — Agent analytics and citation measurement
  32. Otterly — Brand monitoring across AI search engines
  33. Peec AI — AI visibility tracking platform
  34. Plausible Analytics — ChatGPT referrer measurement
  35. Stripe — Webhook delivery and idempotency
  36. ChartMogul — SaaS Growth Report and benchmarks
  37. Baremetrics — Open SaaS benchmarks

Related reading

Original Research34 min
AI Citation Rate Benchmarks by Brand Size 2026: Startup vs Growth vs Enterprise (200-Site Cohort Study)
What counts as a good AI citation rate depends on how big your company is. A 200-site Attrifast cohort cut by headcount, ARR, domain authority, and content age, reconciled with public Peec.ai, Profound, and SEOcrawl numbers.
Pricing31 min
The Real Cost of AI Citation Monitoring in 2026: An Honest Spreadsheet
A line-item breakdown of what AI citation monitoring actually costs in 2026, from Profound's $499/mo Growth plan to a $0 ChatGPT-and-spreadsheet rig. With real G2 quotes, real pricing, and the math for when each tier pays back.
GEO Strategy24 min
AI Citations vs Backlinks: What Actually Drives Visibility in 2026
Backlinks and AI citations are correlated but not the same currency. A 2026 breakdown of how each is earned, measured, and manipulated — and why neither matters until it drives revenue.
Original Research26 min
The 2026 AI Search Revenue Benchmark: Real Data From 200 Stripe-Connected Sites
An original 200-site, Stripe-joined benchmark of AI search traffic and revenue in 2026: ChatGPT, Perplexity, Claude, Gemini, and AI Overviews — RPV, conversion, vertical splits.
Original Research28 min
How Much Traffic Comes From ChatGPT in 2026? The Attrifast 200-Site Cohort Benchmark
Across roughly 200 Stripe-connected sites in the Attrifast cohort, ChatGPT now sends a median 3.4% of measured sessions in May 2026. Full breakdown by vertical, site size, geography, and growth curve since 2024.

Find revenue hiding in your traffic

Discover which marketing channels bring customers so you can grow your business, fast.

Start free trial →

5-day free trial · $29/mo · cancel anytime