Blog / Competitive Analysis

How to Analyze Your Competitors' AI Visibility (and Beat Them in 2026)

29 min readUpdated May 2026

Vincent RuanFounder, Attrifast · May 27, 2026 · 29 min read

A step-by-step method to analyze why ChatGPT, Perplexity, Claude and Gemini recommend your competitors over you — build a buying-query prompt set, tally per-competitor share of voice, teardown their citation sources, then close the gaps that actually drive your revenue.

Part of the generative engine optimization guide.

TL;DR

The useful version of "why does ChatGPT recommend my competitor" is not a feeling — it is a measured, per-competitor share-of-voice table built from a fixed prompt set of your category's buying queries, run across ChatGPT, Perplexity, Claude and Gemini, with every cited URL recorded.
Once you know who out-cites you, the analysis that matters is the citation-source teardown: for a winning competitor, which source class feeds their citations — their own pages, third-party listicles, Reddit, Wikipedia, G2, or news — because that tells you the one structural thing to match, not their entire ten-year footprint.
Most "competitor AI analysis" stops at "they're cited more." That is the vanity half. The useful half identifies the specific structural reason (entity, source graph, content format, organic rank) and asks whether closing it is worth the revenue it would win back.
Gaps split cleanly into fast and slow. Publishing an honest comparison page or shipping a Wikidata entity and sameAs graph is days. Earning a Wikipedia article, accumulating Reddit standing, or out-ranking an incumbent organically is quarters. Sequence the fast, revenue-relevant ones first.
Beating a category-defining incumbent head-on in "best [category] tools" is usually not winnable through GEO. Owning the "[incumbent] alternatives" and "[category] for [segment]" prompts — where intent is higher and the incumbent's dominance is weaker — usually is.
Knowing a competitor out-cites you is half the battle. The other half is whether winning the slot back actually pays. Measure cited→clicked→paid per query with Attrifast → Start free trial

Competitor AI-visibility analysis in 3 layers: measurement (tally citation share per competitor across engines), forensics (reverse-engineer why the winner is cited), and the revenue wedge

A founder emailed me in April with a screenshot. He had asked ChatGPT "what's the best privacy-first analytics tool for a small SaaS," and the answer named three competitors and not him. His message was two words: "fix this." My reply was a question: "fix what, exactly?" Because "ChatGPT recommends my competitor" is not a diagnosis. It is a symptom, and the symptom has a specific, findable cause — usually a source the competitor sits in that you do not. Until you know which source, "fix this" has no target.

This is the offensive, competitive-intelligence companion to two pieces I have already written. ChatGPT isn't recommending your product is the inward-facing diagnostic — the eight reasons the model ignores you. AI share of voice is the metrics piece — what the resulting mention-count number means and why revenue share of voice matters more. This article is the third leg: how to look outward, measure why the engines recommend your competitors, reverse-engineer the structural reason, and decide which gap to close based on whether it moves your revenue. It is a method, not a pep talk. By the end you will have run a real audit on your own category and have a ranked list of gaps with a fast/slow tag and a revenue read on each.

A note on honesty up front, because this topic attracts snake oil. There is no button that drops your competitor from ChatGPT's answer, no algorithm to reverse-engineer, and no guarantee that out-citing a rival pays. What there is: a repeatable way to see the gap precisely, a way to find the source that causes it, and a way to measure whether closing it shipped you money. That is the whole article.

Quick Facts

Metric	Value	Source
ChatGPT weekly active users (Q4 2025)	~400 million	OpenAI investor update [1]
Brands named per ChatGPT "best of" answer (typical)	3-5	OpenAI search docs [2]
Brands cited per Perplexity answer (typical)	4-7	Perplexity docs [3]
Brands cited per Claude web-search answer (typical)	1-3	Anthropic docs [4]
Minimum prompts for a stable per-competitor SOV ranking	~30 per engine	Attrifast methodology
Recommended samples per prompt	3-5	Stochastic-sampling correction
Reddit's share of cited domains in AI answers (2025)	Among the most-cited domains	Search Engine Land / citation studies [5]
Reddit content-licensing deal that boosted its AI presence	Google ~$60M/yr; OpenAI partnership	Reuters [6]
GEO methods lifting citation share in controlled tests	Up to ~40% visibility lift	Princeton GEO paper [7]
GA4 default attribution for ChatGPT clicks	Direct/(none); no AI rule	Google Analytics docs [8]
Trackers that compute competitor share of voice	Profound, Otterly, Peec, SE Ranking	Vendor docs [9][10][11]
Trackers that join competitor SOV to your Stripe revenue	0 of the major ones	Vendor docs
ChatGPT RPV vs Google organic (B2B SaaS)	1.4-2.1x, n=24	Attrifast aggregate, Q1 2026

Two of those rows frame the whole piece. The Princeton GEO finding [7] — that specific structural moves lift citation share by up to ~40% in controlled tests — is why competitive gaps are closable at all: citation presence is engineered, not innate. And the "0 trackers that join competitor SOV to your Stripe revenue" row is the gap this article keeps circling back to. The trackers can tell you a competitor out-cites you. None of them can tell you whether closing that gap put money in your account.

What competitive AI visibility analysis actually is

Competitive AI visibility analysis is measuring how often your competitors appear inside AI-generated answers, reverse-engineering the sources that feed those citations, and isolating the structural gaps that — if closed — would win you revenue, not just mentions. It is the AI-era version of the competitor SEO audit, but the unit of analysis moved from "who ranks for the keyword" to "who gets named in the synthesized answer," and the answer names three to five brands, not ten.

It has three distinct layers, and the value rises sharply as you go down them. Most teams do the first and stop, which is exactly why most competitive AI analysis is useless.

Layer	Question it answers	Output	Most teams
1. Measurement	Who gets cited, how often, on which engine, for which prompts?	Per-competitor share-of-voice table	Do this and stop
2. Forensics	For a cited competitor, what source feeds the citation?	Citation-source teardown per competitor	Skip
3. Wedge	Which gaps, if closed, would move my revenue?	Ranked, revenue-weighted gap list	Almost nobody does

Layer one tells you the score. It feels like progress — you now have a chart showing your competitor at 38% and you at 11%. But a score with no cause is not actionable; you cannot "fix 11%." Layer two finds the cause: the competitor is at 38% because they appear in seven listicles, have a Wikipedia entity, and own three Reddit threads, while you have two listicles and no entity. Now you have targets. Layer three is the one that separates useful analysis from busywork: of those targets, which one is realistic to close, feeds a prompt that actually converts, and would plausibly move your revenue? Closing a gap that wins you a definitional citation nobody clicks is motion without progress.

The discipline of this article is to push all the way to layer three. A competitor out-citing you is a fact. Whether it costs you money — and whether closing it earns you money — is the only question that survives a board meeting. I detail the metrics half of this argument in AI share of voice vs revenue share of voice; here I am applying it to the competitive case specifically.

Why ChatGPT recommends your competitor: the mechanic

ChatGPT recommends your competitor because the competitor sits in more of the third-party sources the model grounds its answer in — the model recommends what the web has already recommended, and your competitor has been recommended more. It is rarely about product quality, which the model cannot assess directly. It is about the density and authority of the citation graph around each brand.

When a user asks for a recommendation, the engine is doing one of two things. Without browsing it recalls from its training corpus — a frozen snapshot weighted toward brands mentioned often and authoritatively before the cutoff. With browsing it runs a live search, retrieves a handful of pages, and synthesizes an answer with inline citations. Your competitor can win in either mode, and the source of their advantage differs:

Mode	Why the competitor wins	What feeds it	Your lever speed
Corpus (no browsing)	Mentioned more, more authoritatively, before the cutoff	Accumulated press, Wikipedia, Reddit, listicles	Slow (next retrain)
Retrieval (browsing on)	More crawlable, better-structured, higher-ranked pages now	Live listicles, comparison pages, organic rank, schema	Fast (days to weeks)

This matters for your analysis because you must test both modes per competitor or you will misread the gap. A competitor who beats you only in corpus mode has an accumulated-authority advantage you close slowly; one who beats you in retrieval mode has a structural-content advantage you can often match this quarter. I unpack the corpus-versus-retrieval split in detail in ChatGPT isn't recommending your product — for the competitive case, the key is that the two modes point you at two different sets of gaps with two different clocks.

One more mechanic that prevents the most common analysis error: a mention is not a citation is not a click is not a sale. Your competitor being named in an answer (a mention) is different from their URL being cited (a clickable path), which is different from a user clicking it, which is different from that click converting. Keep these four states separate or your competitive read will blur exactly where it matters.

State	What it means for a competitor	How you detect it
Mentioned	Model named the brand, no link	Read the answer text
Cited	Their URL appears as a footnote	Read the answer's sources
Clicked	A user followed the citation to them	You cannot see a competitor's clicks
Converted	That click paid them	You cannot see a competitor's revenue

The last two rows are why competitive analysis has a hard ceiling: you can measure a competitor's presence precisely, but you can never see their revenue. You can only measure the revenue impact on your own side after you close a gap. That asymmetry shapes the entire method.

The step-by-step competitive AI-visibility audit

Here is the full audit, the one I run for my own properties and walk customers through. It is seven steps, roughly two to three hours of manual work for a 30-prompt set across four engines, or an afternoon with a prompt-replay tool. Do it manually the first time even if you will automate later, because the manual pass teaches you what the automated dashboards are abstracting away.

Step 1: define the competitive set

Write down the explicit list of brands you compete with for the prompts you care about. Five to twelve is typical. This is your denominator, and it silently determines every share-of-voice number downstream. Include the brands you actually lose deals to and the ones that show up when you ask the engines your category question — not just the ones you think about. If you omit a strong competitor, your relative position looks better than it is; if you pad the list with brands nobody considers, the numbers get diluted.

Competitive-set sizing	What happens to the analysis
2-3 brands	Share swings wildly; treat as directional only
4-8 brands	Stable, the usual sweet spot
9-15 brands	Realistic for crowded categories; shares run smaller
16+ brands	Denominator so large signal washes out; segment the category instead

A useful trick: run one "who competes with [your category leader]" prompt across the engines before you finalize the set. The brands the model itself clusters together are the brands it will choose between in recommendation answers — that cluster is your real competitive set in the model's eyes, which is sometimes broader or narrower than your sales team's list.

Step 2: build the buying-query prompt set

This is where most competitive audits quietly break. Keyword tools return Google-shaped phrases ("best CRM small business"); AI users type conversational, buying-stage questions ("what's a good CRM for a 5-person team that's already on Stripe"). Build the set from the queries a buyer actually asks an AI when they are choosing, blended from three sources:

Prompt source	Contributes	Share of set
Keyword research (Ahrefs, Semrush)	Breadth of category topics	35%
Conversational question expansion	Real AI phrasing	30%
Observed buyer language (sales calls, support, Reddit)	Highest-intent prompts	35%

Then weight the set itself toward the buying-stage intent classes, because those are the answers that produce buyers. A 30-prompt audit set I would defend:

Intent class	Example	Count	Commercial value
Best-of / category	"best [category] tools for SMB SaaS"	8	Very high
Alternatives	"[incumbent] alternatives"	6	Highest — your query
Versus	"[you] vs [competitor]"	4	Highest
For-segment	"[category] tool for [niche use case]"	6	High — winnable
Recommendation	"what should I use to [job to be done]"	4	High
Definitional	"what is [category]"	2	Low — sanity check only

Notice definitional prompts are capped at two. They are easy to win and rarely convert; their only job here is to confirm the engines know your category exists. The audit's weight goes where the buyers are. Tag each prompt with its intent class now — you will need the tag in step seven to separate gaps that feed converting prompts from gaps that feed vanity ones.

Step 3: run, sample, record

Run each prompt against each engine, browsing on, three to five times, because the engines sample stochastically and a single run is a roll of the dice, not a reading. For the corpus check, also run your best-of and versus prompts once with browsing off to see who the model recommends from memory alone. Record per the schema below — a spreadsheet is fine for 30 prompts.

Field	Type	Used for
prompt	text	Grouping
intent_class	enum	Gap-to-revenue mapping
engine	enum	Per-engine SOV
browsing	bool	Corpus vs retrieval split
run_index	int	Stochastic dedup
brands_named	text[]	Mention SOV numerator
brands_cited_url	text[]	Citation SOV numerator
cited_urls	text[]	The teardown raw material
your_position	int	Where you placed, if at all

The cited_urls column is the most valuable thing you will collect and the one manual auditors skip because it is tedious. Those URLs are the raw material for the entire layer-two teardown. Do not just record that a competitor was cited — record what page was cited. The difference between "Competitor X appeared" and "Competitor X appeared via a G2 listicle and a Reddit thread" is the difference between a score and a strategy.

Step 4 and 5: tally per-competitor share of voice

For each engine, count how many answer slots each brand won (named, or cited with a URL — track both), divide by the total brand slots, and you have per-competitor share of voice. Then weight a blended number by where your buyers actually are, not by the engines' aggregate user counts.

SOV variant	Numerator	Tells you
Mention SOV	Times the brand was named	Awareness footprint
Citation SOV	Times the brand's URL was cited	Click-capable visibility
Position-weighted SOV	Citations weighted by their position in the source list	Approximate click share
Win rate	Prompts where the brand placed first	Recommendation dominance

A worked tally from a real audit (anonymized category, my own 30-prompt run across four engines, four runs each):

Brand	ChatGPT SOV	Perplexity SOV	Claude SOV	Gemini SOV	Blended	Win rate
Competitor A (the winner)	41%	37%	22%	33%	35%	19/30
Competitor B	22%	29%	11%	24%	23%	6/30
Competitor C	18%	14%	8%	19%	15%	3/30
You	9%	16%	4%	11%	11%	2/30
Competitor D	6%	4%	2%	8%	5%	0/30

Two things jump out before any forensics. First, Competitor A wins 19 of 30 prompts — they are the dominant recommendation, not just a frequent mention. Second, your own best engine is Perplexity (16%) and your worst is Claude (4%), which already tells you Perplexity is where a gap is most closable and Claude is where the corpus barrier is highest. The blended single number (11%) would have hidden both facts. This is exactly the per-engine divergence I argue against flattening in the share of voice piece.

The competitor citation-source teardown

This is layer two, and it is where competitive AI analysis stops being a scoreboard and becomes a plan. A citation-source teardown takes one cited competitor and identifies which source class is actually feeding their citations — their own pages, third-party listicles, Reddit, Wikipedia, G2, or news — so you know the one structural thing to match rather than guessing. The raw material is the cited_urls column you collected in step three plus a few targeted searches.

Six source classes account for nearly all AI citations. For each, here is what it is, how to find whether a competitor has it, and roughly how much weight it carries:

Source class	What it is	How to find it for a competitor	Citation weight
Own structured pages	Their comparison / best-of / docs pages the model lifts	Read the cited URLs; note their own-domain pages	Medium-high (retrieval)
Third-party listicles	"Best [category] tools" roundups naming them	Google the buying queries; read top-10 listicles for their name	Very high
Reddit / forums	Recommendation threads naming them	Search `"Competitor" reddit` and `best [category] reddit`	High (over-weighted)
Wikipedia / Wikidata	Their entity card	Search Wikipedia and Wikidata for the brand	Very high (over-weighted)
G2 / Capterra	Review profile and review count	Visit their G2 / Capterra page; note review count	Medium
News / press	TechCrunch, trade press, funding coverage	`"Competitor" -site:competitor.com` news search	Medium-high

Run that against your winning competitor and you get a teardown like this — the actual shape from the Competitor A above:

Source class	Competitor A has it?	Evidence	You have it?
Own structured pages	Yes — 6 comparison/alternatives pages	Cited 11x across the audit	Partial — 1 page
Third-party listicles	Yes — named in 7 of top-10 roundups	Cited 14x	Named in 2
Reddit / forums	Yes — recommended in 4 threads	Cited 5x	Named in 0
Wikipedia / Wikidata	Yes — full Wikipedia entity	Grounds corpus-mode recall	No entity at all
G2 / Capterra	Yes — 340 reviews	Cited in G2's own listicle	12 reviews
News / press	Yes — funding + product coverage	Several cited	Minimal

Now the analysis has teeth. Competitor A is not winning because of one thing; they are winning because they sit in every source class while you sit in two. But — and this is the layer-three move — you do not have to match all six. You have to find the subset that is realistic to close and feeds the prompts that convert.

A useful framing is to compute, per source class, how many of the competitor's citations it accounts for, because that ranks the sources by leverage rather than by your gut. In the teardown above, third-party listicles (14 citations) and own comparison pages (11) are doing the heavy lifting; Reddit (5) and news matter less to the live-retrieval answers, even though Reddit punches above its weight in the corpus. That ranking tells you where to aim first: the listicles and your own comparison content, both of which are faster to influence than a Wikipedia entity or 300 G2 reviews.

A worked example: 30 "best [category]" prompts, mapped

Let me make the whole method concrete with the run I keep referencing, because an abstract method is easy to nod along to and hard to actually execute. I ran 30 buying-stage prompts in my own category — privacy-first / cookieless analytics and attribution for small SaaS — across ChatGPT, Perplexity, Claude and Gemini, four runs each, browsing on, over a single week in early May 2026. Aggregate patterns are real; I have anonymized the specific competitor names to A through D.

The headline tally was the table two sections up: Competitor A won 19 of 30 prompts at 35% blended SOV; I won 2 at 11%. The interesting part was not the score. It was mapping which competitor won which prompt, because the pattern was not uniform.

Prompt cluster (count)	Winner	Why (from the teardown)
Best-of / category (8)	A in 7, B in 1	A's listicle and G2 dominance; pure authority play
Alternatives (6)	Split: A 2, B 2, you 2	Lower barrier; whoever published the comparison page won
Versus (4)	Whoever wrote the page (you 1, A 2, B 1)	The "[X] vs [Y]" page author gets cited for their own comparison
For-segment (6)	Mixed; you 1, C 2, A 2, B 1	Segment specificity beat raw authority
Recommendation (4)	A 3, B 1	Defaulted to the most-cited brand
Definitional (2)	A 2	Wikipedia entity carried the corpus answer

Reading that table changed my entire plan. Competitor A owned the high-authority prompts (best-of, recommendation, definitional) through accumulated structural advantages I could not replicate quickly — Wikipedia entity, 340 G2 reviews, seven listicle placements. Chasing those was a quarters-long money pit. But the alternatives and versus and for-segment clusters were a different story: those were won by whoever published the relevant comparison page, and the barrier was low. Those 16 prompts were winnable this quarter with content I controlled.

Then I did the thing the title promises: I found the three things the winner had that the losers (including me) did not.

What Competitor A had	What the losers lacked	Closability
1. A Wikipedia + Wikidata entity grounding corpus-mode recall	No entity; model unsure they were a distinct, notable brand	Wikidata fast, Wikipedia slow
2. Comparison pages for every major rival ("A vs B," "A vs C," "B alternatives")	One or zero comparison pages each	Fast — publish in days
3. Presence in 7 of the top-10 third-party listicles	Presence in 0-2 listicles	Medium — outreach cycle

The single most decisive one was the second. Competitor A had a comparison page targeting every alternatives and versus query in the category, so they got cited for their own honest comparison even when a rival was the subject of the query. That is the cheapest, fastest, most replicable advantage on the list — and it directly feeds the highest-intent prompts. So that is where I aimed first. Not at the Wikipedia entity (slow), not at the listicles (medium), but at publishing the comparison pages that win the converting prompts, and only then working the slower entity and listicle gaps in parallel.

The number that mattered most was not the SOV. It was this: the alternatives and versus clusters, where the gap was closable, were also the clusters with the highest commercial intent. The gaps I could close were the gaps worth closing. That alignment does not always hold — sometimes the closable gap feeds a worthless prompt — which is precisely why you map prompt-cluster to winner to closability before you do anything.

The gap-closing playbook

Once the teardown names your gaps, each one has a known fix and a known speed. Here is the playbook table — for each gap class, the fix and how fast it can realistically land. The fast/slow split tracks the corpus-versus-retrieval mechanic: anything that flows through live retrieval moves in days, anything that depends on a model retrain or earned authority moves in quarters.

Gap	Fix	Speed	You control timing?
No Wikipedia entity	Earn it via 5+ independent reliable-source citations; do not pay for it	Quarters (notability-gated)	No
No Wikidata entity	Create a Wikidata item yourself today	Hours to days	Yes
Thin sameAs entity graph	Build clean `sameAs` across LinkedIn, X, GitHub, Crunchbase, Product Hunt, G2	Hours	Yes
Thin comparison content	Publish honest "[you] vs [competitor]" and "[competitor] alternatives" pages	Days	Yes
Not in third-party listicles	Pitch authors, earn placement, get reviewed	Weeks (outreach)	Partly
Blocked GPTBot / AI crawlers	Remove disallow lines in robots.txt and CDN	Minutes to fix; days to recrawl	Mostly
Low organic rank on buying query	Standard SEO: content depth, links, intent match	Months	Partly
Low G2 / Capterra review count	Systematic review-request flow to happy customers	Weeks to months	Partly
No Reddit / forum presence	Genuine community participation; never astroturf	Months	Barely
Category-language mismatch	Match buyer phrasing in H2s, FAQ, answer blocks	Days	Fully

The sequencing I recommend mirrors the worked example: fast and controllable first, slow and earned in parallel.

Window	Focus	Gaps
Week 1	Unblock crawlers, fix category language, ship Wikidata + sameAs, publish first comparison pages	Crawler, language, Wikidata, sameAs, comparison
Weeks 2-4	Fill out the comparison-page matrix for every rival and segment	Comparison content
Weeks 2-8	Outreach for listicle placements; systematic review requests	Listicles, G2/Capterra
Ongoing	Organic community presence, press, organic rank	Reddit, news, SEO
Quarterly	Re-test browsing-off recall and pursue Wikipedia once notable	Wikipedia, corpus

The honest caveat that belongs on this playbook: closing a gap moves your citation presence, which is necessary but not sufficient for revenue. A comparison page that wins you a citation slot only pays if the click lands somewhere that converts. That is why the playbook ends not at "gap closed" but at the measurement step — the last section of this article. Do not treat a closed gap as a win until the revenue line confirms it.

There is one gap class worth singling out because it is uniquely high-leverage and uniquely fast: the comparison-page matrix. If a competitor out-cites you because they have a page for every "[rival] alternatives" and "[X] vs [Y]" query and you have one or none, you can close most of that gap in a week of honest writing. The catch is the word honest: a self-serving comparison that ranks you first on every axis fools no one, the model cross-references, and it gets discounted. The comparison content that earns citations is genuinely useful about where competitors win — which is also just better content. I cover the citation-earning structure in depth in how to get cited by AI engines.

Where each source comes from, and how to match it

The gap-closing playbook is the what-to-do. This section is the deeper why behind the two source classes that punch hardest above their weight — Reddit and Wikipedia — because matching them is both the highest-leverage and the most-misunderstood part of competitive GEO.

Reddit is among the most-cited domains in AI answers, and the reason is structural, not coincidental. The major engines have content-licensing and access arrangements with Reddit — Google's reported ~$60M/year deal and OpenAI's partnership being the prominent ones [6] — and beyond the licensing, community recommendation threads are exactly the answer shape the model wants: real users naming real tools for real use cases, with upvote signal standing in for consensus. If your competitor is recommended in four subreddit threads and you are in none, the model has a high-trust, independent source naming them and nothing equivalent for you. I go deep on the revenue side of this in Reddit AI citations and revenue.

The way to match it is the slow, unspoofable way: genuine participation, useful answers, letting satisfied customers mention you organically. The fast version — sockpuppet threads, paid recommendations — is detectable by both the community and increasingly the models, and a public astroturfing bust is a durable brand liability that outlasts any short-term gain. There is no safe shortcut here, which is precisely why Reddit presence is such a strong signal: it is hard to fake.

Wikipedia and Wikidata are disproportionately present in both training corpora and live retrieval, and they function as the entity card the model uses to decide you are a real, distinct, notable thing. A competitor with a Wikipedia entity has a canonical, authoritative source the model trusts for who they are; a brand without one leaves the model unsure, and uncertainty suppresses recommendation. This is why the Wikipedia gap shows up most starkly in corpus-mode (browsing-off) recall — it is the source the model leans on when recalling from memory.

The way to match it splits in two. Wikidata you can create yourself today, legitimately, at a far lower bar than Wikipedia — do that immediately, and build a clean sameAs graph alongside it, which is an afternoon of work and pure upside. Wikipedia you must earn through genuine notability (5+ independent reliable-source citations) and you must never pay for, because paid creation violates the platform's rules, gets flagged and deleted, and a deleted article is worse than none. The full mechanism is in the Wikipedia effect on AI visibility.

Source	Why it over-weights	Fast match	Slow match
Reddit	Licensing deals + community-recommendation answer shape	None (no safe shortcut)	Genuine participation over months
Wikipedia	Canonical entity card, over-represented in corpus	None (notability-gated)	Earn via independent citations
Wikidata	Structured entity, self-creatable	Create item + sameAs today	n/a

Tools: manual prompt-testing vs Profound, Otterly, Peec

You can run the entire audit above by hand, and you should the first time. For ongoing competitive monitoring you will want a prompt-replay tool that automates the runs, parses the answers, and tracks competitor share of voice over time. Here is the honest landscape, with the standard caveat that I build a tool in an adjacent category, so I will be explicit about where mine does and does not belong.

Manual prompt-testing is free and teaches you the most, but it does not scale past about 30 prompts and one engine before the labor gets painful. Roughly 30 seconds per query means 30 prompts times four engines times four runs is about three hours per pass, and you are doing it manually every week. It is the right starting point and the wrong steady state.

Tool	What it does for competitive analysis	Engines	Entry price	Computes revenue impact?
Manual + spreadsheet	Full audit, full control, full labor	All (you run them)	$0	No
Profound	Enterprise citation analytics, competitor SOV + sentiment	ChatGPT, Perplexity, Claude, Gemini, Copilot	$499+/mo	No
Otterly	Daily-check AI monitoring, SOV + sentiment, competitor tracking	ChatGPT, Perplexity, AIO	$29+/mo	No
Peec AI	Competitor-focused AI visibility tracking, share-of-voice over time	ChatGPT, Perplexity, Gemini	~$120+/mo	No
SE Ranking AI Tracker	SERP-adjacent AI visibility, competitor visibility score	Gemini, AIO, ChatGPT	$52+/mo add-on	No
Attrifast	First-party AI-engine revenue attribution (the wedge layer)	All, via referral + Stripe	$15/mo	Yes (your side only)

The categorical split is the thing to internalize, because vendor demos blur it constantly:

Job to be done	Tool category	Example
"Who out-cites me, on which engine, for which prompts?"	Citation / prompt tracker	Profound, Otterly, Peec, SE Ranking
"What source feeds my competitor's citations?"	The teardown (mostly manual)	Any tracker + your reading of cited URLs
"Did closing the gap drive MY revenue?"	First-party revenue attribution	Attrifast

The trackers are good at layer one. Profound is the most thorough for enterprise; Otterly and Peec are the SMB-friendly options with honest competitor SOV at a fraction of the price; SE Ranking is sensible if you already pay for it. I am not knocking any of them — they solve the measurement problem cleanly. The point is narrower and it is the recurring theme of this article: every one of these tools stops at the competitor's citation count. None of them join that count to your Stripe revenue, because they scrape AI answers from the outside and have no view into your billing system. They can tell you a competitor out-cites you 35% to 11%. They cannot tell you whether closing that gap put a dollar in your account.

That is the layer I built Attrifast for, and I want to be precise about the boundary: Attrifast does not replace a citation tracker. It does not compute a competitor's share of voice — it cannot, because it does not scrape AI answers. What it does is the half the trackers structurally cannot reach: detect AI-sourced sessions arriving on your site server-side, persist them first-party without a cookie or consent banner, and join each to revenue on the Stripe webhook. You bring the competitor SOV from your tracker; Attrifast tells you whether the prompts you won back actually converted. The two together close the loop. Neither alone does.

The revenue wedge: is closing the gap worth it?

This is layer three, and it is the part that separates competitive AI analysis that ships money from competitive AI analysis that ships charts. Knowing a competitor out-cites you is half the battle; the other half is whether winning the slot back drives your revenue — and the only way to know is to measure cited-to-clicked-to-paid for each query you win. A competitor losing a citation slot to you only matters if the users who would have clicked them now click you, land on a page that fits, and pay.

The trap is declaring victory at the share-of-voice line. You close a gap, your tracker shows your SOV climbing from 11% to 18%, and the temptation is to call it a win. But SOV is an input, not an outcome. The link from SOV to revenue is severed by everything between a citation and a sale — citation position, click-through, landing-page fit, buyer intent — none of which the SOV number can see. I make this argument in full in the share of voice piece; here is the competitive-specific version.

The per-query measurement that closes the loop:

Step	What you measure	How
1. Win the slot	Your SOV rises for the prompt you targeted	Tracker, before vs after
2. Cited	Your URL now appears in the answer	Read the answer's sources
3. Clicked	AI-attributed sessions to that landing page rise	First-party AI-source detection
4. Converted	Those sessions paid	Stripe webhook join
5. Worth it	Revenue won back > cost of closing the gap	The actual decision

The reason step three and four require special machinery is the one I document at length in the ChatGPT referral analytics guide: ChatGPT strips the referer, so the click lands in GA4's Direct/(none) bucket, and you cannot see the AI-attributed session at all in default analytics. You need first-party server-side detection plus a Stripe join — detect the AI source at the edge, persist a first-party session row, join it to revenue on checkout.session.completed. No third-party cookie, no consent banner, no dependence on the engine passing a referer. That is the only way the cited-to-clicked-to-paid chain becomes visible.

A worked revenue read on the alternatives cluster I targeted in the worked example. After publishing the comparison pages, I tracked the four "[competitor] alternatives" prompts I was newly winning:

Prompt won back	SOV before → after	AI sessions/mo to page	Converted	Revenue/mo
"Competitor A alternatives"	0% → 22%	31	2	$30
"Competitor B alternatives"	0% → 18%	19	1	$15
"Competitor C alternatives"	0% → 14%	11	0	$0
"cheaper [category] tool than [A]"	0% → 27%	24	2	$30

The read is honest and useful. Three of the four pages won the slot and shipped revenue; the "Competitor C alternatives" page won the slot, drew clicks, and converted nobody — which told me either the page did not fit those buyers or Competitor C's users are not in-market for a switch. The cost of all four pages was a week of writing. The revenue was modest in absolute terms (first-year SMB SaaS, small numbers) but the structure is the point: I now know which gap classes pay and which do not, and I can reweight the next quarter's effort toward the ones that do. Without the Stripe join, all four pages would have looked identically "successful" on the SOV chart, and I would have happily written more Competitor-C-style pages that win citations and convert no one.

The general principle, stated cleanly: a competitive gap is worth closing only if the prompt it feeds converts for you specifically. The same closed gap can be a win for one brand and a waste for another, depending on landing-page fit and buyer intent. Share of voice cannot tell the two apart. Cited-to-clicked-to-paid can, and it is the only number that survives the question "did beating them on this query make us money."

Common mistakes in competitive AI analysis

Eight failure modes I see often enough to name, each with the correction.

Mistake 1: Stopping at "they're cited more." The most common one, and the whole reason this article exists. Knowing a competitor out-cites you is a score, not a plan. Fix: always run the teardown to find the source of their advantage, then map it to revenue.

Mistake 2: Testing one engine. A competitor can dominate ChatGPT and be beatable on Perplexity for the identical queries. Fix: run all four engines and read the per-engine spread — the engine where your gap is most closable is often not the one with the most users.

Mistake 3: Testing once. A single run is a roll of the dice given the engines' stochastic sampling. Fix: 3-5 runs per prompt, averaged or unioned.

Mistake 4: A definitional-heavy prompt set. Stuffing the set with "what is [category]" prompts inflates everyone's apparent presence and measures nothing about buying behavior. Fix: weight toward best-of, alternatives, versus, and for-segment prompts where buyers actually decide.

Mistake 5: Trying to match the incumbent's entire footprint. A category leader's Wikipedia entity, 2,000 reviews, and decade of press cannot be replicated this year, and chasing all of it is a money pit. Fix: isolate the one or two closable gaps that feed converting prompts.

Mistake 6: Going head-on in "best [category] tools." Where a category-defining incumbent is strongest, displacement through GEO alone usually is not winnable. Fix: own the alternatives, versus, and segment queries where the incumbent's dominance is weaker and intent is higher.

Mistake 7: Astroturfing the fast sources. Paying for Reddit recommendations or fake reviews to match a competitor's community presence is detectable and self-defeating. Fix: earn the slow sources genuinely; their difficulty is exactly what makes them strong signals.

Mistake 8: Measuring the win in share of voice, not revenue. The biggest one. You close a gap, SOV rises, you declare victory — but GA4 buckets the AI clicks as Direct and you never see whether they paid. Fix: measure cited-to-clicked-to-paid with first-party attribution joined to Stripe, via the revenue attribution feature and track ChatGPT traffic.

Mistake	Symptom	Correction
Stop at "they're cited more"	A score with no action	Run the source teardown
One engine	Misread the gap	Test all four, read the spread
One run	False volatility	3-5 runs per prompt
Definitional-heavy set	Meaningless presence numbers	Weight toward buying prompts
Match entire footprint	Quarters-long money pit	Isolate closable, revenue-relevant gaps
Head-on with incumbent	Wasted effort	Own alternatives + segment queries
Astroturf fast sources	Detection + brand liability	Earn slow sources genuinely
Measure in SOV not revenue	Vanity wins	Cited→clicked→paid via Stripe join

How competitive GEO analysis fits the broader picture

Competitive AI analysis is one input in a stack, and it is most powerful next to its neighbors. Here is where it sits relative to the inward-facing and metrics work.

Analysis	Question	This article's role
Diagnostic (inward)	Why doesn't the model recommend me?	Companion piece
Competitive (outward)	Why does it recommend them, and what's the gap?	This article
Metrics	What does the SOV number mean and is it revenue?	Share of voice piece
Content performance	Which of my pages does AI actually cite?	Which pages AI cites
Source-class deep dives	How do Reddit and Wikipedia drive citations?	Reddit, Wikipedia

The diagnostic piece and this competitive piece are two sides of one coin: the diagnostic asks why you are absent, the competitive asks why your rival is present, and the answers overlap heavily — both come down to source-graph density. The difference in method is the direction you point the analysis. Run the diagnostic on yourself, run the competitive analysis on your category, and the union of the two gives you a complete map of where you stand and exactly which gaps to close.

The thread tying all of them together is the revenue layer. Every one of these analyses produces a presence number — a citation, a mention, a share of voice — and presence is an input, not an outcome. The piece that converts presence into a defensible business decision is the first-party-to-Stripe join, which is the layer the citation trackers structurally cannot ship. That is the consistent argument across this whole series, and it is the one I staked a product on.

What this looks like inside Attrifast

A short, honest note on the product, because the article should not pretend the author is disinterested. Attrifast does not run prompt-replay audits, does not compute your competitor's share of voice, and does not scrape AI answers — those are jobs for Profound, Otterly, Peec, and the manual method above, and I will happily point you to them. What Attrifast does is the wedge layer this whole article builds toward: it detects AI-sourced sessions server-side (ChatGPT, Perplexity, Claude, Gemini, Copilot), persists them first-party without a cookie or consent banner, and joins each session to revenue on the Stripe checkout.session.completed webhook.

The practical value for competitive analysis specifically is step three through five of the revenue wedge. You close a gap against a competitor, your tracker shows your SOV rising for the target prompt, and then Attrifast tells you the part the tracker cannot: did AI-attributed sessions to that landing page actually rise, and did they convert in Stripe? That turns "we won the citation slot" into "we won the citation slot and it shipped $X," which is the only version of the sentence that survives a budget review. Cost is $15/mo, the tracking script is 4 KB and cookieless, and the Stripe connection is OAuth, not an API key. The detection mechanic is documented end to end on the track ChatGPT traffic page.

The first-person reason I built it: I was the founder who could see a competitor out-citing me, did the work to close the gap, watched my Direct bucket climb, and had no idea whether any of it converted — because GA4 hid the AI clicks in Direct and I had no revenue signal per query. The product is the signal I wished I had when I was guessing.

Limitations

Five things this article does not claim, so you do not over-extrapolate.

You can never see a competitor's revenue. The entire method measures a competitor's presence precisely and infers their advantage, but the cited-to-clicked-to-paid chain is only observable on your own side after you close a gap. Treat competitor "wins" as presence wins, not revenue wins.
Share of voice is an input, not an outcome. Out-citing a competitor 18% to 11% is a leading indicator at best. The only outcome metric is your AI-attributed revenue, which requires the first-party-to-Stripe join GA4 cannot produce.
The corpus-versus-retrieval split is a useful model, not the literal internal architecture. The engines do not publish how recall, recommendation, and live retrieval interact. The two-mode framing holds up in testing but is an approximation, not a leaked spec.
GEO does not let you displace a category-defining incumbent head-on. Where a brand owns the category in the corpus, the realistic win is to make the list and own the long-tail and alternatives queries, not to dethrone them in "best [category] tools." Anyone promising head-on displacement is overselling.
The numbers are aggregates and snapshots. The per-competitor SOV figures, the worked revenue read, and the 1.4-2.1x RPV multiplier are from a specific set of mostly-SaaS properties in early-to-mid 2026 and will drift as the engines' user bases broaden. Treat them as directional, and measure your own.

FAQ

How do I find out which competitors ChatGPT recommends over me?

Build a prompt set of 30-100 buying-stage queries your category actually gets asked — "best [category] tools," "[incumbent] alternatives," "[category] for [segment]" — and run each one 3-5 times across ChatGPT, Perplexity, Claude and Gemini with browsing on. Record every brand named and every URL cited per answer, then tally how many of the answer slots each competitor wins. The competitor who appears in the most answers, in the highest positions, with the most cited supporting URLs is the one out-citing you. The whole audit is about 30 prompts times four engines times four runs, which is roughly two to three hours of manual work or an afternoon with a prompt-replay tool. The output is a per-competitor share-of-voice table, not a vibe.

Why does ChatGPT recommend my competitor instead of me?

Almost always because the competitor sits in more of the third-party sources the model grounds its answer in, not because their product is better. The model recommends what the web has already recommended. If your competitor appears in eight "best [category]" listicles, has a Wikipedia entity, gets named in Reddit recommendation threads, holds 200+ G2 reviews, and ranks in the organic top 5 for the buying query — and you appear in two listicles with no Wikipedia entity and a thin review profile — the model has far more corroborating signal to recommend them. The useful analysis is not "they win"; it is identifying the specific structural source they have and you do not, then deciding whether closing that one gap is worth the revenue it would win back.

What is competitive GEO analysis?

Competitive GEO (generative engine optimization) analysis is the practice of measuring how visible your competitors are inside AI-generated answers, reverse-engineering the sources that feed their citations, and finding the structural gaps between their footprint and yours. It has three layers. Layer one is measurement: who gets cited, how often, on which engine, for which prompts (per-competitor share of voice). Layer two is forensics: for a cited competitor, what feeds the citation — their own pages, Reddit, Wikipedia, G2, third-party listicles, news. Layer three is the wedge: which gaps, if closed, would actually move your revenue rather than just your mention count. Most teams do layer one and stop. Layers two and three are where the money is.

Can I see my competitor's AI share of voice for free?

Partially, and it is the most useful free analysis in GEO. You can manually run a 20-30 prompt set across the four engines weekly, log every brand mention and cited URL in a spreadsheet, and compute each competitor's share of voice with a pivot table. That costs roughly two to three hours a week of labor and breaks down past about 30 prompts. Paid tools (Profound, Otterly, Peec, SE Ranking) automate the prompt replay and competitor tracking from $29 to $499+ per month, which is worth it once you are tracking 50+ prompts on a daily or weekly cadence. None of those tools tell you whether out-citing a competitor actually drove your revenue — that requires first-party attribution joined to Stripe, which is a separate layer.

How many prompts do I need to analyze competitor AI visibility?

Thirty buying-stage prompts per engine is the minimum that produces a stable per-competitor ranking; 50-100 is materially better and lets you segment by intent class. Below about 30 prompts the model's stochastic sampling plus its continuously-updating retrieval index produce enough run-to-run variance that a single competitor can swing 10-15 percentage points of apparent share between two measurement passes, and you will chase noise. Sample each prompt 3-5 times per pass and average or union the results rather than trusting one roll. Weight the set toward the high-intent prompts ("best," "alternatives," "vs," "for [segment]") because those are the answers that actually drive buyers, not the definitional prompts that are easy to win but rarely convert.

What sources feed my competitor's AI citations?

Six source classes account for nearly all of them, in rough order of how much weight they carry: their own well-structured pages (comparison and "best of" content the model can lift directly), third-party "best [category]" listicles, Reddit and forum recommendation threads, their Wikipedia and Wikidata entity, G2 and Capterra review profiles, and news or press coverage. To find each, read the cited URLs in the AI answers directly, then run a brand-name-minus-own-domain search ("Competitor" -site:competitor.com) to map their third-party footprint, check Wikipedia and Wikidata, search Reddit for their name plus "reddit," and look at their G2 review count. The teardown tells you which source class is doing the heavy lifting, which is the one you have to match.

Should I copy whatever my top competitor is doing for AI visibility?

No — copy only the structural gaps that map to revenue-driving prompts, not their whole footprint. A category incumbent may have a Wikipedia entity, 2,000 G2 reviews, and a decade of press you cannot replicate, and chasing all of it is a quarters-long money pit with diminishing returns. The useful move is to isolate the one or two gaps that (a) are realistic for you to close, (b) feed the high-intent prompts you actually want to win, and (c) would plausibly move your revenue if closed. Often that is a single honest comparison page targeting an "alternatives" query, or a Wikidata entity plus a clean sameAs graph — fast, cheap, and aimed at the prompts that convert. Beating an incumbent head-on is usually not the game; owning the alternatives and segment queries is.

Does out-citing a competitor in ChatGPT actually drive my revenue?

Not automatically, and this is the half of competitive GEO analysis nobody measures. Winning a citation slot a competitor used to own only matters if the users who would have clicked them now click you, land on a page that converts, and pay. The link from citation to revenue is mediated by citation position, click-through, landing-page fit, and buyer intent — all of which a share-of-voice tally ignores. The discipline is to measure cited-to-clicked-to-paid per query you win back: did AI-attributed sessions for that prompt's landing page rise after you closed the gap, and did those sessions convert in Stripe? Because ChatGPT strips the referer and GA4 buckets the clicks as Direct, you need first-party server-side attribution joined to Stripe to see it. Otherwise you are optimizing a vanity number against a competitor.

How is competitor AI analysis different from a normal SEO competitor analysis?

A traditional SEO competitor analysis maps keyword overlap, backlink profiles, and SERP rankings — the inputs to a ranked list of ten blue links. Competitor AI analysis maps citation presence inside a synthesized answer that names only three to five brands, has no stable click-through curve, and differs across four non-overlapping engines. The overlap is real (organic rank still feeds AI citations, so Ahrefs and Semrush competitive research are still useful inputs), but the unit of analysis changed from "who ranks for the keyword" to "who gets named in the answer," and the sources that drive it expanded beyond backlinks to include Reddit, Wikipedia, review sites and listicles. You still run the SEO analysis; you layer the AI-citation analysis on top of it.

Which engine should I prioritize when analyzing competitors?

Prioritize the engine where your buyers actually are, then the engine where your competitive gap is most closable, not the engine with the most aggregate users. The four engines diverge hard: Perplexity is citation-forward and lower-barrier for topical specificity, ChatGPT prefers high domain authority and a slower corpus, Claude cites sparingly and prefers primary sources, and Gemini rides existing Google organic rankings. A competitor can dominate ChatGPT and be beatable on Perplexity for the identical queries. Run the audit across all four, but weight your gap-closing effort toward the engine that both reaches your buyers and shows a gap you can realistically close — and confirm the win in revenue, since per-engine conversion varies enormously.

How often should I re-run a competitive AI visibility analysis?

Monthly for the retrieval-driven layer (live citations, listicle presence, organic rank) and quarterly for the corpus-driven layer (training-baked recommendations that only change when an engine ships a new model). The retrieval layer moves within weeks of a competitor publishing new comparison content or earning a listicle placement, so a monthly re-run catches their moves and yours. The corpus layer only shifts on the engines' model-release schedule, so a quarterly browsing-off re-test is enough. Pair every re-run with your first-party AI-revenue line so you are measuring whether the gaps you closed moved money, not just whether your share-of-voice chart moved. A 30-prompt monthly pass is about two to three hours.

What's the single fastest gap I can close to compete in AI answers?

The honest comparison-page matrix. If a competitor out-cites you because they have a page for every "[rival] alternatives" and "[X] vs [Y]" query and you have one or none, you can close most of that gap in a week of genuine writing, and those pages feed the highest-intent prompts in the category. The catch is the word honest — a self-serving comparison that ranks you first on every axis gets cross-referenced and discounted by the model, so the page has to be genuinely useful about where competitors win, which is also just better content. Pair it with a same-day Wikidata entity and a clean sameAs graph, and you have closed the two fastest, highest-leverage gaps in days rather than quarters.

How do I know if my competitor's lead is in the corpus or in live retrieval?

Run your best-of and versus prompts in both browsing-off and browsing-on modes and compare. If the competitor wins in browsing-off mode, their lead is corpus-baked — accumulated press, Wikipedia, Reddit, listicle authority — which you close slowly over quarters as the engines retrain. If they win only in browsing-on mode, their lead is in live retrieval — better-structured pages, more listicle placements, higher organic rank — which you can often match this quarter with content you control. The two modes point at two different gap sets with two different clocks, so testing both is what tells you whether to expect a fast win or a slow grind.

Can a small brand realistically beat a bigger competitor in AI recommendations?

On the right queries, yes; head-on, usually not. You will probably not displace a category-defining incumbent in "best [category] tools" through GEO alone, because their corpus authority is too dense to overcome quickly. But the alternatives, versus, and segment-specific queries are a different game — there the barrier is whoever published the relevant comparison page and matched the buyer's segment language, and a small brand can win those this quarter. The realistic strategy is to concede the high-authority prompts, own the high-intent long-tail ones where the incumbent is weak, and measure whether winning them ships you revenue. Modest share of voice on converting prompts beats large share of voice on prompts nobody clicks.

For the inward-facing version of this analysis — the eight reasons the model ignores you — see ChatGPT isn't recommending your product. For what the share-of-voice number actually means and why revenue share of voice matters more, the AI share of voice breakdown is the metrics companion. For the structural moves that earn citations, how to get cited by AI engines is the offensive playbook, and which pages AI actually cites shows how to find your own citation-earning pages. The two source classes that punch hardest get their own deep dives in Reddit AI citations and revenue and the Wikipedia effect on AI visibility. And for the measurement layer that proves any of it won you money — the cited-to-clicked-to-paid join no citation tracker can ship — the revenue attribution feature and the track ChatGPT traffic overview walk the first-party-to-Stripe mechanic end to end.

Find revenue hiding in your traffic

Discover which marketing channels bring customers so you can grow your business, fast.

Start free trial →

7-day free trial · $15/mo · cancel anytime