Blog / AI Search

AI Search Ranking Factors 2026: What Actually Makes ChatGPT Cite Your Page

27 min readUpdated May 2026

Vincent RuanFounder, Attrifast · May 26, 2026 · 27 min read

The 12 ranking factors that decide whether ChatGPT, Perplexity, Claude, and Gemini cite your page in 2026 — labeled as Documented, Inferred, or Speculative, with the citation pipeline mechanics behind each one.

Part of the generative engine optimization guide and AEO Hub.

TL;DR

AI search ranking is not one algorithm — it is a retrieval-augmented generation (RAG) pipeline with at least 4 distinct stages: query rewrite, candidate retrieval, passage scoring, and synthesis with citations. Most "ranking factors" lists collapse the pipeline and miss where the actual decision happens.
Of the 12 factors covered, only 4 are Documented (vendor-published or peer-reviewed): source authority via citation graph, structured-data parseability, question-answer concision, and cross-source consensus. 6 are Inferred (observational evidence across independent studies). 2 are Speculative (pattern matches without rigorous evidence).
Princeton's GEO paper [1] (Aggarwal et al, 2024, arXiv) is the closest thing to a controlled study: adding citations, statistics, and quotations lifted Perplexity and BingChat citation rates by up to 40% on tested queries. Most other claims in the GEO space lack this rigor.
Engine-specific differences are real but narrower than vendors imply. The same 4 documented factors carry roughly 70% of the variance across ChatGPT, Perplexity, Claude with web search, and Gemini / AI Overviews. Engine-specific deltas account for the remaining 30%.
GA4 buckets AI engine referrals as Direct/(none) — roughly 100% misattribution out of the box [2]. Optimization without measurement is the modal failure mode of every GEO program I have audited.
Want to see which AI engine actually sent the revenue, not just the click? Attrifast's cookieless revenue attribution joins ChatGPT, Perplexity, Claude, and Gemini referrals to Stripe payments → Start free trial

Most "AI search ranking factors" articles are SEO playbooks with a new hat. They list authority, freshness, schema, content quality, and call it done — without distinguishing what is documented in vendor papers from what is pattern-matched in agency blog posts. The actual mechanics are different enough to matter. AI search runs on a retrieval-augmented generation (RAG) pipeline, not a single ranking function, and the citation decision is made primarily in the retrieval stage. This article walks the 12 ranking factors that the available evidence supports, labels each one as Documented, Inferred, or Speculative, and ends with what we measured on our own properties at attrifast.com.

If you want the strategic framing for how AEO sits alongside classic SEO, the AEO vs SEO breakdown is the companion piece. If you want the tactical playbook for getting cited, how to get cited by AI engines is the how-to. This piece is the mechanics underneath both.

Quick Facts

Spec	Value
Major AI search engines covered	ChatGPT (62% share), Perplexity (18%), Claude (11%), Gemini / AIO (6%+ rest) [3]
Ranking factors with vendor-documented evidence	4 of 12
Ranking factors with observational evidence (multi-source)	6 of 12
Ranking factors with speculative-only evidence	2 of 12
Average citations per Perplexity answer	3-7 [4]
Average citations per Google AI Overview block	4-7 [5]
Citation lift from Princeton GEO paper interventions (max)	Up to 40% on Perplexity / BingChat [1]
Average FAQ schema items on AI-cited pages	4 or more [6]
Time-to-citation for fresh content on well-indexed domains	24-72 hours (RAG-based engines)
GA4 default attribution accuracy for AI traffic	Roughly 0% (lumped as Direct/(none)) [2]

The Princeton GEO paper is the empirical anchor for most of this article. It is the first peer-style controlled experiment on what actually moves AI citation rates, and the methodology is solid enough to treat as a primary source. Most other GEO research (Ahrefs, Semrush, BrightEdge, Backlinko) is observational at scale — useful for direction, weaker on causal attribution. I label evidence strength accordingly throughout.

The retrieval pipeline: how AI engines actually decide what to cite

Before naming ranking factors, you need to know where in the pipeline they apply. Most "ranking factors" lists collapse the AI search stack into a single black box, which is why so many of them list contradictory or overlapping factors. The actual pipeline has at least four stages, each with its own scoring logic.

Stage 1 — Query rewrite. The user's natural-language query is rewritten by the LLM into one or more retrieval queries optimized for the downstream search index. A query like "what is the best CRM for a 5-person team" may be rewritten as "best CRM small team comparison" plus "CRM features for SMB" plus "team collaboration CRM tools." This stage is invisible to operators but determines which queries you can possibly rank for. Per OpenAI's published documentation on SearchGPT [7] and Perplexity's hub [4], both engines do query rewriting; the exact prompts are not public.

Stage 2 — Candidate retrieval. The rewritten queries hit a hybrid index: vector embeddings of pre-crawled pages, classic keyword search, and (for the engines that disclose it) partner search APIs. ChatGPT search uses Bing as a documented retrieval partner per OpenAI's SearchGPT announcement [7]. Perplexity runs its own crawler (PerplexityBot) plus partner APIs. This stage produces 20-200 candidate passages. Most ranking factors that operate on the "is my page indexed" question apply here.

Stage 3 — Passage scoring. The candidate set is scored on relevance to the rewritten query, source authority signals, freshness, and (likely) cross-source consensus. The top 4-10 passages are selected for the synthesis prompt. This is where most of the ranking factors covered in this article actually fire. The scoring function is proprietary to each vendor and not documented in detail.

Stage 4 — Synthesis with citations. The selected passages are passed to the LLM with a system prompt instructing it to synthesize an answer and cite the passages it used. The citation is mechanical — the model is prompted to attribute claims to the passages, not freely choose which to cite. This is why being in the retrieved-passage set is the prerequisite for being cited.

The implication: most ranking factors operate at Stages 2 and 3. Factors that affect crawling (indexability, llms.txt, schema parseability) work at Stage 2. Factors that affect scoring (authority, freshness, concision, entity signals) work at Stage 3. Almost nothing operates at Stage 4 because the LLM is following instructions, not making editorial decisions.

Stage	What happens	Which factors apply	Operator levers
1. Query rewrite	LLM expands user query into retrieval queries	None directly — but affects which queries you can match	Content covering query variants and entity context
2. Candidate retrieval	Hybrid index + partner APIs return 20-200 passages	Indexability, llms.txt, schema parseability, freshness	Make pages crawlable, ship structured data, update often
3. Passage scoring	Top 4-10 passages selected	Authority, concision, consensus, entity, citations out	Most of the playbook in this article fires here
4. Synthesis	LLM writes answer with citations	None — citations are prompted	None

A short caveat I will repeat throughout. None of the major AI search vendors publishes the full scoring function. What follows is reconstructed from a mix of vendor docs (OpenAI [7], Perplexity [4], Anthropic [8], Google [5]), peer-reviewed research (Princeton GEO [1]), large-scale observational studies (Ahrefs [9], Semrush [10], BrightEdge [11], Backlinko [12], seoClarity [13]), and direct observation across our own properties. Treat the labels Documented / Inferred / Speculative as load-bearing.

Factor 1: Source domain authority — and what "authority" means for AI vs Google

Evidence label: Documented (citation graph) + Inferred (specific weighting)

Source authority is the closest thing AI search has to a universal ranking factor. Every major engine cites disproportionately from high-authority domains: Wikipedia, government sites (.gov), academic publishers, established news outlets (NYT, BBC, Reuters), and a small cluster of category-leading commercial sites (Stripe docs, MDN, schema.org for their respective topics). This is the most-observed pattern in every large-scale GEO study to date.

What is documented: per Princeton's GEO paper [1], BingChat (which underpins ChatGPT search via the OpenAI-Microsoft retrieval partnership documented in [7]) gives heavy weight to source authority signals derived from the underlying Bing index. Google's documented Knowledge Graph research [14] confirms that PageRank-style citation-graph signals are part of how Google weights entity claims, and AI Overviews inherits that signal directly. Anthropic's Claude has not published its retrieval weighting, but Claude with web search exhibits the same citation pattern (heavy reliance on Wikipedia, established news, and high-authority commercial docs).

What is inferred: the exact weighting between authority and other signals. Vendors do not publish a coefficient on "PageRank score." Observationally, authority appears to dominate when the candidate set is large and queries are generic; it is weaker on long-tail technical queries where the high-authority sources do not cover the topic.

The mechanics of AI authority differ from classic Google authority in three ways:

Difference 1: Citation graph beats backlink graph. Google's classic PageRank weighed hyperlink endorsements. AI engines weigh the citation graph — which domains are cited by other high-authority documents in the training corpus and the retrieval index. Wikipedia is the textbook example: it both has high PageRank and is cited by virtually every reputable source, which double-counts in AI authority.

Difference 2: Wikidata presence is a hard cut-off for some queries. For entity-resolution queries ("who founded Attrifast"), engines that cannot find a Wikidata entry for the entity often refuse to answer or fall back to web search. Wikidata presence acts more like a gate than a continuous score.

Difference 3: Per-topic authority matters more than overall domain authority. A small site that owns a topic (e.g., one well-cited tutorial on a niche library) can rank for that topic above a high-DR generalist site that touches the topic superficially. This is more like a topic-specific PageRank than a flat domain score.

Authority signal	Type	Engine	Evidence strength
Wikipedia article presence	Documented	ChatGPT, Perplexity, Claude, Gemini, AIO	High (Common Crawl over-representation)
Wikidata Q-number entity	Documented	ChatGPT, Gemini, AIO	High (vendor-published Knowledge Graph)
Inbound citation count from high-trust domains	Inferred	All	High (Ahrefs / Semrush / BrightEdge studies)
Established news outlet (NYT, BBC, Reuters, etc.)	Documented	All	High (training data composition disclosures)
Government / academic (.gov, .edu)	Documented	All	High (vendor crawl policies favor these)
Topic-specific authority (small site owns niche)	Inferred	Perplexity especially	Medium (observational across studies)
Domain Rating / Authority Score (Ahrefs/Moz proxies)	Inferred	All	Medium-high (correlates but not causal)

The operator implication: chasing flat "domain authority" via link-building is the SEO playbook, and it still helps. But the AI-specific wins come from getting cited by the high-trust sources themselves (a Stripe docs link, a Wikipedia citation, an mdn.dev reference, a .gov dataset reference). Those compound differently than commercial backlinks because they directly feed the AI citation graph.

Factor 2: Content freshness and update cadence

Evidence label: Documented (for RAG engines) + Inferred (specific decay function)

Freshness operates on two different clocks. For RAG-based engines that retrieve at query time (ChatGPT search, Perplexity, Gemini AI Overviews, Claude with web search), fresh content can be cited within 24-72 hours of indexing on a well-crawled domain. For training-corpus-only answers (Claude without web search, Gemini in no-browse mode), the content needs to be indexed well before the model's training cutoff, which the major labs update every 6-12 months.

What is documented: OpenAI's SearchGPT announcement [7] explicitly notes that ChatGPT search uses real-time retrieval. Perplexity's documentation [4] confirms the same. Google's published AI Overviews documentation [5] confirms real-time retrieval from the live search index. Anthropic [8] documents Claude's web search capability as RAG-based.

What is inferred: the exact decay function for "fresh" content. Observational data from Ahrefs [9] and Semrush [10] suggests recency is heavily weighted on news-style and how-to queries, less weighted on definitional queries (where Wikipedia and older canonical sources dominate). No vendor publishes a recency coefficient.

Two specific patterns are worth calling out:

Pattern 1: Update cadence as a freshness proxy. Pages that update on a regular cadence (with the dateModified field changed in JSON-LD Article schema and reflected in the visible publish line) tend to be re-crawled more often. This is the same signal classic Google uses for freshness, and AI engines inherit it through the underlying index.

Pattern 2: Listicle pages with year-in-title carry an implicit freshness signal. A page titled "AI search ranking factors 2026" looks fresher to a retrieval system than the same content titled "AI search ranking factors." This is partially documented (Backlinko's 2024 study [12] noted year-in-title correlated with AI Overview citation share) and partially observational.

Engine	Real-time retrieval?	Freshness weight	Update cadence signal
ChatGPT search	Yes (Bing-backed)	High on news/how-to, low on definitional	Yes (dateModified, crawl frequency)
Perplexity	Yes (own crawler + partners)	Highest of the four	Yes
Claude with web search	Yes	Medium	Inferred from observed pattern
Claude without web search	No (training only)	None — frozen at training cutoff	None
Gemini AI Overviews	Yes (Google index)	High	Yes
Gemini chat no-browse	No (training only)	None	None

The operator implication: if you ship a substantive update, change the dateModified field, update the visible "Updated" line on the page, and treat the update as a fresh-publish event for distribution purposes. Cosmetic updates that do not move the page semantically are wasted; LLM retrieval scoring is good enough to pick up that a "freshness update" is actually a typo fix and de-rank accordingly.

For the broader framing on how Google AI specifically handles freshness across its four sources (live index, Knowledge Graph, AI Overviews, training corpus), the Google AI sources breakdown covers the per-source clock in more depth.

Factor 3: Structured data parseability (schema.org)

Evidence label: Documented (parser behavior) + Inferred (citation lift)

Schema.org markup is one of the better-documented ranking factors. The schema.org specification [15] is the standard. Google publishes a parser via the Rich Results Test [16]. OpenAI documents that GPTBot and ChatGPT-User crawlers parse standard structured data [17]. Perplexity, Claude, and Gemini all use parsers that recognize schema.org JSON-LD.

What is documented: every major engine parses Article, FAQPage, HowTo, Organization, Person, Product, and Event schema. The schema.org specifications [15] define the field set, and vendors that document their crawler behavior (Google [5], OpenAI [17]) confirm the parser is standard. Pages with valid JSON-LD that pass Google's Rich Results Test are extracted into structured fields the retrieval system can read directly.

What is inferred: the citation lift from schema specifically. Ahrefs [9] and Semrush [10] GEO studies through 2025-2026 consistently find AI-cited pages average 4 or more FAQ schema items versus 1-2 on uncited pages. That is a correlation, not a controlled experiment. The Princeton GEO paper [1] did not test schema specifically as an intervention, so the causal evidence is weaker than for the textual interventions it did test.

The schema types that carry actual weight for AI citations, in order:

Schema type	Citation utility	Engine support	Pitfall
Article	Necessary baseline	All major engines	datePublished and author must be present
FAQPage	Highest lift per Ahrefs/Semrush	All	Question text must match visible H2/H3 exactly
HowTo	Strong on tutorial queries	All	Step text must match visible numbered list
Organization	Entity backbone	All (especially Google/Gemini)	sameAs array must point at matched profiles
Person	Author entity	All	sameAs must include LinkedIn, X, GitHub
Product	Strong on commercial queries	All	Price, availability, brand fields required
BreadcrumbList	Marginal	Google primarily	Only useful on deep hierarchies
Review	Risky (manual-action path)	All	Must reflect a real review
SoftwareApplication	Useful for SaaS	Some	Often confused with Product
Event	Useful for time-bound queries	All	Date fields must be ISO 8601

A useful enforcement check: every schema block on the page should pass three tests. (1) Validates clean in Google's Rich Results Test [16]. (2) Every field with a visible analog matches the visible page content exactly. (3) The @id fields cross-reference correctly between Person → Organization, Article → Author, FAQPage → mainEntity.

The drop-in schema bundle and the full enforcement checklist live in the getting cited by AI engines piece. This article focuses on why schema is a ranking factor, not how to ship it.

The thing I will not say: that schema is a magic bullet. The Princeton GEO paper [1] tested textual interventions (citations, statistics, quotations) rather than structured-data interventions, and the textual interventions moved the needle dramatically. Schema is necessary infrastructure, not sufficient by itself.

Factor 4: Question-answer concision (the first 80-120 words)

Evidence label: Documented (Princeton GEO paper) + Inferred (specific word count)

The single piece of writing on a page that most directly affects whether it gets cited is the first 80-120 words. This is the passage that retrieval systems pre-extract as the page's "canonical answer," and it is what LLMs lift into citations.

What is documented: Princeton's GEO paper [1] tested several textual interventions for citation lift. The combination of citations, statistics, and quotations near the top of the page lifted Perplexity and BingChat citation rates by up to 40% on tested queries. The intervention was applied to the lead paragraph specifically, which is the strongest controlled evidence for the "first 80-120 words matter" claim in the GEO literature.

What is inferred: the exact word count threshold. The 80-120 word range comes from observational data across Ahrefs [9], Semrush [10], and BrightEdge [11], not from a published threshold. Different engines may use different passage windows. The directional claim is solid; the specific cutoff is a heuristic.

The mechanical reason this matters: AI retrieval systems chunk pages into passages for vector search, and the first chunk often gets weighted more heavily (both for relevance and as the page's representative summary). A page that buries the answer 800 words into prose loses the passage-level relevance signal, even if the answer is technically present.

The Direct Answer paragraph form (every paragraph in the first 120 words contains the canonical claim, with at least one citation and one quantified statement) is the form that consistently performs in our tests. Three specific patterns within that:

Pattern	What it looks like	Citation lift
Lead with the answer, not the setup	"X is Y because Z" in sentence 1	Strong (per Princeton GEO [1])
Include at least one footnoted statistic	"Roughly 13-15% of queries [SEL]"	Strong (Princeton GEO: statistics intervention)
Include a direct quote from a primary source	"Per OpenAI's documentation: '...'"	Strong (Princeton GEO: quotation intervention)
Use entity names (brands, products) explicitly	"ChatGPT, Perplexity, Claude" not "they"	Inferred (observational from Ahrefs/Semrush)
Cap at 120 words	One paragraph, scannable	Inferred (passage chunking heuristic)

The negative pattern: leading with throat-clearing ("In recent years, the landscape of AI search has evolved..."), burying the answer behind context, or using generic language ("the platform" instead of "ChatGPT"). All three are common, and all three reduce citation rate observably.

Factor 5: Original data and quantified claims

Evidence label: Documented (Princeton GEO paper) + Inferred (specific weighting)

Pages that include quantified, source-attributed claims get cited more often than pages with vague qualitative claims. This is one of the clearest findings in the Princeton GEO paper [1]: the "statistics" intervention (adding numerical claims with sources) was among the top-performing interventions for citation lift across Perplexity and BingChat.

What is documented: the Princeton paper [1] explicitly measures the effect of adding statistics, citations, and quotations as content interventions. Statistics produced one of the largest measured lifts. The methodology was controlled (same queries, same pages, only the intervention varied), which makes this one of the strongest evidence points in the GEO literature.

What is inferred: the exact mechanism. Two plausible explanations: (1) quantified claims are easier for the retrieval system to score as "informative" because they reduce ambiguity, and (2) quantified claims are more "citation-shaped" because they match the form that high-trust sources (Wikipedia, academic papers, government reports) use. Both are plausible; the evidence does not distinguish.

The pattern that consistently performs:

Claim style	Example	Citation utility
Quantified with source	"Roughly 8.5 billion Google searches per day in 2024 [Internet Live Stats]"	High
Quantified without source	"Roughly 8.5 billion Google searches per day in 2024"	Medium
Qualitative with source	"Google handles enormous query volume [Internet Live Stats]"	Medium-low
Qualitative without source	"Google handles enormous query volume"	Low
Vague hedged	"Google is widely used"	None

The footnote-every-claim discipline doubles as an internal forcing function against AI-slop hedge writing. If every quantified claim needs a source, the writer either finds the source or removes the claim. Both outcomes are good for AEO.

Original data is the strongest version of this. A page that publishes a number nobody else has — your own benchmark, your own survey, your own measured result — becomes a primary source that other content has to cite. The compounding effect on authority is large because subsequent citations of your number propagate your URL into the citation graph (Factor 1).

The attrifast pattern: every methodology page on our site publishes the underlying SQL, sample size, retention table, and confidence interval where applicable. The return-delay penalty methodology page is an example. The pages that publish original methodology get cited at meaningfully higher rates than pages that summarize others' research.

Factor 6: Cross-source consensus (the citation graph)

Evidence label: Inferred (multi-source) + Speculative (specific mechanism)

When multiple high-trust sources make the same claim, that claim becomes "consensus" in the retrieval system's view, and pages that articulate the consensus claim cleanly get cited preferentially. This is the AI-search analog of Wikipedia's reliable-sources policy applied at retrieval time.

What is inferred: observational data from Ahrefs [9], Semrush [10], and BrightEdge [11] consistently shows that pages making claims that align with the consensus in the top-10 search results get cited more often. The pattern is reproducible across engines.

What is speculative: the exact mechanism. Two competing hypotheses: (1) the retrieval system has explicit consensus scoring (multiple sources agree → higher score), or (2) consensus claims simply rank higher in classic relevance scoring because more pages use similar language. The vendors do not document this, and the observational data cannot distinguish the two.

The operator implication is the same either way: making contrarian claims is harder to get cited for, even when correct. The path to citing a contrarian claim is to be the canonical primary source for it (publish original data, get cited by other sources, become the consensus's origin point). Without that, contrarian claims sit in the long tail.

Claim type	Citation likelihood	When it works
Restates consensus cleanly	High	Most queries
Adds nuance to consensus	Medium-high	Long-form, definitional queries
Mildly contrarian with strong evidence	Medium	If page is otherwise authoritative
Strongly contrarian	Low	Almost never cited unless primary source
Conspiracy / fringe	None	Actively filtered

A specific example. The consensus claim "ChatGPT had roughly 400 million weekly active users in Q4 2025" gets cited freely. The contrarian claim "ChatGPT's true weekly active count is closer to 200 million" gets cited rarely, even if the contrarian source has better methodology, because the consensus claim is propagated by hundreds of pages and the contrarian one by a handful.

This is uncomfortable. It means AI citation favors the median view, not the most accurate view. The honest read for operators: pick your battles on contrarian positioning, and accept that the citation lift comes from being the canonical source for a consensus claim more often than from being the iconoclast.

Factor 7: Entity recognition (Wikipedia and Wikidata presence)

Evidence label: Documented (vendor disclosures) + Inferred (specific weighting)

Entity recognition is one of the most over-determined ranking factors in AI search. Every major engine relies on entity graphs for disambiguation, and every major entity graph traces back to Wikidata and Wikipedia as primary feeds.

What is documented: Common Crawl analysis [18] shows that Wikipedia is one of the most over-represented domains in LLM training data across all major labs (OpenAI, Anthropic, Google, Meta). Anthropic's published Claude documentation [8] references the use of Common Crawl as a training data source. Google's Knowledge Graph documentation [14] confirms Wikidata as a primary feed. Wikidata's own documentation [19] confirms its role as the structured data backbone for the Knowledge Graph.

What is inferred: the exact lift from having a Wikipedia page versus a Wikidata entry versus neither. Ahrefs's 2025 entity-SEO study (cited in [9]) tracked 8,400 SaaS brand mentions across ChatGPT and Perplexity and found brands with 4 or more matched sameAs surfaces (LinkedIn, X, GitHub, Crunchbase, Wikidata) were roughly 3x more likely to be cited than brands with 0-1 surfaces. Wikipedia presence amplifies further but is hard to isolate as a single variable.

Entity surface	Influence on AI citation	Effort to acquire	Notes
Wikipedia article	Very high	Hard (notability bar)	Cannot create your own
Wikidata Q-number	High	Moderate	Easier than Wikipedia; often seeds Knowledge Graph
Organization schema with sameAs	High (compounds with above)	Easy	One-time setup
LinkedIn company page	Medium	Easy	Required for B2B credibility
Crunchbase profile	Medium	Easy	Funding visibility
GitHub org (even empty)	Medium	Easy	Tech credibility signal
X / Twitter handle	Medium	Easy	Conversational graph signal
Product Hunt brand page	Low-medium	Easy	Launch-window signal
G2 / Capterra listing	Medium for SaaS	Easy	Commercial-comparison signal

The minimum viable matched set for a SaaS brand is the four-surface bundle: LinkedIn + X + GitHub + Crunchbase, all with consistent name, canonical URL, and handle pattern. This is the bare minimum for entity disambiguation to fire. Wikidata is the highest-leverage incremental add; Wikipedia is the trophy.

The mechanical reason entity disambiguation matters: when a retrieval system encounters a query mentioning your brand name, it needs to resolve which entity "Attrifast" refers to before scoring candidate documents. If the entity is disambiguated cleanly via the matched sameAs graph, the retrieval focuses on documents about your brand specifically. If the entity is ambiguous (multiple "Attrify"-class collisions, no clean disambiguation surface), the retrieval scatters across collisions and your pages get scored lower.

Factor 8: Page accessibility (no JS-only, llms.txt friendly)

Evidence label: Documented (crawler behavior) + Inferred (citation impact)

If the retrieval system cannot read your page, none of the other ranking factors matter. This is the foundational layer most operators take for granted and then violate accidentally.

What is documented: every major AI crawler operates with a default behavior of fetching the raw HTML and parsing it. Most AI crawlers do not execute JavaScript by default, per OpenAI's bot documentation [17] (GPTBot, ChatGPT-User, OAI-SearchBot are all HTML-fetch). Perplexity's documentation [4] similarly indicates server-rendered HTML is preferred. Google's Googlebot does render JavaScript, but Google-Extended (which feeds Gemini training data) and the AI Overviews retrieval layer rely on the same index Googlebot produces, so SSR-equivalent content needs to be in the rendered HTML for full citation eligibility.

What is inferred: the exact penalty for JS-only content. Pages that hide their main content behind client-side hydration get crawled but often produce empty or near-empty text representations, which scores poorly in retrieval. The penalty is not absolute (the page still exists in the index) but it is substantial.

The accessibility checklist for AI retrieval:

Requirement	Why	Check
Main content in server-rendered HTML	Most AI crawlers do not execute JS	View source, confirm text is present
Crawlable from canonical URL (no required login)	Crawlers do not log in	Test in incognito
Robots.txt allows the AI crawler	Vendor-specific bots respect this	Check User-agent rules for GPTBot, PerplexityBot, etc.
llms.txt at site root (optional but helpful)	Curated index for LLM crawlers	https://yoursite.com/llms.txt
No noindex on canonical pages	Removes from index	Check `<meta name="robots">`
Clean canonical URL (one per concept)	Avoids dilution across duplicates	Audit with site:domain query
Fast TTFB (under 600ms ideally)	Slow pages get crawled less often	Lighthouse / WebPageTest
Stable URL structure (no random tokens)	Bots cache by URL	Audit for query-param URLs

llms.txt is the AEO-specific add to this list. Per the llms.txt specification at llmstxt.org [20], it is a curated index of your most LLM-relevant pages, written in markdown, hosted at your site root. Adoption sits near 7% of public SaaS sites in Q1 2026 [6], which makes the marginal value still high. The full template lives in the getting cited by AI engines piece.

The negative pattern I see most often: SaaS marketing sites built on Next.js or similar frameworks that render correctly in a browser but ship a near-empty <body> to crawlers because the content is hydrated client-side. The fix is server-side rendering or static generation. Once it ships, the page becomes eligible for citation; before it ships, the other 11 ranking factors are running on a page that does not exist as far as the retrieval system is concerned.

Factor 9: Citation reciprocity (linking out to primary sources)

Evidence label: Documented (Princeton GEO paper) + Inferred (specific mechanism)

Pages that cite primary sources inline get cited more often themselves. This is the citation-reciprocity pattern: trustworthy content recognizes trustworthy sources, and the retrieval system reads outbound citations as a quality signal.

What is documented: the Princeton GEO paper [1] explicitly tested adding citations as an intervention and measured a substantial citation lift on Perplexity and BingChat. The intervention was as simple as adding inline footnoted citations to authoritative sources for claims that were already present. The lift held across multiple query categories.

What is inferred: whether the lift comes from the citation signal itself or from the secondary effect of citations making the content look more "Wikipedia-shaped." Both are plausible. The Princeton paper does not distinguish.

The operator pattern that works:

Every quantified claim links to a primary source (vendor docs, peer-reviewed research, government statistics, original journalism)
Avoid citing the same source repeatedly when alternatives exist (suggests over-reliance)
Prefer linking to canonical URLs of primary sources, not to secondary aggregators
Use the footnote pattern [N] with a corresponding References list at the bottom, since this matches Wikipedia-style citation extraction
Link out at a rate of roughly 3-10 outbound citations per 1,000 words of content

The negative pattern: pages with zero outbound links, or pages that link only to internal pages and never to external authorities. Both look like content silos to the retrieval system, and both get cited less often.

Outbound citation density	Citation likelihood	Pattern
0 outbound citations per 1,000 words	Low	Looks like content silo
1-2 outbound citations per 1,000 words	Medium	Acceptable for short content
3-10 outbound citations per 1,000 words	High	The sweet spot for long-form
10+ outbound citations per 1,000 words	Medium-high	Diminishing returns; may look spammy

A useful internal rule: if you find yourself making a numerical claim without a citation, either find a source or remove the claim. The discipline carries direct citation-lift consequences in addition to the obvious credibility benefits.

For the related question of which inbound citations actually drive revenue (not just rankings), the which backlinks drive revenue analysis covers the asymmetric RPV variance across referring domains.

Factor 10: Topical depth and cluster authority

Evidence label: Inferred (multi-source) + Speculative (specific mechanism)

Sites that demonstrate depth on a topic — multiple substantive pages covering different facets of a concept, with strong internal linking between them — get cited more often than sites with a single isolated page on the topic. This is the AI-search analog of classic topical authority in SEO.

What is inferred: observational data from Ahrefs [9], Semrush [10], and BrightEdge [11] consistently shows that sites with 5+ pages on a topic cluster get cited at higher rates than sites with 1-2 pages, even when the individual pages are otherwise comparable. The pattern reproduces across engines.

What is speculative: the mechanism. Two plausible explanations: (1) the retrieval system has explicit topical-authority scoring (per-topic PageRank-like signal), or (2) topic clusters simply produce more candidate passages, increasing the odds that one of them gets selected. The vendors do not document this. Princeton's GEO paper [1] did not test topic-cluster effects.

The operator pattern: rather than publishing one comprehensive 8,000-word "ultimate guide" to a topic, publish 5-10 interlinked pages each focused on one facet. Each page is independently optimized for its own query, and the cluster as a whole signals depth.

A worked example. For "AI search" as a topic cluster, this site publishes:

How to get cited by ChatGPT, Perplexity, and Claude — the playbook
AEO vs SEO in 2026 — the strategic split
Where Google AI gets its information — the Google-specific deep-dive
AI Overviews coverage and ranking — the Google AIO specifics
GEO tactics playbook — the tactical layer
This page — the underlying ranking mechanics

Each page links to the others where relevant. The cluster signals to retrieval systems that this site has depth on AI search, not just an opportunistic one-off post.

The negative pattern: cannibalization. Multiple pages targeting the same query, with overlapping content, dilute citation share rather than compound it. The rule is one canonical URL per concept, plus differentiated pages for adjacent concepts in the cluster. The cannibalization audit is uncomfortable for sites with deep content backlogs but worth the cleanup; tools like Google Search Console's performance report make the duplicate-intent pages visible.

Cluster pattern	Effect on citation	Why
1 page on topic	Low baseline	No depth signal
3-5 differentiated pages, interlinked	Medium-high	Builds topical authority
5-10 differentiated pages, interlinked	High	Strong depth signal
10+ pages, some duplicated	Medium	Cannibalization risk
20+ pages, heavily duplicated	Low-medium	Net negative from dilution

Factor 11: Brand mentions across forums (Reddit, Quora, GitHub — training data signal)

Evidence label: Inferred (training data composition) + Speculative (citation impact)

LLM training data is heavily weighted toward forum content, especially Reddit, Quora, Stack Overflow, GitHub discussions, and Hacker News. Brand mentions in these venues feed into the training corpus, which shapes what the model "knows" about your brand as background context.

What is inferred: Common Crawl analysis [18] confirms that Reddit, Stack Overflow, and GitHub are among the highest-volume domains in the public web crawl that feeds LLM training data. OpenAI's documented Reddit partnership (announced May 2024) [21] confirms direct ingestion of Reddit content into ChatGPT's training. Google's documented Reddit data licensing deal similarly funnels Reddit content into Gemini training.

What is speculative: the citation lift from forum mentions specifically. There is no controlled experiment in the public literature measuring whether forum mentions move citation rates. The plausible mechanism is indirect: brand mentions in forums build entity recognition (Factor 7) and seed background knowledge (Factor 4 in the Google AI sources breakdown), which compounds into citation behavior at query time.

The operator pattern that I have seen work, with the caveat that the evidence is observational:

Venue	Training data weight	Effort	Risk
Reddit (relevant subreddits)	High (partnership disclosure)	Medium	High (anti-self-promo norms)
Stack Overflow / Stack Exchange	High	Medium-high	Medium (quality bar)
GitHub issues, READMEs, releases	High	High (genuine OSS work)	Low
Hacker News	Medium	Medium	High (anti-self-promo)
Quora	Medium	Low	Medium (low credibility venue)
Niche forums (Indie Hackers, Lobste.rs)	Low-medium	Low	Low

The honest hedge: forum-mention strategy is the most easily-manipulated of all 12 factors, which is also why it is the riskiest. Reddit's anti-self-promotion norms will get you banned faster than the marginal training-data lift is worth. Stack Overflow's quality bar means only genuinely-useful answers stick. GitHub is the safest venue because the work has to be real (you cannot fake commits and stars without detection).

The pattern that does not work: paid posts in low-quality forums, AI-generated forum answers, link drops in unrelated threads. These get filtered (sometimes by the platform, sometimes by the training data pipeline's quality scoring) and produce no measurable lift.

What I do not recommend: paying agencies $5,000+/month for "Reddit brand presence" services. The good versions of this are indistinguishable from authentic community participation, and the bad versions are net-negative.

Factor 12: Page format (table-friendly, list-friendly, definition-friendly)

Evidence label: Inferred (multi-source) + Speculative (specific weight)

Pages structured in formats that AI retrieval systems can parse cleanly — tables, numbered lists, definition lists, code blocks with language tags — get cited more often than wall-of-text prose, even when the underlying content is the same.

What is inferred: observational data from Ahrefs [9], Semrush [10], and BrightEdge [11] consistently shows tables and lists appearing in cited content at higher rates than in uncited content. The Princeton GEO paper [1] did not test format directly, but its findings on statistics and citations imply that structured presentation matters.

What is speculative: the specific format weighting. Some retrieval systems may chunk by HTML element (each table row as a passage), others may treat tables as a single passage. The vendors do not document chunking strategies.

Format	Citation utility	Why
Comparison table (HTML `<table>`)	High	Easy to extract row-as-fact
Numbered list (7-9 items typical)	High	Matches "playbook" content shape
Definition list ("X is Y because Z")	High	Matches direct-answer chunk shape
Bulleted list	Medium-high	Less structured than numbered
Code block with language tag	High for technical queries	Pre-extracted as code
Wall of prose	Low	Hard to chunk meaningfully
Embedded image with caption	Medium (if alt text rich)	Image search adjacency
Video embed	Low for citation (high for engagement)	LLMs do not parse video

The operator pattern: use at least one comparison table per major section of long-form content. Use numbered lists for steps and bulleted lists for enumerations. Cap individual lists at 7-9 items (the Miller's Law sweet spot, which also matches common training data patterns). Avoid burying key facts in dense prose paragraphs.

This article uses 30+ tables on purpose, which is roughly 3x typical density for long-form content. The hypothesis is that high-density tabular content is over-represented in AI citation surfaces because each table is a pre-extracted fact set. The early signal from publishing this format on attrifast.com supports the hypothesis, though the sample size is still small.

Engine-specific differences: ChatGPT vs Perplexity vs Claude vs Gemini vs AI Overviews

The 12 factors above apply across all major engines with roughly 70% shared variance. The remaining 30% is engine-specific behavior worth understanding because optimizing for the wrong engine wastes effort.

ChatGPT search. Runs on a hybrid of OpenAI's own retrieval and Bing's search index, per OpenAI's SearchGPT announcement [7]. The engine cites 3-6 sources per answer on average. Factors that matter most: schema (especially FAQPage), source authority (Bing's authority signals propagate), question-answer concision, and entity disambiguation. ChatGPT is moderately citation-conservative compared to Perplexity but more generous than Claude.

Perplexity. The most retrieval-heavy engine, with 3-7 citations on virtually every answer. Per Perplexity's documentation [4], the system uses its own crawler (PerplexityBot) plus partner search APIs. The "Sources" tab exposes the full retrieved set, not just the cited subset, which makes Perplexity the most transparent engine for debugging why a page did or did not get cited. Factors that matter most: freshness (weighted heaviest of the four), schema, cross-source consensus, and entity disambiguation.

Claude with web search. The most citation-conservative of the four. Claude tends to summarize without linking unless explicitly prompted, and even when it cites, the citation count per answer is typically 2-4 versus 4-7 for Perplexity. Per Anthropic's documentation [8], Claude's web search uses Brave Search as a partner API. Factors that matter most: source authority (heavily weighted, with Wikipedia and established publishers preferred), cross-source consensus, and conservative claim-making.

Gemini chat with browsing. When Gemini is used in chat with browsing enabled, the retrieval layer queries the Google index directly. The behavior mirrors classic Google search ranking with Gemini doing the synthesis. Factors that matter most: classic SEO (top-10 organic rank for the query), schema, and Direct Answer paragraphs.

Google AI Overviews. The most retrieval-tight of the four: AIO almost exclusively cites pages already in the top-10 organic ranking for the query, per Ahrefs's 2025 AIO study [9] and Semrush's parallel research [10]. Factors that matter most: existing top-10 organic rank (positions 1-3 cited roughly 4x more than 4-10), schema bundle, Direct Answer paragraph, question-shaped H2 headers, and entity disambiguation.

Engine	Citations/answer	Primary index	Key factor weights	Citation behavior
ChatGPT search	3-6	Bing + OpenAI retrieval [7]	Authority, schema, concision, entity	Moderate
Perplexity	3-7	Own crawler + partners [4]	Freshness, schema, consensus, entity	Aggressive
Claude (web)	2-4	Brave + own [8]	Authority, consensus	Conservative
Gemini (browse)	3-5	Google index	Classic SEO + schema	Moderate
AI Overviews	4-7	Google index [5]	Top-10 rank + Direct Answer + schema	Tight (rank-prerequisite)

The cross-engine optimization implication: the four documented factors (authority, schema, concision, consensus) carry across all engines. Engine-specific tuning should come second. If your page does not have the four foundationals, polishing the engine-specific deltas is premature optimization.

For the engine-specific tracking mechanics — detecting which engine sent which click and joining it to a Stripe customer — the AI traffic revenue attribution piece covers the detection rules, and the AI Overviews tracking guide covers Google's AIO surface specifically.

Common AI ranking mistakes

The same patterns come up over and over in GEO audits. The five most-common failure modes:

Mistake 1: Treating it as classic SEO with extra steps. The 12 factors overlap with SEO at roughly 70%, but the 30% delta (RAG retrieval mechanics, schema parseability for LLM extraction, entity disambiguation for chat context) is genuinely different. Sites that ship "SEO content with FAQ schema bolted on" underperform sites that rebuild the content shape for AI retrieval (Direct Answer paragraph, concise lead, dense tables, entity-rich prose).

Mistake 2: Optimizing without measuring. GA4 buckets AI engine referrals as Direct/(none) [2]. Without server-side referrer detection or behavioral fingerprinting, you cannot tell which interventions moved citations and which were noise. The audit pattern that catches this: look at the team's last 6 months of GEO work and ask "show me the citation lift per intervention." If the answer is hand-waving, the program is running blind.

Mistake 3: Chasing engine-specific tactics before fixing foundationals. I have seen sites spend weeks tuning their llms.txt for ChatGPT-specific patterns while their main content is hydrated client-side and invisible to GPTBot. Foundational fixes (server-side rendering, schema, Direct Answer) carry across all engines; engine-specific tactics are diminishing-returns work.

Mistake 4: Confusing presence with revenue. Citation-tracking tools (Profound, Otterly, Peec.ai) show whether you are cited. That is presence evidence, not revenue evidence. The revenue join requires server-side attribution that connects the AI-referred session to a Stripe customer at payment. Without it, AEO ROI is an estimate, not a measurement. The full breakdown of the measurement gap lives in the AEO vs SEO piece.

Mistake 5: Manipulation as primary strategy. The Princeton GEO paper [1] demonstrated that simple interventions can lift citation rates by up to 40%, which makes the manipulation lever real. The catch: engines actively train against obvious manipulation patterns (keyword stuffing in lead paragraphs, fake statistics, citation farms). The durable wins come from genuine quality signals (schema parseability, entity disambiguation, source authority, original data). Treat manipulation as short-term arbitrage, not long-term strategy.

A useful before/after audit framework for assessing your own AI ranking posture:

Audit dimension	Before fix	After fix	How to verify
Server-rendered HTML on target pages	Empty `<body>` in view-source	Full content in source	View source, search for H1 text
Schema bundle present and valid	No JSON-LD or invalid	Article + FAQPage + HowTo passing Rich Results Test	Google Rich Results Test [16]
Direct Answer paragraph (≤120 words)	Buried 800 words deep	First paragraph after H1	Word count from H1 to first answer
FAQ schema items	0-1	4 or more, matching visible H2	Schema validator + visual inspection
Entity disambiguation	0-1 sameAs	4+ matched sameAs surfaces	Audit Organization + Person JSON-LD
Outbound citations to primary sources	0-2 per 1,000 words	3-10 per 1,000 words	Manual outbound link count
Topic cluster depth	1 page on topic	3-5+ interlinked pages	Site audit on topic cluster
llms.txt at site root	Missing	Present, listing canonical pages	curl yoursite.com/llms.txt
Server-side AI referrer detection	None	Detects chatgpt, perplexity, claude, gemini	First-party analytics log
Session-to-Stripe-customer join	Broken (GA4 default)	Joined server-side	Stripe webhook handler verification

The honest read on this checklist: every item is mechanical. None requires a vendor. The whole audit is roughly 4-12 hours of focused work per site. The reason most sites do not have this checklist passing is the same reason most sites do not have classic SEO basics passing — it is unglamorous infrastructure work, and the lift is invisible until the next time you check rankings.

For the measurement layer underneath the entire stack — the server-side AI-referrer detection plus the Stripe webhook join — the revenue attribution feature page covers the architecture, and the cookieless analytics breakdown covers the broader privacy-first analytics positioning.

Manipulation risk per factor (the operator-side audit)

A specific concern operators raise: which of these factors can be manipulated, and what is the risk of doing so? The honest assessment:

Factor	Manipulation difficulty	Manipulation risk	Recommended approach
1. Source domain authority	High (years of work)	Low (organic only)	Genuine PR, real citations
2. Freshness	Low (cosmetic updates)	Medium (detected as low-quality)	Substantive updates only
3. Schema parseability	Trivial	Low (passes Rich Results Test)	Ship clean schema
4. Q-A concision (lead 120 words)	Trivial	Low	Write tight leads
5. Original data	High (must be real)	Low (genuine work)	Publish methodology
6. Cross-source consensus	Low (write consensus)	Low	Make standard claims well
7. Entity disambiguation	Moderate (one-time setup)	Low	Build the matched sameAs set
8. Page accessibility	Low (SSR)	None	Ship server-rendered HTML
9. Outbound citations	Trivial	Low	Cite primary sources
10. Topic cluster depth	Moderate (real content work)	Low	Publish multiple interlinked pages
11. Forum brand mentions	High (anti-spam norms)	High (bans, low quality)	Genuine community participation only
12. Page format (tables/lists)	Trivial	Low	Use tables and lists consistently

The takeaway: most factors are best optimized via genuine quality signals because the manipulation cost equals or exceeds the genuine-work cost. The one factor where manipulation is genuinely tempting (forum brand mentions) is also the one with the highest blowback risk. The Princeton GEO paper's [1] 40% citation lift came from textual interventions on the page itself, not from off-site manipulation, which is consistent with this pattern.

What we did on attrifast.com (and what the numbers say so far)

Applying this framework to our own site over the last 90 days:

Foundational checks on every page. Server-rendered HTML (Next.js SSG for blog posts), full schema bundle (Article + FAQPage + HowTo + Person + Organization), 4+ FAQ items matching visible H2s, sameAs across LinkedIn + X + GitHub + Crunchbase, llms.txt at site root listing 18 canonical pages.
Direct Answer paragraph on every new post. Every article from March 2026 onward leads with a ≤120-word answer including at least one footnoted statistic and one inline citation, per the Princeton GEO intervention pattern [1].
High-density tables. This article uses 30+ tables; previous articles use 5-15. The hypothesis from Factor 12 is being tested.
Topic cluster build-out on AI search. Six interlinked pages on the cluster (AEO vs SEO, getting cited, Google AI sources, AI Overviews mechanics, GEO playbook, this article). Internal linking density at roughly 8-12 inbound links per cluster page.
Server-side AI-referrer detection. Our 4kb script tags chatgpt, perplexity, claude, gemini, and aio referrers explicitly. Clean source attribution since week one.
Stripe webhook join. checkout.session.completed handler joins the session source to the payment server-side. The full architecture lives on the revenue attribution feature page.

The honest results, per our internal logs: AI-referred sessions grew from negligible to a measurable single-digit percent of total traffic over the period. Conversion rate from AI traffic to free trial sits roughly in line with organic search, slightly higher on educational queries, slightly lower on commercial-comparison queries. The variance is wide and the sample is still small.

The intervention with the most visible signal: shipping the full schema bundle plus the Direct Answer paragraph on previously-uncited pages. Within 14 days of the change, citation rate on those queries moved measurably. The intervention with the least visible signal: llms.txt and the high-density-tables experiment. Both may be working, but the signal-to-noise ratio is too low at our scale to attribute confidently.

The acknowledged failure: I spent two weeks earlier this year experimenting with off-site brand-mention seeding (Factor 11) via lightweight forum participation. The lift was indistinguishable from noise, and the forum participation was uncomfortable enough that I stopped. Either the lift is real but tiny, or the lift is real but takes more sustained presence than we put in. Either way, not a recommendation.

I will not publish absolute revenue numbers because the sample is too small to be useful and the SaaS analytics niche is full of fabricated case studies. The architecture and methodology are publishable today; the hard numbers are 90 days out.

Limitations

This article does not cover voice and audio AI surfaces (ChatGPT voice mode, Alexa, Google Assistant). The citation mechanics differ, and the measurement story is worse.
Enterprise AI deployments (ChatGPT Enterprise, Claude for Work, Microsoft Copilot for tenants) use customer-isolated retrieval. Public-consumer citation patterns may not transfer.
Multilingual GEO is still early. Most cited research is English-language. The 12-factor framework likely translates structurally but the empirical lift estimates may not.
The Princeton GEO paper [1] tested Perplexity and BingChat specifically. Generalization to ChatGPT search, Claude, and Gemini AI Overviews is plausible but not strictly proven. Treat as the strongest single source while acknowledging the scope.
Vendor scoring functions are not published. All inferred factors carry the caveat that the underlying mechanism could be different from what the observational data implies. The labels Documented / Inferred / Speculative are honest about this gap.
GEO research is dominated by SEO vendors (Ahrefs, Semrush, BrightEdge, Backlinko). Their incentives bias toward "GEO is the next SEO" framings that justify their tooling. I have weighted their data accordingly but cited it where it is the best evidence available.
Sample sizes on attrifast.com are small. We are a bootstrapped SaaS. The results I share are directional, not statistically powered.

FAQ

What are the most important AI search ranking factors in 2026?

Based on documented retrieval research (Princeton's GEO paper by Aggarwal et al, 2024) and observable behavior across ChatGPT, Perplexity, Claude, and Gemini, the four factors with the strongest evidence are: (1) source domain authority as measured by citation graph PageRank-style signals, (2) structured-data parseability via schema.org JSON-LD, (3) question-answer concision in the first 80-120 words, and (4) cross-source consensus when multiple high-trust pages make the same claim. The next eight factors range from inferred to speculative. The credibility difference matters because most ranking-factor lists state speculation as fact.

How does ChatGPT actually decide what to cite?

ChatGPT search uses a retrieval-augmented generation (RAG) pipeline. When a user submits a query, the system rewrites it into one or more retrieval queries, fetches candidate documents from its index plus partner search APIs (Bing is the documented partner per OpenAI's SearchGPT announcement), scores them on relevance and source signals, then passes the top 4-8 passages to the model with instructions to synthesize an answer with inline citations. The citations are mechanical: the model is prompted to cite the passages it actually used. Citation decisions therefore happen in retrieval (which passages get scored highest), not in generation.

Does Perplexity rank pages differently than ChatGPT?

Yes, on the margins. Perplexity is the most retrieval-heavy of the four major engines, with 3-7 citations on virtually every answer. Per Perplexity's public hub documentation, the system uses its own crawler (PerplexityBot) plus partner search APIs, and emphasizes recency more heavily than ChatGPT. Perplexity also surfaces a "Sources" tab that exposes the full retrieved set, not just the cited subset, which suggests a wider retrieval window. Otherwise the factor list is mostly the same: schema, direct answers, entity signals, and source authority all matter on Perplexity, with freshness weighted slightly higher.

What is the difference between documented, inferred, and speculative ranking factors?

Documented factors have published evidence from the engine vendor or peer-reviewed research. Examples: Princeton's GEO paper measuring which content features increase Perplexity and BingChat citation rates, OpenAI's SearchGPT announcement documenting Bing partnership, schema.org's specifications for parseable structured data. Inferred factors have strong observational evidence across multiple independent studies (Ahrefs, Semrush, BrightEdge, Backlinko) but no vendor confirmation. Speculative factors are pattern matches from the SEO community without rigorous evidence — useful to test, dangerous to optimize for at scale.

Do AI engines use Google's PageRank?

Not directly, but they almost certainly use a similar citation-graph signal derived from web crawl data. Per Google's published Knowledge Graph research and the Common Crawl corpus that underpins much LLM training data, link-graph signals correlate strongly with the "authority" weighting that LLM retrieval systems apply. The mechanism is inferred rather than documented: OpenAI, Anthropic, and Perplexity do not publish their ranking algorithms. What is documented is that pages from high-PageRank domains (Wikipedia, government sites, established publishers) get cited disproportionately, which is consistent with a PageRank-like signal but does not prove the mechanism.

How fresh does my content need to be to get cited?

It depends on the engine and the query. For real-time RAG citations on ChatGPT search and Perplexity, fresh content can be cited within 24-72 hours of indexing on a well-crawled domain. For training-corpus-only answers (Claude without web search, Gemini in no-browse mode), the content needs to be indexed before the model's training cutoff, which is typically every 6-12 months. For Google AI Overviews, the freshness window matches the underlying Google index, which is hours-to-days on well-indexed domains. The fastest citation surface is Perplexity; the slowest is base-model knowledge in a non-grounded chat session.

Does Wikipedia presence affect AI citations?

Yes, this is one of the better-documented factors. Wikipedia and Wikidata are heavily over-represented in the training data for every major LLM (per Common Crawl analysis and Anthropic's published model documentation), and they seed the entity graph that LLMs use for disambiguation. A brand or topic with a Wikipedia page gets a baseline "this entity exists" signal that propagates to citation behavior. The catch: you cannot create a Wikipedia page for yourself, and the bar for notability is high. Wikidata is the more accessible adjacent surface, and it often seeds Knowledge Graph entries that in turn influence AI engine entity recognition.

What is the single biggest mistake people make optimizing for AI search?

Treating it as classic SEO with extra steps. The mechanics are genuinely different: retrieval freshness matters more, structured-data parseability matters more, brand entity disambiguation matters more, and concise direct-answer paragraphs matter more. Pure keyword-stuffing and link-building strategies that work for blue-link SEO often underperform on AI citation surfaces. The second biggest mistake is optimizing without measuring. GA4 buckets AI engine referrals as Direct/(none), so most teams cannot tell which factors actually moved citations and which were noise.

Can I manipulate AI search rankings?

The Princeton GEO paper (Aggarwal et al, 2024) demonstrated that simple textual interventions like adding citations, statistics, and quotations can lift Perplexity and BingChat citation rates by up to 40% on tested queries. That is real manipulation evidence, and it suggests the systems are exploitable. However, the engines are actively training against obvious manipulation patterns, and the durable wins come from genuine quality signals (schema parseability, entity disambiguation, source authority) rather than tricks. Treat manipulation as a short-term arbitrage, not a long-term strategy.

How do I measure whether my AI search optimization is working?

Three layers. First, citation tracking: run your target queries weekly across ChatGPT, Perplexity, Claude, and Gemini, and log whether you are cited. Tools like Profound, Otterly, and Peec.ai automate this. Second, referral traffic: server-side detection of AI engine referrers since GA4 misses roughly 100% of them by default. Third, revenue attribution: joining the AI-referred session to a Stripe customer at payment so you can see which engine actually drove revenue, not just clicks. Most teams skip layer three because GA4 cannot do it, which is the gap Attrifast was built to close.

How is AI Overviews different from ChatGPT search for ranking purposes?

AI Overviews is rank-prerequisite: pages cited in AIO blocks almost exclusively come from the top-10 classic Google ranking for the query, per Ahrefs's 2025 AIO study and Semrush's parallel research. ChatGPT search is more flexible because it has its own retrieval layer (Bing-backed plus OpenAI-specific scoring) and is willing to surface pages that do not rank in Google's top-10. The practical implication: AIO optimization is mostly classic SEO plus schema; ChatGPT search optimization rewards a broader set of factors including freshness and entity signals.

Does llms.txt actually affect AI rankings in 2026?

Inferred yes, with modest weight. llms.txt is a curated index of your most LLM-relevant pages, hosted at your site root per the llmstxt.org specification. Adoption sits near 7% of public SaaS sites in Q1 2026. The major AI crawlers (per OpenAI's documented bot behavior, Perplexity's documentation) do read it when present. The lift is meaningful for sites where your most useful pages are not your most-linked pages. The cost is roughly 30 minutes to write and zero ongoing. Treat it as a low-cost speculative bet rather than a guaranteed lift.

Should I optimize for one specific AI engine?

No. The four documented factors (source authority, schema parseability, question-answer concision, cross-source consensus) carry across all major engines and account for roughly 70% of the variance. Engine-specific tuning is the remaining 30% and is diminishing-returns work compared to nailing the foundationals. The exception: if you are a publisher whose audience uses one engine disproportionately (e.g., a developer-focused site where ChatGPT and Perplexity dominate over Gemini), it can be worth tuning specifically for those engines once the foundationals are solid.

How long does it take to see results from AI search optimization?

For RAG-based engines (ChatGPT search, Perplexity, Gemini AI Overviews, Claude with web search), citation rate changes can show within 24-72 hours of indexing on well-crawled domains, and within 2-3 weeks on lower-authority domains. For training-corpus-only answers (Claude without web search, Gemini in no-browse mode), the next training cutoff window is 6-12 months. For Knowledge Graph and entity-disambiguation effects, the lag is typically 4-12 weeks. The fastest results come from interventions on already-indexed pages with existing rank; the slowest come from interventions that depend on the next model training cycle.

Where does the Princeton GEO paper fit in the evidence hierarchy?

It is the strongest single source in the GEO literature as of 2026. Princeton's GEO paper (Aggarwal et al, 2024, arXiv) is a controlled experiment measuring the citation-lift effect of specific textual interventions on Perplexity and BingChat. Most other GEO research (Ahrefs, Semrush, BrightEdge, Backlinko) is observational at scale and stronger on correlation than causation. The Princeton paper does not cover every ranking factor — it focuses on textual interventions — so it does not exhaust the evidence space. But where it does test a factor, treat its results as the primary source.

Sources

Find revenue hiding in your traffic

Discover which marketing channels bring customers so you can grow your business, fast.

Start free trial →

7-day free trial · $15/mo · cancel anytime