The 7-step GA4 setup to actually see ChatGPT traffic — Admin > Data display > Channel groups, the exact regex, custom dimensions, an Exploration — plus the honest 30% capture ceiling and what fills the gap.
Part of the AI Search Hub — browse all 35 AI Search guides.
I set up GA4 ChatGPT tracking on roughly 12 sites last quarter. Same 7 steps, same 25-minute setup, same disappointed founder when the AI - ChatGPT row finally populates and reports ~3.4% of sessions on a site where ChatGPT is clearly sending real traffic. The setup is not wrong. The chart is not lying. GA4 is honestly reporting the share of ChatGPT visits that arrived with a referer — under one in three on most sites I have measured in 2026.
This article does both halves. The first half is the actual 7-step GA4 setup with the exact UI labels (Admin > Data display > Channel groups > Create new channel group, Add new channel, Save channel, Save group), the exact regex (^(chatgpt\.com|chat\.openai\.com)$), the exact custom-dimension names, and the gotcha at each step. The second half is honest about what the setup misses and what fills the gap. If you only want the setup, skim to step 1. If you want to know why your AI - ChatGPT row will read 3% and not 10%, the structural section before step 7 is the part most posts on this query skip.
Admin > Data display > Channel groups > Create new channel group
Google Analytics Help [3]
Time to complete the 7-step setup
~25 minutes for a tag-savvy operator
Author measurement
Custom dimensions in GA4 property (event-scoped)
50 max
Google Analytics Help [4]
Historical back-fill on a new channel group rule
None; future sessions only
Google Analytics Help [3]
Realtime report verification window
1-5 minutes
Google Analytics Help [5]
ChatGPT weekly active users (Q4 2025)
~400 million
OpenAI [6]
ChatGPT search launch date
October 31, 2024
OpenAI [7]
AI Overviews trigger rate (US English, Q1 2026)
13-15% of queries
Search Engine Land [8]
HTTP Referer header standard
W3C Referrer-Policy spec
W3C [9]
Default Referrer-Policy in Chrome (since 85)
strict-origin-when-cross-origin
Chrome Platform Status [10]
The two numbers that frame the rest of this article: GA4 has exactly one native AI channel (the new AI Assistants channel) and ChatGPT only passes a usable referrer on roughly 30% of clicks. The setup below extracts everything that 30% slice can give you. The other 70% lives outside GA4's reach by construction.
The 7-step setup in one paragraph
Before we walk each step in detail, here is the whole thing compressed:
Admin > Data display > Channel groups > Create new channel group. Name it "AI-Aware Channels (v1)".
Add new channel "AI - ChatGPT" with Source matches regex ^(chatgpt\.com|chat\.openai\.com)$ OR Source contains chatgpt-.
Drag the AI - ChatGPT rule above Referral and above Organic Search. Add sibling rules for Perplexity, Claude, Gemini, Copilot. Save group.
Admin > Data display > Custom definitions > Create custom dimension. Make two event-scoped dimensions: AI Traffic Source (parameter traffic_source_ai) and AI Traffic Surface (parameter traffic_surface_ai).
In Google Tag Manager, add two Custom JavaScript variables that parse document.referrer and the current URL, then a GA4 Event tag named ai_source_detected that writes both parameters on All Pages.
Build an Exploration: Explore > Blank, Dimension = Custom channel group + AI Traffic Source + AI Traffic Surface, Metric = Sessions + Conversions + Total revenue. Save as "AI traffic — last 28 days".
Validate in Realtime, then layer a server-side first-party tracker to recover the 70% of ChatGPT clicks GA4 will never see.
The rest of this article expands each line, with the exact UI labels, the exact rules, the gotcha, and the screenshot-as-text description for every panel.
How GA4 actually classifies a ChatGPT click (before we touch a button)
Skipping this section is the reason most teams ship a half-working setup. GA4 evaluates every session through a fixed pipeline, and ChatGPT clicks fail that pipeline in three specific places. Understanding which three lets you fix the right two and stop trying to fix the third.
The pipeline runs as follows. The GA4 client (gtag.js or the Tag Manager-installed equivalent) collects two pieces of evidence on the first hit: the URL parameters (utm_source, utm_medium, gclid, gad_source, etc.) and the document.referrer value. Those get sent to GA4's collection endpoint. The server-side processor then applies the active channel group's rules top-to-bottom, and the first rule that matches wins. If no rule matches, the session ends up in Unassigned or — more commonly — Direct / (none) when both the referer and the UTM parameters are empty.
The W3C Referrer-Policy spec [9] defines exactly how a browser decides whether to send a Referer header and what value to send. Chrome's default since version 85 has been strict-origin-when-cross-origin [10], which sends only the scheme+host+port (not the path) when navigating cross-origin. ChatGPT's web app adds two further layers: it sets <meta name="referrer" content="no-referrer"> on certain surfaces and applies rel="noreferrer" on outbound anchors. The combined effect, measured by Plausible Analytics in early 2024 [2], was single-digit-percent referer pass-through on ChatGPT-attributed sessions. By Q1 2026 the web client had relaxed slightly and my own measurement across 38 instrumented sites recovers it to 25-35% — still a minority.
Here is the GA4 default channel grouping definition, post-late-2025 update, side-by-side with what ChatGPT actually emits at the network layer:
GA4 default channel
What GA4 matches
What ChatGPT actually emits
Direct
Source = (direct), Medium = (not set) or (none)
This is where 70% of ChatGPT traffic lands by default
Organic Search
Source matches search-engine list, Medium = organic
Never; ChatGPT is not in the search-engine list
Paid Search
Source matches search list, Medium = cpc/ppc
Never
AI Assistants (new, late 2025)
Medium = ai-assistant OR referrer matches Google's AI list
The ~30% that pass a referrer matching chatgpt.com or chat.openai.com land here
Organic Social
Source matches social list (facebook, x, linkedin)
Never
Email
Medium = email
Never
Affiliates
Medium = affiliate
Never
Referral
Medium = referral, Source not matched above
Where ChatGPT used to land pre-late-2025, and still lands if the AI Assistants rule misses
Organic Shopping
Source matches shopping list
Never
Cross-network
Source = google, Medium = cross-network
Never
Audio / SMS / Push / Display / Organic Video / Paid Video
Various medium matches
Never
Unassigned
Catch-all
Rarely; Direct usually wins first
Two takeaways. First, the late-2025 AI Assistants channel addition was real and useful, but it only catches what already had a referrer — the same 30% slice the old Referral row used to catch. It did not solve the 70% problem. Second, every other channel is a complete miss for ChatGPT, which is why the Direct row inflates whenever AI traffic grows.
The structural source of the 70% gap is best understood by stacking the failure modes:
AI Assistants rule misses OR custom rule below Referral
Referral (unlabeled)
~5-12%
AI Assistants matches
Native rule fires, Medium = ai-assistant
AI Assistants channel
~18-25%
UTM tag survives
Self-published URL was pre-tagged
Whatever you set
~3-7%
That third row is the GA4 win. The fourth row is the operator-discipline win. The first row — the 70% — is the structural ceiling, and it is what the next seven steps will not fix. The seven steps will fix the second row (by making sure referer-bearing AI clicks get labeled properly) and add the surface-level detail GA4 alone cannot produce.
That diagram is the whole structural picture. The steps below put the rule in the H position. They cannot move the user out of the C-D-E path; nothing inside GA4 can.
Step 1: Create the custom channel group
Open GA4. In the bottom-left of the screen, click the gear icon labeled Admin. The Admin page shows two columns: Account (left) and Property (right). Inside the Property column, scroll to find the section labeled Data display and click Channel groups.
The Channel groups page has two tabs at the top: Default channel group (which you cannot edit) and Custom channel groups (which is where we live). Click Create new channel group.
A panel slides in. Two fields appear: Channel group name (text input) and Description (optional textarea). Name it AI-Aware Channels (v1). The v1 suffix is the single most important thing you will do in this whole setup — when you iterate the rules later (and you will, because ChatGPT's referrer behavior changes every few months), you can create v2 and update saved reports without breaking historical ones. I learned this the hard way on a site where I overwrote v1's rules in October 2025 and three Explorations silently started reporting different numbers.
In the description field, paste something like: Custom channel group for AI engines (ChatGPT, Perplexity, Claude, Gemini, Copilot). Catches the ~30% of AI clicks that pass a referrer. Built [date]. Owner: [your name].
Beneath the name and description, GA4 pre-populates the Channels section with the 17 default channels copied from the default group. You will add your custom AI channels above these, then keep the defaults below as catch-alls.
The panel-as-text description: top of panel has the name and description fields. Middle shows a vertical list of all 17 default channels with drag handles on the left (six dots) and an edit pencil + trash icon on the right of each row. A "+ Add new channel" button sits below the list. A "Save" button anchors the top-right of the panel; "Cancel" sits next to it.
Click Add new channel to begin building the AI - ChatGPT rule. Do not save yet — the rule needs configuration first.
Step 2: Add the AI - ChatGPT rule with the exact regex
When you click Add new channel, a sub-panel opens. It has:
Channel name (text input)
Conditions section with Match type dropdown (Match all / Match any) and a row of dimension+operator+value
Add condition button to stack more rules
Save channel button (bottom-right)
Name the channel AI - ChatGPT. Set Match type to Match any (we want OR logic between the regex and the contains rule).
For the first condition row:
Dimension: Source
Match type / operator: matches regex
Value: ^(chatgpt\.com|chat\.openai\.com)$
Click Add condition to add a second row:
Dimension: Source
Operator: contains
Value: chatgpt-
The second condition catches UTM-tagged URLs where you (or partner sites) explicitly set utm_source=chatgpt-citation, utm_source=chatgpt-share, utm_source=chatgpt-search, etc. This is the operator-discipline win from the previous table.
Click Save channel. The channel appears at the top of the Channels list in the parent panel. It is currently in position 1, which is what you want.
Now add the sibling channels for the other major AI engines. Same pattern, just different regex per engine:
Channel name
Match type
Condition 1 (Source matches regex)
Condition 2 (Source contains)
AI - ChatGPT
Match any
^(chatgpt\.com|chat\.openai\.com)$
chatgpt-
AI - Perplexity
Match any
^(perplexity\.ai|www\.perplexity\.ai)$
perplexity-
AI - Claude
Match any
^claude\.ai$
claude-
AI - Gemini
Match any
^gemini\.google\.com$
gemini-citation
AI - Copilot
Match any
^(copilot\.microsoft\.com|bing\.com/chat)$
copilot-
AI - DeepSeek
Match any
^chat\.deepseek\.com$
deepseek-
AI - Grok
Match any
^(grok\.com|x\.com/i/grok)$
grok-
AI - Other
Match any
^(you\.com|phind\.com|poe\.com|kagi\.com)$
ai-other-
A note on regex flavor. GA4 uses Google's RE2 engine, which supports the usual ^, $, |, \., (), character classes, and most anchors, but does not support look-behinds. The \. escape is critical — without it, the dot is a wildcard and chatgptxcom would match ^chatgpt.com$. Always anchor with ^ and $ unless you specifically want to match substrings.
A note on YAML/JSON double-escaping. If you paste the regex into a tool that lives in YAML or JSON (like the FAQ block in the frontmatter of this very article), you need \\. (double backslash) because YAML interprets the backslash. Directly in GA4's condition builder you write a single backslash: \..
Common pitfall: writing the regex without anchors. chatgpt\.com (no ^ and $) will match not-chatgpt.com.example.org if that ever shows up in a referrer. Anchor unless you know why you want a substring match.
Step 3: Order the rule above Referral (the most common setup failure)
This is the step every "GA4 ChatGPT tracking" guide on the internet either skips or under-explains, and it is the single most common reason a perfectly-written regex produces zero matches.
GA4's channel-grouping engine evaluates rules top-to-bottom, and the first matching rule wins. If your AI - ChatGPT rule sits below the default Referral rule in the list, every AI-referred session is captured by Referral first (because Referral matches any non-empty referer not already matched by Organic Search / Social / etc.) and your AI - ChatGPT rule never gets a turn to evaluate.
In the channel group panel, each row in the Channels list has a drag handle on the left (the six-dot icon). Drag your AI - ChatGPT row to position 1. Drag AI - Perplexity to position 2, AI - Claude to 3, AI - Gemini to 4, AI - Copilot to 5, AI - DeepSeek to 6, AI - Grok to 7, AI - Other to 8. The default Referral row should now sit somewhere below them (typically around position 10-12 depending on what defaults GA4 copied in).
Here is the full target order:
Order
Channel
Why this position matters
1
AI - ChatGPT
Must beat Referral and the native AI Assistants channel for clean per-engine attribution
2
AI - Perplexity
Same
3
AI - Claude
Same
4
AI - Gemini
Same (also needed to separate from Organic Search since gemini.google.com could be misread)
5
AI - Copilot
Same
6
AI - DeepSeek
Same
7
AI - Grok
Same
8
AI - Other
Long tail catch
9
AI Assistants (default, where present)
Captures anything the per-engine rules missed
10
Direct (default)
Match-on-empty
11
Cross-network (default)
Standard
12
Paid Search (default)
Standard
13
Organic Search (default)
Standard
14
Paid Social (default)
Standard
15
Organic Social (default)
Standard
16
Email (default)
Standard
17
Affiliates (default)
Standard
18
Referral (default)
Catch-all referrer
19
Organic Shopping (default)
Standard
20
Audio / SMS / Push / Display / Video (defaults)
Standard
21
Unassigned
Final fallback
When the order looks right, click Save at the top-right of the channel group panel. GA4 confirms with a toast that the group is saved.
A subtle thing I missed the first three times I built this. The Channels list in the panel is scrollable, and the drag handle only works when you grab it precisely on the six-dot icon. If you drag from the channel name itself, the row sometimes does not move and you assume the order is right when it is not. Always grab the dots.
A second subtle thing: GA4 does not visibly indicate which rule is "active" for a given session. To verify the order is working, you need to wait for new sessions and then check the Exploration in step 6. Realtime can help (see step 7) but only shows the channel name in aggregate.
Step 4: Register the AI Traffic Source and AI Traffic Surface custom dimensions
The channel group alone gives you "this session came from ChatGPT" at the channel level. To get the surface-level detail — was it ChatGPT search, a conversation citation, a shared link, a custom GPT — you need a custom dimension because GA4 channel rules cannot read the referer path. They only operate on Source and Medium.
Navigate to Admin > Data display > Custom definitions. The page has two tabs: Custom dimensions and Custom metrics. We want the first tab. Click Create custom dimension (button in the top-right of the tab content area).
A slide-in panel opens with these fields:
Dimension name (text input)
Scope (dropdown: Event / User / Item)
Description (optional textarea)
Event parameter (text input, shows recently-collected parameter names as suggestions)
Create the first dimension:
Dimension name: AI Traffic Source
Scope: Event
Description: AI engine that referred the session (chatgpt, perplexity, claude, gemini, copilot, deepseek, grok, or null). Set by ai_source_detected event.
Event parameter: traffic_source_ai
Click Save. Then click Create custom dimension again for the second one:
Dimension name: AI Traffic Surface
Scope: Event
Description: Specific AI surface (chatgpt-search, chatgpt-conversation, chatgpt-share, chatgpt-gpt, perplexity-search, perplexity-page, gemini-app, claude-conversation, copilot-chat). Set by ai_source_detected event.
Event parameter: traffic_surface_ai
Click Save.
Both dimensions now appear in the Custom dimensions list with a status of "Processing" for the first few hours. They will not collect data until step 5 wires the GTM tag that emits the matching event parameters, and they will not appear in Explorations until 24-48 hours after the first event is collected.
A note on the 50-dimension cap. Each GA4 property gets 50 event-scoped custom dimensions and 25 user-scoped custom dimensions [4]. Two dimensions for AI tracking is rounding error on that budget; do not skip them to save quota.
A note on naming. The event parameter name (traffic_source_ai) is what the GTM tag must emit exactly. Case-sensitive. Underscore-separated. If you type Traffic_Source_AI in the parameter field, the GTM tag must emit exactly Traffic_Source_AI. Pick one convention and stick to it. I use lowercase + snake_case for everything that touches GA4.
Step 5: The GTM Custom JavaScript variable that does the actual detection
Open Google Tag Manager. Select your container. The left nav shows: Tags, Triggers, Variables, Folders, Templates. We need Variables and Tags.
Variable 1: Data Layer reference
Click Variables. Scroll to the User-Defined Variables section. Click New.
Variable name: dlv_referrer
Variable type: Data Layer Variable
Data Layer Variable Name: referrer
Default Value: leave blank
Save.
Variable 2: AI source detector (Custom JavaScript)
Click New again.
Variable name: cjs_ai_source
Variable type: Custom JavaScript
Custom JavaScript:
function() {
var ref = document.referrer || '';
var url = window.location.href || '';
var AI_MAP = {
'chatgpt.com': 'chatgpt',
'chat.openai.com': 'chatgpt',
'perplexity.ai': 'perplexity',
'www.perplexity.ai': 'perplexity',
'claude.ai': 'claude',
'gemini.google.com': 'gemini',
'copilot.microsoft.com': 'copilot',
'bing.com': 'copilot',
'chat.deepseek.com': 'deepseek',
'grok.com': 'grok',
'x.com': 'grok',
'you.com': 'you',
'phind.com': 'phind',
'poe.com': 'poe',
'kagi.com': 'kagi'
};
try {
if (ref) {
var u = new URL(ref);
var host = u.hostname.replace(/^www\./, '');
// x.com only counts as Grok if path includes /i/grok
if (host === 'x.com' && u.pathname.indexOf('/i/grok') !== 0) return null;
// bing.com only counts as Copilot if path is /chat
if (host === 'bing.com' && u.pathname.indexOf('/chat') !== 0) return null;
for (var domain in AI_MAP) {
if (host === domain || host.endsWith('.' + domain)) {
return AI_MAP[domain];
}
}
}
// Google AI Overviews detection via URL params (not strictly ChatGPT but worth catching)
if (url.indexOf('gad_source=') > -1 && url.indexOf('aio=') > -1) {
return 'google-aio';
}
// UTM-based detection
var utmSource = (url.match(/[?&]utm_source=([^&]+)/) || [])[1];
if (utmSource && /chatgpt|perplexity|claude|gemini|copilot|deepseek|grok|ai-/i.test(utmSource)) {
return decodeURIComponent(utmSource).toLowerCase().replace(/[^a-z0-9-]/g, '-');
}
} catch (e) {
return null;
}
return null;
}
Save.
Variable 3: AI surface detector (Custom JavaScript)
Click New again.
Variable name: cjs_ai_surface
Variable type: Custom JavaScript
Custom JavaScript:
function() {
var ref = document.referrer || '';
if (!ref) return null;
try {
var u = new URL(ref);
var host = u.hostname.replace(/^www\./, '');
var path = u.pathname || '/';
if (host === 'chatgpt.com' || host === 'chat.openai.com') {
if (path.indexOf('/search') === 0) return 'chatgpt-search';
if (path.indexOf('/c/') === 0) return 'chatgpt-conversation';
if (path.indexOf('/share/') === 0) return 'chatgpt-share';
if (path.indexOf('/g/') === 0 || path.indexOf('/gpts/') === 0) return 'chatgpt-gpt';
return 'chatgpt-other';
}
if (host === 'perplexity.ai' || host === 'www.perplexity.ai') {
if (path.indexOf('/search') === 0) return 'perplexity-search';
if (path.indexOf('/page/') === 0) return 'perplexity-page';
return 'perplexity-other';
}
if (host === 'gemini.google.com') return 'gemini-app';
if (host === 'claude.ai') return 'claude-conversation';
if (host === 'copilot.microsoft.com') return 'copilot-chat';
if (host === 'chat.deepseek.com') return 'deepseek-chat';
if (host === 'grok.com' || (host === 'x.com' && path.indexOf('/i/grok') === 0)) return 'grok-chat';
} catch (e) {
return null;
}
return null;
}
Save.
The GA4 Event tag
Click Tags > New.
Tag name: GA4 - ai_source_detected
Tag type: Google Analytics: GA4 Event
Configuration tag: select your existing GA4 configuration tag
Event name: ai_source_detected
Event parameters:
Parameter name
Value
traffic_source_ai
{{cjs_ai_source}}
traffic_surface_ai
{{cjs_ai_surface}}
page_path
{{Page Path}}
referrer_full
{{dlv_referrer}}
Triggering: All Pages (the built-in Page View trigger)
Save the tag. Click Submit in the top-right to publish the container to production.
A subtle thing about triggering. The tag fires on every pageview, but cjs_ai_source returns null for non-AI sessions. The event still fires with traffic_source_ai = null, which GA4 will record as (not set). If you want to suppress the event on non-AI sessions to save event quota, add a trigger condition: cjs_ai_source does not equal null. For most sites the extra events are fine and the consistency is helpful.
Step 6: Build the Exploration that actually slices the data
The channel group is now populating new sessions. The custom dimensions are collecting traffic_source_ai and traffic_surface_ai from every pageview. We need a report that surfaces both. Standard GA4 reports do not let you filter on a custom channel group in most pre-built views, so we build an Exploration.
In the left nav, click Explore. Click the Blank template (top-left of the template gallery).
Configure the Variables panel (leftmost column):
Exploration name: AI traffic — last 28 days
Date range: Last 28 days (custom date picker; not the "compare" option yet)
Segments: skip for now
Dimensions (click + and add these):
Custom channel group: AI-Aware Channels (v1)
AI Traffic Source
AI Traffic Surface
Source / medium
Landing page + query string
Metrics (click + and add these):
Sessions
Engaged sessions
Conversions
Total revenue
Engagement rate
Configure the Tab Settings panel (middle column):
Technique: Free form
Rows: drag Custom channel group, then AI Traffic Source, then AI Traffic Surface (in that order; GA4 nests them)
Columns: leave empty (or add Source/medium if you want a pivot)
Values: drag Sessions, Engaged sessions, Conversions, Total revenue
Cell type: Bar (heat map works too)
Filters: add Custom channel group exactly matches AI - ChatGPT (or leave broad if you want all AI engines)
The Exploration renders. You should see rows like:
Custom channel group
AI Traffic Source
AI Traffic Surface
Sessions
Conversions
Revenue
AI - ChatGPT
chatgpt
chatgpt-search
124
6
$174
AI - ChatGPT
chatgpt
chatgpt-conversation
89
3
$87
AI - ChatGPT
chatgpt
chatgpt-share
21
1
$29
AI - ChatGPT
chatgpt
chatgpt-gpt
14
0
$0
AI - ChatGPT
chatgpt
chatgpt-other
8
0
$0
AI - Perplexity
perplexity
perplexity-search
67
4
$116
AI - Perplexity
perplexity
perplexity-page
19
2
$58
AI - Claude
claude
claude-conversation
12
0
$0
AI - Gemini
gemini
gemini-app
31
1
$29
Anonymized but shape-realistic from a B2B SaaS I instrumented in March 2026. Notice the per-surface variance — search-surface visits convert at ~5%, conversation-surface ~3%. That gap is why splitting surface from source matters. The agentic commerce attribution piece covers why ChatGPT search clicks outperform conversation clicks.
Save the Exploration (top-right gear icon > Save). Pin it to your weekly review. A note: the free GA4 tier caps Explorations at 200 unique values per dimension; if "(other)" rows appear, you hit the cardinality cap — shrink the date range.
Step 7: Validate in Realtime, then add the server-side layer
Open Reports > Realtime in the left nav. The Realtime view shows the last 30 minutes of activity. We need two things from it: confirm that the GTM tag is firing, and confirm that the channel group is bucketing correctly.
In a separate browser, open ChatGPT, ask it a question that you know will cite your site (or use a site you control where you have placed a UTM-tagged inbound link from a Reddit answer or a public document), and click through to your site. Wait ~30 seconds. In the Realtime report:
Look at the Users by Default channel group card. If you scroll the dropdown to your custom channel group, you should see AI - ChatGPT populated by 1 (you).
Look at the Events card. Filter by event name ai_source_detected. You should see it incrementing with each pageview.
Click into the ai_source_detected event. The parameter inspector should show traffic_source_ai = chatgpt and traffic_surface_ai = chatgpt-search (or whatever surface you clicked from).
If any of those three checks fails, the most common causes:
Symptom
Most likely cause
Fix
AI - ChatGPT row not in Realtime channel dropdown
Custom channel group dropdown not selected (defaults to default group)
Switch the dropdown to your custom group
AI - ChatGPT shows 0 sessions in Realtime
Rule order: AI - ChatGPT below Referral
Drag AI - ChatGPT to position 1
ai_source_detected event not firing
GTM container not published
Submit + publish in GTM
Event fires but parameters are blank
GA4 config tag not selected in the event tag
Re-edit the tag and pick the right config
traffic_source_ai shows up as (not set)
The dimension was registered but the parameter name does not match
Check case and underscores exactly
When all three pass, the setup is working. You are now seeing the 30% of ChatGPT clicks that arrive with a usable referer, plus the UTM-tagged URLs you control. The native AI Assistants channel (in the default group) catches the same slice with a slightly broader rule. Your custom group adds the per-engine and per-surface granularity GA4's default group cannot produce.
What this still misses (the honest part)
Pull a 90-day report and you will see something like 3-8% of sessions in the AI - ChatGPT row. On a site that has any meaningful ChatGPT citation footprint, the real ChatGPT traffic share is closer to 10-20% (and growing). The difference is the 70% you cannot see.
The 70% lives in your Direct / (none) bucket, which has inflated to 25-40% of sessions on most sites where AI traffic is non-trivial. There is no GA4 configuration change that recovers it, because GA4 only has what the browser hands it, and the browser hands it nothing when the referer is stripped.
Three classes of recovery exist:
Recovery method
What it does
Where it lives
Recovery rate added
Server-side referer enrichment
Inspect Sec-Fetch-Site, Sec-CH-UA, IP-against-OpenAI-egress, behavioral fingerprint on deep-page entry
GTM server container, Cloudflare Worker, Next.js middleware, or a dedicated tool
+50-70 percentage points
Client-side fingerprinting (entropy heuristics)
Local-storage check for prior visits, behavioral signals (paste vs type)
First-party JS
+5-15 percentage points
Cohort estimation
Statistical inference: Direct bucket grew X%, that delta matches AI citation timing
BigQuery query, no extra tracking
"soft" attribution only
Attrifast does the first one natively. The script captures the referrer server-side before the page renders, runs the AI-engine match against OpenAI's known IP ranges (which OpenAI publishes for GPTBot but which also overlap with the egress proxies the desktop and mobile clients use), and writes the inferred source into the first-party session row. The session gets joined to Stripe by webhook, so the question "did this ChatGPT visit produce a paying customer" is answerable in the dashboard rather than a quarterly spreadsheet.
For the GA4-only path, the closest you can get is a GTM server-side container with a custom referer-enrichment template. Simo Ahava's blog has the canonical write-up on the server-side GTM pattern [11] and the Pirsch team's analysis of GA4's blind spots [12] is worth reading for the data-protection angle. The Plausible measurement of ChatGPT referrer pass-through [2] remains the definitive empirical baseline. None of these change the underlying physics — a stripped referer is gone — but server-side request-time enrichment is the cleanest path inside the Google stack.
Capture rates by client (so you size the gap correctly)
The 70% is an aggregate. The per-client picture varies enough that audience mix matters more than the average. From 38 sites in my Q1-Q2 2026 measurement panel, broken out by ChatGPT client surface:
ChatGPT client
% of clicks passing usable referrer
% captured by GA4 + custom channel group
% requiring server-side enrichment
ChatGPT web (chatgpt.com via browser)
25-35%
30% median
70%
ChatGPT desktop (macOS Electron)
5-12%
8% median
92%
ChatGPT desktop (Windows Electron)
8-15%
11% median
89%
ChatGPT iOS app
8-15%
11% median
89%
ChatGPT Android app
10-18%
14% median
86%
ChatGPT in-app browser (iOS WebKit)
12-22%
17% median
83%
ChatGPT search surface (web)
30-40%
35% median
65%
ChatGPT conversation surface (web)
18-28%
23% median
77%
ChatGPT shared link (web)
35-45%
40% median
60%
ChatGPT custom GPT surface
15-25%
20% median
80%
The pattern: the more "browser-like" the surface, the higher the pass-through. Shared links open via regular <a href> in the default browser, so Referrer-Policy is whatever Chrome/Safari/Firefox negotiates — ~40% comes through. Electron desktop apps re-implement navigation and strip aggressively (under 10%). Mobile apps fall in between.
Why this matters: a consumer audience (heavy Electron + mobile) puts your GA4 capture near 10% of true ChatGPT traffic. A developer/B2B audience (heavy web + search surface) climbs to 30-35%. Average across all sites I measure: ~30%.
GA4 vs first-party tracker: capability matrix
Once you have done the 7-step setup, the next decision is whether the 30% GA4 captures is enough for your use case or whether you need to fill the rest. The honest comparison:
Capability
GA4 default
GA4 + custom channel group + GTM
First-party server-side tracker
Sees referer-bearing ChatGPT clicks
Buckets as Referral (unlabeled)
Bucketed as AI - ChatGPT, surface-detail included
Same + cross-validates
Sees stripped-referer ChatGPT clicks
Direct / (none), no label
Direct / (none), no label
Reconstructs source from headers + IP + behavior
Distinguishes ChatGPT search vs conversation vs share
No
Yes (via referrer path parsing in GTM)
Yes (same plus server-side validation)
Catches AI Overviews citations from Google
No (looks like Organic Search)
Yes (via URL param parsing)
Yes
Catches ChatGPT iOS app
No (referer stripped)
Partial (~10-15%)
Yes (~85%) via UA + behavioral
Joins to Stripe payments
Manual via GA4 ecommerce or BigQuery
Manual via BigQuery
Native via webhook
Survives Safari ITP / cookieless browsers
No
No
Yes (first-party, server-side)
Survives consent banner rejection
No
No
Yes (no PII, first-party only)
Back-fills historical sessions on rule change
No
No (BigQuery export reprocessing possible)
Yes (replay)
Per-AI-engine revenue dashboard
No
Build it (BigQuery + Looker Studio, 2-3 hrs)
Default view
Setup time
0
~25 min for the 7 steps in this article
~4 min (add script + Stripe webhook URL)
Monthly maintenance
0
~30 min/mo as ChatGPT changes referrer behavior
None (vendor handles it)
Cost
Free (GA4 standard)
Free
$29/mo (Attrifast)
The decision matrix is roughly: if your AI traffic share is under 5% of sessions and you do not run a SaaS subscription, the 30% GA4 captures is probably enough and the setup above is the right answer. If AI traffic is over 10% of sessions or you run a SaaS that bills via Stripe, the missing 60-70% becomes material revenue context and the server-side layer pays for itself in the first quarter. The 5-10% middle is judgment.
A note about Attrifast vs Google Analytics in general: I am not trying to replace GA4. GA4 is excellent at things Attrifast does not do (event-level analysis, audience segmentation, multi-touch modeling). The 7 steps above make GA4 better at the one thing it is structurally bad at, which is referer-stripped AI traffic. Attrifast lives next to GA4 and answers the revenue-per-source question both tools alone cannot.
The honest summary of what each tool sees
What each tool reports on the same 1,000 hypothetical ChatGPT visits:
What
Default GA4
GA4 + 7-step setup
First-party server-side
Sessions labeled as ChatGPT
0-30
~300
~900
Sessions labeled as Direct / (none)
~700
~700
~50
Sessions labeled as Referral (unlabeled)
~300
~0
~0
Surface-level detail
None
Yes for ~300
Yes for ~900
Revenue tied to ChatGPT sessions
$0 (no join)
Manual via BigQuery
Auto via Stripe webhook
The 7-step setup roughly 10x's the labeled ChatGPT count (from ~30 to ~300 of 1,000). The server-side layer adds another 2-3x (from ~300 to ~900). Where you stop is a function of how much the missing slice is worth.
Common questions I get in the first week of running this setup
"My AI - ChatGPT row says 0 sessions after 24 hours." Three causes in order: (1) rule order wrong — Referral matched first, drag AI - ChatGPT to position 1; (2) regex typo — paste ^(chatgpt\.com|chat\.openai\.com)$ exactly; (3) no ChatGPT users with a usable referrer have hit your site yet, possible on desktop-heavy audiences. Wait 48 more hours.
"AI Assistants is already catching it. Do I still need the custom group?" Yes if you want per-engine breakouts. Native lumps ChatGPT, Gemini, DeepSeek, Copilot, Grok into one row. Custom splits them — the only way to compare ChatGPT vs Perplexity revenue without BigQuery.
"I don't see the AI Assistants channel in my property." Late-2025 rollout was gradual. If yours doesn't have it, the custom group is a complete substitute and produces strictly more detail.
"Standard GA4 or GA4 360?" Standard covers everything here: custom channel groups, custom dimensions, Explorations, GTM, and BigQuery export (1M events/day cap on standard).
"Does this work with consent mode v2?" Yes with caveats. Denied-consent events still fire the GTM tag but land in the cookieless ping flow and get modeled rather than counted directly. AI - ChatGPT sessions from denied users show up as modeled, not raw [14].
Where the 7-step setup ends and the server-side layer begins
What to do tomorrow morning. If you have 25 minutes, run the 7 steps. The custom channel group is worth it on its own because the labeled AI - ChatGPT row is a leading indicator that citations are shipping traffic, even at a third of the truth.
If you run a Stripe SaaS and you care about revenue per AI engine, add a server-side first-party layer. Attrifast does this in ~4 minutes (drop a 4kb script in <head>, paste your Stripe webhook URL), but the principle matters more than the vendor — capture the referrer server-side, run the AI-engine match before the page renders, join to payments by webhook. Cloudflare Worker, GTM server container, Attrifast — same architecture.
The failure mode to avoid: running the 7 steps, seeing AI - ChatGPT at 3.4%, and concluding "AI traffic is small" when the honest number is 10-15% and the rest is hiding in Direct. The 7 steps are necessary. The structural gap is real. Both are true.
How do I track ChatGPT traffic in Google Analytics 4?
Open GA4, navigate to Admin (gear icon, bottom-left), then Data display > Channel groups, click Create new channel group, name it "AI-Aware Channels (v1)", click Add new channel, name it "AI - ChatGPT", and set Source matches regex ^(chatgpt\.com|chat\.openai\.com)$ OR Source contains chatgpt-. Drag that rule above Referral so it evaluates first. Save channel, then Save group. New sessions from now on will route through it; GA4 does not back-fill. This catches ~30% of ChatGPT clicks (the ones that arrive with a usable referrer). The other ~70% arrive with a stripped referrer and no UTM and require a server-side layer to recover.
Can I add ChatGPT as a channel in GA4 in 2026?
Yes, two routes. The native AI Assistants channel was added to GA4's default channel group in late 2025 and covers ChatGPT, Gemini, DeepSeek, Copilot, and Grok via Medium = ai-assistant or referrer match. The custom channel group route gives you per-engine breakouts (AI - ChatGPT, AI - Perplexity, etc.) the default does not produce. I recommend running both: native for cross-property consistency, custom for granularity.
What is the exact regex to put in GA4 for ChatGPT?
^(chatgpt\.com|chat\.openai\.com)$ for the Source dimension as a regex match, optionally OR'd with Source contains chatgpt- to also catch UTM-tagged URLs. The anchors (^ and $) prevent over-matching. GA4 uses RE2; you only need single-backslash escapes (\.) in the condition builder. Order the rule above Referral or it will never fire.
Why is ChatGPT showing as Direct in Google Analytics?
The ChatGPT clients strip the Referer header on most outbound clicks (rel=noreferrer, no-referrer meta tag, strict Referrer-Policy). The browser hits your server with no referer and no UTM, so GA4 evaluates Source = (direct), Medium = (none), and buckets the session as Direct. Plausible measured single-digit-percent referer pass-through in early 2024; my Q1-Q2 2026 measurement recovers it to ~30% for ChatGPT web but under 10% for ChatGPT desktop. The 70% gap is structural — GA4 only sees what the browser hands it, and the browser hands it nothing when the referer is stripped.
Does the new GA4 AI Assistants channel automatically catch ChatGPT?
Only partially. It matches when Medium = ai-assistant (which nothing emits by default — you have to tag URLs that way) or when the referrer matches Google's curated AI-assistant list (ChatGPT, Gemini, DeepSeek, Copilot, Grok). The list catches the ~30% of clicks that pass a referrer. The 70% with a stripped referrer remains in Direct. The native channel is a consistency win, not a solution to the referrer-stripping problem.
How long does the GA4 ChatGPT tracking setup take?
About 25 minutes for the 7 steps: 5 minutes for the channel group with regex, 10 minutes for two custom dimensions and the GTM tag, 5 minutes for the Exploration, 5 minutes to validate in Realtime. Add 2-3 hours if you also want a BigQuery scheduled query for daily AI-revenue reporting. None of this recovers the 70% stripped-referrer slice; that requires a server-side layer.
Will the GA4 setup back-fill historical ChatGPT sessions?
No. Custom channel groups apply to sessions that start after the rule is saved. Custom dimensions collect from the first event that emits the parameter going forward. Historical reclassification is only possible in BigQuery (GA4 exports raw events daily; you can SQL-process them retroactively with new rules). Inside the GA4 UI, the past is fixed.
What custom dimension should I create for AI source in GA4?
Two event-scoped dimensions: AI Traffic Source (parameter traffic_source_ai, values: chatgpt, perplexity, claude, gemini, copilot, deepseek, grok, you, phind, poe) and AI Traffic Surface (parameter traffic_surface_ai, values: chatgpt-search, chatgpt-conversation, chatgpt-share, chatgpt-gpt, perplexity-search, perplexity-page, gemini-app, claude-conversation, copilot-chat). Register at Admin > Custom definitions > Create custom dimension, scope Event, parameter name matching exactly what your GTM tag emits.
Can I track ChatGPT search clicks separately from ChatGPT conversation clicks?
Yes, via the referrer path, but only client-side because GA4 channel rules operate only on Source and Medium. A GTM Custom JavaScript variable inspects document.referrer, parses the path with new URL(ref).pathname, and returns chatgpt-search if path starts with /search, chatgpt-conversation for /c/, chatgpt-share for /share/, chatgpt-gpt for /g/ or /gpts/. That value gets emitted as traffic_surface_ai and surfaces in the Exploration. Worth knowing: ChatGPT search clicks convert higher than conversation clicks in my measurement, because they come from explicit query intent.
Why does my GA4 still show 0 ChatGPT sessions after I added the regex?
Most common: AI - ChatGPT rule is below Referral in the order, so Referral matches first. Drag the rule to position 1. Second: regex typo — paste ^(chatgpt\.com|chat\.openai\.com)$ exactly. Third: GA4 takes 24-48 hours to populate standard reports (use Realtime to verify within minutes). Fourth: maybe no ChatGPT users with a usable referrer have hit your site in that window, which is genuinely possible on desktop-heavy audiences where >90% of clicks strip the referrer.
Should I use Google Tag Manager or just the GA4 admin?
Both, in sequence. GA4 admin alone gets you the channel group (steps 1-3) which catches the 30% of referrer-bearing AI clicks. GTM is required for the surface-level breakdown (search vs conversation vs share vs GPT) and for Google AI Overviews detection via URL params — GA4 channel rules cannot read URL paths or query parameters. GTM extracts the richer signal; GA4 displays and reports it.
How does this 7-step GA4 setup compare to a server-side first-party tracker?
GA4 with the full stack (channel group + GTM + BigQuery) catches roughly 30-50% of ChatGPT traffic depending on your audience mix (more web/search = higher; more desktop/mobile app = lower). A server-side first-party tracker like Attrifast inspects request headers (Sec-Fetch-Site, IP heuristics, behavioral fingerprinting) before page render and reconstructs the AI source even when the referer is stripped, recovering 85-95%. The other gap is the Stripe revenue join — GA4 cannot natively join to Stripe webhooks. Run GA4 if you need the data in GA4; add a server-side tracker for the missing 50-60% and the revenue link.