The ghost citation problem
New data from 3,981 domains across 115 prompts, 14 countries, and 4 AI search engines
This Memo was sent to 26,187 subscribers. Welcome to +95 new readers! Subscribe to get the free memo weekly or upgrade to Premium for the full archive, research, frameworks, and templates.
When an AI answers a question using your content, it usually cites you with a source link. What it doesn’t do, 62% of the time, is say your name. The link is there. The brand mention is not. This is what I like to call a ghost citation: the AI using your content doesn’t mention you in the answer.
This week, I’m sharing:
Why being cited and being mentioned are 2 different outcomes that require different strategies
Which LLMs name brands vs. which treat them as anonymous source material
The query format and content type that produce 30x more brand mentions
A note from Kevin: I’m a big fan of HubSpot’s Marketing Against the Grain. I had Kieran, one of the co-hosts, on my Tech Bound podcast back in 2023. Now, they launched a newsletter with smart experiments, fresh perspectives, and practical lessons on what’s working right now. So, I thought I would give a friendly shoutout: Check it out.
Hertz drove +50% more bookings from AI search. Here’s the blueprint.
Leading brands are using Semrush for Enterprise to control their brand visibility and convert it into customers.
In a few months, Hertz Iceland secured executive buy-in, delivered a complete seasonal strategy, and grew bookings from AI search significantly.
Their team has a brand visibility blueprint backed by the market’s most complete digital visibility data. And now they’re scaling it year-round.
This analysis draws on 3,981 domains across 115 prompts, 14 countries, and 4 AI search engines (ChatGPT, Google AI Overviews, Gemini, AI Mode), using data from the Semrush AI Toolkit. Every appearance is tagged as ‘cited’ (source link present) and/or ‘mentioned’ (brand name appears in the answer text). The gap between those 2 states is the ghost citation problem.
1. 62% of your brand’s LLM citations are functionally invisible.
Most brands assume being cited means being seen. The data says otherwise.
74.9% of domains were cited, and 38.3% mentioned. 61.7% of citations are ghost citations: the domain gets a source link but zero name recognition in the answer text.
Only 13.2% of appearances convert into both a citation and a mention. Not a single domain was cited, but not mentioned at all or vice versa.
2. Every LLM shows a different behavior
The 4 AI engines treat citations and mentions in fundamentally different ways:
Gemini names brands in 83.7% of appearances, but only generates a citation link 21.4% of the time. It operates more like a conversationalist drawing on brand knowledge.
ChatGPT is the opposite: It cites 87.0% of the time but mentions brands in only 20.7% of answers, functioning more like an academic paper with footnotes.
Google AI Overviews (AIOs) sit in the middle but lean toward citation.
Google’s AI Mode offers about 17% more brand mentions than ChatGPT in its outputs, but also functions closer to an academic paper than its Gemini sibling.
For brands, this means Gemini visibility and ChatGPT visibility are not the same thing. (This data set showed clear evidence that there wasn’t much overlap with ChatGPT citations/mentions and Gemini citation/mentions for the same prompts.) Optimizing for one does not help with the other. There is no single “AI visibility metric.” There are at least 4 different behavioral systems running in parallel.
3. Strong brands get named in the text
A clear pattern emerges among domains appearing 3 or more times: Content aggregators and academic sources are cited repeatedly but almost never mentioned.
Medium.com was cited 16 times for the same prompts across 3 different engines and named zero times.
Wikipedia.org was cited 27 times and mentioned in only 2 answers, both times for the same conversational query (“what is the most dangerous creature in the world?”).
Wired.com, sciencedirect.com, harvard.edu: same pattern.
Consumer brands with strong public identity get mentioned in the output at near 100%. The AI doesn’t feel the need to cite. Instead, it mentions consumer brands outright. It knows the data about the brands came from somewhere but doesn’t feel the need to explicitly say so to users. For publishers whose value proposition is information authority, this is a structural problem.
* Mention rate above 100% means the brand is named in the answer text even when not cited as a source link - the engine references the brand by name without linking to it. For values in this data set over 100%, think about being cited 10x and mentioned 10x as = 100%. If a brand is mentioned 12x and cited 10x, that’s 120%.
4. LLMs disagree on the same brand 22% of the time
454 prompt+domain combinations were tested across multiple engines. In 22% of those outputs (100 total), LLMs disagreed on whether to mention the brand:
Instagram.com was mentioned by ChatGPT and Gemini but only cited (not named) by Google.
Facebook.com was mentioned by Gemini in 3 out of 3 appearances.
Google AI cited Facebook 9 out of 9 times, but named it in only 1.
The same brand, the same query, but different engines and different outcomes. This matters for measurement: A brand can appear “visible” in one engine’s data while being completely anonymous in another. Aggregate AI visibility metrics mask this divergence.
5. In-text brand mention rates vary by geography
Controlling for the LLM, country-level differences in mention rates are meaningful:
India and Sweden show the highest mention rates (50%), suggesting more conversational or brand-forward query patterns in those markets.
Italy, Brazil, and the Netherlands show the lowest mention rates (18–22%), with very high citation rates (82–94%).
The UK and Canada are mid-range but above the global average.
* Note: the dataset uses localized prompts confirmed by Semrush, so language is not a confound.
Being cited and being named are not the same, and require a different approach
From this analysis, 4 takeaways stood out to me the most for brands and their content strategies:
1. Being cited means an AI is drawing on your content. Being mentioned means it is naming you. We don’t yet know enough about the implications of mentions and citations, but we can say for sure that there’s a system that decides when you’re cited vs. mentioned.
2. Your strategy must be LLM-specific. A Gemini-first strategy is different from a ChatGPT-first strategy. Any AI visibility report that aggregates across LLMs is misleading.
3. Comparative content gets brands named. Informational content feeds the machine anonymously. If the goal is brand mentions, not just citations, focus your content strategy toward evaluation, comparison, and recommendation.
4. Prompt format matters. Brands should map not just which topics they want to appear in, but specifically which phrasing patterns produce mentions vs. ghost citations. Short conversational queries and long structured queries behave like different products.
Methodology
Data source: Semrush AI Toolkit: 3,981 domain appearances across 115 prompts, 14 countries, and 4 AI search engines (ChatGPT, Google AI Overviews, Gemini, Google).
Every row in the dataset represents a domain that appeared in an AI answer. Each appearance is tagged as “cited” (the domain appears as a source link) and/or “mentioned” (the brand name appears in the answer text). The gap between those 2 states is what this analysis calls a ghost citation: the AI used your content but did not say your name.
For premium subscribers: What creates a 30x difference in brand mentions
The ghost citation problem has a solution. In fact, 2 findings in this analysis point directly to it: one about how you phrase queries, one about what type of content you produce. Together, they explain a 30x difference in mention rates on the same topic.
Premium subscribers get both findings, plus the content strategy implications.
The first 5 findings describe the problem. These 2 explain what drives it, and what to do differently.













