Growth Memo

Growth Memo

Shorter, Focused Content Wins in ChatGPT

We analyzed 815,000 query-page pairs. The "ultimate guide" strategy produces worse citation results than a focused shorter page.

Kevin Indig's avatar
Kevin Indig
Apr 13, 2026
∙ Paid

This Memo was sent to 26,074 subscribers. Welcome to +105 new readers! Subscribe to get the free memo weekly or upgrade to Premium for the full archive, research, frameworks, and templates.

For years, SEOs have operated on a simple assumption: The more ground your content covers, the more likely it is to surface in AI-generated answers. In fact, every “best practice” in classic SEO content pushes you toward more: more subtopics, more sections, more words. Build the “ultimate guide.”

An analysis of 815,000 query-page pairs across 16,851 queries and 353,799 pages says otherwise:

  • Fanout coverage is nearly irrelevant to citation rates

  • 2 signals actually predict whether ChatGPT cites your page

  • 6 concrete changes to your existing content library help

Premium subscribers get 3 additional findings: whether the density-vs-query match tradeoff shifts by vertical, heading patterns from 6.8 million H2-H4s, and what the data says about “best X” searches.

ChatGPT judges a page by its cover.

Pages with headlines that directly answer the question get cited 41% of the time. Pages with loosely related headlines drop to 29%.

I partnered with AirOps on a study of 16,851 ChatGPT queries and 353,799 pages across 10 industries. Several more findings should change how you approach AI visibility:

  1. Retrieval rank is the #1 signal: A page at position 1 has a 58% chance of being cited. By position 10, that drops to 14%.

  2. Do comprehensive guides still win? Not exactly. Pages covering 26-50% of ChatGPT’s fanout sub-queries get cited more than pages covering 100%.

  3. Domain authority predicts nothing: Always-cited pages have lower DA than never-cited pages. Content quality is what counts.

The full report covers 20+ signals, with controlled comparisons across each. Get a head start on how you can win visibility.

Read the full report

1/ The study

AirOps ran 16,851 queries through ChatGPT 3 times each through the UI, capturing every fanout sub-query, every URL searched, every citation made, and every page scraped. Oshen Davidson built the pipeline. I analyzed the data.

Each query generates an average of 2 fanout queries. ChatGPT retrieves roughly 10 URLs per sub-search, reads through them, then selects which ones to cite. We scored how well each page’s H2-H4 subheadings matched those fanout queries using cosine similarity on bge-base-en-v1.5 embeddings. That score is what we call fanout coverage: the share of subtopics a page addresses at a 0.80 similarity threshold. (The 0.80 similarity threshold cutoff was used to decide whether a subheading counts as a match to a fanout query. Think of it as a relevance bar.)

The question: Do pages with higher fanout coverage get cited more?

You’ll find even more information in the co-written AirOps report.

2/ Density barely moves the needle

Across 815,484 rows, the relationship between fanout coverage and citation is weak.

Covering 100% of subtopics adds 4.6 percentage points over covering none. That gap shrinks further when you control for query match (how well the page’s best heading matches the original query). Among pages with strong query match (>= 0.80 cosine similarity):

Moderate coverage (26-50%) outperforms exhaustive coverage. Pages that cover everything score lower than pages that cover a quarter of the subtopics. The “ultimate guide” strategy produces worse results than a focused article that covers 2-3 related angles well.

3/ What actually predicts citation

These 2 signals dominate: retrieval rank and query match.

1/ Retrieval rank is the strongest predictor by a wide margin. A page at position 0 in ChatGPT’s web search results (the first URL returned by its search tool) has a 58% citation rate. By position 10, that drops to 14%. We ran each prompt 3 times consecutively for this analysis, and pages cited in all 3 runs have a median retrieval rank of 2.5. Pages never cited: median rank 13.

2/ Query match (cosine similarity between the query and the page’s best heading) is the strongest content signal. Pages with a 0.90+ heading match have a 41% citation rate compared to the 30% rate for pages below 0.50. Even among top-ranked pages (position 0-2), higher query match adds 19 percentage points.

Fanout coverage, word count, heading count, domain authority: all secondary. Some are flat. Some are inversely correlated.

4/ The Wikipedia exception

One site type breaks the pattern. Wikipedia has the worst retrieval rank in the dataset (median 24) and the lowest query match score (0.576). It still achieves the highest citation rate: 59%.

Wikipedia pages average 4,383 words, 31 lists, and 6.6 tables. They are encyclopedic in the literal sense. ChatGPT cites Wikipedia from deep in the search results where every other site type gets ignored.

This is density working as a signal, but at a scale no publisher can replicate. Wikipedia’s content is exhaustive, richly structured, and cross-linked across millions of topics. A 3,000-word corporate blog post with 15 subheadings is not the same thing.

5/ The bimodal reality

58% of pages retrieved by ChatGPT in this dataset are never cited. 25% are always cited when they appear. Only 17% fall in between.

The always-cited and never-cited groups look nearly identical on most content metrics: similar word counts (~2,200), similar heading counts (~20), similar readability scores (~12 FK grade), similar domain authority (~54). The on-page signals we can measure do not separate winners from losers.

What separates them is retrieval rank. Always-cited pages rank near the top when they surface. Never-cited pages rank in the bottom half. The retrieval system, whatever signals it uses internally, is the gatekeeper. Everything else is a tiebreaker.

6/ What this means for your content

Conventional SEO content writing wisdom says cover more subtopics, add more sections, build density. The data says the conventional approach produces “mixed” pages, the 17% in the middle that get cited sometimes and ignored other times.

Mixed pages have the highest word counts, the most headings, and the highest domain authority in the dataset. They are the “ultimate guides.” They are also the least reliable performers in ChatGPT.

The pages that win consistently are focused. They:

  • Match the query directly in their headings,

  • Tend to be shorter (the citation sweet spot is 500-2,000 words), and

  • have enough structure (7-20 subheadings) to organize the content without diluting it.

Build the page that is the best answer to one question. Not the page that adequately answers twenty.

Premium subscribers receive 3 more findings:

  1. If this density-vs-query match depends on the vertical

  2. Common successful patterns for writing headings for AI citation (with 6.8 million H2-H4 headings analyzed), and

  3. What you need to know about “Best x” type searches.

User's avatar

Continue reading this post for free, courtesy of Kevin Indig.

Or purchase a paid subscription.
© 2026 Kevin Indig · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture