Growth Memo

Growth Memo

What Our AI Mode User Behavior Study Reveals about the Future of Search

Here’s what 250 sessions of user behavior in AI Mode tells us about the future of search: It’s quietly rewriting all the rules.

Kevin Indig's avatar
Amanda Johnson's avatar
Kevin Indig
and
Amanda Johnson
Oct 06, 2025
∙ Paid
3
1
Share

This Memo was sent to 21,964 subscribers. Welcome to +134 new readers! You’re reading the free version of Growth Memo. Premium subscribers get deeper strategy breakdowns, original research, and exclusive content drops — including today’s AI Mode plays and next week’s stakeholder-ready slide deck.

Our new usability study of 37 participants across 7 specific search tasks clearly shows that people:

  1. Read AI Mode

  2. Rarely click out, and

  3. Only leave when they are ready to transact.

From what we know, there isn’t another independent usability study that has explored AI Mode to this depth.

In May, I published an extensive two-part study of AI Overviews (AIOs) with Amanda, Eric Van Buskirk, and his team. Eric and I also collaborated on Propellic’s travel industry AI mode study.

We worked together again to bring you this week’s Growth Memo: a study that provides crucial insights and validation into the behaviors of people as they interact with Google’s AI Mode.

Since neither Google nor OpenAI (or anyone else) provides user data for their AI (Search) products, we’re filling a crucial gap.

We captured screen recordings and think-aloud sessions via remote study. The 250 unique tasks collected provide a robust data set for our analysis. (The complete methodology is provided at the end of this memo, including details about the 7 search tasks.)

And you might be surprised by some of the findings. We were.

This is a longer post, so grab a drink and settle in.

Executive summary

Our new usability study of Google’s AI Mode reveals how profoundly this feature changes user behaviour.

  • AI Mode holds attention and keeps users inside. In roughly three‑quarters of the total user sessions, users never left the AI Mode pane - and 88 % of users’ first interactions were with the AI‑generated text. Engagement was high: The median time by task type was roughly 52-77 seconds.

  • Clicks are rare and mostly transactional. The median number of external clicks per task was zero. Yep. You read that right. Ze-ro. And 77.6% of sessions had zero external visits.

  • People skim but still make decisions in AI Mode. Over half of tasks were classified as “skimmed quickly” where users glance at the AI‑generated summary, form an opinion and move on.

  • AI Mode delivers “site types” that match intent. It’s not just about meeting search query or prompt intents; AI Mode is citing sources that fit specific site categories (like marketplaces vs review sites vs brands).

  • Visibility, not traffic, is the emerging currency. Participants made their brand judgments directly from AI Mode outputs.

TLDR? These are the core findings from this study:

  • AI Mode is sticky

  • Clicks are reserved for transactions

  • AI Mode matches site type with intent

  • Product previews act like mini product detail pages (aka PDPs)

Acknowledgements

But before we dig in, a quick shout out here to the team behind this study.

Together with Eric Van Buskirk’s team at Clickstream Solutions, I conducted the first broad usability study of Google’s AI Mode that uncovers not only crucial insights into how people interact with the hybrid search/AI chat engine, but also what kinds of branded sites AI Mode surfaces and when.

I want to highlight that Eric Van Buskirk was the research director. While we collaborated closely on shaping the research questions, areas of focus, and methodology, Eric managed the team, oversaw the study execution, and delivered the findings. Afterward, we worked side by side to interpret the data.

Click data is a great first pass for analysis on what’s happening in AI Mode, but with this usability study specifically, we essentially looked “over the shoulder” of real-life users as they completed tasks, which resulted in a robust collection of data to pull insights from.

Our testing platform was UXtweak.

Make your sales pages stand out in LLMs

Powered by Citation Labs, (X)OFU.com helps you map, measure, and grow bottom-of-funnel visibility in AI answers.

Our team:

  • Models buyer prompts to reveal what LLMs cite

  • Audits where you do (and don’t) show up in ChatGPT, Claude, Google AIOs, Gemini, and AI Mode to expose your BOFU gaps

  • Tracks and engages cited sources to uncover missed opportunities

  • Makes your sales pages worth citing

Buyer decisions are already shaping your LLM presence.

Take control.

Build your next LLM citation strategy today with (X)OFU.com.

Ignore this AI Mode data at your own risk

Google’s own Sundar PichAI has been crystal clear: AI Mode isn’t a toy - it’s a proving ground for what the core search experience will look like in the future.

On the Lex Fridman podcast, PichAI said (bolding mine):

“Our current plan is AI Mode is going to be there as a separate tab for people who really want to experience that… But as features work, we’ll keep migrating it to the main page…” [1]

Google has argued these new AI-focused features are designed to point users to the web; but in practice, our data shows that users stick around and make decisions without clicking out. In theory, this could not only impact click-outs to organic results and citations, but also reduce external clicks to ads.

In August, I explored the reality behind Google’s own product cannibalization with AI Mode and AIOs:

Right now, according to Similarweb data, usage of the AI Mode tab on Google.com in the US has slightly dipped and now sits at just over 1%.

Google AIOs are now seen by more than 1.5 billion searchers every month, and they sit front and center. But engagement is falling. Users are spending less time on Google and clicking less pages.

But as Google rolls AI Mode out more broadly, it brings the biggest shift to Search (the biggest customer acquisition channel there is) ever.

Traditional SEO is highly effective in the new AI world, but if AI Mode really becomes the default, there is a chance we need to rethink our arsenal of tactics.

Preparing for the future of search means treating AI Mode as the destination (not the doorway), and figuring out how to show up there in ways that actually matter to real user behavior.

With this study, I sought out to discover and validate actual user behaviors within the AI Mode experience when undertaking a variety of tasks with differing search intents.

1/ AI Mode is sticky

📊 Key stats:

People read first and usually stay inside the AI Mode experience. Here’s what we found:

  • The majority of sessions had zero external visits: meaning, they didn’t leave AI Mode (at all).

  • ~88% of users’ first interaction* within the feature was with the AI Mode text.

  • Typical user engagement within AI Mode is roughly 50 to 80 seconds per task.

These 3 stats define the AI Mode search surface: It holds attention and resolves many tasks without sending traffic.

* Here’s what I mean by “interaction:”

  • An “interaction” within the user tasks = the participant meaningfully engaged with AI Mode after it loaded.

  • What counts as an interaction: Reading or scrolling the AI Mode body for more than a quick glance, including scanning a result block like the Shopping Pack or Right Pane, opening a merchant card, clicking an inline link, link icon, or image pack.

  • What doesn’t count as an interaction: Brief eye flicks, cursor passes, or hesitation before engaging.

Users are in AI Mode to read - not necessarily to browse or search - with ~88% of sessions interacting with the output’s text first and spending 1 minute or more within the AI Mode experience.

Plus, it’s interesting to see that users spend more than double the time in AI Mode compared to AIOs.

The overall engagement is much stronger.

🧠 Why it matters:

Treat the AI Mode panel like the primary reading surface, not a teaser for blue links.

AI Mode is a contained experience where sending clicks to websites is a low priority and giving users the best answer is the highest one.

As a result, it completely changes the value chain for content creators, companies and publishers.

💡Insight:

Why do other sources and/or AI Mode research analyses say that users don’t return to the AI Mode feature very often?

My theory here is that, because AI mode is a separate search experience (at least, for now), it’s not as visible as AIOs.

As AI Mode adoption increases with Google bringing Gemini (and AI Mode) into the browser, I expect our study findings to scale.

2/ Clicks are reserved for transactions

While clicks are scarce, purchase intent is not.

Participants in the study only clicked out when the task demanded it (e.g. “put an item in your shopping cart”) or if they browsed around a bit.

However, the browsing clicks were so few that we can safely assume AI Mode only leads to click-outs when users want to purchase.

Even prompts with a comparison and informational intent tend to keep users inside the feature.

  • Shopping prompts like [canvas bag] and [tidy desk cables] drive the highest AI Mode exit share.

  • Comparison prompts like [Oura vs Apple Watch] show the lowest exit share of the tasks.

When participants were encouraged to take action (“put an item in your shopping cart” or “find a product”) the majority of clicks went to shopping features like Shopping Packs or Merchant Cards.

18% of exits were caused by users exiting AI Mode and going directly to another site, making it much harder to reverse engineer what drove these visits in the first place.

Study transcripts confirm that participants often share out loud that they’ll “go to the seller’s page,” or “find the product on Amazon/eBay” for product searches.

Even when comparing products, whether software or physical goods, users barely click out.

In plain terms, AI mode eats up all TOFU and MOFU clicks. Users discover products and form opinions about them in AI Mode.

📊 Key stats:

  • Out of 250 valid tasks, the median number of external clicks was zero!

  • The prompt task of [canvas bag] had 44 external clicks and [tidy desk cables] had 31 clicks, accounting for 2/3 of all external clicks in this study.

  • Comparison tasks like [Oura Ring vs Apple Watch] or [Ramp vs Brex] had very few clicks (≤6 total across all tasks)

🧠 Why it matters:

Here’s what’s interesting…

In the AIOs Overviews usability study, we found desktop users click out ~10.6% of the time compared to practically 0% in AI Mode.

However, AIOs have organic search results and SERP Features below them. (People click out less in AIOs but they click on organic results and SERP features more often.)

Zero-clicks

  • AI Overviews: 93%*

  • AI Mode: ~100%

*Keep in mind that participants of the AIO usability study clicked on regular organic search results. The 93% relates to zero clicks within the AI Overview.

On desktop, AI Mode produces roughly double the in-panel clickouts compared to the AIO panel. On AIO SERPs, total clickouts can still happen via organic results below the panel, so the page-level rate will sit between the AIO-panel figure and the classic baseline.

An important note here from Eric Van Kirk, the director of this study: When comparing the AI Mode and AI Overview study, we’re not exactly comparing apples to apples. In this study, participants were given tasks that would prompt them to leave AI Mode in 2/7 questions, and that accounts for the majority of outbound clicks (which was less than 3 external clicks). On the other hand, for the AIO study, the most transactional question was “Find a portable charger for phones under $15. Search as you typically would.” They were not told to “put it in a shopping cart.” However, the insights gathered regarding user behavior from this AI Mode study - and the pattern that users don’t feel the need to click out of AI Mode to make additional decisions - still stands as a solid finding.

💡Insight:

The bigger picture here is that AIOs are like a fact sheet that steers users to sites eventually, but AI Mode is a closed experience that rarely has users clicking out.

What makes AI Mode (and Chat GPT, by the way) tricky is when users abandon the experience and go directly to websites. It messes with attribution models and our ability to understand what influences conversions.

Make your sales pages stand out in LLMs

Powered by Citation Labs, (X)OFU.com helps you map, measure, and grow bottom-of-funnel visibility in AI answers.

Our team:

  • Models buyer prompts to reveal what LLMs cite

  • Audits where you do (and don’t) show up in ChatGPT, Claude, Google AIOs, Gemini, and AI Mode to expose your BOFU gaps

  • Tracks and engages cited sources to uncover missed opportunities

  • Makes your sales pages worth citing

Buyer decisions are already shaping your LLM presence.

Take control.

Build your next LLM citation strategy today with (X)OFU.com.

3/ AI Mode matches site type with intent

In the study, we assess what types of sites AI Mode shows for our 7 tasks.

The types are:

  • Brands: Sellers / vendors

  • Marketplaces: amazon.com, ebay.com, walmart.com, homedepot.com, bestbuy.com, target.com, rei.com

  • Review sites: nerdwallet.com, pcmag.com, zdnet.com, nymag.com, usatoday.com, businessinsider.com

  • Publishers: nytimes.com, nbcnews.com, youtube.com, thespruce.com

  • Platform: Google

📊 Key stats:

Shopping prompts route to product pages:

  • Canvas Bag: 93% of exits go to Brand + Marketplace.

  • Tidy desk cables: 68% go to Brand + Marketplace, with a visible Publisher slice.

Comparisons route to reviews:

  • Ramp vs Brex: 83% Review.

  • Oura vs Apple Watch: split 50% Brand and 50% Marketplace.

When the user has to perform a reputation check, the result is split brand and publishers:

  • Liquid Death: 56% Brand, 44 % Publisher.

Google itself shows up on shopping tasks:

  • Store lookups to business.google.com appear on Canvas Bag (7%) and Tidy desk cables (11%).

Check out the top clicked domains by task:

  • Canvas Bag: llbean.com, ebay.com, rticoutdoors.com, business.google.com

  • Tidy desk cables: walmart.com, amazon.com, homedepot.com

  • Subscription language apps vs free: pcmag.com, nytimes.com, usatoday.com

  • Bottled Water (Liquid Death): reddit.com, liquiddeath.com, youtube.com

  • Ramp vs Brex: nerdwallet.com, kruzeconsulting.com, airwallex.com

  • Oura Ring 3 vs Apple Watch 9: ouraring.com, zdnet.com

  • VR arcade or smart home: sandboxvr.com, business.google.com, yodobashi.com

🧠 Why it matters:

Companies need to understand the playing field. While classic SEO allowed basically any site to be visible for any user intent, AI Mode has strict rules:

  • Brands beat marketplaces when users know what product they want.

  • Marketplaces are preferred when options are broad or generic.

  • Review sites appear for comparisons.

  • Opinions highlight Reddit and publishers.

  • Google itself is most visible for local intent, and sometimes shopping.

💡Insight:

As SEOs, we need to consider how Google classifies our site based on its page templates, reputation, and user engagement. But most importantly, we need to monitor prompts in AI Mode and look at the site mix to understand where we can play.

Sites can’t and won’t be visible for all types of queries in a topic anymore; you’ll need to filter your strategy by the intent that aligns with your site type because AI Mode only shows certain sites (like review sites or brands) for specific types of intent.

4/ Product previews act like mini PDPs

📊 Key stats:

Product previews show up in about 25%of the AI Mode sessions, get ~9 seconds of attention, and people usually open only one.

Then? 45% stop there. Many opens are quick spec checks, not a clickout.

You can easily see how some product recommendations by AI Mode and on-site experiences are quite frustrating to users.

The post-click experience is critical: classic best practices like reviews have a big impact on making the most out of the few clicks we still get.

See this example:

“It looks like it has a lot of positive reviews. That’s one thing I would look at if I was going to buy this bag. So this would be the one I would choose.”

🧠 Why it matters:

In shopping tasks, we found that brand sites take the majority of exits.

In comparison tasks, we discovered that review sites dominate. For reputation checks (like a prompt for [Liquid Death]), exits to brands and publishers were split.

  • For transactional intent prompts: Brands absorb most exits when the task is to buy one item now. [Canvas Bag] shows a strong tilt to brand PDPs.

  • For reputation intent prompts: Brand sites appear alongside publishers. A prompt for [Liquid Death] splits between liquiddeath.com and Reddit/YouTube/Eater.

  • For comparison prompts: Brands take a back seat. [Ramp vs Brex] exits go mostly to review sites like NerdWallet and Kruze.

💡Insight:

Given users can now directly checkout on ChatGPT and AI Mode, shopping-related tasks might send even fewer clicks out. [2, 3]

Therefore, AI Mode becomes a completely closed experience where even shopping intent is fulfilled right in the app.

AI Mode’s story isn’t finished yet: Where to go from here

Clicks are scarce. Influence is plentiful.

The data gives us a reality check: If users continue to adopt the new way of Googling, AI Mode will reshape search behavior in ways SEOs can’t afford to ignore.

  • Strategy shifts from “get the click” to “earn the citation.”

  • Comparisons are for trust, not traffic. They reduce exits because users feel informed inside the panel.

  • Merchants should optimize for decisive exits. Give prices, availability, and proof above the fold to convert the few exits you do get.

You’ll need to earn citations that answer the task, then win the few, high-intent exits that remain.

But our study doesn’t end here.

Today’s results reveal core insights into how people interact with AI Mode. We’ll unpack more to consider with Part 2 dropping next week.

👇👇👇But for those who love to dig into details, the methodology of the study is included below. 👇👇👇

And for premium subscribers, I’m including my AI Mode plays and practical takeaways that I developed after completing and reviewing the study. (You can find this under the methodology section - and it’ll save you that extra thinking work of tying all this new information to tactics.)

Next week, you’ll get a ready-for-you slide deck template to present findings to your stakeholders during your next meeting.

Methodology

Study design and objective

We conducted a mixed-methods usability study to quantify how Google’s new AI Mode changes searcher behavior. Each participant completed 7 live Google search prompts via the AI Mode feature. This design allows us to observe both the mechanics of interaction (scrolls, clicks, dwell, trust) and the qualitative reasoning participants voiced while completing tasks.

The tasks:

  1. What do people say about Liquid Death, the beverage company? Do their drinks appeal to you?

  2. Imagine you’re going to buy a sleep tracker and the only two available are the Oura Ring 3 or the Apple Watch 9. Which would you choose, and why?

  3. You’re getting insights about the perks of a Ramp credit card vs. a Brex Card for small businesses. Which one seems better? What would make a business switch from another card: fee detail, eligibility fine print, or rewards?

  4. In the “”ask anything”“ box in AI Mode, enter “”Help me purchase a waterproof canvas bag.”“ Select one that best fits your needs and you would buy (for example, a camera bag, tote bag, duffel bag, etc.).

    1. Proceed to the seller’s page. Click to add to the shopping cart and complete this task without going further.

  5. Compare subscription language apps to free language apps. Would you pay, and in what situation? Which product would you choose?

  6. Suppose you are visiting a friend in a large city and want to go to either: 1. A virtual reality arcade OR 2. A smart home showroom. What’s the name of the city you’re visiting?

  7. 1. Suppose you work at a small desk and your cables are a mess. 2. In the “Ask anything” box in AI Mode, enter: “The device cables are cluttering up my desk space. What can I buy today to help?” 3. Then choose the one product you think would be the best solution. Put it in the shopping cart on the external website and end this task.

Participants and recruitment

37 English-speaking U.S. adults were recruited via Prolific between August 20 and September 1, 2025 (including participants in a small group who did pilot studies).*

Eligibility required a ≥ 95% Prolific approval rate, a Chromium-based browser, and a functioning microphone. Participants visited AI Mode and performed tasks remotely via their desktop computer; invalid sessions were excluded for technical failure or non-compliance. The final dataset contains over 250 valid task records across 37 participants.

*Pilot studies are conducted first in remote usability testing to identify and fix technical issues—like screen-sharing, task setup, or recording problems—before the main study begins. They help refine task wording, timing, and instructions to ensure participants interpret them correctly. Most importantly, pilot sessions confirm that the data collected will actually answer the research questions and that the methodology works smoothly in a real-world remote setting.

Task protocol

Sessions ran in UXtweak’s Remote unmoderated mode. Participants read a task prompt, clicked to Google.com/aimode, prompted AI Mode, and spoke their thoughts aloud while interacting with AI Mode. They were given the following directions: “Think aloud and briefly explain what draws your attention as you review the information. Speak aloud and hover your mouse to indicate where you find the information you are looking for.” Each participant completed 7 task types designed to cover diverse intent categories, including comparison, transactional, and informational scenarios.

Capture stack

UXtweak recorded full-screen video, cursor paths, scroll events, and audio. Sessions averaged 20–25 minutes. Incentives were competitive. Raw recordings, transcripts, and event logs were exported for coding and analysis.

Annotation procedure

Three trained coders reviewed each video in parallel. A row was logged for UI elements that held attention for ~5 seconds or longer. Variables captured included:

  • Structural: Fields describing the setup, metadata, or structure of the study — not user behavior; include data like participant-ID, task-ID, device, query, order of UI elements clicked or visited during the task, type of site clicked (e.g., social, community, brand, platform), domain name of the external site visited, and more.

  • Feature: Fields describing UI elements or interface components that appeared or were available to the participant. Examples include UI element type, including shopping carousels, merchant cards, right panel, link icons, map embed, local pack, GMB card, merchant packs, and merchant cards.

  • Engagement: Fields that capture active user interaction, attention, or time investment. Includes reading and attention, chat and question behavior, along with click and interaction behavior.

  • Outcome: Fields representing user results, annotator evaluations, or interpretation of behavior. Annotator comments, effort rating, where info was found

Coders also marked qualitative themes (e.g., “speed,” “skepticism,” “trust in citations”) to support RAG-based retrieval. The research director spot-checked ~10% of videos to validate consistency.

Data processing and metrics

Annotations were exported to Python/pandas 2.2. Placeholder codes (‘999=Not Applicable’, ‘998=Not Observable’) were removed, and categorical variables (e.g., appearances, clicks, sentiment) were normalized. Dwell times and other time metrics were trimmed for extreme outliers. After cleaning, ~250 valid task-level rows remained.

Our retrieval-augmented generation (RAG) pipeline enabled three stages of analysis:

  • Data readiness (ingestion): We flattened every participant’s seven tasks into individual rows, cleaned coded values, and standardized time, click, and other metrics. Transcripts were retained so that structured data (such as dwell time) could be associated with what users actually said. Goal: create a clean, unified dataset that connects behavior with reasoning.

  • Relevance filtering (retrieval): We used structured fields and annotations to isolate patterns, such as users who left AI Mode, clicked a merchant card, or showed hesitation. We then searched the transcripts for themes such as trust, convenience, or frustration. Goal: combine behavior and sentiment to reveal real user intent.

  • Interpretation (quant + qual synthesis): For each group, we calculated descriptive stats (dwell, clicks, trust) and paired them with transcript evidence. That’s how we surfaced insights like: “external-site tasks showed higher satisfaction but more CTA confusion.” Goal: link what people did with what they felt inside AI Mode.

This pipeline allowed us to query the dataset hyperspecifically — e.g., “all participants who scrolled >50% in AI Mode but expressed distrust” — and link quantitative outcomes with qualitative reasoning.

In plain terms: We can pull up just the right group of participants or moments, like “all the people who didn’t trust AIO” or “everyone who scrolled more than 50%.”

Statistical analysis

We summarized user behavior using descriptive and inferential statistics across 250 valid task records. Each metric included the count, mean, median, standard deviation, standard error, and 95% confidence interval. Categorical outcomes, such as whether participants left AI Mode or clicked a merchant card, were reported as proportions.

Analyses covered more than 50 structured and behavioral fields — from device type and dwell time to UI interactions, sentiment. Confidence measures were derived from a JSON analysis of user sentiment via transcripts of all users.

Reliability and power

Each task was annotated by a trained coder and spot-checked for consistency across annotators. Coder-level distributions were compared to confirm stable labeling patterns and internal consistency.

Thirty-seven participants completed seven tasks each, resulting in approximately 250 valid tasks. At that scale, proportions around fifty percent carry a margin of error of about six percentage points, giving the dataset enough precision to detect meaningful directional differences.

Limitations

Sample size is smaller than our AI Overviews study (37 vs. 69 participants) and is meant to learn about U.S.-based users (all participants were living in the U.S.). All queries took place within AI Mode, meaning we did not directly compare AI vs non-AI conditions. Think-aloud may inflate dwell times slightly. RAG-driven coding is only as strong as its annotation inputs, though heavy spot-checks confirmed reliability.

Ethical compliance

Participants gave informed consent. Recordings were encrypted and anonymized; no personally identifying data were retained. The study conforms to Prolific’s ethics policy and UXtweak TOS.

More knowledge, more power: AI Mode plays and practical takeaways

Our usability study gives us rich insights into how we can influence buyers in AI Mode.

There are 3 things you need to focus on to create influential content:

  1. Get ready to abandon clicks and know your place of influence

  2. Be ready for a quick glimpse + bounce: Aggressively optimize your landing pages

  3. Don’t expect to dominate across all intent types (unfortunate, I know)

Below, I’ll dig into 4 core plays based on the findings above.

🔎 Practical takeaway #1: Get ready to abandon clicks - AI Mode is a game of influence

Influence is harder to measure than clicks, but we can track our presence rate, citation quality, and long-term sign-ups.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Kevin Indig
Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture