We tracked 70 users across 8 tasks to deeply understand their behavior, thoughts emotions when engaging with Google's AI Overviews and other SERP Features. The findings paint a new picture of SEO.
Damn, I love this. It’s highly relevant to the Google engineer’s comment on the DOJ case: “When search results are worse, people attempt fewer tasks. When they're better, they attempt more.”
To me, this highlights the need for a framework that differentiates task completion vs task expansion queries.
This approach can better align brand perception (EEAT), content type, format, site architecture, SEO best practices, UX, CRO, new flavor of Off-site / barnacle SEO (3rd party + social validation paths), and metrics (including brand sentiment / customer happiness scores).
It’s not a new concept, but the application and medium are evolving as user behavior shifts alongside AI-driven experiences.
Secondly, are we building a generation of answer-dependent users, or simply refining the art of finding truth?
Hopefully, it’s the latter. With thoughtful design and user-centric experiences, the ecosystem can still make the web a better place, even as the landscape shifts and traditional visibility declines. I love where SEO is going, thank you and the team for sharing this.
This is excellent. Thanks for putting it together. Loads of insights to shamelessly steal (I mean, strategically summarise) for clients.
If you’re open to it, I had a few follow-up questions:
1. In my experience, one challenge with remote usability testing is that participants often engage more deeply than they would in an organic setting (demand characteristics, and all that).
Do you think your testers might have paid more attention to AIOs than typical users would in the wild, possibly making the CTR drop-offs here lower than what might happen naturally? Or does this roughly align with the larger-scale traffic patterns you’ve seen?
2. How much variance did you notice in scroll depth or click-out rates across categories? Did user behaviour tend to cluster, or did it vary widely by query type? And did you categorise queries in any more granular ways that could be used to stratify the data, beyond health/DIY etc (e.g., by query structure, industry, etc.)?
Essentially I’m wondering how likely it is for scroll depth to be vary significantly across industries and different query types.
3. That section on users clicking through to social platforms to validate AIOs is fascinating—30% on desktop is higher than I’d have guessed. When you say a user clicked through to Reddit or other forums, was that specifically through the “Discussions and forums” SERP feature, or did it also include clicks on organic listings or search refinements?
Sorry if any of those are answered in the write-up and I missed it. Really appreciate the work you’ve done here.
1/ No, we accounted for that by mixing the questions with queries that don't return AIOs.
2/ user behavior varied by query type. for example, finding a coupon and comparing drug side effects were very different. for the former, users basically skim. for the latter, they really take their time. The variance is there for sure (see mean vs. median).
3/ it varied, but users sought out Reddit in either way. I will say the 30% is also an average that varies by query stakes and context.
I remember we have chatted about these things back a few months in the old year. Great to see what you have done here. This piece of work made me a paying subscriber to pro plan.
Absolutely fascinating. Thanks for putting this together.
It's always good to see research back up your intuition. I so rarely read the full AI response - AIO or chatbots - I read the first few lines and maybe scan some more of the content.
It is also interesting to see how few clicks the AIO citations are getting and that more often than not users were still clicking on organic.
I realise this is qualitative research but I wonder how many of these metrics could be measured at scale?
Finally some real user insight behind all the speculation around AIOs. That 2/3 drop in desktop clicks is wild. Question - with visibility becoming the new currency (instead of clicks), how should we start thinking about measuring content performance going forward?
I think the new measuring models still need to be built. My take is that we need to measure visibility, for example in share of voice, and find a way t9 track quality impressions. I'll publish more on that shortly :).
Masterclass of a post.
Seems like a must read for any growth marketer, thank you Kevin for this gold mine of insights.
I watched the video and skimmed thru but I’m definitely booking a good hour tomorrow to go through it in detail. More comments to follow.
Thanks mate !
Damn, I love this. It’s highly relevant to the Google engineer’s comment on the DOJ case: “When search results are worse, people attempt fewer tasks. When they're better, they attempt more.”
To me, this highlights the need for a framework that differentiates task completion vs task expansion queries.
This approach can better align brand perception (EEAT), content type, format, site architecture, SEO best practices, UX, CRO, new flavor of Off-site / barnacle SEO (3rd party + social validation paths), and metrics (including brand sentiment / customer happiness scores).
It’s not a new concept, but the application and medium are evolving as user behavior shifts alongside AI-driven experiences.
Secondly, are we building a generation of answer-dependent users, or simply refining the art of finding truth?
Hopefully, it’s the latter. With thoughtful design and user-centric experiences, the ecosystem can still make the web a better place, even as the landscape shifts and traditional visibility declines. I love where SEO is going, thank you and the team for sharing this.
On point!
The Kevin Indig value!!! 🔥
And all for free?!?! He never stops 🎉
🙏🙏🙏
This is excellent. Thanks for putting it together. Loads of insights to shamelessly steal (I mean, strategically summarise) for clients.
If you’re open to it, I had a few follow-up questions:
1. In my experience, one challenge with remote usability testing is that participants often engage more deeply than they would in an organic setting (demand characteristics, and all that).
Do you think your testers might have paid more attention to AIOs than typical users would in the wild, possibly making the CTR drop-offs here lower than what might happen naturally? Or does this roughly align with the larger-scale traffic patterns you’ve seen?
2. How much variance did you notice in scroll depth or click-out rates across categories? Did user behaviour tend to cluster, or did it vary widely by query type? And did you categorise queries in any more granular ways that could be used to stratify the data, beyond health/DIY etc (e.g., by query structure, industry, etc.)?
Essentially I’m wondering how likely it is for scroll depth to be vary significantly across industries and different query types.
3. That section on users clicking through to social platforms to validate AIOs is fascinating—30% on desktop is higher than I’d have guessed. When you say a user clicked through to Reddit or other forums, was that specifically through the “Discussions and forums” SERP feature, or did it also include clicks on organic listings or search refinements?
Sorry if any of those are answered in the write-up and I missed it. Really appreciate the work you’ve done here.
Haha, my pleasure!
Regarding your questions:
1/ No, we accounted for that by mixing the questions with queries that don't return AIOs.
2/ user behavior varied by query type. for example, finding a coupon and comparing drug side effects were very different. for the former, users basically skim. for the latter, they really take their time. The variance is there for sure (see mean vs. median).
3/ it varied, but users sought out Reddit in either way. I will say the 30% is also an average that varies by query stakes and context.
All good! Happy to clarify, of course :).
I remember we have chatted about these things back a few months in the old year. Great to see what you have done here. This piece of work made me a paying subscriber to pro plan.
🙇🏻♂️🙇🏻♂️🙇🏻♂️
Großartig Kevin. This one made me a subscriber.
Aw huge! Danke, Ralf :)
Absolutely fascinating. Thanks for putting this together.
It's always good to see research back up your intuition. I so rarely read the full AI response - AIO or chatbots - I read the first few lines and maybe scan some more of the content.
It is also interesting to see how few clicks the AIO citations are getting and that more often than not users were still clicking on organic.
I realise this is qualitative research but I wonder how many of these metrics could be measured at scale?
Thanks! Check out my previous posts with lots of aggregated quant data.
You're right that people still click organic results, but I will also say that AIOs decrease clicks significantly. Both is true at the same time.
Outstanding Kevin. Unparalleled insights. Insanely valuable for businesses and SEOs IMO
thank you so much!
Well this is an amazing analysis/write up, Kevin. Upgrading to paid 👏
Thanks so much ,Tyler!
Finally some real user insight behind all the speculation around AIOs. That 2/3 drop in desktop clicks is wild. Question - with visibility becoming the new currency (instead of clicks), how should we start thinking about measuring content performance going forward?
Thanks!
I think the new measuring models still need to be built. My take is that we need to measure visibility, for example in share of voice, and find a way t9 track quality impressions. I'll publish more on that shortly :).