Chapters
00:00 Introduction to AI and E-Commerce
02:19 Josh Silverbauer's Background in Analytics
04:51 The Concept of the New Data Layer
07:13 Shifts in Analytics with AI
13:59 Understanding User Sessions in AI Commerce
22:42 Measuring AI-Driven Traffic and Conversions
24:47 Microsoft Clarity and AI Insights
27:27 The Disconnect Between GA4 and AI
27:51 Optimizing Data Layers for E-Commerce
32:33 Server-Side Tracking: Benefits and Challenges
37:17 The Importance of Proper Web Structure
43:05 Lessons from GA4 Migration
46:29 Continuous Learning in Analytics
Katrin (00:00)
Welcome to Knowledge Distillation, where we explore the rise of the AI analyst. I'm your host, Catherine Ribond, CEO and founder of AskWI. This is episode nine. We're continuing our deep dive into the agent e-commerce era. And specifically today, we'll be talking about the new data layer. In episode eight, Sandy Manage showed us
how to make websites readable by AI agents. The accessibility trees, the semantic HTML, structured data that lets agents actually see and navigate your site, basically the way he coined it, making boring sexy again. So making things right, just having proper structure, calling a button a button, all of these good things. Obviously, there's a little bit more to it than that, but that's really the basics.
Today we're tackling the next question. ⁓ Once a human initiates a product search in an LLM, how do you know if the LLM found your site? it.
if your product made it into the context window. By the way, there's a filthy video about that that Sani and I did with also Kelly Wortham. It's the Hollywood special for those of you who want to watch it. It basically explains that whole mechanics and how to structure your data in order to get into the context window and be caught by the attention mechanisms. ⁓ So we're not going to go into that now. But the question is, how do you know if the LLM found your site Reddit?
If your product made into the context window, if an agent buys on your site, how do you know if it's not a human? How do you attribute it? How do you even define a conversion? So basically, how do you actually go about this new data layer that we have to create for this new era of commerce? So to talk about this today, I am very, very happy to welcome fellow analysts, singer, dancer,
and Measuring Camp New York organizer. We have a lot of common. Josh Silverbauer, ⁓ Head of Analytics and CRO and partner at From the Future. Josh, I usually make the intros, but with the amount of things you do, please can you tell us about all of it and what exactly you focus on in analytics?
Josh (02:19)
I'm mostly, well, hi everybody, I'm Josh Silverbauer. I have a broken hand, so you get to see three of my fingers and not two. But I can still type and I can still use the phone, so I'm still A-OK when it comes to work. Completely functional.
Katrin (02:32)
You're completely functional.
Josh (02:37)
So I, yes, like Catherine said, I am Josh Silverbauer. I mainly focus on the analytics space in terms of implementation. I'm in the Google Tag Manager space, technical implementation, working with companies to.
you know, kind of architect their data and make sure that they're getting value out of, you know, what is happening on a website, ⁓ understanding what people are doing on websites and now, you know, understanding what ⁓ activity is happening on their websites, right? ⁓ And then, like you said, I weirdly combine that stuff with entertainment to make the other side of my brain happy. ⁓ Started doing that a few years ago.
combining music, comedy, and whatever else I could do. ⁓ Because I also really enjoy creativity and I always want to make that part of whatever I'm doing. ⁓ combining them has been a really interesting outlet for me.
Katrin (03:43)
And so for the people in the audience who don't know about this, you actually wrote an opera, a rock opera about the cookie.
Josh (03:52)
Yes, well, so it's loosely based on Universal Analytics Sunsetting. When that happened, I thought it would be really funny to write a song called Universal Sunset, which was about, you know, yeah, it was about a person at the time. It was not about an alien. It was just about a person who was losing their universe and they had to go find a new universe. And that was like the concept of that song because we were all losing our universes at that time. And
Katrin (04:06)
It's good title.
Yeah.
Josh (04:22)
I wanted to kind pull from my experience of having to like, not sure where my career is going to go, not sure what I'm doing next, right? How is this? I have to learn this whole new thing. So I wrote that song and then I got very good reception on it. ⁓ And so I made it into a full album and it is about an alien named Cookie who has to go off and find a new universe after his universe is sunset, essentially. And it's about like the adventures that he goes to try to find his new universe.
Katrin (04:51)
highly recommend it's very funny. I mean, if you're, don't know if you're not in data, but if you're in data, it's really very funny.
Josh (04:57)
Yeah, and if you're not in data, you will still
enjoy it. It's just, you might not get some of the jokes.
Katrin (05:02)
Yeah, probably.
So after talking to Sunny in episode 8, sort of down on me. It really should have done on me earlier. I don't know why it like just really sort of like I was like, oh, wow, big discovery that OpenAI's instant checkout isn't just a new feature.
It's actually forcing a work stream cycle that's comparable in scale, in my opinion, to like three major previous shift combined. So ⁓ I see it as combining part of the mobile shift, the mobile re-platforming, because it's really about users accessing the information very differently, ⁓ different behavior.
We have humans and non-humans. I mean, that's kind of like, feels like a significant segmentation. The other part is kind of the Amazon channel shift in business model. You know, when Amazon became a big thing, brands having to come down and contend with like that new distribution channel that forces specific constraints in terms of information accessibility.
⁓ selection obviously but also relationship ownership feel like that that's a big deal in this one and then you know as you ⁓ as you kind of just mentioned the universal sunset so that migration at least in ⁓ at least in scale well the thing is it's like
It feels like the same thing, but with no playbook. It's like the metrics, the models, the infrastructure, all of that needs to be invented. It's not like we have a goal. We're not retagging with something that is this. We have to think about all of these data there differently in a new for a type of user that we have really no experience with. ⁓
GFO really was a migration, agentic measurement is a construction. So there's no playbook. And I'm wondering, since you are really at the anchor of the creation of the data layer, does that resonate with what you're seeing today?
Josh (07:13)
Yeah, I mean, it's interesting, right? Like you kind of like hit it on the head with like, there's been these shifts over time, right? Where, ⁓ you know, like...
you say you live in SEO, right? To a degree, you've always had this dichotomy between like optimizing for a human, optimizing for a bot, right? And, you know, and then how to measure that, right? Like how are you going to be determining that your, you know, ⁓ website, for instance, is, the bots are essentially responding to that, right? And so, you know, that and then mixed with like GA4, the mentality around GA4 where like,
It was very easy before with Universal Analytics to get away with not having superstructured data. You really could just throw... Yeah, yeah, sure. Yeah, so like Universal Analytics ⁓ was laid out in a way, like the UI was easy enough in a way where you were able to ⁓ get...
Katrin (08:02)
Can you explain why with the people who are not so familiar with it?
Josh (08:20)
access to reports and ⁓ session level data in a way that was accessible enough. ⁓ You could build essentially like.
You know, every like, like the way that the architecture worked is you had like three different event, know, event category action label, you could drill down easily into those three. was very simple and contained, right. And you would have like your, you know, your main goals. And then, you know, then like everything was in universal analytics and you could get all the data in universal analytics. then, you know, like it, it allowed for you to see most of what was going on. ⁓ it didn't like harbor, it didn't.
Katrin (08:53)
Mm-hmm
Josh (09:02)
You know, there was little segmentation. There was little like nuances to it, right? Everything was last touch. So no one had to ask questions, right? Even though you should be asking questions. It wasn't like, you know, if you start giving, if you start being like, Hey, you live in, you know, everybody's used to living in their own little house, right?
But if you start to be like, well, your house is actually in this city and the city's in a state and the state's in a country and this country is in a universe and you have to understand the entire universe because what you're living right now doesn't is only like one very tiny bit. Like look how big it actually is. And you're like, I can't compute because this is my reality. You know, like that was like universal analytics to GA4, right? Like GA4 was like, okay, we are going to give you first touch, last touch, right? And data driven three different
Katrin (09:45)
Mm-hmm.
Josh (09:52)
attribution models. Most of the people didn't even understand what an attribution model was. They just understood like, these are where my purchases came from. Right. And then on top of that, here's all of this, like you can compile any, like it's basically like, I used to explain it when we were doing the transitions as like, Universal Analytics was like a, an automatic car that went 60 miles per hour. Right. And people were like, cool.
I don't have to really learn how, you know, that much. just learn how to drive and then, you know, I get to where I'm going. It's not the fastest thing in the world, but fine. But then when they went to GA4, they're like, this is an amazing car. This goes 90 miles per hour. You're going to get so fast. And then you would say, that sounds great. And then you get a, ⁓ a box in the mail that says car, right. And you're like, wait, I have to build this. Right. And then you're like, yeah, you have to build it, but it goes 90 miles per hour. And then when you build it.
You still have to learn six shift, right? And so like, it's like, yes, it has the ability to get further, but you have to learn all these different types of tools. have to structure your data in a way that like makes sense. And then like, you're not going to get.
Katrin (11:01)
And can
you give like a simple example of something that you need to structure in GA4 in a very specific way that you could get away with in universe and analytics?
Josh (11:14)
Sure, so like an example of this is say you want to send, know, like.
Say you want to send like a, let's go with like click data for instance, right? ⁓ In universal analytics, you could just send click and then a like event categories click, event action is the, ⁓ let's say it is the click text and then the event, ⁓
Katrin (11:36)
Okay?
Josh (11:55)
⁓ label is one drill down, which would be click URL, right? And you can just easily use a hierarchy of like, okay, I'm going to drill down into this, and then I'm going to drill down into this to see like, you know, where, what they clicked and where they clicked to, right? With GA4, everything is kind of like one dimensional surrounding around that, like specific event that occurred, right? And so you have this just basically like click, right? It's just click.
Katrin (12:11)
Mm-hmm.
Josh (12:23)
Right. And then you have the ability to layer on custom dimensions, but nothing in GA4 ties the custom dimension directly to the event that you're sending.
Right? So you have this click event, and then you build these custom dimensions. But you have to have some kind of architecture documentation that says these custom dimensions go with these events. Right? And then on top of that, Google Analytics, like GA4,
Katrin (12:35)
Mm-hmm.
I understand.
Josh (12:56)
like, is highly segments your data if you try to get any type of raw data from GA4 itself. And so then you're kind of expected to bring that data into something like BigQuery, which has even more architecture that you have to learn, which is even more important to have some kind of documentation around where everything is and what everything is essentially connected to, because you're going to have to pull that data in SQL, right? So.
Katrin (13:21)
So you really
have to maintain your own semantic layer about your implementation separately, right?
Josh (13:25)
Yeah,
which is a good thing. It's like having a blueprint for your house or that you're architecturing. You don't want to just build a house without a measurement plan, right? It forced people to be like, what do I care about as opposed to let me track everything because you can't really track everything and then just walk backwards from that now. You have to be very civic about
Katrin (13:47)
Yeah, I mean,
quite frankly, you never really could, right? You maybe technically could, but you really shouldn't have ever, ever have done that, right?
Josh (13:51)
Yeah, you should. Right. It was like auditing 101, right?
Of Universal Analytics, you know?
Katrin (13:59)
So let's start with the fundamental break in what I think is the fundamental break in this shift to agent commerce. Traditional analytics assumes user visits site, browsers, converts, roughly. We all know this is not how it works, but basically that's what we assume, more or less. With instant checkout, discovery happens in chat GPT.
Let's just take OpenAI Commerce as an example. I know we will talk about Google Universal Protocol after, et cetera. Let's just take OpenAI Instant Checkout. So discovery happens in ChatGPT. The purchase is confirmed off domain, so off domain from the brand. Not always, but can be. And then the order just appears.
on in the brand's back end, in the merchant's back end, so in GA4 or Shopify, whatever. So what does that do to the concept of session? In that semantic layer, I had a certain definition of what is session. This is a completely different definition of what is session. And specifically, if I'm an analyst looking at my GA4 data and a customer asks, Chagipiti, find my running shoe for, I don't know, $150, and bought from my store without ever
Josh (15:05)
Yeah.
Katrin (15:16)
How does that appear in my G4 data today? Does it appear? ⁓ No, nothing.
Josh (15:22)
No, it does not appear right
now. I mean, you can configure some stuff right now, but everything is server side ⁓ within like the...
Katrin (15:26)
Mm-hmm.
Can you
explain that for people who aren't familiar with that concept?
Josh (15:33)
Yes.
there is no bot that is going and triggering JavaScript on your site itself. If you were to, if somebody, say the first part of that happens, and the first part of that happens and somebody clicks a link from ChatGBT to your website,
Katrin (15:49)
Mm-hmm.
Josh (16:00)
There are scenarios where you will see UTM underscore source equals chat GPT. And not all scenarios, some scenarios. And same with Google. There are some scenarios where you can identify that this came from an AI overview, for instance. It's not all scenarios, though. And so you might see in your most likely you're unassigned traffic right now, ⁓ chat GPT.
Katrin (16:06)
But not all scenarios.
Mm-hmm.
Josh (16:29)
not set, for instance, right? And if that is, if you see that it's usually somebody who has, you know, been served information from chat GPT and there's a link there. And in one of the links, you click that link and it has like UTM source equals chat GPT. And so now you have some idea of this is still a human going to my website.
Katrin (16:32)
Mm-hmm.
Josh (16:52)
and this is now, they are now purchasing or they are now exploring the website. So that's like one, but with instant checkout, there is no JavaScript call essentially right now that is ⁓ triggering any type of ⁓ actual transaction on your site, right? This is all communicated back and forth via server side posts.
of this is a transaction that's occurring. And there are certain server-side ⁓ actions that you can tie onto. And when a transaction happens, can post essentially back to GA4 if you want to with this is a chat GPT referral instant checkout. ⁓
Katrin (17:45)
But you have
to actively build that loop.
Josh (17:48)
Yes. You would have to actively
build that loop. And you wouldn't really get any type of attribution with that. It would just be this event happened from this. There's no user in that scenario. It's literally just these are the amount of times that that happened.
Right? ⁓ So it'd just be event-based logic that is just like, here's an event that happened from this channel. Right? But there's no actual, like, you can't identify a user. So the alternative to that is to, when that type of action happens, for instance, you could, you know, like, send essentially an order confirmation with all of their order details to an email.
Katrin (18:05)
Yes.
Josh (18:32)
for instance, right, because you would be receiving that. And then when they click that to the website, now you are picking up the browser activity around any type of thing. So you're like forcing that person back to go to some part of the website journey so that you can get more information than just this happened, right? Yeah.
Katrin (18:53)
Yes.
So that's actually worse than I thought it was. ⁓ So we have these two distinct human, non-human, I mean, we always had bots, right? But these are bots we kind of want. They're sort of gray bots, right? We don't want to just filter them out. We want to actually know about this traffic. ⁓
I don't know if you've ever been confronted with that already or not, but do you think there's any way to identify through the behavior of the pseudo user ID that this is not what a human behavior would be because they would, I don't know, be browsing too fast, be looking at too much content, something?
Josh (19:38)
I mean, yeah. So like generally like there are, there are certain like, ⁓ like user, ⁓ like if you look at, for instance, traffic that's in Shopify or traffic that is in for analytics, there's usually like these, ⁓ specific, ⁓ locations that are servers essentially like
Katrin (19:55)
Mm-hmm.
Josh (20:07)
AWS servers are running, yeah, that you can tell that the amount of traffic that is coming from this specific place is generally like bot traffic, right? So that's one way to identify it. There is, you can look at user agent and see if there's, or like,
Katrin (20:08)
How interesting.
Josh (20:35)
determine if there's certain specific screen sizes or something like that that you can ⁓ identify specific bot traffic. But yeah, mean, also engagement, right? ⁓ If you see, then this is where BigQuery comes in, where you have the ability to see timestamps. And if something is just going super duper fast through different pages, that's something that you can identify potentially as bot traffic.
Katrin (21:01)
I kind of love this in a way because it feels like being at the beginning of web analytics again, where, you know, we were looking at things and trying to like guess and maybe if I do this and if I squinted it that way, I can recognize this and that kind of this detective work. I like it.
Josh (21:17)
Yeah, it's
weird because we are at a time where there's so much more information, but there's so much less information at the same time, right? Where we have come around a corner where there is, if we look at a system like GA4, right? We have segmentation, have, or we have,
canonicalization, we have consent issues, we have all this stuff that we're getting very partial amounts of information. And then we have all the bot stuff as well. But we also now have AI as a resource to help sift through information. But that doesn't mean that we ourselves feel empowered. We are now relying on machines to analyze machines.
And so us as analysts are kind of, you know, like there is an inherent distrust or not distrust, like a feeling of disempowerment by being in the situation where we can't really identify truth ourselves, you know.
Katrin (22:28)
Yeah,
that's true. so, technically, if a brand came to you tomorrow and said, like, we already know how to assess the impact of agent e-commerce on our online visits, conversions, whatever, how do
do we use what we have today infrastructure wise, right? To get an idea of the scale of traffic. And let's imagine for the sake of argument, this brand is reasonably well implemented in GA4, there's no glaring issues. They kind of have a reasonably documented semantic layer and they know what is what more or less. Where would you start?
Josh (23:05)
Yeah, I mean, we can see kind of like, I mean, a lot of this is trend base, right? So like, you can kind of look over time to determine like, with the people who are clicking over to your site from these channels, is that going up? Is it going down? Right? Because you do have that, right? You do have the ability to ⁓
Katrin (23:10)
Mm-hmm.
Josh (23:29)
like have some level of refer data or UTM source, you know, AI traffic data, which you can channel group together to say, is this going up or is this going down? Now I know that there are some tools that are
working on the impact of querying or citation, let's say, how many times your website is essentially cited in a response from an LLM. And so when it is returned, they're able to essentially pick up that this
you know, this website essentially was returned in these results in the LLM. I will not reveal yet which brand is doing that, but that is happening now. ⁓
Katrin (24:26)
No, but could you talk about what new tools you are seeing at that data layer ⁓ level where you operate that are either focused on SEOG, or like the discoverability, et cetera, or the specific measurement of this type of traffic? Do you see anything interesting cropping up?
Josh (24:47)
Yeah, I mean, guess it's I don't think it's it's. I don't think it's knowledge that I'm not allowed to talk about, but clarity just showed me like Microsoft Clarity was on an ambassador's call with them and because a lot of the.
LLMs use Microsoft as a essentially like a base, right? That they pull from. They're able to like get actual data around like the actual websites that are returned from queries. ⁓ Even if they're not essentially clicked on, right? Like ⁓ just the citation. like within clarity in a few months, we are going to have like the dashboards essentially that are, you know, AI traffic focus or
Katrin (25:07)
Mm-hmm.
Josh (25:33)
AI citation focused that is supposed to give you a better sense of how many times your website is showing up versus competitors. You can put it in competitors and it will show like yours against competitors for certain ⁓ keywords and citations. I think there's companies out there who are trying to ⁓ do what they can, but at the end of the day, a lot of this stuff is still that black box.
Katrin (25:51)
Mm-hmm.
But I would imagine ⁓ conversely to what Clarity is doing on the Google side, they're also going to do something, right? ⁓ Obviously.
Josh (26:09)
Yeah, Google's gonna do. Yeah, demonize. Right. I mean, but that's
the the the thing with like Google, which has been really weird. And, you know, like, we'll give them the benefit of doubt is that so much of like, what is happening with
Gemini and data and all that stuff is not yet in GA4 or is not like it's so it feels so disconnected from like GA4 feels like this is website data Here's your website data and then like there's all the AI stuff, right? And they're really not yet doing anything that gives us any Information. I mean, it's almost the opposite where like things like P max and stuff. They're just like trust us, you know, we're gonna you put it
your your your your budget and we'll make it work you know versus like you know like having kind of that you know the more information not not to say they won't get there
But it has been kind of fragmented from like, you would think that with all the resources and all the money and everything that they have that's being poured into AI development, you would think that they would give the users a little bit more information around what's happening, right?
Katrin (27:27)
So yeah, I mean, it's actually really interesting to see that ⁓ Microsoft is on top of that, that Clarity really is on top of that. ⁓ So, you know, let's go one step further. ⁓ Brand ABC really realizes after listening to these podcasts and doing all of these good analytics, ⁓ there is an impact.
Josh (27:36)
Yeah.
Katrin (27:51)
And the impact is significant enough ⁓ in that agent traffic that they want to kick off a project. ⁓
let's call it something grand use like agentic analytics, something, something optimization, know, cross department project, they're going to fix the website, they call Sani, they get all of their structure right, they call a button a button, they call a form a form, they've got all of that right. And ⁓ they're like, okay, well, let's now think about the data layer. So what are
Josh (28:23)
Hmm?
Katrin (28:27)
examples of new questions they would need to answer and how they would have to change their GA4 or clarity or both implementations they have.
Josh (28:39)
Yeah, I I think the data layer, what is the best information in the data layer is treating it kind of like a CDN in a way, right? Where you have...
like customer information, you have segments that that customer is a part of.
more contextual information that you're putting into a data layer, the more likely it's going to be able to determine that this is essentially relevant for that type of
level data around like what is on that page, what categories are on that, you know, like say you have a product, right? What categories do those products fit into? Right. It's really important. Like if you're thinking about e-commerce, which is kind of like what we're talking about is to have as much context around the e-commerce data as possible. So when that view item loads, you have, you know, not just, you know, this is this product, but you have like this product fits into, you know, ⁓ camping.
You know, this product fits into, ⁓ you know, tents, you know, like not just, this is a specific type of tent or like a tent pole, right? You want to kind of like think about the, like I was saying before, house into city, into state, into region, into universe, right?
Katrin (30:03)
And you want to
build that, suppose, as a nice tree that's nicely readable by the LLM, right? Yeah.
Josh (30:07)
Yeah, right,
right, right. And like you like the more context that you can provide around, you know, relevancy, the more it will be able to say, this is, you know, this fits this category really, really well. Right. So if I'm looking for, you know, something that's like, hey, I'm going on a trip.
and I'm going to be camping, and I'm trying to understand what I should bring with me, right? Having some kind of architecture that supports or data layer that supports, this isn't camping, this isn't accessory, this is this type of accessory, right? You're much more likely to be ⁓ fed as a result than if you were just like tenting pole, right?
Katrin (30:56)
And I imagine that would also help you then identify some of those behaviors, some of those agent behaviors, because if you have that type of tree and you can identify if the tree has been read or not, you can see how fast it's been read. And so that might give you an idea of this session was an agent session as opposed to the session was a human session. Because ultimately your KPI, kind of want to, I would imagine the main KPI
you want to know about is ⁓ how many of my sales come from human, how many of my sales come from commerce agents or related to commerce agents.
Josh (31:35)
Yeah. What you,
what you should be able to do by specifying, you know, within like a, like a, a web hook back, you know, that
The data is there from a server side perspective. You should be able to web hook back into something like GA4 how many actual purchases occurred via instant checkout. And that will look different than your purchases that happen via the browser. ⁓ This is this count. This is that count. Now, you won't have a ton of context.
or user journey information around Instant Checkout, but just the sheer numbers you should be able to be able to see.
Katrin (32:16)
Yeah.
And so that's kind of like a quick fix for that. Would you actually think that moving the whole tracking server side would be sort of the way to go at this point?
Josh (32:33)
I still think that there is value on
Moving everything server side is...
challenging, I think, for a lot of companies to maintain, ⁓ which is something that you want to consider when having ⁓ any type of setup. If you have a bunch of technical architects who are able to support that, then sure. If you have a company that
is on the smaller side, who is a mom and pop shop who are trying to get their bearings and selling stuff online. Instant checkout has the capability to be used, but you're not going to be able to move your entire storefront server side. It's just, know.
Katrin (33:23)
No,
that would clearly not be feasible, right? But like, I'm talking for a larger brand, Obviously, right? Otherwise, it just really wouldn't make sense. ⁓ I would actually argue that for the larger you are as a brand for this, the more it is a, like, it's a serious endeavor, right?
Josh (33:41)
Yeah. mean, it's as you own.
Yeah. I mean, you own the data. you're you don't even have like you like have the ability to store the data separately if you wanted to. Right. It's more secure. ⁓ It's less ⁓ likely to, you know, be. ⁓
Like it's essentially more accurate too, right? It doesn't get blocked as much, you know, by like arbitrary ad blocker stuff, right? You should still, you know, you should still respect consent, but there's some tools out there that just automatically do that, you know, to client side tags and stuff. so, you know, I would definitely suggest, yeah, if you have the architecture and the strength to do that, to make everything server side, I think that's a,
valuable thing to do.
Katrin (34:39)
And so let's assume, you know, a brand decides to do that. I hate this subject as much as I think all of us do, but I kind of have to say the word attribution. ⁓ Yeah. It's hard. Do you think the server-side move would actually have an impact on attribution? something significant? Yeah.
Josh (34:50)
Yeah.
Yeah, it's hard. I you...
I mean, I think it
We've seen that if you move completely server-side, it does have some weird impact on how much direct you see. I think one of the things is how are you moving it server-side? Are you using GTM to then send to server-side GTM, which then we'll send in? Is it being powered server-side from a CDN? The big thing is like,
Katrin (35:14)
Hmm?
Josh (35:30)
you want to, or like you want to.
make sure that you're capturing all the browser details, all the cookie information. Because the browser is what sets the cookie. ⁓ And that's where the client ID comes from and what is able to identify a user initially and where they came from. And if you aren't careful when you move to server side, if you don't do it right and you're not sending data, if you have a homegrown system and not doing it through something like server side
GTM, you want to make sure that you're sending all of the data that the ⁓ system, that the platform needs to be able to identify this is one user and this is who that user is. And this is where they originally came from. And all that cookie data is what's storing where that user came from. ⁓ So it's really important to also send through client ID, session ID. All those things are really important for
it to be able to determine attribution. ⁓ And if you go completely server-side, if you don't send that information, then it just looks like random users are all coming from essentially direct.
Katrin (36:48)
Right. ⁓ So actually ⁓ to tie that together with what we were talking about previously, ⁓ in my previous scenario, we were just assuming everybody's got a good implementation and did the right work on the website. Obviously, completely the illusion scenario. Let's go back to real life. Look, that's why we have jobs, right? So ⁓ from an analytics point of view,
Josh (37:11)
Yeah.
Katrin (37:17)
How important is this aspect of proper web structure, ⁓ proper data layer, specifically as it comes to identifying agent e-commerce? And can you give us something very concrete as an example of a website is badly structured, a button is not identified as a button. I always come back to that, but better example than that. ⁓ Somehow that means the data layer cannot be implemented correctly.
whether it's server side or client side, doesn't matter. And hence, we cannot measure something that we could measure otherwise about agentic commerce. think, like, simply, how can we make sure that we are grouping agentic traffic into medium source correctly? I suppose that would probably be a good example.
Josh (38:11)
Yeah. mean, it's like just by pulling data in to like, just by like taking, you know, a product for instance, and showing it to somebody in chat, TBD, just chat TBD, taking that product and showing it, you're going to have very little data around source medium, right? There's, there's, if you can at all, right? Unless they click. And if they click.
Sometimes you will be able to see that they have referral sources or UTM parameters that
Katrin (38:45)
So
why sometimes yes and sometimes not?
Josh (38:49)
It depends on where it shows up in the chat. if it's part, I think, and I'm not 100 % on this, so don't quote me 100 % on this.
Katrin (38:57)
don't
worry. In three weeks, we're going to listen to this episode and we're going to go, we knew nothing at the time. Now we know all of these things. Never mind. Just see what we know today.
Josh (39:04)
Well, I think it's like
when they show you the actual boxes of like here's like this, this and this, and you like click those and they have like images and stuff. believe those are the ones that have like UTM parameter, like ⁓ UTM source, chat GPT. And then like any link that is just in a text in there, do not, right? ⁓ Like it's one of the, either that or it's flipped. I forget exactly which one.
Katrin (39:26)
Yeah. Interesting.
huh. Okay. So something
like that. Yeah.
Josh (39:34)
Yeah,
but like, so it depends on like where it shows up and how it's integrated. And then, um,
If somebody clicks, and they have that, you will be able to see that in your source mediums that they came from ChatGBT, not set. That's what it says. And because it says not set, it gets lumped into unassigned traffic. You'll also see perplexity, not set. Or you'll see copilot, not set. And so these LLMs are slightly using these UTM parameters to be like, are they people clicking or not?
And when they do that and they click, you do have some level of ability to see, is this happening more and more and more or is this happening less? And you can take that out of unassigned with channel groupings and put it into something like an AI bucket and create your own channel grouping that's like, here's AI traffic. But just it taking information and you seeing information, it's not going to have any type of source medium there of like, I'm able to identify anything.
Katrin (40:39)
So this is obviously complex as a work stream for a brand. If you are sort of like, you know, asked to work on a project like that in your ideal world, who on the brand side is involved?
Josh (40:42)
you
Katrin (40:56)
in this. What conversation should they have had before they reach you and they get to the point where they work with you? How does a good process work in your opinion? Let's dream a little bit.
Josh (41:07)
Yeah, I think like the
perfect company is the type of company who knows that right now a lot of it is the unknown and like...
what they're coming into is not necessarily to understand it all, but to understand how to measure something. It's not going to be the complete picture because the complete picture is very much still a black box. However, there are many methods, let's call them, within this to at least determine trends. Ultimately, that's what we're trying to do as analysts.
trying to not necessarily know everything, we're trying to determine are things going in the direction that we feel is a positive direction and do more of that, right? Or are things not going in a positive direction, right? And so, you know, I think a good company understands that they're not
going to get complete answers, but to come to us as experts who are able to figure out how to provide information around what is happening and what can we do and what can we show to help them understand it's the efforts that they're moving and doing are that, you know, they've implemented an instant checkout. Is that having an impact? Right. And that's really the question. Is that having an impact? Not necessarily like, that going, you know, is that like,
you know, ⁓ do I understand everything around it, right? Because you're just not going to be able to understand everything around it.
Katrin (42:49)
And to that effect, ⁓ obviously you've done a fair amount of migration work in the GA4, the Universal Sunset. Any lessons, any experiences you can share of like what not to do, anything that would apply you think to the current situation.
Josh (43:05)
Yeah, I mean, I think there was just what not to do is
confuse every single person by basically giving limited information about, you know, what the platform is. And I mean, that's what Google did. ⁓ But, know, what I would say is like so much of the migration process around like back in that day was confused by the auto migration.
Right? Like so many people were like, we just like, you know, universal analytics is just going to like transfer over to GA four. And so like people took their shit from one place and just like move their shit into a worse place, you know, and it was just so messy. And then people had to pay, you know, so much money to like redo it. I mean, the most important thing with any platform migration, just like if you were moving is to downsize and figure out what you actually
Katrin (43:35)
⁓ yeah.
Yeah.
Josh (44:04)
need, right? What do I actually want in my new house that I'm going? So how can I plan out what is actually coming with me here, right? And that's, I think, what I would suggest for anybody in any scenario is really determining. And same with agentic instant checkout.
What do you need to know? What is valuable? Right? It all comes back to value. What is valuable to understand here? And then how can we best put a pathway to actually getting value? And that's where like, when we were talking about data layer and all this stuff is like, what actually matters? Right? Like don't spend thousands of hours creating the perfect data layer only to realize in six months that LLMs are smart enough to not even
care, you know, so like
Katrin (45:02)
Yeah,
I love that house analogy because this is really like remodeling your house for a new type of user, right? You now have a new type of user and you want to accommodate the house for them. And you kind of need to think about what that user needs and wants, what you want to provide and remodel the house specifically for that, as opposed to just go like, yeah, let's just, let's just.
Josh (45:08)
Yeah. Yeah.
Katrin (45:30)
put a new tool on top of it and fix it with some new tagging or something, right? Yeah. And I can, I imagine that this is going to create the way, the same way Analytics Sunset created, you know, a lot of work across the industry. This is also probably going to, to like create a lot of work across the industry. And for AI, for analysts specifically, there are...
Josh (45:34)
Yeah, exactly.
Katrin (46:00)
Obviously a whole number of things in our skill sets that are applicable to this, the way it's applicable to everything. There are some new things that analysts are going to have to learn. If you were to sort of like, know, advise somebody on what to pay attention to in terms of what to learn to be able to tackle these work streams correctly, you know, in the near future, what would you point at?
Josh (46:29)
Like learning resources, you mean?
Katrin (46:31)
Yeah,
well, know, like learning things like what you explained about how medium really gets captured, what KPIs are emerging that tend to be best practices. Like we all learn from each other, right? That's kind of the reason why we talking today. It's really for me to learn from you. And so like,
Josh (46:49)
Yeah.
Katrin (46:54)
We are all learning from each other in this. And that's how best practices typically emerge in our industry.
Josh (47:01)
Yeah. mean, that's
what I would say. Like, have your feet close to the ground, you know, like be, be on LinkedIn, like be under, connect with people who are, you know, I learned a lot of like, you know, I did my own research around some of the, the AI traffic in GA four, but I learned a lot from Dana Dematso.
Right? Like who's posting about it. Right. And like, you know, I learned everything that I know from like, you know, not everything I know, but a lot of what I know from like Simo and from Julian and Julius. Right. Right. Yeah. So like, you know, I think like the most important thing is just like to have your ear on the ground.
Katrin (47:35)
Everybody in, everybody learned from Seymour and Julius for sure. Yeah.
Josh (47:44)
Like go to conferences, go and learn from people. ⁓ And, you know, there's people posting about this stuff that's changing every day. And a lot of it is, you know, they post one day and then it's irrelevant in three, you know, so you can't just be like, cool, I know what it is. Right. That's our life. Which is, know, what you sign up for. Yeah.
Katrin (48:00)
That's our life. Yeah, that's our life, right? So that's nothing new. Yeah, yeah. I mean, that's
also, in my opinion, keeps, in my case anyway, what keeps us in. It's interesting. It changes. If it didn't change, I wouldn't find it interesting.
Josh (48:14)
Yeah, yep, exactly.
Katrin (48:18)
So yeah, so I definitely learned a lot. Thank you. ⁓ For the analysts and e-commerce teams listening, what's the one thing sort of you would like them to think about and take away, the one takeaway?
Josh (48:21)
Yeah.
⁓ the one takeaway I would say. Yeah, I would say like, you know, I guess my one takeaway is, and this is kind of a, it's like the genie wishing for more riches, right? It's like kind of like a cop out, but like, I'm just going to go with, you know, what, what we say today is like.
Katrin (48:35)
about agent e-commerce traffic.
Josh (48:55)
I feel like right now there's very limited information, but people don't like not having information. And also, there is a need at some point, like we're seeing in Gemini, for people to start making money ⁓ with LLMs. And when people start spending money,
they will need to have information around what that money is doing. And so just because the data is really limited right now does not mean that it's going to be super limited tomorrow.
Katrin (49:25)
Yes.
Josh (49:33)
Because we're really just scratching the surface of how to use these tools to make money. And as that develops and as more people do it and as we see that shift, just like in the time where people moved to mobile, just like in the time where people realized that they could optimize for bots, people started to realize that, we need data around this to determine if we're doing things right or wrong.
Katrin (49:42)
Mm-hmm.
Josh (50:01)
And so I do think that even though it's black boxy right now, we are going to be demanding a shift soon once people start spending a lot of money on it, right?
Katrin (50:14)
yeah, and I think it's come fast, way faster than for the other changes, because this one is going way faster than anything we've had before. So, No, I don't think it's a couple hours, it's a great one, thank you. So for people who want to learn more, follow your work, or get help with their analytics, implementation, their data layer, where should they go?
Josh (50:17)
It will. Yeah.
I'm gonna have like my 18 shout outs right now. So shout out number one is following me on LinkedIn. That's where I post a lot of stuff and that's where I'm pretty active. ⁓ Shout out number two is the third party show, which is my talk show. It's a comedy ⁓ musical tonight show version of for ⁓ digital marketing. So you can hang out with me there. Number three, analytical, my YouTube channel that I do weird stuff on.
And then, of course, my company from the future. You can go to that after you go to all the other fun stuff that I do.
Katrin (51:15)
I
will put all the links in the show notes, including episode eight with Sunny and ⁓ the fluffy episode about ⁓ commerce for agents. So thank you for that. It was great.
And that's it for episode nine of Knowledge Distillation. If today's conversation made you think about how AI is changing data and analytics, visit us as rsquire.ai and Triprism, our platform helping analysts navigate complexity with context. Thanks for listening and remember bots won't win, AI analysts will.
Josh (51:28)
Yeah.