YouTube Video cover reading

Listen to this episode:

About this episode:

In this episode of Behind the Data, Matthew Stibbe interviews Will Critchlow, founder and CEO of SearchPilot. They discuss the impact of AI on technology and data, the workings of SearchPilot in optimizing SEO for large websites, and the challenges of understanding Google's algorithms. Will shares insights on choosing the right data for SEO analysis, the importance of statistical validity, and the need for effective data storytelling in business. The conversation emphasizes the evolving landscape of data analytics and the critical role of intellectual humility in decision-making.

AI-generated transcript

Matthew Stibbe (00:01.463) Hello and welcome to Behind the Data with CloverDX. This is the first recording of 2025, so happy new year. I am your host Matthew Stibbe from Articulate Marketing. And today I'm talking to Will Critchlow who is founder and CEO at SearchPilot. Great to have you on the show, Will.

Will Critchlow (00:19.556) Thanks having me on Matthew and happy new year to you and I guess anyone listening.

Matthew Stibbe (00:23.341) And I was thinking about when I was preparing my notes for this and I was writing CEO at SearchPilot, I accidentally but perhaps deliberately wrote SEO at SearchPilot. And Will, as you will discover as we listen to this, is Mr. SEO. And also in full disclosure, he is also a client of Articulate Marketing. So welcome for that as well.

But before we dive into your world and we explore what Search Pilot is and does, let's start with a really practical question. What emerging technology trends or thought processes in the world of data are you particularly excited about as we head into 2025?

Will Critchlow (01:10.276) It's hard to go anywhere other than AI, I think, with this question right now. But I guess to pick a specific element of it, the thing that I'm most excited by is the qualitative change in what computers are capable of. I heard it described the other day as that essentially, if we had magic in the real world, we wouldn't call it magic, we'd call it technology.

Matthew Stibbe (01:39.831) Mm-hmm.

Will Critchlow (01:40.062) and we kind of have magic, right? We have these, we make these things out of sand that turn out to kind of think and without getting into the philosophical side of things, it's certainly kind of magical what they're capable of. And my favorite example of this is the, the XKCD comic, which hopefully we've got a lot of geeks listening to the podcast and if folks are not familiar with it, it's top recommendation. 

Matthew Stibbe (02:03.703) Big fan.

Will Critchlow (02:09.252) There's one particular, might even just be a single pane comic or very short comic from quite a few years ago that interestingly has not aged particularly well that talks about how hard it is to tell whether a particular problem is going to be easy or hard for a computer to do. If for a lay person, you know, if you're not a software engineer. It gives a bunch of examples of things that sound really hard, but in fact, computers are incredibly good at and gives one example of something that at the time was unbelievably beyond the bounds of what computers were capable of, which was, is there a bird in this photograph?

And that was kind of unimaginably hard to a software engineer at the time of writing. And of course, it's the kind of thing that AI is now doing. And so it's that kind of qualitative change. And I think the most tangible example of it in a work context is we've started using meeting recording technology. So we use Fathom, but other software is available. And it is capable of not only transcribing the audio to text, but summarizing it into really good bullet pointed notes, action items. And if you use a framework like MEDICC in your sales process or whatever, it can even score a conversation against your favorite framework. Did you have a deep conversation about economic buyer or decision criteria or whatever else it might be? that's making a huge difference across the ability to search across those things and ask questions about multiple conversations and so forth.

Matthew Stibbe (03:36.841) seeing all kinds of really interesting applications of it with data. I use Unify networking from Ubiquiti at home, and they've got a chat GPT driven system. You can ask it questions about Unify and how much storage capacity do I need for this many cameras? And it kind of works that out, which that's kind of almost getting to the point where I'd rather talk to a chat GPT driven support system than a human being. 

Will Critchlow (04:02.318) Yeah, hard to tell when you are and are not sometimes, but yeah.

Matthew Stibbe (04:06.061) Yeah, well, there is that. And here's a prompt that I came across the other day that I was just fascinated by. If you use ChatGPT a lot and it starts to build a little bit of a knowledge base about you, given what you know about me, tell me some things I might not know about myself. And that, the response to that is slightly spooky, depending on how much it knows about you. Anyway, so.

Will Critchlow (04:28.258) Yeah, a little bit creepy.

Matthew Stibbe (04:32.725) Now let's get on to SearchPilot talking about spooky and exciting. Tell me what SearchPilot does. I know what it does, but for the listeners, what does SearchPilot do? And tell me a bit about your role there.

Will Critchlow (04:44.982) So we're a software platform. We help very large websites, typically e-commerce, retail, those kinds of businesses that make money online. We help them run experiments to figure out what Google will prefer about their website as they make changes. You, many people are probably familiar with the kind of experiments that you can run on a website to do user testing. So you can run experiments to see if your users are more likely to buy from you or spend more money or...

whatever else it might be, by splitting those users into cohorts and running different versions of your website for different users. We're an analogous approach to see what Google will prefer and when Google will show you more, more highly, more visibly in the organic search results in particular, which is the lion's share of where most online businesses get particularly new customers from and where they make their money.

And this came out of... I've been around the industry for a while. And back in 2005, the co-founder and I started a business called Distilled, which was an SEO agency. And it was over the next 10 or 15 years of SEO consulting. I'm throwing around the word SEO, which you did already, but for anybody who's not familiar with digital marketing, this is the practice of thinking about how to get more business from organic search visibility. So not paying to be in the Google search results, but showing up highly for the kinds of things that your customers and potential customers are searching for.

So we did that as a professional service business for many years. And over the course of that, found that especially at large organizations, they were really struggling to prove the value of the work that they were doing and figure out the best things to do to move forward. And that trend only accelerated with the advent of artificial intelligence and deep learning and so forth on Google's side, making their algorithm increasingly impenetrable to just thinking hard and trying to decide what would be good to do to your website. And so we help folks run experiments to figure that out.

Matthew Stibbe (07:01.505) And that must involve a ton of data, right? Where does that data come from? How do you process it? What are the data challenges of that?

Will Critchlow (07:09.314) So yeah, the biggest source of data that we pull in is from web analytics. So a lot of our customers being kind of big enterprise type folks are Adobe analytics users, but it could be that, could be Google analytics, GA4, could be more proprietary technology. So we sometimes find it coming from a Snowflake or a AWS repository of some kind.

That's the primary data source, because that's the source not only of measuring how many people found your website, but also what did they do? So did they go ahead and buy? Were they kind of good visitors in that sense? So that's the biggest data source. We also incorporate a couple of others. we're obviously, we have a lot of our own first party data. We sit in the web stack for our customers and most of our deployment models. And so we get to see Googlebot visits in particular. So when Googlebot visits your website to crawl and understand what content you have, we see at least some of those depending on your caching and CDN setup and so forth. So we can use that data to understand the progress of our tests. And then the third major source of data is direct from Google themselves, which is from what they call Search Console, which is information about your visibility in search results. So they provide ranking, keyword, and click-through rate data that you essentially can't get anywhere else. And that data is unfortunately sampled and obfuscated in various annoying ways. Obviously it's proprietary.

Matthew Stibbe (08:55.277) presumably to make you pay for something.

Interestingly, well, I mean, to make you pay for Google Ads, maybe, you can't buy better organic data off them. You can buy adverts, obviously.

They claim that a lot of this stuff is obfuscated for privacy reasons. And there's been various regulatory oversight and constraints on this, as you can imagine, particularly in Europe and California, seems to be the entities pushing that most. But they still give most of that data to their paid advertising customers. So I'm not completely sold that it's entirely a privacy thing.

But nonetheless, that data source is limited in a variety of ways. So we don't use that so much as a primary source as we use it for explaining and understanding more deeply the statistical results that we're getting from analyzing the other data.

Matthew Stibbe (09:49.271) Right, your analysis, because you're in the tech stack and you've got the traffic analysis on the sites, you've got some direct insight into what people are doing that you can analyze. OK. And one of the things you just said caught my attention, which is something I suppose everybody sort of intuits, which is Google is a bit of a black box. And there's a lot of Kremlinology around trying to figure out what they pay attention to in my world for different reasons and in yours.

Will Critchlow (09:58.308) Yep.

Matthew Stibbe (10:20.077) And I'm reading a lot about cybernetics at the moment. I'm very interested in black boxes. When you're thinking about the work that you're trying to do to understand what Google's black box is doing using the data that you have, how do you go about sort of opening that up or can you or is there other any methodologies or techniques or approaches to trying to understand a black box like that?

Will Critchlow (10:44.814) So we are these days, I would say, not actually particularly trying to open the box. That possibly was a bit different in the past, and I'll explain why, but these days we're, I guess, if you think of it as being the difference between running conversion rate experiments and doing, I guess, human psychology or something, we're not trying to understand why Google is the way it is or how it was treated by its parents or the kind of the Freudian analysis of Google. So much as just saying, well, we've discovered that, especially for this particular website, if we make this particular change, it turns out that that's a good thing. So we're kind of self-help rather than Freud, I guess, is maybe the way of thinking about it. The thing that's interesting about how that's changed over time is, if you'd asked me that question before we actually had SearchPilot, when I was a consultant, back in say 2009, in fact, I did contribute to a number of studies and they were more survey based, but kind of essentially asking a bunch of experts, what do you think the top ranking factors are? What makes the most difference? Why does Google think this website is better than that website or whatever it might be? And that felt like a valuable endeavor in 2009. And there was a lot of reading of the academic stuff on information retrieval and the cutting edge of how those algorithms were designed.

The changing world that's made that different, I think, is so I think it was 2016, a called Amit Singhal left Google. He was the head of search, I forget his job title. He actually went to Apple. And he, I think I might be getting him confused. There's another guy called John Giannandrea, who left at a similar time. Point being, between... Amit in particular was famous within Google for not wanting to put deep learning into the core organic ranking algorithm. He wanted to be able to explain why website A deserved to rank above website B for query X. And he thought that kind of human explainability was very important and you tweak the, you had knobs and dials and you tweak them all to get the best possible search results.

But software engineers within Google had essentially a debugging dashboard that could say, well, it's because this website mentions these things and doesn't mention those things and has these links and doesn't have those links. And, you know, they, they, had that kind of debug information. He was very against that. I don't know the cause and effect direction, but he left and they moved on from that philosophical belief. And these days there's a lot of deep learning in those algorithms. And so it becomes much more, the computer likes this one. And, you know, I'm sure they have more debugging information than that.

But a key feature of these kind of massive deep learning models is that they are too big to explain to a human. By definition, they're doing things a different way to the way that humans think. And so these days, yeah, as I say, we're not necessarily trying to ask the kind of philosophical why, so much as the more practical, how do we do better?

Matthew Stibbe (13:51.372) Yes.

Matthew Stibbe (14:01.311) It's a, there's a lot in there, isn't there? Certainly explainable AI or explainable decision making versus not, I think is going to be one of the challenges, complexities, nuances, you know, when an AI driven car drives into a pedestrian, whose fault is it, is an extreme case.

And I remember, yeah, I remember sort of doing keyword stuffing and all kinds of things, thinking that you could sort of reverse engineer the algorithm and optimize it. And now what? I don't know. Well, I actually do know, but that was another conversation. So when you're doing this, examining the output of the black box in a good cybernetic way, and not trying to reverse engineer it or figure out how it works, how do you choose the data that you use or the analytical methods that you use so that you get to... because that's clearly an important thing right, choosing your data is as important as choosing the decisions you make. How do you do that with SEO data?

Will Critchlow (15:03.384) Yes. Well, so we chose to treat our primary metric as being visits from organic search. So clicks, essentially. And the reason for that, there's a couple of different ways you might potentially go in up and downstream. One direction, when people hear that we're doing SEO testing, an immediate, a lot of people's immediate thought goes to rankings, goes to saying, well, why don't we just change a thing and see if we move from position five to position three or something like that. There's a lot of kind of pop culture elements around ranking number one. You hear it a lot from senior executives and so forth. Well, I want to rank number one for the keyword in my industry. The key thing that misses is quite how long the long tail is of search queries. And there isn't in fact a keyword that is probably even that important in the grand scheme of things. It's the portfolio of thousands, hundreds of thousands, millions of keywords. And some of them haven't even been searched for yet, which is a kind of a hard thing to get your head around.

Matthew Stibbe (16:15.381) It's a surprising proportion of Google searches, isn't it?

Will Critchlow: This is not a... They say you hear numbers ranging from 15 to 25 % of all Google searches are brand new. Never been seen before in the however many trillion searches they've seen at this point.

Will Critchlow (16:29.828) Some of that is novelty, right? That new things happen in the world that have never happened before, whether it's product launches, whether it's rebranding new names, news events. There's some combination of events that means that people are searching for things that they couldn't have searched for yesterday. And some of it is just human ingenuity. It turns out that the sample space, the universe is big. There's a lot of ways of expressing the same concept. So you can't just say, am I ranking better for this keyword or even this set of keywords?

Matthew Stibbe (16:53.313) Yes.

Will Critchlow (16:59.656) And if you, even if you could kind of somehow come up with the universe of all things that people might search for, for your brand, you need to kind of weight them by the likelihood of them searching for them and the likelihood of them clicking. And essentially you get back to clicks. And so that's why we, that's a big part of why we chose that. The second reason is when you think about a Google search results page. So imagine you've done a kind of quite commercial search. So I just recently got a new fitness tracker, you know, imagine you were doing a search for heart rate monitor or something, and think about what could show up on that page. You don't just automatically click on the first thing. No one does. There's a certain amount of, you you're going to be looking around different options, you're to click on multiple things on the page for a start, you're going to be drawn in to some and not to others, you're going to be attracted in different directions. And a lot of those features are things that we can influence. So we can influence how you show up in that page, we can influence the images that show up, we can influence the language that's used, the title of the page, the meta description, you know, getting into the technical weeds.

All those things mean that even if you stay ranking in the same position for a particular keyword, you could get a range of 10x, how different amounts of traffic depending on how compelling your kind of organic advert is. And there's a complex feedback loop that Google is looking at that stuff to figure out how much, what your ranking should be in the future.

All of which means you can't just optimize for click-through rate, you can't just optimize for rankings, you have to look at the total amount of traffic that you end up getting. So that's how we've made that kind of decision. As I said, we use data from Search Console, which gives us information about click-through rate and rankings, mainly to try and explain our experiments. We don't, not necessarily in the Freudian sense, not why did Google think this, but what, in what way was this a winner?

So imagine that you've got a statistically significant result. made it, we had a hypothesis, we made a change. Now we're getting 12 % more organic traffic to the pages in question. A lot of marketers and particularly their bosses want to be able to answer the question, well, is it because we're ranking better on average? Is it because we're attacking a new market? In other words, new keywords. Is it because we're in fact more compelling and we're getting more clicks for the things that we always used to rank for? That's the kind of explainability that we try and answer with Search Console data.

Will Critchlow (19:24.036) And final piece of the puzzle is it's very important that these visitors are good for your business. Anybody can write an article about a totally unrelated topic and if it's a kind of pop culture thing, get potentially more traffic than there might be for data conversion pipelines or something. And that doesn't mean it's good idea. And so there is an element of making sure that the visitors from Google go on to convert and that they're relevant to the business and high quality.

We do work in that area around what we call full funnel testing, which is kind of combining what I've been talking about up to this point, the SEO testing, with conversion rate testing. So doing some element of, these visitors converting as well, better, less well than the visitors that we used to get. But even just in pure SEO testing, for most of our kind of customers, when they get more traffic, it tends to be very highly relevant because we're not creating, you know, totally off topic pages here. We're saying we're taking your product pages and making them better. So mainly Google does a good job and sends...

Matthew Stibbe (20:29.133) You're not doing celebrity gossip on the blog and then selling data integration software on the website.

Will Critchlow: No, exactly. This is like improving the visibility of the reviews or having more compelling descriptions of the products. And so those kinds of things, they tend to align.

Matthew Stibbe (20:45.591) And this, to me, naively as a non-mathematician, implies quite high volumes of data. I mean, how you can optimize for fit, you know, this page for fitness trackers if you're John Lewis or Marks and Spencer or something like that. Is there a sort of a, how do you know you've got enough data for your decisions to be statistically valid, defensible, meaningful?

I mean, and is this sort of technique available if you don't have that data?

Will Critchlow (21:16.752) So if you don't have that data, if you're working on a smaller website, I mean, ironically, we're not in our own target market. So when I'm talking about the searchpilot.com website, we have to make kind of best practice driven decisions. And we can't run these kinds of scientific statistical tests that we run for our customers. But yeah, so in the smaller cases, broadly speaking, you have to look at, can you see a trend before and after, but there's many confounding things that can be going on and there's a lot of intuition you have to follow in those cases. So when we're talking about a large scale test on a large website like the ones you mentioned, we built into the statistical analysis, the capability for it to come with its own confidence. If statisticians in the audience, we're talking about bootstrapping confidence intervals here. So we're essentially saying run lots of simulations of the analysis and look at how many of them turn out to be statistically significant. And from that, you can build a kind of confidence interval that says, well, we are 92% sure that this is, in fact, an actual uplift and the middle estimate of how much of an uplift is 11% or whatever those numbers might be. So yeah, we can use the analysis techniques on themselves to answer the question - are we confident in the output of the analysis technique?

Matthew Stibbe (22:48.077) And I had a sneaky peek at your LinkedIn bio before the conversation today and I see that you studied maths, math for our American listeners at Cambridge. And full disclosure, I studied history at Oxford so we are opposite ends of the two cultures as it were. What can you bring into your work with data and analysis from that? I mean, what are useful techniques?

And I ask this very naively as a complete non-mathematician. Perhaps a different way of asking the same question is, what should people who didn't have a mathematical background know that would be useful that they might not know?

Will Critchlow (23:30.734) Great question. I think there are actually two quite different questions because the, so with the caveat that this was 25 years ago, I've definitely forgotten more than I still remember.

Matthew Stibbe (23:46.285) People keep asking me history questions and I like, don't know, I didn't do dates. Victorian culture and intellect. That was me.

Will Critchlow (23:52.028) Right, exactly. I think the equivalent of asking you dates is asking me like mental arithmetic. You know, it's like I didn't really do numbers. The big things, the thing that possibly our experiences will have had in common is that you go from big fish small pond to small fish big pond. And I don't know about your experience of school and whatever else but most people who end up at somewhere like Oxford or Cambridge have the experience of doing well in exams and those kinds of things. You go from certainly in your specialist area, finding certain kinds of things reasonably straightforward to all of a sudden working very, very hard to just not be the stupidest person in the room, never mind being in the top quartile or whatever. And so I think the...

A big part of what it taught me actually was a lot of intellectual humility in terms of trying to find people who actually really, really, really do understand the specific question that we're trying to answer, not just, you know, it's a mathematician or a statistician, they must be good at this stuff. Like these questions are quite specific, very narrow. So we've worked with some very specialist people over the years.

What has specifically been useful to me, think, is so my, I ended up in my weird element of specialization in, so I loved probability theory, but probability theory gets really esoteric. And mean, all these things get really esoteric and eventually gets too hard. And I actually did do a lot of statistics and operations research type work. And I think in those areas, gave me enough, it gave me the vocabulary to be able to speak to the real experts who are doing the work. And I think the lasting benefit has been the ability to think rigorously about problems. And it's a different kind of rigor to, and I've not studied history to the level you have, but there's a different kind of rigor that might get torn apart in an essay question answer versus mathematical rigor in terms of proving a theorem or... that concept of proof is probably the thing that is most different. To anybody who's done high school maths, gets very good at computation, gets very good at solving exam questions. The thing that's new at university is broadly speaking, this concept of proof.

And I love the idea of the pyramid that you get to construct of mathematical constructs. Without getting too kind of philosophical about it, I think that even in areas where we're talking about probabilities, so we're talking about, there's a 92 % chance of this being an 11 % uplift. It's really interesting to look with some rigor at how did we get to those numbers and what level of confidence do we have in them? And so yeah, that kind of probability end of things is my happy place.

The other side of the question, so what should somebody who didn't end up doing a post-grad degree in statistics, what's valuable to them in the modern world. I actually think, so I do think there's something about basic statistics, know, kind of what in the UK we call A-Level statistics, that, yeah, so end of high school, that I think is incredibly valuable in the modern world, and it overlaps with probability in a lot of ways. And I think this is useful for decision making, it's useful for, it's also useful for understanding the language of people like me who are talking about confidence intervals and probabilities of uplifts and so on and so forth. I think it's useful in our day day life.

I saw a shocking statistic, one of the most memorable... without getting anywhere near any kind of party politics, it was a political thing, that they gave a load of, I have no idea why people agreed to do this, they gave a load of British MPs a basic probability question. They asked, you got a...you toss two coins and how many times out of 100 would you expect to get two heads? Or asked another way, what's the probability of getting two heads when you toss two fair coins? And... It's one of those things that, like, anybody who's listening who says, I have no idea how to answer that question, getting to the point where you definitely know how to answer that question, I think is a very, very valuable skill in the modern world, because there are so many things that are like that, so many things that are talking about probabilities and a shocking number of MPs got this wrong. without going, I don't want to...

Matthew Stibbe (28:57.431) I would probably get it wrong. Almost certainly actually.

Will Critchlow (29:03.15) And that's fine - there's something very valuable about getting to the point of not getting it wrong. And the answer is a quarter, by the way, because there are four equally likely outcomes, two tails, heads tail, tail heads and heads heads. And they're all equally likely. One in four of those is heads heads. What was the most shocking part of this was the level of confidence that people attributed to their answer. They said, so what's your answer and how confident are you in your answer? And I don't remember the numbers, but it was a really shocking proportion of people who were confident and wrong about that. And I think that level of like self-awareness is actually probably the most important part.

Matthew Stibbe (29:45.805) To what you said at the beginning of this discussion about intellectual humility, I think is critical. I think the biggest crisis that we face as a civilization is the Dunning-Kruger effect, i.e. being wrong, but not knowing that you might be wrong. That's probably a simplification of that. It's terrifying. I think the thing that I'm also hearing about this and thinking about this is the misappropriation of vocabulary. Now, I'm interested in words, but, you know, talking about using statistical language without understanding what it means. I've been reading a book recently, which I think is very interesting, called The Unaccountability Machine by Dan Davis, recommended by another guest. And he's sort of unpicking this idea of shareholder value, right? The shareholder value is not, if your mission in life is maximizing shareholder value, it's a different thing than increasing profitability, although they sound very similar. That language has come from somewhere and means some things and implies some stuff and sounds very good, but it's a misappropriation of some language. The book is full of repeated examples of how people do this.

Anyway, let's come to something a little bit less philosophical because as interesting as this is, we're in danger of diving down a rabbit hole. Let's look at a specific project. And I'm always interested in talking to my guests about something that they did with data, what they tried to do, and what they learned. So perhaps if you could suggest something from your experience, and we can dig into that.

Will Critchlow (31:26.978) Yeah, so I think there's an example that we can talk about, which is a project that we underwent at SearchPilot to improve in a very specific way our data analysis. So what we were looking to do here, success for this project would be to detect smaller uplifts with greater confidence. So be more confident that we've detected an uplift even when it's very small.

Matthew Stibbe (31:48.813) It could be commercially meaningful but hard to detect, right?

Will Critchlow (31:51.544) That's absolutely, I mean, sometimes a fraction of a percentage point uplift could be a lot of money to the kind of very large organizations that we're working with. The challenge is that even though it's a lot of money, it's hard to detect in the noise of all the fluctuations that are going on for many, many other reasons, right? So whether it's seasonality, time of year, what your competitors are up to, macro economic trends, all kinds of things could be bouncing these numbers around all over the place.

You've generated a 3 % uplift in a particular area of your business. I mean, that could be dwarfed by the 10 % swings that are going on in other areas. So it can be quite hard to pick out the signal from the noise, but compound that over time, repeatedly add to these single digit percentage improvements and you get to a good place. And in particular, you get to a much better place than you would have been had you not done it. You obviously can't do anything about the macro economy or the fact that it's raining, but you can do better than you would have otherwise done and that's kind of the name of the game.

And so we do a lot of work to try to improve the algorithms that we're using to analyze this data. And we also do a lot of work to do the kind of self introspection piece, so the algorithm, saying how confident the algorithm is and then validating that it's actually correct about that. And so one of the ways that this is a kind of two pronged big data problem. So every single one of these analyses is a big data problem, right? In the ways we talked about, you're pulling in this analytics data and you're doing a lot of time series analysis to figure out the statistical features of that data. But what we wanted to do was do some kind of experiments across lots of these experiments, if that makes sense. 

Matthew Stibbe (33:42.807) for meta experiments

Will Critchlow (33:49.102) Exactly. So if you imagine that at this point, we've got thousands of these tests that we've run historically, we've got a lot of historical data that we can rerun new analysis models on to benchmark and compare and contrast. And that's all well and good. There's even another element of it, which is for those historical tests, we don't have an oracle. We don't actually know for certain what the right answer was. We know what the old algorithm said, we know what the new algorithm says. We don't know which one is righter.

And so there's another element to this, which is running on synthetic data. And synthetic data often starts as real data. So you might take the noisy time series data from an actual website, but then you add in a known uplift. you say, okay, imagine we'd run an experiment here and imagine it was a 2 % uplift that faded in over two weeks in this kind of way.

So now we know there's a 2 % uplift buried in this data. And so then you can run your analysis on the synthetic data as well and see how often do you get the right answer? How often is your algorithm misled and so forth. And so we ran one very specific example of this. We can put links in appropriate show notes and places to our public write-up where we talked about the kind of from a product marketing perspective, what this meant was we essentially doubled the accuracy, the sensitivity of the algorithms that we're running, which is hugely commercially valuable to our customers, which means that they can detect a greater number of those small or uncertain uplifts than they could before.

Matthew Stibbe (35:35.437) This is a little bit of, if I can paraphrase, and sort of self-analysis, self-regulation of sort of stepping outside individual models and looking at how you do the work you do and can you optimize it. What prompted this investigation? I mean, what thought process or what management mechanisms did you have that went, we should be doing this kind of higher level thinking about it?

Will Critchlow (36:03.844) So the way we think about product development is quite a commercial approach. So we're very much starting from what do our customers or our prospective customers most value, most need? Where are the gaps? What are they not getting right now? And I think the need that we were identifying here was, the positive way of framing it is the big value is when our customers get a decisive result. So when they get to, they don't just get a kind of inconclusive shrug of the shoulders from the algorithm, they get an answer. This change was with a sufficient degree of confidence, we can say this change was positive or negative and quantify that.

And we know from many sales conversations and conversations on renewal and conversations with existing customers, we know that that kind of, that win rate is very important to them. Number of insights per quarter. So our most sophisticated customers who've been running experimentation programs for a long period of time, value, test cadence. So how many tests did we run this quarter? Win rate, how many winners did we get? Learnings rate, so how many were positive or negative? But we at least, we learned something. And size of uplifts, quantification of the size of the uplifts.

So we know that anything we can do to improve win rate or improve learning rate is commercially valuable. I guess the negative framing is that we know that on occasion a customer who doesn't renew, part of the reason might be, well, we didn't get enough winners or we couldn't be sure enough that we'd in fact got the scale of uplift that we were hoping for. So we know there's something of commercial value in there.

And then in terms of how to actually kind of go from knowing we want to improve it to having a chance of improving it. This was quite a pure R&D project actually. So we took a kind of bit of portfolio approach. I mean, I won't go deep into the details of what ended up working, but there were a number of approaches that we took, some of which worked and some of which didn't. And so was little bit of a, you know, try out a number of things, know, initially in a fairly ad hoc way, just hunting for ideas or hunting for things that could be powerful, and then honing in on the most promising of those.

And the thing that we ended up launching actually was a second or third iteration of a particular idea, where the initial spark of the idea had been valuable in itself, but the one we ended up releasing was two or three times better than that. Because once we kind of found a vector, if you like, found a direction to go in, we could kind of iterate on that itself and kind of say, well, this is the best version that we can come up with of that approach.

Matthew Stibbe (38:57.237) The thing I'm taking away from this conversation is not only a case of choosing the data that you want to analyze, but also choosing the metric or metrics that you want to improve and optimize is really important with this. And that's my layperson's interpretation, but we're almost out of time. So I'd like to give you an opportunity to close out your thinking. I'd like to ask you a question. Given the conversation we've had, what one piece of advice would you give to a data team or a business embarking on an analytics or data analytics project?

Will Critchlow (39:34.404) So I think there's probably two big things that I feel that we've internalized. I don't know whether in our particulars, one of them we're just getting to now and maybe that's the right time. It could have possibly been a distraction to try and do it sooner, which is really having a test harness and a test framework to be able to run those meta experiments as you described them. The idea that we're, we're iterating on our algorithm and we want to rerun that algorithm across all historical tests to be able to see how it performs. We've improved our internal tooling for that recently in a way that I think is really gonna pay off. Should we have done that three years ago? I don't know, like maybe it would have been a distraction. You have to kind of do these things in the kind of manual way first sometimes to prove the value of automating them.

The one that I think probably I need to internalize more is, so I mentioned the search console data, which is limited in a variety of ways and not sufficient from our research to be able to be a source of statistical confidence for many of our tests. And I think that led to me in particular, I mean, ultimately to SearchPilot discounting that data for too long. So we have finally integrated that data this year.

And this is maybe where the maths background leads us astray is I think I underrated the value of the storytelling. That we get our statistical confidence from one place and we get the storytelling from the other place. And the value of the story is actually quite significant. And we're storytelling machines, aren't we? And I think giving our points of contact at our customers the tools to tell the story better to their teams, to their bosses, to their decision makers is I think probably, hopefully something that's really gonna pay off.

Matthew Stibbe (41:43.671) Very interesting and thank you. brings this episode, fascinating conversation, Will, to a close. If you at home would like to get more practical data insights and learn more about CloverDX, please visit cloverdx.com/behind-the-data. Thank you for listening. Will, thank you very much for being with us today.

Will Critchlow (42:02.98) Been a pleasure, thanks for having me on.

Matthew Stibbe (42:04.791) Bye everybody.

Share

Download and listen on other platforms

Subscribe on your favorite podcast platform and follow us on social to keep up with the latest episodes.

Upcoming episodes

Get notified about upcoming episodes

Our podcast takes you inside the world of data management through engaging, commute-length interviews with some of the field’s most inspiring figures. Each episode explores the stories and challenges behind innovative data solutions, featuring insights and lessons from industry pioneers and thought leaders.