How to run AB tests with personalization
Do you consider yourself an analytical person? Being an analytical professional goes way beyond using analytics tools and spreadsheets or even knowing how to perform an AB test. The question is most likely to be: how do you use these tools to understand and solve business-related problems? That's what indicates the analytics skills you have.
SEO teams usually focus on keywords performance and bounce rates, while performance teams' main metrics are impressions, costs, and click rates. On the other hand, content teams mind session metrics and page views, while the product team eyes site speed and performance.
These teams use the same database to obtain completely different information, each one conducting their own experiments. Therefore, you need to adopt a specific focal point for answering questions related to your AB tests.
We will discuss three AB test steps in this post (planning, implementation, and analysis) and show you how to use data analysis to support you through each of them. We're going to dive into the following:
- How to use data for generating and validating hypotheses
- How to analyze experiments for obtaining insights
- How to foster future experiments.
The role of data in creating AB tests
Firstly, it's essential to define what a data analysis platform is and how you can benefit from using it. Some people see it as a safe containing the secret for generating more traffic, revenue, and success for their business. In reality, it's no more than a tool; it won't do much until you ask the right questions. If you're aimlessly navigating through its dashboards looking for information, you're not going to find anything valuable.
Instead, take a step backward and consider the problems you're trying to solve. It could be just a hunch or even a series of questions about your site, product, or customers. Use your data analysis platform to check if your assumptions are right or wrong. If the hypotheses are false, maybe it's time to take a more cautious look at your funnels. If it's right, maybe there's room for optimization. In that case, an AB test would be welcome.
Asking yourself the right questions might sound tricky for you, as well as creating hypotheses for AB tests. But you're not alone. That's the main reason why some data analytics platforms have an Insight section. These insights are guided by artificial intelligence and can be a insights sections accurate understanding from your data. Here are some tips for asking the right questions:
- Put yourself in your users' shoes
- Track events from your website or app
- Segment your users base.
Put yourself in your users' shoes
One of the best ways to identify a good idea for an AB test is to put yourself in your users' shoes. Many companies do that by surveying their users. You can also start by browsing your website or application as if you were a new user and look for points of friction as optimization opportunities.
Another way is to ask someone you know — it could be family or even a friend — to browse your website and complete a specific task. Then, ask that person about possible obstacles they've run into. If you work for an e-commerce company, you may discover that prices are too high for shipping to specific regions. Or maybe that the expected delivery date isn't displayed until the user finishes the purchase. That can be particularly frustrating for users.
Tracking relevant events
If you have been working on optimization for some time now, that suggestion may even seem foolish. However, we are constantly surprised by the number of companies not harnessing information derived from events.
Suppose you create a new AB test or implement some new feature: set aside some time to map important actions or events in the user browsing. Engagement with the header or the footer, clicks on CTAs, scrolling depth, video reproductions, and many other things are valuable events that can help you discover helpful information based on users' behavior.
Although many CRO professionals know that, it's important to remember that you won't get the same inputs from simply looking at heat maps and analyzing tracked events. That's because you can't always segment users according to their clicks for performing any analysis. We'll explain.
Let's say there is a product page on your website with a video on it, and the heat map shows that many of your users are playing it. Based on that, the next logical step would be to add more videos to your website, right? Not necessarily: you can't prove that visualizations are related to conversions.
To do that, you should find out if users who play the video are actually converting more than those who don't. Maybe you would even identify that those who don't watch the video are more likely to convert.
Segmenting your user base
All marketing professionals use segmentation on a daily basis and know this concept addresses much more than just demographics. User segments can originate new and unexploited insights as they supply increased granularity. One of the most valuable manners for using segments is to improve your understanding of users who browse your website.
Data analysis platforms usually report the number of sessions, page views, users, etc., on standardized reports. However, many testing tools operate on the session-level. Creating user-based segments enables more precise estimation of the size of your target audience, conversion rate, and AB test duration, for example.
Here we'll talk about two different ways of segmenting your user base, one of which applies to standard AB tests and the other to personalization tests.
Standard AB tests
When we create an experiment, we usually define a portion of the audience to view the current content without any modifications: the control group. Then, we split the remaining traffic among the different variations we want to test.
To ensure a consistent experience, the user's attribution segment must be random and, once determined, it should be the same for the following sessions.
Personalization tests
When introducing personalization, the main change happens at the beginning of the segmentation flow. Let's suppose we have two active personalizations in a website, one for users browsing in the southern and the other for the northern region. In this case, we already have three segments for the general audience:
Now let's imagine we want the AB test personalized content displayed in each segment. We wish to discover if (1) the personalization would actually bring better results and (2) what the most appropriate content is.
To do that, we should create a control group for each one of the segments: one without any personalization and two with personalized variations of the content. Then, the results from our segmentation will be as below:
Can you tell how's it different from the standard AB test? The first segment is predictable in the personalization test, as the user's location is always pre-defined. On the other hand, the second segmentation abides by the same rule as the standard AB test (it is random with the guarantee of consistency in future visits).
Implementing an AB test
Using data analysis for segmenting your audience is only half of the process. Furthermore, defining and monitoring the appropriate metrics will enable you to evaluate the success of the experiments and obtain insights about your users.
There is a lot of available information online on this subject, so we'll talk about just a few of the most recommended practices. The most important thing is to keep in mind that AB tests' quality depends directly on their correlated metrics.
A step-by-step guide for a good AB testing plan to help you prioritize, document and estimate the duration of your experiments.
Previously, define your metrics
You'll probably monitor either one or two metrics in an AB test: the primary and secondary metrics.
Primary metrics
Primary metrics must be consistent in many, if not in all your experiments. If you're working on e-commerce, your primary metrics will probably be the average revenue per user (ARPU) or the order conversion rate. If the main goal is generating leads, it may be the number of leads or lead conversion rates. In a SaaS, for example, it can be demo requests or sign-up rates. Your primary metric must always relate to your business results.
Secondary metrics
Your primary metric is your guide for decision-making, but sometimes it doesn't help you tell the entire story behind an AB test. Usually, we use secondary metrics to help explain why the primary metric increased or decreased.
Thus, it is most important when an AB test negatively impacts the primary metric. In this case, having supportive metrics helps you understand what happened since they show a clearer picture of why users behaved in a specific manner.
That's why these metrics are usually related to specific parts of funnels and the experiment, so defining it depends on what you are testing. You may want to look at engagement metrics, such as the number of clicks on a CTA or visits to a specific page.
Integrating your AB testing tool to your data analysis tool
Our assumptions on how users will interact with an AB test often turn out to be very different from reality. For instance, we can expect the conversion rate to change, but it can remain stable while the average ticket considerably increases. If you can't anticipate possible impacts on your average ticket, your AB test won't noticeably display positive results.
Therefore, carefully thinking about your hypotheses and integrating data analysis into AB testing platforms is very important.
Personalization AB test analysis
Personalization has already been proven to be a strong ally in building unique experiences for achieving high ROI. For example, a McKinsey study shows that personalization could reduce CAC by up to 50%, increase revenue from 5% to 15%, and improve marketing efficiency from 10% to 30%. However, the goal should always be to measure how much personalization impacts your business results.
Before talking about personalization test analysis, we need to remind ourselves what the purpose of such tests is. We want to discover which personalized content is the most recommended for a given segment. In our previous example, comparing the S1 metrics to N1 metrics wouldn't make sense. We should analyze the metrics between the southern and northern regions sub-segments to answer the two previous questions:
- Is that personalization beneficial? Is the control group metric better or worse than the others?
- And, if it is, what content is the most appropriate for each one of the segments? What variations achieved the best results?
Determining the success of an experiment
Determining whether the variations of your AB test had improved performance compared to the control group depends a great deal on:
- the system you're using to analyze results
- the metrics you set before starting the experiment.
Although identifying the best result may sound simple, it can be a hurdle to obtain concrete results from primary metrics in the real world. It's not uncommon to face situations where the primary metric remains stable while the secondary metrics show variations.
Estimating the mid-range impact
Forecasting the impact from your AB tests is essential for obtaining and keeping stakeholders' adhesion; it demonstrates the value of your experiments and assures that your initiatives continue to get the necessary budget to work.
If you're responsible for some KPIs, you need to show your AB tests are affecting those KPIs and the overall results of your business. The purpose of forecasting the impact of your experiments is to prove the return on investment (ROI) from your initiatives and demonstrate its value and long-term outcomes.
The typical approach for these forecasts is to apply the observed increase on your test and show how much additional revenue would come from implementing the winning variation to the rest of the audience.
Let's take a look at a simple example to clarify this concept. Let's say you run a standard AB test with a variation and a control group for four weeks. If the variation causes a 5% increase in the conversion rate, that's impressive, isn't it? However, you should consider the possible impact of seasonality to evaluate if that would actually mean a 5% increase in your overall business results.
On the other hand, if you run a personalization AB test with two different segments, like the southern and northern regions, a 5% increase in conversion rate should not mean a 5% increase in the overall business results.
In any case, there is no right or wrong, so it's crucial to label them as forecasts and estimates.
Discovering new insights
After all, discovering new ideas is essential to any business. Clearly, the primary purpose of AB tests is to find the best routes for improving revenue and increasing conversion rates, but insights are a crucial success factor in the long run.
We hope you found this blog post useful. If you want to put AB testing and personalization into practice, you should try our platform. It was designed to help you find the right message for each customer and boost your results faster and more autonomously. Create your free account and start now!