What Is A CRO Test? Definition, Types & Examples
You only have to do a quick google search (probably why you might be here!) and you’ll get a myriad of answers to what a CRO test is, which in itself is a bit of a problem. There are lots of answers and opinions on what exactly a CRO test can be, but sadly, like many answers to questions relating to conversion rate optimisation, the answer is “it depends”.
CRO tests can take on many forms depending on the hypothesis they are trying to substantiate, some of which we’ll describe here to show you the types of test that can be run and the pitfalls and advantages of each.
Before we start however, there is one fundamental premise to understand. The term CRO is outdated, limiting and doesn’t even begin to describe the plethora of other things that can be improved and measured beyond conversion rate when carrying out any sort of testing program.
What is a CRO test?
A conversion rate optimization test, also known as a CRO test, is a type of testing technique that determines what the best conversion rate outcome will be after making changes to your website. A/B tests are the most common form of CRO testing, but there are others, such as a multivariate test, that can produce more in-depth results.
Because of the wealth of CRO tests available, it’s long been debated amongst industry professionals if CRO as a name is even fit for purpose anymore. Many prefer to reference the science of improvement under the general umbrella term of experimentation and optimisation. This opens up a much broader scope of testing activities that can be carried out and also gives us a clue as to how a test should in fact be performed in order to pass scientific muster and validity.
CRO tests on a whole should be thought of as controlled experiments where your audience members are randomly assigned to a control group or an alternative test group. All variables in the experiment are held constant except of course the change you’re investigating. By running experiments in such controlled conditions you are then able to directly attribute any observable differences between the groups because of the change that you made.
Much like any experiment you might run, they are not designed to give binary answers, but rather provide data and insights that can undergo data analysis to lead to better understanding of a problem statement and how better to solve it. Experiments don’t usually have a definitive end point, but evolve to exploit a particular goal that you’re aiming for.
There are many types of experiments that can be run, but they all share the same process and steps that must be adhered to, but before we get into the details, you should always ask one question first:
“Should we even be testing?”
Seems a bit of an odd opener, but it really is a question you should ask before you head down the long path of experimentation overall. There are a couple of key ingredients that are needed, call them prerequisites if you like, before you open your experimentation tool box.
“Do we have enough traffic?”
In order to run any sort of conversion rate optimization experiment you must have sufficient traffic on your own website to ensure a couple of things; one, that you have enough people to expose to the experiment, and two, that the number is sufficiently high that experiments don’t end up running for an eternity before reaching any sort of statistical significance (something that is generally needed to validate the result).
A finger in the air estimate would suggest that a minimum of 5,000 visitors per week would be a good starting point, but is not the only determiner as to whether or not you can run a/b tests.
“Are there enough conversions?”
If you run any sort of experiment, you must decide on what you’re trying to find out and if you have enough data to reach an end result. Let’s say for example, your experiment is to understand how to increase the number of people who complete a sign-up form. If you only have a few people who do this every week ordinarily, then if you decide to run an experiment to try and improve the sign-up rate it will take longer for any a/b testing tool to reach a conclusion because of the baseline number of conversions (or sign-ups) is low.
“Are we cut out for this?”
Or perhaps a better question to ask would be;
How willing are you to be confronted every day by how wrong you are?David Vismans, Chief Product Officer at Booking.com
It’s really important to understand that A/B testing by its very definition is a mechanism used to enable better decision making through data collection and uncovering insights. A/B tests that are researched, implemented and analysed well have little regard for opinion or conjecture, because the result is the result, the proof is in the so-called pudding, statistically speaking.
It can both open a world of undiscovered possibilities and challenge hard-fast opinions and assumptions that have been left undisputed for one reason or another for years. However, the brutality of an honest and unbiased answer to a question can be a hard pill to swallow for some and can lead to resentment and all sorts of other negative connotations that can massively impact any a/b testing or experimentation strategy.
This is why it’s important to figure out if you, your company and your colleagues are willing to embrace A/B testing and the culture mindset. Your A/B testing programme will only ever be as good as the people involved in running it. There is a great saying that suggests you can lead a horse to water, but you can’t make it drink.
The same applies here. In order to soak up the benefits, you must first have the thirst to do so.
The 7 steps to successful CRO tests
So, let’s say you have answered yes to the 3 questions above, then it’s likely dipping a toe into a/b testing might be a good fit for you. But what is the process to get an a/b test out into the wild? It might sound straightforward, but actually there are a lot of steps to consider, and since we want the truest and most accurate results, a/b test ideation follows a process like any well-designed scientific experiment.
1. Ask a question
The easy part. Much as it suggests, any test idea usually comes from someone asking a question about something. As we mentioned, the great thing about a/b testing is that you will get an answer to it, but first that question needs to have clarity and purpose. Let’s give an example:
“Why has the number of people completing our sign-up form fallen for 2 months consecutively?”
Once the question has been defined a raft of research is undertaken to discover data that can help us come up with a way to answer the proposed question. The research you do should be the most applicable to what you’re trying to find out. Oftentimes, this will be a combination of both qualitative and quantitative research. From our example question in (1) above, the research we might want to look at or carry out could be the following:
|Quantitative (what, when, how)||Qualitative (why)|
|Site analytics||Open-ended surveys|
|Click / scroll maps||Usability testing|
|Form analytics||Form heuristic (cross device)|
|User testing videos|
If appropriate research has been carried out, it should put you on a good footing to formulate your hypothesis based on what your research has shown. It merits mention here that the quality of your CRO tests will be hindered considerably if your hypothesis is poorly stated and defined because it forms the backbone of your experiment and defines what you’re testing, why you’re testing it and what you’re expecting the impact to be.
A good hypothesis will clearly state exactly what to measure, identifies the causal link that you’ll explore and sets you up for learning. Whatever the outcome, you’ll know something new about the effect of X on Y. Booking.com have a great method for writing hypotheses, check it out:
|Based on (data/research) we believe that (change) for (population) will cause (impact)||We will know this when we see (metric) or (feedback)||This will be good for customers, partners and business (because)|
So, for our initial question, our hypothesis may look like this:
|Based on (form analytics and open-ended surveys) we believe that (removing the telephone number input field) for (all customers) will cause (more users to complete the form)||We will know this when we see (an increase in form completions)||This will be good for customers, partners and business (users will have privacy concerns met and bolster trust towards our company and our partners)|
The experiment part actually has several steps that usually need to be completed before you can start running it. The following are common components required:
This is where you set out the scope of the CRO test and should include information such as any functional & design requirements, what pages the test is to run on and under what conditions (particular audience segments, triggers to activate the experiment, traffic allocation to variations and devices etc)
These come in three flavours, primary, secondary and guardrail. Primary goals are usually entirely guided by your hypothesis and will determine what constitutes a “win”. They should be aligned with your business objectives.
Secondary goals can be set that whilst not your primary objective, still add value to your business if they can be improved and help to explain the change in behaviour that has been observed.
Guardrail goals serve as your early warning system if something is going pear-shaped, ecom conversion rate might be an example of a guardrail goal and its trigger point would be if it decreased by a certain amount.
If you run an online shoe shop for example, the primary goal may be checkout completions (conversions). A secondary goal might be the number of product views within a particular category. Your guardrail goals might be a drop in conversions or revenue per customer.
It’s always good to have designs worked up for an experiment for easy comparison later and reporting purposes. Designs will also help clarify to stakeholders exactly what to expect, and more importantly allow developers and quality assurance teams to build and test the experiment before it goes live. For more complicated experiments, starting with interactive prototypes and/or wireframes may be the best bet before you put the metaphorical design icing on the cake.
Quality assurance is crucial. Evaluating if an experiment is fit for purpose against the original scope could mean the difference between a win, fail or a catastrophe. QA should check that the experiment behaves as expected, looks the same way as it was designed and doesn’t have a negative impact on usability, accessibility, performance or functionality.
QA will also ensure that goals, triggers, audiences and more are set up and working correctly and also check any guardrails have been put in place that alert you when something looks iffy (such as a sudden drop in conversions or other important metrics). The QA should also ensure that any integrations (such as analytics) have been set up correctly, so you have another source of data outside the testing platform as to how the test performed.
Once all checks and measures have been double checked and measured, the experiment can go live. The duration that it needs to run for is dependent on many factors and there are many online calculators that can help predict the amount of time an experiment will need to be active before any firm conclusions can be made.
Most a/b tests should run for a minimum of 1 week, but anything longer than 6 is pushing the limits as to what is reasonable without degradation of the data used to analyse the results. Consideration should also be given to when a test is put live such as business cycles or other activities that may unduly skew the results.
When a test has run for sufficient time, you can start to analyse the results. Whilst many a/b testing platforms will generate a delightful visual representation of how your experiment is doing, there are some things to look out for beyond the veil of assumption and questions that should form part of your analysis:
Is the sample size for each variation high enough?
Is there any sample ratio mismatch (SRM) going on, if so, why?
Is the number of conversions high enough?
How long did the test run?
Did the experiment reach the required confidence level?
What’s the margin of error?
It’s important to understand that just getting results isn’t enough. There is a fine art in analysing results correctly and then communicating what it all means to an audience that invariably has a broad range of understanding of what happened, how it happened and what the result of it all was and what the next steps should be – let alone statistics and the usual tsunami of data associated with it..
Just showing data tables, percentage increases or losses against a given metric isn’t sufficient, and it’s certainly not going to ignite the spark of genuine curiosity and interest in an a/b test that has been carefully thought out, crafted together and run to gain insights that hopefully help move towards a goal that has been set.
What’s even worse than poor presentation of data insights, is if the results that you’ve garnered are then left to gather digital dust in a shared folder somewhere that everyone then forgets about.
Results that you gain from an experiment need to be shared widely, and access to the test (and all its predecessors and successors) should be held in an easily accessible place which should be the go-to repository to revisit, reiterate or reimagine possible a/b testing ideas and generate new ones. If you want to embed experimentation as a mindset into your company, then easy access to understandable information is critical.
Depending on the results of your experiment and variations within it, there are a few courses of action that you can take. It might be that what you learnt from your experiment wasn’t what you were expecting, in which case understanding WHY that was is important.
The analysis of an experiment should go some way to explaining those reasons and once they’re understood you can revise the test and run it again (ie. iterate upon it). The same can be applied for a test that passed the finish line with exceptional results. That doesn’t mean the end of the road however for that particular test – you should be looking to see how it can be improved further.
The whole premise of a/b testing is not just to get a win (or a loss). It is however about getting insights, and it is those insights that each test produces that continually feed into your a/b testing strategy and road map.
As you can see, simply getting a CRO test up and running can be complicated and time consuming in its own right. A/B testing is an involved process and takes commitment from those participating. Even more so when you consider the number of different types of a/b test that can be run and the major benefits of each. Let’s take a look.
Types of Experiments
A/A test – validate
In an A/A test, a web page is tested against an identical variation of itself. Why would you do that, you might ask. Well, the simple answer is that an A/A test is both a sanity and health check focusing on your A/B testing set up, your site performance and your experiments.
1. Helps you understand if you’ve got your testing platform set up correctly
2. Establishes if your experiment has been set up, configured and run correctly
3. Can rule out novelty effect
1. Can take up valuable resources (design, develop, run and traffic!)
2. Detracts from running experiments that may provide more valuable insights, learnings and revenue
3. Randomness even with a large sample size – cannot guarantee absolute certainty.
4. May not be possible to carry out in some a/b testing platforms
A/B test (split test) – explore & learn
An A/B test compares how well your existing page (or part thereof) performs against a variation of it. Half of your audience will see version A (the control) and the other will see version B (the variation). Ideally, the change you have made to the variation is singular so understanding the results of the experiment is easier to understand and interpret.
1. Better clarity in post-test analysis as the change is limited
2. Relatively straightforward to set up and get running
1. Only looks at one variable at a time
2. Slower velocity of testing
A/B/n test – explore and learn a bit faster
A/B/n testing allows you to test more than one alternative element on a page at a time. Each variation receives its own share of your total audience.
1. Can help to explain interactional relationships between variables on a page
2. Allows you to get insights faster
1. Reduced sample size (too many variations)
2. Longer to run (too many variations)
3. Might not be easy to understand what change caused a shift in metrics
4. Increased rate of false positives
Multivariate testing (MVT) – explore, learn more in one go
A full factorial multivariate test (MVT) allows you to test multiple designs to multiple elements on a page (and variations of those changes) to try and understand what combination works best.
Let’s take a look at the example below. For this MVT, we want to try different combinations of the image (left or right aligned), heading font (Arial or Roboto) and CTA colour (solid purple, solid yellow). That means in total we’d need to build out 8 variations for this particular MVT.
1. Allows you test lots of combinations of elements on a page to see which works best
2. Saves time as you’re testing simultaneously
3. A great way to identify conversion levers, which can then be explored with further A/B testing
1. The number of variations increases substantially the more elements that you test at the same time, which means…
2. Potentially you need a lot of traffic to run a single MVT
3. Can result in tests having to run longer if you’ve too many variations and insufficient traffic to distribute between them
4. Higher number of variations can also lead to more false positive results
5. Scattergun effect, may affect scientific rigour due to poor hypothesis
6. Complex to analyse which may slow down your ability to iterate correctly
Multi-armed bandit (MAB) – explore, exploit
Almost as interesting as the name suggests, multi-armed bandit testing aims to find the potentially best version of a series of tests that are running through a period of exploration and then shifting the traffic allocation towards better performing variations whilst it is running (exploitation).
During the lifespan of the experiment, the conversion rates of control and all variations are monitored and the traffic split between each variation is determined by an algorithm that decides what the next step should be (ie. to continue with or kill a variation (or arm) of the experiment).
Think of it a bit like earn while you learn.
1. Faster & more efficient
2. Reasonably accurate test results
3. Great when all you care about is one given metric (usually conversion goals)
4. Suitable when you’ve a limited window (e.g. a time sensitive event such as Black Friday) for testing something
5. Can work on lower traffic sites better than standard a/b tests
1. Gives more of an approximation rather than absolute proof (statistically speaking), therefore greater chance of a higher false positive rate or gravitating to a sub-optimal solution.
2. Those that enjoy statistical rigour might find this approach less than palatable
3. Slower to reach statistical significance
4. Difficult to run in-depth post experiment analysis
Contextual Multi-armed bandit (cMAB) – context, apply, exploit
A contextual multi-armed bandit test is much like a standard MAB, but in this instance, we add context to the algorithm so it can find the best strategy for a given user in the experiment. One of the best (easily digestible) descriptions of cMABs is this (and I couldn’t put it any better myself):
Contextual bandits are like the Sorting Hat from Harry Potter – you put it on the head of the customer, and it tells you which variant is right.Robert Lacok | Product Manager at Exponea
Putting things into “Context” could include a myriad of historical data about the user (such as search history, past purchases etc) that is then used to serve up the best fit variation for that particular user’s session. The information gained is then sucked in as feedback for the algorithm that then learns about what context and variation leads to more or less conversions.
1. Faster, more efficient, learns through feedback
2. Context allows better variation exposure to segments of users
1. Greater computational complexity
2. Requires more resources
Split URL – jump, explore, learn
Split URL tests sound much like the name suggests. Variation(s) of a webpage are hosted on different URLs and measured to see which has the greatest impact over the control. Traffic is split randomly between the variations and metrics are tracked to see which performs best. Split URL testing is generally used for overall design-based testing.
1. Useful for when you want to make significant design changes
2. Great for understanding the impact of backend changes (alternative to server side testing)
3. Can use a/b testing afterwards to fine tune page
4. Good if you just want to know which variation is better based on a single metric
5. Require less traffic to reach statistical significance, so good for low traffic websites
1. Can take a lot of time and resource to set up
2. Won’t tell you WHY a variation is better than its competitors and WHAT caused the improvement
In summary, the simple question of “what is a CRO test” turns out to have not such a simple answer, after all. Sorry for any illusions that have been shattered, but one of the biggest misconceptions around CRO and a/b testing is that it’s easy.
Sure, it can be, but if you want to do it the right way, it certainly is not a silver bullet that’s going to solve everything. Lack of understanding the entire process of producing an a/b test (from conception right through to birth) is one of the biggest reasons a/b testing roadmaps and strategies fail. ideas
If you can embrace that process, get buy in from your colleagues and embed a culture of experimentation within your company, then a/b testing will help you succeed with your business goals, discover hundreds of millions of new opportunities and open an encyclopaedic sized knowledge base of insights and learnings that you were previously unaware of.
So, the final question really is “are you up for the challenge?” Indeed, are you ready to be challenged on your assumptions? Do you want to shed light on what’s really going on and make better business decisions using data rather than gut-feel? Do you want proof rather than hear-say?
Our guess is that the answer should be yes. See how we can help start your CRO journey.
Don’t miss the next event
Experimentation Works hosts some of the most interesting and knowledgeable speakers and practitioners on experimentation and advanced CRO, leave your email address to get advance notification of the next event
If your CRO programme is not delivering the highest ROI of all of your marketing spend, then we should talk.