If you want to improve your website you probably need to do A/B testing, otherwise known as split testing.
Instead of guessing, A/B testing allows you to experiment more scientifically. You introduce a change, observe how users respond, and let real behaviour inform your decisions.
In this guide, we’ll explain what an A/B test is, how A/B testing works in practice, and why understanding the psychology behind experimentation is key to improving conversions.
Although we’re talking about A/B testing websites in this guide, the principles are the same for A/B testing ads, emails, or anything else (if you’re specifically testing email campaigns, you might also like our beginner’s guide to email A/B testing).
What Is an A/B Test?
An A/B test is a method of comparing two or more versions of a webpage, design element, or user experience to determine which performs better. In A/B testing, visitors are split between two variations – Version A (the original) and Version B (the modified version) – and their behaviour is measured against a specific goal, such as clicks, purchases, or sign-ups.
A/B testing works like a psychology experiment. You change one stimulus, for example, a headline, button colour, or layout, and observe how user behaviour changes in response.
Rather than relying on opinions or assumptions, an A/B test uses data to identify what actually works. If you’re up to scratch with A/B testing and want a wider overview of other approaches, see our post on using variant testing on your website.
A/B Testing as a Psychology Experiment
One of the easiest ways to understand A/B testing is through behavioural psychology.
In traditional psychology experiments:
- Researchers introduce a stimulus (e.g. tarantula).
- Participants react (e.g. scream).
- Behavioural changes are measured (e.g. fear goes up).
Website A/B testing follows the exact same logic.
- The stimulus is the change you introduce.
- The participants are your website visitors.
- The response is measurable behaviour – clicks, conversions, engagement, or other actions.
This mindset shifts optimisation away from personal preference. Instead of asking “What do we like?”, you ask:
“How do users actually behave when we change this variable?”
Thinking this way encourages structured experimentation and reduces bias. If you’re running an A/B test on your website you can officially call yourself a psychologist (with no formal, relevant qualifications).
If you’re into the psychology side of optimisation, you might enjoy how cognitive psychology can make you a better usability tester.
Why A/B Testing Matters for Modern Websites
Attracting traffic takes time and money. Whether through SEO, paid advertising, or social media campaigns, gaining visitors isn’t easy, so maximising conversion rate becomes essential.
A/B testing is a core part of conversion rate optimisation (CRO) because it helps you improve performance without increasing traffic.
Seemingly small improvements in conversion rate can produce significant outcomes:
- Increasing conversion rate from 2% to 3% represents a 50% improvement.
- The same audience delivers more results.
- Marketing budgets go further.
For developers, agencies, and business owners, A/B testing also reduces risk. Instead of implementing major changes blindly, you validate ideas through controlled experimentation.
You should be aware, however, that there are difficulties and pitfalls to A/B testing. And we’ll get to those soon.
What Should You A/B Test First?
Not every change produces meaningful results. It’s always worth diving into your website’s statistics to identify areas which can produce the most benefit. While the order of items in your footer is probably not going to affect revenue. Instead, prioritise areas where user decisions are most influenced.
Calls-to-Action (CTAs)
CTAs are often ideal starting points for an A/B test because they directly affect conversion behaviour.
Consider testing:
- Button colour
- Button text
- Placement on page
- Size and visual prominence
Headlines and Messaging
Your headline sets the tone for the page, and influences whether users continue reading or leave.
Examples of testing approaches include:
- Benefit-focused vs feature-focused messaging
- Short headlines vs longer explanatory versions
- Emotional positioning vs factual descriptions
Page Structure and Layout
Sometimes the most effective A/B testing experiments involve simply rearranging and repositioning elements on a page, rather than adding new ones.
Try testing:
- Removing distractions
- Reordering sections
- Improving visual hierarchy
When A/B Testing Results Don’t Give Clear Answers: A Real World Example
After running A/B testing experiments across multiple product pages and conversion funnels in a previous role, one experience in particular stood out as a reminder that testing isn’t always straightforward.
We ran an A/B test on product page CTA button colours to improve conversion rates.
The variations included:
- Black buttons with white text
- Green buttons with white text
- White buttons with black text
- Pink buttons with white text (the original)
Green buttons won. Statistical significance was achieved and our testing platform confirmed green was the winner.
Conversions increased compared to the other variations, and from a purely data-driven perspective, the decision looked obvious.
However, my boss didn’t like green, and asked me to run the test again.
This time, white buttons came out on top.
When we combined the results from both tests, something interesting happened:
There was no definitive winner.
What initially appeared to be a clear success turned into an inconclusive result when analysed over a longer timeframe.
This experience highlighted a critical lesson about A/B testing:
Context, timing, and external variables can significantly influence outcomes, and often there’s no way to account for them.
Common Pitfalls in A/B Testing
Many guides make A/B testing sound simple: run a test, find a winner, implement the change. Repeat in perpetuity until your business is the most successful ever. Annoyingly, experimentation is more nuanced.
Multiple Variables Can Influence Behaviour
Even if you change only one element during an A/B test, your experiment is never truly isolated. Unlike a laboratory psychology study, real-world websites operate in constantly changing environments, and many external factors can influence how users behave.
Understanding these variables helps prevent misinterpreting results.
Some common influences include:
Changes in Traffic Sources
Where visitors come from can dramatically affect how they behave.
For example:
- Visitors arriving from organic search may be researching and comparing options.
- Paid ad traffic may have higher intent because they clicked a specific offer.
- Social media users might be browsing casually rather than actively buying.
If your traffic mix shifts during an A/B test – for example, if a paid campaign launches midway through, results may reflect audience differences rather than the change you introduced.
Marketing Campaigns and Promotions
Running promotions alongside testing can skew results.
Imagine testing a new product page layout while launching a limited-time discount. Increased conversions might be caused by urgency from the promotion rather than the design change itself.
Whenever possible, try to:
- Keep campaigns consistent during tests.
- Note external activities when analysing results.
Seasonality and Timing Effects
User behaviour changes depending on timing.
Examples include:
- Weekend vs weekday browsing habits.
- Seasonal shopping patterns (holiday sales vs quiet periods).
- Monthly pay cycles affecting purchasing behaviour.
A test run during a high-intent period may produce different results than the same test during slower traffic periods.
Device Usage Patterns
Mobile and desktop users often behave differently.
Mobile users may:
- Scroll more quickly.
- Prefer simpler layouts.
- Abandon long forms.
If your device traffic mix changes such as a spike in mobile visitors, this can influence outcomes independently of your A/B test variation.
External Factors (Yes, Even Weather)
It might sound surprising, but external events like weather can influence behaviour.
For example:
- Poor weather may increase online browsing time.
- Seasonal weather patterns affect travel, retail, and lifestyle purchases.
While you can’t control these factors, being aware of them helps interpret results more realistically.
Shifts in User Intent
User motivations evolve.
Changes in:
- Market trends
- News events
- Industry developments
- Competitor activity
can all influence why people visit your site and how they interact with it.
If user intent shifts during testing, behaviour changes might not be caused solely by your experimental variation.
Your A/B Test Doesn’t Exist in Isolation
The key takeaway is that A/B testing happens in a dynamic environment. You are a scientist looking through a microscope made of frosted glass.
This doesn’t mean testing isn’t valuable, far from it. But it does mean that interpreting results requires context and critical thinking.
Understanding Statistical Significance (Without the Maths Headache)
One term you’ll often hear in A/B testing is statistical significance. While it sounds technical, the core idea is simple.
Statistical significance helps answer the question:
“Are these results likely real, or could they just be by chance?”
Imagine flipping a coin ten times and getting seven heads. Does that mean the coin is biased? Probably not. Small sample sizes can produce unusual patterns just by coincidence.
A/B testing works the same way.
If only a small number of users see your variations, one version might appear to perform better simply because of randomness. Statistical significance measures whether the difference between variations is large and consistent enough to be considered reliable.
In simple terms:
- Low statistical significance = results are less reliable.
- High statistical significance = results are more reliable.
Most testing tools calculate this automatically, but understanding the concept helps you avoid common mistakes like ending tests too early or declaring a winner based on limited data.
A useful way to think about it is that statistical significance is confidence that your experiment is measuring real effects on behaviour, rather than coincidence.
Sample Size Matters
One of the most common mistakes in A/B testing is drawing conclusions from insufficient data.
Small sample sizes can produce misleading outcomes, making random fluctuations appear meaningful.
To improve reliability:
- Allow tests to run long enough.
- Ensure adequate traffic volume.
- Avoid declaring winners prematurely.
The Importance of Repeating Tests
This is not a commonplace thought amongst those who work in CRO or marketing in general, but repeating an A/B test can be extremely valuable.
Obviously testing takes time, and potentially a cost if your variants are worse-performing than your original. So you’ll have to weigh up whether a test is worth repeating. In particular, if the results don’t match with logic, the findings were close, or the value of the change is potentially high, it might be worth considering retesting.
Given the huge number of variables that can affect your A/B tests’ outcomes, which you most likely cannot control for, and most likely can’t know the effect of, and most likely can’t model into your results, it’s worth taking your test results with a sprinkle of salt. Although scientific, this is not a hard science.
Running experiments multiple times helps:
- Confirm consistency.
- Reduce false positives.
- Reveal whether results were situational.
As seen in the CTA button example, repeating tests can expose insights that single experiments might miss.
Final Thoughts
A/B testing isn’t about endlessly tweaking button colours and hoping conversions magically improve. It’s about adopting a structured way of learning from real user behaviour instead of relying on opinions or assumptions.
If you think back to the psychology experiment comparison earlier, every A/B test is essentially a small behavioural study. You introduce a stimulus, observe how people respond, and try to understand what that response actually means. Sometimes you’ll get a clear winner. Other times, as we saw with the button colour test, results can be messy, inconsistent, or downright confusing.
And that’s normal.
Real-world testing happens in dynamic environments where traffic sources change, user intent shifts, and external factors can influence behaviour. The goal isn’t perfection, it’s learning. Over time, consistent experimentation helps you build a deeper understanding of what your audience responds to and where meaningful improvements actually come from.
In our next A/B testing guide, we’ll move beyond theory and into practice. We’ll cover how to set up A/B tests properly, what tools and frameworks you can use, and real-world examples of what you should be testing first to generate useful insights without wasting time.
Bonus tip: if you’re improving conversion rates, don’t ignore performance, a faster site often converts better. Here are 13 easy ways to optimise your website speed and performance.
