What is statistical significance ?
Z-score is a statistical measurement allowing to determine the validity of the results of a variation and if a variation is more or less efficient than the reference. For further information about z-score, you can read the Standard score article on Wikipedia.
For example, if the statistical significance of your experiment is 90%, it will mean that the variation has 90% chances to beat the original, but there is also a 10% risk to be wrong.
Let’s take the example below:
- Reference variation: 142,000 unique visitors tested, 3.52% conversion rate (5,000 sales)
- Variation 1 – tested: 216,000 unique visitors tested, 3.64% conversion rate (1,850 sales)
In this example, variation 1 is more efficient than reference variation, but is your campaign significant enough? By calculating the statistical significance, we notice that variation 1 is more efficient than reference variation and that significance is 96%: your A/B experiment is a success! You can now use variation 1 as reference.
Checking statistical significance
If you are using Google Analytics or Kameleoon as reporting tool, Kameleoon will calculate automatically the significance of your experiment. In Google Analytics, you will only see the conversion rate for each goal, but you cannot know easily which variation is the most efficient. Kameleoon calculates automatically the z-score, allowing you to check if your results are significant and if your variation is more or less efficient than its reference.
To know how to access the results page of a campaign, please read this article.
The results page allow you to see the performance of your campaigns. For each variation or each goal, you will find in the results tables (at the bottom of the page) the reliability rate, or statistical significance.
In the example above, the reliability rate is >99%.
Evolution and reliability
The campaign is only reliable if it has been applied to a sufficiently large amount of visitors. If the number of tested visitors is too low, the campaign loses its value.
Your campaign is not considered complete until this reliability rate stabilizes over time. This rate is considered stable if it remains within a range of +/- 5 points. A visual indicator reflects the stabilization of the rate: if the 3 boxes light up, your rate is stable!
- Not stable: 0/3 boxes
- Stable for 1 to 3 days: 1/3 boxes
- Stable for 4 to 6 days: 2/3 boxes
- Stable for 7 or more days: 3/3 boxes
In this example, the variation has three full boxes: its confidence level is stabilized, its results are reliable.
This variation, on the other hand, has no full box. This means that you have to wait until your reliability level stabilizes.
You can also consult the graph showing the evolution of the reliability over time, by selecting “Confidence rate” in the “Graphs” section of the block.
You can do it on the “Graphs” section of the page or by displaying the graph in a specific goal table.
When the curve flattens, it means that the results of your campaign are stabilized and that you can use this data with confidence.
What can you do if sufficient reliability is not achieved?
If your reliability rate is stable but insufficient, this may be related to different factors. Among them:
- the traffic on the page is not sufficient;
- the difference between the performance of the original and the variation is too small to draw conclusions (for example, the modification you have made has a very small impact on the behavior of your visitors).
However, you can draw conclusions for your website from a reliability rate stabilized at 75%. If the traffic on the page is not sufficient, your reliability rate will probably not reach 95%. But the Kameleoon results page offers a wide variety of data and indicators that will allow you to better understand your audience.
Note: You can change the minimum reliability rate required for Kameleoon to consider that a variation is winning. To do so, go to the “Administrate” > “Sites” page in the Kameleoon App. In the tab dedicated to Experiments, you will be able to modify this minimum reliability rate.