What is the statistical significance ?
Z-score is a statistical measurement allowing to determine the validity of the results of a variation and if a variation is more or less efficient than the reference. For further information about z-score, you can read the Standard score article on Wikipedia.
For example, if the statistical significance of your test is 90%, it will mean that the variation has 90% chances to beat the original, but there is also a 10% risk to be wrong.
Let’s take the example below:
- Reference variation: 142,000 unique visitors tested, 3.52% conversion rate (5,000 sales)
- Variation 1 – tested: 216,000 unique visitors tested, 3.64% conversion rate (1,850 sales)
In this example, variation 1 is more efficient than reference variation, but is your test significant enough? By calculating the statistical significance, we notice that variation 1 is more efficient than reference variation and that significance is 96%: your A/B test is a success! You can now use variation 1 as reference.
Checking statistical significance
If you are using Google Analytics or Kameleoon as reporting tool, Kameleoon will calculate automatically the significance of your test. In Google Analytics, you will only see the conversion rate for each goal, but you cannot know easily which variation is the most efficient. Kameleoon calculates automatically the z-score, allowing you to check if your results are significant and if your variation is more or less efficient than its reference.
You can see these measurement in your personal space. To do this, you must be logged in on your personal space.
Once you are logged in, click on the “All tests” button.
Then, click hover on the test card and lick on “View results”.
The results page allow you to see the performance of your tests. For each variation or each goal, you will find in the results tables (at the bottom of the page):
- The improvement rate
- The reliability rate, or statistical significance
- The conversion rate
- The number of conversions
- The number of converted visits
- The amount of visits
- The number of visits/visitors;
- The number of converted visits/visitors;
- The conversion rate;
- The improvement rate;
- The reliability rate, or statistical significance.
In the example below, the reliability rate is 99%.
Evolution and reliability
The test is only reliable if it has been applied to a sufficiently large amount of visitors. If the number of tested visitors is too low, the test loses its value.
Your test is not considered complete until this reliability rate stabilizes over time. A visual indicator reflects the stabilization of the rate: if the 3 boxes light up, your rate is stable and your test is over!
Note: This indicator makes it possible to visualize, immediately and very easily, the evolution of the reliability. Just watch the boxes.
In this example, the first and third variations have their three boxes illuminated: their confidence level is stabilized, their results are reliable. The second variation, on the other hand, has only one illuminated box. This means that you have to wait until your reliability level stabilizes.
You can also consult the graph showing the evolution of the reliability over time, by selecting “Reliability” in the “Graph” section of the block. You can do it on the “Graph” section of the page or by displaying the graph in specific goal table. Then choose to display the reliability rate on the graph.
When the curve flattens, it means that the results of your test are stabilized and that you can use this data with confidence.