Checking for potential SRM (Sample Ratio Mismatch)

Written by Julie Trenque

Updated on 10/10/2024

2 min

Advanced

Was this content useful?

What is Sample Ratio Mismatch (SRM)?

Sample Ratio Mismatch means that there is a significant difference between the expected deviations of the experiment variations and the observed ones. You are faced with SRM when one variant in your test receives notably more, or less, traffic than expected. This causes traffic to your variation(s) to be distributed unequally, regardless whether you split your traffic evenly (e.g., 50-50).

Let’s look at an example: You launched a campaign with a traffic allocation of 50% toward the original and 50% toward the variation. Your data shows you have 50,000 visitors on the original and 48,900 on the variation. Kameleoon’s native, in-app SRM detection test will flag these results as positive for Sample Ratio Mismatch. At Kameleoon, we detect SRM by running a Chi-Square Goodness of Fit test, which helps us understand whether a variable is likely to come from a specified distribution. We use a threshold of 0.001 for the p-value of the test, meaning that an experiment will be positive for the SRM test if the test’s p-value is lower than this threshold.

Kameleoon’s native, in-app SRM detection alert on the Results page

Why is SRM relevant?

SRM implies that there is bias in the experiment’s traffic assignment mechanism. This bias can make the experiment’s results invalid because one of the core hypotheses of the test, on which statistical computations for A/B test are based, is violated. 

This problem is especially relevant if you have large traffic on your sites because even small discrepancies can have major effects on experiment results and lead to poor decision-making. 

Kameleoon’s in-app SRM alerting helps you make decisions based on an actual statistical test, rather than on intuition.

What causes SRM and how to address it?

The root causes of SRM can vary greatly between experiments and the organizations or users running them.

Based on our experience, we have created a checklist for troubleshooting when your experiment shows signs of SRM.

Important: When configuring your experiment with original 0%, control 50%, and redirection 50%, ensure that you define the new control as the reference one in the results page.

Check the history of changes in the experiment

Ensure there has been no traffic reallocation, no change in targeting, and no change in the number of variations after the experiment has started. Use the graph view on the results page to check the number of visits for each variation. This can reveal if the gap occurs only on certain dates.

⇒ If so, we recommend launching a new experiment.

Break down the visits and check if the gap is localized

Review device type, browser, operating system, new/returning visitors, and any custom data.

⇒ If so, you can filter out the variable causing the SRM.

Check if there is a CD for cross-device reconciliation

⇒ If so, ensure that the ID is uniquely generated for each visitor.

The Kameleoon assignment algorithm uses a hash function to randomize users, and the ID is part of this process. If two visitors have the same ID, this imbalance causes a Sample Ratio Mismatch. Ensuring that each visitor has a unique and correctly generated ID is crucial for maintaining the integrity of the randomization process and achieving a balanced sample distribution.

Check if assignVariation() is used in the GS or an experiment

⇒ If so, the custom assignment may not align with the experiment configuration. You can then trust the data or launch a new experiment that aligns the configuration with assignVariation().

Check if your experiment is using a custom variation assignment script

(in the experiment configuration under the GS)

Using a custom variation assignment script could potentially cause SRM, as it bypasses our assignment algorithm.

⇒ If so, the custom assignment may not align with the experiment configuration. You can then trust the data or launch a new experiment that aligns the configuration with the custom variation assignment script.

Check if the site is an SPA

Some visitors may not be targeted if they navigate to the targeted page while Kameleoon does not reload. They will only be targeted if they directly access that page.

⇒ Correctly handle the SPA using enableSinglePageSupport() or reload Kameleoon when needed, based on information that confirms the page has changed, using Kameleoon.API.Core.load().

Check that consent is well configured

If you display experiments before the targeting, the visitor will be allocated to a variation, and when they accept the cookies, the Kameleoon consent needs to be passed to true. Otherwise, data for that variation won’t be collected.

⇒ Correctly handle consent using enableLegalConsent() and disableLegalConsent() in the GS or through the kameleoonQueue.

Check if the experiment is based on a redirection

This issue arises when some visitors redirected to variant B fail to see the page, or when data collection only occurs after page B loads. This results in data loss for variant B that wouldn’t happen on the original page. In order to avoid this issue as much as possible Kameleoon automatically takes care for you to also redirect visitors being assigned to the original but sometimes it’s not enough.

⇒ To address this, we recommend:

  • Target only consented visitors using the following code in the targeting:

return Kameleoon.API.Visitor.experimentLegalConsent || false

  • Use the native redirection (under the variation options) OR Kameleoon.API.Core.processRedirect('newURL') instead of document.location.href = “newURL”
  • Don’t target the newURL where visitors are redirected, and ensure Kameleoon is installed on that newURL.

Follow our guidelines to avoid a negative impact on SEO from redirection experiments.

Check that there is no delay affecting any variation execution that could cause discrepancies

E.g., if your variation is redirecting to a different page where Kameleoon is not installed.

⇒ Install Kameleoon on all pages.

For help diagnosing potential causes you can check out this great and extensive taxonomy of causes.

Check if your technical team is using any internal bots/crawlers that are not listed in the IAB/ABC International Spiders and Bots List

(For example, tools like Ekara IP-label previously)
→ If so, please send us the User-Agent, and we will exclude it from the results.
→ If you’re unsure, you can create a custom data field “userAgent” to capture the User-Agents, export the data from your experiment, and verify if any User-Agent is generating a significant number of visits.
You can set the custom data value using the code below in the Global Script:
Kameleoon.API.Data.setCustomData("userAgent", window.navigator.userAgent);

How frequent is SRM?

Our SRM test is a powerful tool because it offers a single test for detecting a broad range of potential issues. However, because this range is so wide, SRM tests may come out positive more often than expected. On average, 4.5% of experiments run on the Kameleoon platform test positive for SRM.

To give you more insight, Microsoft shared the following about SRM in September 2020:

Recent research contributions from companies such as LinkedIn and Yahoo, as well as our own research confirm that SRMs happen relatively frequently. How frequently? At LinkedIn, about 10% of their zoomed-in A/B tests (A/B tests that trigger users in the analysis if they satisfy some condition) used to suffer from this bias. At Microsoft, a recent analysis showed that about 6% of A/B tests have an SRM.

  • In this article :