Deterministic A/B tests via the hashing trick

In principle A/B testing is really simple. To do it you need to define two separate user experiences, and then randomly allocate users between them:

So far this seems pretty simple. But then you think about edge cases:

  • Shouldn’t the same user get the same experience if they do this twice?

  • After the test is complete, how can I compare group A and b?

It’s not hard to track this data, but it certainly makes your code a bit uglier:

This is doable, but the code is a lot longer and more annoying. Are there race conditions? Should everything live in a single transaction, potentially skewing things?

Fortunately there’s a better way: the hashing trick.:

Usingdeterministic_random_choice instead of random.choice will ensure that the same user is always assigned to the same variation. This is done without any database access.

It also makes it very easy to run analytics and compare the two groups, even though we never stored group membership in any database table:

(This is not SQL that any real DB will actually run, but it’s illustrative.)

Whatever you currently do for analytics, you can take the exact same queries and either GROUP BY the deterministic_random_choice or else run the query once for each variation and put deterministic_random_choice(user.id, “experience_test”, 2) = 0,1 into the WHERE clause.

It’s just a nice simple trick that makes it easy to start A/B testing today. No database migration in sight!

This post was first published on the Simpl company blog.