A/B Test Simulator v1

A few weeks ago I wrote about how I was working on an A/B test simulator and asked if anyone was interested in working on it. A few of you reached out (thank you!) but the discussions quickly stalled because I realized that I didn’t have a good plan where to take it from there.

Rather than let it linger on my Macbook forever more, I made a push ship the v1 and am happy to say you can check it out on GitHub here:


How it works

Here’s the idea:

Let’s say you’re running a big test on your homepage which has a conversion rate of 10% and you think your test will either do really well (+20%) or fail terribly (-20%). You configure this in the script:


Also, you want to run your A/B test until you’ve either had more than 10,000 participants or until the test has reached 99% significance. You configure this in the evaluate method:

return "pass" if total_participants >= 10000
if gaussian?(participants_a, conversions_a) && gaussian?(participants_b, conversions_b)
if p_value(participants_a, conversions_a, participants_b, conversions_b) >= 0.99
return "yes"

When you run the script (ruby abtest-simulator.rb) it then simulates 1,000 A/B tests, where for each A/B test we assign visitors one of the variations and continue until we declare a winner or pass if a winner is never decided on:

Passes: 74
Correct: 908
Incorrect: 18
Correct Decisions: 908/926: 98.06%

908 times out of 1,000 our criteria made the “correct” decision: we choose the winning +20% variation or didn’t chose the -20% variation. In 18 tests we incorrectly chose the -20% variation or didn’t choose the +20% variation. And in 74 tests out of 1,000 we never reached significance.

The idea with this project is that you can play around with the numbers to see what impact they have. For example:

  • What is the impact of 99% signifiance vs 95% signifiance?
  • What if you just wait until there are 50 conversions and pick the best performer?
  • What if you don’t expect the test to result in a big change, but only smaller ones? (Hint: A/B testing small changes is a mess.)

Next steps

If anyone is interested in helping on this project, now’s a good time to get involved.

Specifically, I’d love it for folks to go through the script and verify that I haven’t made any logical mistakes. I don’t think I have, but also wouldn’t bet my house on it. That’s also why I’m not including any general “lessons learned” from this simulator just yet – I don’t want to report on results until others have verified that all is well with the script. I also wouldn’t rule out someone saying “Matt, the way you’ve done this doesn’t make any sense”. If I figure out any mistakes on my own or from others, I’ll write posts about them so others can learn as well.

If you can’t find any bugs, just play around with it. How does the original conversion rate impact the results? How does the distribution of results impact it? How does the criteria for ending the test impact it? Eventually we can publish our findings – the more people that contribute, the better the writeup will be.

Analyzing an A/B Test’s Impact Using Funnel Segmentation

If you decide to roll your own in-house A/B testing solution, you’re going to need a way to measure how each variation in each test influences user behavior.

In my experience the best way to do this is to take advantage of a third party analytics tool and piggyback on its funnel segmentation features. This post is about how to do that.

Funnel Segmentation 101

Consider this funnel from Lean Domain Search:

  1. A user performs a search
  2. Then clicks on a search result
  3. Then clicks on a registration link

In Mixpanel, the funnel looks like this:

Screen Shot 2016-08-04 at 9.29.19 AM.png

Of the 35K people who performed a search, 9K (26%) of them clicked on a search result, then 900 (10% who clicked, 2.5% overall) clicked a registration link.

We can then use Mixpanel’s segmentation feature to segment on various properties to see how they impact the funnel. For example, here’s what segmenting on Browser looks like:

Screen Shot 2016-08-04 at 9.32.16 AM.png

We can see that 27% of Chrome searchers click on a search result compared to only 18% of iOS Mobile visitors. We could also segment on other properties that Mixpanel’s tracking client automatically collects such as the visitor’s country, which search engine he or she came from, and most importantly for our purposes here, custom event properties.

Passing Variations as Custom Event Properties

Segmenting on a property like the visitor’s country is very similar conceptually to segmenting on which A/B test variation a user sees. In both cases we’re breaking down the funnel to see what impact the property value (each country or each variation) has on the rest of the funnel.

Consider a toy A/B test where we’re running an A/B test to measure the impact of the homepage’s background color on sign ups.

When the visitor lands on the homepage, we fire a Visited Homepage event with a abtest_variation property set to the name of the variation the user sees:

mixpanel.track( 'Visited Homepage', { abtest_variation: 'white' } );

view raw


hosted with ❤ by GitHub

With this in place, you can then set up a funnel such as:

  1. Visited  Homepage
  2. Signed Up

Then segment on abtest_variation to see what impact each variation has on the rest of the funnel.

In the real world, you’re not going to have white hardcoded like it is in the code snippet above. You’ll want to make sure that whatever A/B test variation the user is assigned to gets passed as the variation property’s value on that tracking event.

Further improvements

The setup above should work fine for your v1, but there are several ways you can improve the setup for long term testing.

Pass the test name as an event property

I recommend also passing an abtest_name property on the event:

mixpanel.track( 'Visited Homepage', { abtest_name: 'homepage test 3', abtest_variation: 'white' } );

The advantage of this is that if you’re running back to back tests, you’ll be able to set up your funnel to ensure you’re only looking at the results of a specific test without worrying that identically-named variations from earlier tests are impacting the results (which would happen if you started a test the same day a previous test ended). The funnel would look like this:

  1. Visited Homepage where abtest_name = homepage test 3
  2. Signed Up

Then segment on abtest_variation like before to see just the results of this A/B test.

Generalize the event name

In the examples above, we’re passing the A/B test details as properties on the Visited Homepage event. If we’re running multiple tests on the site, we’d have to pass the A/B test properties on all of the relevant events.

A better way to do it is to fire a generic A/B test event name with those properties instead:

mixpanel.track( 'Assigned Variation', { abtest_name: 'homepage test 3', abtest_variation: 'white' } );

Now the funnel would look like this:

  1. Assigned Variation where abtest_name = homepage test 3
  2. Signed Up

Then segment on abtest_variation again.

To see this in action, check out Calypso’s A/B test module (more on that module in this post). When a user is assigned an A/B test variation, we fire a calypso_abtest_start event with the name and variation:

analytics.tracks.recordEvent( 'calypso_abtest_start', { abtest_name: this.experimentId, abtest_variation: variation } );

view raw


hosted with ❤ by GitHub

We can then analyze the test’s impact on other events using Tracks, our internal analytics platform.


The nice thing about using an analytics tool to analyze an A/B test is that you can measure the test’s impact on any event even after the test has finished. For example, at first you might decide you want to measure the test’s impact on sign ups, but later decide you also want to measure the test’s impact on users visiting your support page. Doing that is as easy as setting up a new funnel. You can event measure your test’s impact on multiple steps of your funnel because that’s just another funnel.

Also, you don’t have to litter your code with lots of conversion events specific to your A/B test (like how A/Bingo does it) because you’ll probably already have analytics events set up for the core parts of your funnel.

Lastly, if your analytics provider provides an API like Mixpanel you can pull in the results of your A/B tests into an internal report where you can also add significance results and other details about the test.

If you have any questions about any of this, don’t hesitate to drop me a note.

Transferring ROMs to RetroPie


I recently bought a Raspberry Pi and configured it to play some of my favorite oldschool SNES video games. Transferring the video game ROMs over to the Raspberry Pi was one of the more confusing aspects of the setup so in this post I’ll share the steps I took to do it.

Obtaining ROMs

There are two main ways for obtaining ROMs:

  1. The legal way: buy a device that lets you create ROMs from your physical game cartridges. More on how to do that in this ArsTechnica article.
  2. The not-so-legal-way: go on ThePirateBay and download a torrent containing a library of ROMs that others have created.

Regardless of which way you go, in the end you should end up with one or more ROMs on your computer:

Screen Shot 2016-08-02 at 9.40.10 AM.png

Transferring the ROMs to your Raspberry Pi

There are a bunch of ways to do this: USB, SFTP, scp, and more.

I have a great Mac app called Transmit that provides SFTP functionality which made it my go-to choice for performing the transfer.

Simply set up a new favorite with your Raspberry Pi’s credentials:

Screen Shot 2016-08-02 at 9.42.48 AM.png

Then connect and transfer the ROMs from your computer to the appropriate subdirectory in the Raspberry Pi’s RetroPie/roms directory. For example, this Contra III ROM is an SNES ROM so I transferred it into the RetroPie/roms/snes directory:

Screen Shot 2016-08-02 at 9.46.05 AM.png

After the ROM is transferred, restart the RetroPie (Menu > Quit > Restart System), select the appropriate gaming system (Super Nintendo in this case), find the ROM in the game list, and you’re ready to play.