This version of Vestibule is read only. It represents an archive of the community effort to produce content for Ruby Manor 4.

A/B testing got you elected, Mr. President

This proposal by <a href="/users/samphippen">Sam Phippen</a> has been chosen by the community to be given at Ruby Manor 4.

updated about 4 years ago; latest suggestion about 4 years ago

Proposal

The Obama Campaign this year ran what could be described as an utter insurgency. What most people don't know is that they used a simple statistical technique, A/B testing, to drive up donations and volunteers, which inevitably led to more votes. I'd like to do a talk explaining how A/B testing works, how to do it in ruby (rails & sinatra) and some protips, as illustrated by real world examples.

The structure of my talk will be something roughly like:

  • Examples of A/B testing
  • How to do it (demo simple sinatra or rails app to add A/B testing)
    • How to interpet the results (warning, this will involve statistics)
    • What not to do
    • What to measure
    • Changes suitable for A/B testing

Changelog

  • Updated outline for recommendations from James Adam

Suggestions

  • B99daa9f050dfdcdc8f207aa3d0ea511 Tim Cowlishaw suggests about 4 years ago

    This sounds great - I'd like to echo James' suggestion that you should definitely include something on how to interpret the results of A/B tests though - even if it's just a really high-level look at the principles of statistical hypothesis testing, and some of the more common errors to avoid.

  • 6cdb7590e6804ff1ec3d0f558ddeec3d eastmad suggests about 4 years ago

    Please cover caching errors! Oh these hurt so much..

  • 3c9da486f4ab709c368fdde6596c480a jbsf suggests about 4 years ago

    As an obsessed follower of the 2012 election, the title seems an unnecessary stretch. Willing to let it go if "47%" finds its way into the statistical examples.

    Agreed with the others that touching on interpretation would be good. Also helpful would be discussion of the costs of managing tests (in time, effort, and attention), and consideration of when one should and should not A/B.

  • 3f158c174d6436072b961b73d413e31c Andrew France suggests about 4 years ago

    The statistics side of things would be pretty useful. I suspect a lot of developers would have difficulty with calculating the duration of the test from the sample size needed to reach a certain confidence level.

  • D72b1e5724b5c20a36d78730b50d8f7b Glenn Gillen suggests about 4 years ago

    Would be glad to hear discussion on how to decide when to interpret results. Lots of frameworks/tools I've used show results when they are in progress. Should sample size be pre-determined? Why not? How? What value can be gleaned from experiments that are in-progress? How does this play into the use of bandit algorithms?

  • Acd62030df551952268e84c8fff26a5b James Adam suggests about 4 years ago

    Would this presentation talk about how to gather and interpret the results of A/B testing as well? Does that involve/require particular analytics systems? I would definitely be more interested if there was at least one concrete example of how to interpret the results for an A/B test.

    Also, are there any "wrong" ways of using A/B testing? Are there situations where A/B tests aren't suitable, or where the results can be misleading or inconclusive? I think this would be really useful guidance too.

    Thanks!