Precision and metrics

I spend more time than I’d like to admit investigating why metrics that should be identical vary from one tracking source to another. Sometimes the difference is due to bugs, but often just due to the nature of how the tracking system works. It’s also given me an appreciation for just how difficult it is to get truly precise metrics.

For example, consider this simple example:

You run a site that where people sign up for an account and then create a post. What percentage of accounts create a post?

Sounds pretty simple, right? Here are two possible approaches and where they can lead your metrics astray:

Using your database to figure out the answer

With this approach, we look at how many accounts there are, figure out how many of them have a post, and calculate the percentage that way. If there were 1,000 accounts created in August and 200 of them have a post, 20% have created a post (yay math).

But… what about users who create a post then delete the post? It’s possible that 20 people wound up creating a post then deleting it, so the “true” number of accounts that created a post is actually 220, but your database only reflects 200 posts so you’ll wind up reporting 20% instead of 22%.

What about users who deleted their account? Instead of 1,000 accounts there were actually 1,100 accounts, but 100 of their owners wound up deleting their account. Unless you set up some special tracking for this, you’re now in a spot where you don’t know exactly how many accounts were created or how many of them created posts.

If your account records have a unique id that increments by 1 each time a new one is created you could figure out the number of accounts relying on that, maybe.

One way around this is to not let people delete their posts or accounts. Maybe you tell them their post was deleted, but don’t actually delete it from your database. Just set a “deleted” flag on the database record and don’t show it to the user. Same thing for accounts. But what happens if they think they deleted their account then try to sign up again with the same email? Will you tell them that their account already exists even though supposedly you deleted it? There are technical solutions around this, but things are getting pretty complicated already.

Using an analytics tool to figure out the answer

Another approach is to use a tool like Mixpanel or KISSmetrics to try to answer the question. Set up a funnel to measure the conversion rate from your Sign Up event to the Publish Post event.

The good news here is that if users delete a post, it won’t impact the conversion rate that the funnel reports.

The bad news is that if they create additional accounts, they’ll only count once in the funnel because funnels try to measure the actions of people, not accounts.

One possible solution

If I really, really wanted to answer this question precisely, I’d set up a new database table that keeps track of accounts and whether they’ve created a post. These records wouldn’t be impacted if the post or account gets deleted so we can trust that the data is accurate.

Alternatively, you can simply redefine your metric so that you can be precise: instead of “what % of accounts created a post?” you ask “what % of non-deleted accounts have a non-deleted post?”. Not very elegant, but far easier to answer than the original metric.

Does perfect precision matter?

I’d argue that for most metrics (with the exception of revenue metrics), being perfectly precise is not critical. The metrics are probably fine as long as they’re close to the true value and that the way you calculate it is consistent over time.

tl;dr: data analysis is fun :).

Only 2537 Sundays Remain

Paras Chopra, CEO of Wingify, the company behind Visual Website Optimizer (VWO), created a nifty little Chrome extension as a weekend project that shows you how many Sundays you have left in your life (assuming an 80 year life expectancy) whenever you open a new tab.

Here’s what it looks like:

Screen Shot 2016-09-09 at 2.37.06 PM.png

It’s a handy reminder that the clock is ticking.

My guess is that it was probably inspired by some of Tim Urban’s writing on the subject.

You can download the extension from the Chrome Web Store.

Measuring How Far Down Your Homepage Visitors Scroll

As optimistic web developers, we’d like to imagine that most people visiting our site for the first time will scroll down our homepage to check out all of the content we’ve painstakingly laid out for them.

But how many actually scroll down?

With a little bit of JavaScript and the help of an analytics service, we can figure it out:

// This code assumes that each section of your homepage uses a "section" tag
// and each one has an id corresponding to its purpose: "testimonials", "pricing", etc
( function() {
var viewedSections = [];
$( window ).on( 'scroll', function() {
$( 'section' ).each( function() {
var sectionName = $( this ).attr( 'id' );
var distanceToTopOfViewport = this.getBoundingClientRect().top;
if ( sectionName && distanceToTopOfViewport < 600 ) {
var hasAlreadyBeenTracked = viewedSections.indexOf( sectionName ) !== 1;
if ( ! hasAlreadyBeenTracked ) {
viewedSections.push( sectionName );
analytics.record( 'Viewed Homepage Section', {
name: sectionName
} );
} );
} );
} )();

This fires a Viewed Homepage Section analytics event where the name property is set to the id attribute of each section tag that the visitors scrolls by (where “scrolls by” means that the distance of the section to the top of the viewport is less than 600px).

You can then set up a funnel to measure which sections visitors see:

  1. Viewed Homepage (which should be fired when anyone hits the homepage regardless of how far down they’ve scrolled)
  2. Viewed Homepage Section where the name property is features
  3. Viewed Homepage Section where the name property is pricing
  4. Etc etc

My guess is that you’re going to be very surprised by how few people scroll down your homepage. The results should also underscore just how important the content above the fold is because everyone one of your visitors will see that; the content below the fold not so much. And if you’re A/B testing your homepage, you should focus on the content above the fold because that’s what most people see and where your tests will have the biggest impact.

You can get more sophisticated with this too by tracking how many people see a certain section when signing up. For example, you can measure what % of people who sign up actually saw the pricing section on your homepage:

// Add this below the analytics tracking:
if ( sectionName === 'pricing' ) {
setCookie( 'viewed_homepage_pricing', 'true' );
// Then in your sign up event:
analytics.record( 'Signed Up', {
viewed_pricing: getCookie( 'viewed_homepage_pricing' ) === 'true'
} );

You’ll then be able to set up a funnel to measure what % saw the pricing, what impact that has on their post-sign up behavior, and more. All of this will help you understand your users better and help you make better decisiosn about the future of your product.

If you’re tracking other types of homepage analytics events like this that you find useful I’d love to hear about it – drop a comment below or shoot me a note on Twitter.