Analyzing a Conversion Funnel in BigQuery Using Fivetran Powered Mixpanel Data


In a recent post I outlined how to use Fivetran to sync Mixpanel data to BigQuery for analysis in Looker. Today we’ll walk through how to write a SQL query to analyze a funnel using the Mixpanel data in BigQuery.

For this analysis, we’re going to create a three-step funnel showing how many visitors who start on Help Scout’s pricing page click through to the sign up page and then sign up.

One Step

To begin, lets just look at visitors who viewed the pricing page:

You might wonder why we need the % at the end of the URL; that’s simply to make sure the results include pages with URL parameters such as

Two Steps

Next, we’ll join these results on sign up page data, making sure that the sign up page views occurred after the pricing page views:

The reason we LEFT JOIN is because not all visitors will make it to the next step of the funnel and we want the data to reflect that.

Note that we join on both the distinct_id (to ensure each result is for a single visitor) and on the time the events occurred.

Three Steps

Extending this to the third and final step of the funnel, the Signed Up event, we get:

Determining the First Event Occurrence for Each Step

The query above will return every combination of pricing page views, sign up page views, and sign up events for each visitor. For our funnel though, we don’t care whether they loaded the pricing page or sign up page multiple times, we only care that they did it at all. So we modify the query to only return the first instance of each event for each visitor:

Measuring the Funnel

Finally, we count how many visitors made it to each step to get our funnel:


From here, you can take it a step further and segment the results on a property in the first step of the funnel or add a funnel duration to limit how long visitors have to complete the funnel, etc.

If you have any questions about this, don’t hesitate to reach out.

Querying Mixpanel using JQL from the Terminal


Mixpanel provides a way to query the data you send to it using something they call JavaScript Query Language (JQL).

To experiment with it, you can navigate to the JQL app for your project:

Screen Shot 2018-04-09 at 4.07.17 PM.png

Which takes you to the JQL editor:

Screen Shot 2018-04-09 at 4.07.50 PM.png

Alternatively, you query Mixpanel’s JQL endpoint directly from a local script:

First, you need to create a file with the JQL. I called mine testing.js:

Next, grab your project’s API Secret from its settings.

Finally, execute the JQL:

$ curl --silent -u YOUR_API_KEY: --data-urlencode script@testing.js | python -m json.tool

The last part that pipes to python simply formats the JSON response when it’s displayed in the terminal for easy viewing:

This is a trivial example obviously, but there’s a lot more you can do with JQL. You can use JQL to perform any type of analysis that you can in Mixpanel itself, and more.


Tracking What Pages Your Visitors View Prior to Signing Up Using Mixpanel, Fivetran, BigQuery, and Looker


One of the things we’ve been analyzing at Help Scout recently is what paths individual companies take before signing up for a trial. For example, looking at a company that signed up last month, did they start their journey with Help Scout on our homepage? Or did they find us via our blog? Or one of our marketing landing pages? And once we know that, what other pages did they visit before signing up? Did that company wind up becoming a customer? How much did we make from them?

This post is about how we’ve wrangled the data using Mixpanel, Fivetran, BigQuery, and Looker to help us answer these questions.

Big picture, we’re tracking page view and sign up events in Mixpanel, syncing that data to BigQuery using Fivetran, then tying it to our internal company data in Looker for easy analysis.

Step 1: Tracking Page Views and Sign Ups in Mixpanel

If you’ve used an event-based analytics service, this step will be pretty straightforward. We load the Mixpanel script on every page, then fire a Viewed Page event:

And then if the company signs up for a trial, we fire a Signed Up event with that company’s id as a property:

These are the only two Mixpanel events we track. If we wanted to track actions in-app, we could also fire custom events for those, but we don’t at Help Scout because we tie this Mixpanel data together with our internal data about what companies have done in-app, eliminating the need for additional Mixpanel events.

Here’s what the Viewed Page event looks like in Mixpanel’s Live View:

Screen Shot 2018-04-06 at 1.42.07 PM.png

Also as we’ll see later, Mixpanel automatically tracks a lot of details about the visitor: things like the browser, country, URL, referrer, OS, etc, all of which we can use use in our analyses.

Step 2: Syncing Mixpanel data to BigQuery with Fivetran

Fivetran is this amazing service that specializes in helping you centralize all of your data in a data warehouse. For example, we have Fivetran connectors set up for MySQL (which we use internally at Help Scout), Salesforce, HubSpot, Google Sheets, and now Mixpanel:

Screen Shot 2018-04-06 at 1.46.42 PM.png

Screen Shot 2018-04-06 at 1.51.58 PM.png

Taking Mixpanel as an example, we provide Fivetran out Mixpanel API credentials, then Fivetran queries Mixpanel’s API periodically, cleans up the results, and throws it all in BigQuery:

Screen Shot 2018-04-06 at 1.50.21 PM.png

This lets us analyze our Mixpanel data just like we would other SQL data:

We can also access the custom event properties that we’re tracking for the Signed Up event using Standard SQL’s json_extract_scalar function:

Once we have the company id, it’s just a matter of querying for a specific company id to view their event history:

Step 3: Modeling the data in Looker

If you have access to a Business Intelligence tool like Looker, you can model this Mixpanel data and how to join it with your other data.

First, create a view to model the Mixpanel event data:

Then create a view with a derived table to model the relationship between a distinct_id in Mixpanel with a company id:

Finally, connect those two views and any others you want to be able to analyze together:

Step 4: Analyzing the data in Looker

Once you’ve modeled the data, you can analyze the event history for specific companies, track trends like page views and unique visitors, and more:

Screen Shot 2018-04-06 at 2.28.57 PM.png

In future posts I’ll walk through some of the other interesting analyses you can perform with this data.

If you have any questions about this setup, don’t hesitate to reach out.