Analyzing a Conversion Funnel in BigQuery Using Fivetran Powered Mixpanel Data

mixpanel-funnels_fqmumg.png

In a recent post I outlined how to use Fivetran to sync Mixpanel data to BigQuery for analysis in Looker. Today we’ll walk through how to write a SQL query to analyze a funnel using the Mixpanel data in BigQuery.

For this analysis, we’re going to create a three-step funnel showing how many visitors who start on Help Scout’s pricing page click through to the sign up page and then sign up.

One Step

To begin, lets just look at visitors who viewed the pricing page:

SELECT
distinct_id,
time AS viewed_pricing_at
FROM mp.event
WHERE
name = "Viewed Page" AND
current_url LIKE "https://www.helpscout.net/pricing/%"
+————————————————–+——————————–+
| distinct_id | viewed_pricing_at |
+————————————————–+——————————–+
| 1628dfc9f7e00184dcc4e71615c5a49a101628dfc9f800 | 20180403 17:09:27.000000 UTC |
| 16291bd9645601abc2a0f87a6c722c88016291bd9648ae | 20180404 10:41:32.000000 UTC |
| 162936a4e6f3701526b616b4b327d25800162936a4e738 | 20180404 18:27:46.000000 UTC |
| 162a0381d4810b3137c5860031d3c3d10d162a0381d496 | 20180407 06:10:35.000000 UTC |
+————————————————–+——————————–+

view raw
pricing-page.sql
hosted with ❤ by GitHub

You might wonder why we need the % at the end of the URL; that’s simply to make sure the results include pages with URL parameters such as https://www.helpscout.net/pricing/?utm_source=adwords.

Two Steps

Next, we’ll join these results on sign up page data, making sure that the sign up page views occurred after the pricing page views:

SELECT
pricing.distinct_id,
viewed_pricing_at,
viewed_sign_up_at
FROM (
SELECT
distinct_id,
time AS viewed_pricing_at
FROM mp.event
WHERE
name = "Viewed Page" AND
current_url LIKE "https://www.helpscout.net/pricing/%"
) pricing
LEFT JOIN (
SELECT
distinct_id,
time AS viewed_sign_up_at
FROM mp.event
WHERE
name = "Viewed Page" AND
current_url LIKE "https://secure.helpscout.net/members/register/%"
) sign_up_page ON sign_up_page.distinct_id = pricing.distinct_id AND sign_up_page.viewed_sign_up_at > viewed_pricing_at
+———————————————————+——————————–+——————————–+
| distinct_id | viewed_pricing_at | viewed_sign_up_at |
+———————————————————+——————————–+——————————–+
| 1628be3f8ab2080678714f9f0ecb34356b1fa4001628be3f8b825 | 20180403 07:31:26.000000 UTC | 20180409 06:08:08.000000 UTC |
| 1628be3f8ab2080678714f9f0ecb34356b1fa4001628be3f8b825 | 20180405 12:48:37.000000 UTC | 20180409 06:08:08.000000 UTC |
| 1628be3f8ab2080678714f9f0ecb34356b1fa4001628be3f8b825 | 20180403 07:28:42.000000 UTC | 20180409 06:08:08.000000 UTC |
| 1628be3f8ab2080678714f9f0ecb34356b1fa4001628be3f8b825 | 20180403 07:27:49.000000 UTC | 20180409 06:08:08.000000 UTC |
| 1628be3f8ab2080678714f9f0ecb34356b1fa4001628be3f8b825 | 20180403 07:35:51.000000 UTC | 20180409 06:08:08.000000 UTC |
| 162914788c5fb09337b4c437dc4b34356b144000162914788c628 | 20180404 17:30:18.000000 UTC | |
+———————————————————+——————————–+——————————–+

view raw
two-step-funnel.sql
hosted with ❤ by GitHub

The reason we LEFT JOIN is because not all visitors will make it to the next step of the funnel and we want the data to reflect that.

Note that we join on both the distinct_id (to ensure each result is for a single visitor) and on the time the events occurred.

Three Steps

Extending this to the third and final step of the funnel, the Signed Up event, we get:

SELECT
pricing.distinct_id,
viewed_pricing_at,
viewed_sign_up_at,
signed_up_at
FROM (
SELECT
distinct_id,
time AS viewed_pricing_at
FROM mp.event
WHERE
name = "Viewed Page" AND
current_url LIKE "https://www.helpscout.net/pricing/%"
) pricing
LEFT JOIN (
SELECT
distinct_id,
time AS viewed_sign_up_at
FROM mp.event
WHERE
name = "Viewed Page" AND
current_url LIKE "https://secure.helpscout.net/members/register/%"
) sign_up_page ON sign_up_page.distinct_id = pricing.distinct_id AND sign_up_page.viewed_sign_up_at > viewed_pricing_at
LEFT JOIN (
SELECT
distinct_id,
time AS signed_up_at
FROM mp.event
WHERE
name = "Signed Up"
) signed_up ON signed_up.distinct_id = pricing.distinct_id AND signed_up.signed_up_at > viewed_sign_up_at
+———————————————————+——————————–+——————————–+——————————–+
| distinct_id | viewed_pricing_at | viewed_sign_up_at | signed_up_at |
+———————————————————+——————————–+——————————–+——————————–+
| 1628be3f8ab2080678714f9f0ecb34356b1fa4001628be3f8b825 | 20180403 07:31:26.000000 UTC | 20180409 06:08:08.000000 UTC | |
| 1628be3f8ab2080678714f9f0ecb34356b1fa4001628be3f8b825 | 20180405 12:48:37.000000 UTC | 20180409 06:08:08.000000 UTC | |
| 1628be3f8ab2080678714f9f0ecb34356b1fa4001628be3f8b825 | 20180403 07:28:42.000000 UTC | 20180409 06:08:08.000000 UTC | |
| 1628be3f8ab2080678714f9f0ecb34356b1fa4001628be3f8b825 | 20180403 07:27:49.000000 UTC | 20180409 06:08:08.000000 UTC | |
| 1628be3f8ab2080678714f9f0ecb34356b1fa4001628be3f8b825 | 20180403 07:35:51.000000 UTC | 20180409 06:08:08.000000 UTC | |
| 162914788c5fb09337b4c437dc4b34356b144000162914788c628 | 20180404 17:30:18.000000 UTC | 20180404 17:31:13.000000 UTC | 20180404 17:34:51.000000 UTC |
| 162914788c5fb09337b4c437dc4b34356b144000162914788c628 | 20180404 08:31:16.000000 UTC | 20180404 17:24:52.000000 UTC | 20180404 17:34:51.000000 UTC |
+———————————————————+——————————–+——————————–+——————————–+

view raw
three-steps.sql
hosted with ❤ by GitHub

Determining the First Event Occurrence for Each Step

The query above will return every combination of pricing page views, sign up page views, and sign up events for each visitor. For our funnel though, we don’t care whether they loaded the pricing page or sign up page multiple times, we only care that they did it at all. So we modify the query to only return the first instance of each event for each visitor:

SELECT
pricing.distinct_id,
MIN(viewed_pricing_at) AS viewed_pricing_at,
MIN(viewed_sign_up_at) AS viewed_sign_up_at,
MIN(signed_up_at) AS signed_up_at
FROM (
SELECT
distinct_id,
time AS viewed_pricing_at
FROM mp.event
WHERE
name = "Viewed Page" AND
current_url LIKE "https://www.helpscout.net/pricing/%"
) pricing
LEFT JOIN (
SELECT
distinct_id,
time AS viewed_sign_up_at
FROM mp.event
WHERE
name = "Viewed Page" AND
current_url LIKE "https://secure.helpscout.net/members/register/%"
) sign_up_page ON sign_up_page.distinct_id = pricing.distinct_id AND sign_up_page.viewed_sign_up_at > viewed_pricing_at
LEFT JOIN (
SELECT
distinct_id,
time AS signed_up_at
FROM mp.event
WHERE
name = "Signed Up"
) signed_up ON signed_up.distinct_id = pricing.distinct_id AND signed_up.signed_up_at > viewed_sign_up_at
GROUP BY 1
+————————————————————+——————————–+——————————–+——————————–+
| distinct_id | viewed_pricing_at | viewed_sign_up_at | signed_up_at |
+————————————————————+——————————–+——————————–+——————————–+
| 162931933729b20b3ba365e6d865336c7b053840001629319337333b | 20180404 16:58:45.000000 UTC | 20180411 20:50:13.000000 UTC | 20180411 20:51:46.000000 UTC |
| 162b770945247a0145ce1cbf076d84958601fa400162b77094541286 | 20180411 19:56:18.000000 UTC | 20180411 20:02:54.000000 UTC | |
| 162899a5ec58950fc657f6c6ebde33697b07384000162899a5ec68c6 | 20180409 19:20:01.000000 UTC | 20180411 19:38:59.000000 UTC | |
| 162b73858df1f80625fc7a476d418763131183d10d162b73858e2237 | 20180411 17:19:53.000000 UTC | 20180411 17:23:22.000000 UTC | |
| 162b6d32dd11a50bbf34346ce9393553629231015162b6d32dd2646 | 20180411 15:30:09.000000 UTC | 20180411 16:40:08.000000 UTC | |
| 162875b83f63e20870a441baadb633697b07384000162875b83f714f | 20180402 10:15:44.000000 UTC | 20180411 14:56:37.000000 UTC | 20180411 14:57:27.000000 UTC |
+————————————————————+——————————–+——————————–+——————————–+

Measuring the Funnel

Finally, we count how many visitors made it to each step to get our funnel:

SELECT
COUNT(viewed_pricing_at) AS viewed_pricing,
COUNT(viewed_sign_up_at) AS viewed_sign_up,
COUNT(signed_up_at) AS signed_up
FROM (
SELECT
pricing.distinct_id,
MIN(viewed_pricing_at) AS viewed_pricing_at,
MIN(viewed_sign_up_at) AS viewed_sign_up_at,
MIN(signed_up_at) AS signed_up_at
FROM (
SELECT
distinct_id,
time AS viewed_pricing_at
FROM mp.event
WHERE
name = "Viewed Page" AND
current_url LIKE "https://www.helpscout.net/pricing/%"
) pricing
LEFT JOIN (
SELECT
distinct_id,
time AS viewed_sign_up_at
FROM mp.event
WHERE
name = "Viewed Page" AND
current_url LIKE "https://secure.helpscout.net/members/register/%"
) sign_up_page ON sign_up_page.distinct_id = pricing.distinct_id AND sign_up_page.viewed_sign_up_at > viewed_pricing_at
LEFT JOIN (
SELECT
distinct_id,
time AS signed_up_at
FROM mp.event
WHERE
name = "Signed Up"
) signed_up ON signed_up.distinct_id = pricing.distinct_id AND signed_up.signed_up_at > viewed_sign_up_at
GROUP BY 1
) visitor_data
+————–+—————-+———–+
| viewed_pricing | viewed_sign_up | signed_up |
+————–+—————-+———–+
| 1234 | 567 | 89 |
+————–+—————-+———–+

Voila!

From here, you can take it a step further and segment the results on a property in the first step of the funnel or add a funnel duration to limit how long visitors have to complete the funnel, etc.

If you have any questions about this, don’t hesitate to reach out.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s