Tracking Daily Unique Visitors to Recently Published Blog Posts with Looker, Fivetran, Mixpanel, and BigQuery

If you work at a company that publishes a lot of content, it’s important to understand how well that content is performing. Not just in terms of page views and unique visitors, but whether it converts visitors into trials, etc.

At Help Scout we have a Looker dashboard to help us track all of these things and more. In this post, I’ll walk you through how we track the daily unique visitors to our recently published blog posts. For example, on May 2nd we published April’s Release Notes – how many people viewed that post on the day it was published? How about the day after? And how does that compare to our other recently published content?


Big picture, we fire a Viewed Page Mixpanel event on every marketing page. We then use Fivetran to get that event data into BigQuery, where we analyze it in Looker. You can read more about the setup here: Tracking What Pages Your Visitors View Prior to Signing Up Using Mixpanel, Fivetran, BigQuery, and Looker.

Querying for Recently Published Posts

With this data in hand, we need to figure out a way to determine what the recent blog posts were so that we can limit our analysis to them.

Here’s the query we use:

MIN(TIMESTAMP_TRUNC(time, DAY)) AS published_at
FROM mp.event
current_url like ""
COUNT(*) > 200 AND
MIN(time) >= "2018-04-04"

For Help Scout blog URLs (ie, URLs that begin with, we need to determine when the post was published. That’s the same as the first day it was viewed. However, because we launched Mixpanel page view tracking on April 3rd, this would make it look like every one of our posts was published on April 3rd or sometime after that. That’s why we limit the results to April 4th or later. Also, we want to limit it to posts that received at least a certain number of visitors that first day, otherwise the results will wind up including low traffic posts that were first viewed after April 4th.

This query gets us a list of those recently published posts:

| current_url | published_at |
| | 2018-05-03 00:00:00.000000 UTC |
| | 2018-05-02 00:00:00.000000 UTC |
| | 2018-04-26 00:00:00.000000 UTC |
| | 2018-04-19 00:00:00.000000 UTC |
| | 2018-04-05 00:00:00.000000 UTC |
| | 2018-04-04 00:00:00.000000 UTC |

Modeling the Data in Looker

Over in Looker, we’re going to create a derived table with these results so that we can determine whether a given Mixpanel event URL is a recently published blog post:

view: mp_new_blog_content {
derived_table: {
MIN(TIMESTAMP_TRUNC(time, DAY)) AS published_at
FROM mp.event
current_url like ""
COUNT(*) > 200 AND
MIN(time) >= "2018-04-04"
sql_trigger_value: FORMAT_DATETIME("%Y-%m-%d-%H", CURRENT_DATETIME()) ;;
dimension: current_url {
type: string
sql: ${TABLE}.current_url ;;
hidden: yes
dimension: is_new_blog_content {
label: "Is New Blog Content?"
type: yesno
sql: ${blog_post_publish_date} IS NOT NULL ;;
dimension_group: blog_post_publish {
type: time
timeframes: [
sql: ${TABLE}.published_at ;;

The reason we have the is_new_blog_content dimension here is because we’re going to LEFT JOIN all Mixpanel events on this derived table by the URL. Not all URLs will have a match in this table, so this dimension will let us limit the analysis to just events that were a recently published blog post.

Here’s how we model the relationship between our main Mixpanel events model and this derived table:

explore: mp_events {
view_label: "Mixpanel"
label: "Mixpanel"
join: mp_new_blog_content {
view_label: "Mixpanel"
type: left_outer
relationship: many_to_one
sql_on: ${mp_new_blog_content.current_url} = ${mp_events.current_url} ;;

One other key piece of this is that we model how to calculate unique visitors in the main Mixpanel events view:

view: mp_events {
sql_table_name: mp.event ;;
measure: unique_visitors {
type: count_distinct
sql: ${distinct_id} ;;

Creating the Chart in Looker

With these foundations in place, we can then create the chart we set out to.

We want to use that Is New Blog Content dimension to limit the results to recently published posts, then pivot the daily unique visitor count on the URL:


Then it’s just a matter of setting up the chart in Looker and voila, there we have it:

Screen Shot 2018-05-11 at 1.51.34 PM.png

Going forward, without anyone having to log into Google Analyics, we’ll be able to track the popularity of our new blog posts and track the trends over time.

By the way, that spike on April 19th is from our CEO’s Beacon 2.0 Preview: The User Interface post, all about Help Scout’s soon-to-be-released live chat & improved self service tool. If you’re interested in getting notified when it launches, you can sign up here.

Happy querying!

4 thoughts on “Tracking Daily Unique Visitors to Recently Published Blog Posts with Looker, Fivetran, Mixpanel, and BigQuery

  1. So you use 4 different tools and a data analyst, and this is somehow more efficient than having people log in to Google Analytics and get the same report in 3 clicks? Classic “business intelligence” logic: spend inconceivable amounts of time and money to get the same data you have elsewhere in a fancier chart. Except the data is probably wrong with all the transformations and duplication you had to do to get there.

    • Hey E, thanks for this comment, others may have similar questions.

      For our team, there are a lot of benefits to using this setup over GA:

      * All of our top of funnel reporting lives in one place: our team doesn’t have to load multiple reports in GA to understand trends. Everything is in a single dashboard that they can access with one click.
      * We don’t have to share GA segments across the team, ensuring everyone is looking at a single source of truth for our metrics.
      * Unlike GA, we can track individual visitors (thanks to Mixpanel) and join this data with other data sources (thanks to Fivetran). For example, in addition to conversion rates like what % of visitors convert into trials from specific content, we can see each company’s account status, total revenue earned, whether or not they’re working with sales, whether they have any support interactions, etc.

      For simple top of funnel metrics, GA gets the job done just fine. But for more advanced reporting, these tools are invaluable to us.

  2. Hi Matt,

    Your response to Matt is where the ‘value’ is in this post, as you’re doing much more than what the post alludes to :)

    A question – do you use Fivetran to capture data from multiple sources AND reconcile it before passing it on to Looker? Or does this happen in Looker?

    I’m hoping to achieve much the same, but my budget is less than $100 a month – do you know of any other tools I can look at?

    (Both platforms mentioned don’t publish their pricing which usually means they are way too pricey for smaller cos.)

    • Hey Lawrence!

      There are two different approaches you can take to working with your data: ETL and ELT. With ETL, you manipulate your data before analyzing it. With ELT, you load everything, then analyze it. That’s the route we go at Help Scout: load all of the data (from MySQL, Salesforce, HubSpot, Mixpanel, etc) into BigQuery using Fivetran, then analyze it in Looker. More on ETL vs ELT here:

      As far your budget though, both Looker and Fivetran will be way more expensive than $100/month.

      One free option is to use Metabase, an open source BI tool: If you go that route though, you’re going to need to figure out how to make your data available to it for analysis. They have some solid docs on their site that should help.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s