Using Fivetran’s Google Sheets Integration to Get Monthly Unique Visitors Into BigQuery and Looker

fivetran-google-sheets-bigquery-looker.png

At Help Scout we use Fivetran to get data from various data sources (like our internal MySQL database, HubSpot, and Salesforce) into BigQuery so that we can analyze it all using Looker.

One of the challenges I’ve run into is that even though Fivetran supports Google Analytics, the data is limited to what’s available via the Google Analytics API. One of the main metrics we want – Unique Visitors – is available, but only on a 1-day, 7-day, 14-day, or 30-day basis:

Screen Shot 2017-12-05 at 9.41.27 AM.png

This means that if we want to know how many Unique Visitors we had each month, there’s no way to get that out of the Google Analytics via its API, which means Fivetran can’t grab it and automatically put it into BigQuery, which means we can’t have it in Looker.

Enter Fivetran’s Google Sheets integration

Fortunately, there is a hacky way to do it by taking advantage of Fivetran’s Google Sheets integration. Big picture, we’re going to manually export monthly Unique Visitors from the Google Analytics UI, put that data into a Google Sheet, then configure Fivetran to pull that data out of the Google Sheet and put it into BigQuery so we can analyze it with Looker. Here’s how the setup works:

In our Google Sheet, we select the range of data we want then go to Data > Named ranges… to give it a name:

Screen Shot 2017-12-05 at 9.48.53 AM.png

We give it a name that begins with fivetran_ such as fivetran_monthly_unique_visitors_test:

Screen Shot 2017-12-05 at 9.51.44 AM.png

Next, we grab the Google Sheet ID from the URL:

Screen Shot 2017-12-05 at 9.54.01 AM.png

And paste that into Fivetran:

Screen Shot 2017-12-05 at 9.54.52 AM.png

Authorize Fivetran to access Google Sheets:

Screen Shot 2017-12-05 at 9.55.39 AM.png

Fivetran will then kick off the initial sync:

Screen Shot 2017-12-05 at 9.56.14 AM.png

And then shoot you an email when it’s done:

Screen Shot 2017-12-05 at 10.04.31 AM.png

Now over in BigQuery, we can check to verify the table is synced:

Screen Shot 2017-12-05 at 10.05.29 AM.png

Step 1, complete.

Next, lets create this chart in Looker

We’ll tell Looker we want to create a view from a table:

Screen Shot 2017-12-05 at 10.07.07 AM.png

I had to click the Refresh button to get the Google Sheets table to appear:

Screen Shot 2017-12-05 at 10.07.56 AM.png

Before tinkering with the view, here’s what the Google Sheet model might look like:

And here’s the view:

There is some hacky stuff going on here so let me explain what’s what:

  • In the Google Sheet, our dates are formatted like “1-Jan-16″. Looker has no idea what to do with that out of the box, so we need to use Standard SQL’s PARSE_DATE function to tell it how to parse it. The dimension expects a timestamp so we then need to cast the date to a timestamp. And because we’re only ever going to be using months here, we can get rid of the other date/time options leaving just months.
  • Similarly, the spreadsheet has numbers like “743,353″. This string isn’t a valid number, so we use REPLACE to eliminate the commas before casting it to an integer.
  • You might be wondering about the measure: in order to use this value in a chart, it has to be a measure, not a dimension. A hacky way to do that is just to use the measure type of max (though min or average or others will do too) which will just grab the maximum value for that date range – and since this value is the only value in the date range, it just returns the number of monthly uniques.

Back in Looker we can now create the chart:

Screen Shot 2017-12-05 at 10.24.28 AM.png

The main downside to this is of course that you have to manually update the Google Sheet if you want the chart to update, but for something like Monthly Unique Visitors, updating it once a month isn’t a big deal. Though if you can think of an automated way to do it, I’d be interested to learn how.

Cheers!

Automatically Updating HubSpot Contact Properties Using Data in Looker

At Help Scout we rely heavily on HubSpot to communicate with our customers.

In order to set up workflows and personalize those emails, we need to get data about our customers into HubSpot. In the past this usually required lots of manual CSV wrangling or involving one of our engineers who would update some backend code to push the relevant data over to HubSpot using their API.

Several months ago we started using Looker, a mind-blowingly powerful Business Intelligence (BI) tool, to analyze and report on data.

As part of a project this week, I did some research to figure out whether or not it’s possible to set HubSpot contact properties using the data we have in Looker. It turns out that it is possible, but there are some gotchas that took some time to figure out. This post is about exactly how to do it.

Big picture, we’re going to schedule a Look to send data to Zapier via a webhook, then use Zapier to update the contact properties in HubSpot. This guide assumes some familiarity with all three of these services.

Scheduling a Webhook in Looker

For this guide, lets imagine that we want to set a Plan Name contact property in HubSpot. We’ll need two data points: the contact’s email address and the plan name.

Here’s what the Look might look like:

Screen Shot 2017-12-01 at 2.12.29 PM.png

Note that I’ll just be setting the plan name for a single user in this guide, but this would work fine if the Look contained a list of users/plan names as well.

Looker lets you set up a set up a Schedule for a Look. You can do things like email yourself data on a regular basis, upload data to an FTP, or for our purposes here, post the data to a third-party webhook.

Before we can set this up though, we’re going to need a Zapier webhook URL to send the data to.

Setting up the Trigger in Zapier

Create a new Zap and select Webhooks by Zapier as the trigger app.

The trigger should be Catch Hook (Wait for a new POST, PUT, or GET to a Zapier URL):

Screen Shot 2017-12-01 at 2.17.53 PM.png

For the Set Up Options, it’s going to ask you whether you want to pick off a child key from the payload. Enter data there. This will tell Zapier to ignore the rest of the payload that Looker sends it and just grab the data (the emails and plan names):

Screen Shot 2017-12-01 at 2.19.37 PM.png

Next, copy the webhook URL that Zapier provides you:

Screen Shot 2017-12-01 at 2.19.58 PM.png

Head back to the Looker Schedule. Make these changes to the defaults:

  • Destination: Webhook
  • Address: paste the URL that Zapier provided
  • Format: Change it to JSON – Simple, Inline
  • Uncheck formatted data values
  • You can also change the other options like the schedule based on your requirements.

It should look something like this:

Screen Shot 2017-12-01 at 2.23.47 PM.png

To test it out, click the Send Test button at the top.

Back in Zapier, it should tell you that your test was successful:

Screen Shot 2017-12-01 at 2.24.57 PM.png

You can click on the view your hook link to view the data:

Screen Shot 2017-12-01 at 2.25.28 PM.png

It has the email and plan, just what we expected.

Next, lets tell Zapier what we want it to do with this data.

For the Step 2 action, choose HubSpot as the desination app. For the action, choose Create or Update Contact:

Screen Shot 2017-12-01 at 2.26.57 PM.png

Now here’s the neat part. For the Set Up Template step, we’re going to map the data from Looker over to HubSpot contact properties. For this example, I’ll map the Contact Email HubSpot property to the email address from the Looker data:

Screen Shot 2017-12-01 at 2.28.55 PM.png

Screen Shot 2017-12-01 at 2.29.41 PM.png

And then map the plan name HubSpot property to our data as well:

Screen Shot 2017-12-01 at 2.30.04 PM.png

Click Continue at the bottom of the page to move onto the testing step. Zapier will show you a summary of the mapping:

Screen Shot 2017-12-01 at 2.31.02 PM.png

Before you run the test, I’d recommend loading the contact’s details in HubSpot to confirm the property is not set (that way you can be sure if it’s set after you test it that it was a result of your test and not because it was previously set):

empty-plan-name.png

The Plan Name is blank, so now lets run the test.

Zapier will tell you it was sent:

Screen Shot 2017-12-01 at 2.36.19 PM.png

Reload the contact page, cross your fingers, and…

plan-name-set.png

Voila! We successfully updated a HubSpot contact property by taking advantage of Looker’s webhooks and Zapier’s integrations.

From here, you could turn on your Zap to make it permanent and then just make sure to save the schedule for the Look so it happens automatically going forward.

Screen Shot 2017-12-01 at 2.40.37 PM.png

Troubleshooting Tips

If the HubSpot property isn’t set like you expect, there are two things to check out. First is the Task History in Zapier:

Screen Shot 2017-12-01 at 2.41.01 PM.png

Here you’ll be able to see whether Zapier received the data from Looker and whether it ran into any problems:

Screen Shot 2017-12-01 at 2.42.01 PM.png

If you don’t see the request there, it probably means that Looker didn’t send it over to Zapier. To check, head to your Settings page in Looker and go to the Schedule History section. You should see the details:

Screen Shot 2017-12-01 at 2.43.20 PM.png

I ran into an issue while testing this where Zapier wasn’t receiving the data because I had the Schedule option toggled that told Looker not to send the data if nothing had changed. Because I had already tested it, Looker didn’t fire the request. This page will tell you if something like that happened.

That should do it! If you run into any trouble setting this up for your company, don’t hesitate to drop me an email.

Companies with large sales teams tend to negotiate more on pricing

Another insight from Tim Thyne around pricing:

If a company has a large sales team, they have more capacity to work 1:1 with potential customers to close deals, even small ones. That means that they will tend to negotiate more on pricing compared to a company without a large sales team. Remember that when you see the pricing they display on their website: the actual price that their customers pay is likely less than that.

Growth Tip: Email newsletter subscribers who haven’t tried your product to understand why not

If you have a newsletter that includes people who haven’t tried your product, you can learn a lot by emailing those people and asking why they haven’t tried your product.

It helps you understand not just about where they may be from a sales opportunity perspective, but also to learn about their perception of your product. For example, maybe they don’t know about your product, or thought it worked in a different way, or thought it was comparable to a product they already use, or didn’t realize it could help them, or thought it did something completely different than what it actually does. You can then use what you learn to optimize your sales and marketing efforts.

Hat-tip Tim Thyne for the insight.

Edit: And also hat-tip to John Bonini, Director of Marketing at Databox, for sharing the idea with Tim.

On the Benefits to Customers of Paying Annually

Highlights from a discussion at Help Scout about the benefits to customers of paying annually:

  • Finance teams will plan or budget for their teams web apps cost for the year.
  • Similarly, most managers prefer to know their budgeted spend and to make that spend upfront. It helps them avoid mid-year “I need more money to pay for our software” surprises.
  • This also helps managers secure their budget and get their needs resourced upfront because later on they may find their budget is cut and anything that isn’t already committed can be subject to those cuts.
  • During the planning stages for purchasing software, there is often an ambitious vision for the roll out. By securing the resources/budget for an entire year, it helps the manager realize that vision.
  • The ideal behavior is letting companies prepay, but then drawing down on that credit based on usage/engagement. It helps companies avoid paying for services they aren’t using and also lets the business earn more for companies that grow during the year. (We do this at Help Scout and I’ll write a detailed post about the mechanics of it in the future.)

Hat-tip David McFarlane for these insights!

On Making it Easy for Customers to Quickly Choose the Right SaaS Plan

One of the benefits of working with so many brilliant people at Help Scout is that I’m exposed to a lot of wisdom about how to build and grow a SaaS company.

I’m going to try to start sharing more of that wisdom here, not just for my own records but to help anybody else looking to build or grow their own software business. To avoid sharing anything competetive I won’t always be able to provide context, but hopefully the quotes mostly stand for themselves.

On SaaS plans:

 

In self-service models, it’s really important that customers have no hesitation about which plan to choose. They need to immediately know when they look at the plans which one is right for them.

And on pricing pages:

Pricing pages are the revenue powerhouse in a self-serve model.

Counting in MySQL When Joins are Involved

There’s a MySQL pattern that I use fairly frequently that I want to share, both to help anyone else who might find it useful and also to find out if there’s a beter way.

Here’s an example: you have one table for users, another for posts, another for pages:

And we want to know how many posts and pages each user has:

Attempt 1: COUNT with JOIN

For the moment, lets focus on just getting the post count by user.

We might be tempted to try JOINing the two tables and using COUNT:

The problem is that when we JOIN the two tables, the results will only include users who have posts. In this case, there’s no result for Jen (user id 3) because she doesn’t have any records in the posts table.

Attempt 2: COUNT with LEFT JOIN

Ok, so if JOIN doesn’t work, lets try LEFT JOIN:

Now Jen has a count, but it’s reflecting 1 post, not 0 like we expect.

We can see why it’s broken by looking at the LEFT JOIN results without aggregating them:

Jen doesn’t have any posts, but because we’re LEFT JOINing, her users record is still included. When we then aggregate the results with GROUP BY and COUNT, MySQL sees that the results have one record so returns a count of 1.

A better way to do this (as suggested by Tom Davies) is instead of counting all records, only count post ids:

Attempt 3: SUM/IF, and LEFT JOIN

Another way we can achieve what we want (for a single table join) is to use SUM/IF:

Here we’re saying “When the post id is null, give it a 0, otherwise a 1, then sum the results” which gives us the correct count. This is actually I trick I learned from the real Jen while at Automattic :).

The problem arises when we need to join multiple tables:

Now instead of Simon (user id 2) having 2 posts, he has 4 – what happened? Again, we can look at the un-aggregated results:

The problem is that each post gets joined with each page result. Simon has 2 posts, each of which gets joined with the 2 pages, so when we use COUNT it sees 4 results and returns that amount.

The solution: Subqueries and COALESCE

Here’s how I would solve it:

To understand how it works, lets focus on counting posts. The first subquery counts how many posts each user has if they have any posts:

We can then LEFT JOIN users on this derived table:

For the users with posts, the result has the post count. For the users without posts, the result is NULL. This is where COALESCE comes into play. COALESCE takes any number of arguments and returns the first non-NULL result:

So we’re saying “If the user has a posts count, use that, otherwise use 0”.

We can then LEFT JOIN again on the pages table and do the same thing with the pages count to get the posts and pages count per user. Because each subquery only returns a max of one result per user, we don’t run into the issue we did earlier where posts get joined with pages to return the incorrect count.

Somewhat complicated, but the only way I know how to do it. If you know of a better way, please drop a comment below. Thanks!

Edit:

This simpler method also works:

By counting distinct post ids and page ids, we avoid counting NULLs and also avoid counting duplicates due to joining on posts and pages.