Wrangling dbt Database Permissions

One of the more time consuming aspects of getting dbt working for me has been figuring out how to set the correct database permissions. I’m comfortable analyzing data once it’s in a data warehouse, but haven’t have a ton of experience actually setting one up. This post is for anyone else in a similar spot looking to set up dbt.

For some context going into this, I’m using Amazon RDS for Postgres with Heroku data made available by Stitch.

Big picture, to summarize Martin Melin who gave me some tips on this on dbt Slack, is for each person to have a user with full access to its own dedicated schema and read-only access to the source schemas.

For a quick key on the different roles in my setup:

  • mhmazur: The superuser
  • mazur: The dbt user
  • stitch: For Stitch to use
  • mode: For Mode to use

1) Create a user for dbt to use to connect to your data warehouse

With the exception of step 4, all of these need to be run from a superuser account:

-- As mhmazur
create role mazur with login password '...';

These are the credentials you’ll set in your profile.yml file for dbt to use to connect to your data warehouse.

2) Create an analytics schema

-- As mhmazur
create schema analytics;

This is where the production views and tables that dbt creates will exist. Your BI tools and analyses will use this data.

You could also set one up for development (and adjust the commands below accordingly) by changing the schema name to something like “dbt_mazur”.

If you tried to run dbt at this point, you’d get an error like “permission denied for schema analytics” because the dbt user doesn’t have access to it yet.

3) Give the dbt user full access to the analytics schema

-- As mhmazur
grant all on schema analytics to mazur;

Almost there! If you tried to run dbt now, you would get a different error: “permission denied for schema preceden_heroku” because the user doesn’t have access to the sources tables yet.

4) Make the dbt user the owner of the schema

-- As mhmazur
alter schema analytics owner to mazur;

This is necessary because when you created the schema, your superuser role became its owner. We need our dbt user to be the owner so that it can grant usage permissions from dbt hooks (see down below).

5) Finally, give the dbt user read-only access to the source tables

In order to transform the data, dbt needs to be able to query the source tables. In my case, the source tables live in a “preceden_heroku” schema owned by a “stitch” user. Therefore, to grant read-only access to the dbt user, I have to log in as the “stitch” user and run the following:

-- As stitch
grant usage on schema preceden_heroku to mazur;
grant select on all tables in schema preceden_heroku to mazur;

The last command is necessary even if you don’t materialize any models as tables because it also grants select permissions on views as well (even though it just says “tables” in the command). Per the docs: “note that ALL TABLES is considered to include views”.

Undoing these changes

You may run into issues and need to reverse these changes. Here’s how:

From the “stitch” user, revoke the read-only rights you gave to the user:

-- As stitch
revoke usage on schema preceden_heroku from mazur;
revoke all privileges on all tables in schema preceden_heroku from mazur;

Then from the dbt user (“mazur” in my case) change the owner back to your superuser:

-- As mazur
alter schema analytics owner to mhmazur;

Then back from your super user account (“mhmazur” for me), revoke permissions to the analytics schema:

-- As mhmazur
revoke all on schema analytics from mazur;

And then remove the analytics schema:

-- As mhmazur
drop schema analytics cascade;

And finally remove the user:

-- As mhmazur 
drop role mazur;

Granting read-only access to the analytics schema

One last thing: you’ll want to grant read-only access to the analytics schema so that your BI tool can execute queries. As the dbt user, run:

-- As mazur
grant usage on schema analytics to mode;
grant select on all tables in schema analytics to mode;

You’ll want to set this up as a hook so that whenever dbt changes the views and tables your BI maintains the correct permissions:

on-run-end:
 - 'grant usage on schema "{{ target.schema }}" to mode'
 - 'grant select on all tables in schema "{{ target.schema }}" to mode'

If you don’t do this, you’ll wind up getting a “permission denied for schema ______” or “permission denied for relation ______” error in your BI tool.

Happy querying!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s