Friday Updates: Scheduling 200k Preceden Users for Deletion, Yuval Noah Harari Interview, AlphaGo

Photo by Gary Chan on Unsplash

What I’m working on at Preceden

As you might recall from previous updates, I’m working on a big data retention project for Preceden. For example, imagine someone who signed up for Preceden in 2015 and never did anything. This account brings up some interesting questions IMO:

  • Should I keep that user’s account around forever?
  • If the answer is no, do I need to notify the person that their account will be deleted?
  • How long does a user need to be inactive before being subject to deletion?
  • If I do notify them, how much time do I give them to log back in? How many reminder emails do I send?
  • If the user does have content, does it affect the decision about notifying them or not?

This is what I’ve been figuring out and implementing in Preceden recently.

Here’s where I’ve landed so far:

  • If a user hasn’t been active in 1 year, has never paid, and has no content in Preceden, I’ll have Preceden delete their account without any email notifications.
  • If a user hasn’t been active in 2 years, has never paid, and has little content in Preceden (1-4 events), I’ll also have Preceden delete their account without any email notifications. There’s an argument for notifying them, but it’s not clear cut: if someone barely used the product and hasn’t used it in years, do I really need to email them? The vast majority won’t care and many would find the email annoying and someone would report it as spam. And for the handful that do care, it’s easy enough for them to create a new account and recreate the little amount of content they previously had.
  • If a user gets deleted and then the user attempts to log in, I’ll display a message letting them know what happened.
  • For users with 5+ events who haven’t been active in 2 years and aren’t paying, Preceden will send an email notification and give them some time (maybe 60 days) to log back in which would stop the deletion. Probably will sent 2 emails about this, one at 60 days and another at like 10 days. Will make some exceptions for users who have popular publicly shared timelines since there’s value keeping them around. This process should also help reactive a bunch of inactive users, hopefully leading to additional revenue.

Last two weeks I worked on the first three parts of this dealing with inactive users that don’t require notification. All in all, there are now 200k users scheduled for deletion over the next 60 days. The reason for 60 days is so that if there are any problems, I can stop the process and investigate before Preceden deletes too many users. I’ve got internal notifications set up so that if someone who is scheduled for deletion logs in, I’ll get an email so I can investigate. In theory there shouldn’t be too many of these, so if there are a lot it indicates something is amiss.

The code for all of this is probably the most thoroughly tested code I’ve ever written, hah. Want to be super careful and not accidentally delete an account that shouldn’t be deleted. It’s coming along.

What I’m working on at Help Scout

The usual: working on a job description for a data lead, project managing the setup of Heap in the app, data requests, Looker training videos, and using ML to improve our monthly KPI forecasts.

What I’m watching

This documentary on AlphaGo, the software DeepMind created to play the game of Go:

What I’m listening to

Tim Ferriss’s recent interview with Yuval Noah Harari. The discussion around 1 hr 25 minutes into it about the major global problems, especially about technology, was very thought provoking.

Be well my friends πŸ‘‹

Friday Updates: Data Retention Progress, Better Call Saul, Doomscrolling

Photo by the Creative Exchange on Unsplash

What I’m working on at Preceden

I’ve been thinking a lot about data retention at Preceden. After people are done using the service, unless they manually delete their timelines and account, that data stays around forever. And since Preceden has been around 11 years now, that means a lot of accounts and a lot of timeline data that’s sitting there that’s serving no purpose. By figuring out a way to clean it up, the database will perform better, there’s less impact in the event of a data breach, it serves as a way to retain paying customers (“we’re not just going to hold onto your data forever if you stop paying”) and in general it feels like the responsible thing to do.

Easier said than done though.

For a v1, I’m attempting to automate a solution to delete accounts that have no timeline data and haven’t been active in at least 2 years. There’s actually a lot of accounts that fit this criteria: about 25% of users that sign up never create a single event. Spread out over 11 years, and the number is pretty high.

Preceden isn’t set up to handle this type of thing well. Users have timelines, timelines have layers, layers have events. And some timelines have collaborators, some users have teams, and some users are set up as teachers with student sub-accounts. Iterating over each user and performing the necessary queries puts a ton of strain on the database, causing performance issues for people actively using it. And I haven’t tracked last active date in the past, adding further complication.

And for v2, I want to handle situations where users have timeline data, and that’s going to require notifying them, giving them some time to log in and indicate they want to keep using it, or to export their data if interested, etc. Lots of and lots of complexity here and it all has to be done super carefully: don’t want to ever delete any accounts accidentally.

This is all solvable, just taking some time to work through it one issue at a time.

What I’m watching

Better Call Saul on Netflix:

I loved Breaking Bad back in the day, but never got into this spinoff. Finally started watching it and am really enjoying it.

What I’m (not) doomscrolling

Twitter.

I got into a bad habit the last few years of keeping Tweetbot open during the day and scrolling through tweets every 10-15 minutes throughout the day. It was largely driven by following politics on Twitter, ugh.

I recently got a new Macbook and decided not to install Tweetbot. It’s a hard habit to break, but I’ve definitely noticed my anxiety level decrease recently and I think not having Twitter open all day has contributed to that. It definitely helps that Biden won the presidency and Twitter politics is a lot more boring now.

Thank goodness.

Friday Updates: Preceden Pricing, Policy Page Design, Time Series Prediction, and Embracing Minimalism

What I’m working on at Preceden

New Pricing

Before:

After:

Big changes are:

  • Renaming the higher end plans to better match the type of people who those plans are targeting. For example, if you’re a project manager coming to Preceden to create a roadmap, these new plans should make it clearer that the Business plan is for you, whereas before the Basic and Pro plan names were nebulous.
  • Nixed the mid-range $69/year price point and introduced the higher $199/year one both as a way to make the $29/year and $99/year seem like better deals, and also to try to make as much money from business and high-usage (6+ timeline) users.

In the past I would have A/B tested these plans together for a month to compare their performance side by side, but Preceden really does not have enough conversions to draw conclusions from these tests. I normally would justify that by saying that some A/B test data is better than no data, but I think I’ve made a lot of wrong decisions in the past based on inconclusive A/B test data so am going to try to do better and just not going to run pricing tests going forward.

Policy Page Redesign

Before:

After:

Hat-tip to Milan, the front-end developer I’m working with, for these lovely improvements.

These policy pages (Terms of Service, Privacy Policy, Cookie Policy) don’t get much traffic, but I think it’s a nice touch putting in the extra effort to make them look really polished.

Misc

  • Chatted with a lawyer about revising the Terms of Service and Privacy documents.
  • Researched different types of insurance for SaaS businesses.
  • Moved Drip tracking from front-end to back-end tracking to both fix some bugs (the Sign Up event wasn’t getting fired for all new sign ups, for example, probably due to ad blockers) and also to have one less cookie on the site (back-end tracking doesn’t require a cookie whereas front-end does).
  • More Drip improvements: if a user updates their email, make sure to update it in Drip. If a user deletes their account, make sure to delete it from Drip.
  • Making sure if a user deletes their account, there’s no trace of their email in the database after that (with some exceptions like for people who pay).
  • Adding admin functionality so that if we refund a user, it automatically cancels their subscription by default as well, so they don’t accidentally get charged again down the road.

I do miss product work, but it feels good putting the time into these professionalization, maintenance, and bug fix tasks right now.

What I’m working on at Help Scout

Mostly the same things I’ve mentioned recently: PM’ing a project to evaluate Heap, getting the ball rolling on hiring a data team this year, answering data requests, etc.

One new thing though: We send out an automated email to a lot of people at the company each day reporting on various KPIs. I gave a talk about this Daily Metrics Email at Looker’s JOIN conference a few years ago. One of the things we report on is what we’re projecting each KPI to be by the end of the ongoing month. For example, how many new trials are we forecasting for January?

Historically, most of these projections are simply extrapolations based on the MTD performance. If 5 days of a 30 day month have passed and we’ve had 1,000 new trials, that’s 200 trials/day so we project 6,000 new trials for the month.

This works alright, but has some issues, especially at the beginning of the month when we’re extrapolating based on only a few days of data. This is made worse when the month starts on a weekend which brings the daily average (and therefore the projection for the month down).

I’ve been playing around with alternative approaches to doing the projections, all in a Jupyter Notebook for now. Some time series machine learning solution is an obvious thing to try, but it’s unclear to me whether I could take a scikit-learn model and use it in the Ruby script that generates the email. Probably. But I think there’s also a strong argument for keeping the projection method simple which will make it easier to explain and debug, even if it winds up not being as accurate as a fancy blackbox ML model. We will see.

What I’m watching

Most things I watch don’t lead to any actual changes in my life, but I recently watched this documentary on minimalism on Netflix and it has inspired some change:

I started going through our house and figuring out whether we really need to keep each thing or not. For example, cleaning out junk drawers and going through boxes in our attic that we haven’t opened since we moved into this house in 2019. I got the kids involved too and they have had them going through their old toys and picking out things they want to donate.

I started going through Preceden’s code and cleaning up obsolete methods and rake tasks that aren’t needed anymore. I also bought a new Macbook (which I’ve been meaning to do for a while anyway) and have been setting it up from scratch so it doesn’t have all the files and whatnot that have accumulated over the past 5 years on my old Macbook. It’s been very freeing getting rid of stuff that no longer serves a purpose.

If this resonates with you, I’d encourage you to check out that documentary.

Be well my friends πŸ‘‹

Friday Updates: GDPR, Heap, TED

Photo courtesy of Unsplash

What I’m working on at Preceden

In my ongoing effort to professionalize Preceden, I’ve been focused on making Preceden more GDPR compliant recently. This has had me reading lots of articles about GDPR and trying to make sense of exactly what’s required. Problem is, there’s very little agreement.

Consider questions like:

  • Can a a cookie banner have only an Accept button or is a Decline button also required? (Pretty sure you need a Decline button, but lots of websites don’t include one.)
  • Can you track visitors in analytics tools before they’ve opted into tracking? (Pretty sure the answer is no, but lots of websites do it anyway.)
  • Are cookie banners required for people in the US? What about an EU resident who is visiting the US? (Pretty sure GDPR applies for the EU resident visiting the US, which means you have to display the banner to all website visitors.)
  • If you turn off advertising features in Google Analytics and configure it to anonymize IP addresses, is a cookie banner still required? πŸ€·β€β™‚οΈ
  • What privacy policy updates are required? (Lots of websites, even those trying to be GDPR compliant, include different sections.)

And this is really just the tip of the iceberg.

Even though there’s a lot of ambiguity, I’ve been making lots of positive improvements to Preceden, and imagine by the time I’m done with this round of updates it will be in the top 1% of websites in terms of GDPR compliance and that should minimize my risk for the foreseeable future.

These updates include:

  • Removed Ezoic, a service I used to optimize ads displayed on pages with public timelines, because it loads a large number of ad tracking scripts and sets dozens of cookies. This will result in a few grand of lost revenue each year going forward, but I feel good nixing it, not just to make Preceden more GDPR compliant but because loading all those ad trackers for visitors is nasty.
  • Removed Mixpanel, because most of the business intelligence reporting I care about these days relies on backend data, so no need to track all these front-end events that I haven’t looked at in years.
  • Updated Preceden’s Privacy Policy and moving it from an internal CMS over to the codebase so I have a version history of it going forward. GDPRStart.com has a great customizable template for $99 that I used as a starting point for the content.
  • Created a Cookie Policy that lists all of the cookies Preceden sets and their purpose.
  • When people visit public timelines, Preceden records a backend analytics event that in the past has included the visitor’s IP address. Nixed that from the database since an IP address is considered PII and visitors haven’t opted into that tracking.

Probably going to spend next week on this work too.

What I’m working on at Help Scout

Speaking of analytics tracking, we’re evaluating Heap for use at Help Scout currently. Unlike tools like Mixpanel, Heap automatically tracks a user’s entire click stream which has a number of benefits like saving on engineering time (because engineers don’t need to manually implement Mixpanel events). I’m leading this evaluation, so it’s had me on demos calls, reading documentation, writing a 3-Pager (a document we use to propose an idea or project to others at the company), and discussing use cases with people.

What I’m watching

A few weeks back I got my Concept 2 rowing machine out of the attic and have been trying to make rowing a daily habit. Usually I do it right after I put the kids to bed and it takes about 25 minutes to row 5k. The rowing machine is in front of a TV and I listen via Apple TV with Airpods so the high volume (which is necessary when rowing) doesn’t wake up the kids.

After finishing up Tiger King, I’ve gotten into TED talks. Most are like 10-15 minutes, so I can usually watch two in the time it takes to do a 5k.

One I watched this week (though nor a normal short talk) is this interview with Elon Musk from a few years back:

And with that, now I’m off to make my little timeline maker tool more GDPR compliant πŸ˜„