Going Full Time on My SaaS After 13 Years

In January 2010 I soft-launched launched Preceden, a web-based timeline maker tool, followed a few weeks later by a larger launch on HackerNews:

Today – almost 13 years to the day since the initial launch – I’m going full time on it and I couldn’t be more excited.

A brief history of Preceden

At the time of Preceden’s launch, I was serving as a first lieutenant in the US Air Force and about halfway through a 5-year service commitment I incurred by attending the Air Force Academy, a military college. I knew I didn’t want to make the Air Force a career, so decided to start learning web development with the hope of eventually working full time on a startup after my service commitment ended in 2012.

The first web app I built during this time period was Domain Pigeon (a domain search tool), followed by Preceden, followed by Lean Designs (a WYSIWYG web design tool), followed by Lean Domain Search (another domain search tool I built while deployed to Iraq), plus a few smaller ones not worth mentioning.

By the time I left the Air Force, I had shut down all except Preceden and Lean Domain Search. I did go full time for a few months, but focused entirely on Lean Domain Search. That tool was eventually acquihired by Automattic in 2013, where I joined full time as a software engineer helping with the domain name experience on WordPress.com.

With Lean Domain Search in Automattic’s hands, I was left with just Preceden, which at that point was about 3 years old. It didn’t make much money at the time, but I decided to continue working on it as a side project and see where it went.

Four years later in 2017 I left Automattic to join Help Scout as their first data team hire (during my time at Automattic, I gradually shifted away from software engineering to more of a data analyst/analytics engineer role). I continued to work on Preceden (then 7 years old), and in 2018 I switched to a contractor role so I could put more time on Preceden.

And now, after 4 years of contracting, I’m finally going full time on Preceden.

Here was my announcement at Help Scout from a few weeks ago:

Why not sooner?

It was a combination of things:

  • I made a lot of rookie mistakes over the years that limited Preceden’s growth including not focusing on a specific niche, not spending enough time marketing, not talking to enough customers, trying to do too much myself, and just in general picking a difficult product and business to build (something I didn’t give any thought to initially).
  • I was learning a ton, doing a lot of interesting work, and enjoying the camaraderie I had with my teammates at Automattic and later Help Scout.
  • Financially it made more sense to keep Preceden as a side project.

On the last point – it’s much easier to launch a SaaS than it is to grow it to the point where it can replace your income. As the sole breadwinner in our household with 4 young kids, I was not comfortable going full time and merely being ramen profitable or anything close to it. I wanted to replace or mostly replace my other income, and with Preceden’s SaaS metrics being what they were, it just took a really, really long time to do that. The long slow SaaS ramp of death is something I now have a lot of experience with šŸ˜‚.

But here I am, finally.

Preceden in 2010
Preceden in 2019
Preceden today in 2023

What’s next?

I plan to focus mostly on Preceden, but will spend some percentage of my time on other pursuits. I recently launched LearnGPT.com, a fledgling GPT education site, and will likely work more and more on AI projects including integrating it into Preceden itself.

Also, it’s been a busy few years, and I’m very much looking forward to relaxing more and spending more time with my family including my two younger kids who aren’t in school yet.

I don’t know what my future holds long term. Preceden’s finances are good enough for now, but not at a point where I can just stop working on it and coast for years. With a little luck, Preceden will continue to grow and will continue supporting me full time to either focus on it or other pursuits. There’s also some chance I get bored with it or stumble across some promising new startup and I wind up going back full time somewhere else. We will see!

I do hope to blog more frequently so if you are interested in following along, you can subscribe via email, RSS, or just follow me on Twitter at @mhmazur.

Thanks for reading šŸ‘‹.

Comments on HackerNews

Generating High Quality Available .com Domain Names for a Specific Industry

In my last post I detailed how to extract all of the available .com domain names from the .com zone file. In this post I’m going to show you how to do something very useful with the result: finding a great available domain name for a business in a specific industry.

For example, we’re going to find great business names that can fill in the blanks for the industry of your choosing:

  • ____________Marketing.com
  • ____________Consulting.com
  • ____________SEO.com
  • ____________Data.com
  • ____________Media.com
  • ____________Systems.com
  • ____________Law.com

The big idea: Check for keywords that are registered for other industries, but not registered for yours

Consider this: what if we looked at all of the registered domains that end with advertising.com, figure out the keyword, and then check whether the correspondingĀ marketing.comĀ domain is available? For example, imagine we check and see that the domain HightowerAdvertising.com is registered (we’ll refer to Hightower as the keyword here). We can then check to see if HightowerMarketing.com is registered. Because someone already registered the keyword for the advertising industry, there’s a good chance that the keyword is meaningful and worth checking for the marketing industry as well.

We can take this a step further by checking for common keywords in multiple industries. For example, we check all the domains that end in advertising.com, all that end in media.com, see which keywords they have in common, then check which of those are not registered for marketing.com domains.

The fewer industies we check for common keywords, the more results we’ll have, but the lower the quality. The more industries we check, the fewer the results, but the higher the quality.

Getting your command line on

If you went through my last post, you should have wound up with a domains.txt file that has about 108M registered .com domain names:

$ wc -l domains.txt 
 108894538 domains.txt

With a little bit of command line magic, we can extract all of the domains that end in ADVERTISING (like HIGHTOWERADVERTISING), then remove the trailing ADVERTISING word to get just HIGHTOWER, then sort those results and save it to a list:

$ LC_ALL=C grep ADVERTISING$ domains.txt | sed 's/.\{11\}$//' | sort -u > tmp/advertising.txt

Which will generate a list such as:


$ head tmp/advertising.txt
A
AA
AAA
AAB
AAC
AADIT
AADS
AAGNEYA
AAHAA

Then we do the same for MARKETING domains:

$ LC_ALL=C grep MARKETING$ domains.txt | sed 's/.\{9\}$//' | sort -u > tmp/marketing.txt

And finally, we figure out which domains are in the advertising list but not in the marketing list:

$ comm -23 tmp/advertising.txt tmp/marketing.txt > results/marketing.txt

If we want to find common keywords registered in multiple industries, we need to add an extra step to generate that list of common keywords before figuring out which ones are available in ours:

$ comm -12 tmp/advertising.txt tmp/media.txt | comm -12 - tmp/design.txt | sort -u > tmp/common.txt
$ comm -23 tmp/common.txt tmp/marketing.txt > results/marketing.txt

The resulting marketing.txt list will have the common keywords in the other industries that are likely not registered in yours:


AANDG
AAS
ABILENE
ABRASIVE
ACCOMPLICE
ACENTO
ACTIONSPORTS
ADAGE
ADAIR
ADAY
ADCOM
ADDO
ADITHYA
ADJACENT
ADJECTIVE
ADLIB
ADOBE
ADONAI
ADONE
ADSPACE

The way to interpret this is that for a keyword like Adspace, those domains are registered in the other industries (AdspaceAdvertising.com, AdspaceMedia.com), but not registered for ours (AdspaceMarketing.com). Again, the more similiar industries you check for common keywords, the higher the quality of results you’ll have. We could add three or four more industries to get a short, very high quality list.

By the way, the reason I say likely not registered is because once a domain loses its name servers – for example, if it’s way past its expiration date – it will drop out of the zone file even though the name isn’t available to register yet. Therefore some of the results might actually be registered, but a quick WHOIS check will confirm if it is or not:

$ whois blueheronmarketing.com

No match for domain "BLUEHERONMARKETING.COM".

Or you could just use this Ruby script

Because it’s a pain to run all of these commands while searching for available domains in an industry, I put together this small Ruby script to help:

https://github.com/mattm/industry-domain-name-generator

There are instructions in the README explaining how to set the industry and similar industries in the script. If all goes well, it will run all of the necessary commands to generate the list of results:

$ ruby generator.rb 
Finding available domains for marketing...
Generating industry name lists...
Searching for domains that end with 'advertising'...
  LC_ALL=C grep ADVERTISING$ domains.txt | sed 's/.\{11\}$//' | sort -u > tmp/advertising.txt
Searching for domains that end with 'media'...
  LC_ALL=C grep MEDIA$ domains.txt | sed 's/.\{5\}$//' | sort -u > tmp/media.txt
Searching for domains that end with 'design'...
  LC_ALL=C grep DESIGN$ domains.txt | sed 's/.\{6\}$//' | sort -u > tmp/design.txt
Searching for domains that end with 'marketing'...
  LC_ALL=C grep MARKETING$ domains.txt | sed 's/.\{9\}$//' | sort -u > tmp/marketing.txt
Finding common names in industries...
  comm -12 tmp/advertising.txt tmp/media.txt | comm -12 - tmp/design.txt | sort -u > tmp/common.txt
Finding names not registered for marketing...
  comm -23 tmp/common.txt tmp/marketing.txt > results/marketing.txt
Done, results available in results/marketing.txt

And with a little luck, you’ll find a great domain in the list to use for your new business.

Extracting a List of All Registered .com Domains from the Verisign Zone File

Back in the day when I worked on Lean Domain Search I got a lot of experience working with Verisign’s .com zone file because that’s what Lean Domain Search uses behind the scenes to check whether a given domain is available to register or not.

I still get a lot of emails asking for details about how it worked so over a series of posts, I’m going to walk through how to work with the zone file and eventually explain exactly how Lean Domain Search works.

What’s a zone file?

A zone file lists all registered domains for a given Top Level Domain (like .com, .net, etc) and the name servers associated with the domain. For example, because this blog is hosted on WordPress.com, the zone file lists the WordPress.com name servers for it:

MATTMAZUR NS NS1.WORDPRESS
MATTMAZUR NS NS2.WORDPRESS
MATTMAZUR NS NS3.WORDPRESS

How do I get access to the zone file?

Anyone can fill out a form, apply, and get access. There are details on this page. I detailed in this old post on Lean Domain Search how I filled out the form, though it has changed since then so you’ll need to make some adjustments.

What happens after I apply for access?

Verisign will provide you details to log into the FTP to download the zone file:

Screen Shot 2018-05-18 at 1.07.14 PM.png

The zone file is that 2.91 GB com.zone.gz which unzipped is 11.47 GB currently.

What’s in the zone file?

It begins with some administrative details, then begins listing domains and their associated name server. Note that registered domains without a name server (such as ones that are close to expiring) are not included in this list.


; The use of the Data contained in Verisign Inc.'s aggregated
; .com, and .net top-level domain zone files (including the checksum
; files) is subject to the restrictions described in the access Agreement
; with Verisign Inc.
$ORIGIN COM.
$TTL 900
@ IN SOA a.gtld-servers.net. nstld.verisign-grs.com. (
1526140941 ;serial
1800 ;refresh every 30 min
900 ;retry every 15 min
604800 ;expire after a week
86400 ;minimum of a day
)
$TTL 172800
NS A.GTLD-SERVERS.NET.
NS G.GTLD-SERVERS.NET.
NS H.GTLD-SERVERS.NET.
NS C.GTLD-SERVERS.NET.
NS I.GTLD-SERVERS.NET.
NS B.GTLD-SERVERS.NET.
NS D.GTLD-SERVERS.NET.
NS L.GTLD-SERVERS.NET.
NS F.GTLD-SERVERS.NET.
NS J.GTLD-SERVERS.NET.
NS K.GTLD-SERVERS.NET.
NS E.GTLD-SERVERS.NET.
NS M.GTLD-SERVERS.NET.
COM. 86400 DNSKEY 257 3 8 AQPDzldNmMvZFX4NcNJ0uEnKDg7tmv/F3MyQR0lpBmVcNcsIszxNFxsBfKNW9JYCYqpik8366LE7VbIcNRzfp2h9OO8HRl+H+E08zauK8k7evWEmu/6od+2boggPoiEfGNyvNPaSI7FOIroDsnw/taggzHRX1Z7SOiOiPWPNIwSUyWOZ79VmcQ1GLkC6NlYvG3HwYmynQv6oFwGv/KELSw7ZSdrbTQ0HXvZbqMUI7BaMskmvgm1G7oKZ1YiF7O9ioVNc0+7ASbqmZN7Z98EGU/Qh2K/BgUe8Hs0XVcdPKrtyYnoQHd2ynKPcMMlTEih2/2HDHjRPJ2aywIpKNnv4oPo/
COM. 86400 DNSKEY 256 3 8 AQOz+iBqxZtCKBBqKsO/i9JVchZ2Z1pFCWnj+pFHJi3uPWiYWsAMvtMpInRPfV1Ot9m+8nHPxSkvOL2+bttj4jEK6uUfTarET4wAMSh2k9rX2h+9kVQDjcuRwfFXV5bAmFd3j57hic7FEYVSxXtNUVU7BPaFRHuFr3OrQHQXaR4IeQ==
COM. 86400 NSEC3PARAM 1 0 0 –
COM. 900 RRSIG SOA 8 1 900 20180519160221 20180512145221 36707 COM. Jh63KZtaFJwB86dM+r65iDaGDNWbLsMi7tP/Kf9dYHdILkGpPfO4HOVkKKvMKQpGcrIyl7LPwwfA2VfvISFsWszcqD7SNfP82rHCf8Y1U6JXRS4v23x+0zeaq4LLAaHsejursS8b5W/PsufbXoWgs6oTuCdNEhzit5ql2s2JtUY=
COM. 86400 RRSIG NSEC3PARAM 8 1 86400 20180516044717 20180509033717 36707 COM. eHouT12OKthPi++n+0hgvafEopsN3Q6iCBNVpyvckt3+29ReGd3XugZrx9qASl0Z+sYd8icxHHG2JIMs/ZqrknQIngP24hkmQrRYBAEkNggUzbjxp1CRqdnyeaJ8c8X8WjiFzLk2y7ic4fpxvHcB2MCAIkIRDWlDYjznNaIbsNI=
COM. RRSIG NS 8 1 172800 20180516044717 20180509033717 36707 COM. nmXBe6F4losM2dmCryGopjjJLlhQmYscNgHqvIQ3zbHm59UHe87T6FmHTdtdujmh3D8rW6g2vx2rzWPxLQigd7xh1KyIfCGZODaUB4TPAxadtGCfvu1h00dieCIf/+UIumg5iJBPjlQdCdpAweh1Zw9KUvbWlkRrXLz03jmJ/xg=
COM. 86400 RRSIG DNSKEY 8 1 86400 20180522182533 20180507182033 30909 COM. F7/1eje9GeHOQcuokqfTHeLYxVznTnkF10YAlMTKi7aJiCySWMVwC/0I/om/EvE+Z4AMG+3B/gFy94PpGnOjpaZcimW1syTKJOPPsGXdQD6F1bxnKCD1r+r9HrSIKTe+lzXI7kzakHNZx3zsdYO4aFifr/hiR/YV/wirJjiXxgOFCtUquSlIOeZ7rv8wTf34onLrf2mYk447ByqUWrXJvqJ16pW+ISUFzyroHqgXFluzrMUqlWVJl8mtnQ5ChCk98zZTGCQJc60HDeYWSY3Mbpji2VZS2uQVDTzO3AeEv5GIoLF8jC+UCAeYDiQhZ5HaEn5HSLh/jYe3TOuIm0tOiw==
KITCHENEROKTOBERFEST NS NS1.UNIREGISTRYMARKET.LINK.
KITCHENEROKTOBERFEST NS NS2.UNIREGISTRYMARKET.LINK.
KITCHENFLOORTILE NS NS1.UNIREGISTRYMARKET.LINK.
KITCHENFLOORTILE NS NS2.UNIREGISTRYMARKET.LINK.
KITCHENTABLESET NS NS1.UNIREGISTRYMARKET.LINK.
KITCHENTABLESET NS NS2.UNIREGISTRYMARKET.LINK.

view raw

com.zone.txt

hosted with ❤ by GitHub

How can I extract a list of just the domains?

Glad you asked! It takes a little bit of command line fu.

If you’d like to follow along, here are the first 1,000 lines of the zone file. You can download this and use the terminal commands below just like you would if you were working with the entireĀ 317,338,073 lineĀ zone file.

1) First, we’ll grab a list of just the domains:

$ awk '{print $1}' com.zone > domains-only.txt

For a line like this:

KITCHENEROKTOBERFEST NS NS1.UNIREGISTRYMARKET.LINK.

This command will return justĀ KITCHENEROKTOBERFEST.

This will also return some non-domains from the administrative section at the top of the zone file, but we’ll filter those out later.

Here’s what domains-only.txt should look like.

2) Next, we’ll sort the results and remove duplicates:

$ sort -u domains-only.txt --output domains-unique.txt

This is necessary because most domains will have multiple name servers, but we don’t want the domain to appear multiple times in our final list of domains.

Here’s what domains-unique.txt should look like.

3) Last but not least, we’ll ensure the results include only domains:

$ LC_ALL=C grep '^[A-Z0-9\-]*$' domains-unique.txt > domains.txt

There are a few things to note here.

First, make sure to use gnu grep, which is not the default on Macs. GNU grep is fast.

The LC_ALL=C forces grep to use the locale C, which tells grep this is an ASCII file, not a UTF-8 file. More details here. While not important for this 1,000-line file, it significantly reduces how much time grep takes on the full 300M+ line zone file.

The ^[A-Z0-9\-]*$ regular expression here looks for lines that are made up of letters, numbers, and dashes. The reason we use a * (0 or more characters) vs + (1 or more characters) is simply because the grep command doesn’t support +.

Technically this regex will match strings that are longer than domains can actually be (the max is 63 characters) as well as strings that start or end with a dash (which isn’t valid for a domain) but there aren’t any of those in the zone file so it’s not a big deal and grep will run faster this way. If you really wanted to get fancy, you could match proper domains, but it will take longer to run:Ā ^[A-Z0-9]([A-Z0-9\-]{0,61}[A-Z0-9])?$

Here’s what domains.txt should look like.

Note that this does include some domain-like strings from the administrative section likeĀ 1526140941 which isn’t actually a domain. Depending on what you’re using the zone file for you could remove these lines, but it’s never been a big deal for my use case. Because Lean Domain Search is limited to letters-only domains, it actually just uses Ā ^[A-Z]* for the regex.

Here’s some actual code from Lean Domain Search with these steps above:

Screen Shot 2018-05-18 at 1.43.18 PM.png

If you run into any trouble or have suggestions on how to improve any of these commands, don’t hesitate to reach out. Cheers!