2024 Year in Review

At the end of 2022 I wrapped up my contract work with Help Scout and took the plunge to work on my indie software businesses full time. I’m now two years into that adventure, and wanted to share a periodic update about how things are going.

Preceden on the Back Burner

I made 32 commits to Preceden, my timeline maker software, the entire year. Those commits were all small tweaks like switching the AI timeline generator from gpt-3.5-turbo to gpt-4o-mini, fixing responsiveness issues, and other quick adjustments.

Weekly Preceden commits

I check my support mailbox once or twice a week, and am usually able to respond to the handful of support requests I receive with saved replies, so support takes up almost no time.

Between the commits and support, I’d estimate I spent fewer than 10 hours working on Preceden the entire year.

Despite that, SaaS revenue grew year-over-year, though only by 2%:

That’s the slowest growth rate its ever had, but also the highest hourly rate I’ve ever had 😂:

It’s likely that if I had invested more time into it, that revenue would have been higher, but it’s hard to say how much. To really move the needle on the revenue, I’d need to focus on increasing new MRR to offset the churn, and at this point that would largely come from marketing it more to increase top of the funnel traffic. That’s easier said than done in my experience, and not really what I want to spend my time working on.

In the fall I did have someone reach out about acquiring Preceden, but they didn’t seem too serious about it and it didn’t go anywhere. Still, it was interesting to consider whether I wanted to sell it, and if so for how much, and how it would impact my life if I did sell it. After all, I could still sell it anytime, even if it’s through FE International or Acquire. Having a few years of future Preceden income in the bank would be nice, but then (in the absence of other income) I’d be eating into those savings each month to pay the bills vs using Preceden’s income each month. Psychologically I think the former would be much more difficult, even if the math indicated long term they’re not that different. Anyway, I’m not sure I want to sell it, but it’s also not completely off the table.

Emergent Mind: AI Research Assistant

The reason I didn’t work on Preceden was because I was focused on Emergent Mind.

Those of you who have been following me for a while know some of this, but to quickly recap where I was a year ago:

  • Dec 2022 – Jan 2023 – it was called LearnGPT, and was for sharing ChatGPT examples. Almost sold it, but decided not to.
  • Feb 2023 – Jan 2024 – renamed it to Emergent Mind, pivoted to an AI news aggregator. Almost shut it down, but decided not to.
  • Jan 2024 – pivoted to an AI research aggregator, went full time on it

Thankfully, there were no hard pivots in 2024, even though the product looks a lot different now than it did a year ago. Instead, I focused on taking that initial AI research aggregator and building it into a proper AI research assistant that people could use to discover and learn about research. I’ve been building it up iteratively:

  • Feb 2024 – Expanded beyond the few initial AI/ML arXiv categories it was aggregating (#)
  • May 2024 – Expanded to all computer science categories (#)
  • June 2024 – Added some basic semantic search capabilities with topic definitions (#)
  • July 2024 – Focused the homepage on search (#)
  • Aug 2024 – Added quick answers to surface relevant research when users perform searches (#)
  • Sept 2024 – Soft launched v1 AI Research Assistant that synthesizes complete answers from research (#)
  • Oct 2024 – Publicly launched the AI Research Assistant (#)

And many smaller improvements and updates in between and since, totaling 2,200 commits:

Weekly Emergent Mind commits

You can see me giving a demo of the mostly-current research assistant in this video:

Omar Olivares , an AI engineer who helps me with Emergent Mind, took the initiative to have us sponsor the 27th Iberoamerican Congress on Pattern Recognition in Chile, which got the site in front of a lot of people:

The last major update was the public launch of the AI research assistant for computer scientists in October. Behind the scenes, I’ve been working on building out the platform so it’s able to perform very sophisticated analyses of research using LLMs. It’s not there yet, but I think there’s a decent chance that in 2025 Emergent Mind will become a best-in-class product for its research synthesis capabilities.

As far as how it’s doing as a business, revenue is up and to the right despite there being a lot of room for improvement with the current plans and pricing and not doing nearly enough marketing:

So what’s the end goal here?

I started Emergent Mind right after ChatGPT launched with the simple goal of building a great product in the AI space. The first two iterations (the ChatGPT examples site and AI news aggregator) were okay, but ultimately not interesting enough to continue with. Some good came out of them though, because they led me into the research space, which I’m now fascinated with and think I can have an impact in by building AI tools for scientists, engineers, and researchers to do better research.

I’m in a weird spot too where I have this $10k+ MRR passive income business with Preceden that makes enough to support me, and while getting Emergent Mind there too would be great, it seems kind of like a hollow goal, and a missed opportunity to set a different, more ambitious type of goal.

The way I’ve started thinking about it is that my mission with Emergent Mind is to build a product that contributes to someone making a scientific discovery that has a significant positive impact on the world.

It sounds kind of delusional, I know, but I think the platform is actually very well positioned to do that in the future.

That mission also provides clarity on questions that might have different answers if the goal was to build a big business to sell to Google. For example:

  • Should it stay focused on computer science or expand to other fields? Expand, because the tools can benefit researchers in other fields too.
  • Should it be a purely paid product? No, better to be generous, because the more people using it, the more it’s likely to benefit their research.
  • Which of the dozens of ideas on my todo list should I work on next? Whichever is most likely to help users do better research.
  • Should I try to raise a seed round to move faster? Maybe? (If you’re in a position to help, lets chat.)

We’ll see how this all goes. It might not play out like this, but it seems very much worth trying, and I’m enjoying the journey.

Also, if you have any suggestions on Preceden or Emergent Mind, please drop me a note, I’d very much value that feedback.

Until next time 👋

It’s Time to Build

It’s been a few months so I wanted to say hey to the 7 of you who follow this blog and share a few updates about what I’ve been up to.

Quick recap

At the start of 2023 I quit consulting to go full time on Preceden, my SaaS timeline maker, after growing it on the side for about 13 years. Around the same time I started working on LearnGPT (which would eventually become Emergent Mind), and wound up spending about 70% of 2023 working on Preceden building out various AI capabilities like its visual timeline generator and 30% working on LearnGPT/Emergent Mind. In November I pivoted Emergent Mind from an AI news aggregator to an AI research aggregator, and I’ve been working on it full time since then.

Preceden

I’ve barely worked on Preceden since November. I answer about a dozen support emails each week and fix the occasional bug, but haven’t worked on any major product updates in a while. A good chunk of those support emails are refund requests, which I actually think is a good sign, because the lack of bug reports and feature requests reflect that the product is in pretty good shape.

Preceden revenue is up about 5% year to date, the lowest it’s ever been. It’s tempting to see that and conclude that it’s because I haven’t worked on it in 5 months, but the reality is that churn finally caught up to new MRR growth, and it’s largely because of a subtle mistake I made in the fall.

Preceden has always struggled to rank well for key search terms like “timeline maker”, despite it having pretty good SEO positioning. I realized around October that the reason for this might be because over its lifetime lots of users have created near-identical public timelines on historical topics, like hundreds of timelines on the Russian Revolution. Maybe Google was penalizing the site for this duplicate content. To remedy this, I used the AI timeline generator I built to generate around 200 timelines on common historical topics, and then 301 redirected about 20k public user-generated timelines to the AI-generated ones in an effort to reduce the amount of content on Google that it was possibly interpreting as spammy.

Good thought, but one problem: I accidentally no-index all of those AI-generated timelines, and because I was heads down on Emergent Mind and not paying close enough attention to Preceden’s metrics, I didn’t realize it for about 4 months. Those 20k public timelines drove a lot of traffic and sign ups, and when I redirected them all to no-indexed pages, I lost all that traffic, and a good portion of Preceden’s new MRR disappeared as well. I got the AI-generated timelines re-indexed, but traffic hasn’t fully recovered, which is why revenue is up 5% and not higher like it’s been in the past.

The good news though is that despite this mistake, Preceden continues to bring in income equivalent to a decently-paid developer’s salary, and it’s entirely passive, allowing me to pursue other things.

I’m taking advantage of that and chilling on the beach reading all day. Except not at all.

Emergent Mind

Emergent Mind helps people discover and learn about new AI/ML research. It gets 10k-15k visitors per month currently and people seem to get a lot of value out of it.

And last week I rolled out some very early paid plans and it now has non-zero revenue coming in:

It’s not much, but it’s a start.

The thing is though, I’m not optimizing for revenue right now.

I think of Emergent Mind as a product lab operating at the intersection of LLMs, research, and education. The way I see it, we’re at a point right now similar to the mid-90s when internet usage exploded with the advent of AOL. Similar to how many companies from that time period focused on building out better infrastructure to enable broader and faster internet usage, there are lots of companies right now focused on building bigger, more powerful LLMs. And similar to 1995, I think we’re going to see a ton of innovation in the coming years in the type of products and businesses being built with this new technology. That’s what I want to focus on.

I want to build tools in the research space at the frontiers of what’s possible with generative AI. I think we’ve seen like 2% of what’s going to be built with these technologies, and I want to spend most of my time exploring that other 98%. These will range from quick features that take several hours to launch, to some in the future that will take months to build. Some of these will be silly and most won’t go anywhere, but I think there’s a huge opportunity right now to tinker with an entrepreneurial mindset and create new types of innovative and hopefully useful products.

Like, what if you put an agent in charge of your Twitter account and set it up to automatically optimize itself based on engagement? What if you built a deeply integrated chatbot into your site that tried to persuade visitors to sign up for your newsletter based on their usage of the site? If you have access to the latest scientific research, could you use LLMs to identify gaps in our knowledge? Could you use LLMs to fill in those gaps? Could you build an AI-enabled educational tool that helps a software developer gain fluency in the type of advanced math you might find in a diffusion paper?

I don’t have the expertise to be confident about what’s going to work and what’s not (does anyone?), so I’m going to just experiment and learn and iterate and see where it goes.

With Preceden’s passive income, I can pursue this for a while, not forever. I do have a small team of amazing contractors helping out (Milan on design and Omar on AI engineering); it will be important to monetize Emergent Mind so I can support this team and possibly add more folks in the future. Ideally, Emergent Mind will make enough income at some point soon-ish where I can continue doing this long term without relying on Preceden’s income to support it.

Honestly there’s nothing else I’d rather be doing right now. For me, building a software business has always been about freeing up my time so I can spend more time learning and building. It took a while, but I’m kind of at that point right now where I can do that all day without being laser-focused on revenue growth.

I have no idea how this approach will play out, but I’m excited to see what happens.

Thanks for following along ❤️.

Is the ChatGPT API Refusing to Summarize Academic Papers? Not so fast.

Yesterday on X, I shared a post about some responses I was getting from the ChatGPT 3.5 API indicating that it was refusing to summarize arXiv papers:

There has been a lot of discussion recently about the perceived decrease in the quality of ChatGPT’s responses and seeing ChatGPT’s refusal here reinforced that perception for a lot of people, myself included.

I dug into it more today and wanted to share my findings.

Here are my takeaways:

  • ChatGPT 3.5 is still great at summarizing the vast majority of papers
  • However, due to some combination of the prompt I was using plus the content of some papers, it occasionally refuses to summarize them
  • It’s not clear if this is a new issue due to some recent change to the 3.5 model, or whether it just hasn’t occurred before while I’ve been working with the API

Background

Before we dive into this, here’s some context: I’m working on a new site called Emergent Mind to help researchers stay informed about important new AI/ML papers on arXiv.

It works by checking social media for mentions of papers and then ranking those papers based on how much discussion is happening on X, HackerNews, Reddit, GitHub, and YouTube and how long since the paper has been published:

For any paper (either ones that Emergent Mind surfaces or those users search for manually), the site also generates a page with details about that paper including a ChatGPT-generated summary.

Here’s an example page for “A Comprehensive Study of Knowledge Editing for Large Language Models” which was published yesterday and already has over 900 stars on GitHub, so is at the top of the trending papers today:

In production, Emergent Mind uses the gpt-4-1106-preview model to generate summaries because it generates higher quality summaries and can handle large papers, which others models cannot. However, locally it tries gpt-3.5-turbo-1106 first because it’s much cheaper and the quality doesn’t matter.

It was while working on it yesterday that I noticed the gpt-3.5-turbo-1106 model frequently refusing to summarize a paper, which prompted my tweet. I had never seen it do that before, and I definitely don’t want the production site ever showing a ‘Sorry, I cannot help with that’ response as a summary for a paper.

Digging in

I published a Jupyter Notebook on GitHub that I used below to experiment with ChatGPT’s responses:

It will grab the summarization prompt in prompt.txt, run it through the gpt-3.5-turbo-1106 endpoint 10 times (or however many you choose), and output the responses to results.csv. Each request costs about a cent, so you don’t have to be too concerned about any experiments consuming your quota.

If you run this script as-is, you’ll likely see about half of the requests result in refusals such as:

  • “Sorry, I cannot do that.”
  • “I’m sorry, I cannot help with that request.”
  • “I legit can’t write a blog post of this length as it is beyond my capabilities.” (lol at the legit)
  • “I’m sorry, but I cannot complete this task as it goes beyond the scope of providing a summary of a research paper. My capabilities are limited to summarizing the content of the paper and I cannot create an original blog post based on the given content.”
  • “I’m sorry, but I can’t do that. However, you can use the information provided in the summary to craft your own blog post about the paper. Good luck!”

It’s easy to see this and come to the conclusion that ChatGPT can no longer be reliably used for summarization tasks. But, reality is more complicated.

Here’s the prompt Emergent Mind and this script are currently using, which I’ve iterated on over time to deal with various issues that popped up in the summaries:

You will be given the content of a newly published arXiv paper and asked to write a summary of it.

Here are some things to keep in mind:

  • Summarize the paper in a way that is understandable to the general public
  • Use a professional tone
  • Don’t use the word “quest” or similar flowery language
  • Don’t say this is a recent paper, since this summary may be referenced in the future
  • Limit your summary to about 4 paragraphs
  • Do not prefix the article with a title
  • Do not mention the author’s names
  • You can use the following markdown tags in your summary: ordered list, unordered list, and h3 headings
  • Divide the summary into sections using markdown h3 headings
  • Do not include a title for the summary; only include headings to divide the summary into sections
  • The first line should be an h3 heading as well.
  • Assume readers know what common AI acronyms stand for like LLM and AI
  • Don’t mention any part of this prompt

Here’s the paper:

Now, take a deep breath and write a blog post about this paper.

If we change the prompt though to simply ‘Please summarize the following paper,’ it seems to work 100% of the time. The problem doesn’t seem to have to do with summarizing papers, but about the guidance I provided about how to summarize the paper combined with the content of some papers.

I spent a while this morning testing different combinations of those bullet points to figure out what’s causing the refusal, but couldn’t figure it out exactly. My impression is that it has something to do with the complexity of the guidance or because it thinks I’m attempting to do something shady with copyrighted work (note that earlier on the page it lists all of the paper’s authors, which is why I I’m excluding them from the summary).

A few other things to note:

  • In my testing, GPT 4 (gpt-4-1106-preview) never refused to summarize a paper using the exact same prompt
  • I ran the script with ChatGPT 3.5 for about 10 other papers, and only 2 others saw similar refusals (2312.17661 and 2305.07895). For most papers, it follows the guidance and summarizes the paper 100% of the time.
  • Locally Emergent Mind has summarized hundreds of papers using gpt-3.5-turbo-1106 in November and December and these instances in early January are the first time it has ever refused (I ran a query on prior results to confirm), despite the prompt not changing much recently.

So, in short, the ChatGPT 3.5 API occasionally refuses to generate complex summaries of some papers. This may be new behavior, or may not be.

If anyone ends up experimenting with the script and learning anything new, or if you have any insights as to the behavior I’m seeing here, please drop me an email or leave a comment below, and I’ll update this post accordingly.

Reflecting on My First Year as a Full Time Indie Founder

At the beginning of 2023 I went full time on Preceden, my SaaS timeline maker business, after 13 years of working on it on the side. A year has passed, so I wanted to share an update on how things are going and some lessons learned.

Preceden

Preceden today

My main focus in 2023 was building AI capabilities into Preceden to make it easier for users to create timelines. For some context: historically people would have to sign up for an account and then manually build their timeline, adding events to it one at a time. For some types of timelines where the events are unique and only known to the user (like a timeline about a legal case or a project plan), that’s still necessary. But for many other use cases (like historical timelines), Preceden can now generate comprehensive timelines for users in less than a minute, for free, directly from the homepage.

It took a good chunk of the year to get that tool to where it is today, starting in February with the launch of a tool for logged-in users to generate suggested events for their existing timelines which laid the groundwork for the launch of the logged-out homepage timeline generator in May. The v1 of that tool was slow and buggy and had design issues and I still hadn’t figured out how to integrate it into Preceden’s pricing model, but a few more months of work got most of those issues ironed out.

Since the launch of that tool in late May, people have generated more than 80k timelines with it, and around a third of new users are signing up to edit an AI generated timeline vs create one from scratch. I’m quite happy with how it turned out, and it’s miles ahead of the competition.

Marketing wise, I didn’t do enough (as usual) but did spend a few weeks working on creating a directory of high quality AI generated timelines about historical topics, some of which are starting to rank well. I also threw a few thousand dollars at advertising on Reddit, though there weren’t enough conversions to justify keeping it up.

I also executed a pricing increase for about 400 legacy customers, which I’ll see the results of this year. More on the results of that and the controversy around it in a future blog post.

Business wise, Preceden makes money in two ways: premium SaaS plans and ads. In 2023, revenue from the SaaS side of the business grew 23% YoY and revenue from the ad side of the business grew 33% YoY. The ad revenue is highly volatile though due to some swingy Google rankings, and will likely mostly disappear in 2024. Still, the SaaS revenue is the main business, and I’ll take 23% YoY growth for a 14 year old business, especially in a year where many SaaS companies struggled to grow.

Emergent Mind

Where to begin? :)

Shortly after ChatGPT launched in late 2022, I launched LearnGPT, a site for sharing ChatGPT examples. The site gained some traction and was even featured in a GPT tutorial on YouTube by Andrej Karpathy. But, a hundred competitors quickly popped up, and my interest in continuing to build a ChatGPT examples site waned, so I decided to shut it down. But then I got some interest from people to buy it, so I put it up for sale, got a $7k offer, but turned it down, and then rebranded the site to Emergent Mind and switched the focus to AI news. A few months into that iteration, I lost interest again (AI news competition is also fierce, and I didn’t think Emergent Mind was competitive, despite some people really liking it), so tried selling it again. I didn’t get any high enough offers, so decided to shut it down, but then decided to keep it, even though I didn’t know what I’d do with it.

And guess what: in November I had an idea for another iteration of the site, this time pivoting away from AI news and into a resource for staying informed about AI/ML research. I worked on that for a good chunk of November/December, and am currently mostly focused on it 😅.

I’m cautiously optimistic about this direction though: the handful of people that I’ve shared it with have been very enthusiastic about it and provided lots of great feedback that I’ve been working through.

Unlike my previous product launches, I’m saving a HN/Reddit/X launch announcement for later, after I’ve gotten the product in really good shape. There’s still lots of issues and areas for improvement, and I believe now it’s a better route to soft launch and iterate on it quietly based on 1:1 feedback before drawing too much attention to an unpolished product. Hat-tip Hiten Shah for influencing how I think about MVPs.

I’ll add too that this “surfacing trending AI/ML research” direction is the first step in a larger vision I have for the site. I think it could evolve into something really neat – maybe even a business – though time will tell.

2024

Preceden is in a good/interesting spot where it’s a fairly feature-complete product that requires very little support and maintenance. I don’t have any employees, and could not work on it for months and it would likely still grow and continue to work fine.

When I look ahead, the most popular feature requests seem like they won’t be heavily used and will wind up bloating the product and codebase. That doesn’t mean there’s no room for improvement – there always is – just that I’m not sure it makes sense anymore for me to be so heads down in VS Code working on it. It’s the first time maybe ever that I’ve thought that. I’d probably see more business impact by spending my time on marketing, but that’s not exactly what I want to spend a lot of my time doing, plus I also can’t afford the kind of talent I’d need to market it effectively either (marketing a B2C horizontal SaaS isn’t fun).

So, my current thinking is that I’ll keep improving and lightly marketing Preceden, but with less intensity than I have in years past. Instead, I’ll devote more of my time to building other products: Emergent Mind and maybe others in the future. Maybe one of those will turn into a second income stream but maybe not. I enjoy the 0 to 1 aspect of creating new products, and the income from Preceden supports me in pursuing that for now. And if Preceden starts declining, I can always start focusing on it again, or go back to contracting or a full time position somewhere, which isn’t a bad outcome either.

Also, one thing I regret not doing more of in 2023 was spending more time wandering. It’s easy for me to get super focused on some project and not leave any time in my day for exploring what else is out there. Only toward the end of the year did I start experimenting with new AI tech like Mixtral. Going forward, I want to spend some time each week learning about, experimenting with, and blogging about new AI tech. I’m still very much in the “AI will change the world in the coming years” camp, and I have the freedom and interest to spend some of my time learning and tinkering, so am going to try to do that.

As always, I welcome any feedback on how I’m thinking about things.

Happy new year everyone and thanks for reading 👋.