Indexing Followup

Some guy that is relieved
Good news on most fronts.

Paul Graham said that my indexing didnt slow down the server, which was a big relief. The slowness I experienced was a result of him throttling my IP address, which means he set a limit on how much data I could download. The limits are in place to discourage people from crawling the site, which if done en masse, could have a significant impact on Hacker News.

The post received mostly positive feedback other than the occasional “lol you idiot”. I expected a lot more of these. It reflects really well on the Hacker News community that there were so few of them. On any other popular news aggregator, there probably would have been a lot more trollish commentary. So, thank you HN.

I hesitated at first about writing anything. Maybe it’d be better just to ignore it and move on. I decided to consult the authority on these things: my lovely wife. I explained to her what happened. She laughed, did this wave motion with her hands and said “Smooooth”. She thought it was hilarious and told me to post it. “But its Hacker News!” I told her. She rolled her eyes. Women.

Anyway, I started analyzing the data from the 73% of submissions that I was able to log before I got throttled (sounds terrible every time I write it). As expected, there is a lot of interesting information to be gleaned, which I’ll share shortly.

Thanks for reading –

The Wrong Way to Get Noticed by YC

Doh
So, I’m sitting at my kitchen table the other night thinking about startup type things when an idea pops into my head: Create an index for Hacker News.

Now, this isn’t the first time this occurred to me. A few weeks ago I emailed Paul Graham asking whether I could create a searchable database of Hacker News. He said he’d rather me not, plus I found out later about searchyc.com, which does exactly that.

But an index… that would have a different purpose. You could do all sorts of interesting analysis on it… top posts, top contributors, posting frequency, etc etc. No, I wouldn’t save the content, just the relevant information for the submissions only (no comments) like title, URL, points, # comments, and date.

The software wasn’t hard to write. The submissions are sequentially numbered from 1 to about 270K and it’s easy to differentiate between submissions and comments by searching the HTML. After about an hour of work and a little testing, I set off my small VB program to crawl the site.

This was Tuesday night. I went to sleep, eager to analyze the results the next day.

Wednesday morning I woke up and checked its status. 30% or something low like that. I couldn’t do any analysis then anyway — so off to work. I got home that evening and it was still chugging along. 55%. Getting there…

That night, around 9, I checked the status. 73.64%. Stupid slow connection. I came back half an hour later. 73.65%. Man, my connection is really terrible, I thought to myself. I loaded up Amazon to see if it would load. No problem. I restarted my computer, thinking it’s some connection problem. When it reboots, I check YC again … it took about 20 seconds and finally loaded. Hmm. Then it hit me. Wait a minute. Oh no. No no no no. What if the indexing caused HackerNews to go down?

This is not good. Not good at all.

So I shut the program down and went to bed. Next morning, Thursday morning, I checked my email before heading out, half expecting to see some sort of email. Nothing. Phew. YC was still somewhat slow at that point, but was improving.

I checked HackerNews throughout the day at work. Seemed to be just about better. Sometime in the afternoon I checked GMail. I had an email from Paul Graham titled “please stop”. It says:

Would you please not do that to the server again?

“Shit” I said. My coworker shot a puzzled look at me. “Nothing” I told him, “Its a long story.”

I wrote an response, apologizing profusely. Unfortunately, I realized later that night that the response didn’t go through… only a blank email. So, I rewrote the email and sent if off.

I’d like to take this opportunity again to say sorry to Paul and any other member of the HackerNews community that was affected by this. I didn’t think through what effect the indexing would cause, and would never have done it if I realized it would unintentionally result in a denial of service attack on my favorite news site. I don’t know how much time it took to fix it and apologize for any lost time YC took to correct it.

If you’re considering doing something like this, you should rethink your plans. It’s not exactly the best way to make an impression.

Early Adapters

I wandered over to HackerNews with the intent of asking how important it was to come up with an original idea vs building on other peoples ideas and products. Sitting in the top 10 was an article by Alexander van Elsas titled Early adopters and Silicon Valley Are The Easy Way To Failure. His main points are:

  • Most startups fail because they don’t solve normal people’s problems
  • Instead, they get a lot of hype from within Silicon Valley, but that’s not whats really important
  • When they do this they get stuck in the Silicon Valley vaccuum, so called because their product never goes mainstream
  • Stay away from Silicon Valley — develop a service that has an impact on mainstream users
  • If you can do that, Silicon Valley will notice

There isn’t a single mainstream user problem or value being addressed.

You are better off with early adopters that aren’t asking for cool new features, but instead tell you about their experience to try and integrate your service into their daily patterns.

Money, Startups, Ideas

Reaching the $5 Million Club Takes an Open Mind

Create opportunities for yourself by being bold:

One might think that good fortune would play a role, but even luck is largely a matter of one’s own making. Psychologist Richard Wiseman has found that people who describe themselves as lucky share common habits that account for their success: They’re friendly and fond of new experiences, traits that put them on a collision course with new opportunities. In addition, “lucky” folks simply have higher expectations of success — they’re too pigheadedly optimistic to heed the long odds and call it quits.

Start a business, make something people want, and puts lots of work into it:

The vast majority — 80% — either started their own business or worked for a small company that saw explosive growth. And almost all of them made their fortune in a big lump sum after many years of effort.

…rich folks often make their fortunes after they make up their minds to solve a problem or do something better than it’s been done before.

Couldn’t have said it better:

Being rich means freedom: to spend your time as you please, to pursue your real interests and to take a chance without courting utter ruin. Paradoxically, the road to riches often means acting as if you already have that freedom.

Let’s see what else

mibbit – cool, online IRC application – very impressed

10 Tips for Budding Web Programmers – and you’ll note that this link now has a title associaeted with it

How to Get Startup Ideas – I thought this was pretty interesting. He says that if you want to have a great startup idea you should move to the bay area because you’ll have a lot of conversations with a lot of smart people and you’ll stumble upon great ideas by accident. Without geeky conversations, its not impossible, but its more difficult.

Relating to ideas… I found out earlier this week that an idea I had for a startup has already been done. In fact it has been done multiple times and a large portion of the design and functionality that I had planned for it have already been implemented. The one site, which has been up for a few months, has been doing well judging based on the few thousand users they have (which could be a misleading stat). I have mixed feelings: part of me says, “Damn, wish I had started working on it first” and the other says “Hey, it actually is a good idea because someone else is making it work.” The other, more ambitious side of me says “These sites are missing a lot of important functionality. Go get ’em.” I might do that. Or I might not — part of me wants to find a large, untouched market. I want a BIG idea. Unfortunately, wanting is easy; doing is hard.