Back in September 2009 I launched a small web app called HNTrends.com, a tool for visualizing the movement of stories on HackerNews’s front page over time.
I haven’t worked on the site much since then, but the script that logs the data has been diligently recording the front page submissions every 15 minutes since it started.
It occurred to me that a detailed analysis of the data might yield some interesting results such as how the site has grown since then, when is the best time to post a new submission, user participation rates, or some insight that changes the way we see the site. I offer it to you today so that you may analyze it to your heart’s content.
You can download it here (CSV, 13.4 MB zipped, 169 MB unzipped).
In total, the database contains 514,478 records spanning from August 31, 2009 to March 7, 2010.
A single line looks like this:
"1","http://paulgraham.com/kate.html","What Kate saw in Silicon Valley","129","albertcardona","2009-08-31 20:15:15","63","1","2009-08-31 23:15:15","796573","HackerNews","c18577"
Removing the quotes and splitting by comma, here is what each item represents:
1– Primary keyhttp://paulgraham.com/kate.html– Destination URLWhat Kate saw in Silicon Valley– Title129– Pointsalbertcardona– Submitter2009-08-31 20:15:15– Approximate UTC submission time, calculated based on the time minus the age of the submission63– Comments1– Rank2009-08-31 23:15:15– UTC time record was created796573– HackerNews IDHackerNews– Always “HackerNews”c18577– Color for display purposes
One final note: this database covers roughly 99% of the time period since it started. For a while the script broke whenever an article didn’t contain comment link, and every so often it goes down for miscellaneous reasons.


