So I got my hands on this 1-gigabyte file which has all of the old /prog/ up to some date in august 2013:
1.4G prog-20130813130608.db
(it's 55mb when compressed)
Big thanks to the anonymous person who published this (though the original url is gone now)
I wonder if it should be shared somehow for posterity. maybe some other people here also have this file ? Somehow I doubt archive.org would want it, what with all the racist posts.
also i wrote a little python program to browse that file via a local HTTP interface
Anyway also big thanks to whoever starte /prog/rider as a resurrection of the old /prog/
BTW I wrote a little python/flask app to browse the "prog.db" file via local HTTP. It has a feature to search for a string in all posts and sort results by date, so you can find out when a meme first appeared. Optionally, you can include the original javascript and CSS from dis.4chan.org to get spoiler-tags to work and have the original colours etc. (Yes, flask probalby sucks, but this little program is useful for browsing the archive.) Hope this is useful to other people.
"What does it mean to be normal?" rhetorically asked the doctor. The doctor pounded his chest and said, "I am normal ! I am the doctor ! I am eternal !". An Indian man 4000 years ago wrote down that I am normal ! I am the doctor! he who installs .NET installs the truth. the wheel spun and several thousand dollars were contributed towards the most prestigiuous instutition prized for its cliques of suburban children. who needs god when you can install .NET ?
What does s/something/something mean? I always see it on /prog/ and Hacker News and I've been bugged by this for months but I have no idea how to Google it.
>>35 I already know that, but I don't care much about giving out some IP that changes every day. Moreover, we're not sharing anything illegal, are we?
No, this is not a ``nothing to hide, nothing to fear'' argument, I just don't see why go through such lengths just to download 1GB of ANUS and chinkspam. I can't pay a good VPS/seedbox because I don't have a credit card and I'm not sure if there are any free VPSs you can trust.
>>36 Then use I2P like I said. Unless you are willing to host your own GNUnet ECRS or Freenet Address Resolution Keys after you have downloaded it (even though I have FrozenVoid's copy).
Dear eNumbered client, the website will be offline for one week while we upgrade security and add additional functionality to the site. Full functionality will be ...
That's Business With .NET That's Business With Smoked Meat The Doctor Is Eternal Let's take it to the next level Let's all go to the hotel pool as we finish the bottle It's very foolish to think reality is normal Let's get this party started right
>>44 You are as bad as >>42 What the fuck is wrong with SQL queries? Your stupid script can be replaced by one SQL line.
Name:
>>442013-12-04 0:44
>>46 I'm sorry. I'm a noob when it comes to SQL queries. I don't really know anything about them, in fact. I just wanted to browse my /prog/ and I coded this up as quickly as I could. The good thing about it is that it works and I can easily use it with the web interface. Again, I'm not an SQL expert at all. I'm sorry.
Name:
>>442013-12-04 0:46
>>46 Also much of what the script does is display the posts in a browseable HTML format close to the original site, and make sure stuff like spoilers work. Again, I'm sorry my noob-tier quickly-put-together code offended you and I appreciate your advice.
>>51 Is there anything worth scraping after 2013-08-13? That's what >>50 uses. I don't want to soil the database with shit like `le pedophile sage' and all the /g/ shitposting.
>>52 Use regex to filter all that shit out too. U are going to remove a lot of "X{+}D{+}", even "8={+}D". If you want to be careful, just place them in the review queue.
Name:
Anonymous2013-12-04 19:41
>>52 there have been maybe a few more lambda arthur calculus appearances, and there was a thread the other day from an oldfag who hadn't visited in two years saying to spread love and not trolling which was kind of amusing.
a few of my threads about steam's drm, and about cloud bullshit, predate 2013-08 but didn't make it into the scrape
>>52 There's nothing worth scraping before 2013-08-13 either. If you're going to have a [spoiler]/prog/[/spoiler] index, you just have to take it as given that almost all of it is going to be shitposts.
thanks a whole lot admin :) thanks to your http hosting /prog/ is now archived in the internet archive for posterity. (and it's also easy to access via http from here)