Get the URLs of all 4chan.org textboards (dis.) and scrap and archive them. Use archive.org
Name:
Anonymous2014-06-27 18:00
compressing the markup reduced the 1.5 GB prog.db to around 390 MB. I could host the old prog on heliohost now, but I want to fit all of world4ch. Any recommendations for using data compression in a database is welcome. Right now I'm thinking of serializing each thread into a flat file and then gzipping them.