Data Analysis

January 23 / Data Analysis

Last year, my brother and I began a project that required collecting lots and lots of tweets to analyze. So far, we’ve collected over 9.5 million geo-located tweets from roughly 20 US cities. Here’s how we did it.

Read MorePersistent Tweet Collection in Python

October 2 / Data Analysis

A few months ago, I decided it would be fun to do some predictive modeling of the quality of upcoming Hollywood films. There’s tons of data out there, but some of it can be hard to find. As part of that project, I wrote a short R script to scrape some data from Rotten Tomatoes. Feed the function an actor’s name, and it will return all of their film and TV work along with corresponding Tomatometer scores, release years, and a few other things. So, we can use this to find out if anybody’s career is in a real decline (or upswing).  For example, Charlie Sheen or M. Night Shyamalan.

Read MoreRotten Tomatoes Data in R