Python Twitter Mining: Handling Rate Limits with TweetPony
Problem: I’m using Python to mine tweets from Twitter. I’m using TweetPony to handle my API requests to Twitter. I’ve been able to successfully request these things using the API, but I’m not sure how to deal with the rate limit without my system crashing. How can I make my program continuously run while still abiding by the rate limit rules?
Solution: I’ve Implemented much of this in my own Twitter Miner. Feel free to look at the code and grab what you need (attribution to me would be awesome!).
Or, if you just want a small example of the code, check out my gist:[https://gist.github.com/SIRHAMY/237bc3928e535e55ab49dabfa30acdb0]
The main thing to keep in mind when dealing with the rate limit is that Twitter will reply telling you you’ve reached the limit if you over-request. Unless you continuously request at a high rate after this, you shouldn’t have any problems utilizing this as a signal that you need to wait for the next request allotment.
TweetPony raises an exception if the rate limit is hit. By searching inside the exception string for ’88’ (this is Twitter’s error code for rate limit), you can branch off whether you just need to wait or if a more serious issue occurred.
In my code, I sleep for three minutes and then try again. The number itself is arbitrary, but I figured I wanted to keep the superfluous requests to a minimum while trying to get back to work ASAP. I chose a factor of 15, so it would hopefully restart as soon as the rate limit was lifted. I haven’t actually tested this, so feel free to use whatever timeout you wish.
To see an example of a full-blown Twitter-mining system, check out Twitter Cabinet. It utilizes Twitter Miner to grab the raw tweets and provides functionality to classify tweets by topic and run sentiment analysis on them.