Have you ever guessed how many people could see a tweet that has been retweeted by a lot of people? Could you find it out using python and Twitter API? Let's do it!
My latest tweet received a ton more attention than any of my previous ones. It was retweeted by 45 people, including a high profile tweeter. If you haven't seen it or can't remember, it was the one from which I got this gif:
We are going to use tweepy, "An easy-to-use Python library for accessing the Twitter API".
import tweepy
You thought I would show you my secret keys? You have to create your own app at apps.twitter.com and get yours.
from secret_tweepy import consumer_key, consumer_secret, access_token, token_secret
With your keys at hand, you can get access to Twitter API.
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, token_secret)
api = tweepy.API(auth)
We'll get our tweet by its id. It's the same number that appears in our url: twitter.com/carneiroblogbr/status/693201831216943104
TWEET_TO_ANALYZE = 693201831216943104
tweet = api.get_status(TWEET_TO_ANALYZE)
Just to be sure, let's take a look at that tweet's text:
print(tweet.text)
Tweepy limits the amount of retweets that we can fetch to a hundred. Luckily (?) we didn't get that much attention.
retweets = api.retweets(TWEET_TO_ANALYZE, 100)
print('Retweets count: %d' % len(retweets))
Now we can build a dictionary to receive the username of the people that retweeted us. Their screen_name will be our key and the count of followers of each of them is our value.
retweeters = {}
for retweet in retweets:
retweeters[retweet.user.screen_name] = retweet.user.followers_count
The moment of truth! Let's sum up all of the followers counts and see how many people could have seen our tweet (if they were paying attention).
print('Tweet reach: %d' % sum(retweeters.values()))
Wow! That's great! Over 200k!! Please consider that I only have 121 followers...
Some magic ahead: a lambda function! Our retweeters' dict can be considered a list of tuples with two elements: key (screen_name) and value(followers_count). If we want to see the most followed user first, we have to sort by the second element(value or [1]) and reverse the list. And let's get only the top 10 to save some space.
most_influence = sorted(retweeters.items(),
key=lambda rt: rt[1],
reverse=True)[:10]
Instead of printing it all in one line, let's print it pretty!
from pprint import pprint
pprint(most_influence)
Thank you, @codinghorror!
If you want to try it yourself, feel free to get this code (or even the whole IPython notebook) on my github repo: https://github.com/ocarneiro/twitter-reach