I'm fairly new to R, I use it for a course on network analysis at my university.
As part of a research project, I want to analyse tweets by Donald Trump and Hillary Clinton. I successfully managed to grant RStudio access to my twitter account, but every time I try to download tweets, I get a fairly meager selection ranging from 1,100 tweets at best to just 800-900 tweets at worst. I do not understand this as I do not get any error message, either. Am I missing something? I thought the limit on downloading tweets was at 3,200?
This is my code:
#load twittR package and necessary tool for login
library(twitteR)
library(ROAuth)
#load login data
api_key <- "blah"
api_secret <- "blah"
access_token <- "blah"
access_token_secret <- "blah"
#login
setup_twitter_oauth(api_key,api_secret,access_token,access_token_secret)
#retreive tweets by Donald Trump, maximum number is 3200
tweetsTrump <- userTimeline("realDonaldTrump", n=3200)
#convert those tweets to a dataframe
Trump.df <- twListToDF(tweetsTrump)
I am eternally grateful for every useful tip!
Have a look at the Twitter API documentation where it says:
The Twitter Search API searches against a sampling of recent Tweets
published in the past 7 days.
Before getting involved, it’s important to know that the Search API is
focused on relevance and not completeness. This means that some Tweets
and users may be missing from search results. If you want to match for
completeness you should consider using a Streaming API instead.
Thus, the results from the API are limited per se. If you want more, use the streaming api or services like Gnip.
Related
I'm currently using the Standard Twitter API for my bachelor degree. So i wanna analyze the timeline of certain users.
My problem is, that i'm want more than 3200 Tweets with the get_timeline command. So i set up following code in R:
df1 <- get_timeline("user1", n = 3200)
df2 <- get_timeline("user1", n = 3200, max_id= ID of the last tweet from df1)
The first one gives me the intended 3200 Tweets. The second one only provides me 40-50 Tweets. It vaires... but i don't know why? I have seen certain post with the same questions, but the most are a little bit outdated.
So does anyone knows if the twitter API is restricting my request or is my problem elsewhere?
The user timeline API has a max limit of the most recent 3200 results, so anything beyond that number will not work. The only way you would be able to do this is using the full archive search API to attempt to pull all the Tweets posed by user1.
I need around 10k tweets from twitter but i am not able to extract them.
Getting below warning message:
In doRppAPICall("search/tweets", n, params = params, retryOnRateLimit
= retryOnRateLimit, : 10000 tweets were requested but the API can only return 476
Is there any way to extract 10k tweets?
See the Twitter search API, with a standard account you can only request tweets of the last 7 days or 180 tweets in a 15 minutes window with user auth (450 with app auth).
Edit1: It seems that I misunderstood the API description. You can make 180/450 requests a second does not mean you get 180/450 tweets, but that you can make 180/450 different API calls. The explanation to the phenomenon you are describing is also made in the above mentioned link:
Please note that Twitter’s search service and, by extension, the Search API is not meant to be an exhaustive source of Tweets. Not all Tweets will be indexed or made available via the search interface.
For one keyword, Twitter may see only a few hundred as important, whereas for other keywords a few thounds may be interesting enough.
I started to learn R, but now I am stuck.
I want to analyse followers from a specific twitter account. The problem is that that profile has a lot of followers, so to get all followers would take much time. And I am just interested in the followers from Switzerland.
So I wonder if its possible to just load the data of followers who are coming from switzerland?
This is what I already have:
library("twitteR")
consumer_key <- "my_key"
consumer_secret <- "my_secret"
access_token <- "my_token"
access_secret <- "my_secret"
options(httr_oauth_cache=T) #This will enable the use of a local file to cache OAuth access credentials between R sessions.
setup_twitter_oauth(consumer_key,
consumer_secret,
access_token,
access_secret)
[1] "Using direct authentication"
trump <- getUser("RealDonaldTrump")
follower <- trump$getFollowers(retryOnRateLimit=180)
So, the last line of code obviously would take hours, so I need a better solution. Thanks :)
Could you elaborate on what information you want about each follower? Do you want a count of how many followers list "Switzerland" as their home country? Or do you want more information about each user?
My understanding is that the API doesn't permit the filtering of followers' output on a field such as country. Thus, it seems to me, that one would need to collect all users' information, then filter, after the fact, on the country.
I collected the user ID numbers for all of Donald Trump's followers in June 2016 (when he had fewer than 10 million followers, I think). It took some time, but, with the use of the smappR package & smappR::getFriends function, it was easy to do. I'm sure that it will take longer now that he has many more followers, but the procedure with smappR::getFriends should work.
It will, however, require some additional time to download the user information for each user ID. I think that you'll need to make a distinct query to a twitter API to get the user information, as smappR::getFriends will give only the user IDs (and maybe the user names). You would then need to query an API with a function like smappR::getUsers to get their user information, including country of residence. I admit that my understanding of Twitter APIs is incomplete, but I hope that this response helps.
I am using twitteR package for R to download tweets from particular timelines, including retweets. The thing is, some retweets are cut short.
For example, I am downloading Donald Trump's tweets and he has retweeted this tweet: https://twitter.com/SecretarySonny/status/906666266320146432
I get this when downloading from Trump's timeline:
"RT #SecretarySonny: Serious #Cabinet meeting today, called by #POTUS at Camp David. Reports on #Irma's track, potential impact, fed & state…"
When the full text is:
"Serious #Cabinet meeting today, called by #POTUS at Camp David. Reports on #Irma's track, potential impact, fed & state preparedness."
It seems that the tweets are cut short by the number of characters required to spell name of the original Twitter account (SecretarySonny in this case).
Is there any way I could get the full retweets? I checked twitteR documentation but I was not able to find anything that could help.
You need to include the ?tweet_mode=extended parameter when you call the Twitter API. This is covered in the extended Tweets documentation on Twitter's developer site. What code are you currently using?
I'm a novice R and twitteR package user but I wasn't able to find a strong recommendation on how to accomplish the following.
I'd like to mine a small number of twitter accounts to identify their output for keyword usage. (i.e. I don't know what the keywords are yet)
Assumptions:
I have a small number of tweeter accounts (<6) I want to mine with a max of 7000 tweets if you aggregate the various account statuses
Those accounts are not generating new tweets at a fast rate (a few a
day)
The accounts all have less than 3200 tweets according to the profile data returned by lookupUsers()
When I use the twitteR function userTimeline("accountname", n=3200) I get between 40 and 600 observations returned i.e no where near the 3200. I know there are API limits but if it was an issue of limits I would expect to get the same number of observations back or get the notice that I need to wait 15 mins
How do I get all the text I need while still playing nice ?
By using a combination of cran and github packages it was possible to get all the tweets for a user
The packages used were streamR available in cran and https://github.com/SMAPPNYU/smappR/ to help with the analysis and getting the tweets.
The basic steps are
Authenticate to twitter using oauth and your twitter keys, tokens and secrets
use smappR function getTimeline() which saves the tweets to a json file you specify
Use parseTweets(jsonfile) to read the json contents into a dataframe
This can be accomplished with rtweet package, which is still supported. First you need to be approved as a developer and create an app. (As a note, twitter has now changed their policies, and approval can take a while. It took me almost a week.)
After that, just use get_timeline() to get all of the tweets from a timeline, up to 3200.
djt <- get_timeline("adamgreatkind", n = 3200)