I'm currently using the Standard Twitter API for my bachelor degree. So i wanna analyze the timeline of certain users.
My problem is, that i'm want more than 3200 Tweets with the get_timeline command. So i set up following code in R:
df1 <- get_timeline("user1", n = 3200)
df2 <- get_timeline("user1", n = 3200, max_id= ID of the last tweet from df1)
The first one gives me the intended 3200 Tweets. The second one only provides me 40-50 Tweets. It vaires... but i don't know why? I have seen certain post with the same questions, but the most are a little bit outdated.
So does anyone knows if the twitter API is restricting my request or is my problem elsewhere?
The user timeline API has a max limit of the most recent 3200 results, so anything beyond that number will not work. The only way you would be able to do this is using the full archive search API to attempt to pull all the Tweets posed by user1.
Related
I'm using rtweet's function get_timeline to download tweets. However, some of the users I'm interested in have way more than the 3200 tweets you are allowed to download (some have around 47'000). There is the "retryonratelimit" argument, if you are downloading tweets based on words or hashtags, therefore I'm wondering whether there is a similar way to get more than 3200 tweets from one user?
The documentation - see ?get_timeline - includes a link to the Twitter developer documentation for GET statuses/user_timeline
. The R function is just a wrapper for this.
If you then follow the link to Working with timelines, you'll find an explanation of the max_id parameter.
The basic approach then is:
get the first 3200 tweets
get the earliest status ID using something like min(as.numeric(zanetti$status_id))
run get_timeline again setting max_id = ID where ID is the ID from step 2
Note: I just tried this using my own timeline and only 40 tweets were returned by step 3. So you may also have to wait an appropriate amount of time to avoid rate limits. And be aware that Twitter basically does all it can to prevent you from requesting large amounts of data via the API - at the end of the day, what you want may not be possible.
I need around 10k tweets from twitter but i am not able to extract them.
Getting below warning message:
In doRppAPICall("search/tweets", n, params = params, retryOnRateLimit
= retryOnRateLimit, : 10000 tweets were requested but the API can only return 476
Is there any way to extract 10k tweets?
See the Twitter search API, with a standard account you can only request tweets of the last 7 days or 180 tweets in a 15 minutes window with user auth (450 with app auth).
Edit1: It seems that I misunderstood the API description. You can make 180/450 requests a second does not mean you get 180/450 tweets, but that you can make 180/450 different API calls. The explanation to the phenomenon you are describing is also made in the above mentioned link:
Please note that Twitter’s search service and, by extension, the Search API is not meant to be an exhaustive source of Tweets. Not all Tweets will be indexed or made available via the search interface.
For one keyword, Twitter may see only a few hundred as important, whereas for other keywords a few thounds may be interesting enough.
I'm a novice R and twitteR package user but I wasn't able to find a strong recommendation on how to accomplish the following.
I'd like to mine a small number of twitter accounts to identify their output for keyword usage. (i.e. I don't know what the keywords are yet)
Assumptions:
I have a small number of tweeter accounts (<6) I want to mine with a max of 7000 tweets if you aggregate the various account statuses
Those accounts are not generating new tweets at a fast rate (a few a
day)
The accounts all have less than 3200 tweets according to the profile data returned by lookupUsers()
When I use the twitteR function userTimeline("accountname", n=3200) I get between 40 and 600 observations returned i.e no where near the 3200. I know there are API limits but if it was an issue of limits I would expect to get the same number of observations back or get the notice that I need to wait 15 mins
How do I get all the text I need while still playing nice ?
By using a combination of cran and github packages it was possible to get all the tweets for a user
The packages used were streamR available in cran and https://github.com/SMAPPNYU/smappR/ to help with the analysis and getting the tweets.
The basic steps are
Authenticate to twitter using oauth and your twitter keys, tokens and secrets
use smappR function getTimeline() which saves the tweets to a json file you specify
Use parseTweets(jsonfile) to read the json contents into a dataframe
This can be accomplished with rtweet package, which is still supported. First you need to be approved as a developer and create an app. (As a note, twitter has now changed their policies, and approval can take a while. It took me almost a week.)
After that, just use get_timeline() to get all of the tweets from a timeline, up to 3200.
djt <- get_timeline("adamgreatkind", n = 3200)
How can I use the TwitterR package for R to get more than 100 search results?
Although I would prefer an example in R (since it currently uses R), I could just as easily use Java, so, an example that searches Twitter in Java to get 200 search results may suffice.
I don't know if this is even possible. I do not remember seeing a "page number" that you can specify when you search (The Google API supports that). I think that with Twitter you can specify the minimum Tweet ID, which allows you to get only newer tweets, but I don't expect that to work as desired in this case (unless the search results are some how ordered based on date / time rather than relevance to the search term).
I'm getting tweets this way.
someTweets <- searchTwitter("#EroticBroadway", n=500)
The n argument tells it how many tweets to cap it at. If there aren't that many tweets it won't return 500 though.
From the docs:
n The maximum number of tweets to return
There is also a time limit on the twitter api search.
The Search API is not complete index of all Tweets, but instead an index of recent Tweets. At the moment that index includes between 6-9 days of Tweets.
The t gem doesn't have that.
You have to jump through more hoops, which has some (a lot of) disadvantages, but I've used TAGS to collect hundreds of tweets (even thousands over the course of time...) and then read them into R as a CSV.
I'm trying to use searchTwitter() to find certain topic on twitter. For example:
searchTwitter("#Fast and Furious 7", n = 10000)
can only give me a few thousand results. I have also done some research on other topics. It seems that by looking at the date from the result it can only return the result from 9 days before (There are arguments called since and until which are used to specify time range. But they don't work).
So I'm thinking is there a way to get information for all of this topic? (Or at least I can take control date range).
Apart from this. Can I use xml in R to achieve the same purpose?
Twitter provides search for the last few days only.
The cost of keeping the data indexed is too high, given the few users interested. Twitter's business model is live information.
If you want historical data, you will have to buy this from third party providers. I don't remember the name, but a company offering such data was linked from the Twitter web page where they explained this limitation of their search API.