How to get latest news posted on twitter by a website - r

I am using R and I need to retrieve the few most recent posts from a Twitter user (#ExpressNewsPK) using twitteR api. I have created an account and have an access token, etc. I have used the following command to extract the tweets:
setup_twitter_oauth(consumerkey,consumersecret,accesstoken,accesssecret)
express_news_tweets <- searchTwitter("#ExpressNewsPK", n = 10, lang = "en" )
However, the posts that are returned aren't the most recent ones from this user. Where have I made a mistake?

I think searchTwitter would search with the search string provided (here #ExpressNewsPK). So instead of giving tweets by #ExpressNewsPK it would give tweets which are directed to #ExpressNewsPK.
To get tweets from #ExpressNewsPK, you have a function named userTimeline which would give tweets from a particular user.
So after you are done with setup_twitter_oauth, you can try
userTimeline("ExpressNewsPK")
read more about it at ?userTimeline

When you use searchTwitter(), you call the Twitter Search API. Search API only returns a sample history of the tweets.
What you really need to do is to call Twitter Streaming API. Using it you'll be able to download tweets in near real time. You can read more about the Streaming API here: https://dev.twitter.com/streaming/overview

Related

Reaching unexpected tweet limit in twitteR

I have paid premium access to Twitter so can access historical data. I am trying to retrieve tweets on a combined search for the terms 'delivery' and 'accident', ie as if I go to a Twitter search and type 'delivery accident' or 'delivery+accident' in the 'latest' tab. My code for doing so is the following:
deliv_list <- searchTwitter("delivery+accident", n=500)
I am getting results which correspond to my search, but far fewer than I had expected. R returns the error message:
Warning message:
In doRppAPICall("search/tweets", n, params = params, retryOnRateLimit = retryOnRateLimit, :
500 tweets were requested but the API can only return 248
I subsequently checked and I am able to return 500 records for a different single search, so it is not that I have reached a download limit. There are multiple tweets every day on this theme so I'm not sure why so few results n=248 are returned. Appreciate any help.
It seems twitteR cannot handle full archive premium accounts. search-tweets-python seems to be the way to go. Tantalizingly, this may be able to work inside R via R's reticulate package. See: https://dev.to/twitterdev/running-search-tweets-python-in-r-45eo

How can I get tweet having only tweet ID without using twitter API?

I have a large number of Tweet IDs that have been collected by other people (https://github.com/echen102/us-pres-elections-2020), and I now want to get these tweets from those IDs. What should I do without the Twitter API?
Do you want the url ? It is : https://twitter.com/user/status/<tweet_id>
If you want the text of the tweet withou using the api , you have to render the page, and then scrape it.
You can do it with one module, requests-html:
from requests_html import HTMLSession
session = HTMLSession()
url = "https://twitter.com/user/status/1414963866304458758"
r = session.get(url)
r.html.render(sleep=2)
tweet_text = r.html.find('.css-1dbjc4n.r-1s2bzr4', first=True)
print(tweet_text.text)
Output:
Here’s a serious national security question: Why does the Biden administration want to protect COMMUNISM from blame for the Cuban Uprising? They attribute it to vaccines. Even if the Big Guy can’t comprehend it, Hunter could draw a picture for him.

twitteR R API getUser with a username of only numbers

I am working in R with the twitteR package, which is meant to get information from Twitter through their API. After getting authentificated, I can download information about any user with the getUser function. However, I am not able to do so with usernames which are only numbers (for example, 1234). With the line getUser("1234") I get the following error message:
Error in twInterfaceObj$doAPICall(paste("users", "show", sep = "/"),
params = params, : Not Found (HTTP 404).
Is there any way to get user information when the username is made completely of numbers? The function tries to search by ID instead of screenname when it finds only numbers.
Thanks in advance!
First of all, twitteR is deprecated in favour of rtweet, so you might want to look into that.
The specific user ID you've provided is a protected account, so unless your account follows it / has access to it, you will not be able to query it anyway.
Using rtweet and some random attempts to find a valid numerical user ID, I succeeded with this:
library(rtweet)
users <- c("989", "andypiper")
usr_df <- lookup_users(users)
usr_df
rtweet also has some useful coercion functions to force the use of screenname or ID (as_screenname and as_userid respectively)

Rfacebook Packages getpage() command only retrieving a few posts from Facebook Pages

I recently tried Rfacebook package by pablobarbera, which works quite well. I am having this slight issue, for which I am sharing the code.
install.packages("Rfacebook") # from CRAN
library(devtools)
install_github("Rfacebook", "pablobarbera", subdir = "Rfacebook")
library(Rfacebook)
# token generated here: https://developers.facebook.com/tools/explorer
token <- "**********"
page <- getPage("DarazOnlineShopping", token, n = 1000)
getPage command works, but it only retrieves 14 records from the Facebook page I used in the command. In the example used by pablobarbera in the original post he retreived all the posts from "Humans of New York", but when I tried the same command, facebook asked me to reduce the number of posts, and I hardly managed to get 20 posts. This is the command used by Pablo bera:
page <- getPage("humansofnewyork", token, n = 5000)
I thought I was using temporary token access that why Facebook is not giving me the required data, but I completed the wholo Facebook Oauth Process, and the same result.
Can somebody look into this, and tell why this is happening.
The getPage() command looks fine to me, I manually counted 14 posts (including photos) on the main page. It could be that Daraz Online Shopping has multiple pages and that the page name you are using only returns results from the main page, when (I assume) you want results from all of them.
getPage() also accepts page IDs. You might want to collect a list of IDs associated with Daraz Online Shopping, loop through and call each of them and combine the outputs to get the results you need.
To find this out these IDs you could write a scraper (or manually search for them all) that views the page source and searches for the unique page ID. Searching for content="fb://page/?id= will highlight the location of the page ID in the source code.

Obtaining topic description in search api

I seem to be having problem pulling out the text content of the following query without making another call:
http://tinyurl.com/mgsewz2 via the mqlread api
{
"id": "/en/alcatraz_island",
"/common/topic/description": [{}],
"/common/topic/topic_equivalent_webpage": [],
"/common/topic/official_website": null
}
I can't retrieve the following
description
equivalent webpage (I'm looking for the en wiki page)
, but I can obtain the official_website url.
It looks like I can get it via the search api via output= but I can't walk through the entire set that I'm looking for without getting search request is too large error.
http://markmail.org/message/hd6pcveta7bhuz25#query:+page:1+mid:u7xegxlhtmhwiqbl+state:results
Thanks!
It you want to download large subsets of Freebase data, your best bet is to use the Freebase RDF Dumps. They contain all the properties that you listed above.

Resources