Stop Rfacebook for loop outputting while still running code - r

The point in the code is to gather posts from a Facebook page and store them in my_page however i am unfamiliar with the code as it is for a Uni project. The problem i have is that it has to be used in a .rpres format created using Rstudio and as such i don't want the output but still need to run the code.
This is the output i don't want to be displayed:
```{r, echo = FALSE}
#install.packages("Rfacebook")
include(Rfacebook)
token <- "Facebook dev auth token goes here"
page_name <- "BuzzFeed"
my_page <- getPage(page_name, token, n = 2,reactions = TRUE,api = "v2.10")
number_required <- 50
dates <- seq(as.Date("2017/07/14"), Sys.Date(), by = "day")
#
n <- length(dates) - 1
df_daily <- list()
for (i in 1:n){
cat(as.character(dates[i]), " ")
try(df_daily[[i]] <- getPage(page_name, token,
n = number_required,reactions = TRUE,api = "v2.10",
since = dates[i],
until = dates[i+1]))
cat("\n")
}
```

Your problem is simply that Rfacebook::getPage prints to the console when it runs. That's because it calls cat(), which is the same thing as print(). Fortunately the package provides a switch to turn that off - all you need to do is add the verbose = FALSE argument to your call and it will stop printing:
getPage(...)
getPage(..., verbose = FALSE)
It's pretty bad practice for a package to call cat or print - they should use message and warning instead - so I have raised an issue with the package maintainer to ask for this to be changed, which you can watch here if you like:
https://github.com/pablobarbera/Rfacebook/issues/145

Related

Using callr to display an (estimated) progress bar without stopping the script

I would like to run a very simple script concurrently or asynchronously, displaying an estimated progress bar.
This works well enough when using system2() like this:
path <- '../Desktop/.../My_Skript_Dir/'
system2(command = "cmd.exe",
input = paste('"./R-4.2.1/bin/Rscript.exe"',
paste0(path, '/Progress_Bar.R')), wait = FALSE)
If possible I would like to avoid using system2 though and I recently found out that callr might do the trick. It almost works, using the function from the "Progress_Bar" script:
estimated_progress <- function(df = NULL, add_time = FALSE){
require(tcltk)
require(callr)
pred <- round(nrow(df)*0.6) # prediction
callr::r_bg(func = function(pred){ # open background r session
pb1 <- tcltk::tkProgressBar(title='PB', label='PB', min=0, max=pred, initial=0)
for (index in seq(pred)){
tcltk::setTkProgressBar(pb=pb1, value=index)
Sys.sleep(1)
}
}, args = list(pred))
}
df <- data.frame(matrix(nrow = 200, ncol = 3)) # dummy data
estimated_progress(df = df, add_time = FALSE)
When I do this, the progress bar opens in a new window as expected.
It keeps going for the next 1-3 function(s) (for example invisible(pbapply::pblapply(1:200000, function(x) x**3)) ) but any more than that and estimated_progress() abborts.
What am I missing here? I am sure it's quite obvious and I have read that callr can work asynchronously (look here) but I can't make it work.

R - `try` in conjunction with capturing ALL console output?

Here's a piece of code I'm working with:
install.package('BiocManager');BiocManager::install('UniProt.ws')
requireNamespace('UniProt.ws')
uniprot_object <- UniProt.ws::UniProt.ws(
UniProt.ws::availableUniprotSpecies(
pattern = '^Homo sapiens$')$`taxon ID`)
query_results <- try(
UniProt.ws::select(
x = uniprot_object,
keys = 'BAA08084.1',
keytype = 'EMBL/GENBANK/DDBJ',
columns = c('ENSEMBL','UNIPROTKB')))
This particular key/keytype combination is non-productive and produces the following output:
Getting mapping data for BAA08084.1 ... and ACC
error while trying to retrieve data in chunk 1:
no lines available in input
continuing to try
Error in `colnames<-`(`*tmp*`, value = `*vtmp*`) :
attempt to set 'colnames' on an object with less than two dimensions
Of the two [eE]rrors reported only the second is a 'proper' R error object and given the use of try accordingly captured in the variable query_result.
I am, however, desperate to capture the other error bit (no lines available in input) to inform downstream programmatic processes.
After playing with a plethora of capture.output, sink, purrr::quietly, etc. options found by startpaging (googling), I continue to fail capturing that bit. How can I do that?
As #Csd suggested, you could use tryCatch. The message that you are after is printed by the message() function in R, not stop(), so try() will ignore it. To capture output from message(), use code like this:
query_results <- tryCatch(
UniProt.ws::select(
x = uniprot_object,
keys = 'BAA08084.1',
keytype = 'EMBL/GENBANK/DDBJ',
columns = c('ENSEMBL','UNIPROTKB')),
message = function(e) conditionMessage(e))
This will abort evaluation when it gets any message, and return the message in query_results. If you are doing more than debugging, you probably want the message saved, but evaluation to continue. In that case, use withCallingHandlers instead. For example,
saveMessages <- c()
query_results <- withCallingHandlers(
UniProt.ws::select(
x = uniprot_object,
keys = 'BAA08084.1',
keytype = 'EMBL/GENBANK/DDBJ',
columns = c('ENSEMBL','UNIPROTKB')),
message = function(e)
saveMessages <<- c(saveMessages, conditionMessage(e)))
When I run this version, query_results is unchanged (because the later error aborted execution), but the messages are saved:
saveMessages
[1] "Getting mapping data for BAA08084.1 ... and ACC\n"
[2] "error while trying to retrieve data in chunk 1:\n no lines available in input\ncontinuing to try\n"
Based on #user2554330 s most excellent answer, I constructed an ugly thing that does exactly what I want:
try to execute the statement
don't fail fatally
leave no ugly messages
allow me access to errors and messages
So here it is in all it's despicable glory:
saveMessages <- c()
query_results <- suppressMessages(
withCallingHandlers(
try(
UniProt.ws::select(
x = uniprot_object,
keys = 'BAA08084.1',
keytype = 'EMBL/GENBANK/DDBJ',
columns = c('ENSEMBL','UNIPROTKB')),
silent = TRUE),
message = function(e)
saveMessages <<- c(saveMessages, conditionMessage(e))))

R - Trycatch is saving warning instead of returning function output

I am trying to download records from twitter using rtweet. One issue with this is the twitter server needs to wait 15minutes every 18000 records. So, after record number 18000, I receive a data frame with all the records and a nice warning telling me to wait for a bit. search_tweets has an function argument to download more than 18000 records called retryonratelimit. However, this isnt working so I am exploring other options.
I have produced a function, incorporating tryCatch to address this. However, when the warning at 18000 records pops up, tryCatch is saving the warning rather than the data frame which should be spit out before the warning. Something it would not do if 17999 records were downloaded
library(rtweet)
library(RDCOMClient)
library(profvis)
TwitScrape = function(SearchTerm){
ReturnDF = tryCatch({
TempList=NULL
Temp = search_tweets(SearchTerm,n=18000)
TempList = list(as.data.frame(Temp), SearchTerm)
return(TempList)
},
warning = function(TempList){
Comb=NULL
MAXID = min(TempList[[1]]$status_id)
message("Delay for 15 minutes to accommodate server download limits")
pause(901)
TempWarn = search_tweets(TempList[[2]],n=18000, max_id=MAXID)
TempWarn = as.data.frame(TempWarn)
Comb = rbind(TempList[[1]], TempWarn)
CombList = list(Comb, TempList[[2]])
return(CombList)
}
)
}
Searches = c("#MUFC","#LFC", "#MCFC")
TestExpandList=NULL
TestExpand=NULL
TestExpand2=NULL
for (i in seq_along(Searches)){
TestExpandList = TwitScrape(SearchTerm = Searches[i])
TestExpand = TestExpandList[[1]]
TestExpand$Cat = Searches[i]
TestExpand$DownloadDate = Sys.Date()
TestExpand2 = rbind(TestExpand2, TestExpand)
}
I hope this makes sense. If I can offer any more information please let me know. In summary, why is tryCatch saving my warning rather than the data frame I want?
I am not 100% sure what you would like to achieve, but it seems you are using tryCatch with a wrong understanding.
The argument in the warning-handler warning = function(TempList) is the warning itself, i.e. you have named it TempList, but that doesn't mean it will become your TempList variable, it will still just pass the warning into the handler.
Your function TwitScrape is returning ReturnDF by convention, as you are not properly returning anything, I guess that is still what you want and ok.
I would try to re-structure your solution without tryCatch
Thanks for your comments. RolandASc, you were right. I went back to the drawing board. See the working TwitScrape function below:
TwitScrape = function(SearchTerm){
DF=NULL
DF = search_tweets(SearchTerm,n=18001)
Warn = warnings()
if (names(Warn[1]) == "Rate limit exceeded - 88"){
message("paused")
pause(910)
DF2 = search_tweets(SearchTerm,n=18000, max_id = min(DF$status_id))
DF3 = rbind(DF, DF2)
return(DF3)
}
else {
return(DF)
}}

Unknown error on Facebook API through R.

I'm trying to download all the posts from a facebook page through RFacebook, but when the page has an high number of posts (over 400 or so), the script stops, returning the error
"Error in callAPI(url = url, token = token) : An unknown error has occurred." at the line where I call the getPage.
library(Rfacebook)
library(stringr)
load("fb_oauth")
token=fb_oauth
page<-getPage("bicocca", token, n = 100000, since = NULL, until = NULL, feed = TRUE)
noSpaceMsg<-str_replace_all(page$message, "[\r\n]" , "")
output<-as.data.frame(cbind(page$from_name,page$id, noSpaceMsg, page$created_time, page$type, page$link, page$likes_count, page$comments_count, page$shares_count))
colnames(output)<-c("username","msgid", "message", "created_time", "type", "link", "likes", "comments", "shares")
write.csv(output, "bicocca.csv", row.names=FALSE)
Where is the problem? How can I fix it?
It seems to be a problem with the API, not with the R package. When I try to do the query in the Graph API Explorer here, I get an error too. No idea why.
One way around this is to query month by month, wrapping the getPage function in a try command:
page <- 'bicocca'
dates <- seq(as.Date("2010/10/01"), as.Date("2015/04/20"), by="month")
n <- length(dates)-1
df <- list()
for (i in 1:n){
cat(as.character(dates[i]), " ")
try(df[[i]] <- getPage(page, token, since=dates[i], until=dates[i+1]))
cat("\n")
}
df <- do.call(rbind, df)
This will not give you all the posts, but probably most of them.

How to create periodically send text to a "log file" while printing normal output to console?

I'm creating R code for a Monte Carlo simulation of a professional sport. Because the game dynamics are very complicated and to make the debugging process simpler, I'd like to have R send a line of text for every action that happens in the game to a "log file." The log file would be a comprehensive, play by play description of what's happening in the simulation, and would look something like this…
"GAME BEGINS"
POSSESSION ASSIGNED TO X TEAM
PLAYER Y GETS BALL
PLAYER Y SCORES
FOUL BY PLAYER Z OCCURS
SUBSTITUTION OCCURS (PLAYER W <-> PLAYER Q)
…
"GAME ENDS"
I can't just use the sink() function because while the simulation is running, I setup a progress bar (with the setTxtProgressBar function) and real time scores to be printed to the console. If I used sink(), I couldn't see any of the progress indicators or scores on the R console. Does this make sense? In other words I need to periodically send text to a log file in a cumulative fashion. Here is some example code to give you something to work with…
Thanks
for (i in 1:100)
{**SOMEHOW NEED TO PRINT LINE "START LOOP" TO LOG FILE**;
a <- rnorm(n = 100, mean = i, sd = 5);
print(mean(a)); #PRINT THIS MEAN TO THE CONSOLE
**SOMEHOW PRINT "LOOP 'i' COMPLETE" TO LOG FILE**}
See ?cat. You can open a file connection to your log file and specify that in your cat call. When you don't specify a file name or connection it will print to the console.
As you say, don't use sink() as it will make the log file the default connection. Rather, open a named connection with file().
> log_con <- file("test.log")
> cat("write to log", file = log_con) # creates file and writes to it
> cat("write to console") # prints to console
write to console
The above results in a log file with the line "write to log" and "write to console" printed on the console.
If you need to append to your log file, set append = TRUE and use the file name instead of the file() connection.
> cat("add to log", file = "test.log", append = TRUE)
To open the log file in "append" mode:
log_con <- file("test.log",open="a")
Figured it out, thanks to shujaa and BigFinger. To summarize, here is how you would do it with my example code:
log_con <- file("/filepath/log.txt", open="a")
for (i in 1:100)
{
cat("loop begins", file = log_con, sep="\n")
a <- rnorm(n = 100, mean = i, sd = 5)
print(mean(a))
cat("single loop completed", file = log_con, sep="\n")
}
close(log_con)
The library log4r seems to be more complete than an homemade one: https://github.com/johnmyleswhite/log4r
write("this message", file=stderr())
Same example, including values written to the log.txt file:
log_con <- file("log.txt", open="a")
for (i in 1:10){
cat("loop begins; i =", i, '\n', file = log_con)
a <- rnorm(n = 100, mean = i, sd = 5)
print(mean(a))
cat("single loop completed; mean(a) = ", mean(a), '\n', file =
log_con)
}
close(log_con)

Resources