Try Catch when looping in R - r

I am trying to incorporate a tryCatch function in my R code to prevent the loop from breaking whenever I get an error.
I've looked through other examples but can't make applying tryCatch work.
Does anyone know how to add tryCatch to the following loop to prevent any error stopping the loop continuing?
for (i in (1:nrow(pagedata))) {
u <- pagedata[i, "id"]
url <- paste0("https://www.google.com/", u)
r <- GET(url)
print(url)
if (!http_error(r)) {
web_page_read_follows <- read.csv(url)
colnames(web_page_read_follows) <- "follows"
web_page_collect_follows <- web_page_read_follows[web_page_read_follows$follows %like% "Followers", ]
web_page_collect_follows <- as.data.frame(web_page_collect_follows)
colnames(web_page_collect_follows) <- "follows"
web_page_collect_follows$follows <- gsub("Followers.*", "", web_page_collect_follows$follows)
web_page_collect_follows$follows <- gsub(".*=", "", web_page_collect_follows$follows)
web_page_collect_follows <- tail(web_page_collect_follows, -(nrow(web_page_collect_follows) - 1))
if (length(web_page_collect_follows$follows) > 0) {
pagedata[i, "followers"] <- web_page_collect_follows$follows
print(i)
Sys.sleep(1)
}
}
}

Related

How to use tryCatch() to ignore the error in while loop in R

I have a code that reads each line of my dataframe's first column, visits the website and then downloads the photo of each deputy. But it doesn't work properly because there are some deputies who don't have a photo yet.
That's why my code breaks and stop working. I tried to use "next" and if clauses, but it still didn't work. So a friend recomended me to use the tryCatch(). I couldn't find enough information online, and the code still doesn't work.
The file is here:
https://gist.github.com/gabrielacaesar/940f3ef14eaf29d18c3780a66053bbee
deputados <- fread("dep-legislatura56-14jan2019.csv")
i <- 1
while(i <= 514) {
this.could.go.wrong <- tryCatch(
attemptsomething(),
error=function(e) next
)
url <- deputados$uri[i]
api_content <- rawToChar(GET(url)$content)
pessoa_info <- jsonlite::fromJSON(api_content)
pessoa_foto <- pessoa_info$dados$ultimoStatus$urlFoto
download.file(pessoa_foto, basename(pessoa_foto), mode = "wb")
Sys.sleep(0.5)
i <- i + 1
}
Here is a solution using purrr:
library(purrr)
download_picture <- function(url){
api_content <- rawToChar(httr::GET(url)$content)
pessoa_info <- jsonlite::fromJSON(api_content)
pessoa_foto <- pessoa_info$dados$ultimoStatus$urlFoto
download.file(pessoa_foto, basename(pessoa_foto), mode = "wb")
}
walk(deputados$uri, possibly(download_picture, NULL))
Simply wrap tryCatch on the lines that can potentially raise errors and have it return NULL or NA on the error block:
i <- 1
while(i <= 514) {
tryCatch({
url <- deputados$uri[i]
api_content <- rawToChar(GET(url)$content)
pessoa_info <- jsonlite::fromJSON(api_content)
pessoa_foto <- pessoa_info$dados$ultimoStatus$urlFoto
download.file(pessoa_foto, basename(pessoa_foto), mode = "wb")
Sys.sleep(0.5)
}, error = function(e) return(NULL)
)
i <- i + 1
}

R allow error in lapply

related to this question. I wanted to build a simple lapply function that will output NULL if an error occur.
my first thought was to do something like
lapply_with_error <- function(X,FUN,...){
lapply(X,tryCatch({FUN},error=function(e) NULL))
}
tmpfun <- function(x){
if (x==9){
stop("There is something strange in the neiborhood")
} else {
paste0("This is number", x)
}
}
tmp <- lapply_with_error(1:10,tmpfun )
But tryCatch does not capture the error it seems. Any ideas?
You need to provide lapply with a function:
lapply_with_error <- function(X,FUN,...){
lapply(X, function(x, ...) tryCatch(FUN(x, ...),
error=function(e) NULL))
}

Trying to call a function in a for loop and getting unused argument error

readStateData <- function() {
infile <- paste("state",i,".txt",sep="")
state <- readLines(infile,n=1)
statedata <- read.table(infile,header=FALSE,sep=",",skip=1,col.names=c("Rank","City","Population"))
statename <- list(state,statedata)
statename
}
# Start loop
for(i in 1:50) {
readStateData()
# Add function to big.list
big.list[[i]] <- readStateData(statename)
}
The assignment for class is to bring in 50 files, all named state#.txt, get the state via readLines, get the data via read.table, and ultimately put it all into big.list that'll have all of the data through a for loop.
The problem I'm having is calling the function in during the for loop. I get the error:
Error in readStateData(statename) : unused argument (statename)
I'm either not calling in the function properly or I've written the function wrong. Both are likely.
Thank you for your help.
You have different issues here.
Do not refer inside a function to a variable which is defined outside. It means instead of access an outside the function defined i inside the function:
i <- 1
fct <- function() {
a <- i + 1
return(a)
}
fct()
Pass the variable as an argument to the function:
i <- 1
fct <- function(x) {
a <- x + 1
return(a)
}
fct(i)
In your function the return statement is missing. See point 1 the last command in the functions. Without a return statement the last written variable is on the stack and is "returned" by the function. This is not the clean way to return a value.
Ergo your code should look like this
readStateData <- function(x) {
infile <- paste("state",x,".txt",sep="")
state <- readLines(infile,n=1)
statedata <-read.table(infile,header=FALSE,sep=",",skip=1,col.names=c("Rank","City","Population"))
statename <- list(state,statedata)
return(statename)
}
# Start loop
for(i in 1:50) {
j <- readStateData(i)
# Add function to big.list
big.list[[i]] <- j
}
If your files are all of the pattern: state[number].txt you can simplify your code to:
# Get all files with pattern state*.txt
fls <- dir(pattern='state.*txt')
readStateData <- function(x) {
state <- readLines(x, n=1)
statedata <-read.table(x, header=FALSE,sep=",",skip=1,col.names=c("Rank","City","Population"))
statename <- list(state,statedata)
return(statename)
}
# Start loop
for(i in 1:length(fls)) {
j <- readStateData(fls[i])
# Add function to big.list
big.list[[i]] <- j
}

call a function from a vector of given functions in R

have the following function:
setTypes <- function(df2, ...) {
fns <- as.list(substitute(list(...)))
for(i in 1:length(df2)) {
if(fns[i] == '') {
next
}
df2[i,] <- fns[i](df2[i,])
}
return(df2)
}
want to do this:
test<-setTypes(sls,c('','as.Date','','','as.numeric','as.numeric'))
idea is to change the types of the fields in a data frame without having to do sls$field <- as.numeric(sls$field) for every field.
I had written a function like this that worked:
fn <- function(t) {
return(t("55.55000"))
}
and the output is this:
> fn(as.numeric)
[1] 55.55
however, i can't figure out why either doing variable length argument as a list and calling it as list[index](input) doesn't work. or even passing a vector of functions like c(as.Date, as.numeric, as.character) and doing c[1]('2015-10-10') # as.Date('2015-10-10')
I am receiving the error 'attempt to apply non-function'.. I've also tried using call but to no avail. Help?
The problem is that class(c[1]) is a list use c[[1]] instead
Example code
v <- c(as.numeric,as.character)
v[[1]]("1")
v[[2]](1)
EDIT
Your example should be:
setTypes <- function(df2, ...) {
fns <- list(...)
for(i in 1:NCOL(df2)) {
if(is.function(fns[[i]])) {
df2[,i] <- fns[[i]](df2[,i])
}
}
return(df2)
}
df <- data.frame(v1 = c(1,2), v2 = c("1","2"))
setTypes(df,as.character,'',as.numeric)

R - Subsetting subsets of variable names in loops

Q: How do I subset a subset of a changing variable inside a function or loop?
Assume I have the following code for determining regression stats for multiple data sets :
dat1 <- data.frame(col1=1:5,col2=6:10)
dat2 <- data.frame(col1=11:15,col2=16:20)
func <- function(data,col.no){
get(data)[,col.no]
}
for(i in c('dat1','dat2')) {
mod.name <- paste0('fit.',i)
assign(mod.name,lm(get(i)[,1]~get(i)[,2]),envir = .GlobalEnv)
}
mod.pvals <- NULL
p.func <- function(attr.name) {
for(i in c('dat1','dat2')) {
mod.name <- paste0('fit.',i)
p.val <- summary(get(mod.name))[attr.name]
mod.pvals <- c(mod.pvals,p.val)
}
mod.pvals
}
r.vals <- p.func('r.squared')
adj.r.vals <- p.func('adj.r.squared')
coef.vals <- p.func('coefficients')
This works just fine with 'r.squared','adj.r.squared',etc.
But I want to access the p-value of the model in the same function.
Outside of a function I'd choose:
summary(fit)$coefficients[2,4]
But how do I do this inside of the function??
I unsuccessfully tried:
summary(get(mod.name))['coefficients'][2,4]
Error in summary(get(mod.name))["coefficients"][[2, 4]] :
incorrect number of subscripts
So then I thought about just changing my code for p.val in the function above:
p.val <- paste0('summary(',mod.name ,')$',attr.name)
get(p.val)
But when I run the code I get the following error:
p.vals <- p.func('coefficients[2,4]')
Error in get(p.val) :
object 'summary(fit.dat1)$coefficients[2,4]' not found
I guess get() doesn't work like this. Is there a function that I can replace get() with?
Other thoughts on how I could make this work??
One solution is to use eval() and parse():
p.val <- eval(parse(text=paste0("summary(",mod.name,")[",paste0(attr.name),"]")))
So the function would be:
mod.pvals <- NULL
p.func <- function(attr.name) {
for(i in c('dat1','dat2')) {
mod.name <- paste0('fit.',i)
p.val <- eval(parse(text=paste0("summary(",mod.name,")[",attr.name,"]")))
mod.pvals <- c(mod.pvals,p.val)
}
mod.pvals
}
r.vals <- p.func(attr.name = '\'r.squared\'')
p.vals <- p.func(attr.name = '[\'coefficients\']][2,4')

Resources