Try statement issue R - r

Having an issue with this simple try() statement. All I would like it to do is if the number is not there or if an error comes up move to the next. New to R and I have some info in certain folders but missing some numbers between the range.
library(readr)
season <- c(2014:2014)
gamenumbers <- c(20300:21271)
#############################################
# TEAM NULL DF's
season_teamstatsadj5v5 <- NULL
print('NUll DFs Created')
##############################################
for(game in gamenumbers){
try(
print('Start Team')
print(as.character(game))
###################################################################################################################
# team_stats_adj_5v5_df Bind
teamstatsadj5v5<-paste0('//LVS_DB/Users/Mike/Desktop/NHL_PBP/', season,'/', game, '/', game, '_teamstatsadj5v5.csv')
teamstatsadj5v5_df <- read_delim(teamstatsadj5v5, delim = ',')
season_teamstatsadj5v5 <- rbind(season_teamstatsadj5v5, teamstatsadj5v5_df)
)
}

Please see the corrected code you shared. error argument, which will handle the exception thrown should be indicated in tryCatch call. Please see as below:
library(readr)
season <- c(2014:2014)
gamenumbers <- c(20300:21271)
#############################################
# TEAM NULL DF's
season_teamstatsadj5v5 <- NULL
print('NUll DFs Created')
##############################################
for(game in gamenumbers){
tryCatch({
print('Start Team')
print(as.character(game))
###################################################################################################################
# team_stats_adj_5v5_df Bind
teamstatsadj5v5<-paste0('//LVS_DB/Users/Mike/Desktop/NHL_PBP/', season,'/', game, '/', game, '_teamstatsadj5v5.csv')
teamstatsadj5v5_df <- read_delim(teamstatsadj5v5, delim = ',')
season_teamstatsadj5v5 <- rbind(season_teamstatsadj5v5, teamstatsadj5v5_df)
}, error = function(e) {message(paste0(e, "\n"))})
}

Related

Choose command order in a function based on an error [R]

I have three files in a folder with the following names:
./multiqc_data$ ls
file1.json
file2.json
file3.json
When I open the files with the TidyMultiqc package existing NA values in the files might lead to the following error:
files <- dir(path,pattern = "*.json") #locate files
files %>%
map(~ load_multiqc(file.path(path, .))) #parse them
## the error
Error in parse_con(txt, bigint_as_char) :
lexical error: invalid char in json text.
"mapped_failed_pct": NaN, "paired in
(right here) ------^
I want to create a function to handle this error.
I want every time this error pops up to be able to apply this sed function in all files of the folder.
system(paste("gsed -i 's/NaN/null/g'",paste0(path,"*.json")))
Any ideas how can I achieve this
You could use this wrapper :
safe_load_multiqc <- function(path, file) {
tryCatch(load_multiqc(file.path(path, file)), error = function(e) {
system(paste("gsed -i 's/NaN/null/g'",paste0(path,"*.json")))
# retry
load_multiqc(path, file)
})
}
A good way to handle errors in work pipelines like that is using restarts and withCallingHandlers and withRestarts.
You establish the condition handlers and the recovery protocols (restarts) then you can choose what protocols to use and in which order. Calling handlers allows a much finer control on error conditions than common try-catch.
In the example, I wrote two handlers: removeNaNs (works at folder level) and skipFile (works at file level), if the first fails, the second is executed (simply skipping the file). Of course is an example
I think in your case you can simply run sed in every case, nevertheless, I hope this answer meet your looking for a canonical way
Inspiration and Extra lecture: Beyond Exception Handling: Conditions and Restarts
path <- "../your_path"
# function that does the error_prone task
do_task <- function(path){
files <- dir(path,pattern = "*.json") #locate files
files %>%
map(~ withRestart( # set an alternative restart
load_multiqc(file.path(path, .)), # parsing
skipFile = function() { # if fails, skip only this file
message(paste("skipping ", file.path(path, .)))
return(NULL)
}))
}
# error handler that invokes "removeNaN"
removeNaNHandler <- function(e) tryInvokeRestart("removeNaN")
# error handler that invokes "skipFile"
skipFileHandler <- function(e) tryInvokeRestart("skipFile")
# run the task with handlers in case of error
withCallingHandlers(
condition = removeNaNHandler, # call handler (on generic error)
# condition = skipFileHandler, # if previous fails skips file
{
# run with recovery protocols (can define more than one)
withRestarts({
do_task(path)},
removeNaN = function() # protocol "removeNaN"
{
system(paste("gsed -i 's/NaN/null/g'",paste0(path,"*.json")))
do_task(path) # try again
}
)
}
)
Based on this open github issue, a potential solution provided by Peter Diakumis is to use RJSONIO::fromJSON() in place of jsonlite::read_json(). You could adapt this solution to your use-case by e.g. creating your own load_multiqc() function:
library(RJSONIO)
load_multiqc_bugfix <- function(paths,
plots = NULL,
find_metadata = function(...) {
list()
},
plot_parsers = list(),
sections = "general") {
assertthat::assert_that(all(sections %in% c(
"general", "plot", "raw"
)), msg = "Only 'general', 'plot' and 'raw' (and combinations of those) are valid items for the sections parameter")
# Vectorised over paths
paths %>%
purrr::map_dfr(function(path) {
parsed <- RJSONIO::fromJSON(path)
# The main data is plots/general/raw
main_data <- sections %>%
purrr::map(~ switch(.,
general = parse_general(parsed),
raw = parse_raw(parsed),
plot = parse_plots(parsed, plots = plots, plot_parsers = plot_parsers)
)) %>%
purrr::reduce(~ purrr::list_merge(.x, !!!.y), .init = list()) %>%
purrr::imap(~ purrr::list_merge(.x, metadata.sample_id = .y))
# Metadata is defined by a user function
metadata <- parse_metadata(parsed = parsed, samples = names(main_data), find_metadata = find_metadata)
purrr::list_merge(metadata, !!!main_data) %>%
dplyr::bind_rows()
}) %>%
# Only arrange the columns if we have at least 1 column
`if`(
# Move the columns into the order: metadata, general, plot, raw
ncol(.) > 0,
(.) %>%
dplyr::relocate(dplyr::starts_with("raw")) %>%
dplyr::relocate(dplyr::starts_with("plot")) %>%
dplyr::relocate(dplyr::starts_with("general")) %>%
dplyr::relocate(dplyr::starts_with("metadata")) %>%
# Always put the sample ID at the start
dplyr::relocate(metadata.sample_id),
.
)
}

showing error in _ga and _gid of header cookie

I am new to R, I want to import 5 minute data of stock into excel sheetbut I find a unexpected error in my code, error show in 16, 17 and 75,76 row which
actuly is Header cookies like:-
_ga and _gid
it showing- Error: unexpected input in "_"
dont now why its showing that
Below is the source code:
library(httr)
library(curl)
library(mime)
library(openssl)
library(R6)
library(jsonlite)
###### Input to the code ######
###### Right click on the zerodha chart page and go to inspect element, go to network, find the chart link and extract below values
symbol_list<-read.csv("D://Trading/Stocks IDs.csv")
path_to_store_data<-("D:/Trading/R Experiement/")
_ga="GA1.2.416497127.1623159701"
_gid="GA1.2.4254895.1626415653"
enctoken="ib1rlX0e8dvp3qsnGvoL4aH3zsBgJA4kxtDPkUd+AMe7p6M6X26mn5w7EapJ+dMbZDtxZif59a2IhLLOwy5xEJAEnZ8iFYohilpfckj1bp8Mf9kX9RoHSA=="
kf_session="g665FGdUPvLaikeQQTomiejLqJYcUPg6"
public_token="bOrpvYIZJ4rhR8ytnAPccc9VFsTZjvSP"
user_id="ZE6166"
from="2021-07-01"
to="2021-07-16"
dir.create(paste0(path_to_store_data,from,"to",to))
path_to_store_data=paste0(path_to_store_data,from,"to",to,"/")
st <- as.Date(from)
en <- as.Date(to)
###### Specify Interval
###### minute,2minute,3minute,4minute,10minute,15minute,60minute,day
interval="5minute"
data_pull<-function(symbol_list){
i=1
for (i in i:nrow(symbol_list)){
ID=symbol_list[i,2]
symbol_name=symbol_list[i,1]
print(ID)
print(symbol_name)
theDate <-st
datalist = list()
big_data<-data.frame()
print(theDate)
print(en)
while (theDate<=en)
{
NextDate<- as.Date(theDate+30)
if (NextDate > as.Date(Sys.Date())){
NextDate<-en
}
dt_range=paste0(theDate,"&to=",NextDate)
print(dt_range)
url<- paste0("https://kite.zerodha.com/oms/instruments/historical/",ID,"/",interval)
httr::GET(
url = url,
add_headers(authorization=enctoken),
query = list(
user_id = user_id,
oi = "1",
from = theDate,
to = NextDate,
_ga= _ga,
_gid= _gid,
kf_session= kf_session,
public_token=public_token,
user_id= user_id,
enctoken= enctoken
)
) -> res
dat <- httr::content(res)
jsonRespText<-content(res,as="text")
#print(jsonRespText)
document<-fromJSON(txt=jsonRespText)
x<-document[["data"]]
y<-x[["candles"]]
if (length(y) <5){
theDate<-as.Date(theDate)+30
next
print("hi")
print(theDate)
}
dt<-as.data.frame(document)
dt<-dt[-1]
colnames(dt)[1]<-"TIME"
colnames(dt)[2]<-"Open"
colnames(dt)[3]<-"High"
colnames(dt)[4]<-"Low"
colnames(dt)[5]<-"CLOSE"
colnames(dt)[6]<-"VOLUME"
colnames(dt)[7]<-"SYMBOL"
dt$SYMBOL<-symbol_name
dt$TIME<-gsub("\\+0530","",dt$TIME)
dt$TIME<-gsub("T"," ",dt$TIME)
dt$Date <- as.Date(dt$TIME) #already got this one from the answers above
dt$TIME1 <- format(as.POSIXct(dt$TIME) ,format = "%H:%M:%S")
datalist[[i]] <- dt
#print("4")
theDate<-as.Date(theDate)+30
big_data = rbind(big_data,dt)
print(theDate)
}
file= paste0(path_to_store_data,symbol_name,".csv",sep="")
print(file)
write.csv(big_data,file,row.names = F)
print(theDate)
}
}
undebug(data_pull)
data_pull(symbol_list)
here is a error image

Why do I get this error using biomod2:response.plot2, and is it important? Error in ncol(dat_) : could not find function "ncol"

When I run the example for the response.plot2 function (biomod2 package) I get the above error. The code produces some plots but does not save an object
Here's the example (including the code that I ran): https://www.rdocumentation.org/packages/biomod2/versions/3.3-7.1/topics/response.plot2
)
[edit:]
The source code for the function response.plot2 is here:
https://r-forge.r-project.org/scm/viewvc.php/checkout/pkg/biomod2/R/response.plot.R?revision=728&root=biomod
It includes these lines:
.as.ggdat.1D <-
function (rp.dat)
{
# requireNamespace('dplyr')
out_ <- bind_rows(lapply(rp.dat, function(dat_) {
dat_$id <- rownames(dat_)
id.col.id <- which(colnames(dat_) == "id")
expl.dat_ <- dat_ %>% dplyr::select(1, id.col.id) %>%
tidyr::gather("expl.name", "expl.val", 1)
pred.dat_ <- dat_ %>% dplyr::select(-1, id.col.id) %>%
tidyr::gather("pred.name", "pred.val", (1:(ncol(dat_)-2)))
out.dat_ <- dplyr::full_join(expl.dat_, pred.dat_)
out.dat_$expl.name <- as.character(out.dat_$expl.name)
out.dat_$pred.name <- as.character(out.dat_$pred.name)
return(out.dat_)
}))
out_$expl.name <- factor(out_$expl.name, levels = unique(out_$expl.name))
return(out_)
}
I tried changing ncol(dat_) to base::ncol(dat_) and then running the whole lot to redefine the function response.plot2 for my R session, but I got a different error message:
Error in base::ncol : could not find function "::"

Error in file(con, "rb") : cannot open the connection External Hard Drive R

I have a code block of the following:
# Obtain records from all patients
patientDir <- sort(list.dirs(path = "sample_images", full.names = TRUE, recursive = FALSE))
dataframes <- list()
i = 1
while(i<19){
# Strip the patient out
patient <- coreHist(patientDir[i])
print("1")
setwd("/Volumes/HUGE storage drive/")
exists<- file.exists(patientDir[i])
print(exists)
# Extract the relevant information from the patient
dicom <- readDICOM(patientDir[i])
dicomdf <- dicomTable(dicom$hdr)
patient_id <- dicomdf$`0010-0020-PatientID`[1]
print("2")
# Normalize their VX's
sum<- sum(patient$histData$finalFreq)
print("3")
# Create the new VX's
patient$histData$finalFreq_scaled <- (patient$histData$finalFreq/sum)
print("4")
# Add their ID
patient$histData$patientid <- patient_id
print("5")
# Keep only the important columns
patient$histData <- patient$histData[c("patientid", "Var1", "finalFreq_scaled")]
print("6")
# Add these dataframes to a list for better recall afterwards
dataframes[[i]] <- patient$histData
print("7")
# Additional code to transpose and merge dataframes
if(i == 1){
wide_df <- patient$histData
}else{
wide_df <- rbind(wide_df,patient$histData )
}
print("8")
print(paste(c("Patient", i), sep ="", collapse = "-"))
i = i+1
}
However, after a (seemingly random) number of iterations, the code fails right after the line "print("1")" with the following error:
Error in file(con, "rb") : cannot open the connection
The working directory is set to an external hard drive as the "sample_images" folder is 62GB large. I thought perhaps there was a timeout connection with R studio and my external hard drive so I tried to "remain active" on my computer, I've also tried resetting the working directory after each iteration to make sure it can find the file.
When it fails on a certain patient, I check manually to see if that file does indeed exist, and it does. Any thoughts?
I'm actually not sure why the error was happening, but to fix it I simply added a "try" statement:
attempt <- 1
while(is.null(dicom) && attempt <= 3){
attempt <- attempt + 1
try(
dicom <- readDICOM(patientDir[i])
)
}
This did indeed work.

How to trace where the Error occurs when executing a `mapply()`?

I have a data.table of 600,000 rows and execute the following command on it:
ranges <- mapply(function(mi, ma) {seq(from=mi, to=ma, by="days")}, mi=Moves$Start, ma=Moves$End)
I get the following error message after a while:
Error in seq.int(0, to0 - from, by) : wrong sign in 'by' argument
I have tested my code with a smaller dataset and that seems to be working fine. This leads me to think that the error message is the result of the values in the dataset. Can anybody recommend an efficient way to trace the problem row(s) in the data.table? Needless to say, manually checking 600k rows is a bit too much.
Your suggestions for finding the problem rows in the data.table are appreciated!
The obvious solution is to turn the anonymous function into a first class, fully named function, and then you can debug the function. Or turn on the recover option and then you can step into the evaluation frames for the current stack and see the state of the variables at the point the error was raised.
myFun <- function(mi, ma) {
seq(from=mi, to=ma, by="days")
}
gets you a named function, which you can debug via
debug(myFun)
or
debugonce(myFun)
To turn on error recovery do
op <- options(error = recover)
(you can rest that then with: options(op) or options(error = stop)
In this case I suspect that mi is greater than ma:
> myFun(Sys.Date(), Sys.Date()-1)
Error in seq.int(0, to0 - from, by) : wrong sign in 'by' argument
so you could alter myFun to see if that is the case:
myFun <- function(mi, ma) {
if(mi > ma)
stop("`mi` is > than `ma`")
seq(from=mi, to=ma, by="days")
}
That way you get a more informative error message.
If that fails I'd use options(error = recover) and then drop into the evaluation call corresponding to the function and see what the values of mi and ma are.
Overview
seq.Date()'s error message is trying to tell you that a date in Moves$End (i.e. June 23, 2017) occurs before Moves$Start (i.e. April 17, 2017). Because seq.Date() assumes all dates in from occur before the dates in to, the error stops the function from proceeding.
To identify where this occurs, use which() to identify which dates in Moves$End are less than Moves$Start. From there, update those dates so that they occur after Moves$Start.
# load necessary data
Moves <- data.frame( Start = as.Date( x = c("2017-04-17", "2018-03-01", "2019-04-01") )
, End = as.Date( x = c("2017-06-23", "2018-02-14", "2018-04-24") )
, stringsAsFactors = FALSE )
# try to create a sequence of dates
date.ranges <-
mapply( FUN = function( mi, ma )
seq.Date( from = mi
, to = ma
, by = "day" )
, Moves$Start
, Moves$End
, SIMPLIFY = FALSE )
# identify the instance
# where the End date occurs
# before the Start date
wrong.end.date <-
which( Moves$End < Moves$Start )
# view results
wrong.end.date
# [1] 2 3
# correct those End Dates
# so that they occur
# after the Start date
Moves$End[ wrong.end.date ] <-
as.Date( x = c("2019-02-14", "2019-04-24") )
# rerun the mapply() function
date.ranges <-
mapply( FUN = function( mi, ma )
seq.Date( from = mi
, to = ma
, by = "day" )
, Moves$Start
, Moves$End
, SIMPLIFY = FALSE )
# end of script #

Resources