I have a function readTochens which is to read csv files. I want to loop through a few more files. the following is the code and it doesn't work. Error message is that No Such file or direction. Any suggestions?
dir <- c("allfac12","allfac13")
cpas <- c("cpas12","cpas13")
mainloop <- function(){
for(i in 1:length(dir)){
readTochens(as.name(dir[i]),as.name(cpas[i]))
}
}
mainloop()
Related
I'm new to functions and I would like to load my data with a function.
The function appears to be correct but the file does not save as a dataframe to the environment, while this does happen when it's not within the function.
This is my script:
read_testdata <- function(file) {
Dataset_test <- read_rds(here("foldername", file))
}
read_testdata("filename")
Can someone spot my error?
After some thinking I spotted my problem, the correct code should be this:
read_testdata <- function(file) {
read_rds(here("foldername", file))
}
Dataset_test <- read_testdata("filename.rds")
I am trying to read 23 excel files, store each in a list, and then rbind them to one csv. Some of these file are csv and some of them are xlsx. However, I got the following message:
Error: Can't establish that the input is either xls or xlsx.
So I want to identify which ones are giving error and then append it manually.
My function is the following:
make_df<-function(filename){
library(readxl)
library(foreign)
if (str_sub(filename,-3,-1) == "csv"){
df<-read.csv(filename,fileEncoding="latin1")
}
else{
df<-read_excel(filename)
}
return(df)
}
filenames_vector<-list.files(# directory)
datalist = list()
for (i in 1:23){
datalist[[i]] <- make_df(filenames_vector[i])
}
mega_data = do.call(rbind,datalist)
How can I add something in make_df to print out the names of files that are causing the error message? Also, is there another work around, when the the error message is on not being able to distinguish xlsx from xls?
This can be done with a tryCatch block. Without example data it's a little hard to recreate. I'm not sure what you mean in your second question.
Try the code below to catch errors and print out the filename if there's an error, otherwise return the df object.
make_df<-function(filename){
library(readxl)
library(foreign)
df = tryCatch(
{ # try block
if (str_sub(filename,-3,-1) == "csv"){
df<-read.csv(filename,fileEncoding="latin1")
}
else{
df<-read_excel(filename)
}
},
error=function(cond){return(filename)} # grab the filename if there was an error
)
if (class(df) == 'character') {
print(df)
} else{return(df)}
}
I am trying to run something on a very large dataset. Basically, I want to loop through all files in a folder and run the function fromJSON on it. However, I want it to skip over files that produce an error. I have built a function using tryCatch however, that only works when i use the function lappy and not parLapply.
Here is my code for my exception handling function:
readJson <- function (file) {
require(jsonlite)
dat <- tryCatch(
{
fromJSON(file, flatten=TRUE)
},
error = function(cond) {
message(cond)
return(NA)
},
warning = function(cond) {
message(cond)
return(NULL)
}
)
return(dat)
}
and then I call parLapply on a character vector files which contains the full paths to the JSON files:
dat<- parLapply(cl,files,readJson)
that produces an error when it reaches a file that doesn't end properly and does not create the list 'dat' by skipping over the problematic file. Which is what the readJson function was supposed to mitigate.
When I use regular lapply, however it works perfectly fine. It generates the errors, however, it still creates the list by skipping over the erroneous file.
any ideas on how I could use exception handling with parLappy parallel such that it will skip over the problematic files and generate the list?
In your error handler function cond is an error condition. message(cond) signals this condition, which is caught on the workers and transmitted as an error to the master. Either remove the message calls or replace them with something like
message(conditionMessage(cond))
You won't see anything on the master though, so removing is probably best.
What you could do is something like this (with another example, reproducible):
test1 <- function(i) {
dat <- NA
try({
if (runif(1) < 0.8) {
dat <- rnorm(i)
} else {
stop("Error!")
}
})
return(dat)
}
cl <- parallel::makeCluster(3)
dat <- parallel::parLapply(cl, 1:100, test1)
See this related question for other solutions. I think using foreach with .errorhandling = "pass" would be another good solution.
I need to create a function in R that reads all the files in a folder (let's assume that all files are tables in tab delimited format) and create objects with same names in global environment. I did something similar to this (see code below); I was able to write a function that reads all the files in the folder, makes some changes in the first column of each file and writes it back in to the folder. But the I couldn't find how to assign the read files in to an object that will stay in the global environment.
changeCol1 <- function () {
filesInfolder <- list.files()
for (i in 1:length(filesInfolder)){
wrkngFile <- read.table(filesInfolder[i])
wrkngFile[,1] <- gsub(0,1,wrkngFile[,1])
write.table(wrkngFile, file = filesInfolder[i], quote = F, sep = "\t")
}
}
You are much better off assigning them all to elements of a named list (and it's pretty easy to do, too):
changeCol1 <- function () {
filesInfolder <- list.files()
lapply(filesInfolder, function(fname) {
wrkngFile <- read.table(fname)
wrkngFile[,1] <- gsub(0, 1, wrkngFile[,1])
write.table(wrkngFile, file=fname, quote=FALSE, sep="\t")
wrkngFile
}) -> data
names(data) <- filesInfolder
data
}
a_list_full_of_data <- changeCol1()
Also, F will come back to haunt you some day (it's not protected where FALSE and TRUE are).
add this to your loop after making the changes:
assign(filesInfolder[i], wrkngFile, envir=globalenv())
If you want to put them into a list, one way would be, outside your loop, declare a list:
mylist = list()
Then, within your loop, do like so:
mylist[[filesInfolder[i] = wrkngFile]]
And then you can access each object by looking at:
mylist[[filename]]
from the global env.
The following code works:
Name<-"s1521r0000_rd2.txt"
OneFile<-read.table(file=Name, sep="", skip=35, fill=TRUE )
But, I am trying to write a function that will load one .txt file so that I can load whatever .txt file I want. I wrote the following function which is not working:
ReadOneFile<-function(Name="s1521r0000_rd2.txt"){
OneFile<-read.table(file=Name, sep="", skip=35, fill=TRUE )
}
It would be great if you could help me.
You need to return() the file from the function into an object. For instance:
func.readonefile <- function(Name) {
thefile <- read.table(file=Name,sep="",skip=35,fill=TRUE)
return(thefile)
}
a_file <- func.readonefile(Name="s1521r0000_rd2.txt")