I am trying to set a few options (globally) before running an R script that calls a bunch of functions in custom packages. These options are to be read from a text file (config) that looks like
param1=value1
I read in this file like so
configs_df<-read.csv("configfile", sep="=", strip.white=TRUE, header=FALSE, comment.char="#", stringsAsFactor=FALSE, blank.lines.skip=TRUE)
I've attempted using a combination of eval and sprintf, but to no avail
options("eval(configs_df$V1[1])"="eval(configs_df$V2[1])")
do.call(options, list(configs_df$V1[1], configs_df$V2[1]))
params <- c(configs_df$V1[1], configs_df$V2[1])
exp <- "options(%s=%s)"
toeval <- splat(sprintf)(c(exp, params))
eval(toeval)
I would really appreciate a few pointers.
After you read in your data, try:
options(as.list(setNames(configs_df$V2, nm=configs_df$V1)))
Related
I have an R function from a package that I need to pass a file path as an argument but it's expecting a csv and my file is an xlsx.
I've looked at the code for the function an it is using read.csv to load the file but unfortunately I can't make any changes to the package.
Is there a good way to read in the xlsx and pass it to the function without writing it to a csv and having the function read it back in?
I came across the text argument for read.csv here:
Is there a way to use read.csv to read from a string value rather than a file in R?
This seems like might be part way there but as I said I am unable to alter the function.
Maybe you could construct your own function checking if the file is xlsx, and in this case create a temporary csv file, feed it to your function, and delete it. Something like
yourfunction = function(path){
read.csv(path)
head(path)
}
library(readxl)
modified_function = function(path){
if(grepl{"\\.xlsx",path}){
tmp <- read_xlsx(path)
tmp_path <- paste0(gsub("\\.xlsx","",path),"_tmp.csv")
write.csv(tmp,file = tmp_path)
output <- yourfunction(tmp_path)
file.remove(tmp_path)
}else{
output <- yourfunction(path)
}
return(output)
}
If it is of help, here you can see how to modify only one function of a package: How to modify a function of a library in a module
I am using this answer to load in a folder of Excel Files:
# Get the list of files
#----------------------------#
folder <- "path/to/files"
fileList <- dir(folder, recursive=TRUE) # grep through these, if you are not loading them all
# use platform appropriate separator
files <- paste(folder, fileList, sep=.Platform$file.sep)
So far, so good.
# Load them in
#----------------------------#
# Method 1:
invisible(sapply(files, source, local=TRUE))
#-- OR --#
# Method 2:
sapply(files, function(f) eval(parse(text=f)))
But the source function (Method 1) gives me the error:
Error in source("C:/Users/Username/filename.xlsx") :
C:/Users/filename :1:3: unexpected input
1: PK
^
For method 2 get the error:
Error in parse(text = f) : <text>:1:3: unexpected '/'
1: C:/
^
EDIT: I tried circumventing the issue by setting the working directory to the directory of the folder, but that did not help.
Any ideas why this happens?
EDIT 2: It works when doing the following:
How can I read multiple (excel) files into R?
setwd("...")
library(readxl)
file.list <- list.files(pattern='*.xlsx')
df.list <- lapply(file.list, read_excel)
just to provide a proper answer outside of the comment section...
If your target is to read many Excel files, you shouldn't use source.
source is dedicated to run external R code.
If you need to read many Excel files you can use the following code and the support of one of these libraries: readxl, openxlsx, tidyxl (with unpivotr).
filelist <- dir(folder, recursive = TRUE, full.names = TRUE, pattern = ".xlsx$|.xls$", ignore.case = TRUE)
l_df <- lapply(filelist, readxl::read_excel)
Note that we are using dir to list the full paths (full.names = TRUE) of all the files that ends with .xlsx, .xls (pattern = ".xlsx$|.xls$"), .XLSX, .XLS (ignore.case = TRUE) in the folder folder and all its subfolders (recursive = TRUE).
readxl is integrated with tidyverse. It is pretty easy to use. It is most likely what you're looking for.
Personally, I advice to use openxlsx if you need to write (rather than read) customized Excel files with many specific features.
tidyxl is the best package I've seen to read Excel files, but it may be rather complicated to use. However, it's really careful in the types preservation.
With the support of unpivotr it allows you to handle complicated Excel structures.
For example, when you find multiple headers and multiple left index columns.
This code works, however, I wonder if there is a more efficient way. I have a CSV file that has a single column of ticker symbols. I then read this csv into R and apply functions to each ticker using a for loop.
I read in the csv, and then go into the data frame and pull out the character vector that the for loop needs to run properly.
SymbolListDataFrame = read.csv("DJIA.csv", header = FALSE, stringsAsFactors=F)
SymbolList = SymbolListDataFrame[[1]]
for (Symbol in SymbolList){...}
Is there a way to combine the first two lines I have written into one? Maybe read.csv is not the best command for this?
Thank you.
UPDATE
I am using the readlines method suggested by Jake and Bartek. There is a warning "incomplete final line found on" the csv file but I ignore it since the data is correct.
SymbolList <- readLines("DJIA.csv")
SymbolList <- read.csv("DJIA.csv", header = FALSE, stringsAsFactors=F)[[1]]
readLines function is the best solution here.
Please note that read.csv function is not only for reading files with csv extensions. This is simply read.table function with parameters like header or sep set differently. Check the documentation for more info.
I have a tab-delimited text file that I am trying to load into R with the read.table function. The first few lines of the script look like this
#!/usr/bin/env Rscript
args <- commandArgs(trailingOnly=TRUE)
data <- read.table(args[1], header=TRUE, sep="\t", quote="")
# process the data
This works. I had originally tried to get R to read the data from standard input, but was unsuccessful. My first approach...
#!/usr/bin/env Rscript
data <- read.table(stdin(), header=TRUE, sep="\t", quote="")
# process the data
...didn't seem to work at all. My second approach...
#!/usr/bin/env Rscript
data <- read.table("/dev/stdin", header=TRUE, sep="\t", quote="")
# process the data
...read the data file but (for some reason I don't understand) the first 20 or so lines get mangled, which is a big problem (especially since those lines contain the header information). Is there any way to get read.table to read from standard input? Am I missing something completely obvious?
?stdin says:
stdin() refers to the ‘console’ and not to the C-level ‘stdin’
of the process. The distinction matters in GUI consoles (which
may not have an active ‘stdin’, and if they do it may not be
connected to console input), and also in embedded applications.
If you want access to the C-level file stream ‘stdin’, use
file("stdin").
And:
When R is reading a script from a file, the file is the
‘console’: this is traditional usage to allow in-line data …
That’s the probable reason for the observed behaviour. In principle you can read.table from standard input – but in most (almost all?) cases you’ll want to do this via file('stdin').
I am trying to automate some data exporting, and I would like to add a header to each file such as "please cite Bob and Jane 2008" ... or even a few lines of specific instructions depending on the context.
I have looked at the write.csv and write.table documentation, but do not see any such feature.
What is the easiest way to achieve this?
Here are two possible approaches - the solution under EDIT using connections is more flexible and efficient.
Using write.table(...,append = T) and cat
Use append=T within a call to write.table, having cat the header there previously
wrapped in its own function....
write.table_with_header <- function(x, file, header, ...){
cat(header, '\n', file = file)
write.table(x, file, append = T, ...)
}
Note that append is ignored in a write.csv call, so you simply need to call
write.table_with_header(x,file,header,sep=',')
and that will result in a csv file.
EDIT
using connections
(Thanks to #flodel whose suggestion is this)
my.write <- function(x, file, header, f = write.csv, ...){
# create and open the file connection
datafile <- file(file, open = 'wt')
# close on exit
on.exit(close(datafile))
# if a header is defined, write it to the file (#CarlWitthoft's suggestion)
if(!missing(header)) writeLines(header,con=datafile)
# write the file using the defined function and required addition arguments
f(x, datafile,...)
}
Note that this version allows you to use write.csv or write.table or any function and uses a file connection which
(as #flodel points out in the comments)
will only open and close the file once, and automatically appends. Therefore it is more efficient!