filepath from concatenated string in R - r

I'm getting a "no such file or directory" error when using the file() function in R when using a concatenated string as the path argument.
folder <- "Trades"
account <- "333000"
symbol <- "EURUSD"
date <- "2016.09.09"
filepath <- sprintf("%s/%s %s %s alpha count.bin",folder, account, symbol, date)
count <- file('filepath', 'rb')
If I simply write out the full file path as the argument I get no such errors:
count <- file('Trades/333000 EURUSD 2016.09.09 alpha count.bin', 'rb')
I inspected the filepath in the first code example and the output is the same by comparison:
countstring <- "Trades/333000 EURUSD 2016.09.09 alpha count.bin"
countstring == filepath
output: TRUE
I can see that if I use the dplyr library and pipe the concatenated string to the file(), then it works.
library(dplyr)
folder <- "Trades"
account <- "333000"
symbol <- "EURUSD"
date <- "2016.09.09"
filepath <- sprintf("%s/%s %s %s alpha",folder, account, symbol, date)
count <- paste(filepath, "count.bin") %>% file('rb')
I feel like I am misunderstanding a fundamental concept in R in regards to string manipulation.
I am new to R and just learning. Please help me understand, thank you!!!

Related

how to automate the process of opening a file and performing some action in R?

I am very new to R and trying to setup some automation. I have some 10-20 json files in a folder, I want to run the R script for each json file so that I can extract data from each json file and keep appending the extracted data in a one dataframe.
In below code df is the dataframe that will store data.
In my case I was able to extract data from one json file and stored that data in df. How do I do this for all the json file and store the extracted data by appending df?
json_file <- "path_to_file/file.json"
json_data <- fromJSON(json_file)
df <- data.frame(str_split(json_data$data$summary$bullet, pattern = " - ")) %>%
row_to_names(row_number = 1)
My output should be a dataframe that contains all the extracted data from each file in a sequence.
I would really appreciate any help.
Something like the following might do what the question asks for. Untested, since there are no data.
The JSON processing package is just a guess, there are alternatives on CRAN. Change the call to library() at will.
library(jsonlite)
read_and_process_json <- function(x, path) {
json_file <- file.path(path, x)
json_data <- fromJSON(json_file)
json_bullet <- stringr::str_split(json_data$data$summary$bullet, pattern = " - ")
data.frame(json_bullet) |>
janitor::row_to_names(row_number = 1)
}
base_path <- "path_to_file"
json_files <- list.files(path = base_path, pattern = "\\.json$")
df_list <- lapply(json_files, read_and_process_json, path = base_path)
df_all <- do.call(rbind, df_list)

How to assign name of variable to a file name?

I have a object period which will be the current month id.
Now, I have different files with the suffix of period and I want R to read and work on those files at many places in a program.
Example
Period="202105"
file 1=SG202105
file 2=MN202105
How can a create a object period and call it at various places in the program?
You can use Period object as pattern argument in list.files to get names of the files that has that value in it.
Period="202105"
list.files(pattern = Period)
To create an object with the current year and moth in the format yyyymm, use this function. It accepts any R object of classes "Date" or "POSIXt".
new_period <- function(date = Sys.Date()){
d <- format(date, "%Y%m")
d
}
Period <- new_period()
Period
#[1] "202105"
I think you need paste0 and assign, try this example:
Period = "202105"
file1 <- paste0("SG", Period, ".csv")
file2 <- paste0("MN", Period)
# read file
myFile <- read.csv(file1)
# create a data.frame
assign("file2", data.frame(x = 1, y = 2))
ls()
# [1] "file1" "file2" "myFile", "Period"

Reading zip files using fread

I tried to call a zip file using fread as like this
data<-("www/608.zip")
test<- fread('gunzip -cq data')
It showed this error does not exist or is non-readable
But it will work if I call
test<- fread('gunzip -cq www/608.zip')
On my script each time value of data will change so I used If command for choosing data as like this
data<-reactive({
if (input$list == 'all')
{
"www/6.zip"
}
else{
if (input$list == 'hkj')
{
"www/6.zip"
}
I think it should work as follows:
data <- "www/608.zip"
test <- fread(cmd = paste("gunzip -cq", data))
i.e. you have to create a command string with paste() first and then pass it as cmd argument to fread().
If you want to read the file path you can use paste0 to create the string
data <- "www/608.zip"
test <- fread(cmd = paste0("gunzip -cq ", data))
fread suggest to use cmd argument for security reasons.
We can also use glue
data <- "www/608.zip"
fread(cmd = glue::glue("gunzip -cq {data}"))

For loop with file names in R

I have a list of files like:
nE_pT_sbj01_e2_2.csv,
nE_pT_sbj02_e2_2.csv,
nE_pT_sbj04_e2_2.csv,
nE_pT_sbj05_e2_2.csv,
nE_pT_sbj09_e2_2.csv,
nE_pT_sbj10_e2_2.csv
As you can see, the name of the files is the same with the exception of 'sbj' (the number of the subject) which is not consecutive.
I need to run a for loop, but I would like to retain the original number of the subject. How to do this?
I assume I need to replace length(file) with something that keeps the original number of the subject, but not sure how to do it.
setwd("/path")
file = list.files(pattern="\\.csv$")
for(i in 1:length(file)){
data=read.table(file[i],header=TRUE,sep=",",row.names=NULL)
source("functionE.R")
Output = paste("e_sbj", i, "_e2.Rdata")
save.image(Output)
}
The code above gives me as output:
e_sbj1_e2.Rdata,e_sbj2_e2.Rdata,e_sbj3_e2.Rdata,
e_sbj4_e2.Rdata,e_sbj5_e2.Rdata,e_sbj6_e2.Rdata.
Instead, I would like to obtain:
e_sbj01_e2.Rdata,e_sbj02_e2.Rdata,e_sbj04_e2.Rdata,
e_sbj05_e2.Rdata,e_sbj09_e2.Rdata,e_sbj10_e2.Rdata.
Drop the extension "csv", then add "Rdata", and use filenames in the loop, for example:
myFiles <- list.files(pattern = "\\.csv$")
for(i in myFiles){
myDf <- read.csv(i)
outputFile <- paste0(tools::file_path_sans_ext(i), ".Rdata")
outputFile <- gsub("nE_pT_", "e_", outputFile, fixed = TRUE)
save(myDf, file = outputFile)
}
Note: I changed your variable names, try to avoid using function names as a variable name.
If you use regular expressions and sprintf (or paste0), you can do it easily without a loop:
fls <- c('nE_pT_sbj01_e2_2.csv', 'nE_pT_sbj02_e2_2.csv', 'nE_pT_sbj04_e2_2.csv', 'nE_pT_sbj05_e2_2.csv', 'nE_pT_sbj09_e2_2.csv', 'nE_pT_sbj10_e2_2.csv')
sprintf('e_%s_e2.Rdata',regmatches(fls,regexpr('sbj\\d{2}',fls)))
[1] "e_sbj01_e2.Rdata" "e_sbj02_e2.Rdata" "e_sbj04_e2.Rdata" "e_sbj05_e2.Rdata" "e_sbj09_e2.Rdata" "e_sbj10_e2.Rdata"
You can easily feed the vector to a function (if possible) or feed the function to the vector with sapply or lapply
fls_new <- sprintf('e_%s_e2.Rdata',regmatches(fls,regexpr('sbj\\d{2}',fls)))
res <- lapply(fls_new,function(x) yourfunction(x))
If I understood correctly, you only change extension from .csv to .Rdata, remove last "_2" and change prefix from "nE_pT" to "e". If yes, this should work:
Output = sub("_2.csv", ".Rdata", sub("nE_pT, "e", file[i]))

R : name of an object stored in a variable

I have this little problem in R : I loaded a dataset, modified it and stored it in the variable "mean". Then I used an other variable "dataset" also containing this dataset
data<-read.table()
[...modification on data...]
mean<-data
dataset<-mean
I used the variable "dataset" in some other functions of my script, etc. and at the end I want to store in a file with the name "table_mean.csv"
Of course the command write.csv(tabCorr,file=paste("table_",dataset,".csv",sep=""))
nor the one with ...,quote(dataset)... do what I want...
Does anyone know how I can retrieve "mean" (as string) from "dataset" ?
(The aim would be that I could use this script for other purposes simply changing e.g. dataset<-variance)
Thank you in advance !
I think you are trying to do something like the following code does:
data1 <- 1:4
data2 <- 4:8
## Configuration ###
useThisDataSet <- "data2" # Change to "data1" to use other dataset.
currentDataSet <- get(x = useThisDataSet)
## Your data analysis.
result <- fivenum(currentDataSet)
## Save results.
write.csv(x = result, file = paste0("table_", useThisDataSet, ".csv"))
However, a better alternative would be to wrap your code into a function and pass in your data:
doAnalysis <- function(data, name) {
result <- fivenum(data)
write.csv(x = result, file = paste0("table_", name, ".csv"))
}
doAnalysis(data1, "data1")
If you always want to use the name of the object passed into the function as part of the filename, we can use non-standard evaluation to save some typing:
doAnalysisShort <- function(data) {
result <- fivenum(data)
write.csv(x = result, file = paste0("table_", substitute(data), ".csv"))
}
doAnalysisShort(data1)

Resources