Save .RData in a different directory - r

I load my files (.RData) from a particular folder, and i created a subfolder to save some samples and subsets. So, i want to save these elements in the subfolder, and they don't have the same name structure because i have multiple datasets (for example it cannot be sub1, sub2 etc, i have to write try1, full_sample, sub_2021 and so on).
I tried the following :
subsets_samples <- file.path <-("/Volumes/WD_BLACK/Merge/SAMPLES_SUBSETS")
fname <- file.path(subsets_samples, ".RData")
save(mydata, file=fname)
But obviously there is a problem with the saving part. My goal is to have something like :
save(mydata, file = "newname")
With the .RData format from fname that is put automatically.
I saw some answers with loops and so on but i don't really understand the process i'm sorry.
Thanks !

The problem with file.path is that it will place a separator (e.g., /ΒΈ) between each of the elements. So you would have to use paste0 in addition for the actual file name:
# If I understand you correctly, you want the iteration, like try1, full_sample, sub_2021 and so on in your file name. define them somewhere in your loop/script
iteration <- "full_sample"
fname <- file.path("Volumes", "WD_BLACK", "Merge", "SAMPLES_SUBSETS", paste0(iteration, ".Rds"))
Additionally, I would suggest to use saveRDS instead of save, since it is the appropriate function if you want to save just one object.
saveRDS(mydata, file = fname)

Related

Creating objects from all .xlsx documents in working directory

I am trying to create objects from all files in working directory with name of the original file. I tried to go the following way, but couldn't solve appearing problems.
# - SETTING WD
getwd()
setwd("PATH TO THE FILE")
library(readxl)
# - CREATING OBJECTS
file_objects <- list.files()
xlsx_objects <- unlist(grep(".xlsx",file_objects,value = T))
for (i in xlsx_objects) {
xlsx_objects[i] <- read_xlsx(xlsx_objects[i], header = T)
}
I tried to paste [i]item from "xlsx_objects" with path to WD but it only created a list of files names from docs in WD.
I also find information, that read.csv can read only one file at the time, but I guess that it should be the case with for loop, right? It is reading only one file at the time.
Using lapply (as described in this forum) I was able to get the data in the environment, but argument header didn't work, I lost names of my docs in that object which does not have desired structure. I am though looking for having these files in separated objects without calling every document exclusively.
IIUC, you could do something like:
files = list.files("PATH TO THE FILE", full.names = T, pattern = 'xlsx')
list_files = map(files, readxl::read_excel)
(You can't use read.csv to read excel files)
Also I recommend reading about R Projects so you don't have to use setwd() ever again, which makes your code harder to reproduce down the pipeline

Why does "write.dat" (R) save data files within folders?

In order to conduct some analysis using a particular software, I am required to have separate ".dat" files for each participant, with each file named as the participant number, all saved in one directory.
I have tried to do this using the "write.dat" function in R (from the 'multiplex' package).
I have written a loop that outputs a ".dat" file for each participant in a dataset. I would like each file that is outputted to be named the participant number, and for them all to be stored in the same folder.
## Using write.dat
participants_ID <- unique(newdata$SJNB)
for (i in 1:length(participants_ID)) {
data_list[[i]] <- newdata %>%
filter(SJNB == participants_ID[i])
write.dat(data_list[[i]], paste0("/Filepath/Directory/", participants_ID[i], ".dat"))
}
## Using write_csv this works perfectly:
participants_ID <- unique(newdata$SJNB)
for (i in 1:length(participants_ID)) {
newdata %>%
filter(SJNB == participants_ID[i]) %>%
write_csv(paste0("/Filepath/Directory/", participants_ID[i], ".csv"), append = FALSE)
}
If I use the function "write_csv", this works perfectly (saving .csv files for each participant). However, if I use the function "write.dat" each participant file is saved inside a separate folder - the folder name is the participant number, and the file inside the folder is called "data_list[[i]]". In order to get all of the data_list files into the same directory, I then have to rename them which is time consuming.
I could theoretically output the files to .csv and then convert them to .dat, but I'm just intrigued to know if there's anything I could do differently to get the write.dat function to work the way I'm trying it :)
The documentation on write.dat is subminimal, but it would appear that you have confused a directory path with a file name . You have deliberately created a directory named "/Filepath/Directory/[participants_ID[i]].dat" and that's where each output file is placed. That you cannot assing a name to the x.dat file itself appears to be a defect in the package as supplied.
However, not all is lost. Inside your loop, replace your write.dat line with the following lines, or something similar (not tested):
edit
It occurs to me that there's a smoother solution, albeit using the dreaded eval:
Again inside the loop, (assuming participants_ID[i] is a char string)
eval(paste0(participants_ID[i],'<- dataList[[i]]'))
write.dat(participants_ID[i], "/Filepath/Directory/")
previous answer
write.dat(data_list[[i]], "/Filepath/Directory/")
thecommand = paste0('mv /Filepath/Directory/dataList[[i]] /Filepath/Directory/',[participants_ID[i]],'.dat',collapse="")
system(thecommand)

Importing files with almost similar path and name

I have many txt files that I want to import into R. These files are imported one by one, I do the operations that I want, and then I import the next file.
All these files are located in a database system where all the folders have almost the same names, e.g.
database\type4\system50
database\type6\system50
database\type4\system30
database\type4\system50
Similarly, the names of the files are also almost the same, referring to the folder where they are positioned, e.g..
type4.system50.txt
type6.system50.txt
type4.system30.txt
type4.system50.txt
I have heard that there should be a easier way of importing these many files one by one, than simply multiple setwd and read.csv2 commands. As far as I understand this is possible by the macro import function in SAS, where you specify an overall path and then for each time you want to import a file you specify what is specific about this file name/folder name.
Is there a similar function in R? I tried to look at
Importing Data in R like SAS macro
, but this question did not really show me how to specify the folder name/file name.
Thank you for your help.
If you want to specify folder name / file name, try this
databasepath="path/to/database"
## list all files
list.files(getwd(),recursive = T,full.names = T,include.dirs = T) -> tmp
## filter files you want to read
readmyfile <- function(foldername,filename){
tmp[which(grepl(foldername,tmp) & grepl(filename,tmp))]
}
files_to_read <- readmyfile("type4", "system50")
some_files <- lapply(files_to_read, read.csv2)
## Or you can read all of them (if memory is large enough to hold them)
all_files <- lapply(tmp,read.csv2)
Instead of using setwd continuously, you could specify the absolute path for each file, save all of the paths to a vector, loop through the vector of paths and load the files into a list
library(data.table)
file_dir <- "path/to/files/"
file_vec <- list.files(path = file_dir, pattern = "*.txt")
file_list <- list()
for (n in 1:length(file_list)){
file_list[[n]] <- fread(input = paste0(file_dir, file_vec[n]))
}

How to read a csv file or load an excel workbook by ignoring some characters in the file path?

I'm writing a loop script which involves reading a file from a workbook (using the package XLConnect). The challenge is that the file names contain characters (representing time) that I want to ignore.
For example, here are 3 paths to those files:
G://User//Documents//daily_data//Op_Schedule_20160520_132025.xlsx
G://User//Documents//daily_data//Op_Schedule_20160521_142805.xlsx
G://User//Documents//daily_data//Op_Schedule_20160522_103052.xlsx
I need to import hundreds of those files. I can easily account for the character string representing the date (e.g. 20160522), but not the time.
Is there a way to tell R to ignore some characters located in the file path? Here is how I was thinking of writing my script (the "???" is where i need help). I know a loop is probably not the most efficient way, but i'm open to suggestions, should you have any:
require(XLConnect)
path= "G://User//Documents//daily_data//Op_Schedule_"
wd.seq = format(seq(as.Date("2014-01-01"),as.Date("2016-12-31"),"days"),format="%Y%m%d")
scheduleList = rep(list(matrix(1,1,1)),length(wd.seq))
for(i in 1:length(wd.seq)) {
wb = loadWorkbook(file= paste0(path,wd.seq[i],"???",".xlxs"))
scheduleList[[i]] = readWorksheet(wb,sheet='=SCHEDULE', header = TRUE)
}
`
Thanks for reading and suggestions, if any.
Mathieu
I don't know if this is helpful, but if you want to read all the files in a certain directory (which it seems to me is what you're after), you can read all the filenames into a list using the list.files() function, for example
fileList <- list.files(""G://User//Documents//daily_data//")
And then load the xlsx files looping through the list with a for loop
for(i in fileList) {
loadWorkbook(file = i)
}
I haven't used the XLConnect function before so that exact code probably doesn't work, but the loop will iterate through all the files in that directory and so you can construct your loading call using the i variable for the filename (it won't be an absolute path though, so you might need to use paste to add the first part of the filepath)
I realize there might be other files in the directory that are not excel files, you could use grepl to select only files containg "OP_Schedule_"
fileListClean <- fileList[grepl("Op_Schedule_",fileList)]
or perhaps only selecting .xlsx files in the directory:
fileListClean <- fileList[grepl(".xlsx",fileList)]
Edit to fit your reply:
Since you need to fit it to a sequence, you can do it as you did earlier:
wd.seq = format(seq(as.Date("2014-01-01"),as.Date("2016-12-31"),"days"),format="%Y%m%d")
wd.seq2 <- paste("Op_Schedule_", wd.seq, sep = "")
And then use grepl to only pick files starting with that extensions:
fileListClean <- fileList[grepl(paste(wd.seq2, collapse = "|"), fileList)]
Full disclosure: The last part i got from this SO answer: grep using a character vector with multiple patterns

Reading a file in, then writing out a file with similar name in R

In R, I would like to read in data from a file, then do a bunch of stuff, then write out data to another file. I can do that. But I'd like to have the two files have similar names automatically.
e.g. if I create a file params1.R I can read it in with
source("c:\\personal\\consults\\ElwinWu\\params1.R")
then do a lot of stuff
then write out a resulting table with write.table and a filename similar to above, except with output1 instead of params1.
But I will be doing this with many different params files, and I can foresee making careless mistakes of not changing the output file to match the params file. Is there a way to automate this?
That is, set the number for output to match the number for params?
thanks
Peter
If your source file always contains "params" which you want to change to "output" then you can easilly do this with gsub:
source(file <- "c:\\personal\\consults\\ElwinWu\\params1.R")
### some stuff
write.table(youroutput, gsub("params","output",file) )
# Will write in "c:\\personal\\consults\\ElwinWu\\output1.R"
Edit:
Or to get .txt as filetype:
write.table(youroutput, gsub(".R",".txt",gsub("params","output",file)))
# Will output in c:\\personal\\consults\\ElwinWu\\output1.txt"
Edit2:
And a loop for 20 param files then would be:
n <- 20 # number of files
for (i in 1:n)
{
source(file <- paste("c:\\personal\\consults\\ElwinWu\\params",i,".R",sep=""))
### some stuff
write(youroutput, gsub(".R",".txt",gsub("params","output",file)))
}
If the idea is just to make sure that all the outputs go in the same directory as the input then try this:
source(file <- "c:\\personal\\consults\\ElwinWu\\params1.R")
old.dir <- setwd(dirname(file))
write.table(...whatever..., file = "output1.dat")
write.table(...whatever..., file = "output2.dat")
setwd(old.dir)
If you don't need to preserve the initial directory you can omit the last line.

Resources