Iterative naming for a list created in a loop - r

i wrote a loop:
for(a in 1:100){
Code
list <- list("test1"=test1,"test2"=test2)
save(list, file = paste(paste("test",a,sep="_"),".RData",sep=""))
}
The iterative naming of the saved file works well, but I have not figured out a way to do this the list. The Problem is, that if I load the file into R the objects are both called list and thus I have a problem.
I have tried mv(from = "list" , to = paste(paste("test",a,sep="_")) but it does not work.
Can anybody help me with this?

Indeed this is a tricky point, since save(eval(parse(text=paste0("list", a))), file = paste("test",a,".RData",sep="")) is not working for some reason, your best bet IMO would be to save one file only - which might be more convenient any way, and access the names of the objects in the list of lists:
test1 <- 1
test2 <- 2
mylist <- list()
for(a in 1:100){
#assign(paste0("list",a), list("test1"=test1,"test2"=test2), environment())
mylist[[a]] <- list("test1"=test1,"test2"=test2)
}
save(mylist, file = "mylist.RData")

Related

Saving data frames as .Rda files and loading them using loops

I have three data frames: sets, themes, and parts. I want to save each as a .Rda file, and then (to prove that they saved correctly) clear my workspace and load each of them.
Without a loop, this obviously works:
save(sets, file = "sets.Rda")
save(themes, file = "themes.Rda")
save(parts, file = "parts.Rda")
rm(list=ls())
load("sets.Rda")
load("themes.Rda")
load("parts.Rda")
Looping through this SEEMS like it should be straightforward, but I can't get it to work. I have a few ideas about what's the issue, but I can't work my way around them.
My thought is this:
DFs <- list("sets", "themes", "parts")
for(x in 1:length(DFs)){
dx <- paste(DFs[[x]], ".Rda", sep = "")
save(x, file = dx)
}
rm(list=ls())
DFs <- list("sets.Rda", "themes.Rda", "parts.Rda")
for(DF in DFs) {
load(DF)
}
I know that loading loop can work because when I save the files using the first (non-looping) bit of code, it loads them all properly. But something about saving them using the above loop makes it so that when I run the loading loop, I don't get what I want. Get one object, named "x" with a value of "3L". I don't get it.
Please help me out. I think the problem rests in the arguments of my save() function, but I am not sure what's up.
Here's a minimal reproducible example showing how to write data.frames as RDS files in a loop, and then read them back into the R environment in a loop:
# Make 3 dataframes as an example
iris1 <- iris2 <- iris3 <- iris
df_names <- c("iris1", "iris2", "iris3")
# Saving them
for (i in 1:length(df_names)) {
saveRDS(get(df_names[i]), paste0(df_names[i], ".RDS"))
}
# Confirm they were written
dir()
# [1] "iris1.RDS" "iris2.RDS" "iris3.RDS"
# Remove them
rm(iris1, iris2, iris3)
# Load them
for (i in 1:length(df_names)) {
assign(df_names[i], readRDS(paste0(df_names[i], ".RDS")))
}

problem with loops: creating names based on increasing values for reading in CSV file

Okay, so I needed to split a larger file into a bunch of CSV's to run through a non-R program. I used this loop to do it:
for(k in 1:14){
inner_edge = 2000000L*(k-1) + 1
outter_edge = 2000000L*(k)
part <- slice(nc_tu_CLEAN, inner_edge:outter_edge)
out_name = paste0("geo/geo_CLEAN",(k),".csv")
write_csv(part,out_name)
Sys.time()
}
which worked great. Except I'm having a problem in this other program, and need to read a bunch of these back in to trouble shoot. I tried to write this loop for it, and get the following error:
for(k in 1:6){
csv_name <- paste0("geo_CLEAN",(k),".csv")
geo_CLEAN_(k) <- fread(file= csv_name)
}
|--------------------------------------------------|
|==================================================|
Error in geo_CLEAN_(k) <- fread(file = csv_name) :
could not find function "geo_CLEAN_<-"
I know I could do this line by line, but I'd like to have that be a loop if possible. What I want is for geo_CLEAN_1 to relate to fread geoCLEAN1.csv; geo_CLEAN_2 to relate to fread geoCLEAN2.csv, etc.
We need assign if we are interested in creating objects
for(k in 1:6){
csv_name <- paste0("geo_CLEAN",(k),".csv")
assign(sub("\\.csv", "", csv_name), fread(file= csv_name))
}

Can't name loaded files - R

So I have a folder with bunch of csv, I set the wd to that folder and extracted the files names:
data_dir <- "~/Desktop/All Waves Data/csv"
setwd(data_dir)
vecFiles <- list.files(data_dir)
all good, now the problem comes when I try to load all of the files using a loop on vecFiles:
for(fl in vecFiles) {
fl <- read.csv(vecFiles[i], header = T, fill = T)
}
The loop treats 'fl' as a plain string when it comes to the naming, resulting only saving the last file under 'fl' (by overwriting the previous one at each time).
I was trying to figure out why this happens but failed.
Any explanation?
Edit: Trying to achieve the following: assume you have a folder with data1.csv, data2.csv ... datan.csv, I want to load them into separate data frames named data1, data2 ..... datan
You want to read in all csv file from your working directory and have the locations of those files saved in vecFiles.
Why your attempt doesn't work
What you are currently doing doesn't work, because you are overwriting the object fn with the newly loaded csv file in every iteration. After all iterations have been run through, you are left with only the last overwritten fn object.
Another example to clarify why fn only contains the value of the last csv-file: If you declare fn <- "abc" in line1, and in line2 you say fn <- "def" (i.e. you overwrite fn from line1) you will obviously have the value "def" saved in fn after line2, right?
fn <- "abc"
fn <- "def"
fn
#[1] "def"
Solutions
There are two prominent ways to solve this: 1) stick with a slightly altered for-loop. 2) Use sapply().
1) The altered for loop: Create an empty list called fn, and assign the loaded csv files to the i-th element of that list in every iteration:
fn <- list()
for(i in seq_along(vecFiles)){
fn[[i]] <- read.csv(vecFiles[i], header=T, fill=T)
}
names(fn) <- vecFiles
2) Use sapply(): sapply() is a function that R-users like to use instead of for-loops.
fn <- sapply(vecFiles, read.csv, header=T, fill=T)
names(fn) <- vecFiles
Note that you can also use lapply() instead of sapply(). The only difference is that lapply() gives you a list as output
You're not declaring anything new when you load the file. Each time you load, it loads into fl, because of that you would only see the last file in vecFiles.
Couple of potential solutions.
First lapply:
fl <- lapply(vecFiles, function(x) read.csv(x, header=T, fill=t) )
names(fl) <- vecFiles
This will create a list of elements within fl.
Second 'rbind':
Under the assumption your data has all the same columns:
fl <- read.csv(vecFiles[1], header=T, fill=t)
for(i in vecFiles[2:length(vecFiles)]){
fl <- rbind(fl, read.csv(vecFiles[i], header=T, fill=t) )
}
Hopefully that is helpful!

How to read multiple .xlsx and generate multimple data frames in R?

I want to read three different files in xlsx and save them in three different dataframes called excel1, excel2 and excel3. How can I do that? I think it should be something like this:
files = list.files(pattern='[.]xlsx') #There are three files.
for (i in 1:files){
"excel" + i =read.xlsx(files[i])
}
I suggest you to use a list instead of creating 3 variables in the current workspace:
dfList <- list()
for (i in 1:files){
dfList[[paste0("excel",i)]] <- read.xlsx(files[i])
}
Then you can access to them in this way :
dfList$excel1
dfList$excel2
dfList$excel3
or :
dfList[[1]]
dfList[[2]]
dfList[[3]]
But, if you really really want to create new variables, you can use assign function :
for (i in 1:files){
assign(paste0("excel",i), read.xlsx(files[i]))
}
# now excel1, excel2, excel3 variables exist...
You can use plyr also and it's a good practice to mention the environment in which you want to create the variable:
library(plyr)
l_ply(1:length(files), function(i) assign(paste0('excel',i),read.xlsx(files[i]), envir=globalenv()))
If someone tries to use this code, this parameters are really helpful:
library(xlsx)
files = list.files(pattern='[.]xlsx')
dfList <- list()
for (i in 1:length(files)){
dfList[[paste0("excel",i)]] <- read.xlsx(files[i],header=T,stringsAsFactors=FALSE,sheetIndex = 1)
}

For loop question in R

Hope I can explain my question well enough to obtain an answer - any help will be appreciated.
I have a number if data files which I need to merge into one. I use a for loop to do this and add a column which indicates which file it is.
In this case there are 6 files with up to 100 data entries in each.
When there are 6 files I have no problem in getting this to run.
But when there are less I have a problem.
What I would like to do is use the for loop to test for the files and use the for loop variable to assemble a vector which references the files that exist.
I can't seem to get the new variable to combine the new value of the for loop variable as it goes through the loop.
Here is the sample code I have written so far.
for ( rloop1 in 1 : 6) {
ReadFile=paste(rloop1,SampleName,"_",FileName,"_Stats.csv", sep="")
if (file.exists(ReadFile))
**files_found <- c(rloop1)**
}
What I am looking for is that files_found will contain those files where 1...6 are valid for the files found.
Regards
Steve
It would probably be better to list the files you want to load, and then loop over that list to load them. list.files is your friend here. We can use a regular expression to list only those files that end in "_Stats.csv". For example, in my current working directory I have the following files:
$ ls | grep Stats
bar_Stats.csv
foobar_Stats.csv
foobar_Stats.csv.txt
foo_Stats.csv
Only three of them are csv files I want to load (the .txt file doesn't match the pattern you showed). We can get these file names using list.files():
> list.files(pattern = "_Stats.csv$")
[1] "bar_Stats.csv" "foo_Stats.csv" "foobar_Stats.csv"
You can then loop over that and read the files in. Something like:
fnames <- list.files(pattern = "_Stats.csv$")
for(i in seq_along(fnames)) {
assign(paste("file_", i, sep = ""), read.csv(fnames[i]))
}
That will create a series of objects file_1, file_2, file_3 etc in the global workspace. If you want the files in a list, you could instead lapply over the fnames:
lapply(fnames, read.csv)
and if suitable, do.call might help combine the files from the list:
do.call(rbind, lapply(fnames, read.csv))
There's a much shorter way to do this using list.files() as Henrik showed. In case you're not familiar with regular expressions (see ?regex), you could do.
n <- 6
Fnames <- paste(1:n,SampleName,"_",FileName,"Stats.csv",sep="")
Filelist <- Fnames[file.exists(Fnames)]
which is perfectly equivalent. Both paste and file.exists are vectorized functions, so you better make use of that. There's no need for a for-loop whatsoever.
To get the number of the filenames (assuming that's the only digits), you can do:
gsub("^[:digit:]","", Filelist)
See also ?regex
I think there are better solutions (e.g., you could use list.files() to scan the folder and then loop over the length of the returned object), but this should (I didn't try it) do the trick (using your sample code):
files.found <- ""
for (rloop1 in 1 : 6) {
ReadFile=paste(rloop1,SampleName,"_",FileName,"_Stats.csv", sep="")
if (file.exists(ReadFile)) files_found <- c(files.found, rloop1)
}
Alternatively, you could get the fileNames (other than their index) via:
files.found <- ""
for (rloop1 in 1 : 6) {
ReadFile=paste(rloop1,SampleName,"_",FileName,"_Stats.csv", sep="")
if (file.exists(ReadFile)) files_found <- c(files.found, ReadFile)
}
Finally, in your case list.files could look something like this:
files.found <- list.files(pattern = "[[:digit:]]_SampleName_FileName_Stats.csv")

Resources