How to fix: loop with dataframe name - r

I want to create some plots with random numbers in a loop. I want to save the created numbers in separate dataframes for example df1, df2 or df3 but it apparently always overwrites it.
How can I use the i for the dataframes names?
x1 <- c(1:9)
for (i in 1:3)
{
name = paste("Pic_", i, ".png", sep="")
png(name)
x2 <- rnorm(9,2,2)
plot(x1,x2)
df <- data.frame(x1,x2)
dev.off()
}

Try this
for (i in 1:3){
x1<-1:9
assign(paste("df",i,sep = ""), rnorm(9,2,2))
png(paste("Pic_", i, ".png", sep=""))
plot(x1,get(paste("df",i,sep = "")),ylab=paste("df",i,sep = ""))
dev.off()
}
The assign and get functions are important here. Assign creates a name in the environment, which is needed to create dataframes with different names using "i". The get function allows you to search for the dataframes you create again using "i" to search for the correct one. Both use the paste function to allow "i" to change with each iteration of the loop.

This should work - you end up with a list of three data frames.
By using df.list[[i]] you're addressing the index i.
x1 <- c(1:9)
df.list <- list()
for (i in 1:3) {
name = paste("Pic_", i, ".png", sep="")
png(name)
x2 <- rnorm(9, 2, 2)
plot(x1, x2)
df.list[[i]] <- data.frame(x1, x2)
dev.off()
}
Each item of the list is a data frame, accessed like you would any other list object:
> is.data.frame(df.list)
[1] FALSE
> is.data.frame(df.list[[1]])
[1] TRUE

Related

List elements getting overwritten in for loop R?

I have a bunch of csv files that I'm trying to read into R all at once, with each data frame from a csv becoming an element of a list. The loops largely work, but they keep overriding the list elements. So, for example, if I loop over the first 2 files, both data frames in list[[1]] and list[[2]] will contain the data frame for the second file.
#function to open one group of files named with "cores"
open_csv_core<- function(year, orgtype){
file<- paste(year, "/coreco.core", year, orgtype, ".csv", sep = "")
df <- read.csv(file)
names(df) <- tolower(names(df))
df <- df[df$ntee1 %in% c("C","D"),]
df<- df[!(df$nteecc %in% c("D20","D40", "D50", "D60", "D61")),]
return(df)
}
#function to open one group of files named with "nccs"
open_csv_nccs<- function(year, orgtype){
file2<- paste(year, "/nccs.core", year, orgtype, ".csv", sep="")
df2 <- read.csv(file2)
names(df2) <- tolower(names(df2))
df2 <- df2[df2$ntee1 %in% c("C","D"),]
df2<- df2[!(df2$nteecc %in% c("D20","D40", "D50", "D60", "D61")),]
return(df2)
}
#############################################################################
yrpc<- list()
yrpf<- list()
yrco<- list()
fname<- vector()
file_yrs<- as.character(c(1989:2019))
for(i in 1:length(file_yrs)){
fname<- list.files(path = file_yrs[i], pattern = NULL)
#accessing files in a folder and assigning to the proper function to open them based on how the file is named
for(j in 1:length(fname)){
if(grepl("pc.csv", fname[j])==T) {
if(grepl("nccs", fname[j])==T){
a <- open_csv_nccs(file_yrs[j], "pc")
yrpc[[paste0(file_yrs[i], "pc")]] <- a
} else {
b<- open_csv_core(file_yrs[j], "pc")
yrpc[[paste0(file_yrs[i], "pc")]] <- b
}
} else if (grepl("pf.csv", fname[j])==T){
if(grepl("nccs", fname[j])==T){
c <- open_csv_nccs(file_yrs[j], "pf")
yrpf[[paste0(file_yrs[i], "pf")]] <- c
} else {
d<- open_csv_core(file_yrs[j], "pf")
yrpf[[paste0(file_yrs[i], "pf")]] <- d
}
} else {
if(grepl("nccs", fname[j])==T){
e<- open_csv_nccs(file_yrs[j], "co")
yrco[[paste0(file_yrs[i], "co")]] <- e
} else {
f<- open_csv_core(file_yrs[j], "co")
yrco[[paste0(file_yrs[i], "co")]] <- f
}
}
}
}
Actually, both of your csv reading functions do exactly the same,
except that the paths are different.
If you find a way to list your files with abstract paths instead of relative
paths (just the file names), you wouldn't need to reconstruct the paths like
you do. This is possible by full.names = TRUE in list.files().
The second point is, it seems there is never from same year and same type
a "nccs.core" file in addition to a "coreco.core" file. So they are mutually
exclusive. So then, there is no logics necessary to distinguish those cases, which simplifies our code.
The third point is, you just want to separate the data frames by filetype ("pc", "pf", "co") and years.
Instead of creating 3 lists for each type, I would create one res-ults list, which contains for each type an inner list.
I would solve this like this:
years <- c(1989:2019)
path_to_type <- function(path) gsub(".*(pc|pf|co)\\.csv", "\\1", path)
res <- list("pc" = list(),
"pf" = list(),
"co" = list())
lapply(years, function(year) {
files <- list.files(path = year, pattern = "\\.csv", full.names = TRUE)
dfs <- lapply(files, function(path) {
print(path) # just to signal that the path is getting processed
df <- read.csv(path)
file_type <- path_to_type(path)
names(df) <- tolower(names(df))
df <- df[df$ntee1 %in% c("C", "D"), ]
df <- df[!(df$nteecc %in% c("D20", "D40", "D50", "D60", "D61")), ]
res[[file_type]][[year]] <- df
})
})
Now you can call from result's list by file_type and year
e.g.:
res[["co"]][[1995]]
res[["pf"]][[2018]]
And so on.
Actually, the results of the lapply() calls in this case are
not interesting. Just the content of res ... (result list).
It seems that in your for(j in 1:length(fname)){... you are creating one of 4 variable a, b, c or d. And you're reusing these variable names, so they are getting overwritten.
The "correct" way to do this is to use lapply in place of the for loop. Pass the list of files, and the required function (i.e. open_csv_core, etc) to lapply, and the return value that you get back is a list of the results.

Looping over lists, extracting certain elements and delete the list?

I am trying to create an efficient code that opens data files containing a list, extracts one element within the list, stores it in a data frame and then deletes this object before opening the next one.
My idea is doing this using loops. Unfortunately, I am quite new in learning how to do this using loops, and don't know how write the code.
I have managed to open the data-sets using the following code:
for(i in 1995:2015){
objects = paste("C:/Users/...",i,"agg.rda", sep=" ")
load(objects)
}
The problem is that each data-set is extremely large and R cannot open all of them at once. Therefore, I am now trying to extract an element within each list called: tab_<<i value >>_agg[["A"]] (for example tab_1995_agg[["A"]]), then delete the object and iterate over each i (which are different years).
I have tried using the following code but it does not work
for(i in unique(1995:2015)){
objects = paste("C:/Users/...",i,"agg.rda", sep=" ")
load(objects)
tmp = cat("tab",i,"_agg[[\"A\"]]" , sep = "")
y <- rbind(y, tmp)
rm(list=objects)
}
I apologize for any silly mistake (or question) and greatly appreciate any help.
Here’s a possible solution using a function to rename the object you’re loading in. I got loadRData from here. The loadRData function makes this a bit more approachable because you can load in the object with a different name.
Create some data for a reproducible example.
tab2000_agg <-
list(
A = 1:5,
b = 6:10
)
tab2001_agg <-
list(
A = 1:5,
d = 6:10
)
save(tab2000_agg, file = "2000_agg.rda")
save(tab2001_agg, file = "2001_agg.rda")
rm(tab2000_agg, tab2001_agg)
Using your loop idea.
loadRData <- function(fileName){
load(fileName)
get(ls()[ls() != "fileName"])
}
y <- list()
for(i in 2000:2001){
objects <- paste("", i, "_agg.rda", sep="")
data_list <- loadRData(objects)
tmp <- data_list[["A"]]
y[[i]] <- tmp
rm(data_list)
}
y <- do.call(rbind, y)
You could also turn it into a function rather than use a loop.
getElement <- function(year){
objects <- paste0("", year, "_agg.rda")
data_list <- loadRData(objects)
tmp <- data_list[["A"]]
return(tmp)
}
y <- lapply(2000:2001, getElement)
y <- do.call(rbind, y)
Created on 2022-01-14 by the reprex package (v2.0.1)

Saving dataframe after every iteration [r]

I'm trying to save a data frame after every iteration of this loop, while appending the data frame with the loop number. So, I'll be left with 5 data frames all with different names.
In my actual code, all the data frames will be different but for simplicity I've just shown one data frame here.
I've supplied some test code below.
testFunction <- function() {
for (i in 1:5) {
x <- data.frame(c(1:10), c(1,2,3,4,5,6,7,8,9,10), c(10:19))
name <- paste("name", i, sep = "_")
name <- x
}
}
The example data frames created would be named:
testFunction()
name_1
name_2
name_3
name_4
name_5
However, I'm only getting the final data frame "name_5" to save when the loop completes. My issue is I don't know how to save the ith version of the data frame without escaping from the loop.
Any suggestions on how I can solve this?
***** EDIT *****
I have my for loop inside a function, which might be why assign() is not working. I've appended my example above to show this.
Inside your loop, use assign():
for (i in 1:5) {
x <- data.frame(c(1:10), c(1,2,3,4,5,6,7,8,9,10), c(10:19))
assign( paste("name", i, sep = "_") , x)
}
Edit:
As you now want to do this in a function, you would have to specify the environment to assign to. I suspect you want the global environment:
testFunction <- function() {
for (i in 1:5) {
x <- data.frame(c(1:10), c(1,2,3,4,5,6,7,8,9,10), c(10:19))
assign( paste("name", i, sep = "_") , x , envir = globalenv() )
}
}
Please be warned that it is not good practice to write a function that edits the enclosing environment. You'd be better off just returning a named list of your data frames, e.g. like so:
testFunction_2 <- function() {
out_list <- vector(mode = "list", length = 5)
for (i in 1:5) {
x <- data.frame(c(1:10), c(1,2,3,4,5,6,7,8,9,10), c(10:19))
out_list[[i]] <- x
names(out_list)[i] <- paste("name", i, sep = "_")
}
return(out_list)
}

How to save results from for loop on list into a new list under "i" vector name?

I have the following code:
final_results <- list()
myfunc <- function(v1) {
deparse(substitute(v1))
}
for (i in mylist) {
...calculations...
tmp_results <- as.data.frame(cbind(effcrs,weights))
colnames(tmp_results) <- c('efficiency',names(inputs),
names(outputs)) # header
rownames(tmp_results) <- namesDMU[,1]
#Save to list
name_in_list <- myfunc(i)
dea_results[[name_in_list]] <- tmp_results
}
The above code loops through a list of data frames. I would like each result yielded from the loop to be stored in a separate list under the same name as the original file obtained from mylist or i
I tried using the deparse substitute. when i apply it to an individual item in mylist it looks like this:
myfunc(standard_DEA$'2010-11-11')
[1] "standard_DEA$\"2010-11-11\""
I don't know what the issue is. At the moment it saves everything under the name "i" and replaces all vectors so the end result is a list of 1.
Thank you in advance
This looks like you want a do loop.
library(dplyr)
function_which_returns_dataframe = function(i) {
...calculations...
tmp_results <- as.data.frame(cbind(effcrs,weights))
colnames(tmp_results) <- c('efficiency',names(inputs),
names(outputs)) # header
rownames(tmp_results) <- namesDMU[,1]
tmp_results
}
data_frame(mylist = mylist,
name = names(mylist)) %>%
group_by(mylist, name) %>%
do(function_which_returns_dataframe(.$mylist[[1]]))

extract from list and save to csv

I am struggling to do something that I know should be simple.
I have a list of dataframes like so:
a <- rep(1, 10)
b <- rep(3.6, 10)
foo1 <- cbind(a, b)
d <- rep(2, 8)
b <- rep(4.9, 8)
foo2 <- cbind(d, b)
data <- list(foo1, foo2)
I want to extract the 2nd column from each dataframe, either by indexing or by column name, and save to a csv file using write.table and with the same name as the dataframe. I have tried a lot of things---for loops and lapply and sapply.
I get a variety of error messages, but mostly the following:
In if (file == "") file <- stdout() else if (is.character(file)) { :
the condition has length > 1 and only the first element will be used
which I can't resolve.
I know I'm not indexing properly. Help me please!
You can use a loop to iterate over the fields of data:
for (i in 1:length(data)) {
col <- data[[i]][,2]
fname <- paste("foo", i, ".csv", sep="")
write.table(col,fname)
}
The write.table command will likely need a bit of tweaking, until you get the data in the format you want.

Resources