How do I make dataframes in a for loop in R? - r

I want to create dataframes in a for loop where every dataframe gets a value specified in a vector. It seems very simple but for some reason I cannot find the answer.
So what I want is something like this:
x <- c(1,2,3)
for (i in x) {
df_{{i}} <- ""
return df_i
}
The result I want is:
df_1
df_2
df_3
So df_{{i}} should be something else but I don't know what.
EDIT: I have solved my problem by creating a list of lists like this:
function_that_creates_model_output <- function(var) {
output_function <- list()
output_function$a <- df_a %>% something(var)
output_function$b <- df_b %>% something(var)
return(output_function)
}
meta_output <- list()
for (i in x) {
meta_output[[i]] <- function_that_creates_model_output(var = i)
}

One solution would be to use the function assign
x <- c(1,2,3)
for (i in x) {
assign(x = paste0("df_",i),value = NULL)
}

Related

How to rewrite as a loop when I have identical frames for different years and the year is in the name?

I am new, so this question is a bit basic, but it might help others get a good start as well...
How to rewrite the below as a loop and have it include the years in the new names, as below...
DFNUM2011 = DF2011[,!(names(DF2011) %in% mydummies)]
DFNUM2012 = DF2012[,!(names(DF2012) %in% mydummies)]
DFNUM2013 = DF2013[,!(names(DF2013) %in% mydummies)]
I tried
df.list<-list("2011","2012","2013")
> for (i in df.list){
+ DFNUM[[i]] = DF[[i]][,!(names(DF2011) %in% mydummies)]
+ }
Error in DF : object 'DF' not found
This can work:
#List
List <- list(DFNUM2011,DFNUM2012,DFNUM2013)
#Loop
for (i in seq_along(List))
{
List[[i]] = List[[i]][,!(names(List[[i]]) %in% mydummies)]
}
A working example can be:
#Example
List <- list(iris,mtcars)
mydummies <- c('Species','mpg')
#Loop
for (i in seq_along(List))
{
List[[i]] = List[[i]][,!(names(List[[i]]) %in% mydummies)]
}
And a more compact way without loops:
#Code
List <- lapply(List, function(x) {x<-x[,!names(x) %in% mydummies]})
You can use :
library(purrr)
n <- 2011:2013
result <- map(mget(paste0('DF', n)), ~keep(.x, !(names(.x) %in% mydummies)))
If you want to create new dataframes with different names in your global environment.
names(result) <- paste0('DFNUM', n)
list2env(result, .GlobalEnv)
This should create DFNUM2011, DFNUM2012 and DFNUM2013 dataframes.

How can I make a loop that calls dataframes

I have the wrote the code below for a transformation of rows of a dataframe to colums
RowsToColums <- function(df)
{
model = list()
for(i in seq_along(df))
{
if(i>4)
{
dataf <- data.frame(names = df[1], Year=colnames(df[i]), index = df[,i:i])
names(dataf)[3]<- toString(df[[3]][2])
names(dataf)[1]<- "Country"
model[[i]] <- dataf
}
}
df <- do.call(rbind, model)
df <- arrange(df, Country)
}
EC_Pop <- RowsToColums(EC_Pop)
EC_GDP <- RowsToColums(EC_GDP)
EC_Inflation <- RowsToColums(EC_Inflation)
ST_Tech_Exp <- RowsToColums(ST_Tech_Exp)
ST_Res_Jour <- RowsToColums(ST_Res_Jour)
ST_Res_Exp <- RowsToColums(ST_Res_Exp)
ST_Res_Pop <- RowsToColums(ST_Res_Pop)
ED_Unempl <- RowsToColums(ED_Unempl)
ED_Edu_Exp <- RowsToColums(ED_Edu_Exp)
But as you can see, I call many times the same function.
I tried to move all these dataframes in a vector like this
list_a = list(EC_Pop,EC_GDP,EC_Inflation,ST_Tech_Exp,ST_Res_Exp)
for (i in seq_along(list_a))
{
list_a[i] <- RowsToColums(list_a[i])
}
write a loop that everytime take the dataframe but it fails with an error
UseMethod ("arrange_") error:
Inapplicable method for 'arrange_' applied to object of class "NULL"
Does anybody know how to fix this case?

R loop to create data frames with 2 counters

What I want is to create 60 data frames with 500 rows in each. I tried the below code and, while I get no errors, I am not getting the data frames. However, when I do a View on the as.data.frame, I get the view, but no data frame in my environment. I've been trying for three days with various versions of this code:
getDS <- function(x){
for(i in 1:3){
for(j in 1:30000){
ID_i <- data.table(x$ID[j: (j+500)])
}
}
as.data.frame(ID_i)
}
getDS(DATASETNAME)
We can use outer (on a small example)
out1 <- c(outer(1:3, 1:3, Vectorize(function(i, j) list(x$ID[j:(j + 5)]))))
lapply(out1, as.data.table)
--
The issue in the OP's function is that inside the loop, the ID_i gets updated each time i.e. it is not stored. Inorder to do that we can initialize a list and then store it
getDS <- function(x) {
ID_i <- vector('list', 3)
for(i in 1:3) {
for(j in 1:3) {
ID_i[[i]][[j]] <- data.table(x$ID[j:(j + 5)])
}
}
ID_i
}
do.call(c, getDS(x))
data
x <- data.table(ID = 1:50)
I'm not sure the description matches the code, so I'm a little unsure what the desired result is. That said, it is usually not helpful to split a data.table because the built-in by-processing makes it unnecessary. If for some reason you do want to split into a list of data.tables you might consider something along the lines of
getDS <- function(x, n=5, size = nrow(x)/n, column = "ID", reps = 3) {
x <- x[1:(n*size), ..column]
index <- rep(1:n, each = size)
replicate(reps, split(x, index),
simplify = FALSE)
}
getDS(data.table(ID = 1:20), n = 5)

creating functions with arguments as column names

I'm trying to create a function the will do a pairwise comparison between the values of one column to another and create a new vector depending on those values. I cannot work out how to allow two of the arguments to be column names that can then be changed and the function can be used on another set of columns.
The specific situation is there are four columns of coloured band labels for a parent bird (pbc1...pbc4) and another four for its chick(obc1...obc4). The band columns are columns of characters such as 'G' 'PG' 'B' etc.
this is the code of the first part of my function which I will extend to include all pairwise comparisons after I get this running:
colourdistance1 <- function(df, refcoldistdf, pbc, obc){
n <- length(pbc)
coldist1 <- rep(NA,n)
for(i in 1:n){
if(pbc[i]==obc[i]){
coldist1[i] <- 0
} else if(pbc[i]=='M'|obc[i]=='M'){
coldist1[i] <- NA
} else if(pbc[i]=='G'& obc[i]=='PG'| obc[i]=='G'& pbc[i]=='PG'){
coldist1[i] <- refcoldistdf[2,2]
} else {
coldist1[i] <- NA
}
}
}
p1o1 <- colourdistance1(bd_df, refcoldistdf,pbc = pbc1, obc = obc1)
This call just returns the object p1o1 as being NULL
I have also tried:
colourdistance1 <- function(df, refcoldistdf, pbc, obc){
n <- length(pbc)
coldist1 <- rep(NA,n)
for(i in 1:n){
if(df$pbc[i]==df$obc[i]){
coldist1[i] <- 0
} else if(df$pbc[i]=='M'|df$obc[i]=='M'){
coldist1[i] <- NA
} else if(df$pbc[i]=='G'& df$obc[i]=='PG'| df$obc[i]=='G'& df$pbc[i]=='PG') {
coldist1[i] <- refcoldistdf[2,2]
} else {
coldist1[i] <- NA
}
}
}
But that just gives this error:
Error in if (df$pbc[i] == df$obc[i]) { : argument is of length zero
I have tried all the code outside the function, inserting the column names and index number and df name and it all works. This makes me think I have an issue with the function arguments not connecting to the function code as I intended.
Any help will be appreciated!!
Reproducible test data:
pbc1 <- c('B','W','G','R')
obc1 <- c('Y','W','PG','FP')
pbc2 <- c('W','W','W','M')
obc2 <- c('M','W','R','R')
pbc3 <- c('W','K','FP','K')
obc3 <- c('G','PG','B','PB')
pbc4 <- c('K','K','B','M')
obc4 <- c('K','PG','W','M')
testbanddf <- cbind(pbc1,obc1,pbc2,obc2,pbc3,obc3,pbc4,obc4)
testrefcoldist <- diag(11)
So there are quite a few comments to make, but first, you might try this:
pbc1 <- c('B','W','G','R')
obc1 <- c('Y','W','PG','FP')
pbc2 <- c('W','W','W','M')
obc2 <- c('M','W','R','R')
pbc3 <- c('W','K','FP','K')
obc3 <- c('G','PG','B','PB')
pbc4 <- c('K','K','B','M')
obc4 <- c('K','PG','W','M')
testbanddf <- data.frame(pbc1,obc1,pbc2,obc2,pbc3,obc3,pbc4,obc4)
testrefcoldist <- diag(11)
colourdistance1 <- function(df, refcoldistdf, pbc, obc){
n <- nrow(df)
coldist1 <- rep(NA,n)
pbc <- df[[pbc]]
obc <- df[[obc]]
for(i in 1:n){
if(pbc[i]==obc[i]){
coldist1[i] <- 0
} else if(pbc[i]=='M'|obc[i]=='M'){
coldist1[i] <- NA
} else if(pbc[i]=='G'& obc[i]=='PG'| obc[i]=='G'& pbc[i]=='PG'){
coldist1[i] <- refcoldistdf[2,2]
} else {
coldist1[i] <- NA
}
}
coldist1
}
colourdistance1(testbanddf, testrefcoldist,pbc = "pbc1", obc = "obc1")
cbind() creates a matrix, not a data frame. You create data frames with the function data.frame().
The simplest way forward is to make the arguments pbc and obc be characters representing the column names.
Referring to data frame columns using $ is useful when working interactively, but isn't so useful (as you discovered) when writing functions and don't know the names of columns in advance. In that case, you use [[, and can select them by name or position.
Your function as written didn't explicitly return coldist1.

How to input data into data frame using nested for loop in R

Using the following code, I can print the values iterating each for loop.
for(i in 5:12)
{
for(j in 5:12)
{
for(k in 5:12)
{
for(l in 5:12)
{
cat(i,j,k,l,'\n')
}
}
}
}
Now I want to store the output data into a data frame df considering 4 columns (a,b,c,d) of numeric data. All I know is only the following code but has only single 'for' in it.
f3 <- function(n){
df <- data.frame(x = numeric(n), y = numeric(n))
for(i in 1:n){
df$x[i] <- i
df$y[i] <- i
}
df
}
How to input data into data frames while using nested for loops. Thank you.
you should try expand.grid
a <- 5:12
df <- expand.grid(a,a,a,a)
names(df) <- c("a","b","c","d")

Resources