How to loop local functions? - r

I made a function in R that I would like to loop. I have gotten the function to work in a single case. I can't get the function to return the vector of number produced by the function.
vec_fun5 <- function(x,y){
Vec <- c(round(mean(x[[y]],na.rm=T),2),nrow(na.omit(x[,y])),length(which(x[,y]==1)),length(which(x[,y]==2)),length(which(x[,y]==3)),length(which(x[,y]==4)),length(which(x[,y]==5)))
return(Vec)
}
for(i in 20:24){
vec_fun5(x,i)
}
I would like to produce a data frame with all of the vectors produced by the loop.

Maybe you can try putting the objects created by the function in a list:
vec_save <- list()
ii <- 1
for(i in 20:24){
vec_save[[ii]] <- vec_fun5(x,i)
ii <- ii+1
}
Following this, if you would like to cbind or rbind the vectors of interest to obtain a single dataframe, you can just run:
df <- do.call("cbind", vec_save) #assuming that you want to bind them by column

Related

In R, how do save space with running a function on 192 dataframes?

I have around 192 CSV's that I have converted to dataframes. I would like to be able to put the names of each dataframe in a vector and then run a FOR LOOP through the vector like so:
for (i in length(vector)){
vector[i] <- f1(vector[i])
}
or just pass through the vector into the function like so: f1(vector).
If the vector is full of integers or strings, I can put the vector through a function and it will work fine. For example:
squared <- function(x) {
return(x*x)
}
This will work with with a vector c(1,2,3,4,5) and return c(1,4,9,16,25). Otherwise, I have to make 124 lines of code for each function I want to do.
Your advice would be greatly appreciated, please.
I think the most Rtistic way of doing it would be to have all your dataframes in a list to start with. For instance,
df1 <- mtcars
df2 <- mtcars
df3 <- mtcars
frames <- grep('df', ls(), value = T)
frame_list <- lapply(frames, get)
gets you there. Now you can apply whatever function you want to each dataframe in a lapply call. So, for instance, if you wanted to get all the squared values of mpg, you could write
frame_adj <- lapply(frame_list, function(x) x$mpg * x$mpg )
The above gives you all the squared values of mpg from the original dataframes, but does not keep the other columns. If you prefer to keep the other values, simply adjust your function to return the dataframe, e.g.
frames_with_squared_mpg <-
lapply(frame_list, function(x) {
x$mpg_sq <- x$mpg * x$mpg
return(x)})
will get you there.

How do I alias a column name in a for loop?

I'm making a function and I'd like to call a column in a particular way.
Initialize data
a <- c(1,2,3,4,5)
b <- c(6,7,8,9,10)
c <- c(1,2,3,4,5)
d <- c(6,7,8,9,10)
df <- as.data.frame(cbind(a,b,c,d))
Call column for the table function
Func <- function(df){
X <- df
Y <- names(M)
for(i in 1:2){
table(X$___,X$___)
}}
The trouble is I don't know how to call the columns.
I'd like it to be the equivalent to table(X$a, X$b) as it iterates through the loop.
I tried this and it didn't work
for(i in 1:2){
Q <- Y[i]
W <- Y[j]
table(X$Q,X$W)
}}
It is necessary for a function I'm using that I make a table with the form table(X$a, X$b) and I don't know quite how to achieve that in a for loop?
Instead of calling table using names of the column you could use column index and use it in the function so you don't have to worry about how to call the columns.
Replace your for loop and use
table(df[1:2])
which would give you the expected result.
You need to use two [[ to get the content of the column:
df <- datasets::mtcars
for (i in 1:2) df[[i]]
This will also work for column names
for (i in names(df)) df[[i]]
Not sure what you are trying to achieve though. You can also just do:
lapply(df[1:2], table)
You can also loop through col using column index. In the following code you can loop through iris dataset column:
for(i in 1:length(colnames(iris))){
print(iris[,i]) # to get single column
print(iris[,c(i,i+1)]) # to get multiple column data
}

Stepwise reducing the input dataframe in a loop

I need to do iteratively evaluate the variance of a dataset while i reduce the data.frame row by row in each step. As an example
data <- matrix(runif(100),10,10)
perc <- list("vector")
sums <- sum(data)
for (i in 1:nrow(data)) {
data <- data[-1,]
perc[[i]] <- sum(data)/sums # in reality, here are ~8 additonal lines of code
}
I dont like that data is re-initialized in every step, and that the loop breaks with an error, when data is emptied.
So the questions are:
1. How to express data <- data[-1,] in an incrementing way (something like tmp <- data[-c(1:i),], which doesnt work?
2. Is there a way to stop the loop, before the last row is removed from data?
You could try
set.seed(123)
data <- matrix(runif(100),10,10)
sums <- sum(data)
perc <- lapply(2:nrow(data),function(x) sum(data[x:nrow(data),]/sums))
The above code yields the same result as your original code, but without error message and without modifying data.
perc1 <- list()
for (i in 1:nrow(data)) {
data <- data[-1,]
perc1[[i]] <- sum(data)/sums
}
identical(perc,perc1)
#[1] TRUE
If you wish to preserve the for loop in order to perform other calculations within the loop, you could try:
for (i in 2:nrow(data)) {
perc[[i-1]] <- sum(data[i:nrow(data),])/sums
# do more stuff here
}
identical(perc,perc1)
#[1] TRUE
If you are using the loop index i for other calculations within the loop, you will most probably need to replace it with i-1. It may depend on what is calculated.
You can use lapply
res <- lapply(2:nrow(data), function(i)sum(data[i:nrow(data),])/sums)
You can write the loop part like this:
for (i in 2:nrow(data)) {
perc[[i - 1]] <- sum(data[i:nrow(data),])/sums # in reality, here are ~8 additonal lines of code
}

Loop used to create multiple vectors from data frame columns

I would like to create a vector from each column of mtcars data frame. I need two solutions. First one should be done in loop and if it's possible the other one without a loop.
The desired output should be like that:
vec_1 <- mtcars[,1]
vec_2 <- mtcars[,2]
etc...
I tried to create a loop but I failed. Can you tell me what is wrong with that loop ?
vec <- c()
for (i in 1:2){
assign(paste("vec",i,sep="_" <- mtcars[,i][!is.na(mtcars[,i])]
}
I need to remove possible NAs from my data that's why I put it in the example.
Your loop is missing a few brackets and you should assign the vector to the global environment of your R session like so:
for (i in 1:2) {
assign(sprintf("vec_%d", i), mtcars[!is.na(mtcars[[i]]), i], envir = .GlobalEnv)
}
It is not possible to get the desired result without a loop.

How to create and add new columns to a dataframe in R within a loop?

I want to create a new dataframe and keep adding variables in R with in a for loop. Here is a pseudo code on what I want:
X <- 100
For(i in 1:X)
{
#do some processing and store it a variable named "temp_var"
temp_var[,i] <- z
#make it as a dataframe and keep adding the new variables until the loop completes
}
After completion of the above loop the vector "temp_var" by itself should be a dataframe containing 100 variables.
I would avoid using for loops as much as possible in R. If you know your columns before hand, lapply is something you want to use.
However, if this is constraint by your program, you can do something like this:
X <- 100
tmp_list <- list()
for (i in 1:X) {
tmp_list[[i]] <- your_input_column
}
data <- data.frame(tmp_list)
names(data) <- paste0("col", 1:X)

Resources