Loop for on dataset name in R - r

This topic has been covered numerous times I see but I can't really get the answer I'm looking for. Thus, here I go.
I am trying to do a loop to create variables in 5 data sets that have similar names as such:
Ech_repondants_nom_1
Ech_repondants_nom_2
Ech_repondants_nom_3
Ech_repondants_nom_4
Ech_repondants_nom_5
Below if the code that I have tried:
list <- c(1:5)
for (i in list) {
Ech_repondants_nom_[[i]]$sec = as.numeric(Ech_repondants_nom_[[i]]$interviewtime)
Ech_repondants_nom_[[i]]$min = round(Ech_repondants_nom_[[i]]$sec/60,1)
Ech_repondants_nom_[[i]]$heure = round(Ech_repondants_nom_[[i]]$min/60,1)
}
Any clues why this does not work?
cheers!

These are object names and not list elements to subset as Ech_repondants_nom_[[i]]. We may need to get the object by paste i.e.
get(paste0("Ech_repondants_nom_", i)$sec
but, then if we need to update the original object, have to call assign. Instead of all this, it can be done more easily if we load the datasets into a list and loop over the list with lapply
lst1 <- lapply(mget(paste0("Ech_repondants_nom_", 1:5)), function(dat)
within(dat, {sec <- as.numeric(interviewtime);
min <- round(sec/60, 1);
heure <- round(min/60, 1)}))
It may be better to keep it as a list, but if we need to update the original object, use list2env
list2env(lst1, .GlobalEnv)

Ech_repondants_nom_[[i]]
Isn't actually selcting that dataframe because you can't call objects like that. Try creating a function that takes a dataframe as an argument then iterating through the dataframes
changing_time_stamp<-function(df){
df$sec = as.numeric(df$interviewtime)
df$min = round(df$sec/60,1)
df$heure = round(df$min/60,1)
for (i in list) {
changing_time_stamp(i)
}
EDIT: I fixed some of the variable names in the function

Related

Defining multiple dataframes with for loop from list object

Very green R user here. Sorry if this is asked and answered somewhere else, I haven't been able to find anything myself.
I can't figure out why I can't get a for loop to work define multiple new dataframes but looping through a predefined list.
My list is defined from a subset of variable names from an existing dataframe:
varnames <- colnames(dplyr::select(df_response, -1:-4))
Then I want to loop through the list to create a new dataframe for each variable name on the list containing the results of summary function:
for (i in varnames){
paste0("df_",i) <- summary(paste0("df$",i))
}
I've tried variations with and without paste function but I can't get anything to work.
paste0 returns a string. Both <- and $ require names, but you can use assign and [[ instead. This should work:
varnames <- colnames(dplyr::select(df_response, -1:-4))
for (i in varnames){
assign(
paste0("df_", i),
summary(df[[i]])
)
}
Anyway, I'd suggest working with a list and accessing the summaries with $:
summaries <- df_response |>
dplyr::select(-(1:4)) |>
purrr::imap(summary)
summaries$firstvarname

get() not working for column in a data frame in a list in R (phew)

I have a list of data frames. I want to use lapply on a specific column for each of those data frames, but I keep throwing errors when I tried methods from similar answers:
The setup is something like this:
a <- list(*a series of data frames that each have a column named DIM*)
dim_loc <- lapply(1:length(a), function(x){paste0("a[[", x, "]]$DIM")}
Eventually, I'll want to write something like results <- lapply(dim_loc, *some function on the DIMs*)
However, when I try get(dim_loc[[1]]), say, I get an error: Error in get(dim_loc[[1]]) : object 'a[[1]]$DIM' not found
But I can return values from function(a[[1]]$DIM) all day long. It's there.
I've tried working around this by using as.name() in the dim_loc assignment, but that doesn't seem to do the trick either.
I'm curious 1. what's up with get(), and 2. if there's a better solution. I'm constraining myself to the apply family of functions because I want to try to get out of the for-loop habit, and this name-as-list method seems to be preferred based on something like R- how to dynamically name data frames?, but I'd be interested in other, more elegant solutions, too.
I'd say that if you want to modify an object in place you are better off using a for loop since lapply would require the <<- assignment symbol (<- doesn't work on lapply`). Like so:
set.seed(1)
aList <- list(cars = mtcars, iris = iris)
for(i in seq_along(aList)){
aList[[i]][["newcol"]] <- runif(nrow(aList[[i]]))
}
As opposed to...
invisible(
lapply(seq_along(aList), function(x){
aList[[x]][["newcol"]] <<- runif(nrow(aList[[x]]))
})
)
You have to use invisible() otherwise lapply would print the output on the console. The <<- assigns the vector runif(...) to the new created column.
If you want to produce another set of data.frames using lapply then you do:
lapply(seq_along(aList), function(x){
aList[[x]][["newcol"]] <- runif(nrow(aList[[x]]))
return(aList[[x]])
})
Also, may I suggest the use of seq_along(list) in lapply and for loops as opposed to 1:length(list) since it avoids unexpected behavior such as:
# no length list
seq_along(list()) # prints integer(0)
1:length(list()) # prints 1 0.

Append in a for loop

I am trying to create such list with value from different data frame called kc2 to kc10. anyone provide me some advice how to formulate this for loop?
sum_square=append(sum_square,weighted.mean(x=kc2$withinss,w=kc2$size, na.rm=TRUE))
I tried something like this but didnt work:
for (i in 2:10){
nam1 = paste0("kc",i,"$withinss")
nam2 = paste0("kc",i,"$size")
sum_square = append(sum_square, lapply(c(as.numeric(nam1),as.numeric(nam2)), weighted.mean))
}
There are a lot of problems with the code you posted, so I'll just cut right to the point. In R, when you want to apply a function to multiple objects and collect the result, you should be thinking of using lapply. lapply loops through a list of objects (you can put your data frames into a list), applies the chosen function to each, and then returns the result of each as a list. The below code is in the form of what you want:
# Add data frames to list by name
list_of_data_frames <- list(kc2, kc3, kc4, kc5, kc6, kc7, kc8, kc9, kc10)
# OR add them programatically
list_of_data_frames <- mget(paste0('kc', seq.int(from = 2, to = 10)))
result <- lapply(list_of_data_frames,
function(x) weighted.mean(x = x$withiniss, w = x$size, na.rm=TRUE))

Apply custom function to any dataset with common name

I have a custom function that I want to apply to any dataset that shares a common name.
common_funct=function(rank_p=5){
df = ANY_DATAFRAME_HERE[ANY_DATAFRAME_HERE$rank <rank_p,]
return(df)
}
I know with common functions I could do something like below to get the value of each.
apply(mtcars,1,mean)
But what if I wanted to do :
apply(any_dataset, 1, common_funct(anyvalue))
How would I pass that along?
library(dplyr)
mtcars$rank = dense_rank(mtcars$mpg)
iris$rank = dense_rank(iris$Sepal.Length)
Now how would I go about applying my same function to both values?
If I understand you question, I would suggest putting you data frames into a list and apply over it. So
## Your example function
common_funct=function(df, rank_p=5){
df[df$rank <rank_p,]
}
## Sanity check
common_funct(mtcars)
common_funct(iris)
Next create a list of the data frames
l = list(mtcars, iris)
and use lapply
lapply(l, common_funct)

How can I assign a function to a variable that its name should be made by paste function? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to name variables on the fly in R?
I have 10 objects of type list named: list1, list2, ..., list10. And I wanted to use the output from my function FUN to replace the first index of these list variables through a for loop like this:
for (i in 1:10){
eval(parse(test=(paste("list",i,sep=""))))[[1]] <- FUN()
}
but this does not work. I also used lapply in this way and again it is wrong:
lapply(eval(parse(text=(paste("list",i,sep=""))))[[1]], FUN)
Any opinion would be appreciated.
This is FAQ 7.21. The most important part of that FAQ is the last part that says not to do it that way, but to put everything into a list and operate on the list.
You can put your objects into a list using code like:
mylists <- lapply( 1:10, function(i) get( paste0('list',i) ) )
Then you can do the replacement with code like:
mylists <- lapply( mylists, function(x) { x[[1]] <- FUN()
x} )
Now if you want to save all the lists, or delete all the lists, you just have a single object that you need to work with instead of looping through the names again. If you want to do something else to each list then you just use lapply or sapply on the overall list in a single easy step without worrying about the loop. You can name the elements of the list to match the original names if you want and access them that way as well. Keeping everything of interest in a single list will also make your code safer, much less likely to accidentilly overwrite or delete another object.
You probably want something like
for (i in 1:10) {
nam <- paste0("list", i) ## built the name of the object
tmp <- get(nam) ## fetch the list by name, store in tmp
tmp[[1]] <- FUN() ## alter 1st component of tmp using FUN()
assign(nam, value = tmp) ## assign tmp back to the current list
}
as the loop.

Resources