How to pass the i index value to a command in R? - r

Fairly new to using R; would appreciate any help. I want to pass the i value from a for loop to an R command. This is what I am stuck with.
for i in colnames(df) table(df$i)
The i value is not being used the way I hoped.
If I did table(df$col1) where col1 is the name of column in df I get a result. i holds "col1".
Why doesn't it work and how can I fix this?

We need to use [[ instead of $. In addition, add the brackets as well. In a for loop if we don't print, it won't show up in the console.
for(i in colnames(df)) print(table(df[[i]]))
Or if we want to store the output, initialize a list and assign the output into the list
lst1 <- vector('list', ncol(df))
names(lst1) <- colnames(df)
for(i in colnames(df)) lst1[[i]] <- table(df[[i]])
The R way would be using *apply functions
lapply(df, table)

Related

colnames and mutate on multiple dataframes

I have a problem with cleaning up my code. I understand I could type this all out but we don't want that obviously.
I have only dataframes in my global environment. They are all "data.frame".
I want to check the dimensions of all of them and put that in a tibble. I managed that somehow. I also would like to change their colnames() tolower() which works easy if I just type the name of the data.frame, but there's more than 2 and I want it done automatically. Then I also want to mutate all data.frames in the same way.
Small example of my code:
library(tidyverse)
x <- data.frame(letters[1:2]) #To create the data
y <- data.frame(letters[3:4])
dfs <- as.list(ls()) #I take whatever is in my environment
I managed below to get a tibble of the dimensions:
z <- as_tibble(lapply(seq_along(dfs),
function(j) dim(get(dfs[[j]]))), .name_repair = "unique")
colnames(z) <- dfs
Now for the colnames of all the data.frames stored in my list I basically want to perform this code:
colnames(dfs[[1]]) <- tolower(colnames(dfs[[1]])
but that returns NULL as I found out earlier. So I used get() in there to make it work for the dimensions. But if I use get() to assign colnames it says it can't find function "get<-".
Since all colnames for all dataframes are the same (just different nrows()) I could save the lowercase colnames as value and use that, but that doesn't take away that it cant find the get<- function.
names <- tolower(colnames(x))
sapply(seq_along(dfs),
function(j) colnames(get(dfs[[j]])) <- names)
*Error in colnames(get(dfs[[j]])) <- names :
could not find function "get<-"*
as for the mutating part I tried a for loop:
for(i in seq_along(dfs)){
get(dfs[[i]]) <- get(dfs[[i]]) %>% mutate(cd = ab)
}
But it's the same issue.
Could anyone help clearing this problem for me? (and if a cleaner code for the dimensions is available that would be highly appreciated)
I am just trying to up my coding skills. I would have been long done if I just typed it all out but that defeats the purpose.
Thanks!
-JK
Using base R
lapply(dfs, function(x) transform(setNames(x, tolower(names(x))), X = c('a', 'b')))

Create a variable in Multiple Dataframes in R

I want to create a ranked variable that will appear in multiple data frames.
I'm having trouble getting the ranked variable into the data frames.
Simple code. Can't make it happen.
dfList <- list(df1,df2,df3)
for (df in dfList){
rAchievement <- rank(df["Achievement"])
df[[rAchievement]]<-rAchievement
}
The result I want is for df1, df2 and df3 to each gain a new variable called rAchievement.
I'm struggling!! And my apologies. I know there are similar questions out there. I have reviewed them all. None seem to work and accepted answers are rare.
Any help would be MUCH appreciated. Thank you!
We can use lapply with transform in a single line
dfList <- lapply(dfList, transform, rAchievement = rank(Achievement))
If we need to update the objects 'df1', 'df2', 'df3', set the names of the 'dfList' with the object names and use list2env (not recommended though)
names(dfList) <- paste0('df", 1:3)
list2env(dfList, .GlobalEnv)
Or using the for loop, we loop over the sequence of the list, extract the list element assign a new column based on the rank of the 'Achievement'
for(i in seq_along(dfList)) {
dfList[[i]][['rAchievement']] <- rank(dfList[[i]]$Achievement)
}

How do I alias a column name in a for loop?

I'm making a function and I'd like to call a column in a particular way.
Initialize data
a <- c(1,2,3,4,5)
b <- c(6,7,8,9,10)
c <- c(1,2,3,4,5)
d <- c(6,7,8,9,10)
df <- as.data.frame(cbind(a,b,c,d))
Call column for the table function
Func <- function(df){
X <- df
Y <- names(M)
for(i in 1:2){
table(X$___,X$___)
}}
The trouble is I don't know how to call the columns.
I'd like it to be the equivalent to table(X$a, X$b) as it iterates through the loop.
I tried this and it didn't work
for(i in 1:2){
Q <- Y[i]
W <- Y[j]
table(X$Q,X$W)
}}
It is necessary for a function I'm using that I make a table with the form table(X$a, X$b) and I don't know quite how to achieve that in a for loop?
Instead of calling table using names of the column you could use column index and use it in the function so you don't have to worry about how to call the columns.
Replace your for loop and use
table(df[1:2])
which would give you the expected result.
You need to use two [[ to get the content of the column:
df <- datasets::mtcars
for (i in 1:2) df[[i]]
This will also work for column names
for (i in names(df)) df[[i]]
Not sure what you are trying to achieve though. You can also just do:
lapply(df[1:2], table)
You can also loop through col using column index. In the following code you can loop through iris dataset column:
for(i in 1:length(colnames(iris))){
print(iris[,i]) # to get single column
print(iris[,c(i,i+1)]) # to get multiple column data
}

Loop used to create multiple vectors from data frame columns

I would like to create a vector from each column of mtcars data frame. I need two solutions. First one should be done in loop and if it's possible the other one without a loop.
The desired output should be like that:
vec_1 <- mtcars[,1]
vec_2 <- mtcars[,2]
etc...
I tried to create a loop but I failed. Can you tell me what is wrong with that loop ?
vec <- c()
for (i in 1:2){
assign(paste("vec",i,sep="_" <- mtcars[,i][!is.na(mtcars[,i])]
}
I need to remove possible NAs from my data that's why I put it in the example.
Your loop is missing a few brackets and you should assign the vector to the global environment of your R session like so:
for (i in 1:2) {
assign(sprintf("vec_%d", i), mtcars[!is.na(mtcars[[i]]), i], envir = .GlobalEnv)
}
It is not possible to get the desired result without a loop.

Using R: colnames() in for loop to change sequential datasets

I have multiple data frames named y1 to y13 - one column each. They all have a column name that I would like to change to "Date.Code". I've tried the following in a for loop:
for(i in 1:13){
colnames(get(paste("y", i, sep=""))) <- c("Date.Code")
}
That didn't work.
I also tried:
for(i in 1:13){
assign(("Date.Code"), colnames(get(paste("y", i, sep=""))))
}
Also with no luck.
Any suggestions?
Thanks,
E
The difficulty here is that you cannot use get with an assignment operator directly
eg, get(nm) <- value will not work. You can use assign, as you're trying, but in a slightly different fashion.
assuming cn is the column number whose name you would like to change
for(i in 1:13){
nm <- paste0("y", i)
tmp <- get(nm)
colnames(tmp)[cn] <- c("Date.Code")
assign(nm, tmp)
}
That being said, a cleaner way of approaching this would be to collect all of your DF's into a single list, then you can easily use lapply to operate on them. For Example:
# collect all of your data.frames into a single list.
df.list <- lapply(seq(13), function(i) get(paste0("y", i)))
# now you can easily change column names. Note the `x` at the end of the function which serves
# as a return of the value. It then gets assigned back to an object `df.list`
df.list <-
lapply(df.list, function(x) {colnames(x)[cn] <- "Date.Code"; x})
Lastly, search these boards for [r] data.table and you will see many options for changing values by reference and setting attributes more directly.
Here one liner solution:
list2env(lapply(mget(ls(pattern='y[0-9]+')),
function(x) setNames(x,"Date.Code")),.GlobalEnv)
Of course it is better to keep all your variable in the same list.

Resources