Using R: colnames() in for loop to change sequential datasets - r

I have multiple data frames named y1 to y13 - one column each. They all have a column name that I would like to change to "Date.Code". I've tried the following in a for loop:
for(i in 1:13){
colnames(get(paste("y", i, sep=""))) <- c("Date.Code")
}
That didn't work.
I also tried:
for(i in 1:13){
assign(("Date.Code"), colnames(get(paste("y", i, sep=""))))
}
Also with no luck.
Any suggestions?
Thanks,
E

The difficulty here is that you cannot use get with an assignment operator directly
eg, get(nm) <- value will not work. You can use assign, as you're trying, but in a slightly different fashion.
assuming cn is the column number whose name you would like to change
for(i in 1:13){
nm <- paste0("y", i)
tmp <- get(nm)
colnames(tmp)[cn] <- c("Date.Code")
assign(nm, tmp)
}
That being said, a cleaner way of approaching this would be to collect all of your DF's into a single list, then you can easily use lapply to operate on them. For Example:
# collect all of your data.frames into a single list.
df.list <- lapply(seq(13), function(i) get(paste0("y", i)))
# now you can easily change column names. Note the `x` at the end of the function which serves
# as a return of the value. It then gets assigned back to an object `df.list`
df.list <-
lapply(df.list, function(x) {colnames(x)[cn] <- "Date.Code"; x})
Lastly, search these boards for [r] data.table and you will see many options for changing values by reference and setting attributes more directly.

Here one liner solution:
list2env(lapply(mget(ls(pattern='y[0-9]+')),
function(x) setNames(x,"Date.Code")),.GlobalEnv)
Of course it is better to keep all your variable in the same list.

Related

How to pass the i index value to a command in R?

Fairly new to using R; would appreciate any help. I want to pass the i value from a for loop to an R command. This is what I am stuck with.
for i in colnames(df) table(df$i)
The i value is not being used the way I hoped.
If I did table(df$col1) where col1 is the name of column in df I get a result. i holds "col1".
Why doesn't it work and how can I fix this?
We need to use [[ instead of $. In addition, add the brackets as well. In a for loop if we don't print, it won't show up in the console.
for(i in colnames(df)) print(table(df[[i]]))
Or if we want to store the output, initialize a list and assign the output into the list
lst1 <- vector('list', ncol(df))
names(lst1) <- colnames(df)
for(i in colnames(df)) lst1[[i]] <- table(df[[i]])
The R way would be using *apply functions
lapply(df, table)

How do I alias a column name in a for loop?

I'm making a function and I'd like to call a column in a particular way.
Initialize data
a <- c(1,2,3,4,5)
b <- c(6,7,8,9,10)
c <- c(1,2,3,4,5)
d <- c(6,7,8,9,10)
df <- as.data.frame(cbind(a,b,c,d))
Call column for the table function
Func <- function(df){
X <- df
Y <- names(M)
for(i in 1:2){
table(X$___,X$___)
}}
The trouble is I don't know how to call the columns.
I'd like it to be the equivalent to table(X$a, X$b) as it iterates through the loop.
I tried this and it didn't work
for(i in 1:2){
Q <- Y[i]
W <- Y[j]
table(X$Q,X$W)
}}
It is necessary for a function I'm using that I make a table with the form table(X$a, X$b) and I don't know quite how to achieve that in a for loop?
Instead of calling table using names of the column you could use column index and use it in the function so you don't have to worry about how to call the columns.
Replace your for loop and use
table(df[1:2])
which would give you the expected result.
You need to use two [[ to get the content of the column:
df <- datasets::mtcars
for (i in 1:2) df[[i]]
This will also work for column names
for (i in names(df)) df[[i]]
Not sure what you are trying to achieve though. You can also just do:
lapply(df[1:2], table)
You can also loop through col using column index. In the following code you can loop through iris dataset column:
for(i in 1:length(colnames(iris))){
print(iris[,i]) # to get single column
print(iris[,c(i,i+1)]) # to get multiple column data
}

Loop used to create multiple vectors from data frame columns

I would like to create a vector from each column of mtcars data frame. I need two solutions. First one should be done in loop and if it's possible the other one without a loop.
The desired output should be like that:
vec_1 <- mtcars[,1]
vec_2 <- mtcars[,2]
etc...
I tried to create a loop but I failed. Can you tell me what is wrong with that loop ?
vec <- c()
for (i in 1:2){
assign(paste("vec",i,sep="_" <- mtcars[,i][!is.na(mtcars[,i])]
}
I need to remove possible NAs from my data that's why I put it in the example.
Your loop is missing a few brackets and you should assign the vector to the global environment of your R session like so:
for (i in 1:2) {
assign(sprintf("vec_%d", i), mtcars[!is.na(mtcars[[i]]), i], envir = .GlobalEnv)
}
It is not possible to get the desired result without a loop.

How to generalize union() to take N arguments?

How can I append/ push data into union dynamically?
For instance, I have 4 data sets to merge,
mydata <- union(data1, data2, data3, data4)
But sometimes I have less than 4 while sometimes more than that.
Any ideas how can I solve this problem?
Make some reproducible data:
#dummy data
data1 <- data.frame(x=letters[1:3])
data2 <- data.frame(x=letters[2:4])
data3 <- data.frame(x=letters[5:7])
We can use rbind with unique in a string then evaluate:
#get list of data frames to merge, update pattern as needed
data_names <- ls()[grepl("data\\d",ls())]
data_names <- paste(data_names,collapse=",")
#make command string
myUnion <- paste0("unique(rbind(",data_names,"))")
#evaluate
eval(parse(text=myUnion))
EDIT:
Here is another better/simpler way, using do.call:
unique(do.call("rbind",lapply(objects(pattern="data\\d"),get)))
You could roll your own function like vunion defined below. Not sure if this actually works, my [R] got a bit stale ;)
Basically, you accept any number of arguments (hence ...) and make use of those as if they were packed in a list. Just choose and remove the first 2 items from that list, calculate their union, append them to the list, repeat.
vunion <- function(...){
data <- list(...)
n <- length(data)
if(n > 2){
u <- list(t(union(data[[1]], data[[2]])))
return(do.call(vunion, as.list(c(tail(data, -2), u))))
} else {
return(union(data[[1]], data[[2]]))
}
}

Renaming headers in R

I want to rename headers of four dataframes (dw,ds,dmw,dne). All of them have six columns.
regions <- c("dw","ds","dmw","dne")
for (i in regions){
names(i)=c("lon","lat","area","fd","tp","rt")
}
But I am getting this error:
Error in names(i) = c("lon", "lat", "area", "fd", "tp", "rt") :
'names' attribute [6] must be the same length as the vector [1]
Where am I going wrong?
I'd say this is more the R way to approach this using lists. Storing your data in a list makes your life easier (most of the time), particularly when you want to do repeated manipulations of individual elements (in this case data.frames). Here I use lapply because you want to change the names in a consistent manner, but with mapply you could change each data.frame individually with different names.
First create some data like you should have done - I assigned to the global environment as I believe you have.
dw <- mtcars[1:4, 1:6]
ds <- mtcars[1:4, 1:6]
dmw <- mtcars[1:4, 1:6]
dne <- mtcars[1:4, 1:6]
Now wrap all that goodness up as a list (or better yet read it in/create as a list if you can)
lst <- list(dw, ds, dmw, dne)
## name the list
names(lst) <- c("dw","ds","dmw","dne")
## Now we can use lapply to add the column names
(out <- lapply(lst, function(x) {
setNames(x, nm = c("lon","lat","area","fd","tp","rt"))
}))
I'd continue to operate out of the list and manipulate individual elements/objects in the list using indexing (see what out[["dw"]] gives you). If you really want to reassign to the global environment use list2env:
list2env(out, envir = .GlobalEnv)
dne
Use colnames instead and use get to refer to the variable:
for (i in regions){
dat <- get(i)
colnames(dat) <- c("lon","lat","area","fd","tp","rt")
assign(i, dat)
}

Resources