I have a problem which I am sure has an easy answer, but I seem to be unable to figure it out. I have many matrices of the same format, and would like to assign the same column and row names to all of them. I am trying to do this in a loop, by calling all the matrices and then to assign the names.
Here is my reproducible example.
mnames <- letters[1:10] # The names to be assigned
mat1 <- matrix(rnorm(100),10,10)
mat2 <- matrix(rnorm(100),10,10)
mat3 <- matrix(rnorm(100),10,10)
obs <- c("mat2", "mat2", "mat3")
for(i in obs){
rownames(as.name(i)) <- mnames
colnames(as.name(i)) <- mnames
}
It seems like the loop does not call the object, but I do not understand why? Would be grateful for any help, I have tons of matrices and doing all the assigning one by one would be tedious. Thanks!
You can get the matrix in for loop but I think it would be better if you get them in a list with mget, change the dimnames and then if needed assign it back to global enviroment.
list_mat <- lapply(mget(obs), function(x) {dimnames(x) <- list(mnames, mnames);x})
list2env(list_mat, .GlobalEnv)
Related
I have a 18-by-48 matrix.
Is there a way to save each of the 18 rows automatically in a separate variable (e.g., from r1 to r18) ?
I'd definitely advise against splitting a data.frame or matrix into its constituent rows. If i absolutely had to split the rows up, I'd put them in a list then operate from there.
If you desperately had to split it up, you could do something like this:
toy <- matrix(1:(18*48),18,48)
variables <- list()
for(i in 1:nrow(toy)){
variables[[paste0("variable", i)]] <- toy[i,]
}
list2env(variables, envir = .GlobalEnv)
I'd be inclined to stop after the for loop and avoid the list2env. But I think this should give you your result.
I believe you can select a row r from your dataframe d by indexing without a column specified:
var <- d[r,]
Thus you can extract all of the rows into a variable by using
var <- d[1:length(d),]
Where var[1] is the first row, var[2] the second. Etc.. not sure if this is exactly what you are looking for. Why would you want 18 different variables for each row?
result <- data.frame(t(mat))
colnames(result) <- paste("r", 1:18, sep="")
attach(result)
your matrix is mat
I have found away to do this using reshape2 but it is quite slow and doesn't quite give me exactly what I want. I have a data.frame that looks like this:
df<-data.frame(expand.grid(1:10,1:10))
colnames(df) <- c("x","y")
for(i in 3:10){
df[i] <- runif(100,10,100)
}
I run:
require(reshape2)
matrices<-lapply(colnames(df)[-c(1:2)],function(x){
mat<-acast(df, y~x, value.var=x, fill= 0,fun.aggregate = mean)
return(mat)
})
there I have a list of matrices for each value vector in my data, I can transform this into an array of 1:10,1:10,1:10 dimension, but I am looking to see if there is a faster way to do this as my datasets can contain many value columns and this process can take a long time and I can't seem to find a more efficient way of doing it..
Thanks for any help.
If your data.frame is stored regularly as you say, you could accomplish this in a for loop, which may actually be faster than casting:
# preallocate array
myArray <- array(0, dim=c(10,10,10))
# loop through:
for(i in 1:10) {
myArray[,,i] <- as.matrix(df[df$y==i,])
}
I'm wanting to fill a list with many different matrices which are created by selecting a variety of different samples from an original matrix. Then repeat this process 10 times. I managed to do it (after much fighting/painful learning process). I would be so grateful if someone could point me in the right direction to get rid of my redundant code and improve the functions I'm using (maybe even get rid of the loops which I gather are rather frowned upon).
My problem hinged on getting the different sized matrices out of the loop.
Here's the code I used, one day I aspire to write R code that is not ugly:
##defining a matrix called allmat
allmat <- matrix(rnorm(100), nrow=50, ncol=2)
##sampling different sizes of the allmat matrix from 0.1*allmat to 10*allmat
for(i in seq(0,9,by=1)) {
for(j in seq(0.1,10,by=0.05)) {
nam <- paste("powermatrix_",j,"_",i,sep="")
assign(nam, allmat[sample(nrow(allmat),replace=T,size=j*nrow(allmat)),])
}
}
##then using apropos to pick out the names of the matrices from file
##eventually converting matrix list into a list to then use lapply
matrixlist <- data.frame(apropos("powermatrix_"), stringsAsFactors = FALSE)
##then rather horribly trying somehow to get my dataframe into a
## list which eventually I do below (but although inelegant this bit is
## not crucial)
colnames(matrixlist) <- "col1"
matrixlist_split <- strsplit(matrixlist$col1, "_")
library("plyr")
df <- ldply(matrixlist_split)
colnames(df) <- c("V1", "V2", "V3")
vector_sample <- as.numeric(df$V2)
mynewdf <- cbind(vector_matrices,vector_sample)
##creating the list before lapply
mylist <- as.list(mget(mynewdf$col1))
##then with the list I can use lapply (but there has to be a much
## much better way!)
Many thanks for all your input. This is now working much better with the following two lines. I didn't know you could seq_along or seq with lapply. These two in combination are very helpful.
this vector changes the size and repititions of the matrix sampled
seq_vector <- c(rep(seq(0.1,1,by=0.1),each=10))
this samples the matrix for all of the sizes and repeats defined by the sequence vector
myotherlist <- lapply(seq(seq_vector), function(x) allmat[sample(1:nrow(allmat), replace=T, size=x*nrow(allmat)),])
I have multiple data frames named y1 to y13 - one column each. They all have a column name that I would like to change to "Date.Code". I've tried the following in a for loop:
for(i in 1:13){
colnames(get(paste("y", i, sep=""))) <- c("Date.Code")
}
That didn't work.
I also tried:
for(i in 1:13){
assign(("Date.Code"), colnames(get(paste("y", i, sep=""))))
}
Also with no luck.
Any suggestions?
Thanks,
E
The difficulty here is that you cannot use get with an assignment operator directly
eg, get(nm) <- value will not work. You can use assign, as you're trying, but in a slightly different fashion.
assuming cn is the column number whose name you would like to change
for(i in 1:13){
nm <- paste0("y", i)
tmp <- get(nm)
colnames(tmp)[cn] <- c("Date.Code")
assign(nm, tmp)
}
That being said, a cleaner way of approaching this would be to collect all of your DF's into a single list, then you can easily use lapply to operate on them. For Example:
# collect all of your data.frames into a single list.
df.list <- lapply(seq(13), function(i) get(paste0("y", i)))
# now you can easily change column names. Note the `x` at the end of the function which serves
# as a return of the value. It then gets assigned back to an object `df.list`
df.list <-
lapply(df.list, function(x) {colnames(x)[cn] <- "Date.Code"; x})
Lastly, search these boards for [r] data.table and you will see many options for changing values by reference and setting attributes more directly.
Here one liner solution:
list2env(lapply(mget(ls(pattern='y[0-9]+')),
function(x) setNames(x,"Date.Code")),.GlobalEnv)
Of course it is better to keep all your variable in the same list.
I have 1000 matrices named A1, A2, A3,...A1000.
In a for loop I would like to simply take the colMeans() of each matrix:
for (i in 1:1000){
means[i,]<-colMeans(A1)
}
I would like to do this for each matrix Ax. Is there a way to put Ai instead of A1 in the for loop?
So, one way is:
for (i in 1:1000){
means[i,]<-colMeans(get(paste('A', i, sep = '')))
}
but I think that misses the point of some of the comments, i.e., you probably had to do something like this:
csvs = lapply(list.files('.', pattern = 'A*.csv'), function(fname) {
read.csv(fname)
})
Then the answer to your question is:
means = lapply(csvs, colMeans)
I don't completely understand, but maybe you have assigned each matrix to a different variable name? That is not the best structure, but you can recover from it:
# Simulate the awful data structure.
matrix.names<-paste0('A',1:1000)
for (name in matrix.names) assign(name,matrix(rnorm(9),ncol=3))
# Pull it into an appropriate list
list.of.matrices<-lapply(matrix.names,get)
# Calculate the column means
column.mean.by.matrix<-sapply(list.of.matrices,colMeans)
You initial question asks for a 'for loop' solution. However, there is an easy way to get the desired
result if we use an 'apply' function.
Perhaps putting the matrices into a list, and then applying a function would prove worthwhile.
### Create matrices
A1 <- matrix(1:4, nrow = 2, ncol = 2)
A2 <- matrix(5:9, nrow = 2, ncol = 2)
A3 <- matrix(11:14, nrow = 2, ncol = 2)
### Create a vector of names
names <- paste0('A', 1:3)
### Create a list of matrices, and assign names
list <- lapply(names, get)
names(list) <- names
### Apply the function 'colMeans' to every matrix in our list
sapply(list, colMeans)
I hope this was useful!
As others wrote already, using a list is perhaps your best option. First you'll need to place your 1000 matrices in a list, most easily accomplished using a for-loop (see several posts above). Your next step is more important: using another for-loop to calculate the summary statistics (colMeans).
To apply a for-loop through an R object, in general you can do one of the two options:
Loop over by indices: for example:
for(i in 1:10){head(mat[i])} #simplistic example
Loop "directly"
for(i in mat){print(i)} #simplistic example
In the case of looping through R lists, the FIRST option will be much easier to set up. Here is the idea adapted to your example:
column_means <- rep(NA,1000) #empty vector to store column means
for (i in 1:length(list_of_matrices)){
mat <- list_of_matrices[[i]] #temporarily store individual matrices
##be sure also to use double brackets!
column_means <- c(column_means, colMeans(mat))