I thought this would be easy but it is not working for. I am trying to follow this example to change the column names for each matrix in a list of matrices that I created:
Assign column names to list of dataframes
When I run the code below, I get a very weird return where it looks like I just set the name of each element in each matrix instead of just the column names.
#create a list of matrices containing random numbers
randoms<-lapply(1:1000, function(x) matrix(rnorm(1440), ncol=10))
trial<-lapply(randoms, setNames , nm = letters[1:10])
head(trial[[1]])
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.89032453 1.02459736 0.7141343 -0.47405630 -2.0719943 -1.5087669
[2,] -0.74866047 0.44086093 -1.7540066 -2.04227094 -0.4875453 1.4207707
[3,] -0.04565454 -1.52336294 -0.1941370 -1.36252338 1.7338307 -1.3536725
[4,] 0.13242099 -0.09157545 -0.6156536 -1.34546174 -0.3279853 0.9663668
[5,] 2.09173141 0.41592339 0.7422889 -0.05991624 0.5319697 0.6413341
[6,] -0.32129540 2.11206231 0.1722047 -0.54404820 1.2685971 -0.0784607
[,7] [,8] [,9] [,10]
[1,] -0.4849624 -1.2590439 -1.5066718 -0.6758746
[2,] -2.5010320 -2.3469163 0.5221117 0.9186142
[3,] -1.3763468 -0.5551194 -0.2304872 -1.6087508
[4,] -2.0282231 -0.1949064 0.9329241 1.0196325
[5,] 1.6429999 1.8176161 -0.6549447 -1.8833887
[6,] 1.0044023 1.5895154 0.3660308 -0.1883819
head(attr(trial[[1]], "names"))
[1] "a" "b" "c" "d" "e" "f"
We can use a for loop
for(i in seq_along(randoms)) {
colnames(randoms[[i]]) <- letters[1:10]
}
Related
I need to write a function in R that has no input but randomly selects a set of 13 pairs of letters.
And the output of such function has to be a 2 x 13 matrix. But the letters can appear only once, meaning they cannot be repeated within a row or amongst rows.
So far, I've come up with this:
f <- function(){
x <- letters[1:26]
return(matrix(sample(x,13, replace = F), 2, 13))
}
I've managed to make sure letters do not repeat within a row (with replace = F), but I don't know how to make sure letters from one row do not appear again in the other row.
Any ideas?
you don't need to generate two vectors
x <- letters[1:26]
matrix(sample(x,26,replace = F),2,13)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
[1,] "s" "m" "h" "z" "q" "y" "w" "x" "p" "n" "e" "o" "j"
[2,] "r" "b" "d" "v" "u" "a" "k" "i" "f" "l" "g" "c" "t"
Here is the shorthand version
x <- letters
matrix(sample(x),2)
Im importing data of values and currencies, among other characteristic. The currencies are strings, and i need to substitute all of them for a number id. I do have a matrix with all currencies and ids as a catalog.
My matrices look like
main <- cbind(c("toys", "food"),c(345, 45), c("USD", "EUR"))
cat<-cbind(c("USD", "EUR"), c(1, 2))
The outcome i want is for
main
[toys, 345, 1
food, 45, 2]
Perhaps:
main[,3] <- cat[,2][match(main[,3], cat[,1])]
Output:
[,1] [,2] [,3]
[1,] "toys" "345" "1"
[2,] "food" "45" "2"
Here is another option using left_join from pakcage dplyr, i.e.,
res <- unname(as.matrix(dplyr::left_join(data.frame(main),
data.frame(cat),
by = c("X3"="X1"))[-3]))
such that
> res
[,1] [,2] [,3]
[1,] "toys" "345" "1"
[2,] "food" "45" "2"
Assuming that the OP wanted to match all the elements in 'main' and not particularly a column, we can use match and then replace the values
i1 <- match(main, cat[,1])
replace(main, !is.na(i1), cat[,2])
# [,1] [,2] [,3]
#[1,] "toys" "345" "1"
#[2,] "food" "45" "2"
Or another option is to create a named vector and use that for replacement
v1 <- setNames(cat[,2], cat[,1])[main]
main[!is.na(v1)] <- v1[!is.na(v1)]
main
# [,1] [,2] [,3]
#[1,] "toys" "345" "1"
#[2,] "food" "45" "2"
Let's assume I have a list of four data.frames containing some NA values:
my.list<-replicate(4,data.frame())
names(my.list)<-paste0("Frame.Number", c(1:4))
for (i in 1:4){
my.list[[i]]<-mapply(rnorm,10,c(1:4))
my.list[[i]][i+1,3]<-NA
my.list[[i]][c(i,i*2),4]<-NA
}
For each of the data.frames, I want to select those rows which don't contain NAs in the 4th column. I can, for example, create a list of vectors (?) containing information about the completeliness of the cases in each data.frame:
selector <- lapply(my.list,"[",, 4)
selector <- lapply(selector,complete.cases)
Now this is where I am stuck: how do I apply the selector list to the my.list list in order to select only the complete cases? I thought I could use lapply again, but I cannot figure out some meaningful syntax.
We can lapply over list, select the 4th column and get the index of non-NA values and subset the matrix accordingly.
lapply(my.list,function(x) x[!is.na(x[,4]), ])
#$Frame.Number1
# [,1] [,2] [,3] [,4]
#[1,] 0.3668229 2.0688573 2.466580 4.339755
#[2,] -0.6391422 3.2635271 2.011809 3.296089
#[3,] 0.8662670 2.2797301 4.838563 4.443876
#[4,] -0.8972108 2.9305257 3.461650 5.525453
#[5,] -0.3452349 -0.2211153 2.570717 3.915671
#[6,] 0.6464616 2.3472838 4.009406 3.436188
#[7,] 0.9341354 2.3092428 2.338770 4.359324
#[8,] -0.5652311 3.2143472 1.944220 4.042566
#$Frame.Number2
# [,1] [,2] [,3] [,4]
#[1,] 0.22304364 2.6085569 3.459335 2.575920
#[2,] -0.08987518 2.9515099 NA 3.775579
#[3,] 2.03265254 0.9405609 3.266783 4.009509
....
I created a matrix in R
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
Now I would like to replace the first column of the matrix with the value 1, the second and third column with standard normal random variables and the last three columns with the values of an other matrix.
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
other.matrix<-matrix(runif(18), nrow = 6, ncol = 3)
C[,1]<-1
C[,3]<-rnorm(6)
C[,4:6]<-other.matrix
To access the rows and columns of matrices (and for that matter, data.frames) in R you can use [] brackets and i,j notation, where i is the row and j is the column. For example, the 3rd row and 2nd column of your matrix C can be addressed with
C[3,2]
#[1] 0
Use <- to assign new values to the rows/columns you have selected.
For the first three columns, you can use
C<-matrix(c(0),nrow=6,ncol=6,byrow = FALSE)
C[ ,1] <- 1; C[ ,2] <- rnorm(6); C[ ,3] <- rnorm(6)
Let's now say your other matrix is called D and looks like
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 0.6527716 0.81793644 0.67138209 0.3175264 0.1067119 0.5907180 0.4619992
[2,] 0.2268516 0.90893913 0.62917211 0.1768426 0.3659889 0.0339911 0.2322981
[3,] 0.9264116 0.81693835 0.59555163 0.6960895 0.1667125 0.6631861 0.9718530
[4,] 0.2613363 0.06515864 0.04971742 0.7277188 0.2580444 0.3718222 0.8028141
[5,] 0.2526979 0.49294947 0.97502566 0.7962410 0.8321882 0.2981480 0.7098733
[6,] 0.4245959 0.95951112 0.45632856 0.8227812 0.3542232 0.2680804 0.7042317
Now let's say you want columns 3,4, and 5 in from D as the last three columns in C, then you can simply just say
C[ ,4:6] <- D[ ,3:5]
And your result is
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 -1.76111875 0.4621061 0.67138209 0.3175264 0.1067119
[2,] 1 0.40036245 0.9054436 0.62917211 0.1768426 0.3659889
[3,] 1 -1.03238266 -0.6705829 0.59555163 0.6960895 0.1667125
[4,] 1 -0.47064774 0.3119684 0.04971742 0.7277188 0.2580444
[5,] 1 -0.01436411 -0.4688032 0.97502566 0.7962410 0.8321882
[6,] 1 -1.18711832 0.8227810 0.45632856 0.8227812 0.3542232
Just one thing to note is that this requires your number of rows to be the same between C and D.
I have a list of three data frames, each with n columns (6 in this case) and r rows (3 in this case). I want to create a matrix with the same dimensions (n*r) of the average of the three data frames within the list. So, for instance, [1,1] of the output matrix should be the average of 0.2470748, 0.2558439 and 0.2439057. Any ideas on how to do this?
We can use Reduce
Reduce(`+`, lis)/length(lis)
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 0.2489415 0.2825572 0.3033121 0.3011313 0.3560603 0.5091391
#[2,] 0.2033602 0.2516646 0.2805718 0.2855458 0.3428526 0.4959503
#[3,] 0.1841235 0.2362422 0.2771326 0.2821553 0.3382137 0.4888071
Or another option is apply (as mentioned in the comments by #Ananda Mahto)
apply(simplify2array(lis), c(1,2), mean)
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 0.2489415 0.2825572 0.3033121 0.3011313 0.3560603 0.5091391
#[2,] 0.2033602 0.2516646 0.2805718 0.2855458 0.3428526 0.4959503
#[3,] 0.1841235 0.2362422 0.2771326 0.2821553 0.3382137 0.4888071
The advantage of mean function is that if there are NA values, we can use na.rm=TRUE as argument.