I think my example is something special. Since I am not advanced in the use of lapply I am stucking with the following calculation. Here is a short reproducivle example: Assume I've a list containing three matrices:
list <- list(est1=matrix(sample(c(0,0.4,0.2,1), replace=TRUE, size=10), ncol=2), est2=matrix(sample(c(0,0.4,0.2,1), replace=TRUE, size=10), ncol=2),
est3=matrix(sample(c(0,0.4,0.2,1), replace=TRUE, size=10), ncol=2))
$`est1`
[,1] [,2]
[1,] 0.4 1.0
[2,] 0.0 0.4
[3,] 0.0 0.0
[4,] 0.0 0.4
[5,] 0.0 1.0
$est2
[,1] [,2]
[1,] 0.0 0.2
[2,] 0.4 0.4
[3,] 1.0 0.0
[4,] 0.2 1.0
[5,] 0.4 0.4
$est3
[,1] [,2]
[1,] 1.0 0.2
[2,] 0.4 1.0
[3,] 1.0 0.0
[4,] 1.0 0.2
[5,] 0.4 0.4
Each matrix contains coefficient estimates for different iterations. Each element inside one matrix belongs to one coefficient. I want to calculate the percentage over the three Matrices at which a coefficient is different from zero.
Expected Output:
[,1] [,2]
0.67 1
0.67 1
0.67 0
0.67 1
0.67 1
Please do not call your list list. In the following, it will be called z.
z <- list(est1=matrix(sample(c(0,0.4,0.2,1), replace=TRUE, size=10), ncol=2), est2=matrix(sample(c(0,0.4,0.2,1), replace=TRUE, size=10), ncol=2),
est3=matrix(sample(c(0,0.4,0.2,1), replace=TRUE, size=10), ncol=2))
For the kind of problems that you describe, I like to use arrays, so the first step is to transform your list into an array.
library(abind)
A <- abind(list, along=3)
Then, you can apply a function along the third dimension:
apply(A, 1:2, function(x) 100 * sum(x!=0) / length(x))
[,1] [,2]
[1,] 100.0 100.0
[2,] 100.0 66.7
[3,] 100.0 66.7
[4,] 100.0 66.7
[5,] 66.7 66.7
Maybe the following does what you want.
I start by setting the RNG seed to make the results reproducible
set.seed(2081) # Make the results reproducible
list <- list(est1 = matrix(sample(c(0,0.4,0.2,1), replace=TRUE, size=10), ncol=2),
est2 = matrix(sample(c(0,0.4,0.2,1), replace=TRUE, size=10), ncol=2),
est3 = matrix(sample(c(0,0.4,0.2,1), replace=TRUE, size=10), ncol=2))
zeros <- sapply(list, `==`, 0)
res <- rowSums(zeros) / ncol(zeros)
matrix(res, ncol = 2)
# [,1] [,2]
#[1,] 0.3333333 0.3333333
#[2,] 0.0000000 0.6666667
#[3,] 0.0000000 0.3333333
#[4,] 0.3333333 0.3333333
#[5,] 0.6666667 0.3333333
EDIT.
The following uses rowMeans and is simpler. The result is identical() to res above.
res2 <- rowMeans(zeros)
identical(res, res2)
#[1] TRUE
matrix(res2, ncol = 2)
Related
say I have the matrix d, which is the result of two different realizations (rows) of a sampling procedure in two dimensions (columns). I want to develop a function that creates the fully-antithetic draws from this original matrix.
c1 <- c(0.1, 0.6);c2 <- c(0.3, 0.8);d <- rbind(c1,c2)
# [,1] [,2]
# c1 0.1 0.6
# c2 0.3 0.8
That is to say, for example, for the first realization (c(0.1, 0.6)) I want to obtain the mirror images of this random draw in two dimensions, which generated 4 (2^2) possible combinations as follows:
d1_anthi = matrix(
c( d[1,1] , d[1,2],
1 - d[1,1], d[1,2],
d[1,1] , 1 - d[1,2],
1 - d[1,1], 1 - d[1,2]), nrow=2,ncol=4)
t(d1_anthi)
# [,1] [,2]
# [1,] 0.1 0.6
# [2,] 0.9 0.6
# [3,] 0.1 0.4
# [4,] 0.9 0.4
Analogously, for the second, realization the results is the following:
d2_anthi = matrix(
c( d[2,1] , d[2,2],
1 - d[2,1], d[2,2],
d[2,1] , 1 - d[2,2],
1 - d[2,1], 1 - d[2,2]), nrow=2, ncol=4)
t(d2_anthi)
# [,1] [,2]
# [1,] 0.3 0.8
# [2,] 0.7 0.8
# [3,] 0.3 0.2
# [4,] 0.7 0.2
Accordingly, my desired object will lock is like this:
anthi_draws <- rbind(t(d1_anthi),t(d2_anthi))
# [,1] [,2]
# [1,] 0.1 0.6 <- original first realization
# [2,] 0.9 0.6
# [3,] 0.1 0.4
# [4,] 0.9 0.4
# [5,] 0.3 0.8 <- original second realization
# [6,] 0.7 0.8
# [7,] 0.3 0.2
# [8,] 0.7 0.2
Finally, I would like to create a function that, given a matrix of random numbers, is able to create this expanded matrix of antithetic draws. For example, in the picture below I have a sampling in three dimensions, then the total number of draws per original draw is 2^3 = 8.
In particular, I am having problems with the creating of the full combinatory that depends on the dimensions of the original sampling (columns of the matrix). I was planning on using expand.grid() but I couldn't create the full combinations using it. Any hints or help in order to create such a function is welcome. Thank you in advance.
You can try this
do.call(
rbind,
apply(
d,
1,
function(x) {
expand.grid(data.frame(rbind(x, 1 - x)))
}
)
)
which gives
X1 X2
c1.1 0.1 0.6
c1.2 0.9 0.6
c1.3 0.1 0.4
c1.4 0.9 0.4
c2.1 0.3 0.8
c2.2 0.7 0.8
c2.3 0.3 0.2
c2.4 0.7 0.2
I have a matrix with 50 rows and 50 columns:
[,1] [,2] [,3]...[,50]
[1,] 1 0.8 0.7
[2,] 0.8 1 0.5
[3,] 0.7 0.5 1
...
[50,]
And I want to sum 0.02 in values up to diagonal to obtain something like this:
[,1] [,2] [,3]...[,50]
[1,] 1 0.82 0.72
[2,] 0.8 1 0.52
[3,] 0.7 0.5 1
...
[50,]
Does anyone know how the sum could be done only in the values that are above the diagonal of the matrix using R?
Example of matrix code:
matrix <- as.matrix(data.frame(A = c(1, 0.8, 0.7), B = c(0.8, 1, 0.5), C = c(0.7, 0.5, 1)), nrow=3, ncol=3)
Try upper.tri like below
matrix[upper.tri(matrix)] <- matrix[upper.tri(matrix)] + 0.02
You can use lower.tri(m) or upper.tri(m) functions in R. Which m is your matrix.
m = matrix(1:36, 6, 6)
m[upper.tri(m)] = m[upper.tri(m)] + 0.02
m
Lets say I have two list-of-lists, one being solely binary and the other one being quantitative. The order in the lists matters. I would like to map the binary matrices onto its qualitatively counterpart while creating a new list-of-lists with the same number of nested matrices with the same dimensions. These matrices will be subsets of their qualitative counterparts; where there are 1s in the binary matrices.
# dummy data
dat1 <- c(0,1,0,1,1,0,0,0,1,0,0,0,1,1,0,1)
mat1 <- matrix(dat1, ncol=4, nrow=4, byrow=T)
dat2 <- c(1,1,0,1,0,0,1,1,0,1,0,1,0,1,0,0)
mat2 <- matrix(dat1, ncol=4, nrow=4, byrow=T)
lsMat1 <- list(mat1, mat2)
dat3 <- c(0.3,0.1,0.6,0.3,0.9,0.1,0.1,0.3,0.6,0.2,0.7,0.8,0.4,0.1,0.4,0.5)
mat3 <- matrix(dat3, ncol=4, nrow=4, byrow=T)
dat4 <- c(0.5,0.3,0.6,0.8,0.1,0.4,0.5,0.1,0.5,0.1,0.0,0.1,0.4,0.6,0.0,0.8)
mat4 <- matrix(dat4, ncol=4, nrow=4, byrow=T)
lsMat2 <- list(mat3, mat4)
Desired new nested list
[[1]]
[,1] [,2] [,3] [,4]
[1,] 0.0 0.1 0 0.3
[2,] 0.9 0.0 0 0.0
[3,] 0.6 0.0 0 0.0
[4,] 0.4 0.1 0 0.5
[[2]]
[,1] [,2] [,3] [,4]
[1,] 0.0 0.3 0 0.8
[2,] 0.1 0.0 0 0.0
[3,] 0.5 0.0 0 0.0
[4,] 0.4 0.6 0 0.8
Any pointers would be highly appreciated, thanks!
I'm going to assume the output you supplied above is incorrect. Since you have 0's and 1's in your binary matrix and you only want to keep the 1's values, you can use simple elementwise multiplication. You can do that for each item in the list with
Map(`*`, lsMat1, lsMat2)
which returns
[[1]]
[,1] [,2] [,3] [,4]
[1,] 0.0 0.1 0 0.3
[2,] 0.9 0.0 0 0.0
[3,] 0.6 0.0 0 0.0
[4,] 0.4 0.1 0 0.5
[[2]]
[,1] [,2] [,3] [,4]
[1,] 0.0 0.3 0 0.8
[2,] 0.1 0.0 0 0.0
[3,] 0.5 0.0 0 0.0
[4,] 0.4 0.6 0 0.8
given that column three in both matrices in lsMat1 are all 0, this seems more correct.
If i understood the question i would do a element-wise matrix multiplication. Im not familiar with the syntax you posted but IN MATLAB:
mat1 .* mat3
Now all elements that are zero in your binary matrix will stay zero, and all that are one will become the value from your qualitative matrix.
Hope it helps!
I have a correlation matrix:
a <- matrix(c(1, .8, .8, .8, 1, .8, .8, .8, 1), 3)
## [,1] [,2] [,3]
## [1,] 1.0 0.8 0.8
## [2,] 0.8 1.0 0.8
## [3,] 0.8 0.8 1.0
I would now like to create a covariance matrix from the correlation matrix. How can this be done in R?
I tried:
e1.sd <- 3
e2.sd <- 10
e3.sd <- 3
e.cov <- a * as.matrix(c, e1.sd, e2.sd, e3.sd) %*% t(as.matrix(c(e1.sd, e2.sd, e3.sd)))
But I get the error:
Error in a * as.matrix(c, e1.sd, e2.sd, e3.sd) %*% t(as.matrix(c(e1.sd, :
non-conformable arrays
What am I doing wrong?
If you know the standard deviations of your individual variables, you can:
stdevs <- c(e1.sd, e2.sd, e3.sd)
#stdevs is the vector that contains the standard deviations of your variables
b <- stdevs %*% t(stdevs)
# b is an n*n matrix whose generic term is stdev[i]*stdev[j] (n is your number of variables)
a_covariance <- b * a #your covariance matrix
On the other hand, if you don't know the standard deviations, it's impossible.
require(MBESS)
a <- matrix(c(1,.8,.8,.8,1,.8,.8,.8,1),3)
> cor2cov(a,c(3,10,3))
[,1] [,2] [,3]
[1,] 9.0 24 7.2
[2,] 24.0 100 24.0
[3,] 7.2 24 9.0
Building on S4M's answer, in base R, I would write this function:
cor2cov <- function(V, sd) {
V * tcrossprod(sd)
}
tcrossprod will calculate the product of each combination of elements of the sd vector (equivalent to x %*% t(x)), which we then (scalar) multiply by the variance-covariance matrix
Here's a quick check that the function is correct using the built in mtcars data set:
all.equal(
cor2cov(cor(mtcars), sapply(mtcars, sd)),
cov(mtcars)
)
The answer marked as correct is wrong.
The correct solution seems to be the one provided by MBESS package, so see the post from dayne.
> a
[,1] [,2] [,3]
[1,] 1.0 0.8 0.8
[2,] 0.8 1.0 0.8
[3,] 0.8 0.8 1.0
> b <- c(3,10,3)
> b %*% t(b)
[,1] [,2] [,3]
[1,] 9 30 9
[2,] 30 100 30
[3,] 9 30 9
> c <- b %*% t(b)
> c %*% a
[,1] [,2] [,3]
[1,] 40.2 44.4 40.2
[2,] 134.0 148.0 134.0
[3,] 40.2 44.4 40.2
> cor2cov(cor.mat=a, b )
[,1] [,2] [,3]
[1,] 9.0 24 7.2
[2,] 24.0 100 24.0
[3,] 7.2 24 9.0
> a %*% c
[,1] [,2] [,3]
[1,] 40.2 134 40.2
[2,] 44.4 148 44.4
[3,] 40.2 134 40.2
>
I have a large dataset (202k points). I know that there are 8 values over 0.5. I want to subset on those rows.
How do I find/return a list the row numbers where the values are > 0.5?
If the dataset is a vector named x:
(1:length(x))[x > 0.5]
If the dataset is a data.frame or matrix named x and the variable of interest is in column j:
(1:nrow(x))[x[,j] > 0.5]
But if you just want to find the subset and don't really need the row numbers, use
subset(x, x > 0.5)
for a vector and
subset(x, x[,j] > 0.5)
for a matrix or data.frame.
which(x > 0.5)
Here's some dummy data:
D<-matrix(c(0.6,0.1,0.1,0.2,0.1,0.1,0.23,0.1,0.8,0.2,0.2,0.2),nrow=3)
Which looks like:
> D
[,1] [,2] [,3] [,4]
[1,] 0.6 0.2 0.23 0.2
[2,] 0.1 0.1 0.10 0.2
[3,] 0.1 0.1 0.80 0.2
And here's the logical row index,
index <- (rowSums(D>0.5))>=1
You can use it to extract the rows you want:
PeakRows <- D[index,]
Which looks like this:
> PeakRows
[,1] [,2] [,3] [,4]
[1,] 0.6 0.2 0.23 0.2
[2,] 0.1 0.1 0.80 0.2
Using the argument arr.ind=TRUE with which is a great way for finding the row (or column) numbers where a condition is TRUE,
df <- matrix(c(0.6,0.2,0.1,0.25,0.11,0.13,0.23,0.18,0.21,0.29,0.23,0.51), nrow=4)
# [,1] [,2] [,3]
# [1,] 0.60 0.11 0.21
# [2,] 0.20 0.13 0.29
# [3,] 0.10 0.23 0.23
# [4,] 0.25 0.18 0.51
which with arr.ind=TRUE returns the array indices where the condition is TRUE
which(df > 0.5, arr.ind=TRUE)
row col
[1,] 1 1
[2,] 4 3
so the subset becomes
df[-which(df > 0.5, arr.ind=TRUE)[, "row"], ]
# [,1] [,2] [,3]
# [1,] 0.2 0.13 0.29
# [2,] 0.1 0.23 0.23