zeros in c() after round() - r

I'm trying to print a vector with:
total_nb_visits <- 20302079
total_nb_visits_y1 <- 19299803
total_nb_visitors <- 17707555
total_nb_visitors_y1 <- 17196674
CVR <- 0.02274954
CVR_y1 <- 0.02293334
When I do:
exportable <- rbind(c(total_nb_visits,total_nb_visits_y1,((total_nb_visits-total_nb_visits_y1)/total_nb_visits_y1)*100),
c(total_nb_visitors,total_nb_visitors_y1,((total_nb_visitors-total_nb_visitors_y1)/total_nb_visitors_y1)*100),
c(CVR,CVR_y1,((CVR-CVR_y1)/CVR_y1)))
I'm getting:
exportable
[,1] [,2] [,3]
[1,] 20302079.00000000 19299803.00000000 5.193192905
[2,] 17707555.00000000 17196674.00000000 2.970812844
[3,] 0.02274954 0.02293334 -0.008014533
And I'd like to get:
[,1] [,2] [,3]
[1,] 20302079 19299803 5.193192905
[2,] 17707555 17196674 2.970812844
[3,] 0.02274954 0.02293334 -0.008014533
And round() is not working...
I should add that I've been also playing with options(scipen=999) to delete the scientific notation...
Can anyone help me?
Thanks !

you can use prettynum like this:
exportable <- as.data.table(
rbind(
c(prettyNum(total_nb_visits),
total_nb_visits_y1, round(((total_nb_visits - total_nb_visits_y1) / total_nb_visits_y1) * 100, 9)),
c(total_nb_visitors,
total_nb_visitors_y1,
round(((total_nb_visitors - total_nb_visitors_y1) / total_nb_visitors_y1) * 100, 9)),
c(CVR,CVR_y1, round((CVR-CVR_y1) / CVR_y1, 9))))
the only downside is that it will convert the values to character but you will have your pretty output!

Related

Efficiently finding minimum cells values from a set of matrices in R

I have a list of matrices (size n*n), and I need to create a new matrix giving the minimum value observed for each cell, based on my list.
For instance, with the following matrices list:
> a = list(matrix(rexp(9), 3), matrix(rexp(9), 3), matrix(rexp(9), 3))
> a
[[1]]
[,1] [,2] [,3]
[1,] 0.5220069 0.39643016 0.04255687
[2,] 0.4464044 0.66029350 0.34116609
[3,] 2.2495949 0.01705576 0.08861866
[[2]]
[,1] [,2] [,3]
[1,] 0.3823704 0.271399 0.7388449
[2,] 0.1227819 1.160775 1.2131681
[3,] 0.1914548 1.004209 0.7628437
[[3]]
[,1] [,2] [,3]
[1,] 0.2125612 0.45379057 1.5987420
[2,] 0.3242311 0.02736743 0.4372894
[3,] 0.6634098 1.15401347 0.9008529
The output should be:
[,1] [,2] [,3]
[1,] 0.2125612 0.271399 0.04255687
[2,] 0.1227819 0.02736743 0.34116609
[3,] 0.1914548 0.01705576 0.08861866
I tried using apply loop with the following code (using melt and dcast from reshape2 library):
library(reshape2)
all = melt(a)
allComps = unique(all[,c(1:2)])
allComps$min=apply(allComps, 1, function(x){
g1 = x[1]
g2 = x[2]
b = unlist(lapply(a, function(y){
return(y[g1,g2])
}))
return(b[which(b==min(b))])
})
dcast(allComps, Var1~Var2)
It works but it is taking a very long time to run when applied on large matrices (6000*6000). I am looking for a faster way to do this.
Use Reduce with pmin :
Reduce(pmin, a)
# [,1] [,2] [,3]
#[1,] 0.02915345 0.03157736 0.3142273
#[2,] 0.57661027 0.05621098 0.1452668
#[3,] 0.48021473 0.18828404 0.4787604
data
set.seed(123)
a = list(matrix(rexp(9), 3), matrix(rexp(9), 3), matrix(rexp(9), 3))
Maybe it should be considered to store the matrices in an array instead of a list. This can be done with simplify2array. In an array the minimum over specific dimensions can be found using min in apply.
A <- simplify2array(a)
apply(A, 1:2, min)
We can use
apply(array(unlist(a), c(3, 3, 3)), 1:2, min)

Conditional function in R which returns a matrix

I am sorry in advance if that's a silly question, but I am a bit new to it.
I would like to write a for loop where the input is a time sequence. Based on the time conditions I would like to select either mat1, mat2, or mat3 to substitute the "mat" parameter and multiply it by 2.
output <- mat*2 #general function
For each time point, I need to have an output.
time=seq(0,10, by=1)
mat1 <- matrix(data = rexp(9, rate = 10), nrow = 3, ncol = 3)
mat2 <- matrix(data = rexp(9, rate = 10), nrow = 3, ncol = 3)
mat3 <- matrix(data = rexp(9, rate = 10), nrow = 3, ncol = 3)
I would like when the time <= 3 the "mat1" to be selected
when the time>3 & time<=6 the "mat2" to be selected
and when the time >6 the "mat3" to be selected and then multiplied by 2.
I know that all this is a bit sketchy but any help would be highly appreciated.
By the way, if you want a list of consecutive integers you can simply use time <- 0:10
Here is one method
lapply(as.character(cut(time,c(-1,3.1,6.1,10),labels=c('mat1','mat2','mat3'))), function(x) get(x)*2)
[[1]]
[,1] [,2] [,3]
[1,] 0.4013379 1.2690301 0.142831401
[2,] 0.1536697 0.1132762 0.040964909
[3,] 0.1412248 0.2209273 0.007446217
[[2]]
[,1] [,2] [,3]
[1,] 0.4013379 1.2690301 0.142831401
[2,] 0.1536697 0.1132762 0.040964909
[3,] 0.1412248 0.2209273 0.007446217
...
[[10]]
[,1] [,2] [,3]
[1,] 0.16712782 0.06451693 0.06554605
[2,] 0.03614116 0.18526124 0.46443236
[3,] 0.53055007 0.01203971 0.16585931
[[11]]
[,1] [,2] [,3]
[1,] 0.16712782 0.06451693 0.06554605
[2,] 0.03614116 0.18526124 0.46443236
[3,] 0.53055007 0.01203971 0.16585931

Fastest R equivalent to MATLAB's reshape() method?

I am converting a MATLAB script into R and regretting it so far, as it is slower at the moment. I'm trying to use "vectorized functions" as much as possible, but I'm relatively new to R and do not know what is meant by this. From my research for loops are only slower than the apply() method in R if you use loads of operators (including the parenthesis). Otherwise, I don't see what R could have done to slow down it further. Here is code that works that I want to speed up.
somPEs <- 9;
inputPEs <- 6;
initial_w <- matrix(1, nrow=somPEs, ncol=inputPEs)
w <- apply(initial_w, 1, function(i) runif(i));
# Reshape w to a 3D matrix of dimension: c(sqrt(somPEs), sqrt(somPEs), inputPEs)
nw <- array(0, dim=c(sqrt(somPEs), sqrt(somPEs), inputPEs))
for (i in 1:inputPEs) {
nw[,,i] <- matrix(w[i,], nrow=sqrt(somPEs), ncol=sqrt(somPEs), byrow=TRUE)
}
w <- nw;
In MATLAB, this code is executed by a built-in function called "reshape", as is done as below:
w = reshape(w,[sqrt(somPEs) sqrt(somPEs) inputPEs]);
I timed my current R code and it's actually super fast, but I'd still like to learn about vectorization and how to convert my code to apply() for readability's sake.
user system elapsed
0.003 0.000 0.002
The first step is to convert your array w from 6x9 to 3x3x6 size, which in your case can be done by transposing and then changing the dimension:
neww <- t(w)
dim(neww) <- c(sqrt(somPEs), sqrt(somPEs), inputPEs)
This is almost what we want, except that the first two dimensions are flipped. You can use the aperm function to transpose them:
neww <- aperm(neww, c(2, 1, 3))
This should be a good deal quicker than looping through the matrix and individually copying over data by row. To see this, let's look at a larger example with 10,000 rows and 100 columns (which will be mapped to a 10x10x10k matrix):
josilber <- function(w) {
neww <- t(w)
dim(neww) <- c(sqrt(dim(w)[2]), sqrt(dim(w)[2]), dim(w)[1])
aperm(neww, c(2, 1, 3))
}
OP <- function(w) {
nw <- array(0, dim=c(sqrt(dim(w)[2]), sqrt(dim(w)[2]), dim(w)[1]))
for (i in 1:(dim(w)[1])) {
nw[,,i] <- matrix(w[i,], nrow=sqrt(dim(w)[2]), ncol=sqrt(dim(w)[2]), byrow=TRUE)
}
nw
}
bigw <- matrix(runif(1000000), nrow=10000, ncol=100)
all.equal(josilber(bigw), OP(bigw))
# [1] TRUE
microbenchmark(josilber(bigw), OP(bigw))
# Unit: milliseconds
# expr min lq mean median uq max neval
# josilber(bigw) 8.483245 9.08430 14.46876 9.431534 11.76744 135.7204 100
# OP(bigw) 83.379053 97.07395 133.86606 117.223236 129.28317 1553.4381 100
The approach using t, dim, and aperm is more than 10x faster in median runtime than the looping approach.
I did not test the speed, but you could try
nw1 <- aperm(`dim<-`(t(w), list(3, 3, 6)), c(2, 1, 3))
> nw1
, , 1
[,1] [,2] [,3]
[1,] 0.8257185 0.5475478 0.4157915
[2,] 0.8436991 0.3310513 0.1546463
[3,] 0.1794918 0.1836032 0.2675192
, , 2
[,1] [,2] [,3]
[1,] 0.6914582 0.1674163 0.2921129
[2,] 0.2558240 0.4269716 0.7335542
[3,] 0.6416367 0.8771934 0.6553210
, , 3
[,1] [,2] [,3]
[1,] 0.9761232 0.05223183 0.6651574
[2,] 0.5740032 0.80621864 0.2295017
[3,] 0.1138926 0.76009870 0.6932736
, , 4
[,1] [,2] [,3]
[1,] 0.437871558 0.5172516 0.1145181
[2,] 0.006923583 0.3235762 0.3751655
[3,] 0.823235642 0.4586850 0.6013853
, , 5
[,1] [,2] [,3]
[1,] 0.7425735 0.1665975 0.8659373
[2,] 0.1418979 0.1878132 0.2357267
[3,] 0.6963537 0.5391961 0.1112467
, , 6
[,1] [,2] [,3]
[1,] 0.7246276 0.02896792 0.04692648
[2,] 0.7563403 0.22027518 0.41138672
[3,] 0.8303413 0.31908307 0.25180560

How to generate symmetric random matrix?

I want to generate a random matrix which should be symmetric.
I have tried this:
matrix(sample(0:1, 25, TRUE), 5, 5)
but it is not necessarily symmetric.
How can I do that?
Another quite interesting opportunity is based on the following mathematical fact: if A is some matrix, then A multiplied by its transpose is always symmetric.
> A <- matrix(runif(25), 5, 5)
> A %*% t(A)
[,1] [,2] [,3] [,4] [,5]
[1,] 1.727769 1.0337816 1.2195505 1.4661507 1.1041355
[2,] 1.033782 1.0037048 0.7368944 0.9073632 0.7643080
[3,] 1.219551 0.7368944 1.8383986 1.3309980 0.9867812
[4,] 1.466151 0.9073632 1.3309980 1.3845322 1.0034140
[5,] 1.104135 0.7643080 0.9867812 1.0034140 0.9376534
Try this from the Matrix package
library(Matrix)
x<-Matrix(rnorm(9),3)
x
3 x 3 Matrix of class "dgeMatrix"
[,1] [,2] [,3]
[1,] -0.9873338 0.8965887 -0.6041742
[2,] -0.3729662 -0.5882091 -0.2383262
[3,] 2.1263985 -0.3550972 0.1067264
X<-forceSymmetric(x)
X
3 x 3 Matrix of class "dsyMatrix"
[,1] [,2] [,3]
[1,] -0.9873338 0.8965887 -0.6041742
[2,] 0.8965887 -0.5882091 -0.2383262
[3,] -0.6041742 -0.2383262 0.1067264
If you don't want to use a package:
n=3
x <- matrix(rnorm(n*n), n)
ind <- lower.tri(x)
x[ind] <- t(x)[ind]
x
I like this one:
n <- 3
aux <- matrix(NA, nrow = n, ncol = n)
for(i in c(1:n)){
for(j in c(i:n)){
aux[i,j] <- sample(c(1:n), 1)
aux[j,i] <- aux[i,j]
}
}

Choleski Decomposition in R to get the inverse when pivot = TRUE

I am using the choleski decomposition to compute the inverse of a matrix that is positive semidefinite. However, when my matrix becomes extremely large and has zeros in it I have that my matrix is no longer (numerically from the computers point of view) positive definite. So to get around this problem I use the pivot = TRUE option in the choleski command in R. However, (as you will see below) the two return the same output but with the rows and columns or the matrix rearranged. I am trying to figure out is there a way (or transformation) to make them the same. Here is my code:
X = matrix(rnorm(9),nrow=3)
A = X%*%t(X)
inv1 = function(A){
Q = chol(A)
L = t(Q)
inverse = solve(Q)%*%solve(L)
return(inverse)
}
inv2 = function(A){
Q = chol(A,pivot=TRUE)
L = t(Q)
inverse = solve(Q)%*%solve(L)
return(inverse)
}
Which when run results in:
> inv1(A)
[,1] [,2] [,3]
[1,] 9.956119 -8.187262 -4.320911
[2,] -8.187262 7.469862 3.756087
[3,] -4.320911 3.756087 3.813175
>
> inv2(A)
[,1] [,2] [,3]
[1,] 7.469862 3.756087 -8.187262
[2,] 3.756087 3.813175 -4.320911
[3,] -8.187262 -4.320911 9.956119
Is there a way to get the two answers to match? I want inv2() to return the answer from inv1().
That is explained in ?chol: the column permutation is returned as an attribute.
inv2 <- function(A){
Q <- chol(A,pivot=TRUE)
Q <- Q[, order(attr(Q,"pivot"))]
Qi <- solve(Q)
Qi %*% t(Qi)
}
inv2(A)
solve(A) # Identical
Typically
M = matrix(rnorm(9),3)
M
[,1] [,2] [,3]
[1,] 1.2109251 -0.58668426 -0.4311855
[2,] -0.8574944 0.07003322 -0.6112794
[3,] 0.4660271 -0.47364400 -1.6554356
library(Matrix)
pm1 <- as(as.integer(c(2,3,1)), "pMatrix")
M %*% pm1
[,1] [,2] [,3]
[1,] -0.4311855 1.2109251 -0.58668426
[2,] -0.6112794 -0.8574944 0.07003322
[3,] -1.6554356 0.4660271 -0.47364400

Resources