Finding dot product in r - r

I am trying to find the dot product of two matrices in R. In the q matrix, which must be transposed, I have three different q values that I randomly generated earlier, and in the z matrix three randomly generated z values that serve as coordinates of a random point i. I have:
z0= NULL
for (i in 1:100){
z0[i]= 1
}
z1= runif(100, min=0, max= 20)
z2= runif(100, min=0, max=20)
q0= runif(1, 0, 1)
q1= runif(1, 0, 1)
q2= runif(1, 0, 1)
i= runif(1, 1, 101)
i= ceiling(i-1)
q= matrix(c(q0,q1,q2), ncol=3)
z= matrix(c(z0[i],z1[i],z2[i]), ncol=3)
s[i]= t(q)*z
However, when I try to calculate s[i], I get Error in t(q) * z : non-conformable arrays. I am not sure why this would be as I they seem to both have the same length.
This is my first time using R so I am not really sure what is going on.
Thanks!

Without using matrices or any special libraries:
The dot product of two vectors can be calulated by multiplying them element-wise with * then summing the result.
a <- c(1,2,3)
b <- c(4,5,6)
sum(a*b)

As Pascal says, dot product in R is %*%. I am able to use this successfully on your sample data:
> z0= NULL
> for (i in 1:100){
+ z0[i]= 1
+ }
> z1= runif(100, min=0, max= 20)
> z2= runif(100, min=0, max=20)
> q0= runif(1, 0, 1)
> q1= runif(1, 0, 1)
> q2= runif(1, 0, 1)
> i= runif(1, 1, 101)
> i= ceiling(i-1)
> q= matrix(c(q0,q1,q2), ncol=3)
> z= matrix(c(z0[i],z1[i],z2[i]), ncol=3)
> t(q)%*%z
[,1] [,2] [,3]
[1,] 0.3597998 3.227388 2.960053
[2,] 0.3544622 3.179510 2.916141
[3,] 0.3550781 3.185035 2.921208
> z%*%t(q)
[,1]
[1,] 4.340265

Sample Answer:
library(geometry)
dot(A,B)

Since it seems like others have tackled your issue, I'll just add on to say that if you want a special dot product function, you can write one yourself:
dot <- function(x, y){ # x and y can be vectors or matrices
result <- t(x)%*%y # %*% is the matrix multiplication operator
print(result) # t(x) denotes the transpose of x
}
Or, as #user3503711 says in his answer, you can just use the dot() function from the geometry library.

Related

Creating more pseudo-random matrices same time in R? Comparing the points sign matching?

I can make one pseudo-random matrix with the following :
nc=14
nr=14
set.seed(111)
M=matrix(sample(
c(runif(58,min=-1,max=0),runif(71, min=0,max=0),
runif(nr*nc-129,min=0,max=+1))), nrow=nr, nc=nc)
The more important question: I need 1000 matrices with the same amount of negative, positive and zero values, just the location in the matrices need to be various.
I can make matrices one by one, but I want to do this task faster.
The less important question: If I have the 1000 matrices, I need to identify for every point of the matrices, that how many positive negative or zero value got there, for example:
MATRIX_A
[,1]
[9,] -0,2
MATRIX_B
[,1]
[9,] -0,5
MATRIX_C
[,1]
[9,] 0,1
MATRIX_D
[,1]
[9,] 0,0
MATRIX_E
[,1]
[9,] 0,9
What I need:
FINAL_MATRIX_positive
[,1]
[9,] (2/5*100)=40% or 0,4 or 2
,because from 5 matrix in this point were 2 positive value, and also need this for negative and zero values too.
If it isn't possible to do this in R, I can compare them "manually" in Excel.
Thank you for your help!
Actually you are almost there!
You can try the code below, where replicate can make 1000 times for generating the random matrix, and Reduce gets the statistics of each position:
nc <- 14
nr <- 14
N <- 1000
lst <- replicate(
N,
matrix(sample(
c(
runif(58, min = -1, max = 0),
runif(71, min = 0, max = 0),
runif(nr * nc - 129, min = 0, max = +1)
)
), nrow = nr, nc = nc),
simplify = FALSE
)
pos <- Reduce(`+`,lapply(lst,function(M) M > 0))/N
neg <- Reduce(`+`,lapply(lst,function(M) M < 0))/N
zero <- Reduce(`+`,lapply(lst,function(M) M == 0))/N
I use a function for your simulation scheme:
my_sim <- function(n_neg = 58, n_0 = 71, n_pos = 67){
res <- c(runif(n_neg, min=-1, max=0),
rep(0, n_0),
runif(n_pos, min=0, max=+1))
return(sample(res))
}
Then, I simulate your matrices (I store them in a list):
N <- 1000
nr <- 14
nc <- nr
set.seed(111)
my_matrices <- list()
for(i in 1:N){
my_matrices[[i]] <- matrix(my_sim(), nrow = nr, ncol = nc)
}
Finally, I compute the proportion of positive numbers for the position row 1 and column 9:
sum(sapply(my_matrices, function(x) x[1,9]) > 0)/N
# [1] 0.366
However, if you are interested in all the positions, these lines will do the job:
aux <- lapply(my_matrices, function(x) x > 0)
FINAL_MATRIX_positive <- 0
for(i in 1:N){
FINAL_MATRIX_positive <- FINAL_MATRIX_positive + aux[[i]]
}
FINAL_MATRIX_positive <- FINAL_MATRIX_positive/N
# row 1, column 9
FINAL_MATRIX_positive[1, 9]
# [1] 0.366

Generate random numbers in R satisfying constraints

I need help with a code to generate random numbers according to constraints.
Specifically, I am trying to simulate random numbers ALFA and BETA from, respectively, a Normal and a Gamma distribution such that ALFA - BETA < 1.
Here is what I have written but it does not work at all.
set.seed(42)
n <- 0
repeat {
n <- n + 1
a <- rnorm(1, 10, 2)
b <- rgamma(1, 8, 1)
d <- a - b
if (d < 1)
alfa[n] <- a
beta[n] <- b
l = length(alfa)
if (l == 10000) break
}
Due to vectorization, it will be faster to generate the numbers "all at once" rather than in a loop:
set.seed(42)
N = 1e5
a = rnorm(N, 10, 2)
b = rgamma(N, 8, 1)
d = a - b
alfa = a[d < 1]
beta = b[d < 1]
length(alfa)
# [1] 36436
This generated 100,000 candidates, 36,436 of which met your criteria. If you want to generate n samples, try setting N = 4 * n and you'll probably generate more than enough, keep the first n.
Your loop has 2 problems: (a) you need curly braces to enclose multiple lines after an if statement. (b) you are using n as an attempt counter, but it should be a success counter. As written, your loop will only stop if the 10000th attempt is a success. Move n <- n + 1 inside the if statement to fix:
set.seed(42)
n <- 0
alfa = numeric(0)
beta = numeric(0)
repeat {
a <- rnorm(1, 10, 2)
b <- rgamma(1, 8, 1)
d <- a - b
if (d < 1) {
n <- n + 1
alfa[n] <- a
beta[n] <- b
l = length(alfa)
if (l == 500) break
}
}
But the first way is better... due to "growing" alfa and beta in the loop, and generating numbers one at a time, this method takes longer to generate 500 numbers than the code above takes to generate 30,000.
As commented by #Gregor Thomas, the failure of your attempt is due to the missing of curly braces to enclose the if statement. If you would like to skip {} for if control, maybe you can try the code below
set.seed(42)
r <- list()
repeat {
a <- rnorm(1, 10, 2)
b <- rgamma(1, 8, 1)
d <- a - b
if (d < 1) r[[length(r)+1]] <- cbind(alfa = a, beta = b)
if (length(r) == 100000) break
}
r <- do.call(rbind,r)
such that
> head(r)
alfa beta
[1,] 9.787751 12.210648
[2,] 9.810682 14.046190
[3,] 9.874572 11.499204
[4,] 6.473674 8.812951
[5,] 8.720010 8.799160
[6,] 11.409675 10.602608

R solver optimization

I am new to R solver and I want to have a simple example in R for the below problem:
I have four columns which I calculate the individual sums as the illustrated sample example below:
The problem I want to solve in R:
Find the optimal lines that satisfies, simultaneously, the below statements:
For the first two columns (a, b) the individual summations to be more close to 0
The sums of (c, d) to be more close to 5
I do not have restrictions of which package solver to use. It could be helpful to have an example of R code for this!
EDIT
For the same solution I would like to apply some rules:
I want the sum(c) > sum(d) AND sum(d) < (static number, like 5)
Also, if I want the sums to fall into a range of numbers and not just static numbers, how the solution could it be written?
Using M defined reproducibly in the Note at the end we find the b which minimizes the following objective where b is a 0/1 vector:
sum((b %*% M - c(0, 0, 5, 5))^2)
1) CVXR Using the CVXR package we get a solution c(1, 0, 0, 1, 1) which means choose rows 1, 4 and 5.
library(CVXR)
n <- nrow(M)
b <- Variable(n, boolean = TRUE)
pred <- t(b) %*% M
y <- c(0, 0, 5, 5)
objective <- Minimize(sum((t(y) - pred)^2))
problem <- Problem(objective)
soln <- solve(problem)
bval <- soln$getValue(b)
zapsmall(c(bval))
## [1] 1 0 0 1 1
2) Brute Force Alternately since there are only 5 rows there are only 2^5 possible solutions so we can try them all and pick the one which minimizes the objective. First we compute a matrix solns with 2^5 columns such that each column is one possible solution. Then we compute the objective function for each column and take the one which minimizes it.
n <- nrow(M)
inverse.which <- function(ix, n) replace(integer(n), ix, 1)
L <- lapply(0:n, function(i) apply(combn(n, i), 2, inverse.which, n))
solns <- do.call(cbind, L)
pred <- t(t(solns) %*% M)
obj <- colSums((pred - c(0, 0, 5, 5))^2)
solns[, which.min(obj)]
## [1] 1 0 0 1 1
Note
M <- matrix(c(.38, -.25, .78, .83, -.65,
.24, -.35, .44, -.88, .15,
3, 5, 13, -15, 18,
18, -7, 23, -19, 7), 5)

Function to multiple every matrix by every vector?

I am implementing a technique whereby I take two matrices (M1 and M2) and multiply them each by the same "skewer" vector (B), producing results vectors R1 and R2, then taking a correlation of these vectors, as so:
P1 <- data.frame(split(rnorm(5*16, 1, 1), 1:16))
M1 <- matrix(unlist(P1[1,]), nrow = 4)
M1[upper.tri(M1)] <- t(M1)[upper.tri(M1)]
P2 <- data.frame(split(rnorm(5*16, 1, 1), 1:16))
M2 <- matrix(unlist(P2[1,]), nrow = 4)
M2[upper.tri(M2)] <- t(M2)[upper.tri(M2)]
B <- rnorm(4, 0, 1)
R1 <- M1 %*% B
R2 <- M2 %*% B
cor(R1, R2)
However, I need to extend this in two ways: i) I need to do this for n (4000, but showing here for 2) vectors of B, which I have achieved using a function as below, and ii) performing this for each iteration of a posterior distribution of the matrices (1000, using 5 here in the example), which I have achieved using a for loop inside the function. This returns a data frame with one row per iteration, and 1 column per skewer, and each cell giving the correlation of response vectors. While this works, the for loop is slow -
com_rsk_p2 <- function(m1, m2, n = 2){
nitt <- length(m1[,1])
k <- sqrt(length(m1))
B <- split(rnorm(n*k, 0, 1), 1:n)
rv_cor <- split(rep(NA, times = n*nitt), 1:nitt)
for(i in 1:nitt){
R1 <- sapply(B, function(x) x %*% matrix(unlist(m1[i,]), ncol = k))
R2 <- sapply(B, function(x) x %*% matrix(unlist(m2[i,]), ncol = k))
rv_cor[[i]] <- diag(matrix(mapply(cor, list(R1), list(R2)), ncol = n))
}
return(t(data.frame(rv_cor)))
}
I've been working on this for a couple of days, but coming up short - is it possible to use a non-looping/apply approach so that each iteration of M1 and M2 are multiplied by each skewer, storing the result vector correlations for each case? I'm sure there must be some trick that I am missing!
> out <- com_rsk_p2(P1, P2)
> out
[,1] [,2]
X1 0.7622732 0.8156658
X2 0.4414054 0.4266134
X3 0.4388098 -0.1248999
X4 0.5438046 0.7723585
X5 -0.5833943 -0.5294521
Ideally I'd like to have the objects R1 and R2 remain within the function because I will use these for some other things later on when I add to this function (calculating angles between vectors etc.).
Updated 26/04/2018 I have created a list of the matrices, and matrix of the B vectors, and I can multiply a single matrix by each vector of B as below - the key I am looking for is to extend this to an efficient approach that multiplies each matrix in the list by each vector of B:
P1 <- data.frame(split(round(rnorm(5*16, 1, 1),2), 1:16))
P2 <- data.frame(split(round(rnorm(5*16, 1, 1),2), 1:16))
nitt <- length(P1[,1])
k <- sqrt(length(P1))
M1L <- list(rep(NA, times = nitt))
M2L <- list(rep(NA, times = nitt))
for(i in 1:nitt){
M <- matrix(P1[i,], byrow = T, ncol = k)
M[lower.tri(M)] <- t(M)[lower.tri(M)]
M1L[[i]] <- M
M <- matrix(P2[i,], byrow = T, ncol = k)
M[lower.tri(M)] <- t(M)[lower.tri(M)]
M2L[[i]] <- M
}
B <- matrix(round(rnorm(2*4, 0, 1),2), ncol = 2)
matrix(unlist(M2L[[1]]), ncol = 4) %*% B
> matrix(unlist(M2L[[1]]), ncol = 4) %*% B
[,1] [,2]
[1,] 0.1620 -0.3203
[2,] 0.6027 0.8148
[3,] 0.9763 -1.3177
[4,] -0.5369 0.5605

What is wrong with my starting values

I am using nleqslv package in R to solve nonlinear system of equations. The R codes are given below;
require(nleqslv)
x <- c(6,12,18,24,30)
NMfun1 <- function(k,n) {
y <- rep(NA, length(k))
y[1] <- -(5/k[1])+sum(x^k[2]*exp(k[3]*x))+2*sum(k[4]*x^k[2]*exp(-k[1]*x^k[2]*exp(k[3]*x)+k[3]*x)/(1-k[4]*exp(-k[1]*x^k[2]*exp(k[3]*x))))
y[2] <- -sum(log(x))-sum(1/(k[2]+k[3]*x))+sum(k[1]*x^k[2]*exp(k[3]*x)*log(x))+2*sum(k[1]*k[4]*exp(-k[1]*x^k[2]*exp(k[3]*x)+k[3]*x)*log(x)/(1-k[4]*exp(-k[1]*x^k[2]*exp(k[3]*x))))
y[3] <- -sum(x/(k[2]+k[3]*x))+sum(k[1]*x^(k[2]+1)*exp(k[3]*x))-sum(x)+2*sum(k[4]*x^k[2]*exp(-k[1]*x^k[2]*exp(k[3]*x)+k[3]*x)/(1-k[4]*exp(-k[1]*x^k[2]*exp(k[3]*x))))
y[4] <- -(5/(1-k[4]))+2*sum(exp(-k[1]*x^k[2]*exp(k[3]*x))/(1-k[4]*exp(-k[1]*x^k[2]*exp(k[3]*x))))
return(y)
}
kstart <- c(0.05, 0, 0.35, 0.9)
NMfun1(kstart)
nleqslv(kstart, NMfun1, control=list(btol=.0001),method="Newton")
The estimated values for k obtained are; 0.04223362 -0.08360564 0.14216026 0.37854908
But the estimated values of k are to be
greater than zero.
Ok. So you want real larger than 0 solutions if they exist of course.
Make a new function which squares the input argument before passing it to NMfun1. And then use the searchZeros function in the package nleqslv to search for solutions. Like this
NMfun1.alt <- function(k0,n) NMfun1(k0^2,n)
3 use set.seed for reproducibility
set.seed(413)
# generate 100 random starting values
xstart <- matrix(runif(4*100,min=0,max=1), nrow=100,ncol=4)
z <- searchZeros(xstart,NMfun1.alt)
z
ksol <- z$x^2
ksol
# in this case there are two solutions
NMfun1(ksol[1,])
NMfun1(ksol[2,])
The output of the last 4 non comment lines of this code are
> ksol <- z$x^2
> ksol
[,1] [,2] [,3] [,4]
[1,] 0.002951051 1.669142 0.03589502 0.001167185
[2,] 0.002951051 1.669142 0.03589502 0.001167185
> NMfun1(ksol[1,])
[1] 3.231138e-11 3.602561e-13 -4.665268e-12 -1.119105e-13
> NMfun1(ksol[2,])
[1] 1.532663e-12 1.085046e-14 6.894485e-14 -2.664535e-15
You will see that the solution contained in object z has a negative element. And that is squared.
From this experiment it appears that your system has a single positive solution.

Resources