Sum of arrays of different dimension in R - r

I have this code:
k = 20
rho = 0.5
pi_greco = array(rep(1/k, k), c(1,20,50))
pi_greco_x <- function(k, rho, pi_greco){
E = array(diag(k),c(20, 20, 50))
E[k,k,] = 0
prob = (1-rho)*pi_greco + rho*E
return(prob)
}
and in prob I need to sum each matrix of dimension 1x20 of pi_greco (multiplied by 1-rho) with each matrix of dimension 20x20 of E (multiplied by rho) in order to get 50 different matrix in prob. But how can I do it?
It is like saying that I would like to do
prob = (1-rho)*pi_greco[,,1] + rho*E[,,1]
But for all the 50 times without using a for cycle
Thanks in advance.

Consider this:
k = 3
rho = 0.5
pi_greco = array(rep(1/k, k), c(1,20,50))
pi_greco_x <- function(k, rho, pi_greco){
E = array(diag(k),c(k, 20, 50))
E[k,k,] = 0
p_rows <- Reduce(
f=function(a,b){ abind( a, (1-rho)*pi_greco, along=1 ) },
x=1:k,
init=NULL
)
prob <- p_rows + rho*E
return(prob)
}
pi_greco_x( k, rho, pi_greco )
I repeat the 1x20x50 to be 20x20x50 by using abind 20 times
This means they can now safely be added together. Working with more than two dimensions can be problematic. The typical human brain isn't very used to it.

Related

Optimize the weight of vectors given the similarity matrix of mean vectors

I want to solve the optimazation problem to search best weights for groups of vectors. Would you like to give some suggestions about how to solve it by R? Thanks very much.
The problem is as follows.
Given there are N groups, we know their similarity matrix among these N groups. The dimension of S is N*N.
In each group, there are K vectors . There are M elements in each vector which value is 0 or 1. .
we can fit an average vector based on these K vectors. For example, average vector
Based on these avearge vectors in each group, we could calculate the correlation among these avearge vectors.
The object is to minimize the differene between correlation matrix C and known similarity matrix S.
Beacuse you didn't provide any data I will generate random and demonstrate way you can approach your problem.
Similarity matrix:
N <- 6
S <- matrix(runif(N^2, -1, 1), ncol = N, nrow = N)
similarity_matrix <- (S + t(S)) / 2
N is number of groups. Each value of similarity matrix is between -1 and 1 and matrix is symmetric (beacuse you want to compare it to covariance matrix these makes sense).
group vectors:
M <- 10
K <- 8
group_vectors <- replicate(N, replicate(K, sample(c(0, 1), M, TRUE)), FALSE)
M is dimension of vector and K is number of binary vectors in each group.
fitness function
fitness <- function(W, group_vectors, similarity_matrix){
W <- as.data.frame(matrix(W, nrow = K, ncol = N))
SS <- cov(
mapply(function(x,y) rowSums(sweep(x, 2, y, "*")), group_vectors, W)
)
sum(abs(SS - similarity_matrix))
}
fitness for given weights calculates described covariance matrix and its distance from similarity_matrix.
differential evolution approach
res <- DEoptim::DEoptim(
fn = fitness,
lower = rep(-1, K*N),
upper = rep(1, K*N),
group_vectors = group_vectors,
similarity_matrix = similarity_matrix,
control = DEoptim::DEoptim.control(VTR = 0, itermax = 1000, trace = 50, NP = 100)
)
W <- matrix(res$optim$bestmem, nrow = K, ncol = N)
genetic algorithm approach
res <- GA::ga(
type = "real-valued",
fitness = function(W, ...) -fitness(W, ...),
lower = rep(-1, K*N),
upper = rep(1, K*N),
group_vectors = group_vectors,
similarity_matrix = similarity_matrix,
maxiter = 10000,
run = 200
)
W <- matrix(res#solution[1,], nrow = K, ncol = N)

MCMC Output does not read full vector

I have a Bayesian MCMC in R, and I have the code below:
RWM = function(Niter,Y,X){
p = ncol(X)
alpha = 0.7
beta = matrix(0,ncol=1,nrow=3)
beta = as.matrix(beta)
sig_p = 0
mu_p = beta
C = diag(p)
R = t(chol(C))
lpi = posterior(beta,Y,X)
OUT = matrix(NA, ncol=3, nrow=Niter)
for (j in 1:Niter){
rr = rnorm(p)
beta_p = beta + exp(sig_p) * as.vector(R%*%rr)
lpi_p = posterior(beta_p,Y,X)
A = exp(lpi_p-lpi)
Acc = min(1,A)
if (runif(1)<=Acc){
beta = beta_p
lpi = lpi_p
}
OUT[j,] = beta
sig_p = sig_p + (1/j^alpha)*(Acc -0.3)
mu_p = mu_p + (1/j)*(as.matrix(beta) - mu_p)
bmu = as.matrix(beta - mu_p)
C = C + (1/j)*(as.matrix(t(bmu)%*%bmu) - C)
}
return(OUT)
It looks like the vector beta will update, and the three elements in this vector will be different due to the rnorm function. However, this is not the case. The 3 columns of the output, one for each element, are exactly the same in the row. I have iterated this function out in the console several times, and in no case did the elements in beta appear to be the same.
For example: beta = [1, 2, 3] but the output = [1, 1, 1]
The MCMC iterates and does not get stuck, as the histogram shows a wide range of values in the output. It is just the sampled betas that are giving me the issue.
I'm just not understanding what is wrong with my code that prevents my vector beta from being added directly to the matrix OUT.

R reverse function to solve for parameter when output is a fixed constant

If I have a function
estimator <- function(A,B) {
A*(B+23)
}
How can I reverse this function to find the value of A for B as a sequence between 0 and 120 (B=1,2,3,4,...,120) that would give a fixed result, say C = 20?
I would use it to map the values for which satisfy the equation A*(B+23)= C = 20 with B being a list b.list between 0 and 120, for c.list, of different C?
b.list <- seq(0,120,by=1)
c.list <- tibble(seq(10,32,by=2))
In the end, I would like to plot the lines of curves of the function for different C using purrr or similar.
I.e.: Given that the height of a tree in metres at age 100 will follow the function, C = A*(B+23), solve for A that will give the result C=10 when B, Age is a list of years between 0 and 120?
Here's a link showing what I'm trying to make!
Here's another one
Many thanks!
For the inverse it is a quick inversion :
A = C/(B+23)
One answer could be :
B <- seq(0, 120)
C <- seq(10, 32, 2)
A <- matrix(0,
nrow = length(B),
ncol = length(C))
for(i in 1:ncol(M)){
A[,i] <- C[i] / (B + 23)
}
matplot(B, A, type ="l", col = "black")
In case of a more complex function indeed you need an automatic solving problem. One way is to see it like an optimisation problem where you want to minimise the distance from C :
B <- seq(1, 120)
C <- seq(10, 32, 2)
A <- matrix(0,
nrow = length(B),
ncol = length(C))
fct <- function(A, B, C){
paramasi <- 25
parambeta<- 7395.6
paramb2 <- -1.7829
refB <- 100
d <- parambeta*(paramasi^paramb2)
r <- (((A-d)^2)+(4*parambeta*A*(B^paramb2)))^0.5
si_est <- (A+d+r)/ (2+(4*parambeta*(refB^paramb2)) / (A-d+r))
return(sum(si_est - C)^2)}
for(c in 1:length(C)){
for(b in 1:length(B)){
# fixe parameters + optimisation
res <- optim(par = 1, fn = fct, B = B[b], C = C[c])
A[b, c] <- res$par
}
}
matplot(B, A, type = "l", col = "black")
You need to be careful because in your case I think that you could find an analytical formula for the inverse which would be better.
Good luck !

constrOptim function: Optimize R code

I am trying to solve an optimization problem.
Below is the mathematical explanation of the problem and the code I used:
F = {f_1, f_2, ... f_n}
S = {s_1, s_2, ....s_m}
Here m is always greater than n, and sum(S) is always greater than sum(F)
if ST = transpose(S)
Find a matrix P (n x m) = {p_ij}, such that: P %* % ST = F, where %* % is matrix multiplication, with respect to following constraints:
p_ij>= 0, for all i and j
sum (p_ij) <=1 when i varies from 1 to n.
Since the exact solution may not exist, I am trying to minimize Error by minimizing [ P %* % ST - F ].[ P %* % ST - F ], where . is the dot product
So the problem is that of constrained optimization where I use the following code.
F = c(10,10,5)
S = c(8,8,9,8,4)
loss_fun <- function(P){
T = matrix(S*P, nrow = n,ncol = m, byrow=T)
F2 = rowSums(T) # Predicted values of F
E = F - F2 # Error
return(sum(E*E))
}
n = length(F)
m = length(S)
P_init = c(rep(0.0001,n*m)) #Initial solution (theta)
# Creating constraint matrix
ui_1 = matrix(0,ncol = n*m, nrow= m)
for (i in 1:m){
for (j in 1:(n*m)) {
if (i%%m==j%%m) ui_1[i,j] = -1
}
}
ui_2 = diag(1,ncol = n*m, nrow = m*n)
my_ui <- rbind(ui_1,ui_2)
# Creating constraint vector
my_ci = c(rep(-1,m),rep(0,n*m))
z = constrOptim(P_init,loss_fun,NULL,ui=my_ui, ci=my_ci)
#result
P_final = matrix(z$par,nrow=n,byrow=T)
#verification of result
T = t(S*t(P_final)) #proportion matrix * S, transpose to ensure multiplication is by row.
F2 = rowSums(T) # Predicted values of F
E = F - F2 # Error
sum(E*E)
The above code works fine and runs in less than 0.5 seconds on my machine which has i5 CPU, 4 cores, 8 GB RAM, 64 bit windows 7, and 64 bit R 3.1.1.
However when I used F and S as in my real problem it ran for around 15 hours without producing any output. F has 39 elements, S has 196.
F = c(212,359,186,396,460,449,206,180,383,264,294,179,256,294,173,415,363,323,389,219,298,338,287,434,195,450,120,460,164,395,198,108,72,345,54,450,420,488,262)
S = c(233.81,0,1.13,59.68,0,768.18,12.33,147.56,115.2,537.32,0,144.35,93.63,13.43,48.58,60,78.26,1280,369.62,8.11,0,342.96,452.99,521.72,4995.58,0,0,10.59,8.1,38.89,161.67,186.14,0,83.22,13.89,37.35,2370,0,0,8.61,4.95,6.31,0,1.53,3600,0,12.48,444.26,0,8490,615.25,27.11,402.95,393.46,1.26,0,44.36,728.85,37.61,159.06,103.63,145.38,0.51,0,0,18.6,3.24,44.5,17.46,210,128.03,19.48,340.79,54.79,54.42,48.48,0,44.76,0,0,0,43.19,102.03,0,0,470,0,101,0,9060,6.09,8.33,49.09,0,19.72,170,57.54,128.78,636.01,10.93,38.79,0,0,49.65,173.58,101.96,21.84,2.55,14.55,770,7419.13,216.21,238.15,582.95,57.93,26.97,71.88,4.63,0,31,103.37,570.58,45.79,540,348.9,151.82,207.41,29.56,51.73,92.25,0,0,51.39,25.14,0,0,95.21,298.94,5.77,154.29,280,1666.59,40.19,0,9.37,119.76,0,0,9.17,28.19,67.5,129.62,85.41,24.59,3607.98,0,130.28,99.57,0,0,0,36.23,1140,328.87,0,0,0,40,22.77,0,2.08,0,0,0,14.66,0,102.86,50.06,13.22,62.25,1410,860,930,646.15,0,0,0,0,890,0,0,12.61,86.4,95.35,19.31,87.74
)
The rbind itself takes 3 to 4 seconds but the real problem is time taken by constrOptim.
Because your constraint is simple, you can avoid big-matrix calculations in the constraint part when you use some packages that can take function as a constraint argument, such as alabama.
loss_fun <- function(P){
T = matrix(S*P, nrow = n,ncol = m, byrow=T)
F2 = rowSums(T) # Predicted values of F
E = F - F2 # Error
return(sum(E*E))
}
n = length(F)
m = length(S)
P_init = c(rep(0.0001, n*m)) #Initial solution (theta)
# Creating inequality constraint function (this is much faster than my_ui %*% P - my_ci)
hin <- function(P){
P_mat <- matrix(P, nrow = m)
c(rowSums(P_mat) * -1 +1, P)
}
library(alabama)
aug_res <- auglag(P_init, loss_fun, hin = hin, control.outer = list(kkt2.check = FALSE))

R: How to add jitter only on singular matrices within a function?

I have the following function that I need to (m)apply on a list of more than 1500 large matrices (Z) and a list of vectors (p) of the same length. However, I get the error that some matrices are singular as I already posted here. Here my function:
kastner <- function(item, p) { print(item)
imp <- rowSums(Z[[item]])
exp <- colSums(Z[[item]])
x = p + imp
ac = p + imp - exp
einsdurchx = 1/as.vector(x)
einsdurchx[is.infinite(einsdurchx)] <- 0
A = Z[[item]] %*% diag(einsdurchx)
R = solve(diag(length(p))-A) %*% diag(p)
C = ac * einsdurchx
R_bar = diag(as.vector(C)) %*% R
rR_bar = round(R_bar)
return(rR_bar)
}
and my mapply command that also prints the names of the running matrix:
KASTNER <- mapply(kastner, names(Z), p, SIMPLIFY = FALSE)
In order to overcome the singularity problem, I want to add a small amount of jitter the singular matrices. The problem starts in line 9 of the function R = solve(diag(length(p))-A) %*% diag(p) as this term(diag(length(p))-A) gets singular and can't be solved. I tried to add jitter to all Z matrices in the first line of the function using: Z <- lapply(Z,function(x) jitter(x, factor = 0.0001, amount = NULL)), but this is very very low and produces still errors.
Therefore my idea is to check with if/else or something similar if this matrix diag(length(p))-A is singular (maybe using eigenvectors to check collinearity) and add on those matrices jitter, else (if not) the solve command should performed as it is. Ideas how to implement this on the function? Thanks
Here some example data, although there is no problem with singularity as I was not able to rebuild this error for line 9:
Z <- list("111.2012"= matrix(c(0,0,100,200,0,0,0,0,50,350,0,50,50,200,200,0),
nrow = 4, ncol = 4, byrow = T),
"112.2012"= matrix(c(10,90,0,30,10,90,0,10,200,50,10,350,150,100,200,10),
nrow = 4, ncol = 4, byrow = T))
p <- list("111.2012"=c(200, 1000, 100, 10), "112.2012"=c(300, 900, 50, 100))
Edit: a small amount o jitter shouldn't be problematic in my data as I have probably more than 80% of zeros in my matrices and than large values. And I am only interested in those large values, but the large amount of 0s are probably the reason for the singularity, but needed.
Since you didn't provide a working example I couldn't test this easily, so the burden of proof is on you. :) In any case, it should be a starting point for further tinkering. Comments in the code.
kastner <- function(item, p) { print(item)
imp <- rowSums(Z[[item]])
exp <- colSums(Z[[item]])
x = p + imp
ac = p + imp - exp
einsdurchx = 1/as.vector(x)
einsdurchx[is.infinite(einsdurchx)] <- 0
# start a chunk that repeats until you get a valid result
do.jitter <- TRUE # bureaucracy
while (do.jitter == TRUE) {
# run the code as usual
A = Z[[item]] %*% diag(einsdurchx)
# catch any possible errors, you can even catch "singularity" error here by
# specifying error = function(e) e
R <- tryCatch(solve(diag(length(p))-A) %*% diag(p), error = function(e) "jitterme")
# if you were able to solve(), and the result is a matrix (carefuly if it's a vector!)...
if (is.matrix(R)) {
# ... turn the while loop off
do.jitter <- FALSE
} else {
#... else apply some jitter and repeat by construcing A from a jittered Z[[item]]
Z[[item]] <- jitter(Z[[item]])
}
}
C = ac * einsdurchx
R_bar = diag(as.vector(C)) %*% R
rR_bar = round(R_bar)
return(rR_bar)
}

Resources