The basic operation I have is an operation on two probability vectors of the same length.
let's call them A,B. in R the formula is:
t = 1-prod(1-A*B)
that is, the result is a scalar, the (1-AB) is a point-wise operation, whose result is a vector whose i'th element is 1-a_i*b_i. The prod operator gives the product of the elements of the vector.
The meaning of this (as you could guess) is this: suppose A is the probability for each of N sources of a disease (or other signal) to have a certain disease. B is the vector of probabilities for each of sources to transmit the disease, if they have it, to the target. The outcome is the probability of the target to acquire the disease from (at least one of) the sources.
Ok, so now I have many types of signals, so I have many "A" vectors. and for each type of signal I have many targets, each with different probability of transmission (or many "B" vectors), and I want to compute the "t" outcome for each pair.
Ideally, a matrix multiplication can do the trick if the operation was an "inner product" of the vectors. but my operation is not such (I think).
What I look for is some kind of a transformation on the vectors A and B, so I could use matrix multiplication. Any other suggestion to simplify my computation is welcome.
Here is an example (code in R)
A = rbind(c(0.9,0.1,0.3),c(0.7,0.2,0.1))
A
# that is, the probability of source 2 to have disease/signal 1 is 0.1 (A[1,2]
# neither rows nor columns need to sum to 1.
B = cbind(c(0,0.3,0.9),c(0.9,0.6,0.3),c(0.3,0.8,0.3),c(0.4,0.5,1))
B
# that is, the probability of target 4 to acquire a disease from source 2 is 0.5 B[2,4]
# again, nothing needs to sum to 1 here
# the outcome should be:
C = t(apply(A,1,function(x) apply(B,2,function(y) 1-prod(1-x*y))))
# which basically loops on every row in A and every column in B and
# computes the required formula
C
# while this is quite elegant, it is not very efficient, and I look for transformations
# on my A,B matrices so I could write, in principle
# C = f(A)%*%g(B), where f(A) is my transformed A, g(B) is my transformed(B),
# and %*% is matrix multiplication
# note that if replace (1-prod(1-xy)) in the formula above with sum(x*y), the result
# is exactly matrix multiplication, which is why I think, I'm not too far from that
# and want to enjoy the benefits of already implemented optimizations of matrix
# multiplications.
This a job where Rcpp excels. Nested loops are straight forward to implement and you don't need much C++ experience. (I like RcppEigen, but you don't really need it for this. You could use "pure" Rcpp.)
library(RcppEigen)
library(inline)
incl <- '
using Eigen::Map;
using Eigen::MatrixXd;
typedef Map<MatrixXd> MapMatd;
'
body <- '
const MapMatd A(as<MapMatd>(AA)), B(as<MapMatd>(BB));
const int nA(A.rows()), mA(A.cols()), mB(B.cols());
MatrixXd R = MatrixXd::Ones(nA,mB);
for (int i = 0; i < nA; ++i)
{
for (int j = 0; j < mB; ++j)
{
for (int k = 0; k < mA; ++k)
{
R(i,j) *= (1 - A(i,k) * B(k,j));
}
R(i,j) = 1 - R(i,j);
}
}
return wrap(R);
'
funRcpp <- cxxfunction(signature(AA = "matrix", BB ="matrix"),
body, "RcppEigen", incl)
Now, lets put your code in an R function:
doupleApply <- function(A, B) t(apply(A,1,
function(x) apply(B,2,function(y) 1-prod(1-x*y))))
Compare the results:
all.equal(doupleApply(A,B), funRcpp(A,B))
#[1] TRUE
Benchmarks:
library(microbenchmark)
microbenchmark(doupleApply(A,B), funRcpp(A,B))
# Unit: microseconds
# expr min lq median uq max neval
#doupleApply(A, B) 169.699 179.2165 184.4785 194.9290 280.011 100
# funRcpp(A, B) 1.738 2.3560 4.6885 4.9055 11.293 100
set.seed(42)
A <- matrix(rnorm(3*1e3), ncol=3)
B <- matrix(rnorm(3*1e3), nrow=3)
all.equal(doupleApply(A,B), funRcpp(A,B))
#[1] TRUE
microbenchmark(doupleApply(A,B), funRcpp(A,B), times=5)
# Unit: milliseconds
# expr min lq median uq max neval
# doupleApply(A, B) 4483.46298 4585.18196 4587.71539 4672.01518 4712.92597 5
# funRcpp(A, B) 24.05247 24.08028 24.48494 26.32971 28.38075 5
First I should note that the R code might be misleading to some Matlab users because A*B in R is equivalent to A.*B in Matlab (element-wise multiplication). I used symbolic variables in my calculations so that the operations that take place are clearer.
syms a11 a12 a21 a22 b11 b12 b21 b22
syms a13 a31 a23 a32 a33
syms b13 b31 b23 b32 b33
First consider the easiest case we have only 1 vector A and 1 vector B :
A1 = [a11;a21] ;
B1 = [b11;b21] ;
The result you want is
1 - prod(1-A1.*B1)
=
1 - (a11*b11 - 1)*(a12*b12 - 1)
Now assume we have 3 vectors A and 2 vectors B stacked one next to the other in columns:
A3 = [a11 a12 a13;a21 a22 a23; a31 a32 a33];
B2 = [b11 b12 ;b21 b22 ; b31 b32];
In order to get the indices of all the possible combinations of columns vectors of A3 paired with all the possible combinations of column vectors of B2 you can do the following:
[indA indB] = meshgrid(1:3,1:2);
Now since for pairwise product of two vectors a,b it holds that a.*b = b.*a we can just keep the unique pairs of indices. You can do that as follows:
indA = triu(indA); indB = triu(indB);
indA = reshape(indA(indA>0),[],1); indB = reshape(indB(indB>0),[],1);
Now the result that you want could be calculated:
result = 1 - prod(1-A3(:,indA).*B2(:,indB))
Just for better readability:
pretty(result.')
=
+- -+
| (a11 b11 - 1) (a21 b21 - 1) (a31 b31 - 1) + 1 |
| |
| (a12 b11 - 1) (a22 b21 - 1) (a32 b31 - 1) + 1 |
| |
| (a12 b12 - 1) (a22 b22 - 1) (a32 b32 - 1) + 1 |
| |
| (a13 b11 - 1) (a23 b21 - 1) (a33 b31 - 1) + 1 |
| |
| (a13 b12 - 1) (a23 b22 - 1) (a33 b32 - 1) + 1 |
+- -+
If I understand amit's question, what you can do in Matlab is the following:
Data:
M = 4e3; % M different cases
N = 5e2; % N sources
K = 5e1; % K targets
A = rand(M, N); % M-by-N matrix of random numbers
A = A ./ repmat(sum(A, 2), 1, N); % M-by-N matrix of probabilities (?)
B = rand(N, K); % N-by-K matrix of random numbers
B = B ./ repmat(sum(B), N, 1); % N-by-K matrix of probabilities (?)
First solution
% One-liner solution:
tic
C = squeeze(1 - prod(1 - repmat(A, [1 1 K]) .* permute(repmat(B, [1 1 M]), [3 1 2]), 2));
toc
% Elapsed time is 6.695364 seconds.
Second solution
% Partial vectorization 1
tic
D = zeros(M, K);
for hh = 1:M
tmp = repmat(A(hh, :)', 1, K);
D(hh, :) = 1 - prod((1 - tmp .* B), 1);
end
toc
% Elapsed time is 0.686487 seconds.
Third solution
% Partial vectorization 2
tic
E = zeros(M, K);
for hh = 1:M
for ii = 1:K
E(hh, ii) = 1 - prod(1 - A(hh, :)' .* B(:, ii));
end
end
toc
% Elapsed time is 2.003891 seconds.
Fourth solution
% No vectorization at all
tic
F = ones(M, K);
for hh = 1:M
for ii = 1:K
for jj = 1:N
F(hh, ii) = F(hh, ii) * prod(1 - A(hh, jj) .* B(jj, ii));
end
F(hh, ii) = 1 - F(hh, ii);
end
end
toc
% Elapsed time is 19.201042 seconds.
The solutions are equivalent …
chck1 = C - D;
chck2 = C - E;
chck3 = C - F;
figure
plot(sort(chck1(:)))
figure
plot(sort(chck2(:)))
figure
plot(sort(chck3(:)))
… but apparently the approaches with partial vectorization, without repmat and permute, are more efficient in terms of memory and execution time.
Related
I have :
a set of N locations which can be workplace or residence
a vector of observed workers L_i, with i in N
a vector of observed residents R_n, with n in N
a matrix of distance observed between all pair residence n and workplace i
a shape parameter epsilon
Setting N=3, epsilon=5, and
d = [1 1.5 3 ; 1.5 1 1.5 ; 3 1.5 1] #distance matrix
L_i = [13 69 18] #vector of workers in each workplace
R_n = [27; 63; 10]
I want to find the vector of wages (size N) that solve this system of N equations,
with l all the workplaces.
Do I need to implement an iterative algorithm on the vectors of workers and wages? Or is it possible to directly solve this system ?
I tried this,
w_i = [1 ; 1 ; 1]
er=1
n =1
while er>1e-3
L_i = ( (w_i ./ d).^ϵ ) ./ sum( ( (w_i ./ d).^ϵ), dims=1) * R
er = maximum(abs.(L .- L_i))
w_i = 0.7.*w_i + 0.3.*w_i.*((L .- L_i) ./ L_i)
n = n+1
end
If L and R are given (i.e., do not depend on w_i), you should set up a non-linear search to get (a vector of) wages from that gravity equation (subject to normalising one w_i, of course).
Here's a minimal example. I hope it helps.
# Call Packages
using Random, NLsolve, LinearAlgebra
# Set seeds
Random.seed!(1704)
# Variables and parameters
N = 10
R = rand(N)
L = rand(N) * 0.5
d = ones(N, N) .+ Symmetric(rand(N, N)) / 10.0
d[diagind(d)] .= 1.0
ε = -3.0
# Define objective function
function obj_fun(x, d, R, L, ε)
# Find shares
S_mat = (x ./ d).^ε
den = sum(S_mat, dims = 1)
s = S_mat ./ den
# Normalize last wage
x[end] = 1.0
# Define loss function
loss = L .- s * R
# Return
return loss
end
# Run optimization
x₀ = ones(N)
res = nlsolve(x -> obj_fun(x, d, R, L, ε), x₀, show_trace = true)
# Equilibrium vector of wages
w = res.zero
Trying to understand Gram-Schmidt process from this explanation:
http://mlwiki.org/index.php/Gram-Schmidt_Process
The steps of the calculation make sense to me. However the Python implementation included in the same article doesn't seem to be aligned.
def normalize(v):
return v / np.sqrt(v.dot(v))
n = len(A)
A[:, 0] = normalize(A[:, 0])
for i in range(1, n):
Ai = A[:, i]
for j in range(0, i):
Aj = A[:, j]
t = Ai.dot(Aj)
Ai = Ai - t * Aj
A[:, i] = normalize(Ai)
From above code, we see it does dot product for V1 and b, however the (V1,V1) part is not done as the denominator (refer to below equation). I wonder how below equation is translated into code residing in the for loop?
This is what the code does exactly
Basically it normalize the previous vector (column in A) and project the current one to it and to be subtracted by the current one.
Normalization happens with every vector for neat calculation.
The V2 equation above doesn't normalize the previous vector hence the difference.
Try this vectorized implementation.
Also I would suggest to go through David C lay book for theory.
def replace_zero(array):
for i in range(len(array)) :
if array[i] == 0 :
array[i] = 1
return array
def gram_schmidt(self,A, norm=True, row_vect=False):
"""Orthonormalizes vectors by gram-schmidt process
Parameters
-----------
A : ndarray,
Matrix having vectors in its columns
norm : bool,
Do you need Normalized vectors?
row_vect: bool,
Does Matrix A has vectors in its rows?
Returns
-------
G : ndarray,
Matrix of orthogonal vectors
Gram-Schmidt Process
--------------------
The Gram–Schmidt process is a simple algorithm for
producing an orthogonal or orthonormal basis for any
nonzero subspace of Rn.
Given a basis {x1,....,xp} for a nonzero subspace W of Rn,
define
v1 = x1
v2 = x2 - (x2.v1/v1.v1) * v1
v3 = x3 - (x3.v1/v1.v1) * v1 - (x3.v2/v2.v2) * v2
.
.
.
vp = xp - (xp.v1/v1.v1) * v1 - (xp.v2/v2.v2) * v2 - .......
.... - (xp.v(p-1) / v(p-1).v(p-1) ) * v(p-1)
Then {v1,.....,vp} is an orthogonal basis for W .
In addition,
Span {v1,.....,vp} = Span {x1,.....,xp} for 1 <= k <= p
References
----------
Linear Algebra and Its Applications - By David.C.Lay
"""
if row_vect :
# if true, transpose it to make column vector matrix
A = A.T
no_of_vectors = A.shape[1]
G = A[:,0:1].copy() # copy the first vector in matrix
# 0:1 is done to to be consistent with dimensions - [[1,2,3]]
# iterate from 2nd vector to number of vectors
for i in range(1,no_of_vectors):
# calculates weights(coefficents) for every vector in G
numerator = A[:,i].dot(G)
denominator = np.diag(np.dot(G.T,G)) #to get elements in diagonal
weights = np.squeeze(numerator/denominator)
# projected vector onto subspace G
projected_vector = np.sum(weights * G,
axis=1,
keepdims=True)
# orthogonal vector to subspace G
orthogonalized_vector = A[:,i:i+1] - projected_vector
# now add the orthogonal vector to our set
G = np.hstack((G,orthogonalized_vector))
if norm :
# to get orthoNormal vectors (unit orthogonal vectors)
# replace zero to 1 to deal with division by 0 if matrix has 0 vector
# or normazalization value comes out to be zero
G = G/self.replace_zero(np.linalg.norm(G,axis=0))
if row_vect:
return G.T
return G
G = np.array([[1,0,0],[1,1,0],[1,1,1],[1,1,1]])
gram_schmidt(G)
>
array([[ 0.5 , -0.8660254 , 0. ],
[ 0.5 , 0.28867513, -0.81649658],
[ 0.5 , 0.28867513, 0.40824829],
[ 0.5 , 0.28867513, 0.40824829]])
Suppose that A is n_1 by n_1 symmetric matrix, and B is n_2 by n_2 symmetric matrix, where typically n_1 > n_2 and n_1 is from 10^3 to 10^5. I would like to get the following (n_1*n_2) by (n_1*n_2) matrix C such that each block of C is C_{ij}=\exp(A\text{.^}2/B_{i,j}^{1.5})/B_{i,j} with i=1, ..., n_2; j=1,..., n_2.
I got two ways to compute this in MATLAB, but both methods do not give me satisfactory timing. In the following I will give a minimal example in MATLAB code.
n1 = 400; n2 = 15;
A = randn(n1); A = A + A' + 10*eye(n1);
B = randn(n2); B = B + B' + 5*eye(n2);
One way:
tic;
Atemp = repmat(A, n2, n2);
Btemp = kron(B, ones(n1));
C1 = exp(Atemp.^2./Btemp.^1.5)./Btemp;
toc;
Elapsed time is 2.402167 seconds.
Another way:
tic;
Btemp = reshape(B, [1 1 n2*n2]);
Ctemp = bsxfun(#(x,y) exp(x.^2/y.^1.5)/y, A, Btemp);
[a, b, c] = size(Ctemp);
Ctemp = reshape(mat2cell(Ctemp, a, b, ones(c,1)), sqrt(c), sqrt(c));
C2 = cell2mat(Ctemp);
toc;
Elapsed time is 2.923428 seconds.
I am wondering whether there are more efficient way to get the matrix C in MATLAB? The resulting matrix C is required for cholesky decomposition.
BTW, in the second approach, there must be more efficient way (i.e., avoiding converting Ctemp to cell, and then converting cell to C2) to convert 3 dimensional array Ctemp to 2 dimensional array C2, but I cannot figure it out now.
Thank you very much!
bsxfun is usually not as fast as you expect. Cell/mat reshape also take some time. So your first approach is better. But Atemp.^2./Btemp.^1.5 can be simplified to kron(1./(B.^1.5), A.^2) to avoid some of the large temporal matrices and increase the speed.
Here's the modified code and the timing on my machine.
n1 = 400; n2 = 15;
A = randn(n1); A = A + A' + 10*eye(n1);
B = randn(n2); B = B + B' + 5*eye(n2);
tic;
Btemp = kron(B, ones(n1));
C0 = exp(kron(1./(B.^1.5), A.^2))./Btemp;
toc;
tic;
Atemp = repmat(A, n2, n2);
Btemp = kron(B, ones(n1));
C1 = exp(Atemp.^2./Btemp.^1.5)./Btemp;
toc;
tic;
Btemp = reshape(B, [1 1 n2*n2]);
Ctemp = bsxfun(#(x,y) exp(x.^2/y.^1.5)/y, A, Btemp);
toc;
tic;
[a, b, c] = size(Ctemp);
Ctemp = reshape(mat2cell(Ctemp, a, b, ones(c,1)), sqrt(c), sqrt(c));
C2 = cell2mat(Ctemp);
toc;
Elapsed time is 0.426900 seconds.
Elapsed time is 0.900966 seconds.
Elapsed time is 2.850293 seconds.
Elapsed time is 0.706957 seconds.
This may be quite a basic question for someone who knows linear programming.
In most of the problems that I saw on LP has somewhat similar to following format
max 3x+4y
subject to 4x-5y = -34
3x-5y = 10 (and similar other constraints)
So in other words, we have same number of unknown in objective and constraint functions.
My problem is that I have one unknown variable in objective function and 3 unknowns in constraint functions.
The problem is like this
Objective function: min w1
subject to:
w1 + 0.1676x + 0.1692y >= 0.1666
w1 - 0.1676x - 0.1692y >= -0.1666
w1 + 0.3039x + 0.3058y >= 0.3
w1 - 0.3039x - 0.3058y >= -0.3
x + y = 1
x >= 0
y >= 0
As can be seen, the objective function has only one unknown i.e. w1 and constraint functions have 3 (or lets say 2) unknown i.e w1, x and y.
Can somebody please guide me how to solve this problem, especially using R or MATLAB linear programming toolbox.
Your objective only involves w1 but you can still view it as a function of w1,x,y, where the coefficient of w1 is 1, and the coeffs of x,y are zero:
min w1*1 + x*0 + y*0
Once you see this you can formulate it in the usual way as a "standard" LP.
Prasad is correct. The number of unknowns in the objective function does not matter. You can view unknowns that are not present as having a zero coefficient.
This LP is easily solved using Matlab's linprog function. For more
details on linprog see the documentation here.
% We lay out the variables as X = [w1; x; y]
c = [1; 0; 0]; % The objective is w1 = c'*X
% Construct the constraint matrix
% Inequality constraints will be written as Ain*X <= bin
% w1 x y
Ain = [ -1 -0.1676 -0.1692;
-1 0.1676 0.1692;
-1 -0.3039 -0.3058;
-1 0.3039 0.3058;
];
bin = [ -0.166; 0.166; -0.3; 0.3];
% Construct equality constraints Aeq*X == beq
Aeq = [ 0 1 1];
beq = 1;
%Construct lower and upper bounds l <= X <= u
l = [ -inf; 0; 0];
u = inf(3,1);
% Solve the LP using linprog
[X, optval] = linprog(c,Ain,bin,Aeq,beq,l,u);
% Extract the solution
w1 = X(1);
x = X(2);
y = X(3);
I have a small question about vector and matrix.
Suppose a vector V = {v1, v2, ..., vn}. I generate a n-by-n distance matrix M defined as:
M_ij = | v_i - v_j | such that i,j belong to [1, n].
That is, each element M_ij in the square matrix is the absolute distance of two elements in V.
For example, I have a vector V = {1, 3, 3, 5}, the distance matrix will be
M=[
0 2 2 4;
2 0 0 2;
2 0 0 2;
4 2 2 0; ]
It seems pretty simple. Now comes to the question. Given such a matrix M, how to obtain the initial V?
Thank you.
Based on some answer for this question, it seems that the answer is not unique. So, now suppose that all the initial vector has been normalized to 0 mean and 1 variance. The question is: Given such a symmetric distance matrix M, how to decide the initial normalized vector?
You can't. To give you an idea of why, consider these two cases:
V1 = {1,2,3}
M1 = [ 0 1 2 ; 1 0 1 ; 2 1 0 ]
V2 = {3,4,5}
M2 = [ 0 1 2 ; 1 0 1 ; 2 1 0 ]
As you can see, a single M could be the result of more than one V. Therefore, you can't map backwards.
There is no way to determine the answer uniquely, since the distance matrix is invariant to adding a constant to all elements and to multiplying all the values by -1. Assuming that element 1 is equal to 0, and that the first nonzero element is positive, however, you can find an answer. Here is the pseudocode:
# Assume v[1] is 0
v[1] = 0
# e is value of first non-zero vector element
e = 0
# ei is index of first non-zero vector element
ei = 0
for i = 2...n:
# if all vector elements have been 0 so far
if e == 0:
# get the current distance from element 1 and its index
# this new element may still be 0
e = d[1,i]
ei = i
v[i] = e
elseif d[1,i] == d[ei,i] + v[ei]: # v[i] <= v[1]
# v[i] is to the left of v[1] (assuming v[ei] > v[1])
v[i] = -d[1,i]
else:
# some other case; v[i] is to the right of v[1]
v[i] = d[1,i]
I don't think it is possible to find the original vector, but you can find a translation of the vector by taking the first row of the matrix.
If you let M_ij = | v_i - v_j | and you translate all v_k for k\in [1,n] you will get
M_ij = | v-i + 1 - v_j + 1 |
= | v_i - v_j |
Hence, just take the first row as the vector and find one initial point to translate the vector to.
Correction:
Let v_1 = 0, and let l_k = | v_k | for k\in [2,n] and p_k the parity of v_k
Let p_1 = 1
for(int i = 2; i < n; i++)
if( | l_i - l_(i+1) | != M_i(i+1) )
p_(i+1) = - p_i
else
p_(i+1) = p_i
doing this for all v_k for k\in [2,n] in order will show the parity of each v_k in respect to the others
Then you can find a translation of the original vector with the same or opposite direction
Update (For Normalized vector):
Let d = Sqrt(v_1^2 + v_2^2 + ... + v_n^2)
Vector = {0, v_1 / d, v_2 / d, ... , v_n / d}
or
{0, -v_1 / d, -v_2 / d, ... , -v_n / d}