I think we all should be familiar of the arithmetic swap algorithm, that swaps two variables without using a third variable. Now I found out that there are two variations of the arithmetic swap. Please consider the following:
Variation 1.
int a = 2;
int b = 3;
a = a + b;
b = a - b;
a = a - b;
Variation 2.
int a = 2;
int b = 3;
b = b - a;
a = a + b;
b = a - b;
I want to know, why are there two distinct variations of the arithmetic swap and why do they work? Are there also other variations of the arithmetic swap that achieve the same result? How are they related? Is there any elegant mathematical formula that justifies why the arithmetic swap works the way it does, for all variations? Is there anything related between these two variations of the two arithmetic swap, like an underlying truth?
Break each variable out as what it represents:
a = 2
b = 3
a1 = a + b
b1 = a1 - b = (a + b) - b = a
a2 = a1 - b1 = (a + b) - a = b
a = 2
b = 3
b1 = b - a
a1 = a + b1 = a + (b - a) = b
b2 = a1 - b1 = b - (b - a) = a
There's not underlying truth other than the fact that the math works out. Remember that each time you do an assignment, it's effectively a new "variable" from the math side.
I have two numbers that are samples of two different quantities (it doesn't really matter what it is). They are both fluctuating with time. I have samples for these values from two different points in time. Call them a0, a1, b0, b1. I can use the differences (a1-a0, b1-b0) the difference and sum of the differences ( (a1-a0)-(b1-b0) ) ( (a1-a0) + (b1-b0) ) )
My questions is how do you determine when both of them are descending in an fashion that doesn't hard code any constants. Let me explain.
I want to detect when both of these quantities have decreased by a certain amount but that amount may change if I change the quantities I'm sampling so I can't hard code a constant.
I'm sorry if this is vague but that's really all the information I have. I was just wondering if this is even solvable.
if ( a1 - a0 < 0)
if( b1 - b0 < 0) {
//... descending
}
or:
if ( a1 - a0 + b1 - b0 < a1 - a0) // b1 - b0 is negative
if( a1 - a0 + b1 - b0 < b1 - b0) { // a1 - a0 is negative
//... descending
}
To add a threshold is simple:
if ( a1 - a0 < -K)
if( b1 - b0 < -K) {
//... descending, more than K
}
or:
if ( a1 - a0 + b1 - b0 < a1 - a0 - K) // b1 - b0 is less than -K
if( a1 - a0 + b1 - b0 < b1 - b0 - K) { // a1 - a0 is less than -K
//... descending more than K
}
I have 8 _mm128 registers and each register needs to be multiplied by a single entry of another _mm256 register.
One solution that jumps to my mind would be:
INPUT: __m128 a[8]; __m256 b;
__m128 tmp = _mm256_extractf128_ps(b,0);
a[0] = _mm_mul_ps(a[0],_mm_shuffle_ps(tmp,tmp,0));
a[1] = _mm_mul_ps(a[1],_mm_shuffle_ps(tmp,tmp,0x55));
a[2] = _mm_mul_ps(a[2],_mm_shuffle_ps(tmp,tmp,0xAA));
a[3] = _mm_mul_ps(a[3],_mm_shuffle_ps(tmp,tmp,0xFF));
tmp = _mm256_extractf128_ps(b,1);
a[4] = _mm_mul_ps(a[4],_mm_shuffle_ps(tmp,tmp,0));
a[5] = _mm_mul_ps(a[5],_mm_shuffle_ps(tmp,tmp,0x55));
a[6] = _mm_mul_ps(a[6],_mm_shuffle_ps(tmp,tmp,0xAA));
a[7] = _mm_mul_ps(a[7],_mm_shuffle_ps(tmp,tmp,0xFF));
What would be the best way to achieve this? Thank you.
I think your solution is about as good as it's going to get, except that I would use explicit variables rather than an array, so that everything stays in registers as far as possible:
__m128 a0, a1, a2, a3, a4, a5, a6, a7;
__m256 b;
__m128 tmp = _mm256_extractf128_ps(b,0);
a0 = _mm_mul_ps(a0, _mm_shuffle_ps(tmp,tmp,0));
a1 = _mm_mul_ps(a1, _mm_shuffle_ps(tmp,tmp,0x55));
a2 = _mm_mul_ps(a2, _mm_shuffle_ps(tmp,tmp,0xAA));
a3 = _mm_mul_ps(a3, _mm_shuffle_ps(tmp,tmp,0xFF));
tmp = _mm256_extractf128_ps(b,1);
a4 = _mm_mul_ps(a4, _mm_shuffle_ps(tmp,tmp,0));
a5 = _mm_mul_ps(a5, _mm_shuffle_ps(tmp,tmp,0x55));
a6 = _mm_mul_ps(a6, _mm_shuffle_ps(tmp,tmp,0xAA));
a7 = _mm_mul_ps(a7, _mm_shuffle_ps(tmp,tmp,0xFF));
The basic operation I have is an operation on two probability vectors of the same length.
let's call them A,B. in R the formula is:
t = 1-prod(1-A*B)
that is, the result is a scalar, the (1-AB) is a point-wise operation, whose result is a vector whose i'th element is 1-a_i*b_i. The prod operator gives the product of the elements of the vector.
The meaning of this (as you could guess) is this: suppose A is the probability for each of N sources of a disease (or other signal) to have a certain disease. B is the vector of probabilities for each of sources to transmit the disease, if they have it, to the target. The outcome is the probability of the target to acquire the disease from (at least one of) the sources.
Ok, so now I have many types of signals, so I have many "A" vectors. and for each type of signal I have many targets, each with different probability of transmission (or many "B" vectors), and I want to compute the "t" outcome for each pair.
Ideally, a matrix multiplication can do the trick if the operation was an "inner product" of the vectors. but my operation is not such (I think).
What I look for is some kind of a transformation on the vectors A and B, so I could use matrix multiplication. Any other suggestion to simplify my computation is welcome.
Here is an example (code in R)
A = rbind(c(0.9,0.1,0.3),c(0.7,0.2,0.1))
A
# that is, the probability of source 2 to have disease/signal 1 is 0.1 (A[1,2]
# neither rows nor columns need to sum to 1.
B = cbind(c(0,0.3,0.9),c(0.9,0.6,0.3),c(0.3,0.8,0.3),c(0.4,0.5,1))
B
# that is, the probability of target 4 to acquire a disease from source 2 is 0.5 B[2,4]
# again, nothing needs to sum to 1 here
# the outcome should be:
C = t(apply(A,1,function(x) apply(B,2,function(y) 1-prod(1-x*y))))
# which basically loops on every row in A and every column in B and
# computes the required formula
C
# while this is quite elegant, it is not very efficient, and I look for transformations
# on my A,B matrices so I could write, in principle
# C = f(A)%*%g(B), where f(A) is my transformed A, g(B) is my transformed(B),
# and %*% is matrix multiplication
# note that if replace (1-prod(1-xy)) in the formula above with sum(x*y), the result
# is exactly matrix multiplication, which is why I think, I'm not too far from that
# and want to enjoy the benefits of already implemented optimizations of matrix
# multiplications.
This a job where Rcpp excels. Nested loops are straight forward to implement and you don't need much C++ experience. (I like RcppEigen, but you don't really need it for this. You could use "pure" Rcpp.)
library(RcppEigen)
library(inline)
incl <- '
using Eigen::Map;
using Eigen::MatrixXd;
typedef Map<MatrixXd> MapMatd;
'
body <- '
const MapMatd A(as<MapMatd>(AA)), B(as<MapMatd>(BB));
const int nA(A.rows()), mA(A.cols()), mB(B.cols());
MatrixXd R = MatrixXd::Ones(nA,mB);
for (int i = 0; i < nA; ++i)
{
for (int j = 0; j < mB; ++j)
{
for (int k = 0; k < mA; ++k)
{
R(i,j) *= (1 - A(i,k) * B(k,j));
}
R(i,j) = 1 - R(i,j);
}
}
return wrap(R);
'
funRcpp <- cxxfunction(signature(AA = "matrix", BB ="matrix"),
body, "RcppEigen", incl)
Now, lets put your code in an R function:
doupleApply <- function(A, B) t(apply(A,1,
function(x) apply(B,2,function(y) 1-prod(1-x*y))))
Compare the results:
all.equal(doupleApply(A,B), funRcpp(A,B))
#[1] TRUE
Benchmarks:
library(microbenchmark)
microbenchmark(doupleApply(A,B), funRcpp(A,B))
# Unit: microseconds
# expr min lq median uq max neval
#doupleApply(A, B) 169.699 179.2165 184.4785 194.9290 280.011 100
# funRcpp(A, B) 1.738 2.3560 4.6885 4.9055 11.293 100
set.seed(42)
A <- matrix(rnorm(3*1e3), ncol=3)
B <- matrix(rnorm(3*1e3), nrow=3)
all.equal(doupleApply(A,B), funRcpp(A,B))
#[1] TRUE
microbenchmark(doupleApply(A,B), funRcpp(A,B), times=5)
# Unit: milliseconds
# expr min lq median uq max neval
# doupleApply(A, B) 4483.46298 4585.18196 4587.71539 4672.01518 4712.92597 5
# funRcpp(A, B) 24.05247 24.08028 24.48494 26.32971 28.38075 5
First I should note that the R code might be misleading to some Matlab users because A*B in R is equivalent to A.*B in Matlab (element-wise multiplication). I used symbolic variables in my calculations so that the operations that take place are clearer.
syms a11 a12 a21 a22 b11 b12 b21 b22
syms a13 a31 a23 a32 a33
syms b13 b31 b23 b32 b33
First consider the easiest case we have only 1 vector A and 1 vector B :
A1 = [a11;a21] ;
B1 = [b11;b21] ;
The result you want is
1 - prod(1-A1.*B1)
=
1 - (a11*b11 - 1)*(a12*b12 - 1)
Now assume we have 3 vectors A and 2 vectors B stacked one next to the other in columns:
A3 = [a11 a12 a13;a21 a22 a23; a31 a32 a33];
B2 = [b11 b12 ;b21 b22 ; b31 b32];
In order to get the indices of all the possible combinations of columns vectors of A3 paired with all the possible combinations of column vectors of B2 you can do the following:
[indA indB] = meshgrid(1:3,1:2);
Now since for pairwise product of two vectors a,b it holds that a.*b = b.*a we can just keep the unique pairs of indices. You can do that as follows:
indA = triu(indA); indB = triu(indB);
indA = reshape(indA(indA>0),[],1); indB = reshape(indB(indB>0),[],1);
Now the result that you want could be calculated:
result = 1 - prod(1-A3(:,indA).*B2(:,indB))
Just for better readability:
pretty(result.')
=
+- -+
| (a11 b11 - 1) (a21 b21 - 1) (a31 b31 - 1) + 1 |
| |
| (a12 b11 - 1) (a22 b21 - 1) (a32 b31 - 1) + 1 |
| |
| (a12 b12 - 1) (a22 b22 - 1) (a32 b32 - 1) + 1 |
| |
| (a13 b11 - 1) (a23 b21 - 1) (a33 b31 - 1) + 1 |
| |
| (a13 b12 - 1) (a23 b22 - 1) (a33 b32 - 1) + 1 |
+- -+
If I understand amit's question, what you can do in Matlab is the following:
Data:
M = 4e3; % M different cases
N = 5e2; % N sources
K = 5e1; % K targets
A = rand(M, N); % M-by-N matrix of random numbers
A = A ./ repmat(sum(A, 2), 1, N); % M-by-N matrix of probabilities (?)
B = rand(N, K); % N-by-K matrix of random numbers
B = B ./ repmat(sum(B), N, 1); % N-by-K matrix of probabilities (?)
First solution
% One-liner solution:
tic
C = squeeze(1 - prod(1 - repmat(A, [1 1 K]) .* permute(repmat(B, [1 1 M]), [3 1 2]), 2));
toc
% Elapsed time is 6.695364 seconds.
Second solution
% Partial vectorization 1
tic
D = zeros(M, K);
for hh = 1:M
tmp = repmat(A(hh, :)', 1, K);
D(hh, :) = 1 - prod((1 - tmp .* B), 1);
end
toc
% Elapsed time is 0.686487 seconds.
Third solution
% Partial vectorization 2
tic
E = zeros(M, K);
for hh = 1:M
for ii = 1:K
E(hh, ii) = 1 - prod(1 - A(hh, :)' .* B(:, ii));
end
end
toc
% Elapsed time is 2.003891 seconds.
Fourth solution
% No vectorization at all
tic
F = ones(M, K);
for hh = 1:M
for ii = 1:K
for jj = 1:N
F(hh, ii) = F(hh, ii) * prod(1 - A(hh, jj) .* B(jj, ii));
end
F(hh, ii) = 1 - F(hh, ii);
end
end
toc
% Elapsed time is 19.201042 seconds.
The solutions are equivalent …
chck1 = C - D;
chck2 = C - E;
chck3 = C - F;
figure
plot(sort(chck1(:)))
figure
plot(sort(chck2(:)))
figure
plot(sort(chck3(:)))
… but apparently the approaches with partial vectorization, without repmat and permute, are more efficient in terms of memory and execution time.
I gotta solve a lambda calculus problem. I reached certain point and I don´t know how to continue:
h f x = \g -> g (f x g)
(h::a1 f::a2 x::a3)::a4 = (\g -> g::a5 (f::a2 x::a3 g::a5)::a6)::a4
a1 = a2 -> a3 -> a4
a2 = a3 -> a5 -> a6
a5 = a6 -> a4
a1 = (a3 -> a5 -> a4) -> a3 -> a4
a1 = (a3 -> (a6->a4) -> a4) -> a3 -> a4
is there any way of finishing?. I use "a1,a2,a3..." to represent a type for the element or function. For example, 1::Int, 2.4::Float, f::a1, x::a3 and so on. I don´t know if it is clear enought...
Thank you so much!!
You've made a mistake. g=a5: a6 -/-> a4. Your brackets are wrong on line 2.
h f x = \g -> g (f x g)
(h::a1 f::a2 x::a3)::a4 = (\g -> (g::a5 (f::a2 x::a3 g::a5)::a6)::a7)::a4
a1 = a2 -> a3 -> a4
a2 = a3 -> a5 -> a6
a5 = a6 -> a7
a4 = a5 -> a7
a1 = (a3 -> a5 -> a6) -> a3 -> a4
a1 = (a3 -> (a6->a7) -> a6) -> a3 -> a5 -> a7
a1 = (a3 -> (a6->a7) -> a6) -> a3 -> (a6 -> a7) -> a7
That is therefore the correct type for h (you can check if you're paranoid just by typing fun h f x = (fn g => g (f x g) ) into an SML prompt and getting the exact same result; same goes for Haskell with appropriate syntax). h is a polymorphic function, so all the a's are arbitrary, but express the relationship between the types of h's argument and the argument of the result of applying h and so on.