Constraining the solutions for linear equations - linear-algebra

I'm searching for a way to solve a system of linear equations. Specifically 8 equations with a total of 16 unknown values.
Each unknown value (w[0...15]) is a 32-bit binary value which corresponds to 4 ascii characters written over 8 bits. For example:
For :
I've tried writing this system of linear equations as a single matrix equation. Which gives:
Right now, using the Eigen linear algebra library, I get my 16 solutions (w[0...15]) but all of them are either decimal or null values, which is not what I need. All 16 solutions need to be the equivalent of 4 hexadecimal characters under their binary representation. Meaning integers between 48 and 56 (ascii for '0' to '9'), 65 and 90 (ascii for 'A' to 'Z'), or 97 and 122 (ascii for 'a' to 'z').
Current 16 solutions:
I've found a solution to this problem using something called box-constraints. An example is shown here using python's lsq_linear function which allows the user to specify bounds. It seems Eigen does not let the user specify bounds in its decomposition methods.
Therefore, my question is, how do you get a similar result in C++ using a linear algebra library? Or is there a better way to solve such systems of equations without writing it under a single matrix equation?
Thanks in advance.

Since you're working with linear equations over Z/232Z, integer linear programming (as you tagged the question) may be a solution, and algorithms that are inherently floating point are not appropriate. Box constraints are not enough, they won't force the variables to take on integer values. Also, the model shown in the question does not taken into account that multiplying and adding in Z/232Z can wrap, which excludes many potential solutions (or perhaps that is intended?) and may make the instance accidentally infeasible when it was intended to be solvable.
ILP can model equations over Z/232Z relatively directly (using integer variables between 0 and 232 and some unconstrained additional variables scaled by 232 to "absorb" the wraparound), but it tends really struggle with that kind of formulation - I would say it's one of the worst cases for an ILP solver without getting into the "intentionally difficult" cases. A more indirect model with 32x boolean variables is also possible, but this leads to constraints with very large constants and ILP solvers tend to struggle with them too. Overall I do not recommend using ILP for this problem.
What I would recommend for this is an SMT solver that offers the bitvector theory, or as alternatively, a pseudo-boolean solver or plain SAT solver (which would leave the grunt work of implementing boolean circuits and converting them to CNF to you instead of having them builtin in the solver).

If you have more unknowns than equations for sure your system will be indeterminate, the rank of a 8 x 16 matrix is at most 8, thus you have at least 16 degrees of freedom.
Further more if you have bounds to your variables i.e. mixed equalities and inequalities, then your problem is better posed as a linear programming. You can set a dummy objective function c[i] = 0, you could use GLPK but that is a very generic solution. If you want a small code snipped you probably can find a toy implementation of the Simplex method that will satisfy your needs.

I went for an SMT solver as suggested by #harold. Specifically the CVC4 SMT Solver. Here is the code I've written in C++ answering my question about finding the 16 solutions (w[0...15]) for a system of 8 equations, constrained to be ascii characters. I have one last question though. What are pushing and popping for? (slv.push() and slv.pop())
#include <iostream>
#include <cvc4/cvc4.h>
using namespace std;
using namespace CVC4;
int main() {
// 1. initialize a CVC4 BitVector SMT solver
ExprManager em;
SmtEngine slv(&em);
slv.setOption("incremental", true); // enable incremental solving
slv.setOption("produce-models", true); // enable models
slv.setLogic("QF_BV"); // set the bitvector theory logic
Type bitvector8 = em.mkBitVectorType(size_8); // create a 8-bit wide bit-vector type (4 x 8-bit = 32-bit)
// 2. create the SMT solver variables
Expr w[16][4]; // w[0...15] where each w corresponds to 4 ascii characters
for (int i = 0; i < 16; ++i) {
for (int j = 0; j < 4; ++j) {
// a. define w[i] (four ascii characters per w[i])
w[i][j] = em.mkVar("w" + to_string(i) + to_string(j), bitvector8);
// b. constraint w[i][0...3] to be an ascii character
// - digit (0-9) constraint
// ascii lower bound digit constraint (bit-vector unsigned greater than or equal)
Expr digit_lower = em.mkExpr(kind::BITVECTOR_UGE, w[i][j], em.mkConst(BitVector(size_8, Integer(48))));
// ascii upper bound digit constraint (bit-vector unsigned less than or equal)
Expr digit_upper = em.mkExpr(kind::BITVECTOR_ULE, w[i][j], em.mkConst(BitVector(size_8, Integer(56))));
Expr digit_constraint = em.mkExpr(kind::AND, digit_lower, digit_upper);
// - lower alphanumeric character (a-z) constraint
// ascii lower bound alpha constraint (bit-vector unsigned greater than or equal)
Expr alpha_lower = em.mkExpr(kind::BITVECTOR_UGE, w[i][j], em.mkConst(BitVector(size_8, Integer(97))));
// ascii upper bound alpha constraint (bit-vector unsigned less than or equal)
Expr alpha_upper = em.mkExpr(kind::BITVECTOR_ULE, w[i][j], em.mkConst(BitVector(size_8, Integer(122))));
Expr alpha_constraint = em.mkExpr(kind::AND, alpha_lower, alpha_upper);
Expr ascii_constraint = em.mkExpr(kind::OR, digit_constraint, alpha_constraint);
slv.assertFormula(ascii_constraint);
}
}
// 3. encode the 8 equations
for (int i = 0; i < 8; ++i) {
// a. build the multiplication part (index * w[i])
vector<Expr> left_mult_hand;
for (int j = 0; j < 16; ++j) {
vector <Expr> inner_wj;
for (int k = 0; k < 4; ++k) inner_wj.push_back(w[j][k]);
Expr wj = em.mkExpr(kind::BITVECTOR_CONCAT, inner_wj);
Expr index = em.mkConst(BitVector(size_32, Integer(m_unknowns[j])));
left_mult_hand.push_back(em.mkExpr(kind::BITVECTOR_MULT, index, wj));
}
// b. sum each index * w[i]
slv.push();
Expr left_hand = em.mkExpr(kind::BITVECTOR_PLUS, left_mult_hand);
Expr result = em.mkConst(BitVector(size_32, Integer(globalSums.to_ulong())));
Expr assumption = em.mkExpr(kind::EQUAL, left_hand, result);
slv.assertFormula(assumption);
// c. check for satisfiability
cout << "Result from CVC4 is: " << slv.checkSat(em.mkConst(true)) << endl << endl;
slv.pop();
}
return 0;
}

Related

Mathematical conversion linear input curve to "logarithm-like" output curve in C on limited resources

I need to make mathematical conversion of input value (linear curve) to output value ("logarithm or exponential-like" curve) on MCU with limited resources (memory, clock speed).
But this question is more general C programming questions.
Let's say, we have two variables: uint8_t input and uint8_t output. Input values are 0-255. Output values should be also 0-255, but changing not linear with input.
For example (input ==> output):
0 ==> 0
1 ==> 1
10 ==> 1
100 ==> 10
200 ==> 75
250 ==> 225
255 ==> 255
Let's say, I can do mathematical conversion, something like (in general) " 2^(input/32)-1 " and I can achieve something close to what I need. I tried:
output = (pow(2, (input/32)))-1
But, according to limitation of uint8_t type variables, I'll get output with powers of 2: 1, 2, 4... up to 128 (minus 1) with no smooth transitions between them:
0 ==> 0
1 ==> 0
10 ==> 0
100 ==> 7
200 ==> 63
250 ==> 127
255 ==> 127
Also I prefer not to use pow(), because of limited MCU resources.
I can achieve required result with the use of table in memory, but (again) it will cost 256 bytes of ROM, what is not acceptable. Also I can use smaller table and make an approximation, but...
May be there is some other ways to achieve this result in mathematical way, remaining inside uint8-uint32 arithmetic on low resources MCU?
EDIT: The "correct" formula and numbers doesn't exist. Output numbers could vary. At "input=0" ==> "output=0", at "input=255" ==> "output=255". The requirement: smallest and fastest compiled code. The expected result should be close to this graph (in uint8_t):
Let's say, I can do mathematical conversion, something like (in general) " 2^(input/32)-1 " and I can achieve something close to what I need.
You can get an approximation of 2^(input/32) by using the remainder to do a linear interpolation. Assuming input is an integer, you can calculate the equivalent of (1 + (input%32)/32.0) * pow(2, input/32) using simple 8-bit calculations:
uint8_t f(uint8_t input) {
uint8_t t = (input & 31) + 32;
uint8_t u = input >> 5;
return u < 5 ? t >> (5 - u) : t << (u - 5);
}
This will give you a linear interpolation between the powers of two, that looks like this:
Update (math background)
We want to approximate function f(input) = pow(2.0, input/32.0) that requires floating point operations with only integer operations. Obvious approach is to split range 0-255 into sub-ranges of size 32 and use the values at 0, 32, 64, 96, 128, 160, 192 which can be calculated using only integers and then do simple linear interpolation for the values in between. Now assume that we want to approximate function value for some input such that 32*u <= input < 32*(u+1) where u is an integer. We know the values at the ends: f(32*u) = 2^u and f(32*(u+1)) = 2^(u+1). Using linear interpolation we get following approximation:
f_approx(input) = [f(32*(u+1))*(input - 32*u) + f(32*u)*(32*(u+1) - input)]/32
Now let's simplify it:
f_approx(input) = [2^(u+1)*(input - 32*u) + 2^u*(32 - (input - 32*u))]/32
f_approx(input) = 2^u*[2*(input - 32*u) + 32 - (input - 32*u)]/32
f_approx(input) = 2^u*[(input - 32*u) + 32]/2^5
f_approx(input) = [(input - 32*u) + 32]*2^(u-5)
Now if we notice that 32 is a power of 2 we can see that (input - 32*u) which is the same as input % 32 can also be calculated as input & 31. Thus
f_approx(input) = t*2^(u-5)
and the ternary if in the return just handles the sign of u-5 to calculate power of 2 using shifts.

How to refine the result of a floating point division result?

I have an an algorithm for calculating the floating point square root divide using the newton-raphson algorith. My results are not fully accurate and sometimes off by 1 ulp.
I was wondering if there is a refinement algorithm for floating point division to get the final bits of accuracy. I use the tuckerman test for square root, but is there a similar algorithm for division? Or can the tuckerman test be adapted for division?
I tried using this algorithm too but didn't get full accuracy results:
z= divisor
r_temp = divisor*q
r = dividend - r_temp
result_temp = r*z
q + result_temp
One practical way of correctly rounding the result of iterative division is to produce a preliminary quotient to within one ulp of the mathematical result, then use the exactly-computed residual to compute the final result.
The tool of choice for the exact computation of residuals is the fused-multiply add (FMA) operation. Much of the foundational work of this approach (both in terms of the mathematics and of practical implementations) is due to Peter Markstein and was later refined by other researchers. Markstein's results are nicely summarized in his book:
Peter Markstein, IA-64 and Elementary Functions: Speed and Precision. Prentice-Hall 2000.
A straightforward approach to correctly-rounded division using Markstein's approach is to first compute a correctly-rounded reciprocal, then compute the correctly-rounded quotient by multiplying it with the
dividend, followed by the final residual-based rounding step.
The residual can be used to compute the final rounded result directly, as is shown for the quotient rounding in the code below (I noticed that this code sequence resulted in an incorrectly rounded result in one out of 1011 divisions, and replaced it with another instance of the comparison-and-select idiom) which is the technique used by Markstein. Alternatively it may be used as part of a two-sided comparison-and-select process somewhat akin to Tuckerman rounding, which is shown for the reciprocal rounding in the code below.
There is one caveat with regard to the reciprocal computation. Many commonly used iterative approaches (including the one I used below), when combined with Markstein's rounding technique, deliver an incorrect result if the mantissa of the divisor consists entirely of 1-bits.
One way of getting around this is to treat this case specially. In the code below I instead opted for a two-sided comparison-and-select approach, which also allows errors slightly larger than one ulp prior to rounding and thus eliminates the need to use FMA in the reciprocal iteration itself.
Please note that I omitted the handling of sub-normal results in the C code below to keep the code concise and easy to follow. I have limited myself to standard C library functions for tasks like extracting parts of floating-point operands, assembling floating-point numbers, and applying one-ulp increments and decrements. Most platforms will offer machine-specific options with higher performance for these.
float my_divf (float a, float b)
{
float q, r, ma, mb, e, s, t;
int ia, ib;
if (!isnanf (a+b) && !isinff (a) && !isinff (b) && (b != 0.0f)) {
/* normal cases: remove sign, split args into exponent and mantissa */
ma = frexpf (fabsf (a), &ia);
mb = frexpf (fabsf (b), &ib);
/* minimax polynomial approximation to 1/mb for mb in [0.5,1) */
r = - 3.54939341e+0f;
r = r * mb + 1.06481802e+1f;
r = r * mb - 1.17573657e+1f;
r = r * mb + 5.65684575e+0f;
/* apply one iteration with cubic convergence */
e = 1.0f - mb * r;
e = e * e + e;
r = e * r + r;
/* round reciprocal to nearest-or-even */
e = fmaf (-mb, r, 1.0f); // residual of 1st candidate
s = nextafterf (r, copysignf (2.0f, e)); // bump or dent
t = fmaf (-mb, s, 1.0f); // residual of 2nd candidate
r = (fabsf (e) < fabsf (t)) ? r : s; // candidate with smaller residual
/* compute preliminary quotient from correctly-rounded reciprocal */
q = ma * r;
/* round quotient to nearest-or-even */
e = fmaf (-mb, q, ma); // residual of 1st candidate
s = nextafterf (q, copysignf (2.0f, e)); // bump or dent
t = fmaf (-mb, s, ma); // residual of 2nd candidate
q = (fabsf (e) < fabsf (t)) ? q : s; // candidate with smaller residual
/* scale back into result range */
r = ldexpf (q, ia - ib);
if (r < 1.17549435e-38f) {
/* sub-normal result, left as an exercise for the reader */
}
/* merge in sign of quotient */
r = copysignf (r, a * b);
} else {
/* handle special cases */
if (isnanf (a) || isnanf (b)) {
r = a + b;
} else if (b == 0.0f) {
r = (a == 0.0f) ? (0.0f / 0.0f) : copysignf (1.0f / 0.0f, a * b);
} else if (isinff (b)) {
r = (isinff (a)) ? (0.0f / 0.0f) : copysignf (0.0f, a * b);
} else {
r = a * b;
}
}
return r;
}

A numerical library which uses a paralleled algorithm to do one dimensional integration?

Is there a numerical library which can use a paralleled algorithm to do one dimensional integration (global adaptive method)? The infrastructure of my code decides that I cannot do multiple numerical integrations in parallel, but I have to use a paralleled algorithm to speed up.
Thanks!
Nag C numerical library does have a parallel version of adaptive quadrature (link here). Their trick is to request the user the following function
void (*f)(const double x[], Integer nx, double fv[], Integer *iflag, Nag_Comm *comm)
Here the function "f" evaluates the integrand at nx abscise points given by the vector x[]. This is where parallelization comes along, because you can use parallel_for (implemented in openmp for example) to evaluate f at those points concurrently. The integrator itself is single threaded.
Nag is a very expensive library, but if you code the integrator yourself using, for example, numerical recipes, it is not difficult to modify serial implementations to create parallel adaptive integrators using NAG idea.
I can't reproduce numerical recipes book to show where modifications are necessary due to license restriction. So let's take the simplest example of trapezoidal rule, where the implementation is quite simple and well known. The simplest way to create adaptive method using trapezoidal rule is to calculate the integral at a grid of points, then double the number of abscise points and compare the results. If the result changes by less than the requested accuracy, then there is convergence.
At each step, the trapezoidal rule can be computed using the following generic implementation
double trapezoidal( double (*f)(double x), double a, double b, int n)
{
double h = (b - a)/n;
double s = 0.5 * h * (f(a) + f(b));
for( int i = 1; i < n; ++i ) s += h * f(a + i*h);
return s;
}
Now you can make the following changes to implement NAG idea
double trapezoidal( void (*f)( double x[], int nx, double fv[] ), double a, double b, int n)
{
double h = (b - a)/n;
double x[n+1];
double fv[n+1];
for( int i = 0; i < n; ++i ) x[i+1] = (a + i * h);
x[n] = b;
f(x, n, fv); // inside f, use parallel_for to evaluate the integrand at x[i], i=0..n
double s = 0.5 * h * ( fv[0] + fv[n] );
for( int i = 1; i < n; ++i ) s += h * fv[i];
return s;
}
This procedure, however, will only speed-up your code if the integrand is very expensive to compute. Otherwise, you should parallelize your code at higher loops and not inside the integrator.
Why not simply implement a wrapper around a single threaded algorithm that dispatches integrals of subdivisions of the bounds to different threads and then adds them together at the end? e.g.
thread 0: i0 = integral(x0, (x0+x1)/2)
thread 1: i1 = integral((x0+x1)/2, x1)
i = i0 + i1

Mathematical (Arithmetic) representation of XOR

I have spent the last 5 hours searching for an answer. Even though I have found many answers they have not helped in any way.
What I am basically looking for is a mathematical, arithmetic only representation of the bitwise XOR operator for any 32bit unsigned integers.
Even though this sounds really simple, nobody (at least it seems so) has managed to find an answer to this question.
I hope we can brainstorm, and find a solution together.
Thanks.
XOR any numerical input between 0 and 1 including both ends
a + b - ab(1 + a + b - ab)
XOR binary input
a + b - 2ab or (a-b)²
Derivation
Basic Logical Operators
NOT = (1-x)
AND = x*y
From those operators we can get...
OR = (1-(1-a)(1-b)) = a + b - ab
Note: If a and b are mutually exclusive then their and condition will always be zero - from a Venn diagram perspective, this means there is no overlap. In that case, we could write OR = a + b, since a*b = 0 for all values of a & b.
2-Factor XOR
Defining XOR as (a OR B) AND (NOT (a AND b)):
(a OR B) --> (a + b - ab)
(NOT (a AND b)) --> (1 - ab)
AND these conditions together to get...
(a + b - ab)(1 - ab) = a + b - ab(1 + a + b - ab)
Computational Alternatives
If the input values are binary, then powers terms can be ignored to arrive at simplified computationally equivalent forms.
a + b - ab(1 + a + b - ab) = a + b - ab - a²b - ab² + a²b²
If x is binary (either 1 or 0), then we can disregard powers since 1² = 1 and 0² = 0...
a + b - ab - a²b - ab² + a²b² -- remove powers --> a + b - 2ab
XOR (binary) = a + b - 2ab
Binary also allows other equations to be computationally equivalent to the one above. For instance...
Given (a-b)² = a² + b² - 2ab
If input is binary we can ignore powers, so...
a² + b² - 2ab -- remove powers --> a + b - 2ab
Allowing us to write...
XOR (binary) = (a-b)²
Multi-Factor XOR
XOR = (1 - A*B*C...)(1 - (1-A)(1-B)(1-C)...)
Excel VBA example...
Function ArithmeticXOR(R As Range, Optional EvaluateEquation = True)
Dim AndOfNots As String
Dim AndGate As String
For Each c In R
AndOfNots = AndOfNots & "*(1-" & c.Address & ")"
AndGate = AndGate & "*" & c.Address
Next
AndOfNots = Mid(AndOfNots, 2)
AndGate = Mid(AndGate, 2)
'Now all we want is (Not(AndGate) AND Not(AndOfNots))
ArithmeticXOR = "(1 - " & AndOfNots & ")*(1 - " & AndGate & ")"
If EvaluateEquation Then
ArithmeticXOR = Application.Evaluate(xor2)
End If
End Function
Any n of k
These same methods can be extended to allow for any n number out of k conditions to qualify as true.
For instance, out of three variables a, b, and c, if you're willing to accept any two conditions, then you want a&b or a&c or b&c. This can be arithmetically modeled from the composite logic...
(a && b) || (a && c) || (b && c) ...
and applying our translations...
1 - (1-ab)(1-ac)(1-bc)...
This can be extended to any n number out of k conditions. There is a pattern of variable and exponent combinations, but this gets very long; however, you can simplify by ignoring powers for a binary context. The exact pattern is dependent on how n relates to k. For n = k-1, where k is the total number of conditions being tested, the result is as follows:
c1 + c2 + c3 ... ck - n*∏
Where c1 through ck are all n-variable combinations.
For instance, true if 3 of 4 conditions met would be
abc + abe + ace + bce - 3abce
This makes perfect logical sense since what we have is the additive OR of AND conditions minus the overlapping AND condition.
If you begin looking at n = k-2, k-3, etc. The pattern becomes more complicated because we have more overlaps to subtract out. If this is fully extended to the smallest value of n = 1, then we arrive at nothing more than a regular OR condition.
Thinking about Non-Binary Values and Fuzzy Region
The actual algebraic XOR equation a + b - ab(1 + a + b - ab) is much more complicated than the computationally equivalent binary equations like x + y - 2xy and (x-y)². Does this mean anything, and is there any value to this added complexity?
Obviously, for this to matter, you'd have to care about the decimal values outside of the discrete points (0,0), (0,1), (1,0), and (1,1). Why would this ever matter? Sometimes you want to relax the integer constraint for a discrete problem. In that case, you have to look at the premises used to convert logical operators to equations.
When it comes to translating Boolean logic into arithmetic, your basic building blocks are the AND and NOT operators, with which you can build both OR and XOR.
OR = (1-(1-a)(1-b)(1-c)...)
XOR = (1 - a*b*c...)(1 - (1-a)(1-b)(1-c)...)
So if you're thinking about the decimal region, then it's worth thinking about how we defined these operators and how they behave in that region.
Non-Binary Meaning of NOT
We expressed NOT as 1-x. Obviously, this simple equation works for binary values of 0 and 1, but the thing that's really cool about it is that it also provides the fractional or percent-wise compliment for values between 0 to 1. This is useful since NOT is also known as the Compliment in Boolean logic, and when it comes to sets, NOT refers to everything outside of the current set.
Non-Binary Meaning of AND
We expressed AND as x*y. Once again, obviously it works for 0 and 1, but its effect is a little more arbitrary for values between 0 to 1 where multiplication results in partial truths (decimal values) diminishing each other. It's possible to imagine that you would want to model truth as being averaged or accumulative in this region. For instance, if two conditions are hypothetically half true, is the AND condition only a quarter true (0.5 * 0.5), or is it entirely true (0.5 + 0.5 = 1), or does it remain half true ((0.5 + 0.5) / 2)? As it turns out, the quarter truth is actually true for conditions that are entirely discrete and the partial truth represents probability. For instance, will you flip tails (binary condition, 50% probability) both now AND again a second time? Answer is 0.5 * 0.5 = 0.25, or 25% true. Accumulation doesn't really make sense because it's basically modeling an OR condition (remember OR can be modeled by + when the AND condition is not present, so summation is characteristically OR). The average makes sense if you're looking at agreement and measurements, but it's really modeling a hybrid of AND and OR. For instance, ask 2 people to say on a scale of 1 to 10 how much do they agree with the statement "It is cold outside"? If they both say 5, then the truth of the statement "It is cold outside" is 50%.
Non-Binary Values in Summary
The take away from this look at non-binary values is that we can capture actual logic in our choice of operators and construct equations from the ground up, but we have to keep in mind numerical behavior. We are used to thinking about logic as discrete (binary) and computer processing as discrete, but non-binary logic is becoming more and more common and can help make problems that are difficult with discrete logic easier/possible to solve. You'll need to give thought to how values interact in this region and how to translate them into something meaningful.
"mathematical, arithmetic only representation" are not correct terms anyway. What you are looking for is a function which goes from IxI to I (domain of integer numbers).
Which restrictions would you like to have on this function? Only linear algebra? (+ , - , * , /) then it's impossible to emulate the XOR operator.
If instead you accept some non-linear operators like Max() Sgn() etc, you can emulate the XOR operator with some "simpler" operators.
Given that (a-b)(a-b) quite obviously computes xor for a single bit, you could construct a function with the floor or mod arithmetic operators to split the bits out, then xor them, then sum to recombine. (a-b)(a-b) = a2 -2·a·b + b2 so one bit of xor gives a polynomial with 3 terms.
Without floor or mod, the different bits interfere with each other, so you're stuck with looking at a solution which is a polynomial interpolation treating the input a,b as a single value: a xor b = g(a · 232 + b)
The polynomial has 264-1 terms, though will be symmetric in a and b as xor is commutative so you only have to calculate half of the coefficients. I don't have the space to write it out for you.
I wasn't able to find any solution for 32-bit unsigned integers but I've found some solutions for 2-bit integers which I was trying to use in my Prolog program.
One of my solutions (which uses exponentiation and modulo) is described in this StackOverflow question and the others (some without exponentiation, pure algebra) can be found in this code repository on Github: see different xor0 and o_xor0 implementations.
The nicest xor represention for 2-bit uints seems to be: xor(A,B) = (A + B*((-1)^A)) mod 4.
Solution with +,-,*,/ expressed as Excel formula (where cells from A2 to A5 and cells from B1 to E1 contain numbers 0-4) to be inserted in cells from A2 to E5:
(1-$A2)*(2-$A2)*(3-$A2)*($A2+B$1)/6 - $A2*(1-$A2)*(3-$A2)*($A2+B$1)/2 + $A2*(1-$A2)*(2-$A2)*($A2-B$1)/6 + $A2*(2-$A2)*(3-$A2)*($A2-B$1)/2 - B$1*(1-B$1)*(3-B$1)*$A2*(3-$A2)*(6-4*$A2)/2 + B$1*(1-B$1)*(2-B$1)*$A2*($A2-3)*(6-4*$A2)/6
It may be possible to adapt and optimize this solution for 32-bit unsigned integers. It's complicated and it uses logarithms but seems to be the most universal one as it can be used on any integer number. Additionaly, you'll have to check if it really works for all number combinations.
I do realize that this is sort of an old topic, but the question is worth answering and yes, this is possible using an algorithm. And rather than go into great detail about how it works, I'll just demonstrate with a simple example (written in C):
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
typedef unsigned long
number;
number XOR(number a, number b)
{
number
result = 0,
/*
The following calculation just gives us the highest power of
two (and thus the most significant bit) for this data type.
*/
power = pow(2, (sizeof(number) * 8) - 1);
/*
Loop until no more bits are left to test...
*/
while(power != 0)
{
result *= 2;
/*
The != comparison works just like the XOR operation.
*/
if((power > a) != (power > b))
result += 1;
a %= power;
b %= power;
power /= 2;
}
return result;
}
int main()
{
srand(time(0));
for(;;)
{
number
a = rand(),
b = rand();
printf("a = %lu\n", a);
printf("b = %lu\n", b);
printf("a ^ b = %lu\n", a ^ b);
printf("XOR(a, b) = %lu\n", XOR(a, b));
getchar();
}
}
I think this relation might help in answering your question
A + B = (A XOR B ) + 2*(A.B)
(a-b)*(a-b) is the right answer. the only one? I guess so!

Interview: random3 function implementation using random2

On recent interview I was asked the following question. There is a function random2(), wich returns 0 or 1 with equal probability (0.5). Write implementation of random4() and random3() using random2().
It was easy to implement random4() like this
if(random2())
return random2();
return random2() + 2;
But I had difficulties with random3(). The only realization I could represent:
uint32_t sum = 0;
for (uint32_t i = 0; i != N; ++i)
sum += random2();
return sum % 3;
This implementation of random4() is based only my intuition only. I'm not sure if it is correct actually, because I can't mathematically prove its correctness. Can somebody help me with this question, please.
random3:
Not sure if this is the most efficient way, but here's my take:
x = random2 + 2*random2
What can happen:
0 + 0 = 0
0 + 2 = 2
1 + 0 = 1
1 + 2 = 3
The above are all the possibilities of what can happen, thus each has equal probability, so...
(p(x=c) is the probability that x = c)
p(x=0) = 0.25
p(x=1) = 0.25
p(x=2) = 0.25
p(x=3) = 0.25
Now while x = 3, we just keep generating another number, thus giving equal probability to 0,1,2. More technically, you would distribute the probability from x=3 across all of them repeatedly such that p(x=3) tends to 0, thus the probability of the others will tend to 0.33 each.
Code:
do
val = random2() + 2*random2();
while (val != 3);
return val;
random4:
Let's run through your code:
if(random2())
return random2();
return random2() + 2;
First call has 50% chance of 1 (true) => returns either 0 or 1 with 50% * 50% probability, thus 25% each
First call has 50% chance of 0 (false) => returns either 2 or 3 with 50% * 50% probability, thus 25% each
Thus your code generates 0,1,2,3 with equal probability.
Update inspired by e4e5f4's answer:
For a more deterministic answer than the one I provided above...
Generate some large number by calling random2 a bunch of times and mod the result by the desired number.
This won't be exactly the right probability for each, but it will be close.
So, for a 32-bit integer by calling random2 32 times, target = 3:
Total numbers: 4294967296
Number of x's such that x%3 = 1 or 2: 1431655765
Number of x's such that x%3 = 0: 1431655766
Probability of 1 or 2 (each): 0.33333333325572311878204345703125
Probability of 0: 0.3333333334885537624359130859375
So within 0.00000002% of the correct probability, seems pretty close.
Code:
sum = 0;
for (int i = 0; i < 32; i++)
sum = 2*sum + random2();
return sum % N;
Note:
As pjr pointed out, this is, in general, far less efficient than the rejection method above. The probability of getting to the same number of calls of random2 (i.e. 32) (assuming this is the slowest operation) with the rejection method is 0.25^(32/2) = 0.0000000002 = 0.00000002%. This together with the fact that this method isn't exact, gives way more preference to the rejection method. Lower this number decreases the running time, but increases the error, and it would probably need to be lowered quite a bit (thus reaching a high error) to approach the average running time of the rejection method.
It is useful to note the above algorithm has a maximum running time. The rejection method does not. If your random number generator is totally broken for some reason, it could keep generating the rejected number and run for quite a while or forever with the rejection method, but the for-loop above will run 32 times, regardless of what happens.
Using modulo(%) is not recommended because it introduces bias. Mapping will be nice only if n is power of 2. Otherwise some kind of rejection is involved as suggested by other answer.
Another generic approach would be to emulate built-in PRNGs by -
Generate 32 random2() and map it to a 32-bit integer
Get random number in range (0,1) by dividing it by max integer value
Simply multiply this number by n (=3,4...73 so on) and floor to get desired output

Resources