Simplify boolean algebra expression - math

Starting from a truth table of one subtractor with three inputs and two outputs I've obtained the following boolean formula for the second output z2 (the "loan" of the bit to the left) :
z2 = x0'x1'x2 + x0'x1x2' + x0'x1x2 + x0x1x2
(where x0' means NOT x0).
Simplifying it : z2 = x2(x0 ⊕ x1)' + x1(x0 ⊕ x2)' + x1x2
which means : z2 = x2(x0 XNOR x1) + x1(x0 XNOR x2) + x1x2
Can I simplify more?
I've tried with z2 = (x0 XNOR x1) + (x0 XNOR x2) + x1x2 but it doesn't do the trick.

Your original expression has 12 variable references and 16 operators. Your simplification has 8 variable references and 7 operators. Here is an expression with 6 variable references and 6 operators:
z2 = x0x1x2 + x0'(x1+x2)
I do not know if that is minimal in any sense.
You ask how I found that expression. I did not start from your simplification, I started from the truth table that you referenced in a comment. I reproduce it here:
As I looked the table looking for patterns, I saw that it looks like a chiasm or skew-symmetric matrix: if I flip the last column upside-down then take the complements of all items I end up with the original column. (I don't know the proper term for this kind of symmetry; those are the terms that came to my mind.) I tried to encapsulate that symmetry into a logical expression but failed.
That led me to look at the upper and lower halves of that last column. The upper half is mostly ones and the lower half is mostly zeros. Then it struck me that the upper half looks like the truth table for the binary OR operation and the lower half looks like the binary AND operation. The upper half is for x0' and the lower half is for x0, of course. Putting those facts together gave me my expression.
I confirmed that expression by seeing if I could manipulate the original expression into mine. I could, by doing
z2 = x0'x1'x2 + x0'x1x2' + x0'x1x2 + x0x1x2
= x0'(x1'x2 + x1x2' + x1x2) + x0x1x2
= x0'(x1 + x2) + x0x1x2
= x0x1x2 + x0'(x1 + x2)
The transition from the second to third line is, of course, equivalent to recognizing the truth table for binary OR, so this is not much different from my actual discovery method.
That latter method may be more transferable to other problems: factor out a common factor from multiple terms. My actual method was more fun but less transferable. My favorite definition of mathematics is "the study of patterns" which explains why the method was fun.

Related

Trying to fit a response surface with R's formula notation

I have a set of a couple of dozen numeric variables and am trying to figure out how to compactly express a quadratic form in those variables. I also want to include the variables themselves. The idea here is that we are fitting a response surface, rather than interacting a group of treatments, as the standard R formula notation seems to assume. I am trying to get appropriate expressions turned into an R formula, suitable for estimation by different techniques, with different data sets, or over different periods.
If there is an explicit statement of how R's formula notation works, anywhere, I have not been able to find it. There is an ancient paper from which R supposedly copied the notation, but it is by no means identical to current R usage. Every other description I have found just gives examples, that do not cover every case -- not even close to every case.
So, just as an example, here I try to construct a quadratic form in three variables, without writing out all the pairs by hand with an I() around each pair.
library(tidyverse)
A <- B <- C <- 1:10
LHS <- 1:10 * 600
tb <- tibble(LHS, A, B, C)
my_eq <- as.formula(LHS ~ I(A + B + C)*I(A + B + C))
I have not found any way to tell if I have succeded
Neither
my_form_eq nor
terms(my_form_eq)
seem at all enlightening.
For example, can one predict whether
identical(as.formula(LHS ~ I(A + B + C)*I(A + B + C)), as.formula(LHS ~ I((A + B + C)*(A + B + C)))
is true or false? I can not even guess. Or to take an even simpler case, is ~ A * I(A) equal to A, I(A^2), or something else? And how would you know?
To restate my question, I would like either a full statement of how R's formula notation works, adequate to cover every case and predict what each would mean, or, failing that, a straightforward way of producing an expansion of any existing formula into all the atomic terms for which coefficients will be estimated.
This may not answer your question, but I'll post this anyway since I think it may help a little.
The I function inhibits the interpretation of operators such as "+", so your formula is probably not going to do what you expect it to do. For example, the results of lm(my_eq) will be the same as the results of doing the following:
D <- A + B + C
lm(LHS ~ D * D)
And then you may as well just do lm(LHS ~ D).
For your question, I believe John Maindonald wrote a good book that explains R formulas for many situations. But it's in my office and today is a Sunday.
Edit: For the expansion, I believe you have to fit the model and then look at the call or the terms:
> my_eq <- as.formula(LHS ~ (A + B + C) * (A + B + C))
> my_formula <- lm(my_eq)
> attr(terms(my_formula), "term.labels")
[1] "A" "B" "C" "A:B" "A:C" "B:C"

Optimizing for global minimum

I am attempting to use optimize() to find the minimum value of n for the following function (Clopper-Pearson lower bound):
f <- function (n, p=0.5)
(1 + (n - p*n + 1) /
(p*n*qf(p= .025, df1= 2*p, df2= 2*(n - p + 1))))^-1
And here is how I attempted to optimize it:
n_clop <- optimize(f.1, c(300,400), maximum = FALSE, p=0.5)
n_clop
I did this over the interval [300,400] because I suspect the value to be between within it but ultimately I would like to do the optimization between 0 and infinity. It seems that this command is producing a local minimum because no matter the interval it produces the lower bound of that interval as the minimum - which is not what I suspect from clopper-pearson. So, my two questions are how to properly find a global minimum in R and how to so over any interval?
I've very briefly looked over the Wikipedia page you linked and don't see any obvious typos in your formula (although I feel like it should be 0.975=1-alpha/2 rather than 0.025=alpha/2?). However, evaluating the function you've coded over a very broad scale suggests that there are no local minima that are messing you up. My strong guess would be that either your logic is wrong (i.e., n->0 is really the right answer) or that you haven't coded what you think you're coding, due to a typo (possibly in the Wikipedia article, although that seems unlikely) or a thinko.
f <- function (n, p=0.5)
(1 + (n - p*n + 1) /
(p*n*qf(p= .025, df1= 2*p, df2= 2*(n - p + 1))))^-1
Confirm that you're getting the right answer for the interval you chose:
curve(f(x),c(300,400))
Evaluating over a broad range (n=0.00001 to 1000000):
curve(f(10^x),c(-5,7))
As #MrFlick suggests, global optimization is hard. You could start with optim(...method="SANN") but the best answer is definitely case-specific.

Calculating log(sum of exp(terms) ) when "terms" are very small

I would like to compute log( exp(A1) + exp(A2) ).
The formula below
log(exp(A1) + exp(A2) ) = log[exp(A1)(1 + exp(A2)/exp(A1))] = A1 + log(1+exp(A2-A1))
is useful when A1 and A2 are large and numerically exp(A1)=Inf (or exp(A2)=Inf).
(this formula is discussed in this thread ->
How to calculate log(sum of terms) from its component log-terms). The formula is true when the role of A1 and A2 are replaced.
My concern of this formula is when A1 and A2 are very small. For example, when A1 and A2 are:
A1 <- -40000
A2 <- -45000
then the direct calculation of log(exp(A1) + exp(A2) ) is:
log(exp(A1) + exp(A2))
[1] -Inf
Using the formula above gives:
A1 + log(1 + exp(A2-A1))
[1] -40000
which is the value of A1.
Ising the formula above with flipped role of A1 and A2 gives:
A2 + log(1 + exp(A1-A2))
[1] Inf
Which of the three values are the closest to the true value of log(exp(A1) + exp(A2))? Is there robust way to compute log(exp(A1) + exp(A2)) that can be used both when A1, A2 are small and A1, A2 are large.
Thank you in advance
You should use something with more accuracy to do the direct calculation.
It’s not “useful when [they’re] large”. It’s useful when the difference is very negative.
When x is near 0, then log(1+x) is approximately x. So if A1>A2, we can take your first formula:
log(exp(A1) + exp(A2)) = A1 + log(1+exp(A2-A1))
and approximate it by A1 + exp(A2-A1) (and the approximation will get better as A2-A1 is more negative). Since A2-A1=-5000, this is more than negative enough to make the approximation sufficient.
Regardless, if y is too far from zero (either way) exp(y) will (over|under)flow a double and result in 0 or infinity (this is a double, right? what language are you using?). This explains your answers. But since exp(A2-A1)=exp(-5000) is close to zero, your answer is approximately -40000+exp(-5000), which is indistinguishable from -40000, so that one is correct.
in such huge exponent differences the safest you can do without arbitrary precision is
chose the biggest exponent let it be Am = max(A1,A2)
so: log(exp(A1)+exp(A2)) -> log(exp(Am)) = Am
that is the closest you can get for such case
so in your example the result is -40000+delta
where delta is something very small
If you want to use the second formula then all breaks down to computing log(1+exp(A))
if A is positive then the result is far from the real thing
if A is negative then it will truncate to log(1)=0 so you get the same result as in above
[Notes]
your exponent difference is base^500
single precision 32bit float can store numbers up to (+/-)2^(+/-128)
double precision 64bit float can store numbers up to (+/-)2^(+/-1024)
so when your base is 10 or e then this is nowhere near enough what you need
if you have quadruple precision that should be enough but when you start changing the exp difference again yo will quickly get to the same point as now
[PS] if you need more precision without arbitrary precision
you can try to create own number class
with internal store of numbers like number=a^b
where a,b are floats
but for that you would need to code all basic functions
*,/ is easy
+,- is a nightmare but there could be some approaches/algorithms out there even for this

Mathematical (Arithmetic) representation of XOR

I have spent the last 5 hours searching for an answer. Even though I have found many answers they have not helped in any way.
What I am basically looking for is a mathematical, arithmetic only representation of the bitwise XOR operator for any 32bit unsigned integers.
Even though this sounds really simple, nobody (at least it seems so) has managed to find an answer to this question.
I hope we can brainstorm, and find a solution together.
Thanks.
XOR any numerical input between 0 and 1 including both ends
a + b - ab(1 + a + b - ab)
XOR binary input
a + b - 2ab or (a-b)²
Derivation
Basic Logical Operators
NOT = (1-x)
AND = x*y
From those operators we can get...
OR = (1-(1-a)(1-b)) = a + b - ab
Note: If a and b are mutually exclusive then their and condition will always be zero - from a Venn diagram perspective, this means there is no overlap. In that case, we could write OR = a + b, since a*b = 0 for all values of a & b.
2-Factor XOR
Defining XOR as (a OR B) AND (NOT (a AND b)):
(a OR B) --> (a + b - ab)
(NOT (a AND b)) --> (1 - ab)
AND these conditions together to get...
(a + b - ab)(1 - ab) = a + b - ab(1 + a + b - ab)
Computational Alternatives
If the input values are binary, then powers terms can be ignored to arrive at simplified computationally equivalent forms.
a + b - ab(1 + a + b - ab) = a + b - ab - a²b - ab² + a²b²
If x is binary (either 1 or 0), then we can disregard powers since 1² = 1 and 0² = 0...
a + b - ab - a²b - ab² + a²b² -- remove powers --> a + b - 2ab
XOR (binary) = a + b - 2ab
Binary also allows other equations to be computationally equivalent to the one above. For instance...
Given (a-b)² = a² + b² - 2ab
If input is binary we can ignore powers, so...
a² + b² - 2ab -- remove powers --> a + b - 2ab
Allowing us to write...
XOR (binary) = (a-b)²
Multi-Factor XOR
XOR = (1 - A*B*C...)(1 - (1-A)(1-B)(1-C)...)
Excel VBA example...
Function ArithmeticXOR(R As Range, Optional EvaluateEquation = True)
Dim AndOfNots As String
Dim AndGate As String
For Each c In R
AndOfNots = AndOfNots & "*(1-" & c.Address & ")"
AndGate = AndGate & "*" & c.Address
Next
AndOfNots = Mid(AndOfNots, 2)
AndGate = Mid(AndGate, 2)
'Now all we want is (Not(AndGate) AND Not(AndOfNots))
ArithmeticXOR = "(1 - " & AndOfNots & ")*(1 - " & AndGate & ")"
If EvaluateEquation Then
ArithmeticXOR = Application.Evaluate(xor2)
End If
End Function
Any n of k
These same methods can be extended to allow for any n number out of k conditions to qualify as true.
For instance, out of three variables a, b, and c, if you're willing to accept any two conditions, then you want a&b or a&c or b&c. This can be arithmetically modeled from the composite logic...
(a && b) || (a && c) || (b && c) ...
and applying our translations...
1 - (1-ab)(1-ac)(1-bc)...
This can be extended to any n number out of k conditions. There is a pattern of variable and exponent combinations, but this gets very long; however, you can simplify by ignoring powers for a binary context. The exact pattern is dependent on how n relates to k. For n = k-1, where k is the total number of conditions being tested, the result is as follows:
c1 + c2 + c3 ... ck - n*∏
Where c1 through ck are all n-variable combinations.
For instance, true if 3 of 4 conditions met would be
abc + abe + ace + bce - 3abce
This makes perfect logical sense since what we have is the additive OR of AND conditions minus the overlapping AND condition.
If you begin looking at n = k-2, k-3, etc. The pattern becomes more complicated because we have more overlaps to subtract out. If this is fully extended to the smallest value of n = 1, then we arrive at nothing more than a regular OR condition.
Thinking about Non-Binary Values and Fuzzy Region
The actual algebraic XOR equation a + b - ab(1 + a + b - ab) is much more complicated than the computationally equivalent binary equations like x + y - 2xy and (x-y)². Does this mean anything, and is there any value to this added complexity?
Obviously, for this to matter, you'd have to care about the decimal values outside of the discrete points (0,0), (0,1), (1,0), and (1,1). Why would this ever matter? Sometimes you want to relax the integer constraint for a discrete problem. In that case, you have to look at the premises used to convert logical operators to equations.
When it comes to translating Boolean logic into arithmetic, your basic building blocks are the AND and NOT operators, with which you can build both OR and XOR.
OR = (1-(1-a)(1-b)(1-c)...)
XOR = (1 - a*b*c...)(1 - (1-a)(1-b)(1-c)...)
So if you're thinking about the decimal region, then it's worth thinking about how we defined these operators and how they behave in that region.
Non-Binary Meaning of NOT
We expressed NOT as 1-x. Obviously, this simple equation works for binary values of 0 and 1, but the thing that's really cool about it is that it also provides the fractional or percent-wise compliment for values between 0 to 1. This is useful since NOT is also known as the Compliment in Boolean logic, and when it comes to sets, NOT refers to everything outside of the current set.
Non-Binary Meaning of AND
We expressed AND as x*y. Once again, obviously it works for 0 and 1, but its effect is a little more arbitrary for values between 0 to 1 where multiplication results in partial truths (decimal values) diminishing each other. It's possible to imagine that you would want to model truth as being averaged or accumulative in this region. For instance, if two conditions are hypothetically half true, is the AND condition only a quarter true (0.5 * 0.5), or is it entirely true (0.5 + 0.5 = 1), or does it remain half true ((0.5 + 0.5) / 2)? As it turns out, the quarter truth is actually true for conditions that are entirely discrete and the partial truth represents probability. For instance, will you flip tails (binary condition, 50% probability) both now AND again a second time? Answer is 0.5 * 0.5 = 0.25, or 25% true. Accumulation doesn't really make sense because it's basically modeling an OR condition (remember OR can be modeled by + when the AND condition is not present, so summation is characteristically OR). The average makes sense if you're looking at agreement and measurements, but it's really modeling a hybrid of AND and OR. For instance, ask 2 people to say on a scale of 1 to 10 how much do they agree with the statement "It is cold outside"? If they both say 5, then the truth of the statement "It is cold outside" is 50%.
Non-Binary Values in Summary
The take away from this look at non-binary values is that we can capture actual logic in our choice of operators and construct equations from the ground up, but we have to keep in mind numerical behavior. We are used to thinking about logic as discrete (binary) and computer processing as discrete, but non-binary logic is becoming more and more common and can help make problems that are difficult with discrete logic easier/possible to solve. You'll need to give thought to how values interact in this region and how to translate them into something meaningful.
"mathematical, arithmetic only representation" are not correct terms anyway. What you are looking for is a function which goes from IxI to I (domain of integer numbers).
Which restrictions would you like to have on this function? Only linear algebra? (+ , - , * , /) then it's impossible to emulate the XOR operator.
If instead you accept some non-linear operators like Max() Sgn() etc, you can emulate the XOR operator with some "simpler" operators.
Given that (a-b)(a-b) quite obviously computes xor for a single bit, you could construct a function with the floor or mod arithmetic operators to split the bits out, then xor them, then sum to recombine. (a-b)(a-b) = a2 -2·a·b + b2 so one bit of xor gives a polynomial with 3 terms.
Without floor or mod, the different bits interfere with each other, so you're stuck with looking at a solution which is a polynomial interpolation treating the input a,b as a single value: a xor b = g(a · 232 + b)
The polynomial has 264-1 terms, though will be symmetric in a and b as xor is commutative so you only have to calculate half of the coefficients. I don't have the space to write it out for you.
I wasn't able to find any solution for 32-bit unsigned integers but I've found some solutions for 2-bit integers which I was trying to use in my Prolog program.
One of my solutions (which uses exponentiation and modulo) is described in this StackOverflow question and the others (some without exponentiation, pure algebra) can be found in this code repository on Github: see different xor0 and o_xor0 implementations.
The nicest xor represention for 2-bit uints seems to be: xor(A,B) = (A + B*((-1)^A)) mod 4.
Solution with +,-,*,/ expressed as Excel formula (where cells from A2 to A5 and cells from B1 to E1 contain numbers 0-4) to be inserted in cells from A2 to E5:
(1-$A2)*(2-$A2)*(3-$A2)*($A2+B$1)/6 - $A2*(1-$A2)*(3-$A2)*($A2+B$1)/2 + $A2*(1-$A2)*(2-$A2)*($A2-B$1)/6 + $A2*(2-$A2)*(3-$A2)*($A2-B$1)/2 - B$1*(1-B$1)*(3-B$1)*$A2*(3-$A2)*(6-4*$A2)/2 + B$1*(1-B$1)*(2-B$1)*$A2*($A2-3)*(6-4*$A2)/6
It may be possible to adapt and optimize this solution for 32-bit unsigned integers. It's complicated and it uses logarithms but seems to be the most universal one as it can be used on any integer number. Additionaly, you'll have to check if it really works for all number combinations.
I do realize that this is sort of an old topic, but the question is worth answering and yes, this is possible using an algorithm. And rather than go into great detail about how it works, I'll just demonstrate with a simple example (written in C):
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
typedef unsigned long
number;
number XOR(number a, number b)
{
number
result = 0,
/*
The following calculation just gives us the highest power of
two (and thus the most significant bit) for this data type.
*/
power = pow(2, (sizeof(number) * 8) - 1);
/*
Loop until no more bits are left to test...
*/
while(power != 0)
{
result *= 2;
/*
The != comparison works just like the XOR operation.
*/
if((power > a) != (power > b))
result += 1;
a %= power;
b %= power;
power /= 2;
}
return result;
}
int main()
{
srand(time(0));
for(;;)
{
number
a = rand(),
b = rand();
printf("a = %lu\n", a);
printf("b = %lu\n", b);
printf("a ^ b = %lu\n", a ^ b);
printf("XOR(a, b) = %lu\n", XOR(a, b));
getchar();
}
}
I think this relation might help in answering your question
A + B = (A XOR B ) + 2*(A.B)
(a-b)*(a-b) is the right answer. the only one? I guess so!

Minimization with constraint on all parameters in R

I want to minimize a simple linear function Y = x1 + x2 + x3 + x4 + x5 using ordinary least squares with the constraint that the sum of all coefficients have to equal 5. How can I accomplish this in R? All of the packages I've seen seem to allow for constraints on individual coefficients, but I can't figure out how to set a single constraint affecting coefficients. I'm not tied to OLS; if this requires an iterative approach, that's fine as well.
The basic math is as follows: we start with
mu = a0 + a1*x1 + a2*x2 + a3*x3 + a4*x4
and we want to find a0-a4 to minimize the SSQ between mu and our response variable y.
if we replace the last parameter (say a4) with (say) C-a1-a2-a3 to honour the constraint, we end up with a new set of linear equations
mu = a0 + a1*x1 + a2*x2 + a3*x3 + (C-a1-a2-a3)*x4
= a0 + a1*(x1-x4) + a2*(x2-x4) + a3*(x3-x4) + C*x4
(note that a4 has disappeared ...)
Something like this (untested!) implements it in R.
Original data frame:
d <- data.frame(y=runif(20),
x1=runif(20),
x2=runif(20),
x3=runif(20),
x4=runif(20))
Create a transformed version where all but the last column have the last column "swept out", e.g. x1 -> x1-x4; x2 -> x2-x4; ...
dtrans <- data.frame(y=d$y,
sweep(d[,2:4],
1,
d[,5],
"-"),
x4=d$x4)
Rename to tx1, tx2, ... to minimize confusion:
names(dtrans)[2:4] <- paste("t",names(dtrans[2:4]),sep="")
Sum-of-coefficients constraint:
constr <- 5
Now fit the model with an offset:
lm(y~tx1+tx2+tx3,offset=constr*x4,data=dtrans)
It wouldn't be too hard to make this more general.
This requires a little more thought and manipulation than simply specifying a constraint to a canned optimization program. On the other hand, (1) it could easily be wrapped in a convenience function; (2) it's much more efficient than calling a general-purpose optimizer, since the problem is still linear (and in fact one dimension smaller than the one you started with). It could even be done with big data (e.g. biglm). (Actually, it occurs to me that if this is a linear model, you don't even need the offset, although using the offset means you don't have to compute a0=intercept-C*x4 after you finish.)
Since you said you are open to other approaches, this can also be solved in terms of a quadratic programming (QP):
Minimize a quadratic objective: the sum of the squared errors,
subject to a linear constraint: your weights must sum to 5.
Assuming X is your n-by-5 matrix and Y is a vector of length(n), this would solve for your optimal weights:
library(limSolve)
lsei(A = X,
B = Y,
E = matrix(1, nrow = 1, ncol = 5),
F = 5)

Resources