Does AC-3 solve path consistency? - constraints

When I read the pseudocode of AC-3 in Artificial Intelligence: A Modern Approach, I thought it solves path consistency as well as arc consistency. But the book says path consistency is solved by an algorithm PC-2. Did I miss something?
Why is AC-3 not sufficient enough for solving path consistency?
Here's code for AC-3
function AC-3(csp) returns false if an inconsistency is found and true otherwise
inputs: csp, a binary CSP with components (X, D, C)
local variables: queue, a queue of arcs, initially all the arcs in csp
while queue is not empty do
(Xi, Xj)←REMOVE-FIRST(queue)
if REVISE(csp, Xi, Xj) then
if size of Di = 0 then return false
for each Xk in Xi.NEIGHBORS - {Xj} do
add (Xk, Xi) to queue
return true
function REVISE(csp, Xi, Xj) returns true iff we revise the domain of Xi
revised ← false
for each x in Di do
if no value y in Dj allows (x,y) to satisfy the constraint between Xi and Xj then
delete x from Di
revised ← true
return revised
Thanks in advance:)

I think I've figured out where the problem is. I misunderstood the meaning of path consistency.
I thought
(1) {Xi, Xj} is path-consistent with Xk
is equivalent to
(2) Xi is arc-consistent with Xj, Xi is arc-consistent with Xk, and Xj is arc-consistent with Xk.
That's why I thought AC-3 was sufficient for solving path consistency. But it turns out not.
Give the meaning of (1) and (2):
(1) means that, for every pair of assignment {a, b} consistent with the constraint on {Xi, Xj}, there is a value c in the domain of Xk such that {a, c} and {b, c} satisfy the constraints on {Xi, Xk} and {Xj, Xk}
(2) could be explained in this way (which makes it easier to see the difference): for every pair of assignment {a, b} consistent with the constraint on {Xi, Xj} (Xi is arc-consistent with Xj, this one may not be accurate, but could make do), there is a c in the domain of Xk such that {a, c} satisfies the constraints on {Xi, Xk} (Xi is arc-consistent with Xk), and there is a d in the domain of Xk such that {b, c} satisfies the constraints on {Xj, Xk} (Xj is arc-consistent with Xk)
It's easy to see the difference now: in the explanation of (2), c and d could be different values in the domain of Xk. Only when c is equal to d, (2) is equivalent to (1)
So AC-3 is only sufficient for solving (2), but it's too lax to solve path consistency
Who can tell me whether my understanding is right, this time? Thx :)

It should be {b,d}satisfies the constraint on {xj,xk} .(xj is arc-consistent with xk).

Related

Dart - Find all roots in a function

I am trying to find all roots [f(x) = 0] in a function. My current solution only works if they are spaced out enough and don't interfer with each other. (e.g. it works for x^2 - 2)
bool numberIsCloseToZero(num number){
return (num.parse(number.abs().toStringAsFixed(1)) == 0.0) ? true : false;
}
List<num> calculateRoots(String function){
num eval = 0.0;
List<num> roots = [];
for (num x = -10; x < 10; x += 0.1){
eval = calculateYOfX(function, x);
if (numberIsCloseToZero(num.parse(eval.toStringAsFixed(2)))){
roots.add(x);
}
}
return roots;
}
Obviously, this is due to my rounding. (e.g. the surrounding values of the root of x^2 are too close to zero, so it assumes they are roots as well). Do you think I should go through actually solving the equation instead of "brute forcing" the roots?
Thanks
If you can find analytical solution - use it. It is possible for low-degree polynomial equations (like mentioned x^2 - 2).
In general case - you definitely have to learn numerical methods - in this case, root finding.
Start with bisection method or Newthon's method. They allow to get more and more exact position of root at every step.
You'll need to put some restrictions on what is an allowable function, otherwise you have no hope.
For example without any restrictions you've no guarantee that there are is only a finite number of values (consider f(x)=sin(x) ), or even a finite number of values in a given interval (consider f(x)=x sin(1/x) ). Or even an infinity of connected zeros ( f(x) = max(0,x) )
And these cases are not even considered particularly pathological mathematical functions.
If you're willing to go down the path of requiring your function to be non-zero almost-everywhere, smooth, continuous and with bounded first and second derivatives then I think you may be able to come up with a relatively simple algorithm that guarantees you get all zeros in a given finite region.
(I'd look for a subdivision based algorithm which recursively splits the region and determines strict bounds on each interval.)
We can derive an example algorithm for when the derivative is bounded by a known constant i.e. |f'(x)| < D. Note that if we evaluate f at some point p then for any other point p+d we can show that f(p) - |d| D < f(p+d) < f(p) + |d| D.
Using this we can consider root finding in an interval [A,B] - which we can write as [p-d, p+d] where p=(A+B)/2, d=(B-A)/2. Sample f at the mid-point to get f(p). The minimum value f could take on the interval is f(p) - d D and the maximum value is f(p) + d D. We can only have a root in this interval if f(p)-d D <= 0 <= f(p) +d D which is equivalent to |f(p)| < d D.
If there can be no root in [A,B] we're done, otherwise we repeat on the two halves [A,p] and [p,B]. (some care needs to be taken in the case f(p)=0 )

Algorithm for finding an equidistributed solution to a linear congruence system

I face the following problem in a cryptographical application: I have given a set of linear congruences
a[1]*x[1]+a[2]*x[2]+a[3]*x[3] == d[1] (mod p)
b[1]*x[1]+b[2]*x[2]+b[3]*x[3] == d[2] (mod p)
c[1]*x[1]+c[2]*x[2]+c[3]*x[3] == d[3] (mod p)
Here, x is unknown an a,b,c,d are given
The system is most likely underdetermined, so I have a large solution space. I need an algorithm that finds an equidistributed solution (that means equidistributed in the solution space) to that problem using a pseudo-random number generator (or fails).
Most standard algorithms for linear equation systems that I know from my linear algebra courses are not directly applicable to congruences as far as I can see...
My current, "safe" algorithm works as follows: Find all variable that appear in only one equation, and assign a random value. Now if in each row, only one variable is unassigned, assign the value according to the congruence. Otherwise fail.
Can anyone give me a clue how to solve this problem in general?
You can use gaussian elimination and similar algorithms just like you learned in your linear algebra courses, but all arithmetic is performed mod p (p is a prime). The one important difference is in the definition of "division": to compute a / b you instead compute a * (1/b) (in words, "a times b inverse"). Consider the following changes to the math operations normally used
addition: a+b becomes a+b mod p
subtraction: a-b becomes a-b mod p
multiplication: a*b becomes a*b mod p
division: a/b becomes: if p divides b, then "error: divide by zero", else a * (1/b) mod p
To compute the inverse of b mod p you can use the extended euclidean algorithm or alternatively compute b**(p-2) mod p.
Rather than trying to roll this yourself, look for an existing library or package. I think maybe Sage can do this, and certainly Mathematica, and Maple, and similar commercial math tools can.

Get branch points of equation

If I have a general function,f(z,a), z and a are both real, and the function f takes on real values for all z except in some interval (z1,z2), where it becomes complex. How do I determine z1 and z2 (which will be in terms of a) using Mathematica (or is this possible)? What are the limitations?
For a test example, consider the function f[z_,a_]=Sqrt[(z-a)(z-2a)]. For real z and a, this takes on real values except in the interval (a,2a), where it becomes imaginary. How do I find this interval in Mathematica?
In general, I'd like to know how one would go about finding it mathematically for a general case. For a function with just two variables like this, it'd probably be straightforward to do a contour plot of the Riemann surface and observe the branch cuts. But what if it is a multivariate function? Is there a general approach that one can take?
What you have appears to be a Riemann surface parametrized by 'a'. Consider the algebraic (or analytic) relation g(a,z)=0 that would be spawned from this branch of a parametrized Riemann surface. In this case it is simply g^2 - (z - a)*(z - 2*a) == 0. More generally it might be obtained using Groebnerbasis, as below (no guarantee this will always work without some amount of user intervention).
grelation = First[GroebnerBasis[g - Sqrt[(z - a)*(z - 2*a)], {x, a, g}]]
Out[472]= 2 a^2 - g^2 - 3 a z + z^2
A necessary condition for the branch points, as functions of the parameter 'a', is that the zero set for 'g' not give a (single valued) function in a neighborhood of such points. This in turn means that the partial derivative of this relation with respect to g vanishes (this is from the implicit function theorem of multivariable calculus). So we find where grelation and its derivative both vanish, and solve for 'z' as a function of 'a'.
Solve[Eliminate[{grelation == 0, D[grelation, g] == 0}, g], z]
Out[481]= {{z -> a}, {z -> 2 a}}
Daniel Lichtblau
Wolfram Research
For polynomial systems (and some class of others), Reduce can do the job.
E.g.
In[1]:= Reduce[Element[{a, z}, Reals]
&& !Element[Sqrt[(z - a) (z - 2 a)], Reals], z]
Out[1]= (a < 0 && 2a < z < a) || (a > 0 && a < z < 2a)
This type of approach also works (often giving very complicated solutions for functions with many branch cuts) for other combinations of elementary functions I checked.
To find the branch cuts (as opposed to the simple class of branch points you're interested in) in general, I don't know of a good approach. The best place to find the detailed conventions that Mathematica uses is at the functions.wolfram site.
I do remember reading a good paper on this a while back... I'll try to find it....
That's right! The easiest approach I've seen for branch cut analysis uses the unwinding number. There's a paper "Reasoning about the elementary functions of complex analysis" about this the the journal "Artificial Intelligence and Symbolic Computation". It and similar papers can be found at one of the authors homepage: http://www.apmaths.uwo.ca/~djeffrey/offprints.html.
For general functions you cannot make Mathematica calculate it.
Even for polynomials, finding an exact answer takes time.
I believe Mathematica uses some sort of quantifier elimination when it uses Reduce,
which takes time.
Without any restrictions on your functions (are they polynomials, continuous, smooth?)
one can easily construct functions which Mathematica cannot simplify further:
f[x_,y_] := Abs[Zeta[y+0.5+x*I]]*I
If this function is real for arbitrary x and any -0.5 < y < 0 or 0<y<0.5,
then you will have found a counterexample to the Riemann zeta conjecture,
and I'm sure Mathematica cannot give a correct answer.

Asymptotic complexity constant, why the constant?

Big oh notation says that all g(n) are an element c.f(n), O(g(n)) for some constant c.
I have always wondered and never really understood why we need this arbitrary constant to multiply with the bounding function f(n) to get our bounds?
Also how does one decide what number this constant should be?
The constant itself doesn't characterize the limiting behavior of the f(n) compared to g(n).
It is used for the mathematical definition, which enforces the existence of a constant M such that
If such a constant exists then you can state that f(x) is an O(g(x)), and this is the usual notation when analyzing algorithms, you just don't care about which is the constant but just the complexity of operations itself. The constant is able make that disequation correct by ensuring that M|g(x)| is an upper bound of f(x).
How to find that constant depends on f(x) and g(x) and it is the mathematical point that must be proved to ensure that f(x) has a g(x) big-o so there's not a general rule. Look at this example.
Consider function
f(n) = 4 * n
Doesn't it make sense to call this function O(n) since it grows "as fast" as g(n) = n.
But without constant in definition of O you can't find n0 such as that for all n > n0, f(n) <= n. That's why you need constant, and indeed from condition,
4 * n <= c * n for all n > n0
you can get n0 == 0, c == 4.

Is Big O(logn) log base e?

For binary search tree type of data structures, I see the Big O notation is typically noted as O(logn). With a lowercase 'l' in log, does this imply log base e (n) as described by the natural logarithm? Sorry for the simple question but I've always had trouble distinguishing between the different implied logarithms.
Once expressed in big-O() notation, both are correct. However, during the derivation of the O() polynomial, in the case of binary search, only log2 is correct. I assume this distinction was the intuitive inspiration for your question to begin with.
Also, as a matter of my opinion, writing O(log2 N) is better for your example, because it better communicates the derivation of the algorithm's run-time.
In big-O() notation, constant factors are removed. Converting from one logarithm base to another involves multiplying by a constant factor.
So O(log N) is equivalent to O(log2 N) due to a constant factor.
However, if you can easily typeset log2 N in your answer, doing so is more pedagogical. In the case of binary tree searching, you are correct that log2 N is introduced during the derivation of the big-O() runtime.
Before expressing the result as big-O() notation, the difference is very important. When deriving the polynomial to be communicated via big-O notation, it would be incorrect for this example to use a logarithm other than log2 N, prior to applying the O()-notation. As soon as the polynomial is used to communicate a worst-case runtime via big-O() notation, it doesn't matter what logarithm is used.
Big O notation is not affected by logarithmic base, because all logarithms in different bases are related by a constant factor, O(ln n) is equivalent to O(log n).
Both are correct. Think about this
log2(n)=log(n)/log(2)=O(log(n))
log10(n)=log(n)/log(10)=O(log(n))
logE(n)=log(n)/log(E)=O(log(n))
It doesn't really matter what base it is, since big-O notation is usually written showing only the asymptotically highest order of n, so constant coefficients will drop away. Since a different logarithm base is equivalent to a constant coefficient, it is superfluous.
That said, I would probably assume log base 2.
Yes, when talking about big-O notation, the base does not matter. However, computationally when faced with a real search problem it does matter.
When developing an intuition about tree structures, it's helpful to understand that a binary search tree can be searched in O(n log n) time because that is the height of the tree - that is, in a binary tree with n nodes, the tree depth is O(n log n) (base 2). If each node has three children, the tree can still be searched in O(n log n) time, but with a base 3 logarithm. Computationally, the number of children each node has can have a big impact on performance (see for example: link text)
Enjoy!
Paul
First you must understand what it means for a function f(n) to be O( g(n) ).
The formal definition is: *A function f(n) is said to be O(g(n)) iff |f(n)| <= C * |g(n)| whenever n > k, where C and k are constants.*
so let f(n) = log base a of n, where a > 1 and g(n) = log base b of n, where b > 1
NOTE: This means the values a and b could be any value greater than 1, for example a=100 and b = 3
Now we get the following: log base a of n is said to be O(log base b of n) iff |log base a of n| <= C * |log base b of n| whenever n > k
Choose k=0, and C= log base a of b.
Now our equation looks like the following: |log base a of n| <= log base a of b * |log base b of n| whenever n > 0
Notice the right hand side, we can manipulate the equation: = log base a of b * |log base b of n| = |log base b of n| * log base a of b = |log base a of b^(log base b of n)| = |log base a of n|
Now our equation looks like the following: |log base a of n| <= |log base a of n| whenever n > 0
The equation is always true no matter what the values n,b, or a are, other than their restrictions a,b>1 and n>0.
So log base a of n is O(log base b of n) and since a,b doesn't matter we can simply omit them.
You can see a YouTube video on it here: https://www.youtube.com/watch?v=MY-VCrQCaVw
You can read an article on it here: https://medium.com/#randerson112358/omitting-bases-in-logs-in-big-o-a619a46740ca
Technically the base doesn't matter, but you can generally think of it as base-2.

Resources