Polynomial reduction: polynomial in terms of other polynomials? - math

Consider each function below such as f, f2, f3 and f4 with the basis I. How can we express each f such that f_i=\sum a_i I_i and each a_i\geq 0?
Example
We demonstrate the polynomials below with M2 and Mathematica.
Macaulay2:
i1 : R=RR[x1,x2,x3,MonomialOrder=>Lex];
f=x3-x1*x2;
f2=x3*x2-x1;
f3=x1-0.2;
f4=x1-x3+0.8;
i5 : I=ideal(x1-0.2,-x1+0.5,x2,-x2+1,x3-1,-x3+1); G=gb(I);
We can express f3 with elements of I, namely with zeroth term
i11 : I_0==f3
o11 = true
We can express f4 with I_5 and I_0
i17 : I_5+I_0==f4
o17 = true
Can we express f and f2 with I?
Mathematica: f and f-2 cannot be expressed in terms of the I but f-1 can be expressed in I but negative terms so cannot use Handelman's theorem on it.
but
f-2 is not non-negative (choose x3=1,x1=2 so 1-0-2=-1<0)
f is non-negative (x3=1 so 1-x1x2>0) and
f-1 is not non-negative (x3=1,x2>0 so -x1x2<0).
and by Handelman's theorem, all computations are inconclusive because the the third term -x1 is negative. More about Mathematica aspects here.
How can we express a polynomial in terms of other polynomials and each quotient term is positive like PolynomialReduce in Mathematica but each quotient term positive?

Note that in this answer, I am using your terminology, in which R is the polynomial ring and RR is the ring of real numbers. I should also say that almost never use the ring RR, since computations in macaulay2 over the real numbers are not always reliable, always use the ring of rationals QQ or a positive characteristic field like QQ/(101).
Your f and f2 polynomials are not linear, so you can not even write them as a linear combination of I_0,...,I_5 (i.e. the generators of I).
Furthermore the ideal I as you defined it contains a scalar so it is what mathematicians call the unit ideal. It means I=R, that is the whole polynomial ring.
So you can write f and f2 as a combination of I_0,...,I_5 but not a linear one.
It means that f = \sum g_i I_i with g_i polynomials where at least one of them is not a number.
Remark. For an arbitrary ring R, the elements are usually called scalars, but when R is a polynomial ring, let's say R=RR[x_1,...x_n] then usually the constant polynomials (which are exactly the real numbers, i.e. elements of RR) are called scalars. This is just a common and of course confusing terminology.
Here is an example,
i2 : R=QQ[x_1,x_2]
o2 = R
o2 : PolynomialRing
i3 : I=ideal(x_1-1,x_2,x_1+1)
o3 = ideal (x - 1, x , x + 1)
1 2 1
o3 : Ideal of R
i4 : I == R
o4 = true
i5 : J = ideal(x_1,x_2)
o5 = ideal (x , x )
1 2
o5 : Ideal of R
i6 : J == R
o6 = false
You see that the ideal I has x_1-1,x_2,x_1+1 so the element (x_1+1)-(x_1-1) = 2 also belongs to I, so I has a constant polynomial which is a unit element (a unit element in a ring is an element that has an inverse) which implies that I=R. For a proof of this fact visit, https://math.stackexchange.com/questions/552173/if-an-ideal-contains-the-unit-then-it-is-the-whole-ring
On the other hand J does not have any constant polynomial, so J is not the whole ring R.

Related

questions about AES irreducible polynomials

For galois field GF(2^8), the polynomial's format is a7x^7+a6x^6+...+a0.
For AES, the irreducible polynomial is x^8+x^4+x^3+x+1.
Apparently, the max power in GF(2^8) is x^7, but why the max power of irreducible polynomial is x^8?
How will the max power in irreducible polynomial affect inverse result in GF?
Can I set the max power of irreducible polynomial be x^9?
To understand why the modulus of GF(2⁸) must be order 8 (that is, have 8 as its largest exponent), you must know how to perform polynomial division with coefficients in GF(2), which means you must know how to perform polynomial division in general. I will assume you know how to do those things. If you don't know how, there are many tutorials on the web from which you can learn.
Remember that if r = a mod m, it means that there is a q such that a = q m + r. To make a working GF(2⁸) arithmetic, we need to guarantee that r is a element of GF(2⁸) for any a and q (even though a and q do not need to be elements of GF(2⁸)). Furthermore, we need to ensure that r can be any element of GF(2⁸), if we pick the right a from GF(2⁸).
So we must pick a modulus (the m) that makes these guarantees. We do this by picking an m of exactly order 8.
If the numerator of the division (the a in a = q m + r) is order 8 or higher, we can find something to put in the quotient (the q) that, when multiplied by x⁸, cancels out that higher order. But there's nothing we can put in the quotient that can be multiplied by x⁸ to give a term with order less than 8, so the remainder (the r) can be any order up to and including 7.
Let's try a few examples of polynomial division with a modulus (or divisor) of x⁸+x⁴+x³+x+1 to see what I mean. First let's compute x⁸+1 mod x⁸+x⁴+x³+x+1:
1 <- quotient
┌──────────────
x⁸+x⁴+x³+x+1 │ x⁸ +1
-(x⁸+x⁴+x³+x+1)
───────────────
x⁴+x³+x <- remainder
So x⁸+1 mod x⁸+x⁴+x³+x+1 = x⁴+x³+x.
Next let's compute x¹²+x⁹+x⁷+x⁵+x² mod x⁸+x⁴+x³+x+1.
x⁴ +x +1 <- quotient
┌──────────────────────────────
x⁸+x⁴+x³+x+1 │ x¹²+x⁹ +x⁷+x⁵ +x²
-(x¹² +x⁸+x⁷+x⁵+x⁴ )
───────────────────────────
x⁹+x⁸ +x⁴ +x²
-(x⁹ +x⁵+x⁴ +x²+x)
─────────────────────────
x⁸ +x⁵ +x
-(x⁸ +x⁴+x³ +x+1)
────────────────────────
x⁵+x⁴+x³ +1 <- remainder
So x¹²+x⁹+x⁷+x⁵+x² mod x⁸+x⁴+x³+x+1 = x⁵+x⁴+x³+1, which has order < 8.
Finally, let's try a substantially higher order: how about x¹⁰⁰+x⁹⁶⁺x⁹⁵+x⁹³+x⁸⁸+x⁸⁷+x⁸⁵+x⁸⁴+x mod x⁸+x⁴+x³+x+1?
x⁹² +x⁸⁴ <- quotient
┌────────────────────────────────────────
x⁸+x⁴+x³+x+1 │ x¹⁰⁰+x⁹⁶⁺x⁹⁵+x⁹³ +x⁸⁸+x⁸⁷+x⁸⁵+x⁸⁴+x
-(x¹⁰⁰+x⁹⁶+x⁹⁵+x⁹³+x⁹² )
─────────────────────────────────────────
x⁹²+x⁸⁸+x⁸⁷+x⁸⁵+x⁸⁴+x
-(x⁹²+x⁸⁸+x⁸⁷+x⁸⁵+x⁸⁴ )
────────────────────────
x <- remainder
So x¹⁰⁰+x⁹⁶⁺x⁹⁵+x⁹³+x⁸⁸+x⁸⁷+x⁸⁵+x⁸⁴+x mod x⁸+x⁴+x³+x+1 = x. Note that I carefully chose the numerator so that it wouldn't be a long computation. If you want some pain, try doing x¹⁰⁰ mod x⁸+x⁴+x³+x+1 by hand.

Yun's algorithm

I would like to try to implement Yun's algorithm for square-free factorization of polynomials. From Wikipedia (f is the polynomial):
a0 = gcd(f, f'); b1 = f/a0; c1 = f'/a0; d1 = c1 - b1'; i = 1
repeat
ai = gcd(bi, di); bi+1 = bi/ai; ci+1 = di/ai; i = i + 1; di = ci - bi'
until b = 1
However, I'm not sure about the second step. I would like to use it for polynomials with integer coefficients (not necessary monic or primitive). Is it possible to realize the division b1 = f/a0 using just integers?
I found the code for synthetic division:
def extended_synthetic_division(dividend, divisor):
'''Fast polynomial division by using Extended Synthetic Division. Also works with non-monic polynomials.'''
# dividend and divisor are both polynomials, which are here simply lists of coefficients. Eg: x^2 + 3x + 5 will be represented as [1, 3, 5]
out = list(dividend) # Copy the dividend
normalizer = divisor[0]
for i in xrange(len(dividend)-(len(divisor)-1)):
out[i] /= normalizer # for general polynomial division (when polynomials are non-monic),
# we need to normalize by dividing the coefficient with the divisor's first coefficient
coef = out[i]
if coef != 0: # useless to multiply if coef is 0
for j in xrange(1, len(divisor)): # in synthetic division, we always skip the first coefficient of the divisor,
# because it is only used to normalize the dividend coefficients
out[i + j] += -divisor[j] * coef
# The resulting out contains both the quotient and the remainder, the remainder being the size of the divisor (the remainder
# has necessarily the same degree as the divisor since it is what we couldn't divide from the dividend), so we compute the index
# where this separation is, and return the quotient and remainder.
separator = -(len(divisor)-1)
return out[:separator], out[separator:] # return quotient, remainder.
The problem for me is that out[i] /= normalizer. Would it always work with integer (floor) division for Yun's b1 = f/a0? Is it so that it is always possible to divide f/gcd(f, f')? Is the out[separator:] (remainder) always going to zero?
The fact that the "division in p/GCD(p, p') will always work (i.e. be "exact", with no remainder in Z)" follows from the definition of the GCD. For any polynomials p and q their GCD(p,q) divides both p and q exactly. That's why it is called GCD i.e. Greatest Common Divisor:
A greatest common divisor of p and q is a polynomial d that divides p and q and such that every common divisor of p and q also divides d.
P.S. it makes more sense to ask such purely mathematical questions at the more specialized https://math.stackexchange.com/

Finding the upper bound of a mathematical function (function analysis)

I am trying to understand Big-O notation through a book I have and it is covering Big-O by using functions although I am a bit confused. The book says that O(g(n)) where g(n) is the upper bound of f(n). So I understand that means that g(n) gives the max rate of growth for f(n) at larger values of n.
and that there exists an n_0 where the rate of growth of cg(n) (where c is some constant) and f(n) have the same rate of growth.
But what I am confused is on these examples on finding Big O in mathmatical functions.
This book says findthe upper bound for f(n) = n^4 +100n^2 + 50
they then state that n^4 +100n^2 + 50 <= 2n^4 (unsure why the 2n^4)
then they some how find n_0 =11 and c = 2, I understand why the big O is O(n^4) but I am just confused about the rest.
This is all discouraging as I don't understand but I feel like this is an important topic that I must understand.
If any one is curious the book is Data Structures and Algorithms Made Easy by Narasimha Karumanchi
Not sure if this post belongs here or in the math board.
Preparations
First, lets state, loosely, the definition of f being in O(g(n)) (note: O(g(n)) is a set of functions, so to be picky, we say that f is in O(...), rather than f(n) being in O(...)).
If a function f(n) is in O(g(n)), then c · g(n) is an upper bound on
f(n), for some constant c such that f(n) is always ≤ c · g(n),
for large enough n (i.e. , n ≥ n0 for some constant n0).
Hence, to show that f(n) is in O(g(n)), we need to find a set of constants (c, n0) that fulfils
f(n) < c · g(n), for all n ≥ n0, (+)
but this set is not unique. I.e., the problem of finding the constants (c, n0) such that (+) holds is degenerate. In fact, if any such pair of constants exists, there will exist an infinite amount of different such pairs.
Showing that f ∈ O(n^4)
Now, lets proceed and look at the example that confused you
Find an upper asymptotic bound for the function
f(n) = n^4 + 100n^2 + 50 (*)
One straight-forward approach is to express the lower-order terms in (*) in terms of the higher order terms, specifically, w.r.t. bounds (... < ...).
Hence, we see if we can find a lower bound on n such that the following holds
100n^2 + 50 ≤ n^4, for all n ≥ ???, (i)
We can easily find when equality holds in (i) by solving the equation
m = n^2, m > 0
m^2 - 100m - 50 = 0
(m - 50)^2 - 50^2 - 50 = 0
(m - 50)^2 = 2550
m = 50 ± sqrt(2550) = { m > 0, single root } ≈ 100.5
=> n ≈ { n > 0 } ≈ 10.025
Hence, (i) holds for n ≳ 10.025, bu we'd much rather present this bound on n with a neat integer value, hence rounding up to 11:
100n^2 + 50 ≤ n^4, for all n ≥ 11, (ii)
From (ii) it's apparent that the following holds
f(n) = n^4 + 100n^2 + 50 ≤ n^4 + n^4 = 2 · n^4, for all n ≥ 11, (iii)
And this relation is exactly (+) with c = 2, n0 = 11 and g(n) = n^4, and hence we've shown that f ∈ O(n^4). Note again, however, that the choice of constants c and n0 is one of convenience, that is not unique. Since we've shown that (+) holds for on set of constants (c,n0), we can show that it holds for an infinite amount of different such choices of constants (e.g., it naturally holds for c=10 and n0=20, ..., and so on).

How to compute limiting value

I have to compute the value of this expression
(e1*cos(θ) - e2*sin(θ)) / ((cos(θ))^2 - (sin(θ))^2)
Here e1 and e2 are some complex expression.
In the case when θ approach to PI/4, then the denominator will approach to zero. But in that case e1 and e2 will also approach to same value. So at PI/4, the value of expression will be E/sqrt(2) where e1=e2=E
I can do special handing for θ=PI/4 but what about the value of θ very very near to PI/4. What is the general strategy to compute such kind of expressions
This is a bit tricky. You need to know more about how e1 and e2 behave.
First, some notation: I'll use the variable a = theta - pi/4, so that we are interested in a->0, and write e1 = E + d1, e2 = E + d2. Let F = your expression times sqrt(2)
In terms of these we have
F = ((E+d1)*(cos(a) - sin(a)) - (E+d2)*(cos(a) + sin(a))) / - sin( 2*a)
= (-(2*E+d1+d2)*sin(a) + (d1-d2)*cos(a)) / (-2*sin(a)*cos(a))
= (E+(d1+d2)/2)/cos(a) - (d1-d2)/(2*sin(a))
Assuming that d1->0 and d2->0 as a->0 the first term will tend to E.
However the second term could tend to anything, or blow up -- for example if d1=d2=sqrt(a).
We need to assume more, for example that d1-d2 has derivative D at a=0.
In this case we will have
F-> E - D/2 as a->0
To be able to compute F for values of a close to 0 we need to know even more.
One approach is to have code like this:
if ( fabs(a) < small) { F = E-D/2 + C*a; } else { F = // normal code }
So we need to figure out what 'small' and C should be. In part this depends on what (relative) accuracy you require. The most stringent requirement would be that at a = +- small, the difference between the approximation and the normal code should be too small to represent in a double (if that's what you're using). But note that we mustn't make 'small' too small, or there is a danger that, as doubles, the normal code will evaluate to 0/0 at a += small. One approach would be to expand the numerator and denominator of F (or just the second term) as power series (to say second order), divide each by a, and then divide these series, keeping terms up to second order; the first term in this gives you C above, and the second term will allow you to estimate the error in this approximation, and hence estimate 'small'.

Implementing additional constraints in R's nnls

I am using the R interface to the Lawson-Hanson NNLS implementation of an algorithm for non-negative linear least squares that solves ||A x - b||^2 with the constraint that all elements of vector x ≥ 0. This works fine but I would like to add further constrains. Of interest to me are:
Also minimize "energy" of x:
||A x - b||^2 + m*||x||^2
Minimize "energy in the x derivative"
||A x - b||^2 + m ||H x||^2, where H is the sum of identity and a matrix with -1 on the first off-diagonal
Most generally, minimize ||A x - b||^2 + m ||H x - f||^2.
Is there are a way to coax nnls to do this by some clever way of restating the problems 1.-3. Above? The reason I have hope for such a thing is that there is a little-throw away comment in a paper by Whitall et al (sorry for the paywall) that claims that "fortunately, NNLS can be adopted from the original form above to accommodate something in problem 3".
I take it m is a scalar, right? Consider the simple case m=1; you can generalize for other values of m by letting H* = sqrt(m) H and f* = sqrt(m) f and using the solution method given here.
So now you're trying to minimise ||A x - b||^2 + ||H x - f||^2.
Let A* = [A' | H']' and let b* = [b' | f']' (i.e. stack up A on top of H and b on top of f) and solve the original problem of
non-negative linear least squares on ||A* x - b*||^2 with the constraint that all elements of vector x ≥ 0 .

Resources