Algorithm for a Pow2() function for fractions?

Algorithm for a Pow2() function for fractions? - math

Does anyone know an algorithm for a Pow2() function for fractions?
For a function of the following form, where BigRational stands for any rational number type, with arbitrarily large integers for numerator and denominator:
BigRational Pow2(BigRational exponent, int maxdigits)
For the inverse function, BigRational Log2(BigRational exponent, int maxdigits), I have already found a very nice algorithm that uses several identities, converges quickly, and is many times faster than corresponding Log (Ln) or Log10 functions.
Of course, Pow2(x) works with Exp(x * Log(2)) but the point is to avoid Exp as this function is relatively slow for arbitrary precesission arithmetic.
Currently I work on an library for arbitrary precision arithmetic based on some new foundations:
https://github.com/c-ohle/RationalNumerics
An efficient algorithm for such a Pow2 function that is more powerful than Exp (based on Taylor series) could improve the performance of many other functions and algorithms for arbitrary arithmetic.

Related

How many arithmetic operations should it take to calculate trig functions?

I'm trying to assess the expected performance of calculating trigonometry functions as a function of the required precision. Obviously the wall clock time depends on the speed of the underlying arithmetic, so factoring that out by just counting number of operations:
Using state-of-the-art algorithms, how many arithmetic operations (add, subtract, multiply, divide) should it take to calculate sin(x), as a function of the number of bits (or decimal digits) of precision required in the output?

... to assess the expected performance of calculating trigonometry functions as a function of the required precision.
Look as the first omitted term in the Taylor series sine for x = π/4 as the order of error.
Details: sin(x) usually has these phases:
Handling special cases: NaN, infinities.
Argument reduction to the primary range to say [-π/4...+π/4]. Real good reduction is hard as π is irrational and so involves code that reaches 50% of sin() time. Much time used to emulate the needed extended precision. (Research K.C. Ng's "ARGUMENT REDUCTION FOR HUGE ARGUMENTS: Good to the Last Bit")
Low quality reduction involves much less:/, truncate, -, *.
Calculation over a limited range. This is what many only consider. If done with a Taylor's series and needing 53 bits, then about 10-11 terms are needed: Taylor series sine. Yet quality code often uses a pair of crafted polynomials, each of about 4-5 terms, to form the quotient p(x)/q(x).
Of course dedicated hardware support in any of these steps greatly increases performance.
Note: code for sin() is often paired with cos() code as extensive use of trig identities simplify the calculation.
I'd expect a software solution for sin() to cost on the order of 25x a common *. This is a rough estimate.
To achieve a very low error rate in the ULP, code typically uses a tad more. sine_crap() could get by with only a few terms. So when assessing time performance, there is a trade-off with correctness. How good a sin() do you want?

assess the expected performance of calculating trigonometry functions as a function of the required precision
Using the Taylors series as a predictor of the number of ops, worst case x = π/4 (45°) and the error in the calculation on the order of the last term of the series:
For 32-bit float, order 6 float ops needed.
For 64-bit double, order 9 float ops needed.
So if time scales by the square of the FP width, double predicted to take 9/6*2*2 or 6 times as long.

We can calculate any trigonometric function using a simple right angled triangle or using the McLaurin\Taylor Series. So it really depends on which one you choose to implement. If you only pass an angle as an argument, and wish to calculate the sin of that particular angle, it would take about 4 to 6 steps to calculate the sin using an unit circle.

Calculating pi without arbitrary precision and only basic arithmetic

I want to calculate pi. But, I have quite a few limits. Variables can only hold up to 5 decimal places, and I only have the following operators:
Addition
Subtraction
Multiplication
Division
Exponents
Square roots
Sin
Cos
Basic Loops, Conditionals, and relational operators.
The BBP algorithm seems useless here, because even though it would not need arbitrary precision, I cannot convert between bases. I'm not aware of any other formulas that can find the nth digit of pi in base 10.
Would it even be possible to calculate pi using these constraints?

BBP can be modified to give π in Base 10. There's a Java implementation on Github. (I believe that the screenshot of the algorithm description is taken from Pi - Unleashed by Arndt/Haenel.)
You'll need the modulo operation and a means to calculate the closest integer to the logarithm of a number, but you can perform them using the operations you have and loops.

What is the inverse of O(log n) time?

I'm doing some math, and today I learned that the inverse of a^n is log(n). I'm wondering if this applies to complexity. Is the inverse of superpolynomial time logarithmic time and vice versa?
I would think that the inverse of logarithmic time would be O(n^2) time.
Can you characterize the inverses of the common time complexities?
Cheers!

First, you have to define what you mean by inverse here. If you mean an inverse by composing two functions together with the linear function being the identity function, i.e. f(x)=x, then the inverse of f(x)=log x would be f(x)=10^x. However, one could define a multiplicative function inverse where the constant function f(x)=1 is the identity function, then the inverse of f(x)=x would be f(x)=1/x. While this is a bit complicated, it isn't that different than saying, "What is the inverse of 2?" and without stating an operation, this is quite difficult to answer. An additive inverse would be -2 while a multiplicative inverse would be 1/2 so there are different answers depending on which operator you want to use.
In composing functions, the key becomes what is the desired end result: Is it O(n) or O(1)? If the latter may be much more challenging in composing functions as I'm not sure if composing O(log n) with a O(1) would give you a constant in the end or if it doesn't negate the initial count. For example, consider doing a binary search for something with O(log n) time complexity and a basic print statement as something with O(1) time complexity and if you put these together, you'd still get O(log n) as there would still be log n calls within the composed function that prints a number each time going through the search.
Consider the idea of taking two different complexity functions and putting one inside the other, the overall complexity is likely to be the product of each. Consider a double for loop where each loop is O(n) complexity, the overall complexity is O(n) X O(n) = O(n^2) which would mean that in the case of finding something that cancels out the log n would be challenging as you'd have to find something with O(1/(log n)) which I'm not sure exists in reality.

How to numerically compute nonlinear polynomials efficiently and accurately?

(I'm not sure whether I should post this problem on this site or on the math site. Please feel free to migrate this post if necessary.)
My problem at hand is that given a value of k I'd like to numerically compute a rational function of nonlinear polynomials in k which looks like the following: (sorry I don't know how to typeset equations here...)
where {a_0, ..., a_N; b_0, ..., b_N} are complex constants, {u_0, ..., u_N, v_0, ..., v_N} are real constants and i is the imaginary number. I learned from Numerical Recipes that there are whole bunch of ways to compute polynomials quickly, in the meanwhile keeping the rounding error small enough, if all coefficients were constant. But I do not think those ideas are useful in my case since the exponential prefactors also depend on k.
Currently I calculate it in a brute force way in C with complex.h (this is just a pseudo code):
double complex function(double k)
{
return (a_0+a_1*cexp(I*u_1*k)*k+a_2*cexp(I*u_2*k)*k*k+...)/(b_0+b_1*cexp(I*v_1*k)*k+v_2*cexp(I*v_2*k)*k*k+...);
}
However when the number of calls of function increases (because this is just a part of my real calculation), it is very slow and inaccurate (only 6 valid digits). I appreciate any comments and/or suggestions.

I trust that this isn't a homework assignment!
Normally the trick is to use a loop add the next coefficient to the running sum, and multiply by k. However, in your case, I think the "e" term in the coefficient is going to overwhelm any savings by factoring out k. You can still do it, but the savings will probably be small.
Is u_i a constant? Depending on how many times you need to run this formula, maybe you could premultiply u_i * k (unless k changes each run). It's been so many decades since I took a Numerical Analysis course that I have only vague recollections of the tricks of the trade. Let's see... is e^(i*u_i*k) the same as (e^(i*u_i))^k? I don't remember the rules on imaginary numbers, or whether you'll save anything since you've got a real^real (assuming k is real) anyway (internally done using e^power).
If you're getting only 6 digits, that suggests that your math, and maybe your library, is working in single precision (32 bit) reals. Check your library and check your declarations that you are using at least double precision (64 bit) reals everywhere.

Multiplication using FFT in integer rings

I need to multiply long integer numbers with an arbitrary BASE of the digits using FFT in integer rings. Operands are always of length n = 2^k for some k, and the convolution vector has 2n components, therefore I need a 2n'th primitive root of unity.
I'm not particularly concerned with efficiency issues, so I don't want to use Strassen & Schönhage's algorithm - just computing basic convolution, then some carries, and that's nothing else.
Even though it seems simple to many mathematicians, my understanding of algebra is really bad, so I have lots of questions:
What are essential differences or nuances between performing the FFT in integer rings modulo 2^n + 1 (perhaps composite) and in integer FIELDS modulo some prime p?
I ask this because 2 is a (2n)th primitive root of unity in such a ring, because 2^n == -1 (mod 2^n+1). In contrast, integer field would require me to search for such a primitive root.
But maybe there are other nuances which will prevent me from using rings of such a form for the FFT.
If I picked integer rings, what are sufficient conditions for the existence of 2^n-th root of unity in this field?
All other 2^k-th roots of unity of smaller order could be obtained by squaring this root, right?..
What essential restrictions are imposed on the multiplication by the modulo of the ring? Maybe on their length, maybe on the numeric base, maybe even on the numeric types used for multiplication.
I suspect that there may be some loss of information if the coefficients of the convolution are reduced by the modulo operation. Is it true and why?.. What are general conditions that will allow me to avoid this?
Is there any possibility that just primitive-typed dynamic lists (i.e. long) will suffice for FFT vectors, their product and the convolution vector? Or should I transform the coefficients to BigInteger just in case (and what is the "case" when I really should)?
If a general answer to these question takes too long, I would be particularly satisfied by an answer under the following conditions. I've found a table of primitive roots of unity of order up to 2^30 in the field Z_70383776563201:
http://people.cis.ksu.edu/~rhowell/calculator/roots.html
So if I use 2^30th root of unity to multiply numbers of length 2^29, what are the precision/algorithmic/efficiency nuances I should consider?..
Thank you so much in advance!
I am going to award a bounty to the best answer - please consider helping out with some examples.

First, an arithmetic clue about your identity: 70383776563201 = 1 + 65550 * 2^30. And that long number is prime. There's a lot of insight into your modulus on the page How the FFT constants were found.
Here's a fact of group theory you should know. The multiplicative group of integers modulo N is the product of cyclic groups whose orders are determined by the prime factors of N. When N is prime, there's one cycle. The orders of the elements in such a cyclic group, however, are related to the prime factors of N - 1. 70383776563201 - 1 = 2^31 * 3^1 * 5^2 * 11 * 13, and the exponents give the possible orders of elements.
(1) You don't need a primitive root necessarily, you need one whose order is at least large enough. There are some probabilistic algorithms for finding elements of "high" order. They're used in cryptography for ensuring you have strong parameters for keying materials. For numbers of the form 2^n+1 specifically, they've received a lot of factoring attention and you can go look up the results.
(2) The sufficient (and necessary) condition for an element of order 2^n is illustrated by the example modulus. The condition is that some prime factor p of the modulus has to have the property that 2^n | p - 1.
(3) Loss of information only happens when elements aren't multiplicatively invertible, which isn't the case for the cyclic multiplicative group of a prime modulus. If you work in a modular ring with a composite modulus, some elements are not so invertible.
(4) If you want to use arrays of long, you'll be essentially rewriting your big-integer library.

Suppose we need to calculate two n-bit integer multiplication where
n = 2^30;
m = 2*n; p = 2^{n} + 1
Now,
w = 2, x =[w^0,w^1,...w^{m-1}] (mod p).
The issue, for each x[i], it will be too large and we cannot do w*a_i in O(1) time.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex