NTh square free semi prime - math

I am trying to find the nth( n <= 2000000) square free semi prime. I have the following code to do so.
int k = 0;
for(int i = 0; i <= 1000; i++)
{
for(int j = i +1 ; j <= 2500; j++ )
{
semiprimes[k++] = (primes[i]*primes[j]);
}
}
sort(semiprimes,semiprimes+k);
primes[] is a list of primes.
My problem is, i get different values for n = 2000000, with different limits on the for loops. Could someone tell a way to correctly calculate these limits?
Thanks in advance..

You want to calculate the nth first semi-prime square-free numbers. "first" means that you have to generate all of them under a certain value. Your method consist of generating a lot of those numbers, sort them and extract the nth first values.
This can be a good approach but you must have all the numbers generated. Having two different limits in your nested loops is a good way to miss some of them (in your example, you are not calculating primes[1001]*primes[1002] which should be in semiprimes).
To avoid this problem, you have to compute all the semi-prime numbers in a square, say [1,L]*[1,L], where L is your limit for both loops.
To determine L, all you need is it to count.
Let N be the number of semi-prime square-free numbers under primes[L-1]*primes[L-1].
N = (L * L - L) / 2
L*L is the total number of pairwise multiplications. L is the number of squares. This has two be divided by two to get the right number (because primes[i]*primes[j] = primes[j]*primes[i]).
You want to pick L such that n<=N. So for n = 2000000 :
int L = 2001, k = 0;
for(int i = 0; i < L; i++)
{
for(int j = i+1 ; j < L; j++ )
{
semiprimes[k++] = (primes[i]*primes[j]);
}
}
sort(semiprimes,semiprimes+k);

I don't believe an approach that works by computing all semiprimes inside a box will work in any reasonable amount of time. Say we graph the factors (p,q) of the first 2 million semiprimes. To make the graph more symmetric, let's plot a point for both (p,q) and (q,p). The graph does not form a nice rectangular region, but instead looks more like the hyperbola y=1/x. This hyperbola stretches out quite far, and iterating over the entire rectangle containing these will be a lot of wasted computation.
You may want to consider first solving the problem "how many semiprimes are there below N?" and then using a binary search. Each query can be done in about sqrt(N) steps (hint: binary search strikes again). You will need a fairly large table of primes, certainly at least up to 1 million, probably more. Although this can be trimmed down by an arbitrarily large constant factor with some precomputation.

Related

Statistical probability of N contiguous true-bits in a sequence of bits?

Let's assume I have an N-bit stream of generated bits. (In my case 64kilobits.)
Whats the probability of finding a sequence of X "all true" bits, contained within a stream of N bits. Where X = (2 to 16), and N = (16 to 1000000), and X < N.
For example:
If N=16 and X=5, whats the likelyhood of finding 11111 within a 16-bit number.
Like this pseudo-code:
int N = 1<<16; // (64KB)
int X = 5;
int Count = 0;
for (int i = 0; i < N; i++) {
int ThisCount = ContiguousBitsDiscovered(i, X);
Count += ThisCount;
}
return Count;
That is, if we ran an integer in a loop from 0 to 64K-1... how many times would 11111 appear within those numbers.
Extra rule: 1111110000000000 doesn't count, because it has 6 true values in a row, not 5. So:
1111110000000000 = 0x // because its 6 contiguous true bits, not 5.
1111100000000000 = 1x
0111110000000000 = 1x
0011111000000000 = 1x
1111101111100000 = 2x
I'm trying to do some work involving physically-based random-number generation, and detecting "how random" the numbers are. Thats what this is for.
...
This would be easy to solve if N were less than 32 or so, I could just "run a loop" from 0 to 4GB, then count how many contiguous bits were detected once the loop was completed. Then I could store the number and use it later.
Considering that X ranges from 2 to 16, I'd literally only need to store 15 numbers, each less than 32 bits! (if N=32)!
BUT in my case N = 65,536. So I'd need to run a loop, for 2^65,536 iterations. Basically impossible :)
No way to "experimentally calculate the values for a given X, if N = 65,536". So I need maths, basically.
Fix X and N, obiously with X < N. You have 2^N possible values of combinations of 0 and 1 in your bit number, and you have N-X +1 possible sequences of 1*X (in this part I'm only looking for 1's together) contained in you bit number. Consider for example N = 5 and X = 2, this is a possible valid bit number 01011, so fixed the last two characteres (the last two 1's) you have 2^2 possible combinations for that 1*Xsequence. Then you have two cases:
Border case: Your 1*X is in the border, then you have (2^(N -X -1))*2 possible combinations
Inner case: You have (2^(N -X -2))*(N-X-1) possible combinations.
So, the probability is (border + inner )/2^N
Examples:
1)N = 3, X =2, then the proability is 2/2^3
2) N = 4, X = 2, then the probaility is 5/16
A bit brute force, but I'd do something like this to avoid getting mired in statistics theory:
Multiply the probabilities (1 bit = 0.5, 2 bits = 0.5*0.5, etc) while looping
Keep track of each X and when you have the product of X bits, flip it and continue
Start with small example (N = 5, X=1 - 5) to make sure you get edge cases right, compare to brute force approach.
This can probably be expressed as something like Sum (Sum 0.5^x (x = 1 -> 16) (for n = 1 - 65536) , but edge cases need to be taken into account (i.e. 7 bits doesn't fit, discard probability), which gives me a bit of a headache. :-)
#Andrex answer is plain wrong as it counts some combinations several times.
For example consider the case N=3, X=1. Then the combination 101 happens only 1/2^3 times but the border calculation counts it two times: one as the sequence starting with 10 and another time as the sequence ending with 01.
His calculations gives a (1+4)/8 probability whereas there are only 4 unique sequences that have at least a single contiguous 1 (as opposed to cases such as 011):
001
010
100
101
and so the probability is 4/8.
To count the number of unique sequences you need to account for sequences that can appear multiple times. As long as X is smaller than N/2 this will happens. Not sure how you can count them tho.

Counting the number of restricted Integer partitions

Original problem:
Let N be a positive integer (actually, N <= 2000) and P - set of all possible partitions of the N, where with and . Let A be the number of partitions . Find the A.
Input: N. Output: A - the number of partitions .
What have I tried:
I think that this problem can be solved by dynamic-based algorithm. Let p(n,a,b) be the function, which returns the number of partitons of n using only numbers a. . .b. Then we can compute the A with the code like:
int Ans = 2; // the 1+1+...+1=N & N=N partitions
for(int a = 2; a <= N/2; a += 1){ //a - from 2 to N/2
int b = a*2-1;
Ans += p[N][a][b]; // add all partitions using a..b to Answer
if(a < (a-1)*2-1){ // if a < previous b [ (a-1)*2-1 ]
Ans -= p[N][a][(a-1)*2-1]; // then we counted number of partitions
} // using numbers a..prev_b twice.
}
Next I tried to find the dynamic algorithm computing p(n,a,b) for any integer a <= b <= n. This paper (.pdf) provides the folowing algorithm:
, were I(n<=b) = 1 if n<=b and =0 otherwise.
Question(s):
How should I realize the algorithm from the paper? I'm new at d-p problems and as I can see, this problem has 3 dimensions (n,a & b), which is quite tricky for me.
How actually that algorithm works? I know how work the algorithms for computing p(n,0,b) or p(n,a,n), but a little explanation for p(n,a,b) will be very helpful.
Does original problem have simpler solution? I'm quite sure that there's another clean solution, but I didn't found it.
I calculated all A(1)-A(600) in 23 seconds with memoization approach (top-down dynamic programming). 3D table requires 1.7 GB of memory.
For reference: A[50] = 278, A(200)=465202, A(600)=38860513616
N=2000 requires too large table for 32-bit environment, and map approach worked too slow.
I can make 2D table with reasonable size, but this approach requires table zeroing at every iteration of external loop - slow again.
A(1000) = 107292471486730 in 131 sec. And I think that long arithmetic might be needed for larger values to avoid Int64 overflow.

Efficient program to check whether a number can be expressed as sum of two cubes

I am trying to write a program to check whether a number N can be expressed as the sum of two cubes i.e. N = a^3 + b^3
This is my code with complexity O(n):
#include <iostream>
#include<math.h>
#define ll unsigned long long
using namespace std;
int main()
{
ios_base::sync_with_stdio(false);
bool flag=false;
ll t,N;
cin>>t;
while(t--)
{
cin>>N;
flag=false;
for(int i=1; i<=(ll)cbrtl(N/2); i++)
{
if(!(cbrtl(N-i*i*i)-(ll)cbrtl(N-i*i*i))) {flag=true; break;}
}
if(flag) cout<<"Yes\n"; else cout<<"No\n";
}
return 0;
}
As the time limit for code is 2s, This program is giving TLE? can anyone suggest a faster approch
I posted this also in StackExchange, so sorry if you consider duplicate, but I really don´t know if these are the same or different boards (Exchange and Overflow). My profile appears different here.
==========================
There is a faster algorithm to check if a given integer is a sum (or difference) of two cubes n=a^3+b^3
I don´t know if this algorithm is already known (probably yes, but I can´t find it on books or internet). I discovered and use it to compute integers until n < 10^18
This process uses a single trick
4(a^3+b^3)/(a+b) = (a+b)^2 + 3(a-b)^2)
We don´t know in advance what would be "a" and "b" and so what also would be "(a+b)", but we know that "(a+b)" should certainly divide (a^3+b^3) , so if you have a fast primes factorizing routine, you can quickly compute each one of divisors of (a^3+b^3) and then check if
(4(a^3+b^3)/divisor - divisor^2)/3 = square
When (and if) found a square, you have divisor=(a+b) and sqrt(square)=(a-b) , so you have a and b.
If not square found, the number is not sum of two cubes.
We know divisor < (4(a^3+b^3)^(1/3) and this limit improves the task, because when you are assembling divisors of (a^3+b^3) immediately discard those greater than limit.
Now some comparisons with other algorithms - for n = 10^18, by using brute force you should test all numbers below 10^6 to know the answer. On the other hand, to build all divisors of 10^18 you need primes until 10^9.
The max quantity of different primes you could fit into 10^9 is 10 (2*3*5*7*11*13*17*19*23*29 = 5*10^9) so we have 2^10-1 different combinations of primes (which assemble the divisors) to check in worst case, many of them discared because limit.
To compute prime factors I use a table with first 60.000.000 primes which works very well on this range.
Miguel Velilla
To find all the pairs of integers x and y that sum to n when cubed, set x to the largest integer less than the cube root of n, set y to 0, then repeatedly add 1 to y if the sum of the cubes is less than n, subtract 1 from x if the sum of the cubes is greater than n, and output the pair otherwise, stopping when x and y cross. If you only want to know whether or not such a pair exists, you can stop as soon as you find one.
Let us know if you have trouble coding this algorithm.

Decompose integer into two bytes

I'm working on an embedded project where I have to write a time-out value into two byte registers of some micro-chip.
The time-out is defined as:
timeout = REG_a * (REG_b +1)
I want to program these registers using an integer in the range of 256 to lets say 60000. I am looking for an algorithm which, given a timeout-value, calculates REG_a and REG_b.
If an exact solution is impossible, I'd like to get the next possible larger time-out value.
What have I done so far:
My current solution calculates:
temp = integer_square_root (timeout) +1;
REG_a = temp;
REG_b = temp-1;
This results in values that work well in practice. However I'd like to see if you guys could come up with a more optimal solution.
Oh, and I am memory constrained, so large tables are out of question. Also the running time is important, so I can't simply brute-force the solution.
You could use the code used in that answer Algorithm to find the factors of a given Number.. Shortest Method? to find a factor of timeout.
n = timeout
initial_n = n
num_factors = 1;
for (i = 2; i * i <= initial_n; ++i) // for each number i up until the square root of the given number
{
power = 0; // suppose the power i appears at is 0
while (n % i == 0) // while we can divide n by i
{
n = n / i // divide it, thus ensuring we'll only check prime factors
++power // increase the power i appears at
}
num_factors = num_factors * (power + 1) // apply the formula
}
if (n > 1) // will happen for example for 14 = 2 * 7
{
num_factors = num_factors * 2 // n is prime, and its power can only be 1, so multiply the number of factors by 2
}
REG_A = num_factor
The first factor will be your REG_A, so then you need to find another value that multiplied equals timeout.
for (i=2; i*num_factors != timeout;i++);
REG_B = i-1
Interesting problem, Nils!
Suppose you start by fixing one of the values, say Reg_a, then compute Reg_b by division with roundup: Reg_b = ((timeout + Reg_a-1) / Reg_a) -1.
Then you know you're close, but how close? Well the upper bound on the error would be Reg_a, right? Because the error is the remainder of the division.
If you make one of factors as small as possible, then compute the other factor, you'd be making that upper bound on the error as small as possible.
On the other hand, by making the two factors close to the square root, you're making the divisor as large as possible, and therefore making the error as large as possible!
So:
First, what is the minimum value for Reg_a? (timeout + 255) / 256;
Then compute Reg_b as above.
This won't be the absolute minimum combination in all cases, but it should be better than using the square root, and faster, too.

Efficiently calculating the total number of divisors of integers in a range

Given the range [1, 2 Million], for each number in this range I need to generate
and store the number of the divisors of each integer in an array.
So if x=p1^(a1)*p2^a2*p3^a3, where p1, p2, p3 are primes,
the total number of divisors of x is given by (p1+1)(p2+1)(p3+1). I generated all
the primes below 2000 and for each integer in the range, I did trial division
to get the power of each prime factor and then used the formula above to calculate
the number of divisors and stored in an array.
But, doing this is quite slow and takes around 5 seconds to generate the number of divsors
for all the numbers in the given range.
Can we do this sum in some other efficient way, may be without factorizing each
of the numbers?
Below is the code that I use now.
typedef unsigned long long ull;
void countDivisors(){
ull PF_idx=0, PF=0, ans=1, N=0, power;
for(ull i=2; i<MAX; ++i){
if (i<SIEVE_SIZE and isPrime[i]) factors[i]=2;
else{
PF_idx=0;
PF=primes[PF_idx];
ans=1;
N=i;
while(N!=1 and (PF*PF<=N)){
power = 0;
while(N%PF==0){ N/=PF; ++power;}
ans*=(power+1);
PF = primes[++PF_idx];
}
if (N!=1) ans*=2;
factors[i] = ans;
}
}
}
First of all your formula is wrong. According to your formula, the sum of the divisors of 12 should be 12. In fact it is 28. The correct formula is (p1a1 - 1)*(p2a2 - 1) * ... * (pkak - 1)/( (p1 - 1) * (p2 - 1) * ... * (pk - 1) ).
That said, the easiest approach is probably just to do a sieve. One can get clever with offsets, but for simplicity just make an array of 2,000,001 integers, from 0 to 2 million. Initialize it to 0s. Then:
for (ull i = 1; i < MAX; ++i) {
for (ull j = i; j < MAX; j += i) {
factors[j] += i;
}
}
This may feel inefficient, but it is not that bad. The total work taken for the numbers up to N is N + N/2 + N/3 + ... + N/N = O(N log(N)) which is orders of magnitude less than trial division. And the operations are all addition and comparison, which are fast for integers.
If you want to proceed with your original idea and formula, you can make that more efficient by using a modified sieve of Eratosthenes to create an array from 1 to 2 million listing a prime factor of each number. Building that array is fairly fast, and you can take any number and factorize it much, much more quickly than you could with trial division.

Resources