recover index in triangular for loops - math

Is there a simple way to recover an index in nested for loops? For example, in for loops which construct Pascals triangle
int index = 0;
for (int i = 0; i < N; ++i)
for (int j = 0; j < N-i; ++j)
index++;
is there a way to recover i and j given only index?

I am adding this as a second answer since it is in a different language (now C) and has a more direct approach. I am keeping the original answer since the following code is almost inexplicable without it. I combined my two functions into a single one to cut down on function call overhead. Also, to be 100% sure that it answers the original question, I used the loops from that question verbatim. In the driver function I show explicitly that the output is correct for N = 4 and then stress-test it for N = 10000 (with a total of 100,000,000 passes through the inner loop). I don't have any formal timing code, but it takes about 1 second on my machine to run through and test those 100 million cases. My code assumes a 32-bit int. Change to long if needed:
#include <stdio.h>
#include <math.h>
void from_index(int n, int index, int *i, int *j);
int main(void){
int N;
int ri,rj; //recovered i,j
N = 4;
int index = 0;
for (int i = 0; i < N; ++i)
for (int j = 0; j < N-i; ++j){
from_index(N,index,&ri,&rj);
printf("i = %d, j = %d, index = %d, ",i,j,index);
printf("recovered i = %d, recovered j = %d\n",ri,rj);
index++;
}
//stress test:
N = 10000;
index = 0;
for (int i = 0; i < N; ++i)
for (int j = 0; j < N-i; ++j){
from_index(N,index,&ri,&rj);
if(i != ri || j != rj){
printf("Don't post buggy code to Stack Overflow!\n");
printf("(i,j) = (%d,%d) but recovered indices are (%d,%d)\n",i,j,ri,rj);
return 0;
}
index++;
}
printf("\nAll %d tests passed!\n",N*N);
return 0;
}
void from_index(int n, int index, int *i, int *j){
double d;
d = 4*n*(n+1) - 7 - 8 * index;
*i = floor((-1 + sqrt(d))/2);
*j = *i * (*i + 1)/2;
*j = n*(n+1)/2 - 1 - index - *j;
*j = *i - *j;
*i = n - *i - 1;
}
Output:
i = 0, j = 0, index = 0, recovered i = 0, recovered j = 0
i = 0, j = 1, index = 1, recovered i = 0, recovered j = 1
i = 0, j = 2, index = 2, recovered i = 0, recovered j = 2
i = 0, j = 3, index = 3, recovered i = 0, recovered j = 3
i = 1, j = 0, index = 4, recovered i = 1, recovered j = 0
i = 1, j = 1, index = 5, recovered i = 1, recovered j = 1
i = 1, j = 2, index = 6, recovered i = 1, recovered j = 2
i = 2, j = 0, index = 7, recovered i = 2, recovered j = 0
i = 2, j = 1, index = 8, recovered i = 2, recovered j = 1
i = 3, j = 0, index = 9, recovered i = 3, recovered j = 0
All 100000000 tests passed!

In this particular case we have
index = N+(N-1)+...+(N-i+1) + (j+1) = i(2N-i+1)/2 + (j+1) = -i^i/2 + (2N-1)i/2 + (j+1)
with j in the interval [1,N-i].
We neglect j and regard this as a quadratic equation in i. Thus we solve
-i^i/2 + (2N-1)i/2 + (1-index) = 0.
We approximate i to be the greatest out of the two resulting solutions (or the ceil of this value, since neglecting j has the effect of lowering the value of i).
We then come back to the complete version of the equation and substitute the approximation of the value of i. If j is outside the interval [1,N-i] we increase/decrease the value of i and re-substitute until we get a value of j in this interval. This loop will probably repeat for a maximum constant number of steps (I suspect a maximum of three steps, but not in the mood to prove it). So this should be doable in a constant number of steps.
As an alternative, we could approximate j to be N/3, instead of zero. This is approximately the expected value of j (over all possible cases), thus the method will probably converge 'faster' at the local search step.
In the general case, you do something very similar, i.e. you solve a fake equation and you perform a local search around the solution.

I found it easier to find i,j from the index in the following number pattern:
0
1 2
3 4 5
6 7 8 9
Since the indices going down the left are the triangular numbers of the form k*(k+1)/2. By solving an appropriate quadratic equation I was able to recover the row and the column from the index. But -- your loops give something like this:
0 1 2 3
4 5 6
7 8
9
which is trickier. It might be possible to solve this problem directly, but note that if you subtract each of these numbers from 9 you get
9 8 7 6
5 4 3
2 1
0
this is the original triangle turned upside down and reflected horizontally. Thus -- I can reduce the problem of your triangle to my triangle. The following Python code shows how it works (the only thing not quite obvious is that in Python 3 // is integer division). The function fromIndexHelper is my solution to my original triangle problem and fromIndex is how I shift it to your triangle. To test it I first printed the index pattern for n = 4 and then the corresponding indices recovered by my function fromIndex:
from math import floor, sqrt
def fromIndexHelper(n,index):
i = floor((-1+sqrt(1+8*index))/2)
j = index - i*(i+1)//2
return i,j
def fromIndex(n,index):
shift = n*(n+1)//2 - 1
i,j = fromIndexHelper(n,shift-index)
return n-i-1,i - j
#test
index = 0
for i in range(4):
for j in range(4-i):
print(index,end = ' ')
index +=1
print('')
print(' ')
index = 0
for i in range(4):
for j in range(4-i):
print(fromIndex(4,index),end = ' ')
index +=1
print('')
Output:
0 1 2 3
4 5 6
7 8
9
(0, 0) (0, 1) (0, 2) (0, 3)
(1, 0) (1, 1) (1, 2)
(2, 0) (2, 1)
(3, 0)

Related

Find the number of possible sums which add to N using (1,...,K)

I have the following problem to solve: given a number N and 1<=k<=N, count the number of possible sums of (1,...,k) which add to N. There may be equal factors (e.g. if N=3 and k=2, (1,1,1) is a valid sum), but permutations must not be counted (e.g., if N=3 and k=2, count (1,2) and (2,1) as a single solution). I have implemented the recursive Python code below but I'd like to find a better solution (maybe with dynamic programming? ). It seems similar to the triple step problem, but with the extra constraint of not counting permutations.
def find_num_sums_aux(n, min_k, max_k):
# base case
if n == 0:
return 1
count = 0
# due to lower bound min_k, we evaluate only ordered solutions and prevent permutations
for i in range(min_k, max_k+1):
if n-i>=0:
count += find_num_sums_aux(n-i, i, max_k)
return count
def find_num_sums(n, k):
count = find_num_sums_aux(n,1,k)
return count
This is a standard problem in dynamic programming (subset sum problem).
Lets define the function f(i,j) which gives the number of ways you can get the sum j using a subset of the numbers (1...i), then the result to your problem will be f(k,n).
for each number x of the range (1...i), x might be a part of the sum j or might not, so we need to count these two possibilities.
Note: f(i,0) = 1 for any i, which means that you can get the sum = 0 in one way and this way is by not taking any number from the range (1...i).
Here is the code written in C++:
int n = 10;
int k = 7;
int f[8][11];
//initializing the array with zeroes
for (int i = 0; i <= k; i++)
for (int j = 0; j <= n; j++)
f[i][j] = 0;
f[0][0] = 1;
for (int i = 1; i <= k; i++) {
for (int j = 0; j <= n; j++) {
if (j == 0)
f[i][j] = 1;
else {
f[i][j] = f[i - 1][j];//without adding i to the sum j
if (j - i >= 0)
f[i][j] = f[i][j] + f[i - 1][j - i];//adding i to the sum j
}
}
}
cout << f[k][n] << endl;//print f(k,n)
Update
To handle the case where we can repeat the elements like (1,1,1) will give you the sum 3, you just need to allow picking the same element multiple times by changing the following line of code:
f[i][j] = f[i][j] + f[i - 1][j - i];//adding i to the sum
To this:
f[i][j] = f[i][j] + f[i][j - i];

canberra distance - inconsistent results

I'm trying to understand what's going on with my calculation of canberra distance. I write my own simple canberra.distance function, however the results are not consistent with dist function. I added option na.rm = T to my function, to be able calculate the sum when there is zero denominator. From ?dist I understand that they use similar approach: Terms with zero numerator and denominator are omitted from the sum and treated as if the values were missing.
canberra.distance <- function(a, b){
sum( (abs(a - b)) / (abs(a) + abs(b)), na.rm = T )
}
a <- c(0, 1, 0, 0, 1)
b <- c(1, 0, 1, 0, 1)
canberra.distance(a, b)
> 3
# the result that I expected
dist(rbind(a, b), method = "canberra")
> 3.75
a <- c(0, 1, 0, 0)
b <- c(1, 0, 1, 0)
canberra.distance(a, b)
> 3
# the result that I expected
dist(rbind(a, b), method = "canberra")
> 4
a <- c(0, 1, 0)
b <- c(1, 0, 1)
canberra.distance(a, b)
> 3
dist(rbind(a, b), method = "canberra")
> 3
# now the results are the same
Pairs 0-0 and 1-1 seem to be problematic. In the first case (0-0) both numerator and denominator are equal to zero and this pair should be omitted. In the second case (1-1) numerator is 0 but denominator is not and the term is then also 0 and the sum should not change.
What am I missing here?
EDIT:
To be in line with R definition, function canberra.distance can be modified as follows:
canberra.distance <- function(a, b){
sum( abs(a - b) / abs(a + b), na.rm = T )
}
However, the results are the same as before.
This might shed some light on the difference. As far as I can see this is the actual code being run for computing the distance
static double R_canberra(double *x, int nr, int nc, int i1, int i2)
{
double dev, dist, sum, diff;
int count, j;
count = 0;
dist = 0;
for(j = 0 ; j < nc ; j++) {
if(both_non_NA(x[i1], x[i2])) {
sum = fabs(x[i1] + x[i2]);
diff = fabs(x[i1] - x[i2]);
if (sum > DBL_MIN || diff > DBL_MIN) {
dev = diff/sum;
if(!ISNAN(dev) ||
(!R_FINITE(diff) && diff == sum &&
/* use Inf = lim x -> oo */ (int) (dev = 1.))) {
dist += dev;
count++;
}
}
}
i1 += nr;
i2 += nr;
}
if(count == 0) return NA_REAL;
if(count != nc) dist /= ((double)count/nc);
return dist;
}
I think the culprit is this line
if(!ISNAN(dev) ||
(!R_FINITE(diff) && diff == sum &&
/* use Inf = lim x -> oo */ (int) (dev = 1.)))
which handles a special case and may not be documented.

Shuffle an array in Arduino software

I have a problem with Shuffling this array with Arduino software:
int questionNumberArray[10]={0,1,2,3,4,5,6,7,8,9};
Does anyone know a build in function or a way to shuffle the values in the array without any repeating?
The simplest way would be this little for loop:
int questionNumberArray[] = {0,1,2,3,4,5,6,7,8,9};
const size_t n = sizeof(questionNumberArray) / sizeof(questionNumberArray[0]);
for (size_t i = 0; i < n - 1; i++)
{
size_t j = random(0, n - i);
int t = questionNumberArray[i];
questionNumberArray[i] = questionNumberArray[j];
questionNumberArray[j] = t;
}
Let's break it line by line, shall we?
int questionNumberArray[] = {0,1,2,3,4,5,6,7,8,9};
You don't need to put number of cells if you initialize an array like that. Just leave the brackets empty like I did.
const size_t n = sizeof(questionNumberArray) / sizeof(questionNumberArray[0]);
I decided to store number of cells in n constant. Operator sizeof gives you number of bytes taken by your array and number of bytes taken by one cell. You divide first number by the second and you have size of your array.
for (size_t i = 0; i < n - 1; i++)
Please note, that range of the loop is n - 1. We don't want i to ever have value of last index.
size_t j = random(0, n - i);
We declare variable j that points to some random cell with index greater than i. That is why we never wanted i to have n - 1 value - because then j would be out of bound. We get random number with Arduino's random function: https://www.arduino.cc/en/Reference/Random
int t = questionNumberArray[i];
questionNumberArray[i] = questionNumberArray[j];
questionNumberArray[j] = t;
Simple swap of two values. It's possible to do it without temporary t variable, but the code is less readable then.
In my case the result was as follows:
questionNumberArray[0] = 0
questionNumberArray[1] = 9
questionNumberArray[2] = 7
questionNumberArray[3] = 4
questionNumberArray[4] = 6
questionNumberArray[5] = 5
questionNumberArray[6] = 1
questionNumberArray[7] = 8
questionNumberArray[8] = 2
questionNumberArray[9] = 3

Challenge with vector: how to split a vector based on max/min conditions

I've recently come across the following problem:
Let say I have an vector of random length (L) of 0 and 1 randomly distributed (for example [0,1,1,1,0,0,1,0]), I need to split the vector in two sub-vector at index K so following conditions are valid:
the left sub-vector must contains the maximum number of elements from
K in reverse order such as the number of zeros must be greater or
equal to the number of 1s
the right sub vector must contains the maximum number of element starting from K+1 such as the number of 1s must be greater or equal to the number of zeros
For example, [1,1,1,1,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0] the split is at index 9, left vector is [1,0], right vector [0,1]
I wrote the following solution but the complexity is O(L^2). I think there could be a solution with complexity of worst case O(L) but I cannot find anything that can help me. Any idea? Thanks
var max = 0;
var kMax = -1;
var firstZeroFound = false;
for (var i = 0; i < testVector.Length - 1; i++)
{
if (!firstZeroFound)
{
if (testVector[i]) continue;
firstZeroFound = true;
}
var maxZero = FindMax(testVector, i, -1, -1, false);
if (maxZero == 0) continue;
var maxOne = FindMax(testVector, i + 1, testVector.Length, 1, true);
if (maxOne == 0) continue;
if ((maxZero + maxOne) <= max)
continue;
max = maxOne + maxZero;
kMax = i;
if (max == testVector.Length)
break;
}
Console.Write("The result is {0}", kMax);
int FindMax(bool[] v, int start, int end, int increment, bool maximize)
{
var max = 0;
var sum = 0;
var count = 0;
var i = start;
while (i != end)
{
count++;
if (v[i])
sum++;
if (maximize)
{
if (sum * 2 >= count)
max = count;
}
else if (sum * 2 <= count)
{
max = count;
}
i += increment;
}
return max;
}
I think you should look at rle.
y <- c(1,1,1,1,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0)
z <- rle(y)
d <- cbind(z$values, z$lengths)
[,1] [,2]
[1,] 1 9
[2,] 0 1
[3,] 1 1
[4,] 0 8
Basically, rle calculates the lengths of 0's and 1's at each level.
From here things may go easier for you.

How to find the number of binary numbers with the following constraints:

Given a binary digit count of n, and a maximum consecutive occurrence count of m, find the number of different possible binary numbers. Also, the leftmost and rightmost bit must be 1.
For example n = 5, and m = 3.
The count is 7:
10001
10011
10101
10111
11001
11011
11101
Notice we excluded 11111 because too many consecutive 1's exist in it.
This was an interview question I had recently, and It has been bothering me. I don't want to brute force check each number for legitimacy because n can be > 32.
Let's call a binary sequence almost valid if it starts with "1" and has at most m consecutive "1" digits.
For i = 1, ..., n and j = 0, ..., m let a(i, j) be the number of almost valid sequences with length i that end with exactly j consecutive "1" digits.
Then
a(1, 1) = 1 and a(1, j) = 0 for j != 1, because "1" is the only almost valid sequence of length one.
For n >= 2 and j = 0 we have a(i, 0) = a(i-1, 0) + a(i-1, 1) + ... + a(i-1, m), because appending "0" to any almost valid sequence of length i-1 gives an almost valid sequence of length i ending with "0".
For n >= 2 and j > 0 we have a(i, j) = a(i-1, j-1) because appending "1" to an almost valid sequence with i-1 trailing ones gives an almost valid sequence of length j with i trailing ones.
Finally, the wanted number is the number of almost valid sequences with length n that have a trailing "1", so this is
f(n, m) = a(n, 1) + a(n, 2) + ... + a(n, m)
Written as a C function:
int a[NMAX+1][MMAX+1];
int f(int n, int m)
{
int i, j, s;
// compute a(1, j):
for (j = 0; j <= m; j++)
a[1][j] = (j == 1);
for (i = 2; i <= n; i++) {
// compute a(i, 0):
s = 0;
for (j = 0; j <= m; j++)
s += a[i-1][j];
a[i][0] = s;
// compute a(i, j):
for (j = 1; j <= m; j++)
a[i][j] = a[i-1][j-1];
}
// final result:
s = 0;
for (j = 1; j <= m; j++)
s += a[n][j];
return s;
}
The storage requirement could even be improved, because only the last column of the matrix a is needed. The runtime complexity is O(n*m).
Without too much combinatorial insight you can tackle this with DP. Let's call left#n,mright the number of binary strings of length n, with no substring of consecutive 1's longer than m, beginning with the string left, and ending with the string right. Clearly, we want to find 1#n-2,m1.
The key observation is simply that left#n,mright = left+'1'#n-1,mright + left+'0'#n-1,mright
A simplistic implementation in js (not sure if it works for small m, and in general untested):
function hash(n,m) {
return _('1',n-2);
function _(left,n){
if (m+1 <= left.length && left.lastIndexOf('0') <= left.length-m-2)
return 0;
if (n==0)
return (m <= left.length &&
left.lastIndexOf('0') <= left.length-m-1 ? 0:1);
return _(left+'1',n-1) + _(left+'0',n-1);
}
}
hash(5,3); // 7
Of course this is more efficient than brute force, however the runtime complexity is still exponential, so it isn't practical for large values of n.

Resources