I'm trying to understand what's going on with my calculation of canberra distance. I write my own simple canberra.distance function, however the results are not consistent with dist function. I added option na.rm = T to my function, to be able calculate the sum when there is zero denominator. From ?dist I understand that they use similar approach: Terms with zero numerator and denominator are omitted from the sum and treated as if the values were missing.
canberra.distance <- function(a, b){
sum( (abs(a - b)) / (abs(a) + abs(b)), na.rm = T )
}
a <- c(0, 1, 0, 0, 1)
b <- c(1, 0, 1, 0, 1)
canberra.distance(a, b)
> 3
# the result that I expected
dist(rbind(a, b), method = "canberra")
> 3.75
a <- c(0, 1, 0, 0)
b <- c(1, 0, 1, 0)
canberra.distance(a, b)
> 3
# the result that I expected
dist(rbind(a, b), method = "canberra")
> 4
a <- c(0, 1, 0)
b <- c(1, 0, 1)
canberra.distance(a, b)
> 3
dist(rbind(a, b), method = "canberra")
> 3
# now the results are the same
Pairs 0-0 and 1-1 seem to be problematic. In the first case (0-0) both numerator and denominator are equal to zero and this pair should be omitted. In the second case (1-1) numerator is 0 but denominator is not and the term is then also 0 and the sum should not change.
What am I missing here?
EDIT:
To be in line with R definition, function canberra.distance can be modified as follows:
canberra.distance <- function(a, b){
sum( abs(a - b) / abs(a + b), na.rm = T )
}
However, the results are the same as before.
This might shed some light on the difference. As far as I can see this is the actual code being run for computing the distance
static double R_canberra(double *x, int nr, int nc, int i1, int i2)
{
double dev, dist, sum, diff;
int count, j;
count = 0;
dist = 0;
for(j = 0 ; j < nc ; j++) {
if(both_non_NA(x[i1], x[i2])) {
sum = fabs(x[i1] + x[i2]);
diff = fabs(x[i1] - x[i2]);
if (sum > DBL_MIN || diff > DBL_MIN) {
dev = diff/sum;
if(!ISNAN(dev) ||
(!R_FINITE(diff) && diff == sum &&
/* use Inf = lim x -> oo */ (int) (dev = 1.))) {
dist += dev;
count++;
}
}
}
i1 += nr;
i2 += nr;
}
if(count == 0) return NA_REAL;
if(count != nc) dist /= ((double)count/nc);
return dist;
}
I think the culprit is this line
if(!ISNAN(dev) ||
(!R_FINITE(diff) && diff == sum &&
/* use Inf = lim x -> oo */ (int) (dev = 1.)))
which handles a special case and may not be documented.
Related
Do if statements not work for integrate? I have to do something much more complicated than this, but I am supplying this example because it isolated the problem.
Kernel = function(x){
if(abs(x)<1){
w = 1 - abs(x)
} else{
w = 0
}
return(w)
}
integrate(Kernel,
0,
1)
The error message:
the condition has length > 1 and only the first element will be used
Kernel = function(x){
pmax(1-abs(x), 0)
}
integrate(Kernel, 0, 1)
0.5 with absolute error < 5.6e-15
or even:
Kernel1 = function(x){
ifelse(abs(x)<1, 1-abs(x), 0)
}
integrate(Kernel1, 0, 1)
0.5 with absolute error < 5.6e-15
If you want to maintain the way you have written your function, you have to vectorize it:
Kernel2 = function(x){
ifelse(abs(x)< 1, 1-abs(x), 0)
if(abs(x)<1){
w = 1 - abs(x)
} else{
w = 0
}
return(w)
}
integrate(Vectorize(Kernel2), 0, 1)
0.5 with absolute error < 5.6e-15
I would like to generate in an efficient way a list of integers (preferably ordered)
with the following defining properties:
All integers have the same number of bit set N.
All integers have the same sum of bit indices K.
To be definite, for an integer I
its binary representation is:
$I=\sum_{j=0}^M c_j 2^j$ where $c_j=0$ or $1$
The number of bit sets is:
$N(I)=\sum_{j=0}^M c_j$
The sum of bit indices is:
$K(I)=\sum_{j=0}^M j c_j$
I have an inefficient way to generate the list as follows:
make a do/for loop over integers incrementing by use
of a "snoob" function - smallest next integer with same number of bit set
and at each increment checking if it has the correct value of K
this is grossly inefficient because in general starting from an integer
with the correct N and K value the snoob integer from I does not have the correct K and one has to make many snoob calculations to get the next integer
with both N and K equal to the chosen values.
Using snoob gives an ordered list which is handy for dichotomic search but
not absolutely compulsory.
Counting the number of elements in this list is easily done by recursion
when viewed as a partition numner counting. here is a recursive function in fortran 90 doing that job:
=======================================================================
recursive function BoundedPartitionNumberQ(N, M, D) result (res)
implicit none
! number of partitions of N into M distinct integers, bounded by D
! appropriate for Fermi counting rules
integer(8) :: N, M, D, Nmin
integer(8) :: res
Nmin = M*(M+1)/2 ! the Fermi sea
if(N < Nmin) then
res = 0
else if((N == Nmin) .and. (D >= M)) then
res = 1
else if(D < M) then
res = 0
else if(D == M) then
if(N == Nmin) then
res = 1
else
res = 0
endif
else if(M == 0) then
res = 0
else
res = BoundedPartitionNumberQ(N-M,M-1,D-1)+BoundedPartitionNumberQ(N-M,M,D-1)
endif
end function BoundedPartitionNumberQ
========================================================================================
My present solution is inefficient when I want to generate lists with several $10^7$
elements. Ultimately I want to stay within the realm of C/C++/Fortran and reach lists of lengths
up to a few $10^9$
my present f90 code is the following:
program test
implicit none
integer(8) :: Nparticles
integer(8) :: Nmax, TmpL, CheckL, Nphi
integer(8) :: i, k, counter
integer(8) :: NextOne
Nphi = 31 ! word size is Nphi+1
Nparticles = 16 ! number of bit set
print*,Nparticles,Nphi
Nmax = ishft(1_8, Nphi + 1) - ishft(1_8, Nphi + 1 - Nparticles)
i = ishft(1, Nparticles) - 1
counter = 0
! integer CheckL is the sum of bit indices
CheckL = Nparticles*Nphi/2 ! the value of the sum giving the largest list
do while(i .le. Nmax) ! we increment the integer
TmpL = 0
do k=0,Nphi
if (btest(i,k)) TmpL = TmpL + k
end do
if (TmpL == CheckL) then ! we check whether the sum of bit indices is OK
counter = counter + 1
end if
i = NextOne(i) ! a version of "snoob" described below
end do
print*,counter
end program
!==========================================================================
function NextOne (state)
implicit none
integer(8) :: bit
integer(8) :: counter
integer(8) :: NextOne,state,pstate
bit = 1
counter = -1
! find first one bit
do while (iand(bit,state) == 0)
bit = ishft(bit,1)
end do
! find next zero bit
do while (iand(bit,state) /= 0)
counter = counter + 1
bit = ishft(bit,1)
end do
if (bit == 0) then
print*,'overflow in NextOne'
NextOne = not(0)
else
state = iand(state,not(bit-1)) ! clear lower bits i &= (~(bit-1));
pstate = ishft(1_8,counter)-1 ! needed by IBM/Zahir compiler
! state = ior(state,ior(bit,ishft(1,counter)-1)) ! short version OK with gcc
state = ior(state,ior(bit,pstate))
NextOne = state
end if
end function NextOne
Since you mentioned C/C++/Fortran, I've tried to keep this relatively language agnostic/easily transferable but have also included faster builtins alternatives where applicable.
All integers have the same number of bit set N
Then we can also say, all valid integers will be permutations of N set bits.
First, we must generate the initial/min permutation:
uint32_t firstPermutation(uint32_t n){
// Fill the first n bits (on the right)
return (1 << n) -1;
}
Next, we must set the final/max permutation - indicating the 'stop point':
uint32_t lastPermutation(uint32_t n){
// Fill the last n bits (on the left)
return (0xFFFFFFFF >> n) ^ 0xFFFFFFFF;
}
Finally, we need a way to get the next permutation.
uint32_t nextPermutation(uint32_t n){
uint32_t t = (n | (n - 1)) + 1;
return t | ((((t & -t) / (n & -n)) >> 1) - 1);
}
// or with builtins:
uint32_t nextPermutation(uint32_t &p){
uint32_t t = (p | (p - 1));
return (t + 1) | (((~t & -~t) - 1) >> (__builtin_ctz(p) + 1));
}
All integers have the same sum of bit indices K
Assuming these are integers (32bit), you can use this DeBruijn sequence to quickly identify the index of the first set bit - fsb.
Similar sequences exist for other types/bitcounts, for example this one could be adapted for use.
By stripping the current fsb, we can apply the aforementioned technique to identify index of the next fsb, and so on.
int sumIndices(uint32_t n){
const int MultiplyDeBruijnBitPosition[32] = {
0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8,
31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9
};
int sum = 0;
// Get fsb idx
do sum += MultiplyDeBruijnBitPosition[((uint32_t)((n & -n) * 0x077CB531U)) >> 27];
// strip fsb
while (n &= n-1);
return sum;
}
// or with builtin
int sumIndices(uint32_t n){
int sum = 0;
do sum += __builtin_ctz(n);
while (n &= n-1);
return sum;
}
Finally, we can iterate over each permutation, checking if the sum of all indices matches the specified K value.
p = firstPermutation(n);
lp = lastPermutation(n);
do {
p = nextPermutation(p);
if (sumIndices(p) == k){
std::cout << "p:" << p << std::endl;
}
} while(p != lp);
You could easily change the 'handler' code to do something similar starting at a given integer - using it's N & K values.
A basic recursive implementation could be:
void listIntegersWithWeight(int currentBitCount, int currentWeight, uint32_t pattern, int index, int n, int k, std::vector<uint32_t> &res)
{
if (currentBitCount > n ||
currentWeight > k)
return;
if (index < 0)
{
if (currentBitCount == n && currentWeight == k)
res.push_back(pattern);
}
else
{
listIntegersWithWeight(currentBitCount, currentWeight, pattern, index - 1, n, k, res);
listIntegersWithWeight(currentBitCount + 1, currentWeight + index, pattern | (1u << index), index - 1, n, k, res);
}
}
That is not my suggestion, just the starting point. On my PC, for n = 16, k = 248, both this version and the iterative version take almost (but not quite) 9 seconds. Almost exactly the same amount of time, but that's just a coincidence. More pruning can be done:
currentBitCount + index + 1 < n if the number of set bits cannot reach n with the number of unfilled positions that are left, continuing is pointless.
currentWeight + (index * (index + 1) / 2) < k if the sum of positions cannot reach k, continuing is pointless.
Together:
void listIntegersWithWeight(int currentBitCount, int currentWeight, uint32_t pattern, int index, int n, int k, std::vector<uint32_t> &res)
{
if (currentBitCount > n ||
currentWeight > k ||
currentBitCount + index + 1 < n ||
currentWeight + (index * (index + 1) / 2) < k)
return;
if (index < 0)
{
if (currentBitCount == n && currentWeight == k)
res.push_back(pattern);
}
else
{
listIntegersWithWeight(currentBitCount, currentWeight, pattern, index - 1, n, k, res);
listIntegersWithWeight(currentBitCount + 1, currentWeight + index, pattern | (1u << index), index - 1, n, k, res);
}
}
On my PC with the same parameters, this only takes half a second. It can probably be improved further.
I would like to create my own Heapsort algorithm in R.
That is my code
heapify <- function(array, n, i)
{
parent <- i
leftChild <- 2 * i + 1
rightChild <- 2 * i + 2
if ((leftChild < n) & (array[parent] < array[leftChild]))
{
parent = leftChild
}
if ((rightChild < n) & (array[parent] < array[rightChild]))
{
parent = rightChild
}
if (parent != i)
{
array = replace(array, c(i, parent), c(array[parent], array[i]))
heapify(array, n, parent)
}
}
heapSort <- function(array)
{
n <- length(array)
for (i in (n+1):1)
{
heapify(array, n, i)
}
for (i in n:1)
{
array = replace(array, c(i, 0), c(array[0], array[i]))
heapify(array, i, 1)
}
print(array)
}
However that implementation seems to be incorrect. That's an example of an input and output.
array <- c(5, 14, 3, 70, 64)
heapSort(array)
Output: [1] 5 14 3 70 64
I have spent quite a while and I have no idea where the problem is. I would appreciate any hints or tips.
My guess is that you were trying to convert the algorithm posted on GeeksforGeeks where they implement this in many zero based languages. This is one of the sources of your problem (R starts indexing at 1 instead of 0).
Base Zero Indexing Issues:
Example 1:
## We also need to swap these indices
array = replace(array, c(i, 0), c(array[0], array[i]))
heapify(array, i, 1)
Should be:
array <- replace(array, c(i, 1), array[c(1, i)])
array <- heapify(array, i, 1)
Example 2:
leftChild <- 2 * i + 1
rightChild <- 2 * i + 2
Should be:
leftChild <- 2 * (i - 1) + 1
rightChild <- 2 * (i - 1) + 2
Pass By Reference Assumption
In R, you cannot pass an object by reference (see this question and answers Can you pass-by-reference in R?). This means that when we call a recursive function we must assign it and the recursive function must return something.
In heapify we must return array. Also every call to heapify we must assign array to the output.
Here is the amended code:
heapify <- function(array, n, i)
{
parent <- i
leftChild <- 2 * (i - 1) + 1
rightChild <- 2 * (i - 1) + 2
if ((leftChild < n) & (array[parent] < array[leftChild]))
{
parent <- leftChild
}
if ((rightChild < n) & (array[parent] < array[rightChild]))
{
parent <- rightChild
}
if (parent != i) {
array <- replace(array, c(i, parent), array[c(parent, i)])
array <- heapify(array, n, parent)
}
array
}
heapSort <- function(array)
{
n <- length(array)
for (i in floor(n / 2):1) {
array <- heapify(array, n, i)
}
for (i in n:1) {
array <- replace(array, c(i, 1), array[c(1, i)])
array <- heapify(array, i, 1)
}
array
}
Here are some tests (note this algorithm is extremely inefficient in R.. do not try on vectors much larger than below):
array <- c(5, 14, 3, 70, 64)
heapSort(array)
[1] 3 5 14 64 70
set.seed(11)
largerExample <- sample(1e3)
head(largerExample)
[1] 278 1 510 15 65 951
identical(heapSort(largerExample), 1:1e3)
[1] TRUE
Modular inverses can be computed as follows (from Rosetta Code):
#include <stdio.h>
int mul_inv(int a, int b)
{
int b0 = b, t, q;
int x0 = 0, x1 = 1;
if (b == 1) return 1;
while (a > 1) {
q = a / b;
t = b, b = a % b, a = t;
t = x0, x0 = x1 - q * x0, x1 = t;
}
if (x1 < 0) x1 += b0;
return x1;
}
However, the inputs are ints, as you can see. Would the above code work for unsigned integers (e.g. uint64_t) as well? I mean, would it be ok to replaced all int with uint64_t? I could try for few inputs but it is not feasible to try for all 64-bits combinations.
I'm specifically interested in two aspects:
for values [0, 264) of both a and b, would all calculation not overflow/underflow (or overflow with no harm)?
how would (x1 < 0) look like in unsigned case?
First of all how this algorithm works? It is based on the Extended Euclidean algorithm for computation of the GCD. In short the idea is following: if we can find some integer coefficients m and n such that
a*m + b*n = 1
then m will be the answer for the modular inverse problem. It is easy to see because
a*m + b*n = a*m (mod b)
Luckily the Extended Euclidean algorithm does exactly that: if a and b are co-prime, it finds such m and n. It works in the following way: for each iteration track two triplets (ai, xai, yai) and (bi, xbi, ybi) such that at every step
ai = a0*xai + b0*yai
bi = a0*xbi + b0*ybi
so when finally the algorithm stops at the state of ai = 0 and bi = GCD(a0,b0), then
1 = GCD(a0,b0) = a0*xbi + b0*ybi
It is done using more explicit way to calculate modulo: if
q = a / b
r = a % b
then
r = a - q * b
Another important thing is that it can be proven that for positive a and b at every step |xai|,|xbi| <= b and |yai|,|ybi| <= a. This means there can be no overflow during calculation of those coefficients. Unfortunately negative values are possible, moreover, on every step after the first one in each equation one is positive and the other is negative.
What the code in your question does is a reduced version of the same algorithm: since all we are interested in is the x[a/b] coefficients, it tracks only them and ignores the y[a/b] ones. The simplest way to make that code work for uint64_t is to track the sign explicitly in a separate field like this:
typedef struct tag_uint64AndSign {
uint64_t value;
bool isNegative;
} uint64AndSign;
uint64_t mul_inv(uint64_t a, uint64_t b)
{
if (b <= 1)
return 0;
uint64_t b0 = b;
uint64AndSign x0 = { 0, false }; // b = 1*b + 0*a
uint64AndSign x1 = { 1, false }; // a = 0*b + 1*a
while (a > 1)
{
if (b == 0) // means original A and B were not co-prime so there is no answer
return 0;
uint64_t q = a / b;
// (b, a) := (a % b, b)
// which is the same as
// (b, a) := (a - q * b, b)
uint64_t t = b; b = a % b; a = t;
// (x0, x1) := (x1 - q * x0, x0)
uint64AndSign t2 = x0;
uint64_t qx0 = q * x0.value;
if (x0.isNegative != x1.isNegative)
{
x0.value = x1.value + qx0;
x0.isNegative = x1.isNegative;
}
else
{
x0.value = (x1.value > qx0) ? x1.value - qx0 : qx0 - x1.value;
x0.isNegative = (x1.value > qx0) ? x1.isNegative : !x0.isNegative;
}
x1 = t2;
}
return x1.isNegative ? (b0 - x1.value) : x1.value;
}
Note that if a and b are not co-prime or when b is 0 or 1, this problem has no solution. In all those cases my code returns 0 which is an impossible value for any real solution.
Note also that although the calculated value is really the modular inverse, simple multiplication will not always produce 1 because of the overflow at multiplication over uint64_t. For example for a = 688231346938900684 and b = 2499104367272547425 the result is inv = 1080632715106266389
a * inv = 688231346938900684 * 1080632715106266389 =
= 743725309063827045302080239318310076 =
= 2499104367272547425 * 297596738576991899 + 1 =
= b * 297596738576991899 + 1
But if you do a naive multiplication of those a and inv of type uint64_t, you'll get 4042520075082636476 so (a*inv)%b will be 1543415707810089051 rather than expected 1.
The mod_inv C function :
return a modular multiplicative inverse of n with respect to the modulus
return 0 if the linear congruence has no solutions
unsigned mod_inv(unsigned n, const unsigned mod) {
unsigned a = mod, b = a, c = 0, d = 0, e = 1, f, g;
for (n *= a > 1; n > 1 && (n *= a > 0); e = g, c = (c & 3) | (c & 1) << 2) {
g = d, d *= n / (f = a);
a = n % a, n = f;
c = (c & 6) | (c & 2) >> 1;
f = c > 1 && c < 6;
c = (c & 5) | (f || e > d ? (c & 4) >> 1 : ~c & 2);
d = f ? d + e : e > d ? e - d : d - e;
}
return n ? c & 4 ? b - e : e : 0;
}
Examples
n = 7 and mod = 45 then res = 13 so 1 == ( 13 * 7 ) % 45
n = 52 and mod = 107 then res = 35 so 1 == ( 35 * 52 ) % 107
n = 213 and mod = 155 then res = 147 so 1 == ( 147 * 213 ) % 155
n = 392 and mod = 45 then res = 38 so 1 == ( 38 * 392 ) % 45
n = 3708141711 and mod = 4280761040 it still works...
I would like to have an implementation in PARI/GP
for the calculation of
a_1 ^ a_2 ^ ... ^ a_n (mod m)
which manages all cases, especially the cases where high powers appear in the phi-chain.
Does anyone know such an implementation ?
Here's a possibility using Chinese remainders to make sure the modulus is a prime power. This simplifies the computation of x^n mod m in the painful case where gcd(x,m) is not 1. The code assumes the a_i are > 1; most of the code checks whether p^a_1^a_2^...^a_n is 0 mod (p^e) for a prime number p, while avoiding overflow.
\\ x[1]^x[2]^ ...^ x[#x] mod m, assuming x[i] > 1 for all i
tower(x, m) =
{ my(f = factor(m), P = f[,1], E = f[,2]);
chinese(vector(#P, i, towerp(x, P[i], E[i])));
}
towerp(x, p, e) =
{ my(q = p^e, i, t, v);
if (#x == 0, return (Mod(1, q)));
if (#x == 1, return (Mod(x[1], q)));
if (v = valuation(x[1], p),
t = x[#x]; i = #x;
while (i > 1,
if (t >= e, return (Mod(0, q)));
t = x[i]^t; i--);
if (t * v >= e, return (Mod(0, q)));
return (Mod(x[1], q)^t);
);
Mod(x[1], q)^lift(tower(x[^1], (p-1)*p^e));
}
For instance
? 5^(4^(3^2)) % 163 \\ direct computation, wouldn't scale
%1 = 158
? tower([5,4,3,2], 163)
%2 = Mod(158, 163)