runif with .Machine$double.xmax as boundaries - r

I wanted to generate a random real (I guess rational) number.
To do this I wanted to use runif(1, min = m, max = M) and my thoughts were to set m; M (absolute) as large as possible in order to make the interval as large as possible. Which brings me to my question:
M <- .Machine$double.xmax
m <- -M
runif(1, m, M)
## which returns
[1] Inf
Why is it not returning a number? Is the chosen interval simply too large?
PS
> .Machine$double.xmax
[1] 1.797693e+308

As hinted by mt1022 the reason is in runif C source:
double runif(double a, double b)
{
if (!R_FINITE(a) || !R_FINITE(b) || b < a) ML_ERR_return_NAN;
if (a == b)
return a;
else {
double u;
/* This is true of all builtin generators, but protect against
user-supplied ones */
do {u = unif_rand();} while (u <= 0 || u >= 1);
return a + (b - a) * u;
}
}
In the return argument you can see the formula a + (b - a) * u which transoform uniformly [0, 1] generated random value in the user supplied interval [a, b]. In your case it will be -M + (M + M) * u. So M + M in case it 1.79E308 + 1.79E308 generates Inf. I.e. finite + Inf * finite = Inf:
M + (M - m) * runif(1, 0, 1)
# Inf

Related

Converting matlab code to R code

I was wondering how I can convert this code from Matlab to R code. It seems this is the code for midpoint method. Any help would be highly appreciated.
% Usage: [y t] = midpoint(f,a,b,ya,n) or y = midpoint(f,a,b,ya,n)
% Midpoint method for initial value problems
%
% Input:
% f - Matlab inline function f(t,y)
% a,b - interval
% ya - initial condition
% n - number of subintervals (panels)
%
% Output:
% y - computed solution
% t - time steps
%
% Examples:
% [y t]=midpoint(#myfunc,0,1,1,10); here 'myfunc' is a user-defined function in M-file
% y=midpoint(inline('sin(y*t)','t','y'),0,1,1,10);
% f=inline('sin(y(1))-cos(y(2))','t','y');
% y=midpoint(f,0,1,1,10);
function [y t] = midpoint(f,a,b,ya,n)
h = (b - a) / n;
halfh = h / 2;
y(1,:) = ya;
t(1) = a;
for i = 1 : n
t(i+1) = t(i) + h;
z = y(i,:) + halfh * f(t(i),y(i,:));
y(i+1,:) = y(i,:) + h * f(t(i)+halfh,z);
end;
I have the R code for Euler method which is
euler <- function(f, h = 1e-7, x0, y0, xfinal) {
N = (xfinal - x0) / h
x = y = numeric(N + 1)
x[1] = x0; y[1] = y0
i = 1
while (i <= N) {
x[i + 1] = x[i] + h
y[i + 1] = y[i] + h * f(x[i], y[i])
i = i + 1
}
return (data.frame(X = x, Y = y))
}
so based on the matlab code, do I need to change h in euler method (R code) to (b - a) / n to modify Euler code to midpoint method?
Note
Broadly speaking, I agree with the expressed comments; however, I decided to vote up this question. (now deleted) This is due to the existence of matconv that facilitates this process.
Answer
Given your code, we could use matconv in the following manner:
pacman::p_load(matconv)
out <- mat2r(inMat = "input.m")
The created out object will attempt to translate Matlab code into R, however, the job is far from finished. If you inspect the out object you will see that it requires further work. Simple statements are usually translated correctly with Matlab comments % replaced with # and so forth but more complex statements may require a more detailed investigation. You could then inspect respective line and attempt to evaluate them to see where further work may be required, example:
eval(parse(text=out$rCode[1]))
NULL
(first line is a comment so the output is NULL)

Is it safe to replace "a/(b*c)" with "a/b/c" when using integer-division?

Is it safe to replace a/(b*c) with a/b/c when using integer-division on positive integers a,b,c, or am I at risk losing information?
I did some random tests and couldn't find an example of a/(b*c) != a/b/c, so I'm pretty sure it's safe but not quite sure how to prove it.
Thank you.
Mathematics
As mathematical expressions, ⌊a/(bc)⌋ and ⌊⌊a/b⌋/c⌋ are equivalent whenever b is nonzero and c is a positive integer (and in particular for positive integers a, b, c). The standard reference for these sorts of things is the delightful book Concrete Mathematics: A Foundation for Computer Science by Graham, Knuth and Patashnik. In it, Chapter 3 is mostly on floors and ceilings, and this is proved on page 71 as a part of a far more general result:
In the 3.10 above, you can define x = a/b (mathematical, i.e. real division), and f(x) = x/c (exact division again), and plug those into the result on the left ⌊f(x)⌋ = ⌊f(⌊x⌋)⌋ (after verifying that the conditions on f hold here) to get ⌊a/(bc)⌋ on the LHS equal to ⌊⌊a/b⌋/c⌋ on the RHS.
If we don't want to rely on a reference in a book, we can prove ⌊a/(bc)⌋ = ⌊⌊a/b⌋/c⌋ directly using their methods. Note that with x = a/b (the real number), what we're trying to prove is that ⌊x/c⌋ = ⌊⌊x⌋/c⌋. So:
if x is an integer, then there is nothing to prove, as x = ⌊x⌋.
Otherwise, ⌊x⌋ < x, so ⌊x⌋/c < x/c which means that ⌊⌊x⌋/c⌋ ≤ ⌊x/c⌋. (We want to show it's equal.) Suppose, for the sake of contradiction, that ⌊⌊x⌋/c⌋ < ⌊x/c⌋ then there must be a number y such that ⌊x⌋ < y ≤ x and y/c = ⌊x/c⌋. (As we increase a number from ⌊x⌋ to x and consider division by c, somewhere we must hit the exact value ⌊x/c⌋.) But this means that y = c*⌊x/c⌋ is an integer between ⌊x⌋ and x, which is a contradiction!
This proves the result.
Programming
#include <stdio.h>
int main() {
unsigned int a = 142857;
unsigned int b = 65537;
unsigned int c = 65537;
printf("a/(b*c) = %d\n", a/(b*c));
printf("a/b/c = %d\n", a/b/c);
}
prints (with 32-bit integers),
a/(b*c) = 1
a/b/c = 0
(I used unsigned integers as overflow behaviour for them is well-defined, so the above output is guaranteed. With signed integers, overflow is undefined behaviour, so the program can in fact print (or do) anything, which only reinforces the point that the results can be different.)
But if you don't have overflow, then the values you get in your program are equal to their mathematical values (that is, a/(b*c) in your code is equal to the mathematical value ⌊a/(bc)⌋, and a/b/c in code is equal to the mathematical value ⌊⌊a/b⌋/c⌋), which we've proved are equal. So it is safe to replace a/(b*c) in code by a/b/c when b*c is small enough not to overflow.
While b*c could overflow (in C) for the original computation, a/b/c can't overflow, so we don't need to worry about overflow for the forward replacement a/(b*c) -> a/b/c. We would need to worry about it the other way around, though.
Let x = a/b/c. Then a/b == x*c + y for some y < c, and a == (x*c + y)*b + z for some z < b.
Thus, a == x*b*c + y*b + z. y*b + z is at most b*c-1, so x*b*c <= a <= (x+1)*b*c, and a/(b*c) == x.
Thus, a/b/c == a/(b*c), and replacing a/(b*c) by a/b/c is safe.
Nested floor division can be reordered as long as you keep track of your divisors and dividends.
#python3.x
x // m // n = x // (m * n)
#python2.x
x / m / n = x / (m * n)
Proof (sucks without LaTeX :( ) in python3.x:
Let k = x // m
then k - 1 < x / m <= k
and (k - 1) / n < x / (m * n) <= k / n
In addition, (x // m) // n = k // n
and because x // m <= x / m and (x // m) // n <= (x / m) // n
k // n <= x // (m * n)
Now, if k // n < x // (m * n)
then k / n < x / (m * n)
and this contradicts the above statement that x / (m * n) <= k / n
so if k // n <= x // (m * n) and k // n !< x // (m * n)
then k // n = x // (m * n)
and (x // m) // n = x // (m * n)
https://en.wikipedia.org/wiki/Floor_and_ceiling_functions#Nested_divisions

why 1 is subtracted from mod where mod =1000000007 in calculation

link of question
http://codeforces.com/contest/615/problem/D
link of solution is
http://codeforces.com/contest/615/submission/15260890
In below code i am not able to understand why 1 is subtracted from mod
where mod=1000000007
ll d = 1;
ll ans = 1;
for (auto x : cnt) {
ll cnt = x.se;
ll p = x.fi;
ll fp = binPow(p, (cnt + 1) * cnt / 2, MOD);
ans = binPow(ans, (cnt + 1), MOD) * binPow(fp, d, MOD) % MOD;
d = d * (x.se + 1) % (MOD - 1);//why ??
}
Apart from the fact that there is the code does not make much sense as out of context as it is, there is the little theorem of Fermat:
Whenever MOD is a prime number, as 10^9+7 is, one can reduce exponents by multiples of (MOD-1) as for any a not a multiple of MOD
a ^ (MOD-1) == 1 mod MOD.
Which means that
a^b == a ^ (b mod (MOD-1)) mod MOD.
As to the code, which is efficient for its task, consider n=m*p^e where m is composed of primes smaller than p.
Then for each factor f of m there are factors 1*f, p*f, p^2*f,...,p^e*f of n. The product over all factors of n thus is the product over
p^(0+1+2+...+e) * f^(e+1) = p^( e*(e+1)/2 ) * f^(e+1)
over all factors f of m. Putting the numbers of factors as d and the product of factors of m as ans results in the combined formula
ans = ans^( e+1 ) * p^( d*e*(e+1)/2 )
d = d*(e+1)
which can now be recursively applied to the list of prime factors and their multiplicities.

Normalizing constant for beta distribution with discrete prior : R code query

I am currently going through Bayesian Thinking with R by Jim Albert. I have a query about his code for his example with a beta likelihood and discrete prior. His code for calculating the posterior is:
pdisc <- function (p, prior, data)
s = data[1] # successes
f = data[2] # failures
#############
p1 = p + 0.5 * (p == 0) - 0.5 * (p == 1)
like = s * log(p1) + f * log(1 - p1)
like = like * (p > 0) * (p < 1) - 999 * ((p == 0) * (s >
0) + (p == 1) * (f > 0))
like = exp(like - max(like))
#############
product = like * prior
post = product/sum(product)
return(post)
}
My query is about the highlighted bit of code for calculating the likelihood and what the logic behind it is (not explained in the book). I'm aware of the pdf for the beta distribution, and that the log likelihood will be proportional to s * log(p1) + f * log(1 - p1) but it is not clear what the following 2 lines are doing - I imagine it's something to do with the normalizing constant, but again there isn't an explanation for this in the book.
The line
like = like * (p > 0) * (p < 1) - 999 * ((p == 0) * (s >
0) + (p == 1) * (f > 0))
takes care of the edge cases when you have prior probability at 0 or 1. Basically, if p=0 and any successes are observed then like=-999 and if p=1 and any failures are observed then like=-999. I would have preferred to use -Inf rather than -999 as that is what the log likelihood is in those cases.
The second line
like = exp(like - max(like))
is a numerically stable way to exponentiate when only the relative differences in the logged values are important. If like were really small, e.g. you had lots of successes and failures, then it is possible that exp(like) would be represented as a 0 vector in a computer. Only the relative differences are important here because you renormalize the product to sum to 1 when constructing the posterior probabilities.

how to calculate 2^n modulo 1000000007 , n = 10^9

what is the fastest method to calculate this, i saw some people using matrices and when i searched on the internet, they talked about eigen values and eigen vectors (no idea about this stuff)...there was a question which reduced to a recursive equation
f(n) = (2*f(n-1)) + 2 , and f(1) = 1,
n could be upto 10^9....
i already tried using DP, storing upto 1000000 values and using the common fast exponentiation method, it all timed out
im generally weak in these modulo questions, which require computing large values
f(n) = (2*f(n-1)) + 2 with f(1)=1
is equivalent to
(f(n)+2) = 2 * (f(n-1)+2)
= ...
= 2^(n-1) * (f(1)+2) = 3 * 2^(n-1)
so that finally
f(n) = 3 * 2^(n-1) - 2
where you can then apply fast modular power methods.
Modular exponentiation by the square-and-multiply method:
function powerMod(b, e, m)
x := 1
while e > 0
if e%2 == 1
x, e := (x*b)%m, e-1
else b, e := (b*b)%m, e//2
return x
C code for calculating 2^n
const int mod = 1e9+7;
//Here base is assumed to be 2
int cal_pow(int x){
int res;
if (x == 0) res=1;
else if (x == 1) res=2;
else {
res = cal_pow(x/2);
if (x % 2 == 0)
res = (res * res) % mod;
else
res = (((res*res) % mod) * 2) % mod;
}
return res;
}

Resources