Scope of Random Seed setting - julia

I want to make a function that will always return the same numbers if I input a parameter asking for a deterministic response and will give a requested number of pseudorandom numbers otherwise. Unfortunately the only way I can figure out how to do it resets the global random seed which is not desirable.
Is there a way I can set the random number seed for one draw of pseudorandom numbers without affecting the global seed or the existing progression along that seed's pseudorandom number sequence?
Example Case
using Random
function get_random(n::Int, deterministic::Bool)
if deterministic
Random.seed!(1234)
return rand(n)
else
return rand(n)
end
end
Random.seed!(4321)
# This and the next get_random(5,false) should give the same response
# if the Random.seed!(1234) were confined to the function scope.
get_random(5,false)
Random.seed!(4321)
get_random(5,true)
get_random(5,false)

The simplest solution is to use newly allocated RNG like this:
using Random
function get_random(n::Int, deterministic::Bool)
if deterministic
m = MersenneTwister(1234)
return rand(m, n)
else
return rand(n)
end
end
In general I usually tend not to use global RNG in simulations at all as it gives me a better control of the process.

Related

Correct way to generate Poisson-distributed random numbers in Julia GPU code?

For a stochastic solver that will run on a GPU, I'm currently trying to draw Poisson-distributed random numbers. I will need one number for each entry of a large array. The array lives in device memory and will also be deterministically updated afterwards. The problem I'm facing is that the mean of the distribution depends on the old value of the entry. Therefore, I would have to do naively do something like:
CUDA.rand_poisson!(lambda=array*constant)
or:
array = CUDA.rand_poisson(lambda=array*constant)
Both of which don't work, which does not really surprise me, but maybe I just need to get a better understanding of broadcasting?
Then I tried writing a kernel which looks like this:
function cu_draw_rho!(rho::CuDeviceVector{FloatType}, λ::FloatType)
idx = (blockIdx().x - 1i32) * blockDim().x + threadIdx().x
stride = gridDim().x * blockDim().x
#inbounds for i=idx:stride:length(rho)
l = rho[i]*λ
# 1. variant
rho[i] > 0.f0 && (rho[i] = FloatType(CUDA.rand_poisson(UInt32,1;lambda=l)))
# 2. variant
rho[i] > 0.f0 && (rho[i] = FloatType(rand(Poisson(lambda=l))))
end
return
end
And many slight variations of the above. I get tons of errors about dynamic function calls, which I connect to the fact that I'm calling functions that are meant for arrays from my kernels. the 2. variant of using rand() works only without the Poisson argument (which uses the Distributions package, I guess?)
What is the correct way to do this?
You may want CURAND.jl, which provides curand_poisson.
using CURAND
n = 10
lambda = .5
curand_poisson(n, lambda)

Generate non-repeating random number using timestamp in Marklogic (XQuery)?

I want to generate non-repeating random number having time stamp in it. What could be the possible code for it?
I've tried using sem:uuid-string() function but it generates 36 long character which is very long.
I'd suggest taking a look at the ml-unique library. It provides 3 different methods for generating unique ids in MarkLogic, and explains to pros and cons of each. Maybe one of those fits your needs, or you can copy the code, and adapt as needed.
Note that a timestamp alone is not enough to guarantee uniqueness, particularly if generating multiple ids in one request, or when processing data in parallel.
The length of uuid string makes the chance of collisions very small by the way.
HTH!
It is not possible to generate a non-repeating random number and have the results fit into finite size. If 36 bytes is too large that further limits the theoretical maximum. The server itself uses 64 bit random numbers (effectively xdmp:random) for unique ID's. Attempting to to do better, with respect to collision probability, is futile - no matter what or how long a URI you use, internally references will be created as a 64 bit random number or as a hash value. The methods recommended will not produce an effectively colliding URI with less probability then the server itself will given non-colliding URI's of any size. Most likely attempts at more complex 'random' URI generation will result in much worse results due to the subtly of pseudo random number algorithms.
The code below generates (with arbitrary high probability) 10 different random numbers. Every iteration of for loop inserts newly generated random number into MarkLogic database. Exception error((), 'BREAK') will be thrown when 10 different numbers were already generated.
xquery version "1.0-ml";
xdmp:document-insert("/doc/random.xml",<root><a>{xdmp:random(100)}</a></root>);
try {
for $i in (1 to 200) (:200 can be replace with larger number to reduce probability that 10 different random numbers will never be selected.:)
return xdmp:invoke-function( function() as item()?
{ let $myrandom:= xdmp:random(100), $last:= count(doc("/doc/random.xml")/root/*)
return
if ($last lt 10) then (
if (doc("/doc/random.xml")/root/a/text() = $myrandom) then () else (xdmp:node-insert-after(doc("/doc/random.xml")/root/a[last()], <a>{$myrandom}</a>)))
else (if ($last eq 10) then (error((), 'BREAK')) else ())},
<options xmlns="xdmp:eval">
<transaction-mode>update</transaction-mode>
<transaction-mode>update-auto-commit</transaction-mode>
</options>)}
catch ($ex) {
if ($ex/error:code eq 'BREAK') then ("10 different random numbers were generated") else xdmp:rethrow() };

Fix seed via srand() produces different results from rand()

In my Julia 0.5 script I use srand(1234) to get the same results from rand() each time I re-run the script. However, I get different results. What do I wrong?
As #Dan Getz mentioned in the comments, this is likely to because you have some code that calls random functions without you knowing about it.
If you call the same rand() function with the same seed set, you get the same results as expected:
julia> for i in 1:3
srand(1)
println(rand())
end
0.23603334566204692
0.23603334566204692
0.23603334566204692
However, if you have another call in your script to rand that may or may not be called, then your random number generator will be at different stages when you get to the investigated rand() call. Here's an example to illustrate this:
julia> for i in 1:3
srand(1)
if i == 2
rand()
end
println(rand())
end
0.23603334566204692
0.34651701419196046
0.23603334566204692
Notice how in the second iteration of the loop there's an extra rand() call that offsets the random number generator and results in a different value.
In addition to the answer given by #niczky12 I would recommend that you define your own generator and use that for better reproducibility, that way you always keep control of "your" generator, and calls to other functions (perhaps not in your control) that uses the global one will not affect the random numbers you obtain.
For example, creating a MersenneTwister with seed 1234:
rng = MersenneTwister(1234)
Then you simply pass this generator to your rand calls:
julia> rng = MersenneTwister(1234);
julia> rand(rng)
0.5908446386657102
julia> rand(rng, 2, 3)
2×3 Array{Float64,2}:
0.766797 0.460085 0.854147
0.566237 0.794026 0.200586

How to get seed of current state of random generator in goal to place it in set.seed() function

I have to repeat some statistical procedure based on pseudorandom numbers several times (about 100 000), this procedure is written in pure R. After each step (there are 100 000 steps or call it iterations) I would like to get current state (getting seed would be proper I suppose) of random generator, and after this one step/iteration of procedure I collect only part of the entire output because it's too large to store (it's the value of optimized goal function and a few other statistics ). After inspection of total output (which is 100 000 long) I would like to pick the best solution and run procedure corresponding to it again, for this I need to set the state of random generator which correspond for choosen solution. There is set.seed but getting seed is no straight forward, there is .Random.seed but how could it help with above problem ?
Call set.seed(x) at the beginning of each iteration. Make sure you can identify the seed that was used before you started the process, so that you can use it later. For example:
for (seed in seeds) {
set.seed(i)
print(sprintf('using seed = %d\n', seed))
do_your_stuff(...)
}
In a comment you asked:
how to choose seed in proper manner - shouldn't it be some "random" prime numbers not the simple series of integers (if we talk about vector of containing seeds) ?
I'm not sure how it matters if seeds is simply a sequence (like 1:100) or random prime numbers. As far as I know, any seed number X is just as good as any other Y. But if that's important to you, then you can grab a list of prime numbers from somewhere (for example here) and use sample to randomize them, for example:
seeds <- sample(c(7, 17, 19, 23, 1019, 1021))

Convergence using for loop in R

I have the following code:
for(n in 1:1000){
..............
}
This will run ............ 1000 times. I havent put the full code in because its extremely long and not relevant to the answer
My question is there any way i can get the code to run until it reaches a specified convergence value to four decimal places. There are initial values being fed into this equation which generates new values and the process is continually iterative until a convergence attained (as specified above).
EDIT
I have a set of 4 values at the end of my code with different labels (A, B, C, D). Within my code there are two separate functions when each calculate different values and feed each other. So when i say convergence, i mean that when function 1 tells function 2 specific values and it calculates new values for A, B, C and D and the cycle continues and the next time these values are the same in as calculated by function 2
The key question im asking here is what format the code should take (the below would suggest that repeat is perferrable) and how to code the convergence criteria correctly as the assignment notation for successive iterations will be the same.
Just making an answer out of my comment, I think often repeat will be the best here. It doesn't require you to evaluate the condition at the start and doesn't stop after a finite number of iterations (unless of course that is what you want):
repeat
{
# Do stuff
if (condition) break
}
If you are just looking for a way of exiting for loops you can just use break.
for (n in 1:1000)
{
...
if (condition)
break;
}
You could always just use a while loop if you don't know how many iterations it will take. The general form could look something like this:
while(insert_convergence_check_here){
insert_your_code_here
}
Edit: In response to nico's comment I should add that you could also follow this pattern to essentially create a do/while loop in case you need the loop to run at least once before you can check the convergence criteria.
continue_indicator <- TRUE
while(continue_indicator){
insert_your_code_here
continue_indicator <- convergence_check_here
}

Resources