I had some python code which I tried to port to Julia to learn this lovely language. I used generators in python. After porting it seems to me (at this moment) that Julia is really slow in this area!
I made part of my code simplified to this exercise:
Think 4x4 chess board. Find every N-moves long path, chess king could do. In this exercise, the king is not allowed to leap twice at the same position in one path. Don't waste memory -> make a generator of every path.
Algorithm is pretty simple:
if we sign every position with numbers:
0 1 2 3
4 5 6 7
8 9 10 11
12 13 14 16
point 0 has 3 neighbors (1, 4, 5). We could find a table for every neighbor for every point:
NEIG = [[1, 4, 5], [0, 2, 4, 5, 6], [1, 3, 5, 6, 7], [2, 6, 7], [0, 1, 5, 8, 9], [0, 1, 2, 4, 6, 8, 9, 10], [1, 2, 3, 5, 7, 9, 10, 11], [2, 3, 6, 10, 11], [4, 5, 9, 12, 13], [4, 5, 6, 8, 10, 12, 13, 14], [5, 6, 7, 9, 11, 13, 14, 15], [6, 7, 10, 14, 15], [8, 9, 13], [8, 9, 10, 12, 14], [9, 10, 11, 13, 15], [10, 11, 14]]
PYTHON
A recursive function (generator) which enlarge given path from the list of points or from a generator of (generator of ...) points:
def enlarge(path):
if isinstance(path, list):
for i in NEIG[path[-1]]:
if i not in path:
yield path[:] + [i]
else:
for i in path:
yield from enlarge(i)
Function (generator) which give every path with given length
def paths(length):
steps = ([i] for i in range(16)) # first steps on every point on board
for _ in range(length-1):
nsteps = enlarge(steps)
steps = nsteps
yield from steps
We could see that there are 905776 paths with length 10:
sum(1 for i in paths(10))
Out[89]: 905776
JULIA
(this code was created by #gggg during our discussion here )
const NEIG_py = [[1, 4, 5], [0, 2, 4, 5, 6], [1, 3, 5, 6, 7], [2, 6, 7], [0, 1, 5, 8, 9], [0, 1, 2, 4, 6, 8, 9, 10], [1, 2, 3, 5, 7, 9, 10, 11], [2, 3, 6, 10, 11], [4, 5, 9, 12, 13], [4, 5, 6, 8, 10, 12, 13, 14], [5, 6, 7, 9, 11, 13, 14, 15], [6, 7, 10, 14, 15], [8, 9, 13], [8, 9, 10, 12, 14], [9, 10, 11, 13, 15], [10, 11, 14]];
const NEIG = [n.+1 for n in NEIG_py]
function enlarge(path::Vector{Int})
(push!(copy(path),loc) for loc in NEIG[path[end]] if !(loc in path))
end
collect(enlarge([1]))
function enlargepaths(paths)
Iterators.Flatten(enlarge(path) for path in paths)
end
collect(enlargepaths([[1],[2]]))
function paths(targetlen)
paths = ([i] for i=1:16)
for newlen in 2:targetlen
paths = enlargepaths(paths)
end
paths
end
p = sum(1 for path in paths(10))
benchmark
In ipython we could time it:
python 3.6.3:
%timeit sum(1 for i in paths(10))
1.25 s ± 15.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
julia 0.6.0
julia> #time sum(1 for path in paths(10))
2.690630 seconds (41.91 M allocations: 1.635 GiB, 11.39% gc time)
905776
Julia 0.7.0-DEV.0
julia> #time sum(1 for path in paths(10))
4.951745 seconds (35.69 M allocations: 1.504 GiB, 4.31% gc time)
905776
Question(s):
We Julians are saying this: It is important to note that the benchmark codes are not written for absolute maximal performance (the fastest code to compute recursion_fibonacci(20) is the constant literal 6765). Instead, the benchmarks are written to test the performance of identical algorithms and code patterns implemented in each language.
In this benchmark, we are using the same idea. Just simple for cycles over arrays enclosed to generators. (Nothing from numpy, numba, pandas or others c-written and compiled python packages)
Is assumption that Julia's generators are terribly slow right?
What could we do to make it really fast?
const NEIG_py = [[1, 4, 5], [0, 2, 4, 5, 6], [1, 3, 5, 6, 7], [2, 6, 7], [0, 1, 5, 8, 9], [0, 1, 2, 4, 6, 8, 9, 10], [1, 2, 3, 5, 7, 9, 10, 11], [2, 3, 6, 10, 11], [4, 5, 9, 12, 13], [4, 5, 6, 8, 10, 12, 13, 14], [5, 6, 7, 9, 11, 13, 14, 15], [6, 7, 10, 14, 15], [8, 9, 13], [8, 9, 10, 12, 14], [9, 10, 11, 13, 15], [10, 11, 14]];
const NEIG = [n.+1 for n in NEIG_py];
function expandto(n, path, targetlen)
length(path) >= targetlen && return n+1
for loc in NEIG[path[end]]
loc in path && continue
n = expandto(n, (path..., loc), targetlen)
end
n
end
function npaths(targetlen)
n = 0
for i = 1:16
path = (i,)
n = expandto(n, path, targetlen)
end
n
end
Benchmark (after executing once for JIT-compilation):
julia> #time npaths(10)
0.069531 seconds (5 allocations: 176 bytes)
905776
which is considerably faster.
Julia's "better performance" than Python isn't magical. Most of it stems directly from the fact that Julia can figure out what each variable's type within a function will be, and then compile highly specialized code for those specific types. This even applies to the elements in many containers and iterables like generators; Julia often knows ahead of time what type the elements will be. Python isn't able to do this analysis nearly as easily (or at all, in many cases), so its optimizations have focused on improving the dynamic behaviors.
In order for Julia's generators to know ahead of time what kinds of types they might produce, they encapsulate information about both the operation they perform and the object they iterate over in the type:
julia> (1 for i in 1:16)
Base.Generator{UnitRange{Int64},getfield(Main, Symbol("##27#28"))}(getfield(Main, Symbol("##27#28"))(), 1:16)
That weird ##27#28 thing is the type of an anonymous function that simply returns 1. By the time the generator gets to LLVM, it knows enough to perform quite a large number of optimizations:
julia> function naive_sum(c)
s = 0
for elt in c
s += elt
end
s
end
#code_llvm naive_sum(1 for i in 1:16)
; Function naive_sum
; Location: REPL[1]:2
define i64 #julia_naive_sum_62385({ { i64, i64 } } addrspace(11)* nocapture nonnull readonly dereferenceable(16)) {
top:
; Location: REPL[1]:3
%1 = getelementptr inbounds { { i64, i64 } }, { { i64, i64 } } addrspace(11)* %0, i64 0, i32 0, i32 0
%2 = load i64, i64 addrspace(11)* %1, align 8
%3 = getelementptr inbounds { { i64, i64 } }, { { i64, i64 } } addrspace(11)* %0, i64 0, i32 0, i32 1
%4 = load i64, i64 addrspace(11)* %3, align 8
%5 = add i64 %4, 1
%6 = sub i64 %5, %2
; Location: REPL[1]:6
ret i64 %6
}
It may take a minute to parse through the LLVM IR there, but you should be able to see that it's just extracting the endpoints of the UnitRange (getelementptr and load), subtracting them from each other (sub) and adding one to compute the sum without a single loop.
In this case, though, it works against Julia: paths(10) has a ridiculously complicated type! You're iteratively wrapping that one generator in filters and flattens and yet more generators. It becomes so complicated, in fact, that Julia just gives up trying to figure out with it and decides to live with the dynamic behavior. And at this point, it no longer has an inherent advantage over Python — in fact specializing on so many different types as it recursively walks through the object would be a distinct handicap. You can see this in action by looking at #code_warntype start(1 for i in paths(10)).
My rule of thumb for Julia's performance is that type-stable, devectorized code that avoids allocations is typically within an factor of 2 of C, and dynamic, unstable, or vectorized code is within an order of magnitude of Python/MATLAB/other higher level languages. Often it's a bit slower simply because the other higher level languages have pushed very hard to optimize their case, whereas the majority of Julia's optimizations have been focused on the type-stable side of things. This deeply nested construct puts you squarely in the dynamic camp.
So are Julia's generators terribly slow? Not inherently so; it's just when they become so deeply nested like this that you hit this bad case.
Not following the same algorithm (and don't know how fast Python would be doing it like this), but with the following code Julia is basically the same for solutions of length=10, and much better for solutions of length=16
In [48]: %timeit sum(1 for path in paths(10))
1.52 s ± 11.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
julia> #time sum(1 for path in pathsr(10))
1.566964 seconds (5.54 M allocations: 693.729 MiB, 16.24% gc time)
905776
In [49]: %timeit sum(1 for path in paths(16))
19.3 s ± 15.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
julia> #time sum(1 for path in pathsr(16))
6.491803 seconds (57.36 M allocations: 9.734 GiB, 33.79% gc time)
343184
Here is the code. I just learnt about tasks/channels yesterday, so probably it can be done better:
const NEIG = [[1, 4, 5], [0, 2, 4, 5, 6], [1, 3, 5, 6, 7], [2, 6, 7], [0, 1, 5, 8, 9], [0, 1, 2, 4, 6, 8, 9, 10], [1, 2, 3, 5, 7, 9, 10, 11], [2, 3, 6, 10, 11], [4, 5, 9, 12, 13], [4, 5, 6, 8, 10, 12, 13, 14], \
[5, 6, 7, 9, 11, 13, 14, 15], [6, 7, 10, 14, 15], [8, 9, 13], [8, 9, 10, 12, 14], [9, 10, 11, 13, 15], [10, 11, 14]];
function enlarger(num::Int,len::Int,pos::Int,sol::Array{Int64,1},c::Channel)
if pos == len
put!(c,copy(sol))
elseif pos == 0
for j=0:num
sol[1]=j
enlarger(num,len,pos+1,sol,c)
end
close(c)
else
for i in NEIG[sol[pos]+1]
if !in(i,sol[1:pos])
sol[pos+1]=i
enlarger(num,len,pos+1,sol,c)
end
end
end
end
function pathsr(len)
c=Channel(0)
sol = [0 for i=1:len]
#schedule enlarger(15,len,0,sol,c)
(i for i in c)
end
Following tholy's answer, since tuples seem to be very fast. This is like my previous code, but with the tuple stuff, and it gets substantially better results:
julia> #time sum(1 for i in pathst(10))
1.155639 seconds (1.83 M allocations: 97.632 MiB, 0.75% gc time)
905776
julia> #time sum(1 for i in pathst(16))
1.963470 seconds (1.39 M allocations: 147.555 MiB, 0.35% gc time)
343184
The code:
const NEIG = [[1, 4, 5], [0, 2, 4, 5, 6], [1, 3, 5, 6, 7], [2, 6, 7], [0, 1, 5, 8, 9], [0, 1, 2, 4, 6, 8, 9, 10], [1, 2, 3, 5, 7, 9, 10, 11], [2, 3, 6, 10, 11], [4, 5, 9, 12, 13], [4, 5, 6, 8, 10, 12, 13, 14], [5, 6, 7, 9, 11, 13, 14, 15], [6, 7, 10, 14, 15], [8, 9, 13], [8, 9, 10, 12, 14], [9, 10, 11, 13, 15], [10, 11, 14]];
function enlarget(path,len,c::Channel)
if length(path) >= len
put!(c,path)
else
for loc in NEIG[path[end]+1]
loc in path && continue
enlarget((path..., loc), len,c)
end
if length(path) == 1
path[1] == 15 ? close(c) : enlarget((path[1]+1,),len,c)
end
end
end
function pathst(len)
c=Channel(0)
path=(0,)
#schedule enlarget(path,len,c)
(i for i in c)
end
Since everybody is writing an answer... here is another version, this time using Iterators, which are kind-of more idiomatic than generators in current Julia (0.6.1). Iterators offer many of the benefits generators have. The iterator definition is in the following lines:
import Base.Iterators: start, next, done, eltype, iteratoreltype, iteratorsize
struct SAWsIterator
neigh::Vector{Vector{Int}}
pathlen::Int
pos::Int
end
SAWs(neigh, pathlen, pos) = SAWsIterator(neigh, pathlen, pos)
start(itr::SAWsIterator) =
([itr.pos ; zeros(Int, itr.pathlen-1)], Vector{Int}(itr.pathlen-1),
2, Ref{Bool}(false), Ref{Bool}(false))
#inline next(itr::SAWsIterator, s) =
( s[4][] ? s[4][] = false : calc_next!(itr, s) ;
(s[1], (s[1], s[2], itr.pathlen, s[4], s[5])) )
#inline done(itr::SAWsIterator, s) = ( s[4][] || calc_next!(itr, s) ; s[5][] )
function calc_next!(itr::SAWsIterator, s)
s[4][] = true ; s[5][] = false
curindex = s[3]
pathlength = itr.pathlen
path, options = s[1], s[2]
#inbounds while curindex<=pathlength
curindex == 1 && ( s[5][] = true ; break )
startindex = path[curindex] == 0 ? 1 : options[curindex-1]+1
path[curindex] = 0
i = findnext(x->!(x in path), neigh[path[curindex-1]], startindex)
if i==0
path[curindex] = 0 ; options[curindex-1] = 0 ; curindex -= 1
else
path[curindex] = neigh[path[curindex-1]][i]
options[curindex-1] = i ; curindex += 1
end
end
return nothing
end
eltype(::Type{SAWsIterator}) = Vector{Int}
iteratoreltype(::Type{SAWsIterator}) = Base.HasEltype()
iteratorsize(::Type{SAWsIterator}) = Base.SizeUnknown()
Cut-and-pasting the definition above works. The term SAW was used as an acronym of Self Avoiding Walk, which is sometimes used in mathematics for such a path.
Now, to use/test this iterator, the following code can be executed:
allSAWs(neigh, pathlen) =
Base.Flatten(SAWs(neigh,pathlen,k) for k in eachindex(neigh))
iterlength(itr) = mapfoldl(x->1, +, 0, itr)
using Base.Test
const neigh = [[2, 5, 6], [1, 3, 5, 6, 7], [2, 4, 6, 7, 8], [3, 7, 8],
[1, 2, 6, 9, 10], [1, 2, 3, 5, 7, 9, 10, 11], [2, 3, 4, 6, 8, 10, 11, 12],
[3, 4, 7, 11, 12], [5, 6, 10, 13, 14], [5, 6, 7, 9, 11, 13, 14, 15],
[6, 7, 8, 10, 12, 14, 15, 16], [7, 8, 11, 15, 16], [9, 10, 14],
[9, 10, 11, 13, 15], [10, 11, 12, 14, 16], [11, 12, 15]]
#test iterlength(allSAWs(neigh, 10)) == 905776
for (i,path) in enumerate(allSAWs(neigh, 10))
if i % 100_000 == 0
#show i,path
end
end
#time iterlength(allSAWs(neigh, 10))
It is relatively readable, and the output looks like this:
(i, path) = (100000, [2, 5, 10, 14, 9, 6, 7, 12, 15, 11])
(i, path) = (200000, [4, 3, 8, 7, 6, 10, 14, 11, 16, 15])
(i, path) = (300000, [5, 10, 11, 16, 15, 14, 9, 6, 7, 3])
(i, path) = (400000, [8, 3, 6, 5, 2, 7, 11, 14, 15, 10])
(i, path) = (500000, [9, 14, 10, 5, 2, 3, 8, 11, 6, 7])
(i, path) = (600000, [11, 16, 15, 14, 10, 6, 3, 8, 7, 12])
(i, path) = (700000, [13, 10, 15, 16, 11, 6, 2, 1, 5, 9])
(i, path) = (800000, [15, 11, 12, 7, 2, 3, 6, 1, 5, 9])
(i, path) = (900000, [16, 15, 14, 9, 5, 10, 7, 8, 12, 11])
0.130755 seconds (4.16 M allocations: 104.947 MiB, 11.37% gc time)
905776
0.13s is not too bad considering this is not as optimized as #tholy's answer, or some others'. Some tricks used in the other answers are deliberately not used here, specifically:
recursion basically uses the stack as a quick way to allocate.
Using tuples combined with specialization hides some run-time complexity in the first compile of methods for each tuple signature.
An optimization not seen in the answers yet could be important is using an efficient Bool array or Dict to speed up the check if a vertex was already used in the path. In this answer, the findnext triggers an allocation, which can be avoided and then this answer will be closer to the minimal memory allocation count.
This is my quick and dirty cheating experiment (I promised to add it here in comment) where I am trying to speedup Angel's code:
const NEIG_py = [[1, 4, 5], [0, 2, 4, 5, 6], [1, 3, 5, 6, 7], [2, 6, 7], [0, 1, 5, 8, 9], [0, 1, 2, 4, 6, 8, 9, 10], [1, 2, 3, 5, 7, 9, 10, 11], [2, 3, 6, 10, 11], [4, 5, 9, 12, 13], [4, 5, 6, 8, 10, 12, 13, 14], [5, 6, 7, 9, 11, 13, 14, 15], [6, 7, 10, 14, 15], [8, 9, 13], [8, 9, 10, 12, 14], [9, 10, 11, 13, 15], [10, 11, 14]];
const NEIG = [n.+1 for n in NEIG_py]
function enlargetc(path,len,c::Function)
if length(path) >= len
c(path)
else
for loc in NEIG[path[end]]
loc in path && continue
enlargetc((path..., loc), len,c)
end
if length(path) == 1
if path[1] == 16 return
else enlargetc((path[1]+1,),len,c)
end
end
end
end
function get_counter()
let helper = 0
function f(a)
helper += 1
return helper
end
return f
end
end
counter = get_counter()
#time enlargetc((1,), 10, counter) # 0.481986 seconds (2.62 M allocations: 154.576 MiB, 5.12% gc time)
counter.helper.contents # 905776
EDIT: time in comment is without recompilation! After recompilation it was 0.201669 seconds (2.53 M allocations: 150.036 MiB, 10.77% gc time).
Related
How am I able to input this into Sage?
A = {x ∈ ℕ | x ≤ 10}
B = {a, b}
I've looked at the docs and they are not clear. I am trying to find the union and distinction between them afterwards.
This might be what you want.
sage: A = set(0 .. 10)
sage: B = {4, 12}
sage: A.union(B)
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12}
sage: A.intersection(B)
{4}
sage: A.difference(B)
{0, 1, 2, 3, 5, 6, 7, 8, 9, 10}
sage: A.symmetric_difference(B)
{0, 1, 2, 3, 5, 6, 7, 8, 9, 10, 12}
I have two distance vector which is distance and distance_1norm
and i try to get 5 of the smallest distance from those 2 vectors and compare them whether are the same.
etc
distance = [ 1, 5, 6, 7, 8, 9, 10, 2, 3, 4]
distance_1norm = [1, 8, 6, 7, 8, 9, 10, 2, 3, 4]
after sorted (I only can do until here)
distance = [1,2,3,4,5]
,distance_1norm= [1,2,3,4,6]
How can i get their index which is
distance = 1st, 8th, 9th, 10th, 2nd
,distance_1norm = 1st, 8th, 9th, 10th, 3rd
function smallestn(distance, n)
sort(distance; alg=Sort.PartialQuickSort(n))[1:n]
end
Use partialsortperm:
julia> partialsortperm([1, 5, 6, 7, 8, 9, 10, 2, 3, 4], 1:5)
5-element view(::Vector{Int64}, 1:5) with eltype Int64:
1
8
9
10
2
julia> partialsortperm([1, 8, 6, 7, 8, 9, 10, 2, 3, 4], 1:5)
5-element view(::Vector{Int64}, 1:5) with eltype Int64:
1
8
9
10
3
Suppose I have a dataset with two columns. I have built linear regression model on my dataset, Now my question is how do I check the accuracy of my model.
I found that answer to my question is applying K-fold on my dataset. I know how K-fold works but I have no idea how to implement K-fold in my Julia program.
#suppose I have two columns x and y in my dataset
x= [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
y=[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]
# now how do I use K-fold to split dataset and also evaluate my algorithm?
As mentioned in the comment, it is easier to setup some code once any base source is given. For example in this case, K-fold cross-validation might need to go through preparation like the following:
julia> x= [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20];
julia> y=[2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21];
julia> K = 5 # number of folds in validation
5
julia> N = length(x) # number of samples in dataset
20
julia> stops = round.(Int,linspace(1,N,K+1))
6-element Array{Int64,1}:
1
5
9
12
16
20
julia> vsets = [s:e-(e<N)*1 for (s,e) in zip(stops[1:end-1],stops[2:end])]
5-element Array{UnitRange{Int64},1}:
1:4
5:8
9:11
12:15
16:20
julia> tsets1 = [1:s-1 for (s,e) in zip(stops[1:end-1],stops[2:end])]
5-element Array{UnitRange{Int64},1}:
1:0
1:4
1:8
1:11
1:15
julia> tsets2 = [e+(e<=N)*1:N for (s,e) in zip(stops[1:end-1],stops[2:end])]
5-element Array{UnitRange{Int64},1}:
6:20
10:20
13:20
17:20
21:20
julia> σ = randperm(N);
julia> [x[σ[vsets[i]]] for i=1:K] # validation sets
5-element Array{Array{Int64,1},1}:
[5, 13, 6, 10]
[16, 4, 2, 3]
[9, 19, 20]
[17, 12, 14, 11]
[8, 1, 18, 7, 15]
julia> [x[vcat(σ[tsets1[i]],σ[tsets2[i]])] for i=1:K] # training sets
5-element Array{Array{Int64,1},1}:
[4, 2, 3, 9, 19, 20, 17, 12, 14, 11, 8, 1, 18, 7, 15]
[5, 13, 6, 10, 19, 20, 17, 12, 14, 11, 8, 1, 18, 7, 15]
[5, 13, 6, 10, 16, 4, 2, 3, 12, 14, 11, 8, 1, 18, 7, 15]
[5, 13, 6, 10, 16, 4, 2, 3, 9, 19, 20, 1, 18, 7, 15]
[5, 13, 6, 10, 16, 4, 2, 3, 9, 19, 20, 17, 12, 14, 11]
This may be satisfactory. For more details regarding K-fold cross validation here is a link to Wikipedia: https://en.wikipedia.org/wiki/Cross-validation_(statistics)#k-fold_cross-validation
You can use folds from MLDataUtils.jl.
kfolds([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20],5)
How do I find the intersection of two lists including duplicates in mathematica?
So, If I have this:
list1 = {1, 1, 3, 4, 5, 6, 6, 6, 7, 7, 10, 11, 11};
list2 = {1, 1, 4, 5, 5, 6, 6, 7, 7, 8, 11, 11, 13, 14};
I'd want it to return this:
IntersectionIncludingDuplicates[list1, list2] = {1, 1, 4, 5, 6, 6, 7, 7, 11, 11}
Thanks for any and all help!
Here's one way:
Catenate#KeyValueMap[ConstantArray]#
MapThread[Min, KeyIntersection[Counts /# {list1, list2}]]
Breaking it down:
Count how many times each element occurs in each list (Counts)
Retain only those elements which occur in both (KeyIntersection)
Take the smaller number of occurrences (MapThread, Min) and replicate the given element that many times (ConstantArray)
Edit : fixed and tested..
you might use regular Intersection , then Count , something like
ConstantArray[#, Min[Count[list1, #], Count[list2, #]]] & /#
Intersection[list1, list2] // Flatten
{1, 1, 4, 5, 6, 6, 7, 7, 11, 11}
in functional form generalised to take an arbitrary number of lists:
IntersectionIncludingDuplicates[lists__List] :=
ConstantArray[#,
Function[{v}, Min ## (Count[#, v] & /# {lists})]##] & /#
Intersection[lists] // Flatten
IntersectionIncludingDuplicates[list1, list2]
{1, 1, 4, 5, 6, 6, 7, 7, 11, 11}
Simple code to understand. Assumes input lists are sorted.
list1 = {1, 1, 3, 4, 5, 6, 6, 6, 7, 7, 10, 11, 11};
list2 = {1, 1, 4, 5, 5, 6, 6, 7, 7, 8, 11, 11, 13, 14};
IntersectionIncludingDuplicates[list1_, list2_] := Module[
{out = {}, i = j = 1},
While[i <= Length[list1] && j <= Length[list2],
If[list1[[i]] == list2[[j]],
AppendTo[out, list1[[i]]];
i++; j++,
If[list1[[i]] < list2[[j]], i++, j++]]];
out]
IntersectionIncludingDuplicates[list1, list2]
{1, 1, 4, 5, 6, 6, 7, 7, 11, 11}
Given two random integer generators one that generates between 1 and 7 and another that generates between 1 and 5, how do you make a random integer generator that generates between 1 and 13? I have tried solving this question in various ways but I have not been able to come up with a solution that generates numbers from 1 to 13 with equal or near equal probability.
Using the top two answers for Expand a random range from 1–5 to 1–7, I've come up with the following. There's probably a more efficient way to do this (maybe using the 1-5 generator?) but this seems to work.
Optimized for Compactness
var j;
do {
j = 7 * (rand7() - 1) + rand7(); // uniformly random between 1 and 49
} while (j > 39);
// j is now uniformly random between 1 and 39 (an even multiple of 13)
j = j % 13 + 1;
Optimized for understandability
var v = [
[1, 2, 3, 4, 5, 6, 7],
[8, 9, 10, 11, 12, 13, 1],
[2, 3, 4, 5, 6, 7, 8],
[9, 10, 11, 12, 13, 1, 2],
[3, 4, 5, 6, 7, 8, 9],
[10, 11, 12, 13, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]
];
var j = 0;
while (j == 0) {
j = v[rand7() - 1][rand7() - 1];
}