when I leran the kl_divergence of MultivariateNormal, I can't understand torch._C._infer_size(), so I ask for help, Thanks!
location: torch.distributions.kl.py
combined_batch_shape = torch._C._infer_size(q._unbroadcasted_scale_tril.shape[:-2],
p._unbroadcasted_scale_tril.shape[:-2])
total code:
#register_kl(MultivariateNormal, MultivariateNormal)
def _kl_multivariatenormal_multivariatenormal(p, q):
# From https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Kullback%E2%80%93Leibler_divergence
if p.event_shape != q.event_shape:
raise ValueError("KL-divergence between two Multivariate Normals with\
different event shapes cannot be computed")
half_term1 = (q._unbroadcasted_scale_tril.diagonal(dim1=-2, dim2=-1).log().sum(-1) -
p._unbroadcasted_scale_tril.diagonal(dim1=-2, dim2=-1).log().sum(-1))
combined_batch_shape = torch._C._infer_size(q._unbroadcasted_scale_tril.shape[:-2],
p._unbroadcasted_scale_tril.shape[:-2])
n = p.event_shape[0]
q_scale_tril = q._unbroadcasted_scale_tril.expand(combined_batch_shape + (n, n))
p_scale_tril = p._unbroadcasted_scale_tril.expand(combined_batch_shape + (n, n))
term2 = _batch_trace_XXT(torch.linalg.solve_triangular(q_scale_tril, p_scale_tril, upper=False))
term3 = _batch_mahalanobis(q._unbroadcasted_scale_tril, (q.loc - p.loc))
return half_term1 + 0.5 * (term2 + term3 - n)
I hope understand this method.
Related
I am trying to solve the following problem using Julia:
Determine the frequency of students cheating during an exam. If we let N be the total number of students who took the exam, and assuming each student is interviewed post-exam using the following privacy retaining algorithm:
"In the interview process for each student, the student flips a coin, hidden from the interviewer. The student agrees to answer honestly if the coin comes up heads. Otherwise, if the coin comes up tails, the student (secretly) flips the coin again, and answers “Yes, I did cheat” if the coin flip lands heads, and “No, I did not cheat”, if the coin flip lands tails. This way, the interviewer does not know if a “Yes” was the result of a guilty plea, or a Heads on a second coin toss. Thus privacy is preserved and the researchers receive honest answers."
I have written the following code to determine the posterior distribution of cheating students:
const N = 100
const YES_ANSWERS = 35
const SAMPLES = 1000000
using DataStructures
using Distributions
function simulate(N, YES_ANSWERS, SAMPLES)
d = DefaultDict{Int64, Int64}(0)
for i in 1:SAMPLES
cheating = rand(0:N)
not_cheating = N - cheating
cheating_honest_yes = rand(Binomial(cheating, 0.5))
cheating_second_toss = cheating - cheating_honest_yes
cheating_random_yes = rand(Binomial(cheating_second_toss, 0.5))
cheating_yes_answers = cheating_honest_yes + cheating_random_yes
not_cheating_second_toss = rand(Binomial(not_cheating, 0.5))
not_cheating_random_yes = rand(Binomial(not_cheating_second_toss, 0.5))
total_yes = cheating_yes_answers + not_cheating_random_yes
if total_yes == YES_ANSWERS
d[cheating] += 1
end
end
d
end
const N = 100
const YES_ANSWERS = 35
const SAMPLES = 1000000
d = simulate(N, YES_ANSWERS, SAMPLES)
using Plots
cheaters = [k/N for k in keys(d)]
probability = [v/sum(values(d)) for v in values(d)]
g = scatter(cheaters, probability)
gui(g)
Essentially I draw and count and calculate the probabilities
I then tried to use Turing.jl but got stuck:
using DataStructures
using Distributions
using Turing
#model function CheatingDistribution(N)
cheating ~ DiscreteUniform(0, N)
not_cheating = N - cheating
cheating_honest_yes ~ Binomial(cheating, 0.5)
cheating_second_toss = cheating - cheating_honest_yes
cheating_random_yes ~ Binomial(cheating_second_toss, 0.5)
cheating_yes_answers = cheating_honest_yes + cheating_random_yes
not_cheating_second_toss ~ Binomial(not_cheating, 0.5)
not_cheating_random_yes ~ Binomial(not_cheating_second_toss, 0.5)
total_yes = cheating_yes_answers + not_cheating_random_yes
total_yes
end
I cannot infer cheating from total_yes.
That said, I could use this model to reproduce the results that I got with the previous code:
using DataStructures
using Distributions
using Turing
#model function CheatingDistribution(N)
cheating ~ DiscreteUniform(0, N)
not_cheating = N - cheating
cheating_honest_yes ~ Binomial(cheating, 0.5)
cheating_second_toss = cheating - cheating_honest_yes
cheating_random_yes ~ Binomial(cheating_second_toss, 0.5)
cheating_yes_answers = cheating_honest_yes + cheating_random_yes
not_cheating_second_toss ~ Binomial(not_cheating, 0.5)
not_cheating_random_yes ~ Binomial(not_cheating_second_toss, 0.5)
total_yes = cheating_yes_answers + not_cheating_random_yes
total_yes
end
function simulate(N, actual_yes_answers, n_samples)
d = DefaultDict{Int64, Int64}(0)
cheatDist = CheatingDistribution(N)
for _ in 1:n_samples
result = rand(cheatDist)
total_yes = result.cheating_honest_yes + result.cheating_random_yes + result.not_cheating_random_yes
if total_yes == actual_yes_answers
d[result.cheating] += 1
end
end
cheaters = [k/N for k in keys(d)]
probability = [v/sum(values(d)) for v in values(d)]
return cheaters, probability
end
const N = 100 # Number of students
const YES_ANSWERS = 35 # Number of students that answered yes (both cheating and honest students)
const SAMPLES = 1000000
cheaters, probability = simulate(N, YES_ANSWERS, SAMPLES)
using Plots
g = scatter(cheaters, probability)
gui(g)
I then wrote a new version where I sum up the probabilities and get the same result except that it is more accurate, especially when yes answers in interviews are low. For example 3 students answered yes I cheated. The following is the code using probabilities:
using DataStructures
using Distributions
function simulate(N, YES_ANSWERS)
d = DefaultOrderedDict{Int64, Float64}(0)
for cheating in 0:N
p1 = 1.0 / N
not_cheating = N - cheating
for cheating_honest_yes in 0:cheating
p2 = p1 * pdf(Binomial(cheating, 0.5), cheating_honest_yes)
cheating_second_toss = cheating - cheating_honest_yes
for cheating_random_yes in 0:cheating_second_toss
p3 = p2 * pdf(Binomial(cheating_second_toss, 0.5), cheating_random_yes)
cheating_yes_answers = cheating_honest_yes + cheating_random_yes
for not_cheating_second_toss in 0:not_cheating
p4 = p3 * pdf(Binomial(not_cheating, 0.5), not_cheating_second_toss)
for not_cheating_random_yes in 0:not_cheating_second_toss
p5 = p4 * pdf(Binomial(not_cheating_second_toss, 0.5), not_cheating_random_yes)
total_yes = cheating_yes_answers + not_cheating_random_yes
if total_yes == YES_ANSWERS
d[cheating] += p5
end
end
end
end
end
end
d
end
const N = 100
const YES_ANSWERS = 35
d = simulate(N, YES_ANSWERS)
using Plots
cheaters = [k/N for k in keys(d)]
probability = [v/sum(values(d)) for v in values(d)]
g = bar(cheaters, probability)
gui(g)
I got this toy problem from the book "Baysian Methods for Hackers". The following is how the solution look like:
Note that the bin size in the solution is 10 while I am showing the PMF. So I am showing densities one tenth of the solution above.
My question: How do I solve this problem using MCMCChain.jl?
I am struggling to plot evaluated function and Cbebyshev approximation.
I am using Julia 1.2.0.
EDIT: Sorry, added completed code.
using Plots
pyplot()
mutable struct Cheb_struct
c::Vector{Float64}
min::Float64
max::Float64
end
function cheb_coeff(min::Float64, max::Float64, n::Int, fn::Function)::Cheb_struct
struc = Cheb_struct(Vector{Float64}(undef,n), min, max)
f = Vector{Float64}(undef,n)
p = Vector{Float64}(undef,n)
max_plus_min = (max + min) / 2
max_minus_min = (max - min) / 2
for k in 0:n-1
p[k+1] = pi * ((k+1) - 0.5) / n
f[k+1] = fn(max_plus_min + cos(p[k+1])*max_minus_min)
end
n2 = 2 / n
for j in 0:n-1
s = 0
for i in 0:n-1
s += f[i+1]*cos(j*p[i+1])
struc.c[j+1] = s * n2
end
end
return struc
end
function approximate(struc::Cheb_struct, x::Float64)::Float64
x1 = (2*x - struc.max - struc.min) / (struc.max - struc.min)
x2 = 2*x1
t = s = 0
for j in length(struc.c):-1:2
pom = s
s = x2 * s - t + struc.c[j]
t = pom
end
return (x1 * s - t + struc.c[1] / 2)
end
fn = sin
struc = cheb_coeff(0.0, 1.0, 10, fn)
println("coeff:")
for x in struc.c
#printf("% .15f\n", x)
end
println("\n x eval approx eval-approx")
for x in struc.min:0.1:struc.max
eval = fn(x)
approx = approximate(struc, x)
#printf("%11.8f %12.8f %12.8f % .3e\n", x,eval, approx, eval - approx)
display(plot(x=eval,y=approx))
end
I am getting empty plot window.
I would be very grateful if someone coould how to plot these two functions.
You should provide a working code as an example.
However the code below can show you how to plot:
using Plots
pyplot()
fn = sin
approxf(x) = sin(x)+rand()/10
x = 0:0.1:1
evalv = fn.(x)
approxv = approxf.(x)
p = plot(evalv,approxv)
using PyPlot
PyPlot.display_figs() #needed when running in IDE such as Atom
I was wondering how I can convert this code from Matlab to R code. It seems this is the code for midpoint method. Any help would be highly appreciated.
% Usage: [y t] = midpoint(f,a,b,ya,n) or y = midpoint(f,a,b,ya,n)
% Midpoint method for initial value problems
%
% Input:
% f - Matlab inline function f(t,y)
% a,b - interval
% ya - initial condition
% n - number of subintervals (panels)
%
% Output:
% y - computed solution
% t - time steps
%
% Examples:
% [y t]=midpoint(#myfunc,0,1,1,10); here 'myfunc' is a user-defined function in M-file
% y=midpoint(inline('sin(y*t)','t','y'),0,1,1,10);
% f=inline('sin(y(1))-cos(y(2))','t','y');
% y=midpoint(f,0,1,1,10);
function [y t] = midpoint(f,a,b,ya,n)
h = (b - a) / n;
halfh = h / 2;
y(1,:) = ya;
t(1) = a;
for i = 1 : n
t(i+1) = t(i) + h;
z = y(i,:) + halfh * f(t(i),y(i,:));
y(i+1,:) = y(i,:) + h * f(t(i)+halfh,z);
end;
I have the R code for Euler method which is
euler <- function(f, h = 1e-7, x0, y0, xfinal) {
N = (xfinal - x0) / h
x = y = numeric(N + 1)
x[1] = x0; y[1] = y0
i = 1
while (i <= N) {
x[i + 1] = x[i] + h
y[i + 1] = y[i] + h * f(x[i], y[i])
i = i + 1
}
return (data.frame(X = x, Y = y))
}
so based on the matlab code, do I need to change h in euler method (R code) to (b - a) / n to modify Euler code to midpoint method?
Note
Broadly speaking, I agree with the expressed comments; however, I decided to vote up this question. (now deleted) This is due to the existence of matconv that facilitates this process.
Answer
Given your code, we could use matconv in the following manner:
pacman::p_load(matconv)
out <- mat2r(inMat = "input.m")
The created out object will attempt to translate Matlab code into R, however, the job is far from finished. If you inspect the out object you will see that it requires further work. Simple statements are usually translated correctly with Matlab comments % replaced with # and so forth but more complex statements may require a more detailed investigation. You could then inspect respective line and attempt to evaluate them to see where further work may be required, example:
eval(parse(text=out$rCode[1]))
NULL
(first line is a comment so the output is NULL)
I'm trying to modeling a prey-prey-predator system using differential equations based on the LV model. For the sake of the precision, i need to use the runge-kutta4 method.
But given the equations, some of the populations become quickly negative.
So I tried to use the events/root system of ODE but it seems that rk4 and rootfun are not compatibles...
eventFunc <- function(t, y, p){
if (y["N1"] < 0) { y["N1"] = 0 }
if (y["N2"] < 0) { y["N2"] = 0 }
if (y["P"] < 0) { y["P"] = 0 }
return(y)
}
rootFunction <- function(t, y, p){
if (y["P"] < 0) {y["P"] = 0}
if (y["N1"] < 0) {y["N1"] = 0}
if (y["N2"] < 0) {y["N2"] = 0}
return(y)
}
out <- ode(func=Model_T2.2,
method="rk4",
y=state,
parms=parameters,
times=times,
events = list(func = eventFunc,
root = TRUE),
rootfun = rootFunction
)
This code give me the followin error :
Error in checkevents(events, times, Ynames, dllname) :
either 'events$time' should be given and contain the times of the events, if 'events$func' is specified and no root function or your solver does not support root functions
Is there any solution to use rk4 and forbid the functions to go under 0?
Thanks in advance.
For those who might ask, here is what works :
if(!require(ggplot2)) {
install.packages("ggplot2"); require(ggplot2)}
if(!require(deSolve)) {
install.packages("deSolve"); require(deSolve)}
Model_T2.2 <- function(t, state, par){
with(as.list(c(state, par)), {
response1 <- (a1 * N1)/(1+(a1*h1*N1)+(a2*h2*N2))
response2 <- (a2 * N2)/(1+(a1*h1*N1)+(a2*h2*N2))
dN1 = r1*N1 * (1 - ((N1 + A12 * N2)/K1)) - response1 * P
dN2 = r2*N2 * (1 - ((N1 + A21 * N2)/K2)) - response2 * P
dP = ((E1 * response1) + (E2 * response2)) * P - Mp
return(list(c(dN1, dN2, dP)))
})
}
parameters<-c(
r1=1.42, r2=0.9,
A12=0.6, A21=0.5,
K1=50, K2=50,
a1=0.77, a2=0.77,
b1 = 1, b2=1,
h1=1.04, h2=1.04,
o1=0, o2=0,
Mp=0.22,
E1=0.36, E2=0.36
)
## inital states
state<-c(
P=10,
N1=30,
N2=30
)
times <- seq(0, 30, by=0.5)
out <- ode(func=Model_T2.2,
method="rk4",
y=state,
parms=parameters,
times=times,
events = list(func = eventFunc,
root = TRUE),
rootfun = rootFunction
)
md <- melt(as.data.frame(out), id.vars=1, measure.vars = c("N1", "N2", "P"))
pl <- ggplot(md, aes(x=time, y=value, colour=variable))
pl <- pl + geom_line() + geom_point() + scale_color_discrete(name="Population")
pl
And the result in a graph :
Evolution of prey1, prey2 and predator populations
As you can see, the population of predators become negative which is clearly impossible in the real world.
Edit : missing variables, sorry about that.
This is a problem you will have with all explicit solvers like rk4. Reducing the time step will help, up to a point. Better use a solver with an implicit method, lsoda seems universally available in one form or another.
Another way to explicitly force positive values is to parametrize them as exponentials. Set N1=exp(U1), N2=exp(U2) then the ODE function code translates to (as dN = exp(U)*dU = N*dU)
N1 <- exp(U1)
N2 <- exp(U2)
response1 <- (a1)/(1+(a1*h1*N1)+(a2*h2*N2))
response2 <- (a2)/(1+(a1*h1*N1)+(a2*h2*N2))
dU1 = r1 * (1 - ((N1 + A12 * N2)/K1)) - response1 * P
dU2 = r2 * (1 - ((N1 + A21 * N2)/K2)) - response2 * P
dP = ((E1 * response1*N1) + (E2 * response2*N2)) * P - Mp
For the output you have then of course to reconstruct N1, N2 from the solutions U1, U2.
Thanks to J_F, I am now able to run my L-V model.
The radau (not randau as you mentionned) function indeed accept root function and events ans implicitly implements the runge-kutta method.
Thanks again, hope this will help someone in the future.
Suppose you have a reoccurence defined by: T(n) = T(n/2) +1. How does one evaluate this without master's method? What I have so far:
T(n) = T(n/2) + 1
T(n/2) = T(n/4) + 1
T(n/4) = T(n/8) + 1
...
T(1) = 1
It looks like this would be O(logn). Is this the only way to do these problems where master
s theorem does not occur?
How you got this T(1) = 1?
Lets see:
T(0) = T(0/2) + 1 => 0 = 1!
So the T(x) function have an asymptote at x=0.
Please notice that we can redefine it to:
T(2^(x+1)) = T(2^x) + 1
=>
f(x+1) = f(x) + 1
=>
f(x) = x + a
=>
T(n) = Log2(n) + a(n)
where a(n) is a function with interval length 1
What makes you think the Master Method doesn't apply? We have:
T(n) = a T(n/b) + f(n)
with
a = 1, b = 2, f(n) = 1
We can see that
c = log_b a = log_2 1 = 0
f(n) = 1 = Theta(1) = Theta(n^0 log^0 n) = Theta(n^c log^k n)
So we can use the result that
T(n) = Theta(n^c log^(k+1) n) = Theta(n^0 log^(0+1) n) = Theta(log n)
Which is your result.