I'm having trouble accessing an exported function from a module. The code for the module is
module SteinDistributions
export
# Abstract types
SteinDistribution,
SteinPosterior,
# Specific distributions
SteinDiscrete,
SteinGMMPosterior,
SteinGaussian,
SteinLogisticRegressionPosterior,
SteinLogisticRegressionGaussianPrior,
SteinScaleLocationStudentT,
SteinUniform,
# Common functions operating on distributions
# CDF of a distribution
cdf,
# Get Stein factors
getC1,
getC2,
getC3,
# Lower bound on range of coordinate
supportlowerbound,
# Upper bound on range of coordinate
supportupperbound,
# Number of dimensions of target variable
numdimensions,
# get number of samples when distribution is posterior of data
numdatapoints,
# the log prior for posterior distributions
logprior,
# the likelihood [without a prior for posterior distributions]
loglikelihood,
# Log density [will include prior for posterior distributions]
logdensity,
# Gradient of the prior
gradlogprior,
# Gradient of the log density
gradlogdensity,
# Gradient of the log likelihood
gradloglikelihood,
# Random samples drawn from distribution
rand,
# random GMM samples
randgmm,
# evaluating joint and conditional probability mass functions for
# discrete distributions
jointdistribution,
condlogodds,
conddistribution
# Include abstract types first
include("SteinDistribution.jl")
include("SteinPosterior.jl")
# Include specific distributions
include("SteinDiscrete.jl")
include("SteinGMMPosterior.jl")
include("SteinGaussian.jl")
include("SteinLogisticRegressionPosterior.jl")
include("SteinLogisticRegressionGaussianPrior.jl")
include("SteinScaleLocationStudentT.jl")
include("SteinUniform.jl")
end
yet I encounter the following error
julia> using SteinDistributions: SteinUniform
ERROR: SteinUniform not defined
The output of whos() only lists some of the functions in the module, and I'm unsure what determines which functions are exported:
julia> whos(SteinDistributions)
SteinDistribution DataType
SteinDistributions Module
SteinPosterior DataType
getC1 Function
getC2 Function
getC3 Function
gradlogdensity Function
gradloglikelihood Function
logdensity Function
numdatapoints Function
rand Function
supportlowerbound Function
supportupperbound Function
I was hoping someone could help me with this.
Thanks,
Dar
Related
I am following the book (Statistical Rethinking) which has code in R and want to reproduce the same in code in Julia. In the book, they compute the likelihood of six successes out of 9 trials where a success, has a probability of 0.5. They achieve this using the following R code.
#R Code
dbinom(6, size = 9, prob=0.5)
#Out > 0.1640625
I am wondering how to do the same in Julia,
#Julia
using Distributions
b = Binomial(9,0.5)
# Its possible to look at random value,
rand(b)
#Out > 5
But how do I look at a specific value such as six successes?
I'm sure you know this but just to be sure the r dbinom function is the probability density (mass) function for the Binomial distribution.
Julia's Distributions package makes use of multiple dispatch to just have one generic pdf function that can be called with any type of Distribution as the first argument, rather than defining a bunch of methods like dbinom, dnorm (for the Normal distribution). So you can do:
julia> using Distributions
julia> b = Binomial(9, 0.5)
Binomial{Float64}(n=9, p=0.5)
julia> pdf(b, 6)
0.1640625000000001
There is also cdf which works in the same way to calculate (maybe unsurprisingly) for the cumulative density function.
With the code I’m calculating the density of a bivariate normal distribution. Here I use two formulas which should return the same result.
The first formula uses the dmvnorm of the mvtnorm package and the second formula uses the formula from Wikipedia (https://en.wikipedia.org/wiki/Multivariate_normal_distribution).
When the standard deviation of both distributions equals one (the covariance matrix has only ones on primary diagonal), the results are the same. But when you vary the two entries in the covariance matrix to two or one third… the results aren’t both identical.
(I hope) I have read the help properly and also this document (https://cran.r-project.org/web/packages/mvtnorm/vignettes/MVT_Rnews.pdf).
Here on stackoverflow (How to calculate multivariate normal distribution function in R) I found this because perhaps my covariance matrix is wrong defined.
But until now I couldn’t find an answer…
So my question: Why is my code returning different results when the standard deviation not equals one?
I hope I gave enough information... but when something is missing please comment. I will edit my question.
Many thanks in advance!
And now my code:
library(mvtnorm) # for loading the package if necessary
mu=c(0,0)
rho=0
sigma=c(1,1) # the standard deviation which should be changed to two or one third or… to see the different results
S=matrix(c(sigma[1],0,0,sigma[2]),ncol=2,byrow=TRUE)
x=rmvnorm(n=100,mean=mu,sigma=S)
dim(x) # for control
x[1:5,] # for visualization
# defining a function
Comparison=function(Points=x,mean=mu,sigma=S,quantity=4) {
for (i in 1:quantity) {
print(paste0("The ",i," random point"))
print(Points[i,])
print("The following two results should be the same")
print("Result from the function 'dmvnorm' out of package 'mvtnorm'")
print(dmvnorm(Points[i,],mean=mu,sigma=sigma,log=FALSE))
print("Result from equation out of wikipedia")
print(1/(2*pi*S[1,1]*S[2,2]*(1-rho^2)^(1/2))*exp((-1)/(2*(1-rho^2))*(Points[i,1]^2/S[1,1]^2+Points[i,2]^2/S[2,2]^2-(2*rho*Points[i,1]*Points[i,2])/(S[1,1]*S[2,2]))))
print("----")
print("----")
} # end for-loop
} # end function
# execute the function and compare the results
Comparison(Points=x,mean=mu,sigma=S,quantity=4)
Remember that S is the variance-covariance matrix. The formula you use from Wikipedia uses the standard deviation and not the variance. Hence you need to plug in the square root of the diagonal entries into the formula. This is also the reason why it works when you choose 1 as the diagonal entries (both the variance and the SD is 1).
See your modified code below:
library(mvtnorm) # for loading the package if necessary
mu=c(0,0)
rho=0
sigma=c(2,1) # the standard deviation which should be changed to two or one third or… to see the different results
S=matrix(c(sigma[1],0,0,sigma[2]),ncol=2,byrow=TRUE)
x=rmvnorm(n=100,mean=mu,sigma=S)
dim(x) # for control
x[1:5,] # for visualization
# defining a function
Comparison=function(Points=x,mean=mu,sigma=S,quantity=4) {
for (i in 1:quantity) {
print(paste0("The ",i," random point"))
print(Points[i,])
print("The following two results should be the same")
print("Result from the function 'dmvnorm' out of package 'mvtnorm'")
print(dmvnorm(Points[i,],mean=mu,sigma=sigma,log=FALSE))
print("Result from equation out of wikipedia")
SS <- sqrt(S)
print(1/(2*pi*SS[1,1]*SS[2,2]*(1-rho^2)^(1/2))*exp((-1)/(2*(1-rho^2))*(Points[i,1]^2/SS[1,1]^2+Points[i,2]^2/SS[2,2]^2-(2*rho*Points[i,1]*Points[i,2])/(SS[1,1]*SS[2,2]))))
print("----")
print("----")
} # end for-loop
} # end function
# execute the function and compare the results
Comparison(Points=x,mean=mu,sigma=S,quantity=4)
So your comment when you define sigma is not correct. In your code, sigma is the variances, not the standard deviations if you judge by how you construct S.
If we generate a random vector from the exponential distribution:
exp.seq = rexp(1000, rate=0.10) # mean = 10
Now we want to use the previously generated vector exp.seq to re-estimate lambda
So we define the log likelihood function:
fn <- function(lambda){
length(exp.seq)*log(lambda)-lambda*sum(exp.seq)
}
Now optim or nlm I'm getting very different value for lambda:
optim(lambda, fn) # I get here 3.877233e-67
nlm(fn, lambda) # I get here 9e-07
I used the same technique for the normal distribution and it works fine. So where is the mistake here?
I'm using my own definition for the exponential distribution because I will need to change it later.
I started using the package boot in R and I am having some trouble understanding the sense of the parameters t and t* on plots.
A basic code is the following:
library(boot)
mydata <- c(0.461, 3.243, 8.822, 3.442)
meanFunc <- function(mydata, i){mean(mydata[i])}
bootMean <- boot(mydata, meanFunc, 250)
plot(bootMean)
When using the command plot.boot I obtain this graphic:
What does it represent t*. Why the title says Histogram of t but in the x axis we have t*?
As an added question: How can I modify the properties of this graphic such as, for example, the color or tile or axis?
Thanks
In the output of boot (bootMean in your case) one can find two types of ts: t0 and t.
From the documentation of ?boot:
t0
The observed value of statistic applied to data.
This is the value of your meanFunc function on the original data set i.e.:
> mean(mydata)
[1] 3.992
This is called original t* or t1* in boot's output:
> bootMean
ORDINARY NONPARAMETRIC BOOTSTRAP
Call:
boot(data = mydata, statistic = meanFunc, R = 250)
Bootstrap Statistics :
original bias std. error
t1* 3.992 0.165301 1.512914
And then you have
t
A matrix with sum(R) rows each of which is a bootstrap replicate of the result of calling statistic
t here represents the matrix (vector in your case) of all the statistics produced according to your R argument i.e. 250 in your case.
Therefore, there is a difference between t and t* and the difference is that t is the matrix of all the statistics i.e. t here is what we would call the random variable in statistics whereas t* are the estimates of the t random variable. In your case you get 250 estimates t*s as determined by the R argument. In other words t is the matrix and t* are the elements of the matrix.
And therefore the plot makes sense as well since it is the histogram of the random variable t and the x-axis contains the estimates of the random variable i.e. the t*s.
Is there a density function for the two-piece Normal distribution:
on CRAN? Thought I would check before I code one. I have checked the distribution task view. It is not listed there. I have looked in a couple of likely packages, but to no avail.
Update: I have added dsplitnorm, psplitnorm, qsplitnorm and rsplitnorm functions to the fanplot package.
If you choose to construct your own version of the distribution, you might be interested in distr. It (and the related packages distrEx, distrSim, distrTEst, distrTeach and distrDoc) have been written to provide a unified interface for constructing new distributions from existing ones. (I constructed this example with the help of the wonderful vignette that accompanies the distrDoc package and which can be gotten by typing vignette("distr").)
This implements the split normal distribution, which may not be exactly what you are after. Using the distr toolset, though, it shouldn't be too hard to adjust this to fit your exact needs.
library(distr)
## Construct the distribution object.
## Here, it's a split normal distribution with mode=0, and lower- and
## upper-half standard deviations of 1 and 2, respectively.
splitNorm <- UnivarMixingDistribution(Truncate(Norm(0,2), upper=0),
Truncate(Norm(0,1), lower=0),
mixCoeff=c(0.5, 0.5))
## Construct its density function ...
dsplitNorm <- d(splitNorm)
## ... and a function for sampling random variates from it
rsplitNorm <- r(splitNorm)
## Compare the density it returns to that from rnorm()
dsplitNorm(-1)
# [1] 0.1760327
dnorm(-1, sd=2)
# [1] 0.1760327
## Sample and plot a million random variates from the distribution
x <- rsplitNorm(1e6)
hist(x, breaks=100, col="grey")
## Plot the distribution's continuous density
plot(splitNorm, to.draw.arg="d")