How to generate random numbers with bivariate gamma distribution. The density is:
F(X, Y)(x, y) = αp+qxp-1(y-x)q-1e-αy / [Γ(p) Γ(q)], 𝕀0≤ x≤ y
With y>x>0, α>0, p>0 and q>0.
I did not find any package on R that does this and nothing in literature.
This is straightforward:
Generate X~ Gamma(p,alpha) (alpha being the rate parameter in your formulation)
Generate W~ Gamma(q,alpha), independent of X
Calculate Y=X+W
(X,Y) have the required bivariate distribution.
in R (assuming p,q,alpha and n are already defined):
x <- rgamma(n,p,alpha)
y <- x + rgamma(n,q,alpha)
generates n values from the bivariate distribution with parameters p,q,alpha
Related
I need to plot the CDF of a generalized pareto distribution when x is greater than 100,000,000 with location parameter = 100,000,000, scale parameter = 49,761,000 and shape parameter = 0.10. The CDF starts at prob. 0.946844, the values below 100,000,000 are modeled by a uniform distribution. I only need to plot the CDF of the GPD.
library(DescTools)
x <- c(100000001:210580000)
pareto_distribution <- dGenPareto(x, 100000000, 49761000, 0.10)
graph <- data.frame(loss = x, probability = pareto_distribution)
plot(graph)
When I try the code above, the probabilities start at 0. I know that dGenPareto is not the code for the CDF but I was starting at the pdf and then going to calculate the CDF. How do I restrict the probability of the GPD so that it starts at the probability at 0.946844 not zero.
I am expecting the CDF of GPD to start at 0.946844 when x = 100,000,000. The x values are discrete.
I´m trying to calculate the cumulative distribution function of the skewed generalized error distribution with the probability density function from Theodossiou(http://www.mfsociety.org/modules/modDashboard/uploadFiles/journals/MJ~0~p1a4fjq38m1k2p45t6481fob7rp4.pdf):
And in R it looks like this:
psi <- -0.09547862
m <- 0.1811856
g <- -0.1288893
d <- 0.8029088
c <- (2/(1+exp(-g)))-1
p <- exp(psi)
y <- function(x) ((d**(1-(1/d)))/(2*p))*gamma(1/d)**(-1)*exp(-(1/d)*((abs(x-m)**d)/((1+sign(x-m)*c)**(d)*p**(d))))
The hole reason I do this is to fit the skewed generalized error distribution to my data and asses the distributions fit to my data by creating a qq-plot. So now I need to calculate the cumulativ distribution function and then the inverse cdf. For the invers cdf I plan to use the inversion-function from the GofKernel-Package. But for this I need the cdf. Is there anyway to calculate that with numerical integration in R?
To get a cumulative function via integration you can pass the x-values to a function that integrates from a suitable extreme low value to an upper limit that is x
# First look at the density function
plot( y(x) ~ x )
cum <- sapply(x, function(x) integrate(y,-10, x)$value )
plot( cum ~ x)
# So the inverse is just `x` as a function of `cum`
plot( x ~ cum)
In general, if you want to estimate the cumulative distribution function, use the function ecdf as follows:
x <- seq(-10,10,0.1)
Fn <- ecdf(y(x))
plot(Fn)
If you want to visualize how two data sets are similar, use qqplot as follows:
y1 <- y(x) # from your function
y2 <- rnorm(100) # some generic data
qqplot(y1, y2) # if the two data sets are from the same
# distribution, you should see a straight line
The input I'm giving to the GLM function is:
glm(family=fam,data=regFrame1,start=starter1,formula=as.formula(paste(yvar,"~.+0")),na.action=na.exclude,y=T)
Where the family is Gamma and the link function is identity.
I'm trying to manually reproduce the coefficients from my model where one of them is for example:
Estimate Std. Error t value Pr(>|t|)
coefficient A 480.6062 195.2952 2.461 0.013902 *
I know the equation I need for coefficient A is:
βA = (XTX)−1XTY
Where y is my dependent variable and x is my independent variable.
In R I write this to produce βA:
# x transposed multiplied by x when both are matrices
xtx <- t(x) %*% x
# x transposed multiplied by y when both are matrices
xty <- t(x) %*% y
# we need to inverse xtx
xtxinv <- solve(xtx, tol=0)
# finally we multiply the inverse of xtx by xty to get betaHat
betaHat <- xtxinv %*% xty
betaHat = 148
When I complete this calculation manually I get the coefficient that is produced when running a GLM on the default normal Gaussian family without specifying a family. Which looks like this:
glm(data=regFrame1,formula=as.formula(paste(yvar,"~.+0")),na.action=na.exclude,y=T)
So the question is how do I tailor my manual calculation to the Gamma family identity link function instead of the Gaussian identity default that is in the glm.fit function in R.
The only two differences with my two runs using the glm function are:
providing the family (Gamma identity)
giving the model starting values (100 for each column in the dataframe)
I tried to recreate glm.fit function manually to get out the coefficient (beta). When I didn't provide a family or starting values I got the correct answer but when I gave Gamma as the family and identity as the link with starting values I get a much different coefficient.
For linear regression, which is fit with least squares, BA is indeed (XTX)-1XTY. However, for generalized linear regression, BA is fit by iteratively weighted least squares, which is an iterative algorithm. Therefore, there is no direct formula to compute BA. However, we can compute the equivalent of the hat matrix H in linear regression. In linear regression, the hat matrix is H=X(XTX)-1XT. In generalized linear model, the analogy of the hat matrix is H=WX(XTWX)-1XT where W = diag(mu'(XB)). In both cases, Hy give the fitted values, yA. Here is code to demonstrate.
#' Test that the two parameterizations of Gamma are the same
curve(dgamma(x, 3, scale=3), xlim=c(0, 10))
grid <- seq(0, 10, length=1000)
d <- 1/grid/gamma(3)*(grid/(1/3)/9)^3*exp(-grid/3)
plot(grid, d, type='l')
#' Generate random variates according to GLM with
#' Y_i ~ Gamma(mean=mu,
#' squared coefficient of variation (variance over squared mean) = phi)
#' Y_i ~ Gamma(shape=alpha, scale=beta)
#' mu = alpha*beta
#' phi= 1/alpha
#' Let Beta = (3, 4)
set.seed(123)
X <- data.frame(x1=runif(1000, 0, 10))
mu = (3+4*X$x1)^(-1)
y=NULL
for (i in 1:1000) {
alpha = 1/3
beta = mu[i] * 3
y[i]=rgamma(1, alpha, scale=beta)
}
#' Fit the model and compute the hat matrix, then the fitted values manually
mod <- glm(y ~ ., family=Gamma(), data=X)
x <- as.matrix(cbind(1, X))
W=diag(c(-(x%*%c(3, 4))^(-2)))
H=W%*%x%*%solve(t(x)%*%W%*%x)%*%t(x)
#Manual fitted values
head(H%*%y)
#Fitted values from model
head(mod$fitted.values)
definition of the bivariate distribution
If I have the following joint probability density function:
The joint PMF
i.e, the joint probability mass function (pmf) is consisting of the following pmf and cumulative distribution function
The marginal pmf
The R code of the marginal distribution is given as follows (i.e the pmf of DIW distribution)
ddiw<- function(x, t, eta){ # t means theta parameter
stopifnot( eta>0, x>=0)
pmf<-t^(x+1)^(-eta) - t^(x)^(-eta)
return(pmf)
}
Its cumulative distribution function in R is as follows
pdiw<-function(x, t, eta){
stopifnot( eta > 0)
cdf<- t^(x+1)^(-eta)
return(cdf)
}
I want to write the joint pmf in R as in equation (4).
I tried to write the joint pmf in equation (4) in R, but I did not succeed.
Also, I want to plot the joint pmf in R as in the following figure
3D plot of bivariate distribution
could you help me to write the joint pmf in R and plot it as in the given figure.
Thanks in advance.
Edit
I write the joint pmf to be more clear as follows
The joint pmf
The joint pmf when substituting the pmf of DIW and substituting cdf of DIW
Not sure if this is helpful, but you could write the function along the lines:
joint_pmf <- function(x1, x2, eta) {
stopifnot(
all(x1>0), all(x2>0),
all(is.finite(x1)), all(is.finite(x2)),
length(x1)==length(x2)
)
# the result vector
n <- length(x1)
pmf <- rep(NA, n)
# check for the first condition
idx <- which(x1<x2)
pmf[idx] <- NA # here you need to fill in
# check for the second condition
idx <- which(x1>x2)
pmf[idx] <- NA # here you need to fill in
# and so on..
}
This function accepts vectors for the x1, x2 arguments. Since the pmf is defined piecewise, the result is computed for subsets only.
I'd like to compute the density of a multivariate dirichlet distribution and to generate random realizations from such a distribution. Like what the function dmvnorm does with the multivariate normal distribution. I found this for the normal distribution and i would like to know if there is a function that could do this for the Dirichlet and Gamma distribution :
g <- expand.grid(x = seq(-2,2,0.05), y = seq(-2,2,0.05)) ## x and y are the 2 normal distributions.
g$z <- dmvnorm(x=cbind(g$x,g$y),mean = c(0,0),sigma = diag(2),log = FALSE)