Recognizing that this may be as much a statistical question as a coding question, let's say I have a normal distribution created using Distributions.jl:
using Distributions
mydist = Normal(0, 0.2)
Is there a good, straightforward way that I should go about discretizing such a distribution in order to get a PMF as opposed to a PDF?
In R, I found that the actuar package contains a function to discretize a continuous distribution. I failed to find anything similar for Julia, but thought I'd check here before rolling my own.
There isn't an inbuilt function to do it, but you can use a range object, combined with the cdf and diff functions to compute the values:
using Distributions
mydist = Normal(0, 0.2)
r = -3:0.1:3
d = diff(cdf(mydist, r))
Related
I am new on R and and I am working with Copulas.
I have read the R documentation and so far I understood how to create a copula and to calculate the PdF and CDF.
#Generate Normal Copula
coef_ <- 0.7
mycopula <- normalCopula(coef_, dim = 2)
v <- rCopula(4000, mycopula)
# Compute the density
pdf_ <- dCopula(v, mycopula)
# Compute the CDF
cdf <- pCopula(v, mycopula)
However, I need a function to retrieve the inverse of the CDF of the Multivariate Normal Distribution, as I need to find the 99° percentile.
Anyone knows how to do that? Thanks!
I am not sure if you are still interested. However, you can just use qCopula function. Or simply qnorm(v). This will transfer your data from copula data to original data with standard normal margins.
Is there anyway to print out the PDF/CDF formula for a distribution? E.g. for normal distribution I wish to run a command and see some formula printed f(x) = 1/sqrt(.....)...
I want to translate R's implementation of distributions like hyperbolic and EGB2 into Python and hope there is a way to fetch the formula from R elegantly rather than looking into the source code.
I want to compute the cumulative distribution function in R for data that follows a gamma distribution. I understood how to do this with a lognormal distribution using the equation from Wikipedia; however, the gamma equation seems more complicated and I decided to use the pgamma() function.
I'm new to this and don't understand the following:
Why do I get three different values out of pgamma, and how does it make sense that they are negative?
Am I supposed to take the log of all the quantiles, just as I used log(mean) and log(standard deviation) when doing calculations with a lognorm distribution?
How do I conceptually understand the CDF calculated by pgamma? It made sense for lognorm that I was calculating the probability that X would take a value <= x, but there is no "x" in this pgamma function.
Really appreciate the help in understanding this.
shape <- 1.35721347
scale <- 1/0.01395087
quantiles <- c(3.376354, 3.929347, 4.462594)
pgamma(quantiles, shape = shape, scale = scale, log.p = TRUE)
I am using lowess function to fit a regression between two variables x and y. Now I want to know the fitted value at a new value of x. For example, how do I find the fitted value at x=2.5 in the following example. I know loess can do that, but I want to reproduce someone's plot and he used lowess.
set.seed(1)
x <- 1:10
y <- x + rnorm(x)
fit <- lowess(x, y)
plot(x, y)
lines(fit)
Local regression (lowess) is a non-parametric statistical method, it's a not like linear regression where you can use the model directly to estimate new values.
You'll need to take the values from the function (that's why it only returns a list to you), and choose your own interpolation scheme. Use the scheme to predict your new points.
Common technique is spline interpolation (but there're others):
https://www.r-bloggers.com/interpolation-and-smoothing-functions-in-base-r/
EDIT: I'm pretty sure the predict function does the interpolation for you. I also can't find any information about what exactly predict uses, so I've tried to trace the source code.
https://github.com/wch/r-source/blob/af7f52f70101960861e5d995d3a4bec010bc89e6/src/library/stats/R/loess.R
else { ## interpolate
## need to eliminate points outside original range - not in pred_
I'm sure the R code calls the underlying C implementation, but it's not well documented so I don't know what algorithm it uses.
My suggestion is: either trust the predict function or roll out your own interpolation algorithm.
With poweRlaw library, and once computed alpha and xmin with estimate_xmin, which formula the script uses to compute the fitted values?
I mean, assuming that y=C·x^(-alpha), my question is how the script computes the normalization constant from alpha and xmin.
The normalising constant is fairly easy to calculate. See the Clauset et al's powerlaw paper (in particular table 2.1). For the continuous case, C = (alpha-1) xmin^(alpha-1), the discrete case involves calculating the diagamma function.
You can also examine the R code:
Discrete
Continuous