I am trying to sample from a multivariate distribution given by a (quite complex, but continuous) density function in R. For the univariate case I used AbscontDistribution from the distr package, but I cannot make it work for the multivariate case.
I tried finding an appropriate package for this problem online, but cannot find one.
Any ideas?
Thanks! :)
I'm wondering if there is an R package which can help me to get the correct parameters for a distribution of my choice and for intervals of my choice.
For Instance, here Betancourt is looking at inverse gamma and he wants to learn which set of parameters will give >1% below 2 and >1% above 20 (like the graph below). Stan's solver returns the parameters for inv-gamma which results the intervals of interest. Is there any solution applied directly on R?
Or in other words,
I have the distribution
I have the intervals
Can I learn the correct parameters?
Thanks
I have a data with more than 10000 distributions looking like the ones in red. I want to compare each one of them with a reference distribution like the one in blue. Because some are unimodal and some are multimodal I cannot use a t-test for all of them. So I am trying to detect multimodal distribution to apply a conditional test (t-test for normal distribution, mann-whithney for multimodal distribution - If any other idea please let me know). Is there any way to detect multimodal distribution?
I am also thinking about splitting the modes when I have a multimodal distribution and compare each of the mode to the reference. Is this possible? I found this SO link Calculate the modes in a multimodal distribution in R but didn't find anything more recent.
I tried mclust to find how many mode can be found but it doesn't work well
as it will find 2 mode when the distribution looks unimodal.
library(mclust)
clust <- Mclust(data$sample_frequency)
I also tried dip.test
library(diptest)
dip.test(b$sample_frequency)
but again the p-value will not always be correct (for example the plot 77 will be significaant at p=0.001 when it will be at p=0.076 for the plot 79).
Any help/thought is welcome!
Thanks!
I am interested in frequency distributions that are not normally distributed.
If I have a frequency distributions table which is not normally distributed.
Is there a function or package that will identify the type of distribution for me?
You can use the fitdistr function (library MASS i think) and check for yourself if you find a 'fitting' distribution. However i suggest that you plot the function first and see how it looks like. This approach is generally not recommended as you always can use different parameters to fit a distribution and thus confuse one distribution with another. If you have found a suited distribution you should test it against data.
Edit: For instance a normal distribution may look like a poisson distribution. Fitting is in my oppinion only useful if you have enough random variables. Otherwise just draw variables from your data if you need to
You can always try to test whether a distribution is adequate for your data with QQ plot. If you have data that is dynamic, I would suggest that you use ECDF (Empirical Cumulative Distribution Function) which will give you more precise distributions as your data grows. You can use ECDF in R with the ecdf() function.
I have two histograms.
int Hist1[10] = {1,4,3,5,2,5,4,6,3,2};
int Hist1[10] = {1,4,3,15,12,15,4,6,3,2};
Hist1's distribution is of type multi-modal;
Hist2's distribution is of type uni-modal with single prominent peak.
My questions are
Is there any way that i could determine the type of distribution programmatically?
How to quantify whether these two histograms are similar/dissimilar?
Thanks
Raj,
I posted a C function in your other question ( automatically compare two series -Dissimilarity test ) that will compute divergence between two sets of similar data. It's actually intended to tell you how closely real data matches predicted data but I suspect you could use it for your purpose.
Basically, the smaller the error, the more similar the two sets are.
These are just guesses, but I would try fitting each distribution as a gaussian distribution and use something like the R-squared value to determine if the distribution is uni-modal or not.
As to the similarity between the two distributions, I would try doing an autocorrelation and using the peak positive value in the autocorrelation as a similarity measure. These ideas are pretty rough, but hopefully they give you some ideas.
For #2, you could calculate their cross-correlation (so long as the buckets themselves can be sorted). That would give you a rough estimation of what "similarity".
Comparison of Histograms (For Use in Cloud Modeling).
(That's an MS .doc file.)
There are a variety of software packages that will "fit" your distributions to known discrete distributions for you - Minitab, STATA, R, etc. A reference to fitting distributions in R is here. I wouldn't advise programming this from scratch.
Regarding distribution comparisons, if neither distribution fits a known distribution (Poisson, Binomial, etc.), then you need to use non-parametric methods described here.