Gaussian deconvolution of a density curve - r

I have a vector which I have made a density function for. It is a curve with multiple peaks, and I want to perform a Gaussian deconvolution in order to represent my density curve with multiple Gaussian curves. I am hoping there is a package that will allow me to do this, but I haven't found anything that has worked for me so far. I tried to follow the example given here (https://www.r-bloggers.com/fitting-mixture-distributions-with-the-r-package-mixtools/) but the vector I am working with is ~400 mb and won't play nicely with the mixtools package tools. I need something that will work with the size of the datasets I am working with or be able to work with the density curve directly. Any help appreciated, thank you!
I do not have sample data but I am happy to have it explained with the faithful dataset in R, I know I can find the density of faithful$waiting with density(faithful$waiting) and it plots just fine, but the deconvolution has me stuck.

Related

how to make plot to compare rpkm values

I have a fair amount of experience analyzing RNA-Seq data, but I am looking for new ways to visualize the data. I typically use heat maps and volcano plots, but I'd like to make this plot which is from this paper. I can make this type of plot with rlog transformed data before doing DEG analysis, but I want to color dots based on statistically significant expression differences.
I've search online and have not been able to find a good way to create this plot. Thanks in advance for any advice.
This question is more about bioinformatics so maybe it is better you can post it on biostar.
In any case, maybe you can draw a scatter plot with the package "ggscatter" or "ggplot2" and colour the statistically significant gene with an if else statement.
Please, provide sample of your data.

Confidence ellipse formula in JS or R

What I have: A scatter chart(plot) of PCA. Plotted in JS. I have Rtools that Ive used to push PCA data to the client side.
What I'm trying to do: Plot a confidence ellipse formula.
I can't seem to find a straight forward formula for the CI ellipse. I came across a lot of theory and a lot of examples in R which give you the end result - an ellipse (One can use ggplot or CRAN to plot it).
But Im looking for a formula that I could use in the client side to plug my scatter chart points and calculate the ellipse or even better a function in R that would give me a formula for the ellipse.
I have the covariance matrix and Eigen vectors as well (calculated in R).
All suggestions much appreciated.
Haven't found a formula but after using Momocs:::conf_ell library I managed to get the vertices and the x,y points of an ellipse.
I will update this answer once I find the second part to my answer - a straight forward formula.

Make density cloud from point cloud

My question consists of two sub questions.
I have a graphical illustration presenting (some virtual) worst case scenarios sampled from history organized based on two parameters.
Image:
At this moment I have a point cloud. I would like to create nicely splined density cloud of my results. I would like the 3d spline to consider density of points when aproximating (so aproximate further around when there are less samples availabe and more exactly in more dense region of space)
Because then, having that density cloud, I would be able scale the density in each vertical line specified by the two input parameters, and that would make it a likehood function of each outcome - [the worst case scenario])
Second part is, I would like to plot it, at best as semi-transparent 3d-regions that would be forming sometihng like a fog around the most dense region.
Uh,wow.. that wasn't easy to explain. Sigh. :)
Thanks for reading that far.
So here is a way to generate 3D density plots using the ks package. Since you provided no data this example is taken directly from the documentation to plot(...) in the ks package
library(MASS)
library(ks)
x <- iris[,1:3]
H.pi <- Hpi(x, pilot="samse")
fhat <- kde(x, H=H.pi, compute.cont=TRUE)
plot(fhat, drawpoints=TRUE)

How to implement histfit in r?

There is histfit function in Matlab would plot histogram and fit the distribution by bin values.
The distribution's parameters have to be estimated.
How to implement histfit in r? I searched for a long time, but it has no lucky.
This post have mentioned this before, but there is no preferable solution. The sn package seems support several distribution, not so much.
I explore the data with hist function, the histogram shows gamma distribution in gerneral.
But if I add up bins and show it again, the graph will show more details, and gamma distribution fails.
fitdistr would fail to find parameters also.
so I want to fit the data just using the coarse data from histogram. This is the question, thank you for your help.
The fitdistr function in the MASS package can be used to find parameters for a given distribution (including gamma). The function density and the logspline package (and others) can be used to estimate the density function of the data without assuming a specific distribution.
The lines and curve functions can be used to add an estimated density curve to a plotted histogram (use prob=TRUE when creating the histogram).
If you want to compare your data to a specific distribution then tools like qqplots (qqplot function or others) or visual tests (vis.test in the TeachingDemos package) will probably be better than a histogram and density plot.
I have to answer it myself, package 'bda' could fit the binned data in several distributions, however it could only binning data by rounding.

General questions about Principal Component Analysis (PCA) in R

I would like to produce some nice PCA plots in R. As usual, in R, there are several ways to perform a principal component analysis. I found so far 3 different ways of how to calculate your components and 3 ways of plotting them. I was wondering whether people who are familiar with these functions can give me some advise on the best combination of functions to produce the following plots:
Scores Plot
Loadings Plot
Histogram / Bar chart of the variances explained by each principal component
My research on functions and plots used for PCA in R resulted in:
Functions:
pca.xzy()
prcomp()
princomp()
dudi.pca()
Plot:
plot.pca (this one seems to belong to the function pca.xzy())
ggplot2
plot
biplot
I also found the following webpage:
http://pbil.univ-lyon1.fr/ade4/ade4-html/dudi.pca.html
And I was wondering if you can draw those circles and lines starting from each of the circle centers with one of the other functions mentioned above as the function dudi.pca from the ade4 package seems to be the most complicated one.
One question per question, please! There's psych package by William Revelle, see this and this. There's also a good tutorial here. Anyway...
for scores/loadings plot see pairs
histogram: see hist
So once again, what's your question actually? =)

Resources