Layering density plots in R without using density() - r

I've computed and plotted gaussian kernel density estimates using the KernSmooth package as follows:
x <- MyData$MyNumericVector
h <- dpik(x)
est <- bkde(x, bandwidth=h)
plot(est, type='l')
This is the method described in KernSmooth's documentation. Note that dpik() finds the optimal bandwidth and bkde() uses this bandwidth to fit the kernel density estimate. It's important that I use this method instead of the basic density() function.
How do I layer these plots on top of one another?
I cannot use the basic density() function that geom_density() from ggplot2 relies upon, as bandwidths and kernel density estimates are best optimized using the KernSmooth package (see Deng & Wickham, 2011 here: http://vita.had.co.nz/papers/density-estimation.pdf). Since Wickham wrote ggplot2 and the above review of kernel density estimation packages, it would make sense that there's a way to use ggplot2 to layer densities that aren't reliant on the basic density() function, but I'm not sure.
Can I use ggplot2 for this even if I don't wish to use the basic density() function? What about lattice ?

You could do it with geom_line:
m <- ggplot(NULL, aes(x=bkde(movies$votes)$x,y=bkde(movies$votes)$y)) + geom_line()
print(m)
If you were doing t with lattice::densityplot, you could probably add some of the values to the drags-list:
darg
list of arguments to be passed to the density function. Typically, this should be a list with zero or more of the following components : bw, adjust, kernel, window, width, give.Rkern, n, from, to, cut, na.rm (see density for details)

Related

In R - How to plot a PDF and CDF for exponential distribution given lambda=0.2 or theta=5 and N=10?

I don't know how to plot the probability distribution function (PDF) and the cumulative one given λ = 1/5 for N=10observations
The 3 functions you are looking for are dexp(), qexp() and rexp(). Use the ? operator to get the help page for them.
Each stat distribution has the equivalent set of functions that define the density, quantiles (cumulative) and a data set.
R uses plot(x, y) notation to make a plot. If you are using Rstudio, you can use the GUI to save the image, else you can do:
png("name.png")
plot(x, y)
dev.off()
I'm not wholly confident on stack overflow's policy on answering homework questions, but I think this isn't overstepping. I think figuring out what goes on the x and y axes constitutes what you're supposed to figure out.

Is there a way to rescale the axes of a plot produced by plot.clusterlm (R)?

I have run cluster analysis on some time series data using permuco in R. (Permutes the labels of control/treatment conditions and calculates the F statistic as to how likely it is that these time clusters of significant differences occurred by chance.)
So far so good.
I have produced a number of plots using the inbuilt function plot.clusterlm that comes with this package. However, the data come from different groups, and the F values on the y axis get rescaled in each plot, i.e. the values and ticks are reset depending on how strong the effects are.
This is problematic, because the different plots based on different cluster analyses are not visually comparable.
I would like to rescale the y axis, so that all clusters are visualised along the same F values (0-10 for example).
I haven't been able to do that, and I was wondering if there is a way to pass any additional functions into the plot.clusterlm to do this.
This is the usage of the function, but I don't see a way to rescale the y axis. (Although rescaling the x axis is possible by manipulating the nbbaselinepts & nbptsperunit, but that's not what I want...)
plot(x, effect = "all", type = "statistic",
multcomp = "clustermass", alternative = "two.sided",
enhanced_stat = FALSE, nbbaselinepts = 0, nbptsperunit = 1, ...)
If you have any ideas on this, please let me know.
Thank you!
Thanks for using permuco! I opened an issue on GitHub to have a solution for implementing these features. You can expect changes in further releases of permuco.
However, the plot() method shows the F statistic which is not a good measure of effect size. A better measure of effect size is the partial-eta square which is implemented in the afex package
In the base R plotting device axes are altered like this:
x<-1:10; y=x*x
# Simple graph
plot(x, y)
# Enlarge the scale
plot(x, y, xlim=c(1,15), ylim=c(1,150))
# Log scale
plot(x, y, log="y")
This is an example from STHDA where you can find many helpful tutorials.

How to plot multiple curves on same plot in julia?

I am using julia to build linear regression model from scratch. After having done all my mathematical calculations, I need to plot a linear regression graph
I have a scatter plot and linear fit (Linear line) plots separately ready, How do I combine them or use my linear fit plot on scatter plot?
Basically, how do I draw multiple plots on a single plot in Julia?
Note: Neither do I know python or R
x = [1,2,3,4,5]
y = [2,3,4,5,6]
plot1 = scatter(x,y)
plot2 = plot(x,y) #line plot
#plot3 = plot1+plot2 (how?)
Julia doesn't come with one built-in plotting package, so you need to choose one. Popular plotting packages are Plots, Gadfly, PyPlot, GR, PlotlyJS and others. You need to install them first, and with Plots you'll also need to install a "backend" package (e.g. one of the last three mentioned above).
With Plots, e.g., you'd do
using Plots; gr() # if GR is the plotting "backend" you've chosen
scatter(point_xs, point_ys) # the points
plot!(line_xs, line_ys) # the line
The key here is the plot! command (as opposed to plot), which modifies an existing plot rather than creating a new one.
More simply you can do
scatter(x,y, smooth = true) # fits the trendline automatically
See also http://docs.juliaplots.org/latest/
(disclaimer: I'm associated with Plots - others may give you different advice)

Different lowess curves in plot and qplot in R

I am comparing two graphs with a non-parametric lo(w)ess curve superimposed in each case. The problem is that the curves look very different, despite the fact that their arguments, such as span, are identical.
y<-rnorm(100)
x<-rgamma(100,2,2)
qplot(x,y)+stat_smooth(span=2/3,se=F)+theme_bw()
plot(x,y)
lines(lowess(y~x))
There seems to be a lot more curvatute in the graph generated by qplot(). As you know detecting curvature is very important in the diagnostics of regression analysis and I fear that If I am to use ggplot2, I would reach erroneous conclusions.
Could you please tell me how I could produce the same curve in ggplot2?
Thank you
Or, you can use loess(..., degree=1). This produces a very similar, but not quite identical result to lowess(...)
set.seed(1) # for reproducibility
y<-rnorm(100)
x<-rgamma(100,2,2)
plot(x,y)
points(x,loess(y~x,data.frame(x,y),degree=1)$fitted,pch=20,col="red")
lines(lowess(y~x))
With ggplot
qplot(x,y)+stat_smooth(se=F,degree=1)+
theme_bw()+
geom_point(data=as.data.frame(lowess(y~x)),aes(x,y),col="red")
Here is a new stat function for use with ggplot2 that uses lowess(): https://github.com/harrelfe/Hmisc/blob/master/R/stat-plsmo.r. You need to load the proto package for this to work. I like using lowess because it is fast for any sample size and allows outlier detection to be turned off for binary Y. But it doesn't provide confidence bands.

How to scale/transform graphics::plot() axes with any transformation, not just logarithmic (for Weibull plots)?

I am building an R package to display Weibull plots (using graphics::plot) in R. The plot has a log-transformed x-axis and a Weibull-transformed y-axis (for lack of a better description). The two-parameter Weibull distribution can thus be represented as a straight line on this plot.
The logarithmic transformation of the x-axis is as simple as adding the log="x" parameter to plot() or curve(). How can I supply the y-axis transformation in an elegant way, so that all graphics-related plotting will work on my axis-transformed plot? To demonstrate what I need, run the following example code:
## initialisation ##
beta <- 2;eta <- 1000
ticks <- c(seq(0.01,0.09,0.01),(1:9)/10,seq(0.91,0.99,0.01))
F0inv <- function (p) log(qweibull(p, 1, 1))
# this is the transformation function
F0 <- function (q) exp(-exp(q))
# this is the inverse of the transformation function
weibull <- function(x)pweibull(x,beta,eta)
# the curve of this function represents the weibull distribution
# as a straight line on weibull paper
weibull2 <- function(x)F0inv(weibull(x))
First an example of a Weibull distribution with beta=2 and eta=1000 on a regular, untransformed plot:
## untransformed axes ##
curve(weibull ,xlim=c(100,1e4),ylim=c(0.01,0.99))
abline(h=ticks,col="lightgray")
This plot is useless for Weibull analysis. Here is my currently implemented solution that transforms the data with function F0inv() and modifies the y-axis of the plot. Notice that I have to use F0inv() on all y-axis related data.
## transformed axis with F0inv() ##
curve(weibull2,xlim=c(100,1e4),ylim=F0inv(c(0.01,0.99)),log="x",axes=F)
axis(1);axis(2,at=F0inv(ticks),labels=ticks)
abline(h=F0inv(ticks),col="lightgray")
This works, but this is not very user-friendly: when the user wants to add annotations, one must always use F0inv():
text(300,F0inv(0.4),"at 40%")
I found that you can achieve a solution to my problem using ggplot2 and scales, but I don't want to change to a graphics package unless absolutely necessary since a lot of other code needs to be rewritten.
## with ggplot2 and scales ##
library(ggplot2)
library(scales)
weibull_trans <- function()trans_new("weibull", F0inv, F0)
qplot(c(100,1e4),xlim=c(100,1e4),ylim=c(0.01,0.99),
stat="function",geom="line",fun=weibull) +
coord_trans(x="log10",y = "weibull")
I think that if I could dynamically replace the code for applying the logarithmic transformation with my own, my problem would be solved.
I tried to find more information by Googling "R axis transformation", "R user coordinates", "R axis scaling" without useful results. Almost everything I have found dealt with logarithmic scales.
I tried to look into plot() at how the log="x" parameter works, but the relevant code for plot.window is written in C – not my strongest point at all.
While it doesn't appear to be possible in base graphics, you can make this function do what you want so that you can call it more simply:
F0inv <- function (p) log(qweibull(p, 1, 1))
## this is the transformation function
F0 <- function (q) exp(-exp(q))
weibullplot <- function(eta, beta,
ticks=c(seq(0.01,0.09,0.01),(1:9)/10,seq(0.91,0.99,0.01)),
...) {
## the curve of this function represents the weibull distribution
## as a straight line on weibull paper
weibull2 <- function(x)
F0inv(pweibull(x, beta, eta))
curve(weibull2, xlim=c(100, 1e4), ylim=F0inv(c(0.01, 0.99)), log="x", axes=FALSE)
axis(1);
axis(2, at=F0inv(ticks), labels=ticks)
abline(h=F0inv(ticks),col="lightgray")
}
weibullplot(eta=1000, beta=2)

Resources