This code
library(ggplot2)
library(MASS)
# Generate gamma rvs
x <- rgamma(100000, shape = 2, rate = 0.2)
den <- density(x)
dat <- data.frame(x = den$x, y = den$y)
ggplot(data = dat, aes(x = x, y = y)) +
geom_point(size = 3) +
theme_classic()
# Fit parameters (to avoid errors, set lower bounds to zero)
fit.params <- fitdistr(estimate, "gamma", lower = c(0, 0))
# Plot using density points
ggplot(data = dat, aes(x = x,y = y)) +
geom_point(size = 3) +
geom_line(aes(x=dat$x, y=dgamma(dat$x,fit.params$estimate["shape"], fit.params$estimate["rate"])),
color="red", size = 1) +
theme_classic()
fits and plots the distribution of series x. The resulting plot is:
Packages stats and MASS seem not to support the Rayleigh distribution. How can I extend the previous code to the Rayleigh distribution?
In the code below I start by recreating the vector x, this time setting the RNG seed, in order to make the results reproducible. Then a data.frame dat with only that vector is also recreated.
The density functions of the Gamma and Rayleigh distributions are fit to the histogram of x by first estimating their parameters and with stat_function.
library(ggplot2)
library(MASS)
library(extraDistr) # for the Rayleigh distribution functions
# Generate gamma rvs
set.seed(2020)
x <- rgamma(100000, shape = 2, rate = 0.2)
dat <- data.frame(x)
# Fit parameters (to avoid errors, set lower bounds to zero)
fit.params <- fitdistr(dat$x, "gamma", lower = c(0, 0))
ggplot(data = dat, aes(x = x)) +
geom_histogram(aes(y = ..density..), bins = nclass.Sturges(x)) +
stat_function(fun = dgamma,
args = list(shape = fit.params$estimate["shape"],
rate = fit.params$estimate["rate"]),
color = "red", size = 1) +
ggtitle("Gamma density") +
theme_classic()
fit.params.2 <- fitdistrplus::fitdist(dat$x, "rayleigh", start = list(sigma = 1))
fit.params.2$estimate
ggplot(data = dat, aes(x = x)) +
geom_histogram(aes(y = ..density..), bins = nclass.Sturges(x)) +
stat_function(fun = drayleigh,
args = list(sigma = fit.params.2$estimate),
color = "blue", size = 1) +
ggtitle("Rayleigh density") +
theme_classic()
To plot points and lines like in the question, not histograms, use the code below.
den <- density(x)
orig <- data.frame(x = den$x, y = den$y)
ggplot(data = orig, aes(x = x)) +
geom_point(aes(y = y), size = 3) +
geom_line(aes(y = dgamma(x, fit.params$estimate["shape"], fit.params$estimate["rate"])),
color="red", size = 1) +
geom_line(aes(y = drayleigh(x, fit.params.2$estimate)),
color="blue", size = 1) +
theme_classic()
Related
I am trying to follow this tutorial here : https://rc2e.com/timeseriesanalysis ( bottom of the page) and plot a smoothed time series and the original time series on the same plot. I have simulated some data below, smoothed it, and then tried to plot it.
library(dplyr)
library(KernSmooth)
library(ggplot2)
a = rnorm(2000,10,10)
y = ts(a, frequency = 12)
gridsize <- length(y)
bw <- dpill(t, y, gridsize = gridsize)
lp <- locpoly(x = t, y = y, bandwidth = bw, gridsize = gridsize)
smooth <- lp$y
ggplot() +
geom_line(aes(x = t, y = y)) +
geom_line(aes(x = t, y = smooth), linetype = 2)
However, there seems to be some problem. The first error appears : 'x' must be atomic for 'sort.list', method "shell" and "quick"
Could someone please tell me what I am doing wrong?
Thanks
You can fit a smoothed curve to a time series directly in ggplot. Here's an example using gam inside geom_smooth:
library(ggplot2)
set.seed(1)
a <- cumsum(rnorm(2000, 0.1, 10))
t <- seq(as.Date("1854-06-01"), by = "1 month", length.out = 2000)
ggplot(data.frame(t, a), aes(t, a)) +
geom_point(size = 0.1, color = "orange2", alpha = 0.5) +
geom_smooth(method = 'gam', formula = y ~ s(x, k = 30, bs = "cs"),
fill = "orange", color = "orange4", linetype = 2) +
theme_bw()
I am plotting different plots in my shiny app.
By using geom_smooth(), I am fitting a smoothing curve on a scatterplot.
I am plotting these plots with ggplot() and rendering with ggplotly().
Is there any way, I can exclude a particular data profile from geom_smooth().
For e.g.:
It can be seen in the fit, the fit is getting disturbed and which is not desirable. I have tried plotly_click(), plotly_brush(), plotly_select(). But, I don't want user's interference when plotting this fit, this makes the process much slower and inaccurate.
Here is my code to plot this:
#plot
g <- ggplot(data = d_f4, aes_string(x = d_f4$x, y = d_f4$y)) + theme_bw() +
geom_point(colour = "blue", size = 0.1)+
geom_smooth(formula = y ~ splines::bs(x, df = 10), method = "lm", color = "green3", level = 1, size = 1)
Unfortunately, I can not include my dataset in my question, because the dataset is quite big.
You can make an extra data.frame without the "outliers" and use this as the input for geom_smooth:
set.seed(8)
test_data <- data.frame(x = 1:100)
test_data$y <- sin(test_data$x / 10) + rnorm(100, sd = 0.1)
test_data[60:65, "y"] <- test_data[60:65, "y"] + 1
data_plot <- test_data[-c(60:65), ]
library(ggplot2)
ggplot(data = test_data, aes(x = x, y = y)) + theme_bw() +
geom_point(colour = "blue", size = 0.1) +
geom_smooth(formula = y ~ splines::bs(x, df = 10), method = "lm", color = "green3", level = 1, size = 1)
ggplot(data = test_data, aes(x = x, y = y)) + theme_bw() +
geom_point(colour = "blue", size = 0.1) +
geom_smooth(data = data_plot, formula = y ~ splines::bs(x, df = 10), method = "lm", color = "green3", level = 1, size = 1)
Created on 2020-11-27 by the reprex package (v0.3.0)
BTW: you don't need aes_string (which is deprecated) and d_f4$x, you can just use aes(x = x)
I am trying to make a histogram of density values and overlay that with the curve of a density function (not the density estimate).
Using a simple standard normal example, here is some data:
x <- rnorm(1000)
I can do:
q <- qplot( x, geom="histogram")
q + stat_function( fun = dnorm )
but this gives the scale of the histogram in frequencies and not densities. with ..density.. I can get the proper scale on the histogram:
q <- qplot( x,..density.., geom="histogram")
q
But now this gives an error:
q + stat_function( fun = dnorm )
Is there something I am not seeing?
Another question, is there a way to plot the curve of a function, like curve(), but then not as layer?
Here you go!
# create some data to work with
x = rnorm(1000);
# overlay histogram, empirical density and normal density
p0 = qplot(x, geom = 'blank') +
geom_line(aes(y = ..density.., colour = 'Empirical'), stat = 'density') +
stat_function(fun = dnorm, aes(colour = 'Normal')) +
geom_histogram(aes(y = ..density..), alpha = 0.4) +
scale_colour_manual(name = 'Density', values = c('red', 'blue')) +
theme(legend.position = c(0.85, 0.85))
print(p0)
A more bare-bones alternative to Ramnath's answer, passing the observed mean and standard deviation, and using ggplot instead of qplot:
df <- data.frame(x = rnorm(1000, 2, 2))
# overlay histogram and normal density
ggplot(df, aes(x)) +
geom_histogram(aes(y = after_stat(density))) +
stat_function(
fun = dnorm,
args = list(mean = mean(df$x), sd = sd(df$x)),
lwd = 2,
col = 'red'
)
What about using geom_density() from ggplot2? Like so:
df <- data.frame(x = rnorm(1000, 2, 2))
ggplot(df, aes(x)) +
geom_histogram(aes(y=..density..)) + # scale histogram y
geom_density(col = "red")
This also works for multimodal distributions, for example:
df <- data.frame(x = c(rnorm(1000, 2, 2), rnorm(1000, 12, 2), rnorm(500, -8, 2)))
ggplot(df, aes(x)) +
geom_histogram(aes(y=..density..)) + # scale histogram y
geom_density(col = "red")
I'm trying for iris data set. You should be able to see graph you need in these simple code:
ker_graph <- ggplot(iris, aes(x = Sepal.Length)) +
geom_histogram(aes(y = ..density..),
colour = 1, fill = "white") +
geom_density(lwd = 1.2,
linetype = 2,
colour = 2)
I am trying to make a histogram of density values and overlay that with the curve of a density function (not the density estimate).
Using a simple standard normal example, here is some data:
x <- rnorm(1000)
I can do:
q <- qplot( x, geom="histogram")
q + stat_function( fun = dnorm )
but this gives the scale of the histogram in frequencies and not densities. with ..density.. I can get the proper scale on the histogram:
q <- qplot( x,..density.., geom="histogram")
q
But now this gives an error:
q + stat_function( fun = dnorm )
Is there something I am not seeing?
Another question, is there a way to plot the curve of a function, like curve(), but then not as layer?
Here you go!
# create some data to work with
x = rnorm(1000);
# overlay histogram, empirical density and normal density
p0 = qplot(x, geom = 'blank') +
geom_line(aes(y = ..density.., colour = 'Empirical'), stat = 'density') +
stat_function(fun = dnorm, aes(colour = 'Normal')) +
geom_histogram(aes(y = ..density..), alpha = 0.4) +
scale_colour_manual(name = 'Density', values = c('red', 'blue')) +
theme(legend.position = c(0.85, 0.85))
print(p0)
A more bare-bones alternative to Ramnath's answer, passing the observed mean and standard deviation, and using ggplot instead of qplot:
df <- data.frame(x = rnorm(1000, 2, 2))
# overlay histogram and normal density
ggplot(df, aes(x)) +
geom_histogram(aes(y = after_stat(density))) +
stat_function(
fun = dnorm,
args = list(mean = mean(df$x), sd = sd(df$x)),
lwd = 2,
col = 'red'
)
What about using geom_density() from ggplot2? Like so:
df <- data.frame(x = rnorm(1000, 2, 2))
ggplot(df, aes(x)) +
geom_histogram(aes(y=..density..)) + # scale histogram y
geom_density(col = "red")
This also works for multimodal distributions, for example:
df <- data.frame(x = c(rnorm(1000, 2, 2), rnorm(1000, 12, 2), rnorm(500, -8, 2)))
ggplot(df, aes(x)) +
geom_histogram(aes(y=..density..)) + # scale histogram y
geom_density(col = "red")
I'm trying for iris data set. You should be able to see graph you need in these simple code:
ker_graph <- ggplot(iris, aes(x = Sepal.Length)) +
geom_histogram(aes(y = ..density..),
colour = 1, fill = "white") +
geom_density(lwd = 1.2,
linetype = 2,
colour = 2)
I am trying to make a histogram of density values and overlay that with the curve of a density function (not the density estimate).
Using a simple standard normal example, here is some data:
x <- rnorm(1000)
I can do:
q <- qplot( x, geom="histogram")
q + stat_function( fun = dnorm )
but this gives the scale of the histogram in frequencies and not densities. with ..density.. I can get the proper scale on the histogram:
q <- qplot( x,..density.., geom="histogram")
q
But now this gives an error:
q + stat_function( fun = dnorm )
Is there something I am not seeing?
Another question, is there a way to plot the curve of a function, like curve(), but then not as layer?
Here you go!
# create some data to work with
x = rnorm(1000);
# overlay histogram, empirical density and normal density
p0 = qplot(x, geom = 'blank') +
geom_line(aes(y = ..density.., colour = 'Empirical'), stat = 'density') +
stat_function(fun = dnorm, aes(colour = 'Normal')) +
geom_histogram(aes(y = ..density..), alpha = 0.4) +
scale_colour_manual(name = 'Density', values = c('red', 'blue')) +
theme(legend.position = c(0.85, 0.85))
print(p0)
A more bare-bones alternative to Ramnath's answer, passing the observed mean and standard deviation, and using ggplot instead of qplot:
df <- data.frame(x = rnorm(1000, 2, 2))
# overlay histogram and normal density
ggplot(df, aes(x)) +
geom_histogram(aes(y = after_stat(density))) +
stat_function(
fun = dnorm,
args = list(mean = mean(df$x), sd = sd(df$x)),
lwd = 2,
col = 'red'
)
What about using geom_density() from ggplot2? Like so:
df <- data.frame(x = rnorm(1000, 2, 2))
ggplot(df, aes(x)) +
geom_histogram(aes(y=..density..)) + # scale histogram y
geom_density(col = "red")
This also works for multimodal distributions, for example:
df <- data.frame(x = c(rnorm(1000, 2, 2), rnorm(1000, 12, 2), rnorm(500, -8, 2)))
ggplot(df, aes(x)) +
geom_histogram(aes(y=..density..)) + # scale histogram y
geom_density(col = "red")
I'm trying for iris data set. You should be able to see graph you need in these simple code:
ker_graph <- ggplot(iris, aes(x = Sepal.Length)) +
geom_histogram(aes(y = ..density..),
colour = 1, fill = "white") +
geom_density(lwd = 1.2,
linetype = 2,
colour = 2)