I am drawing a histogram using ggplot2 and overlaying a density plot (in black). I then overlay a normal density plot (in red).
set.seed(1234)
dat <- data.frame(cond = factor(rep(c("A","B"), each=200)), rating = c(rnorm(200),rnorm(200, mean=.8)))
plot <- ggplot(dat, aes(x = rating))
plot <- plot + geom_histogram(aes(y=..density..), color="black", fill = "steelblue", binwidth = 0.5, alpha = 0.2)
plot <- plot + geom_density()
plot <- plot + stat_function(fun = dnorm, colour = "red", args = list(mean = 0.3, sd = 1))
plot
Currently, the plot looks like I want it to look but it is missing a legend explaining the black and red density plots and I have not been able to figure out how to add them.
I am learning R and any help would be greatly appreciated.
An option is this. First you include the legend labels with aes(color = "Name you want") and then add the colours using scale_colour_manual.
plot <- ggplot(dat, aes(x = rating))
plot <- plot + geom_histogram(aes(y = ..density..), color = "black", fill = "steelblue", binwidth = 0.5, alpha = 0.2)
plot <- plot + geom_density(aes(color = "Density"))
plot <- plot + stat_function(aes(colour = "Normal"), fun = dnorm, args = list(mean = 0.3, sd = 1)) +
scale_colour_manual("Legend title", values = c("black", "red"))
plot
Related
Here is the example of overlaying of barplots
library(data.table)
library(ggplot2)
set.seed(100)
dat <- data.frame(Axis=letters[1:10],V1=1:10, V2=runif(10, 1,10), V3=10:1)
ggplot(dat, aes(x = Axis)) + theme_classic() +
geom_col(aes(y = V1), fill = "darkred", alpha = .5) +
geom_col(aes(y = V2), fill = "blue", alpha = .5,
position = position_nudge(x = 0.2))
I want to only get the smoothed coutours and shading below so thta it looks like this example below. How can I do that for a discreete x-axis?
I wish to draw different density functions in the same histogram. This is one example:
ggplot(mtcars, aes(mpg)) +
geom_histogram(aes(y = ..count../40),
fill = "gray70", color = "gray50") +
geom_density(aes(color = "default")) +
geom_density(adjust = 2, aes(color = "longer")) +
geom_density(adjust = 1/2, aes(color = "shorter")) +
geom_density(kernel = "epanechnikov", aes(color = "epanechnikov")) +
geom_density(kernel = "rectangular", aes(color = "rectangular")) +
geom_density(kernel = "cosine", aes(color = "cosine"))
And here is a solution with plot from R base: Use plot (or hist) for the first plot and lines for subsequent plots.
Beware to use freq=FALSE, because otherwise the histogram area is not normalized to one.
x <- rnorm(50)
hist(x, freq=F)
xx <- seq(min(x)-0.5, max(x)+0.5, 0.01)
lines(xx, dnorm(xx), col="red")
lines(density(x), col="blue")
I'm looking for a way to create a scatterplot with marginal histograms.
I found a solution using ggplot2 + cowplot (thanks #crsh):
library(ggplot2)
library(cowplot)
# Set up scatterplot
scatterplot <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) +
geom_point(size = 3, alpha = 0.6) +
guides(color = FALSE) +
theme(plot.margin = margin())
# Define marginal histogram
marginal_distribution <- function(data, var, group) {
ggplot(data, aes_string(x = var, fill = group)) +
geom_histogram(bins = 30, alpha = 0.4, position = "identity") +
# geom_density(alpha = 0.4, size = 0.1) +
guides(fill = FALSE) +
#theme_void() +
theme(plot.margin = margin())
}
# Set up marginal histograms
x_hist <- marginal_distribution(iris, "Sepal.Length", "Species")
y_hist <- marginal_distribution(iris, "Sepal.Width", "Species") +
coord_flip()
# Align histograms with scatterplot
aligned_x_hist <- align_plots(x_hist, scatterplot, align = "v")[[1]]
aligned_y_hist <- align_plots(y_hist, scatterplot, align = "h")[[1]]
# Arrange plots
plot_grid(
aligned_x_hist
, NULL
, scatterplot
, aligned_y_hist
, ncol = 2
, nrow = 2
, rel_heights = c(0.2, 1)
, rel_widths = c(1, 0.2)
)
I got:
Now, I have some questions:
a) How can I add a legend and a tittle without breaking the plot_grid?
b) The axes of the scatterplot does not match with the axes of the histograms. How can I solve this?
Regards!
Ok, I solve the axes problem:
I set xlim and ylim for the scatterplot based on the limits of the histograms:
scatterplot <- scatterplot +
xlim(layer_scales(x_hist)$x$range$range) +
ylim(layer_scales(y_hist)$x$range$range)
But I don't know how to add a tittle and a legend
I am trying to make a histogram of density values and overlay that with the curve of a density function (not the density estimate).
Using a simple standard normal example, here is some data:
x <- rnorm(1000)
I can do:
q <- qplot( x, geom="histogram")
q + stat_function( fun = dnorm )
but this gives the scale of the histogram in frequencies and not densities. with ..density.. I can get the proper scale on the histogram:
q <- qplot( x,..density.., geom="histogram")
q
But now this gives an error:
q + stat_function( fun = dnorm )
Is there something I am not seeing?
Another question, is there a way to plot the curve of a function, like curve(), but then not as layer?
Here you go!
# create some data to work with
x = rnorm(1000);
# overlay histogram, empirical density and normal density
p0 = qplot(x, geom = 'blank') +
geom_line(aes(y = ..density.., colour = 'Empirical'), stat = 'density') +
stat_function(fun = dnorm, aes(colour = 'Normal')) +
geom_histogram(aes(y = ..density..), alpha = 0.4) +
scale_colour_manual(name = 'Density', values = c('red', 'blue')) +
theme(legend.position = c(0.85, 0.85))
print(p0)
A more bare-bones alternative to Ramnath's answer, passing the observed mean and standard deviation, and using ggplot instead of qplot:
df <- data.frame(x = rnorm(1000, 2, 2))
# overlay histogram and normal density
ggplot(df, aes(x)) +
geom_histogram(aes(y = after_stat(density))) +
stat_function(
fun = dnorm,
args = list(mean = mean(df$x), sd = sd(df$x)),
lwd = 2,
col = 'red'
)
What about using geom_density() from ggplot2? Like so:
df <- data.frame(x = rnorm(1000, 2, 2))
ggplot(df, aes(x)) +
geom_histogram(aes(y=..density..)) + # scale histogram y
geom_density(col = "red")
This also works for multimodal distributions, for example:
df <- data.frame(x = c(rnorm(1000, 2, 2), rnorm(1000, 12, 2), rnorm(500, -8, 2)))
ggplot(df, aes(x)) +
geom_histogram(aes(y=..density..)) + # scale histogram y
geom_density(col = "red")
I'm trying for iris data set. You should be able to see graph you need in these simple code:
ker_graph <- ggplot(iris, aes(x = Sepal.Length)) +
geom_histogram(aes(y = ..density..),
colour = 1, fill = "white") +
geom_density(lwd = 1.2,
linetype = 2,
colour = 2)
I am trying to make a histogram of density values and overlay that with the curve of a density function (not the density estimate).
Using a simple standard normal example, here is some data:
x <- rnorm(1000)
I can do:
q <- qplot( x, geom="histogram")
q + stat_function( fun = dnorm )
but this gives the scale of the histogram in frequencies and not densities. with ..density.. I can get the proper scale on the histogram:
q <- qplot( x,..density.., geom="histogram")
q
But now this gives an error:
q + stat_function( fun = dnorm )
Is there something I am not seeing?
Another question, is there a way to plot the curve of a function, like curve(), but then not as layer?
Here you go!
# create some data to work with
x = rnorm(1000);
# overlay histogram, empirical density and normal density
p0 = qplot(x, geom = 'blank') +
geom_line(aes(y = ..density.., colour = 'Empirical'), stat = 'density') +
stat_function(fun = dnorm, aes(colour = 'Normal')) +
geom_histogram(aes(y = ..density..), alpha = 0.4) +
scale_colour_manual(name = 'Density', values = c('red', 'blue')) +
theme(legend.position = c(0.85, 0.85))
print(p0)
A more bare-bones alternative to Ramnath's answer, passing the observed mean and standard deviation, and using ggplot instead of qplot:
df <- data.frame(x = rnorm(1000, 2, 2))
# overlay histogram and normal density
ggplot(df, aes(x)) +
geom_histogram(aes(y = after_stat(density))) +
stat_function(
fun = dnorm,
args = list(mean = mean(df$x), sd = sd(df$x)),
lwd = 2,
col = 'red'
)
What about using geom_density() from ggplot2? Like so:
df <- data.frame(x = rnorm(1000, 2, 2))
ggplot(df, aes(x)) +
geom_histogram(aes(y=..density..)) + # scale histogram y
geom_density(col = "red")
This also works for multimodal distributions, for example:
df <- data.frame(x = c(rnorm(1000, 2, 2), rnorm(1000, 12, 2), rnorm(500, -8, 2)))
ggplot(df, aes(x)) +
geom_histogram(aes(y=..density..)) + # scale histogram y
geom_density(col = "red")
I'm trying for iris data set. You should be able to see graph you need in these simple code:
ker_graph <- ggplot(iris, aes(x = Sepal.Length)) +
geom_histogram(aes(y = ..density..),
colour = 1, fill = "white") +
geom_density(lwd = 1.2,
linetype = 2,
colour = 2)