I would like to use ggplot2 to draw a lattice plot of densities produced from different methods, in which the same yaxis scale is used throughout.
I would like to set the upper limit of the y axis to a value below the highest density value for any one method. However ggplot by default removes sections of the geom that are outside of the plotted region.
For example:
# Toy example of problem
xval <- rnorm(10000)
#Base1
plot(density(xval))
#Base2
plot(density(xval), ylim=c(0, 0.3)) # densities > 0.3 not removed from plot
xval <- as.data.frame(xval)
ggplot(xval, aes(x=xval)) + geom_density() #gg1 - looks like Base1
ggplot(xval, aex(x=xval)) + geom_density() + ylim(0, 0.3)
#gg2: does not look like Base2 due to removal of density values > 0.3
These produce the images below:
How can I make the ggplot image not have the missing section?
Using xlim() or ylim() directly will drop all data points that are not within the specified range. This yields the discontinuity of the density plot. Use coord_cartesian() to zoom in without losing the data points.
ggplot(xval, aes(x=xval)) +
geom_density() +
coord_cartesian(ylim = c(0, 0.3))
Related
I'm new to PCA. I'm plotting the scores using autoplot from ggfortify and ggplot. Both have the same shape but have different values for the x and y axes. Eg. autoplot goes from -0.2 to 0.2 in the y-axis, and ggplot goes from -0.6 to -0.6. The points on the graphs look the exact same. Only the values of the axes changed. Why is that?
Edit:
I can't really give the full data here as it's very long. I tried these two:
library(ggfortify)
pca.data <- prcomp(my_data)
autoplot(pca.data)
and
my_dataframe <- data.frame(Sample = rownames(pca.data$x),
X = pca.data$x[,1],
Y = pca.data$x[,2])
ggplot(data = my_dataframe, aes(x=X, y=Y, label=Sample)) +
geom_point() +
xlab("PC1") +
ylab("PC2") +
ggtitle("PCA Graph")
According to the vignette, autoplot scales in the same way as the biplot() function. If you don't want it to, you can instead use:
autoplot(pca.data, scale=0)
which (except for axis labels) gives the same at the ggplot command that you used.
i am trying to add a fitted distribution to the histogram, but after I run it, it is just a straight line. How can i get a density line?
hist(data$price) lines(density(data$price)), lwd = 2, col ="red")
You are using graphics function hist. Use MASS function truehist instead
MASS::truehist(data$price)
lines(density(data$price)), lwd = 2, col ="red")
#Chriss gave a good solution--it does produce a density curve on top of the histogram; however, it changes the y-axis so that you only see the density values (losing the count values).
Here is an alternate solution that will place the frequency counts on the left-side y-axis and add density as a right-side y-axis. Tweak code as needed for things like bins, color, etc. I'm using the mtcars data as an example since there was no code or data provided in the question to replicate. In addition to the two libraries used here (ggpubr and cowplot), you may need to use some ggplot functions to better customize these plot options.
Code for this solution was modified from https://www.datanovia.com/en/blog/ggplot-histogram-with-density-curve-in-r-using-secondary-y-axis/
# packages needed
library(ggpubr)
library(cowplot)
# load data (none provided in the original question)
data("mtcars")
# create histogram (I have 10 bins here, but you may need a different amount)
phist <- gghistogram(mtcars, x="hp", bins=10, fill="blue", ylab="Count (blue)") + ggtitle("Car Horsepower Histogram")
# create density plot, removing many plot elements
pdens <- ggdensity(mtcars, x="hp", col="red", size=2, alpha = 0, ylab="Density (red)") +
scale_y_continuous(expand = expansion(mult = c(0, 0.05)), position = "right") +
theme_half_open(11, rel_small = 1) +
rremove("x.axis")+
rremove("xlab") +
rremove("x.text") +
rremove("x.ticks") +
rremove("legend")
# overlay and display the plots
aligned_plots <- align_plots(phist, pdens, align="hv", axis="tblr")
ggdraw(aligned_plots[[1]]) + draw_plot(aligned_plots[[2]])
I try to plot both geom_histogram and geom_density in one figure. When I plot the two separate from each other I get for each the output I want (histogram and density plot) but when I try combining them, only the histogram is showed (regardless of which order of the histogram/density in the code).
My code looks like this:
ggplot(data=Stack_time, aes(x=values))+geom_density(alpha=0.2, fill="#FF6666")+
geom_histogram(binwidth = 50, colour="black", fill="#009454")
I do not receive any error message, but the geom_density is never shown in combination with the geom_histogram.
Since you did not provide any data here a solution based on mtcars:
Your code is nearly correct. You need to add an alpha value to your histogram, so you can see the density. But also you need to scale your data, since the density plot is between the range of 0 and 1. If you got data values larger then 1, the density plot can be tiny and you can't see it. With the function scale_data as defined as follows, i scale my data to the range of 0-1
df=mtcars
scale_data <- function(x){(x-min(x))/(max(x)-min(x))}
df$mpg2 <- scale_data(df$mpg)
library(ggplot2)
ggplot(data=df, aes(x=mpg2))+geom_density(alpha=0.2, fill="#FF6666")+
geom_histogram(binwidth = 50, colour="black", fill="#009454", alpha = 0.1)
this gives the expected output:
you can adjust this solution to your needs. Just scale the data or the density plot to the data
This should do the job, approximately:
data.frame(x=rnorm(1000)) %>% ggplot(aes(x, ..density..)) + geom_histogram(binwidth = 0.2, alpha=0.5) + geom_density(fill="red", alpha=0.2)
I have data from 2 populations.
I'd like to get the histogram and density plot of both on the same graphic.
With one color for one population and another color for the other one.
I've tried this (example):
library(ggplot2)
AA <- rnorm(100000, 70,20)
BB <- rnorm(100000,120,20)
valores <- c(AA,BB)
grupo <- c(rep("AA", 100000),c(rep("BB", 100000)))
todo <- data.frame(valores, grupo)
ggplot(todo, aes(x=valores, fill=grupo, color=grupo)) +
geom_histogram(aes(y=..density..), binwidth=3)+ geom_density(aes(color=grupo))
But I'm just getting a graphic with a single line and a single color.
I would like to have different colors for the the two density lines. And if possible the histograms as well.
I've done it with ggplot2 but base R would also be OK.
or I don't know what I've changed and now I get this:
ggplot(todo, aes(x=valores, fill=grupo, color=grupo)) +
geom_histogram( position="identity", binwidth=3, alpha=0.5)+
geom_density(aes(color=grupo))
but the density lines were not plotted.
or even strange things like
I suggest this ggplot2 solution:
ggplot(todo, aes(valores, color=grupo)) +
geom_histogram(position="identity", binwidth=3, aes(y=..density.., fill=grupo), alpha=0.5) +
geom_density()
#skan: Your attempt was close but you plotted the frequencies instead of density values in the histogram.
A base R solution could be:
hist(AA, probability = T, col = rgb(1,0,0,0.5), border = rgb(1,0,0,1),
xlim=range(AA,BB), breaks= 50, ylim=c(0,0.025), main="AA and BB", xlab = "")
hist(BB, probability = T, col = rgb(0,0,1,0.5), border = rgb(0,0,1,1), add=T)
lines(density(AA))
lines(density(BB), lty=2)
For alpha I used rgb. But there are more ways to get it in. See alpha() in the scales package for instance. I added also the breaks parameter for the plot of the AAs to increase the binwidth compared to the BB group.
I have some code that is plots a histogram of some values, along with a few horizontal lines to represent reference points to compare against. However, ggplot is not generating a legend for the lines.
library(ggplot2)
library(dplyr)
## Siumlate an equal mix of uniform and non-uniform observations on [0,1]
x <- data.frame(PValue=c(runif(500), rbeta(500, 0.25, 1)))
y <- c(Uniform=1, NullFraction=0.5) %>% data.frame(Line=names(.) %>% factor(levels=unique(.)), Intercept=.)
ggplot(x) +
aes(x=PValue, y=..density..) + geom_histogram(binwidth=0.02) +
geom_hline(aes(yintercept=Intercept, group=Line, color=Line, linetype=Line),
data=y, alpha=0.5)
I even tried reducing the problem to just plotting the lines:
ggplot(y) +
geom_hline(aes(yintercept=Intercept, color=Line)) + xlim(0,1)
and I still don't get a legend. Can anyone explain why my code isn't producing plots with legends?
By default show_guide = FALSE for geom_hline. If you turn this on then the legend will appear. Also, alpha needs to be inside of aes otherwise the colours of the lines will not be plotted properly (on the legend). The code looks like this:
ggplot(x) +
aes(x=PValue, y=..density..) + geom_histogram(binwidth=0.02) +
geom_hline(aes(yintercept=Intercept, colour=Line, linetype=Line, alpha=0.5),
data=y, show_guide=TRUE)
And output: