Why doesn't geom_hline generate a legend in ggplot2? - r

I have some code that is plots a histogram of some values, along with a few horizontal lines to represent reference points to compare against. However, ggplot is not generating a legend for the lines.
library(ggplot2)
library(dplyr)
## Siumlate an equal mix of uniform and non-uniform observations on [0,1]
x <- data.frame(PValue=c(runif(500), rbeta(500, 0.25, 1)))
y <- c(Uniform=1, NullFraction=0.5) %>% data.frame(Line=names(.) %>% factor(levels=unique(.)), Intercept=.)
ggplot(x) +
aes(x=PValue, y=..density..) + geom_histogram(binwidth=0.02) +
geom_hline(aes(yintercept=Intercept, group=Line, color=Line, linetype=Line),
data=y, alpha=0.5)
I even tried reducing the problem to just plotting the lines:
ggplot(y) +
geom_hline(aes(yintercept=Intercept, color=Line)) + xlim(0,1)
and I still don't get a legend. Can anyone explain why my code isn't producing plots with legends?

By default show_guide = FALSE for geom_hline. If you turn this on then the legend will appear. Also, alpha needs to be inside of aes otherwise the colours of the lines will not be plotted properly (on the legend). The code looks like this:
ggplot(x) +
aes(x=PValue, y=..density..) + geom_histogram(binwidth=0.02) +
geom_hline(aes(yintercept=Intercept, colour=Line, linetype=Line, alpha=0.5),
data=y, show_guide=TRUE)
And output:

Related

Ggplot2 Boxplot width setting changes x-axis

I have produced a boxplot with a continuous x-axis unsing geom_boxplot() in ggplot2. However, as there are many boxes they appear as skinny lines. Another stackoverflow chain (see here) suggested using the width= argument to make all the boxes the same width. However, when I use this argument it changes the x-axis and some of the boxes just disappear!
For example, take this example dataframe. I apologise for the number of observations this has but I think the quantity has to do with the problem as I couldn't reproduce it with a more simple boxplot:
Lat<- c(50.70228,50.70228,50.70228,51.82067,51.82067,51.82067,52.45893,52.45893,52.45893,52.76478,52.76478,52.76478,52.78354,52.78354,52.78354,53.56102,53.56102,53.56102,53.65364,53.65364,53.65364,53.63130,53.63130,53.63130,54.19035,54.19035,54.19035,54.25751,54.25751,54.25751,54.23526,54.23526,54.23526,54.62469,54.62469,54.62469,54.67831,54.67831,54.67831,54.67900,54.67900,54.67900,54.94908,54.94908,54.94908,55.19456,55.19456,55.19456,54.79198,54.79198,54.79198,55.34981,55.34981,55.34981,55.85655,55.85655,55.85655,56.06078,56.06078,56.06078,55.84553,55.84553,55.84553,56.00197,56.00197,56.00197,56.71842,56.71842,56.71842,57.00116,57.00116,57.00116,57.06942,57.06942,57.06942,57.26815,57.26815,57.26815,57.45532,57.45532,57.45532,57.88596,57.88596,57.88596,51.07711,51.07711,51.07711,51.07801,51.07621,51.11159,51.11159,51.11159,52.02484,52.02484,52.02484,52.02581,52.02581,52.02581,52.02685,52.02685,52.02685,52.05353,52.05353,52.05626,52.05353,52.05353,52.05353,52.05353,52.05353,52.05353,51.93541,51.93541,51.93541,51.93541,51.93541,51.93541,51.93541,51.93541,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.90810,52.90810,52.90810,52.90810,52.90810,52.90810,52.78968,52.78778,52.78968,52.78968,52.78881,52.78883,52.78883,52.78883,52.78970,52.78970,52.79506,52.79506,52.79506,53.77270,53.77276,53.77109,53.77109,53.77276,53.76845,53.76845,53.77109,53.76845,53.77109,53.87020,53.87020,53.87020,53.87103,53.88205,53.88205,53.88205,53.88205,53.87701,53.87701,53.87098,53.87098,53.87098,53.86932,53.86932,53.86932,56.51869,56.51869,56.51869,56.55870,56.55870,56.55870,56.55964,56.55964,56.55964,57.51056,57.49542,57.49542,57.50878,57.50878,57.50878,57.45201,57.45477,57.45192,57.45192,57.45192)
y <- c(33.45407,21.40954,27.73487,20.38318,26.65483,31.68201,23.95467,20.77363,32.94192,22.71228,25.78824,28.39449,35.60615,24.29325,22.95047,25.65343,30.23262,22.05534,37.20565,35.53812,38.20211,39.38034,35.16619,38.82336,29.72370,38.25754,26.51339,39.38283,29.57483,31.80111,24.52967,34.83037,21.75038,35.50868,39.41830,21.96971,22.82504,32.69746,35.10747,27.75669,34.96690,37.61921,37.17226,20.50448,39.26582,22.08668,28.41502,36.69530,23.69404,23.18052,33.27420,23.04157,33.17285,32.00579,21.83845,22.97143,32.27190,21.53771,38.65481,20.14341,33.62718,39.86755,39.77881,30.59810,27.65909,24.11646,34.56981,29.30249,34.99361,32.39553,28.90443,34.88775,22.77049,36.44468,30.64496,35.81501,31.77673,24.19058,39.36298,21.47219,23.02268,31.37647,27.28457,33.14749,23.20842,39.73427,39.81399,35.51515,24.55080,39.41190,29.59987,38.46791,20.94479,37.22109,26.36060,30.91641,39.25975,39.88288,22.59061,30.24439,21.66110,30.36878,28.76901,38.75561,33.80408,31.05842,26.18921,21.30804,35.02966,33.85981,30.84373,31.67341,35.07605,37.93820,31.30481,21.45117,37.13626,25.70964,25.64736,38.58381,31.24448,26.55902,23.90817,33.70300,26.48909,37.73200,32.52413,22.44440,28.19878,32.46415,25.13711,26.66075,28.16254,20.40673,39.89327,30.83327,32.40196,39.81218,39.80391,21.87316,34.95792,33.38958,38.18441,22.03114,35.64410,34.90643,24.23056,36.66581,29.35813,20.86880,30.02044,36.13727,24.65558,39.43175,29.00154,29.78185,22.89196,37.15204,35.88188,28.73920,28.04934,37.50701,30.36306,28.39842,35.20973,26.54260,29.57763,26.03163,26.90440,27.60110,25.80086,39.98019,21.59970,28.83825,32.01711,20.50812,38.43331,32.41898,27.68722,32.59905,24.18150,29.05701,22.38512,32.93342,37.66694,37.65391,34.19613,23.89985,36.90012,20.74244,27.08511,29.21433,35.83771,35.59557,33.74533,27.08854,38.38994)
V3 <-c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2)
df <- as.data.frame(cbind(Lat, y, as.factor(V3)))
head(df)
I plot it on a continuous x-axis as so:
df_plot <- ggplot(df, aes(x=Lat, y=y, group=Lat))+
geom_boxplot(aes(colour=as.factor(V3)))+
theme_classic()
df_plot
Which produces:
As you can see the boxes are represented as skinny lines.
Therefore I tried to use the width= argument as so:
df_plot2 <- ggplot(df, aes(x=Lat, y=y, group=Lat))+
geom_boxplot(aes(colour=as.factor(V3)), width=1)+
theme_classic()
df_plot2
The output is:
The main thing to notice here is that the x-axis range has suddenly changed! Some of the boxes are no longer plotted whilst others seem to be placed at different values of the x-axis.
The range of the x-axis should be:
range(df$Lat)
[1] 50.70228 57.88596
I am completley perplexed as to why the x-axis would change by simply adding the width= argument in geom_boxplot(). I therefore tried to force the limits of the x-axis scale as so:
df_plot3 <- ggplot(df, aes(x=Lat, y=y, group=Lat))+
geom_boxplot(aes(colour=as.factor(V3)), width=1)+
xlim(50,58)+
theme_classic()
df_plot3
ouput:
Please send help!
I think the strange behaviour comes from ggplot trying to automatically dodge your boxplots apart. By setting position = position_dodge(width = 0) the plot seems to be created as expected without changing the placement of boxes along the x-axis. (But gives a warning about overlapping x intervals)
Lat<- c(50.70228,50.70228,50.70228,51.82067,51.82067,51.82067,52.45893,52.45893,52.45893,52.76478,52.76478,52.76478,52.78354,52.78354,52.78354,53.56102,53.56102,53.56102,53.65364,53.65364,53.65364,53.63130,53.63130,53.63130,54.19035,54.19035,54.19035,54.25751,54.25751,54.25751,54.23526,54.23526,54.23526,54.62469,54.62469,54.62469,54.67831,54.67831,54.67831,54.67900,54.67900,54.67900,54.94908,54.94908,54.94908,55.19456,55.19456,55.19456,54.79198,54.79198,54.79198,55.34981,55.34981,55.34981,55.85655,55.85655,55.85655,56.06078,56.06078,56.06078,55.84553,55.84553,55.84553,56.00197,56.00197,56.00197,56.71842,56.71842,56.71842,57.00116,57.00116,57.00116,57.06942,57.06942,57.06942,57.26815,57.26815,57.26815,57.45532,57.45532,57.45532,57.88596,57.88596,57.88596,51.07711,51.07711,51.07711,51.07801,51.07621,51.11159,51.11159,51.11159,52.02484,52.02484,52.02484,52.02581,52.02581,52.02581,52.02685,52.02685,52.02685,52.05353,52.05353,52.05626,52.05353,52.05353,52.05353,52.05353,52.05353,52.05353,51.93541,51.93541,51.93541,51.93541,51.93541,51.93541,51.93541,51.93541,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.92425,52.90810,52.90810,52.90810,52.90810,52.90810,52.90810,52.78968,52.78778,52.78968,52.78968,52.78881,52.78883,52.78883,52.78883,52.78970,52.78970,52.79506,52.79506,52.79506,53.77270,53.77276,53.77109,53.77109,53.77276,53.76845,53.76845,53.77109,53.76845,53.77109,53.87020,53.87020,53.87020,53.87103,53.88205,53.88205,53.88205,53.88205,53.87701,53.87701,53.87098,53.87098,53.87098,53.86932,53.86932,53.86932,56.51869,56.51869,56.51869,56.55870,56.55870,56.55870,56.55964,56.55964,56.55964,57.51056,57.49542,57.49542,57.50878,57.50878,57.50878,57.45201,57.45477,57.45192,57.45192,57.45192)
y <- c(33.45407,21.40954,27.73487,20.38318,26.65483,31.68201,23.95467,20.77363,32.94192,22.71228,25.78824,28.39449,35.60615,24.29325,22.95047,25.65343,30.23262,22.05534,37.20565,35.53812,38.20211,39.38034,35.16619,38.82336,29.72370,38.25754,26.51339,39.38283,29.57483,31.80111,24.52967,34.83037,21.75038,35.50868,39.41830,21.96971,22.82504,32.69746,35.10747,27.75669,34.96690,37.61921,37.17226,20.50448,39.26582,22.08668,28.41502,36.69530,23.69404,23.18052,33.27420,23.04157,33.17285,32.00579,21.83845,22.97143,32.27190,21.53771,38.65481,20.14341,33.62718,39.86755,39.77881,30.59810,27.65909,24.11646,34.56981,29.30249,34.99361,32.39553,28.90443,34.88775,22.77049,36.44468,30.64496,35.81501,31.77673,24.19058,39.36298,21.47219,23.02268,31.37647,27.28457,33.14749,23.20842,39.73427,39.81399,35.51515,24.55080,39.41190,29.59987,38.46791,20.94479,37.22109,26.36060,30.91641,39.25975,39.88288,22.59061,30.24439,21.66110,30.36878,28.76901,38.75561,33.80408,31.05842,26.18921,21.30804,35.02966,33.85981,30.84373,31.67341,35.07605,37.93820,31.30481,21.45117,37.13626,25.70964,25.64736,38.58381,31.24448,26.55902,23.90817,33.70300,26.48909,37.73200,32.52413,22.44440,28.19878,32.46415,25.13711,26.66075,28.16254,20.40673,39.89327,30.83327,32.40196,39.81218,39.80391,21.87316,34.95792,33.38958,38.18441,22.03114,35.64410,34.90643,24.23056,36.66581,29.35813,20.86880,30.02044,36.13727,24.65558,39.43175,29.00154,29.78185,22.89196,37.15204,35.88188,28.73920,28.04934,37.50701,30.36306,28.39842,35.20973,26.54260,29.57763,26.03163,26.90440,27.60110,25.80086,39.98019,21.59970,28.83825,32.01711,20.50812,38.43331,32.41898,27.68722,32.59905,24.18150,29.05701,22.38512,32.93342,37.66694,37.65391,34.19613,23.89985,36.90012,20.74244,27.08511,29.21433,35.83771,35.59557,33.74533,27.08854,38.38994)
V3 <-c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2)
library(ggplot2)
df <- as.data.frame(cbind(Lat, y, as.factor(V3)))
df_plot <- ggplot(df) +
geom_boxplot(aes(colour=as.factor(V3), x=Lat, y=y, group=as.factor(Lat)),
position=position_dodge(width = 0),
width=1) +
theme_classic()

ggplot doesn't plot two geom in one figure R

I try to plot both geom_histogram and geom_density in one figure. When I plot the two separate from each other I get for each the output I want (histogram and density plot) but when I try combining them, only the histogram is showed (regardless of which order of the histogram/density in the code).
My code looks like this:
ggplot(data=Stack_time, aes(x=values))+geom_density(alpha=0.2, fill="#FF6666")+
geom_histogram(binwidth = 50, colour="black", fill="#009454")
I do not receive any error message, but the geom_density is never shown in combination with the geom_histogram.
Since you did not provide any data here a solution based on mtcars:
Your code is nearly correct. You need to add an alpha value to your histogram, so you can see the density. But also you need to scale your data, since the density plot is between the range of 0 and 1. If you got data values larger then 1, the density plot can be tiny and you can't see it. With the function scale_data as defined as follows, i scale my data to the range of 0-1
df=mtcars
scale_data <- function(x){(x-min(x))/(max(x)-min(x))}
df$mpg2 <- scale_data(df$mpg)
library(ggplot2)
ggplot(data=df, aes(x=mpg2))+geom_density(alpha=0.2, fill="#FF6666")+
geom_histogram(binwidth = 50, colour="black", fill="#009454", alpha = 0.1)
this gives the expected output:
you can adjust this solution to your needs. Just scale the data or the density plot to the data
This should do the job, approximately:
data.frame(x=rnorm(1000)) %>% ggplot(aes(x, ..density..)) + geom_histogram(binwidth = 0.2, alpha=0.5) + geom_density(fill="red", alpha=0.2)

3-variables plotting heatmap ggplot2

I'm currently working on a very simple data.frame, containing three columns:
x contains x-coordinates of a set of points,
y contains y-coordinates of the set of points, and
weight contains a value associated to each point;
Now, working in ggplot2 I seem to be able to plot contour levels for these data, but i can't manage to find a way to fill the plot according to the variable weight. Here's the code that I used:
ggplot(df, aes(x,y, fill=weight)) +
geom_density_2d() +
coord_fixed(ratio = 1)
You can see that there's no filling whatsoever, sadly.
I've been trying for three days now, and I'm starting to get depressed.
Specifying fill=weight and/or color = weight in the general ggplot call, resulted in nothing. I've tried to use different geoms (tile, raster, polygon...), still nothing. Tried to specify the aes directly into the geom layer, also didn't work.
Tried to convert the object as a ppp but ggplot can't handle them, and also using base-R plotting didn't work. I have honestly no idea of what's wrong!
I'm attaching the first 10 points' data, which is spaced on an irregular grid:
x = c(-0.13397460,-0.31698730,-0.13397460,0.13397460,-0.28867513,-0.13397460,-0.31698730,-0.13397460,-0.28867513,-0.26794919)
y = c(-0.5000000,-0.6830127,-0.5000000,-0.2320508,-0.6547005,-0.5000000,-0.6830127,-0.5000000,-0.6547005,0.0000000)
weight = c(4.799250e-01,5.500250e-01,4.799250e-01,-2.130287e+12,5.798250e-01,4.799250e-01,5.500250e-01,4.799250e-01,5.798250e-01,6.618956e-01)
any advise? The desired output would be something along these lines:
click
Thank you in advance.
From your description geom_density doesn't sound right.
You could try geom_raster:
ggplot(df, aes(x,y, fill = weight)) +
geom_raster() +
coord_fixed(ratio = 1) +
scale_fill_gradientn(colours = rev(rainbow(7)) # colourmap
Here is a second-best using fill=..level... There is a good explanation on ..level.. here.
# load libraries
library(ggplot2)
library(RColorBrewer)
library(ggthemes)
# build your data.frame
df <- data.frame(x=x, y=y, weight=weight)
# build color Palette
myPalette <- colorRampPalette(rev(brewer.pal(11, "Spectral")), space="Lab")
# Plot
ggplot(df, aes(x,y, fill=..level..) ) +
stat_density_2d( bins=11, geom = "polygon") +
scale_fill_gradientn(colours = myPalette(11)) +
theme_minimal() +
coord_fixed(ratio = 1)

How to adjust the ordering of labels in the default legend in ggplot2 so that it corresponds to the order in the data

I am plotting a forest plot in ggplot2 and am having issues with the ordering of the labels in the legend matching the order of the labels in the data set. Here is my code below.
data code
d<-data.frame(x=c("Co-K(W) N=720", "IH-K(W) N=67", "IF-K(W) N=198", "CO-K(B)N=78", "IH-K(B) N=13", "CO=A(W) N=874","D-Sco Ad(W) N=346","DR-Ad (W) N=892","CE_A(W) N=274","CO-Ad(B) N=66","D-So Ad(B) N=215","DR-Ad(B) N=123","CE-Ad(B) N=79"),
y = rnorm(13, 0, 0.1))
d <- transform(d, ylo = y-1/13, yhi=y+1/13)
d$x <- factor(d$x, levels=rev(d$x)) # reverse ordering
forest plot code
credplot.gg <- function(d){
# d is a data frame with 4 columns
# d$x gives variable names
# d$y gives center point
# d$ylo gives lower limits
# d$yhi gives upper limits
require(ggplot2)
p <- ggplot(d, aes(x=x, y=y, ymin=ylo, ymax=yhi,group=x,colour=x,)) +
geom_pointrange(size=1) +
theme_bw() +
scale_color_discrete(name="Sample") +
coord_flip() +
theme(legend.key=element_rect(fill='cornsilk2')) +
guides(colour = guide_legend(override.aes = list(size=0.5))) +
geom_hline(aes(x=0), colour = 'red', lty=2) +
xlab('Cohort') + ylab('CI') + ggtitle('Forest Plot')
return(p)
}
credplot.gg(d)
This is what I get. As you can see the labels on the y axis matches the labels in the order that it is in the data. However, it is not the same order in the legend. I'm not sure how to correct this. This is my first time creating a plot in ggplot2. Any feedback is well appreciated.Thanks in advanced
Nice plot, especially for a first ggplot! I've not tested, but I think all you need is to add reverse=TRUE inside your colour's guide_legend(found this in the Cookbook for R).
If I were to make one more comment, I'd say that ordering your vertical factor by numeric value often makes comparisons easier when alphabetical order isn't particularly meaningful. (Though maybe your alpha order is meaningful.)

R - Smoothing color and adding a legend to a scatterplot

I have a scatterplot in R. Each (x,y) point is colored according to its z value. So you can think of each point as (x,y,z), where (x,y) determines its position and z determines its color along a color gradient. I would like to add two things
A legend on the right side showing the color gradient and what z values correspond to what colors
I would like to smooth all the color using some type of interpolation, I assume. In other words, the entire plotting region (or at least most of it) should become colored so that it looks like a huge heatmap instead of a scatterplot. So, in the example below, there would be lots of orange/yellow around and then some patches of purple throughout. I'm happy to further clarify what I'm trying to explain here, if need be.
Here is the code I have currently, and the image it makes.
x <- seq(1,150)
y <- runif(150)
z <- c(rnorm(mean=1,100),rnorm(mean=20,50))
colorFunction <- colorRamp(rainbow(100))
zScaled <- (z - min(z)) / (max(z) - min(z))
zMatrix <- colorFunction(zScaled)
zColors <- rgb(zMatrix[,1], zMatrix[,2], zMatrix[,3], maxColorValue=255)
df <- data.frame(x,y)
x <- densCols(x,y, colramp=colorRampPalette(c("black", "white")))
df$dens <- col2rgb(x)[1,] + 1L
plot(y~x, data=df[order(df$dens),],pch=20, col=zColors, cex=1)
Here are some solutions using the ggplot2 package.
# Load library
library(ggplot2)
# Recreate the scatterplot from the example with default colours
ggplot(df) +
geom_point(aes(x=x, y=y, col=dens))
# Recreate the scatterplot with a custom set of colours. I use rainbow(100)
ggplot(df) +
geom_point(aes(x=x, y=y, col=dens)) +
scale_color_gradientn(colours=rainbow(100))
# A 2d density plot, using default colours
ggplot(df) +
stat_density2d(aes(x=x, y=y, z=dens, fill = ..level..), geom="polygon") +
ylim(-0.2, 1.2) + xlim(-30, 180) # I had to twiddle with the ranges to get a nicer plot
# A better density plot, in my opinion. Tiles across your range of data
ggplot(df) +
stat_density2d(aes(x=x, y=y, z=dens, fill = ..density..), geom="tile",
contour = FALSE)
# Using custom colours. I use rainbow(100) again.
ggplot(df) +
stat_density2d(aes(x=x, y=y, z=dens, fill = ..density..), geom="tile",
contour = FALSE) +
scale_fill_gradientn(colours=rainbow(100))
# You can also plot the points on top, if you want
ggplot(df) +
stat_density2d(aes(x=x, y=y, z=dens, fill = ..density..), geom="tile",
contour = FALSE) +
geom_point(aes(x=x, y=y, col=dens)) +
scale_colour_continuous(guide=FALSE) # This removes the extra legend
I attach the plots as well:
Also, using ggplot2, you can use color and size together, as in:
ggplot(df, aes(x=x, y=y, size=dens, color=dens)) + geom_point() +
scale_color_gradientn(name="Density", colours=rev(rainbow(100))) +
scale_size_continuous(range=c(1,15), guide="none")
which might make it a little clearer.
Notes:
The expression rev(rainbow(100)) reverses the rainbow color scale,
so that red goes with the larger values of dens.
Unfortunately, you cannot combine a continuous legend (color) and a
discrete legend (size), so you would normally get two legends. The
expression guide="none" hides the size legend.
Here's the plot:

Resources