how to improve ggplot? - r

I have a boxplot output in R using ggplot2. box plot i got using the below code
I want to label each box plot as labelled in the sample plot. sample plot i want to get
I have calculated p-value that is 0.06 for first egg1. i would like to paste this text on the plot as shown in the sample plot. how i can do that?
ggplot(testdata) +
geom_boxplot(aes(x=variable, y=value, color= as.factor (classification)))

You can use annotate to add text on your boxplot:
ggplot(testdata) +
geom_boxplot(aes(x=variable, y=value, color= as.factor (classification))) +
annotate(geom="text", x=1, y=6, label="p = 0.06")

Related

How to group multiple box and whisker plots together using vertical separators?

My data has a mix of single and multiple box and whisker plots for each genotype in the x axis (see image). I am trying to add vertical lines in between each genotype, so that the this graph looks neater. I tried using facet function but this creates multiple plots altogether. Is there any simpler way to do this? Cheers!
this is my image so far
Code below:
use ggplot to plot the MED PEAKGT BY YEAR & GENOTYPE
t<-ggplot(crop.data, aes(x=reorder(genotype,- offsetgt,FUN= median), y= peakgt, fill=year)) +
geom_boxplot(outlier.shape=NA) +
labs(title="Peak GT (degree C) distribution",x="Genotype", y = "Peak GT (degree C)")+
geom_jitter(shape=1, color="black", size=1)+
theme_classic()
t

Why doesn't geom_hline generate a legend in ggplot2?

I have some code that is plots a histogram of some values, along with a few horizontal lines to represent reference points to compare against. However, ggplot is not generating a legend for the lines.
library(ggplot2)
library(dplyr)
## Siumlate an equal mix of uniform and non-uniform observations on [0,1]
x <- data.frame(PValue=c(runif(500), rbeta(500, 0.25, 1)))
y <- c(Uniform=1, NullFraction=0.5) %>% data.frame(Line=names(.) %>% factor(levels=unique(.)), Intercept=.)
ggplot(x) +
aes(x=PValue, y=..density..) + geom_histogram(binwidth=0.02) +
geom_hline(aes(yintercept=Intercept, group=Line, color=Line, linetype=Line),
data=y, alpha=0.5)
I even tried reducing the problem to just plotting the lines:
ggplot(y) +
geom_hline(aes(yintercept=Intercept, color=Line)) + xlim(0,1)
and I still don't get a legend. Can anyone explain why my code isn't producing plots with legends?
By default show_guide = FALSE for geom_hline. If you turn this on then the legend will appear. Also, alpha needs to be inside of aes otherwise the colours of the lines will not be plotted properly (on the legend). The code looks like this:
ggplot(x) +
aes(x=PValue, y=..density..) + geom_histogram(binwidth=0.02) +
geom_hline(aes(yintercept=Intercept, colour=Line, linetype=Line, alpha=0.5),
data=y, show_guide=TRUE)
And output:

Scatter plot and boxplot overlay

Based on the previous post ggplot boxplots with scatterplot overlay (same variables),
I would like to have one boxplot for each day of week instead of two boxplots while have scatter points on it with different colour.
The code will be like:
#Box-plot for day of week effect
plot1<-ggplot(data=dodgers, aes(x=ordered_day_of_week, y=Attend)) + geom_boxplot()
#Scatter with specific colors for day of week
plot2<-ggplot(dodgers, aes(x=ordered_month, y=Attend, colour=Bobblehead, size=1.5)) + geom_point()
#Box-ploy with Scatter plot overlay
plot3<-ggplot(data=dodgers, aes(x=ordered_day_of_week, y=Attend, colour=Bobblehead)) + geom_boxplot() + geom_point()
And the result would be:
1, scatter plot
2, boxplot plot
3, combined plot
Put color= inside the aes() of geom_point() and remove it from ggplot() aes(). If you put color= inside the ggplot() then it affects all geoms. Also you could consider to use position dodge to separate points.
Example with mtcars data as OP didn't provide data.
ggplot(mtcars,aes(factor(cyl),mpg))+geom_boxplot()+
geom_point(aes(color=factor(am)),position=position_dodge(width=0.5))

how to remove line from fill scale legend using geom_vline and geom_histogram r ggplot2

Basics:
Using R statistical software, ggplot2, geom_vline, and geom_histogram to visualize some data. The issue is with the legend keys.
I'm trying to plot a pair of histograms from some stochastic simulations, and on top of that plot a couple of lines representing the result of a deterministic simulation. I've got the data plotted, but the legend keys for the histograms have an unnecessary black line through the middle of them. Can you help me remove those black lines? Some sample code reproducing the issue is here:
df1 <- data.frame(cond = factor( rep(c("A","B"), each=200) ),
rating = c(rnorm(200),rnorm(200, mean=.8)))
df2 <- data.frame(x=c(.5,1),cond=factor(c("A","B")))
ggplot(df1, aes(x=rating, fill=cond)) +
geom_histogram(binwidth=.5, position="dodge") +
geom_vline(data=df2,aes(xintercept=x,linetype=factor(cond)),
show_guide=TRUE) +
labs(fill='Stochastic',linetype='Deterministic')
Edit: added image
Cheers,
Ryan
One workaround is to change the order of geom_histogram() and geom_vline(). Then add another geom_vline() without aes(), just giving xintercept= and linetype=. This will not remove lines but will hide them under the color legend entries.
ggplot(data=df1, aes(x=rating, fill=cond)) +
geom_vline(data=df2,aes(xintercept=x,linetype=factor(cond)),
show_guide=TRUE) +
geom_histogram(binwidth=.5, position="dodge") +
geom_vline(xintercep=df2$x,linetype=c(1,3))+
labs(fill='Stochastic',linetype='Deterministic')

Create a stacked density graph in ggplot2

I'm trying to create a stacked density graph in ggplot2, and I am also trying to understand how qplot works relative to ggplot.
I found the following example online:
qplot(depth, ..density.., data=diamonds, geom="density",
fill=cut, position="stack")
I tried translating this into a call to ggplot because I want to understand how it works:
ggplot(diamonds, aes(x=depth, y=..density..)) +
geom_density(aes(fill=cut, position="stack"))
This creates a density graph, but does not stack it.
What is the different between what qplot is creating and what ggplot is creating?
Here is a stacked density graph:
Non-stacked density graph:
Original example is here
From #kohske's comment, the position is not an aesthetic, and so should not be inside the aes call:
ggplot(diamonds, aes(x=depth, y=..density..)) +
geom_density(aes(fill=cut), position="stack")
or using the movies data (which your example graphs use):
ggplot(movies, aes(x=rating, y=..density..)) +
geom_density(aes(fill=mpaa), position="stack")

Resources