how to plot probability histogram in ggplot2 - r

I want to plot a probability histogram overlay with probability curve and compare them between two group.
my code is as following,
ggplot(MDmedianall, aes(x= MD_median, y=..density.., fill =IDH.type )) +
geom_histogram(alpha = 0.5,binwidth = 0.00010, position = 'identity') +
geom_density( stat="density", position="identity", alpha=0.3 ) +
scale_fill_discrete(breaks=c("0","1"), labels=c("IDH wild type","IDH mutant type")) +
scale_y_continuous(labels = scales :: percent) +
ylab("Relative cumulative frequency(%)") +
xlab("MD median value")
However, the y axis is not what I want, any reasons for that?
BTW, how to change the line style and label them within the color square on the right.

Related

Add mean to grouped box plots in R using ggplot2

I have three Cultures of algae (A,B,C) at two temperatures (27C and 31C) and their densities. I want to make a box plot with Temperature in the x axis, Density in the y axis and the three cultures above each temperature (see picture below). I also need to include a dot with the mean density per culture and temperature. My script plots only one mean above each temperature but what I need is a mean for A#27C, B#27C, C#27 and A#31C, B#31C and C#31C. I tried to adapt some scripts with similar questions but I couldn’t get it to work. Any help would be much appreciated.
graph<-ggplot(Algae, aes(x = Temperature,
y = Density,
fill=Culture))+
geom_boxplot()+
stat_summary(fun=mean,
geom="point",
shape=20,
size=2,
color="red",
fill="red",
position = position_dodge2 (width = 0.5, preserve = "single"))
Remove fill in stat_summary and adapt the width in position
Here is an example with the mtcars data set:
ggplot(mtcars, aes(x = factor(am),
y = mpg,
fill=factor(cyl)))+
geom_boxplot() +
stat_summary(fun=mean,
geom="point",
shape=20,
size=2,
color="red",
position = position_dodge2 (width = 0.7, preserve = "single"))
I fixed it by adding a facet. Here is the script
graph <-ggplot(Algae, aes(x = Temperature,
y = Density,
fill=Culture))+
geom_boxplot()+
stat_summary(fun=mean,
geom="point",
shape=21,
size=2,
color="black",
fill="violet")+
facet_grid(.~Temperature,scales="free")
graph

ggplot fill dotplot but group as no filled

I want to plot a dotplot grouped as non colored figure but filled as the coloured one. To generate coloured I used:
Sample dataset:
data <- data.frame(estado1 = c('APLV','APLV','APLV','APLV','APLV','NO APLV','APLV','NO APLV','NO APLV','APLV','NO APLV','APLV','APLV','APLV','APLV','APLV','APLV','APLV','NO APLV','APLV'), combined_ige = c(3.6,2.84,1.2,14.33,0,0,0,0,0.07,2,0,0.3,0.11,0,0,1.31,0,0,0,0.19), sxtypes = c('skin_resp','skin','skin','skin_dig','dig','dig_resp','skin_dig','dig','dig','skin_resp','skin_dig_resp','dig','dig','dig_resp','skin_dig_resp','skin','dig','skin_dig_resp','resp','skin_dig'))
code
ggplot(data, aes(x=estado1, y=combined_ige, fill= sxtypes)) +
geom_dotplot(binaxis='y', stackdir='center',
stackratio=1.5, dotsize=1.2, alpha=0.6) +
geom_hline(yintercept = (0.35), linetype="dashed") +
geom_hline(yintercept = (0.77), linetype="dashed", col="red") +
xlab("Status group") +
ggtitle("IgE especĂ­ficas combinadas") +
scale_y_log10(labels = function(y) format(y, scientific = F))
When I use "fill = sxtypes" in order to colour dots, them group in layers overlapping each other. I want them to stay in the same positions as in the not coloured figure at the time they colour as in the second figure.

ggplot2: Shift the baseline of barplot (geom_bar) to the minimum data value

I'm trying to generate a bar plot using geom_bar. My bars have both negative and positive values:
set.seed(1)
df <- data.frame(y=log(c(runif(6,0,1),runif(6,1,10))),se=runif(12,0.05,0.1),name=factor(rep(c("a","a","b","b","c","c"),2),levels=c("a","b","c")),side=factor(rep(1:2,6),levels=1:2),group=factor(c(rep("x",6),rep("y",6)),levels=c("x","y")),stringsAsFactors=F)
This plot command plots the positive bars to face up and the negative ones to face down:
library(ggplot2)
dodge <- position_dodge(width=0.9)
limits <- aes(ymax=y+se,ymin=y-se)
ggplot(df,aes(x=name,y=y,group=interaction(side,name),col=group,fill=group))+facet_wrap(~group)+geom_bar(width=0.6,position=position_dodge(width=1),stat="identity")+
geom_bar(position=dodge,stat="identity")+geom_errorbar(limits,position=dodge,width=0.25)
My question is how do I set the base line to the minimum of all bars instead of at 0 and therefre have the red bars facing up?
You can subtract min(df$y) from each value so that the data are shifted to a baseline of zero, but then relabel the y-axis to the actual values of the points. The code to do it is below, but I wouldn't recommend this. It seems confusing to have bars emanating from a non-zero baseline, as the lengths of the bars no longer encode the magnitudes of the y values.
ggplot(df, aes(x=name,y=y - min(y),group=interaction(side, name), col=group, fill=group)) +
facet_wrap(~group) +
geom_bar(position=dodge, stat="identity", width=0.8) +
geom_errorbar(aes(ymin=y-se-min(y), ymax=y+se-min(y)),
position=dodge, width=0.25, colour="black") +
scale_y_continuous(breaks=0:4, labels=round(0:4 + min(df$y), 1)) +
geom_hline(aes(yintercept=0))
Another option is to use geom_linerange which avoids having to shift the y-values and relabel the y-axis. But this suffers from the same distortions as the bar plot above:
ggplot(df, aes(x=name, group=interaction(side, name), col=group, fill=group)) +
facet_wrap(~group) +
geom_linerange(aes(ymin=min(y), ymax=y, x=name, xend=name), position=dodge, size=10) +
geom_errorbar(aes(ymin=y-se, ymax=y+se), position=dodge, width=0.25, colour="black") +
geom_hline(aes(yintercept=min(y)))
Instead, it seems to me points would be more intuitive and natural than bars here:
ggplot(df, aes(x=name,y=y,group=interaction(side, name), col=group, fill=group)) +
facet_wrap(~group) +
geom_hline(yintercept=0, lwd=0.4, colour="grey50") +
geom_errorbar(limits, position=dodge, width=0.25) +
geom_point(position=dodge)
This simple hack also works:
m <- min(df$y) # find min
df$y <- df$y - m
ggplot(df,aes(x=name,y=y,group=interaction(side,name),col=group,fill=group))+
facet_wrap(~group)+
geom_bar(width=0.6,position=position_dodge(width=1),stat="identity")+
geom_bar(position=dodge,stat="identity")+
geom_errorbar(limits,position=dodge,width=0.25) +
scale_y_continuous(breaks=seq(min(df$y), max(df$y), length=5),labels=as.character(round(seq(m, max(df$y+m), length=5),2))) # relabel
I ran into the same problem and discovered you can also easily do this using geom_crossbar.
As long as color and fill are the same you don't see the break in the crossbar (set with y aesthetic) so they look exactly like bars.
library(ggplot2)
dodge <- position_dodge(width=0.9)
limits <- aes(ymax = y+se, ymin = y-se)
df$ymin <- min(df$y)
ggplot(df, aes(x = name, ymax = y, y = y, ymin = ymin, group = interaction(side,name), col = group, fill = group)) +
facet_wrap(~group) +
geom_crossbar(width=0.6,position=position_dodge(width=1),stat="identity") +
geom_errorbar(limits, color = 'black', position = dodge, width=0.25)
ggplot output

Modiffy axes in ggplot

I used the following code based on a previous post How to create odds ratio and 95 % CI plot in R to produce the figure posted below. I would like to:
1) Make x and y axes as well as the legends bold
2) Increase the thickness of the lines
How can I do that in ggplot?
ggplot(alln, aes(x = apoll2, y = increase, ymin = l95, ymax = u95)) + geom_pointrange(aes(col = factor(marker)), position=position_dodge(width=0.50)) +
ylab("Percent increase & 95% CI") + geom_hline(aes(yintercept = 0)) + scale_color_discrete(name = "Marker") + xlab("")
To change axis and legend appearance you should add theme() to your plot.
+ theme(axis.text=element_text(face="bold"),
legend.text=element_text(face="bold"))
To make line wider add size=1.5 inside the geom_pointrange() call.

Plot two regression lines (calculated on subset of the same data frame) on the same graph with ggplot

I have this kind of data frame:
df<-data.frame(x=c(1,2,3,4,5,6,7,8,9,10),y=c(2,11,24,30,45,65,90,110,126,145), a=c(0.2,0.2,0.3,0.4,0.1,0.8,0.7,0.6,0.8,0.9))
Using ggplot, I would like to plot on the same figure two regression lines, calculated for a subset of my data frame under condition (a > or < 0.5).
Visually, I would like that both regression lines:
df_a<-subset(df, df$a<0.5)
ggplot(df_a,aes(x,y))+
geom_point(aes(color = a), size=3.5) +
geom_smooth(method="lm", size=1, color="black") +
ylim(-5,155) +
xlim(0,11)
df_b<-subset(df, df$a>0.5)
ggplot(df_b,aes(x,y)) +
geom_point(aes(color = a), size=3.5) +
geom_smooth(method="lm", size=1, color="black") +
ylim(-5,155) +
xlim(0,11)
Appear on this figure:
ggplot(df,aes(x,y))+ geom_point(aes(color = a), size=3.5)
I've tried with par(new=TRUE) without success.
Make a flag variable, and use group:
df$small=df$a<0.5
ggplot(df,aes(x,y,group=small))+geom_point() + stat_smooth(method="lm")
and have yourself pretty colours and a legend if you want:
ggplot(df,aes(x,y,group=small,colour=small))+geom_point() + stat_smooth(method="lm")
Or maybe you want to colour the dots:
ggplot(df,aes(x,y,group=small)) +
stat_smooth(method="lm")+geom_point(aes(colour=a))

Resources