Shortening mean line in two level factor within facet panels - r

I have a factor of time that has two levels, admission and discharge. I'm using facet_grid to create four panels in which my continuous Y will be looked at by time. I want to be able to add a mean line to each of the two time levels in each panel. My problem is that the mean line spans the entire width of the panel and I'd like to shorten it to just remain within the area of the dots.
Here is the code:
plot <- ggplot(data.in, aes(x=Time, y=Y)) + geom_point()
plot <- plot + facet_grid(.~FacetGroup)
data_hline <- aggregate(data.in$Y~data.in$Time + data.in$FacetGroup, FUN=mean)
plot + geom_hline(data=data_hline, aes(yintercept=Y))

Related

Not all Counts Appearing with geom_text

I have a data set of several features of several organisms. I'm displaying each feature individually by several different categories individually and in combination (e.g. species, location, population). Both in raw counts and a percentage of the total sample size and a percentage within a give group.
My problem comes when I'm trying to display a stacked bar chart using ggplot for the percent of individuals within a group. Since the groups do not have the same number of individuals in them, I'd like to display the raw number or count of individuals with that feature on their respective bars for context. I've managed to properly display the stacked percentage bar chat and get the number of individuals from the most populous groups to display. I'm having trouble displaying the rest of the groups.
ggplot(data=All.k6,aes(x=Second.Dorsal))+
geom_bar(aes(fill=Species),position="fill")+
scale_y_continuous(labels=scales::percent)+
labs(x="Number of Second Dorsal Spines",y="Percentage of Individuals within Species",title="Second Dorsal Spines")+
geom_text(aes(label=..count..),stat='count',position=position_fill(vjust=0.5))
You need to include a group= aesthetic so that position_fill knows how to position things. In geom_bar, you set the fill= aesthetic, so ggplot assumed you also want to group by that aesthetic. In geom_text it assumes the group is your x= aesthetic. In your case, just add group=Species after your label= aesthetic. Here's an example:
# sample dataset
set.seed(1234)
types <- data.frame(
x=c('A','A','A','B','B','B','C','C','C'),
x1=rep(c('aa','bb','cc'),3)
)
df <- rbind(types[sample(1:9,50,replace=TRUE),])
Plot without grouping:
ggplot(df, aes(x=x)) +
geom_bar(aes(fill=x1),position='fill') +
scale_y_continuous(label=scales::percent) +
geom_text(aes(label=..count..),stat='count',
position=position_fill(vjust=0.5))
Plot with group= aesthetic:
ggplot(df, aes(x=x)) +
geom_bar(aes(fill=x1),position='fill') +
scale_y_continuous(label=scales::percent) +
geom_text(aes(label=..count..,group=x1),stat='count',
position=position_fill(vjust=0.5))

How to make half-wiskers in a ggplot2 line graph?

I make very slow progress in R but now I'm able to do some stuff.
Right now I'm plotting the effects of 4 treatments on plant growth in one graph. As you can see the errorbars overlap which is why I made them different colors. I think in order to make the graph clearer it's better to use the lower errorbars as "half wiskers" for the lower 2 lines, and the upper errorbars for the top two lines (like I have now), see the attached image for reference
Is that doable with the way my script is set up now?
Here is part of my script of the plot, I have a lot more but this is where I specify the plot itself (leaving out the aesthetics and stuff), thanks in advance:
"soda1" is my altered dataframe, setup in a clear way, "sdtv" are my standard deviations for each timepoint/treatment, "oppervlak" is my y variable and "Measuring Date" is my x variable. "Tray ID" is the treatment, so my grouping variable.
p <- ggplot(soda1, aes(x=reorder(`Measuring Date`, oppervlak), y=`oppervlak`, group=`Tray ID`, fill=`Tray ID`, colour = `Tray ID` )) +
scale_fill_brewer(palette = "Spectral") +
geom_errorbar(data=soda1, mapping=aes(ymin=oppervlak, ymax=oppervlak+sdtv, group=`Tray ID`), width=0.1) +
geom_line(aes(linetype=`Tray ID`)) +
geom_point(mapping=aes(x=`Measuring Date`, y=oppervlak, shape=`Tray ID`))
print(p)
Showing only one side of errorbars can hide an overlap in the uncertainty between the distribution of two or more variables or measurements.
Instead of hiding this overlap, you could adjust the position of your errorbars horizontally very easily by adding position=position_dodge(width=) to your call to geom_errorbar().
For example:
library(ggplot2)
# some random data with two factors
df <- data.frame(a=rep(1:10, times=2),
b=runif(20),
treat=as.factor(rep(c(0,1), each=10)),
errormax=runif(20),
errormin=runif(20))
# plotting both sides of the errorbars, but dodging them horizontally
p <- ggplot(data=df, aes(x=a, y=b, colour=treat)) +
geom_line() +
geom_errorbar(data=df, aes(ymin=b-errormin, ymax=b+errormax),
position=position_dodge(width=0.25))

Adding group mean lines to geom_bar plot and including in legend

I want to be able to create a bar graph which shows also shows the mean value for bars in each group. AND shows the mean bar in the legend.
I have been able to get this graph Bar chart with means using the code below, which is fine, but I would like to be able to see the mean lines in the legend.
##The data to be graphed is the proportion of persons receiving a treatment
## (num=numerator) in each population (denom=demoninator). The population is
##grouped by two age groups and (Age) and further divided by a categorical
##variable V1
###SET UP DATAFRAME###
require(ggplot2)
df <- data.frame(V1 = c(rep(c("S1","S2","S3","S4","S5"),2)),
Age= c(rep(70,5),rep(80,5)),
num=c(5280,6570,5307,4894,4119,3377,4244,2999,2971,2322),
denom=c(9984,12600,9425,8206,7227,7290,8808,6386,6206,5227))
df$prop<-df$num/df$denom*100
PopMean<-sum(df$num)/sum(df$denom)*100
df70<-df[df$Age==70,]
group70mean<-sum(df70$num)/sum(df70$denom)*100
df80<-df[df$Age==80,]
group80mean<-sum(df80$num)/sum(df80$denom)*100
df$PopMean<-c(rep(PopMean,10))
df$groupmeans<-c(rep(group70mean,5),rep(group80mean,5))
I want the plot to look like this, but want the lines in the legend too, to be labelled as 'mean of group' or similar.
#basic plot
P<-ggplot(df, aes(x=factor(Age), y=prop, fill=factor(V1))) +
geom_bar(position=position_dodge(), colour='black',stat="identity")
P
####add mean lines
P+geom_errorbar(aes(y=df$groupmeans, ymax=df$groupmeans,
ymin=df$groupmeans), col="red", lwd=2)
Adding show.legend=TRUE overlays the error bars onto the factor legend, rather than separately. If there is a way of showing geom_errorbar separately in the legend this is probably the simplest solution.
I have also tried various things with geom_line
The syntax below produces a line for the population mean value, but running from the centre of each point rather than covering the width of the bars
This produces a line for the population mean and it does produce a legend but one showing a bar of colour rather than a line.
P+geom_line(aes(y=df$PopMean, group=df$PopMean, color=df$PopMean),lwd=1)
If i try to do lines for group means the lines are not visible (because they are only single points).
P+geom_line(aes(y=df$groupmeans, group=df$groupmeans, color=df$groupmeans))
I also tried to get round this with facet plot, although this requires me to pretend my categorical variable is numeric to get it to work.
###set up new df
df2<-df
df2$V1<-c(rep(c(1,2,3,4,5),2))
P<-ggplot(df2, aes(x=factor(V1), y=prop, fill=factor(V1))) +
geom_bar(position=position_dodge(),
colour='black',stat="identity",width=1)
P+facet_grid(.~factor(df2$Age))
P+facet_grid(.~factor(df2$Age))+geom_line(aes(y=df$groupmeans,
group=df$groupmeans, color=df$groupmeans))
Facetplot
This allows me to show the mean lines, using geom_line, so a legend does appear (although it doesn't look right, showing a colour gradient rather than coloured lines!). However, the lines still do not go the full width of the bars. Also my x-axis now needs relabelling to show S1, S2 etc rather than numeric 1,2,3
To sum up - is there a way of showing error bar lines separately in the legend?
If not, then, if i use facetting, how do I correct the legend appearance and relabel axes with my categorical variables and is is possible to get the line to go the full width of the plot?
Or is there an alternate solution that I am missing!?
Thanks
To get the legend for the geom_error you need to pass the colour argument in the aes.
As you want only one category (here red), I've create a dummy variable first
df$mean <- "Mean"
ggplot(df, aes(x=factor(Age), y=prop, fill=factor(V1))) +
geom_bar(position=position_dodge(), colour='black',stat="identity") +
geom_errorbar(aes (ymax=groupmeans,
ymin=groupmeans, colour=mean), lwd=2) +
scale_colour_manual(name="",values = "#ff0000")

Add geom_vlines to - or colour - a facet wrap plot

I have a plot generated by the following R code - basically a panel of many histograms/bars. and to each one I'd like to add a vertical line, but the vertical line for each facet is different in it's position. Alternatively I'd like to colour the bars red depending on whether the x value is higher than a threshold - how do I do this to such a plot with ggplot2 / R.
I generated the chart like so:
Histogramplot3 <- ggplot(completeFrame, aes(P_Value)) + geom_bar() + facet_wrap(~ Generation)
Where completeFrame is my dataframe, P_Value is my x variable, and the Facet Wrap Variable Generation is a factor.
It's easier to help with specific examples, but simulating some data, maybe this will help:
#simulate data
completeFrame<-data.frame(P_Value=rnorm(200,0.8,0.1),Generation=rep(1:4,times=50))
#draw the basic plot
h3 <- qplot(data=completeFrame,x=P_Value,geom="blank") +
geom_bar(binwidth=0.02, col="black", fill="black") +
# overlay the "red" bars for the subset of data
geom_bar(data=completeFrame[which(completeFrame$P_Value>0.8),],binwidth=0.02, col="black", fill="red") +
facet_wrap(~ Generation)
#add lines to the subsets
h3 <- h3+geom_hline(data=completeFrame[which(completeFrame$Generation==2),],aes(yintercept=max(P_Value)))
h3 <- h3+geom_hline(data=completeFrame[which(completeFrame$Generation==1),],aes(yintercept=2.5))
h3 <- h3+geom_hline(data=completeFrame[which(completeFrame$Generation==3),],aes(yintercept=mean(P_Value)))
h3

Scatterplot with single regression line despite two groups using ggplot2

I would like to produce a scatter plot with ggplot2, which contains both a regression line through all data points (regardless which group they are from), but at the same time varies the shape of the markers by the grouping variable. The code below produces the group markers, but comes up with TWO regression lines, one for each group.
#model=lm(df, ParamY~ParamX)
p1<-ggplot(df,aes(x=ParamX,y=ParamY,shape=group)) + geom_point() + stat_smooth(method=lm)
How can I program that?
you shouldn't have to redo your full aes in the geom_point and add another layer, just move the shape aes to the geom_point call:
df <- data.frame(x=1:10,y=1:100+5,grouping = c(rep("a",10),rep("b",10)))
ggplot(df,aes(x=x,y=y)) +
geom_point(aes(shape=grouping)) +
stat_smooth(method=lm)
EDIT:
To help with your comment:
because annotate can end up, for me anyway, with the same labels on each facet. I like to make a mini data.frame that has my variable for faceting and the facet levels with another column representing the labels I want to use. In this case the label data frame is called dfalbs.
Then use this to label data frame to label the facets individually e.g.
df <- data.frame(x=1:10,y=1:10,grouping =
c(rep("a",5),rep("b",5)),faceting=c(rep(c("oneR2","twoR2"),5)))
dflabs <- data.frame(faceting=c("oneR2","twoR2"),posx=c(7.5,7.5),posy=c(2.5,2.5))
ggplot(df,aes(x=x,y=y,group=faceting)) +
geom_point(aes(shape=grouping),size=5) +
stat_smooth(method=lm) +
facet_wrap( ~ faceting) +
geom_text(data=dflabs,aes(x=posx,y=posy,label=faceting))

Resources