In ggplot2, how to add text by group in facet_wrap? - r

Here is an example:
library(ggplot2)
set.seed(112)
df<-data.frame(g=sample(c("A", "B"), 100, T),
x=rnorm(100),
y=rnorm(100,2,3),
f=sample(c("i","ii"), 100, T))
ggplot(df, aes(x=x,y=y, colour=factor(g)))+
geom_point()+geom_smooth(method="lm", fill="NA")+facet_wrap(~f)
My question is how to add text like the second plot by group into the plot.

You can manually create another data.frame for your text and add the layer on the original plot.
df_text <- data.frame(g=rep(c("A", "B")), x=-2, y=c(9, 8, 9, 8),
f=rep(c("i", "ii"), each=2),
text=c("R=0.2", "R=-0.3", "R=-0.05", "R=0.2"))
ggplot(df, aes(x=x,y=y, colour=factor(g))) +
geom_point() + geom_smooth(method="lm", fill="NA") +
geom_text(data=df_text, aes(x=x, y=y, color=factor(g), label=text),
fontface="bold", hjust=0, size=5, show.legend=FALSE) +
facet_wrap(~f)

Another option is to calculate the correlations on the fly and use the underlying numeric values of the factor variable g to place the text so that the red and blue labels don't overlap. This reduces the amount of code needed and makes label placement a bit easier.
library(dplyr)
ggplot(df, aes(x=x, y=y, colour=g)) +
geom_point() +
geom_smooth(method="lm", fill=NA) + # Guessing you meant fill=NA here
#geom_smooth(method="lm", se=FALSE) # Better way to remove confidence bands
facet_wrap(~f) +
geom_text(data=df %>% group_by(g, f) %>% summarise(corr = cor(x,y)),
aes(label=paste0("R = ", round(corr,2)), y = 10 - as.numeric(g)),
x=-2, hjust=0, fontface="bold", size=5, show.legend=FALSE)

Related

Is there a possibility to combine position_stack and nudge_x in a stacked bar chart in ggplot2?

I want to add labels to a stacked bar chart to achieve something like this:
The goal is simple: I need to show market shares and changes versus previous year in the same graph. In theory, I would just add "nudge_x=0.5," to geom_text in the code but I get the error: "Specify either position or nudge_x/nudge_y". Is it possible to use some workaround, maybe another package? Thanks a lot in advance!
Code:
DashboardCategoryText <- c("Total Market","Small Bites","Bars","Total Market","Small Bites","Bars","Total Market","Small Bites","Bars")
Manufacturer <- c("Ferrero","Ferrero","Ferrero","Rest","Rest","Rest","Kraft","Kraft","Kraft")
MAT <- c(-1,5,-7,6,8,10,-10,5,8)
Measure_MATCurrent <- c(500,700,200,1000,600,80,30,60,100)
data <- data.frame(DashboardCategoryText,Manufacturer,MAT,Measure_MATCurrent)
library(dplyr)
groupedresult <- group_by(data,DashboardCategoryText)
groupedresult <- summarize(groupedresult,SUM=sum(Measure_MATCurrent))
groupedresult <- as.data.frame(groupedresult)
data <- merge(data,groupedresult,by="DashboardCategoryText")
data$percent <- data$Measure_MATCurrent/data$SUM
library(ggplot2)
ggplot(data, aes(x=reorder(DashboardCategoryText, SUM), y=percent, fill=Manufacturer)) +
geom_bar(stat = "identity", width = .7, colour="black", lwd=0.1) +
geom_text(aes(label=ifelse(percent >= 0.005, paste0(sprintf("%.0f", percent*100),"%"),"")),
position=position_stack(vjust=0.5), colour="white") +
geom_text(aes(label=MAT,y=percent),
nudge_x=0.5,
position=position_stack(vjust=0.8),
colour="black") +
coord_flip() +
scale_y_continuous(labels = percent_format()) +
labs(y="", x="")
I have a somewhat 'hacky' solution where you essentially just change the geom_text data in the underlying ggplot object before you plot it.
p <- ggplot(data, aes(x=reorder(DashboardCategoryText, SUM), y=percent, fill=Manufacturer)) +
geom_bar(stat = "identity", width = .7, colour="black", lwd=0.1) +
geom_text(aes(label=ifelse(percent >= 0.005, paste0(sprintf("%.0f", percent*100),"%"),"")),
position=position_stack(vjust=0.5), colour="white") +
geom_text(aes(label=MAT,y=percent),
position=position_stack(vjust=.5),
colour="black") +
coord_flip() +
scale_y_continuous(labels = percent_format()) +
labs(y="", x="")
q <- ggplot_build(p) # get the ggplot data
q$data[[3]]$x <- q$data[[3]]$x + 0.5 # change it to adjust the x position of geom_text
plot(ggplot_gtable(q)) # plot everything

Use position_jitterdodge to plot points, and add highlighted points that are also dodged

I have some data where x is categorical, y is numeric, and color.var is another categorical variable that I would like to color by. My goal is to plot all of the points using position_jitterdodge(), and then highlight a couple of the points, draw a line between them, and add labels, while making sure these highlighted points line up with the corresponding strips of points that were plotted using position_jitterdodge(). The highlighted points are aligned properly when all factors are present in the variable used to dodge, but it does not work well when some factors are missing.
Minimal (non-)working example
library(ggplot2)
Generate some data
d = data.frame(x = c(rep('x1', 1000), rep('x2', 1000)),
y = runif(n=2000, min=0, max=1),
color.var= rep(c('color1', 'color2'), 1000),
facet.var = rep(c('facet1', 'facet1', 'facet2', 'facet2'), 500))
head(d)
dd = d[c(1,2,3,4,1997,1998, 1999,2000),]
dd
df1 = dd[dd$color.var=='color1',] ## data for first set of points, labels, and the line connecting them
df2 = dd[dd$color.var=='color2',] ## data for second set of points, labels, and the line connecting them
df1
dw = .75 ## Define the dodge.width
Plot all points
Here are all of the points, separated using position_jitterdodge() and the aesthetic fill.
ggplot() +
geom_point(data=d, aes(x=x, y=y, fill=color.var), position=position_jitterdodge(dodge.width=dw), size=3, alpha=1, shape=21, color='darkgray') +
facet_wrap(~facet.var) +
scale_fill_manual(values=c( 'lightblue','gray'))+
theme(axis.title = element_blank()) +
theme(legend.position="top")
That works well.
Additional highlighted points.
Here is the same plot, with additional points in dd added.
ggplot() +
geom_point(data=d, aes(x=x, y=y, fill =color.var), position=position_jitterdodge(dodge.width=dw), size=3, alpha=1, shape=21, color='darkgray') +
geom_point(data=dd, aes(x=x, y=y, color=color.var ), position=position_dodge(width=.75), size=4 ) +
geom_line(data=dd, aes(x=x, y=y, color=color.var, group=color.var ), position=position_dodge(width=.75), size=1 ) +
geom_label(data=dd, aes(x=x, y=y, color=color.var, group=color.var, label=round(y,1)), position=position_dodge(width=.75), vjust=-.5) +
facet_wrap(~facet.var) +
scale_fill_manual(values=c( 'lightblue','gray'))+
scale_color_manual(values=c( 'blue', 'gray40')) +
theme(axis.title = element_blank())+
theme(legend.position="top")
This is what I want it to look like. However, this only works properly if both factors of the color.var variable are in the set of points to highlight.
If both factors aren't present in the new data, the horizonal alignment fails.
Highlight points, only one factor present
Here is an example where only the 'color1' factor (blue) is present. Note that data=dd was replaced with data=df1 (data that only contains blue highlighted dots) in this code.
ggplot() +
geom_point(data=d, aes(x=x, y=y, fill =color.var), position=position_jitterdodge(dodge.width=dw), size=3, alpha=1, shape=21, color='darkgray') +
geom_point(data=df1, aes(x=x, y=y, color=color.var ), position=position_dodge(width=.75), size=4 ) +
geom_line(data=df1, aes(x=x, y=y, color=color.var, group=color.var ), position=position_dodge(width=.75), size=1 ) +
geom_label(data=df1, aes(x=x, y=y, color=color.var, group=color.var, label=round(y,1)), position=position_dodge(width=.75), vjust=-.5) +
facet_wrap(~facet.var) +
scale_fill_manual(values=c( 'lightblue','gray'))+
scale_color_manual(values=c( 'blue', 'gray40')) +
theme(axis.title = element_blank())+
theme(legend.position="top") +
scale_x_discrete(drop=F)
The highlight blue dots appear between the blue and gray dots, instead of aligned with the blue dots. Note that the additional code scale_x_discrete(drop=F) had no apparent effect on the alignment.
A manual solution
One possible fix is to edit the x coordinate manually, like this
ggplot(data=d, aes(x=x, y=y)) +
geom_point(aes(fill=color.var), position=position_jitterdodge(dodge.width=dw), size=3, alpha=1, shape=21, color='darkgray') +
geom_point(data=df1, aes(x=as.numeric(x)-dw/4, y=y), alpha=.9, size=4 , color='blue') + ## first set of points
geom_line( data=df1, aes(x=as.numeric(x)-dw/4, y=y , group=color.var ), color='blue', size=1) + ## first line
geom_label(data=df1, aes(x=as.numeric(x)-dw/4, y=y , label=round(y,1)), color='blue', vjust=-.25)+ ## first set of labels
facet_wrap(~facet.var) +
scale_fill_manual(values=c( 'lightblue','gray'))+
theme(axis.title = element_blank() +
theme(legend.position="top")
An adjustment of 1/4 of the dodge.width seems to work. This works fine, but it seems like there should be a better way, especially since I will eventually want to do this with 4-5 sets of highlighted points/lines, which may all be all be the same color.var, like the blue 'color1' factor above. Repeating this 4-5 times would be cumbersome. I will also eventually want to do this will 5-10 different figures. I suppose dodge.width*1/4 will always work, and copying and pasting might do the trick, but would like to know if there is a better way.
Here is a solution based on #aosmith's comment. Basically, just need to add this code before using ggplot:
library(dplyr) ## needed for group_by()
library(tidyr) ## needed for complete()
df1 = df1 %>% group_by(facet.var, x) %>% complete(color.var)
That adds extra rows to the data so that all the levels of color.var are present. Then the code given in the question, along with a couple of small edits that fix the legend, can be used:
ggplot() +
geom_point(data=d , aes(x=x, y=y, fill =color.var), position=position_jitterdodge(dodge.width=dw), size=3, alpha=1, shape=21, color='darkgray', show.legend=T) +
geom_point(data=df1, aes(x=x, y=y, color=color.var ), position=position_dodge(width=.75), size=4, show.legend=T ) +
geom_line( data=df1, aes(x=x, y=y, color=color.var, group=color.var ), position=position_dodge(width=.75), size=1, show.legend=F ) +
geom_label(data=df1, aes(x=x, y=y, color=color.var, group=color.var, label=round(y,1)), position=position_dodge(width=.75), vjust=-.5, show.legend=F) +
facet_wrap(~facet.var) +
scale_fill_manual( values=c( 'lightblue','gray'), name='Background dots', guide=guide_legend(override.aes = list(color=c('lightblue', 'gray')))) +
scale_color_manual(values=c( 'blue', 'gray40') , name='Highlighted dots') +
theme(axis.title = element_blank())+
theme(legend.position="top")+
scale_x_discrete(drop=F)

plotting: color based on the combination of two column levels

How to plot based on the combination of two column levels(here: treatment, replicate)?
set.seed(0)
x <- rep(1:10, 4)
y <- sample(c(rep(1:10, 2)+rnorm(20)/5, rep(6:15, 2) + rnorm(20)/5))
treatment <- sample(gl(8, 5, 40, labels=letters[1:8]))
replicate <- sample(gl(8, 5, 40))
d <- data.frame(x=x, y=y, treatment=treatment, replicate=replicate)
plots: color based on single column levels
ggplot(d, aes(x=x, y=y, colour=treatment)) + geom_point()
ggplot(d, aes(x=x, y=y, colour=replicate)) + geom_point()
The combination of two column levels would be a-1, a-2, a-3, ... h-6, h-7, h-8.
64 colours will be uninterpretable. How about point labels instead:
ggplot(d, aes(x=x, y=y, colour=treatment)) +
geom_text(aes(label=paste0(treatment, replicate)), size=3, show.legend=FALSE) +
theme_classic()
Or, if you're trying to spot differences in patterns for different treatments, maybe faceting would help:
ggplot(d, aes(x=x, y=y, colour=treatment)) +
geom_text(aes(label=paste0(treatment, replicate)), size=3, show.legend=FALSE) +
facet_wrap(~ treatment, ncol=4) +
scale_x_continuous(expand=c(0,0.7)) +
theme_bw() + theme(panel.grid=element_blank())
But, if you really want a whole bunch of colours...
ggplot(d, aes(x=x, y=y, colour=interaction(treatment,replicate,sep="-",lex.order=TRUE))) +
geom_point() +
labs(colour="Treatment-Replicate") +
theme_classic()
(If you want all potential treatment-replicate combinations to be listed in the legend, regardless of whether they're present in the data, then add + scale_colour_discrete(drop=FALSE) to the plot code.)

Add same gradient to each rectangle in ggplot

I am trying to display color gradient in below created ggplot2. So with using following data and code
vector <- c(9, 10, 6, 5, 5)
Names <- c("Leadership", "Management\n", "Problem Solving",
"Decision Making\n", "Social Skills")
# add \n
Names[seq(2, length(Names), 2)] <- paste0("\n" ,Names[seq(2, length(Names), 2)])
# data.frame, including a grouping vector
d <- data.frame(Names, vector, group=c(rep("Intra-capacity", 3), rep("Inter-capacity", 2)))
# correct order
d$Names <- factor(d$Names, levels= unique(d$Names))
d$group_f = factor(d$group, levels=c('Intra-capacity','Inter-capacity'))
# plot the bars
p <- ggplot(d, aes(x= Names, y= vector, group= group, fill=vector, order=vector)) +
geom_bar(stat= "identity") +
theme_bw()+
scale_fill_gradient(low="white",high="blue")
# use facet_grid for the groups
#p + facet_grid(.~group_f, scales= "free_x", space= "free_x")
p+ theme(text = element_text(size=23),plot.background = element_rect(fill = "white"),
strip.background = element_rect(fill="Dodger Blue")) +
facet_grid(.~group_f, scales= "free_x", space= "free_x") + xlab("") +ylab("") +
theme(strip.text.x = element_text(size = 18, colour = "white" )) +
geom_text(size=10, aes(label=vector))
My output is this:
But now I would like to insert color gradient so each rectangle would look like picture below (my desired output):
I've also looked at this:
R: gradient fill for geom_rect in ggplot2
create an arrow with gradient color
http://www.computerworld.com/article/2935394/business-intelligence/my-ggplot2-cheat-sheet-search-by-task.html
Color Gradients With ggplot
Label minimum and maximum of scale fill gradient legend with text: ggplot2
How can I apply a gradient fill to a geom_rect object in ggplot2?
And also tried using:
scale_fill_gradient(low="white",high="blue") or
scale_fill_gradientn(colours = c("blue","white","red"),
values = c(0,5,10),
guide = "colorbar", limits=c(0,10))
But I am clearly doing something wrong.
I'm with #RomanLustrik here. However, if you can't use Excel (= prly much easier), maybe just adding a white rectangle with an alpha-gradient is already enough:
ggplot(d, aes(x= Names, y= vector, group= group,order=vector)) +
geom_bar(stat= "identity", fill="blue") +
theme_bw() +
scale_fill_gradient(low="white",high="blue") +
annotation_custom(
grid::rasterGrob(paste0("#FFFFFF", as.hexmode(1:255)),
width=unit(1,"npc"),
height = unit(1,"npc"),
interpolate = TRUE),
xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=5
) +
geom_text(aes(label=vector), color="white", y=2, size=12)

Controlling bg and fg in ggplot2

I have a set of data that I am trying to plot.
I would like the fill (bg) to be controlled by a logical variable.
The only way I can make it work is by layering two sets of points.
Is there a better way?
require(ggplot2)
dat<-data.frame(
x=rep(1:10, 2),
val=c(rnorm(10, 10), rnorm(10, 12)),
grp=rep(c("A", "B"), each=10),
tf=sample(c(TRUE, FALSE), 20, replace=TRUE)
)
ggplot(dat, aes(x, val, col=grp))+
geom_line()+
geom_point(aes(alpha=tf), size=4)+
geom_point(shape=21, size=4, aes(fg=grp))
You can use a manual shape to do this.
ggplot(dat, aes(x, val, col=grp)) +
geom_line() +
geom_point(aes(shape=tf), size=4) +
scale_shape_manual(values=c(19,21))

Resources