R - geom_point - grouping to be used for legend - r

I have two geom_point commands applied to different data frames and would like to have a legend to specify them. However, I am not sure how to group them right for the legend. I appreciate it if you can take a look at the simple example below and help me figure out why no legend appears on the figure. Thanks!
df1=data.table(x1=c(-1,0,1), y1=c(-1,0,1))
df2=data.table(x2=c(-1,0,1), y2=c(-2,0,2))
ggplot()+
geom_point(data=df1, aes(x=x1, y=y1), color='red', group=1) +
geom_point(data=df2, aes(x=x2, y=y2), color='blue', group=2) +
xlab("X Label")+ylab("Y Label") +
scale_colour_manual(name = "My Legend",
values = group,
labels = c("database1", "database2"))

As suggested, ggplot2 likes a "tidy" way of dealing with data. In this case, it involves combining the data with an additional variable to differentiate the groups:
colnames(df2) <- c("x1","y1")
df <- rbind(transform(df1, grp='red'), transform(df2, grp='blue'))
ggplot()+
geom_point(data=df, aes(x=x1, y=y1, color=grp), group=1) +
xlab("X Label")+ylab("Y Label") +
scale_color_identity(guide="legend")
I used scale_color_identity for simplicity here, but it isn't hard to use where you started going with scale_colour_manual and relabeling them.

Related

add geom vline and ggplotly to a facet grid

is there any way i could add a vertical line in both the plot at x=15. and also add a plotly to this. i tried but it doesn't seem to work. Thanks
sbucks_new %>%
ggplot(aes(x= category, y= bad_fat, color= category)) +
geom_boxplot() +
coord_flip() +
facet_grid(~ milk_dummy)+
labs(title= "Unhealthy Fats in Milk drinks by Category",
x= "Drinks Category",
y="Bad Fats (g)") +
theme_bw()
Use geom_hline for plotting the vertical line (confusing due to the coord_flip).
Here's an example with mtcars:
p <- ggplot(mtcars, aes(x=factor(carb), y=disp)) +
geom_boxplot() +
facet_wrap(~am2) +
geom_hline(aes(yintercept=300)) +
coord_flip()
Not sure about your other question, but you can quickly convert ggplot object into plotly using ggplotly function.
plotly::ggplotly(p)

Labelling plots arranged with grid.arrange

I have attached multiple plots to one page using grid.arrange.
Is there a way to label each plot with "(a)","(b)" etc...
I have tried using geom_text but it does not seem compatible with my plots....
.... as you can see, geom_text has some strange interaction with my legend symbols.
I will show an example using the mtcars data of what I am trying to achieve. THe alternative to geom_text I have found is "annotate" which does not interact with my legend symbols. However, it is not easy to label only one facet....
q1=ggplot(mtcars, aes(x=mpg, y=wt)) +
geom_line() +
geom_point()+
facet_grid(~cyl)+
annotate(geom="text", x=15, y=12, label="(a)",size=8,family="serif")
q2=ggplot(mtcars, aes(x=mpg, y=wt,)) +
geom_line() +
geom_point()+
facet_grid(~cyl)+
annotate(geom="text", x=15, y=12, label="(b)",size=8,family="serif")
geom_text(x=15, y=5,size=8, label="(b)")
gt1 <- ggplotGrob(q1)
gt2 <- ggplotGrob(q2)
grid.arrange(gt1,gt2, ncol=1)
Therefore, my question is, is there a way to label plots arranged using grid.arrange, so that the first facet in each plot is labelled with either a, or b or c etc...?
You can use ggarrange from ggpubr package and set labels for each plot using the argument labels:
library(ggplot2)
library(ggpubr)
q1=ggplot(mtcars, aes(x=mpg, y=wt)) +
geom_line() +
geom_point()+
facet_grid(~cyl)+
annotate(geom="text", x=15, y=12, label="(a)",size=8,family="serif")
q2=ggplot(mtcars, aes(x=mpg, y=wt,)) +
geom_line() +
geom_point()+
facet_grid(~cyl)+
annotate(geom="text", x=15, y=12, label="(b)",size=8,family="serif")
ggarrange(q1,q2, ncol = 1, labels = c("a)","b)"))
Is it what you are looking for ?
If you set inherit.aes=FALSE, you can prevent it from interring:
ggplot(mtcars, aes(x=mpg, y=wt,col=factor(cyl))) +
geom_line() +
geom_point()+
geom_text(inherit.aes=FALSE,aes(x=15,y=12,label="(a)"),
size=8,family="serif")+
facet_grid(~cyl)
If you want to only label the first facet (hope I got you correct), I think the easiest way to specify a data frame, e.g if we want only something in the first,
#place it in the first
lvl_data = data.frame(
x=15,y=12,label="(a)",
cyl=levels(factor(mtcars$cyl))[1]
)
ggplot(mtcars, aes(x=mpg, y=wt,col=factor(cyl))) +
geom_line() +
geom_point()+
geom_text(data=lvl_data,inherit.aes=FALSE,
aes(x=x,y=y,label=label),size=8,family="serif")+
facet_grid(~cyl)

Duplicated xtick labels in ggplot facets

I have this data.frame which I want to plot in facets using ggplot + facet_wrap:
set.seed(1)
df <- data.frame(val=rnorm(36),
gt=c(sapply(c("wt","pd","md","bd"),function(x) rep(x,9))),
ts=rep(c(sapply(c("cb","hp","ac"),function(x) rep(x,3))),4),
col=c(sapply(c("darkgray","darkblue","darkred","darkmagenta"),function(x) rep(x,9))),
index=rep(1:9,4),
stringsAsFactors=F)
df$xlab <- paste(df$ts,df$index,sep=".")
df$gt <- factor(df$gt,levels=c("wt","pd","md","bd"))
Here's how I'm trying to plot:
require(ggplot2)
ggplot(df,aes(x=index,y=val,color=gt))+geom_point(size=3)+facet_wrap(~gt,ncol=4)+
scale_fill_manual(values=c("darkgray","darkblue","darkred","darkmagenta"),labels=levels(df$gt),name="gt",guide=F)+
scale_colour_manual(values=c("darkgray","darkblue","darkred","darkmagenta"),labels=levels(df$gt),name="gt",guide=F)+
labs(x="replicate",y="val")+scale_x_continuous(breaks=df$index,labels=df$xlab)+
theme_bw()+theme(axis.text=element_text(size=6),axis.title=element_text(size=7),legend.text=element_text(size=6),legend.key=element_blank(),panel.border=element_blank(),strip.background=element_blank())
Which gives:
The problem is that the x0axis tick labels repeat themselves, sinceI'm calling scale_x_continuous. How do I get it right with facet_wrap?
Use the actual x-values in xlab as the x aesthetic, along with scales="free_x" in facet_wrap and delete the call to scale_x_continuous. Note, however, that the axis labels are still the same in each panel, because they are the same for each level of gt in the data.
ggplot(df,aes(x=xlab, y=val, color=gt)) +
geom_point(size=3, show.legend=FALSE) +
facet_wrap(~gt, ncol=4, scales="free_x") +
# scale_fill_manual(values=c("darkgray","darkblue","darkred","darkmagenta"), labels=levels(df$gt), name="gt", guide=F) +
scale_colour_manual(values=c("darkgray","darkblue","darkred","darkmagenta")) +
labs(x="replicate", y="val") +
#scale_x_continuous(breaks=df$index, labels=df$xlab)+
theme_bw() +
theme(axis.text=element_text(size=8),
axis.title=element_text(size=7),
legend.text=element_text(size=6),
legend.key=element_blank(),
panel.border=element_blank(),
strip.background=element_blank())
Now let's change xlab, just to see how this works when different panels really do have different labels:
df$xlab[10:20] = LETTERS[1:11]
Now run the same plot code again to get the following:
One more contingency is the case where not all the panels have the same number of x-values. In that case, you can switch to facet_grid and add space="free_x" if you want the width of each panel to be proportional to the number of x-values in each panel.
ggplot(df[-c(1:5),], aes(x=xlab, y=val, color=gt)) +
geom_point(size=3, show.legend=FALSE) +
facet_grid(.~gt, space="free_x", scales="free_x") +
scale_colour_manual(values=c("darkgray","darkblue","darkred","darkmagenta")) +
labs(x="replicate", y="val") +
theme_bw() +
theme(axis.text=element_text(size=8),
axis.title=element_text(size=7),
legend.text=element_text(size=6),
legend.key=element_blank(),
panel.border=element_blank(),
strip.background=element_blank())
A few other things:
You don't need to add color names to your data frame. If you want to change the default color, you can just set the them using one of the scale_colour_*** functions (as you did in your code).
For future reference this c(sapply(c("darkgray","darkblue","darkred","darkmagenta"),function(x) rep(x,9))) can be changed to this rep(c("darkgray","darkblue","darkred","darkmagenta"), each=9).
You can remove the scale_fill_manual line, as you don't have a fill aesthetic in your graph.

ggplot remove or replace the 'a' in geom_text legends

I trying to remove the little a in front of a legend but without any luck. Other possibility would be to create a legend or legend like text next to the graph but I am running out of ideas. Maybe someone can help me.
I plot on specific positions a red X and I want to point out, that the X marked things are imputed...
df <- data.frame(x=rnorm(10),y=rnorm(10))
ggplot(df, aes(x=x, y=y)) + geom_point() + geom_text(aes(x=0,y=0, color=factor(1)), label='X') +
scale_color_manual(values = 'red', name='imputed',labels='imputed') +
theme(legend.key=element_blank(), legend.title=element_blank())
I think the best result would be to replace the little a by a X. But I could not find any solution for it.
The problem is that you use geom_text to draw the cross.
A simple way to solve it is to use geom_point to plot the cross:
ggplot(df, aes(x=x, y=y)) + geom_point() + geom_point(aes(x=0,y=0, color=factor(1)), shape='X', size=5) +
scale_color_manual(values = 'red',labels='imputed') +
theme(legend.key=element_blank(), legend.title=element_blank())

Draw mean and outlier points for box plots using ggplot2

I am trying to plot the outliers and mean point for the box plots in below using the data available here. The dataset has 3 different factors and 1 value column for 3600 rows.
While I run the below the code it shows the mean point but doesn't draw the outliers properly
ggplot(df, aes(x=Representations, y=Values, fill=Methods)) +
geom_boxplot() +
facet_wrap(~Metrics) +
stat_summary(fun.y=mean, colour="black", geom="point", position=position_dodge(width=0.75)) +
geom_point() +
theme_bw()
Again, while I am modify the code like in below the mean points disappear !!
ggplot(df, aes(x=Representations, y=Values, colour=Methods)) +
geom_boxplot() +
facet_wrap(~Metrics) +
stat_summary(fun.y=mean, colour="black", geom="point", position=position_dodge(width=0.75)) +
geom_point() +
theme_bw()
In both of the cases I am getting the message: "ymax not defined: adjusting position using y instead" 3 times.
Any kind suggestions how to fix it? I would like to draw the mean points within individual box plots and show outliers in the same colour as the plots.
EDIT:
The original data set does not have any outliers and that was reason for my confusion. Thanks to MrFlick's answer with randomly generated data which clarifies it properly.
Rather than downloading the data, I just made a random sample.
set.seed(18)
gg <- expand.grid (
Methods=c("BC","FD","FDFND","NC"),
Metrics=c("DM","DTI","LB"),
Representations=c("CHG","QR","HQR")
)
df <- data.frame(
gg,
Values=rnorm(nrow(gg)*50)
)
Then you should be able to create the plot you want with
library(ggplot2)
ggplot(df, aes(x=Representations, y=Values, fill=Methods)) +
geom_boxplot() +
stat_summary(fun.y="mean", geom="point",
position=position_dodge(width=0.75), color="white") +
facet_wrap(~Metrics)
which gave me
I was using ggplot2 version 0.9.3.1

Resources