Manually plotting significance relations between sub-groups on ggplot2 barplot - r

I've been trying to plot manually labelled significance bars for a subset of groups on a ggplot2 barplot using ggsignif or ggpubr without much luck. The data is something like the following MWE:
set.seed(3)
## create data
df <- data.frame(activity = rep(c("Flying", "Jumping"), 3),
mean = rep(rnorm(6, 50, 25)),
group = c(rep("Ecuador", 2),
rep("Peru", 2),
rep("Brazil", 2)))
## plot it
ggplot(df, aes(x = activity, y = mean, fill = group)) +
geom_bar(position = position_dodge(0.9), stat = "identity",
width = 0.9, colour = "black", size = 0.1) +
xlab("Activity") + ylab("Mean")
Where I'd like to manually specify significance labels, say between Brazil/Ecuador" on "Flying", and Ecuador/Peru on "Jumping". Does anyone know how to properly deal with this kind of data, for example with ggsignif? And is there a way to refer to each bar by name, rather than try to work out its x-axis position?

If you know on which barchart you want to add your significance labels, you can do:
library(ggsignif)
library(ggplot2)
ggplot(df, aes(x = activity, y = mean, fill = group)) +
geom_bar(position = position_dodge(0.9), stat = "identity",
width = 0.9, colour = "black", size = 0.1) +
xlab("Activity") + ylab("Mean")+
geom_signif(y_position = c(60,50), xmin = c(0.7,2), xmax = c(1,2.3),
annotation=c("**", "***"), tip_length=0)
Does it answer your question ?

Related

How to avoid over lapping bubbles in bubble plot?

I want to separately plot data in a bubble plot like the image right (I make this in PowerPoint just to visualize).
At the moment I can only create a plot that looks like in the left where the bubble are overlapping. How can I do this in R?
b <- ggplot(df, aes(x = Year, y = Type))
b + geom_point(aes(color = Spp, size = value), alpha = 0.6) +
scale_color_manual(values = c("#0000FF", "#DAA520", "#228B22","#E7B888")) +
scale_size(range = c(0.5, 12))
You can have the use of position_dodge() argument in your geom_point. If you apply it directly on your code, it will position points in an horizontal manner, so the idea is to switch your x and y variables and use coord_flip to get it in the right way:
library(ggplot2)
ggplot(df, aes(y = as.factor(Year), x = Type))+
geom_point(aes(color = Group, size = Value), alpha = 0.6, position = position_dodge(0.9)) +
scale_color_manual(values = c("#0000FF", "#DAA520", "#228B22","#E7B888")) +
scale_size(range = c(1, 15)) +
coord_flip()
Does it look what you are trying to achieve ?
EDIT: Adding text in the middle of each points
To add labeling into each point, you can use geom_text and set the same position_dodge2 argument than for geom_point.
NB: I use position_dodge2 instead of position_dodge and slightly change values of width because I found position_dodge2 more adapted to this case.
library(ggplot2)
ggplot(df, aes(y = as.factor(Year), x = Type))+
geom_point(aes(color = Group, size = Value), alpha = 0.6,
position = position_dodge2(width = 1)) +
scale_color_manual(values = c("#0000FF", "#DAA520", "#228B22","#E7B888")) +
scale_size(range = c(3, 15)) +
coord_flip()+
geom_text(aes(label = Value, group = Group),
position = position_dodge2(width = 1))
Reproducible example
As you did not provide a reproducible example, I made one that is maybe not fully representative of your original dataset. If my answer is not working for you, you should consider providing a reproducible example (see here: How to make a great R reproducible example)
Group <- c(LETTERS[1:3],"A",LETTERS[1:2],LETTERS[1:3])
Year <- c(rep(1918,4),rep(2018,5))
Type <- c(rep("PP",3),"QQ","PP","PP","QQ","QQ","QQ")
Value <- sample(1:50,9)
df <- data.frame(Group, Year, Value, Type)
df$Type <- factor(df$Type, levels = c("PP","QQ"))

How to correct the position of labels on piechart in ggplot. Also tell me how produce 3D piechart

One of the value in my dataset is zero, I think because of that I am not able to adjust labels correctly in my pie chart.
#Providing you all a sample dataset
Averages <- data.frame(Parameters = c("Cars","Motorbike","Bicycle","Airplane","Ships"), Values = c(15.00,2.81,50.84,51.86,0.00))
mycols <- c("#0073C2FF", "#EFC000FF", "#868686FF", "#CD534CFF","#FF9999")
duty_cycle_pie <- Averages %>% ggplot(aes(x = "", y = Values, fill = Parameters)) +
geom_bar(width = 1, stat = "identity", color = "white") +
coord_polar("y", start = 0)+
geom_text(aes(y = cumsum(Values) - 0.7*Values,label = round(Values*100/sum(Values),2)), color = "white")+
scale_fill_manual(values = mycols)
Labels are not placed in the correct way. Please tell me how can get 3D piechart.
Welcome to stackoverflow. I am happy to help, however, I must note that piecharts are highly debatable and 3D piecharts are considered bad practice.
https://www.darkhorseanalytics.com/blog/salvaging-the-pie
https://en.wikipedia.org/wiki/Misleading_graph#3D_Pie_chart_slice_perspective
Additionally, if the names of your variables reflect your actual dataset (Averages), a piechart would not be appropriate as the pieces do not seem to be describing parts of a whole. Ex: avg value of Bicycle is 50.84 and avg value of Airplane is 51.86. Having these result in 43% and 42% is confusing; a barchart would be easier to follow.
Nonetheless, the answer to your question about placement can be solved with position_stack().
library(tidyverse)
Averages <-
data.frame(
Parameters = c("Cars","Motorbike","Bicycle","Airplane","Ships"),
Values = c(15.00,2.81,50.84,51.86,0.00)
) %>%
mutate(
# this will ensure the slices go biggest to smallest (a best practice)
Parameters = fct_reorder(Parameters, Values),
label = round(Values/sum(Values) * 100, 2)
)
mycols <- c("#0073C2FF", "#EFC000FF", "#868686FF", "#CD534CFF","#FF9999")
Averages %>%
ggplot(aes(x = "", y = Values, fill = Parameters)) +
geom_bar(width = 1, stat = "identity", color = "white") +
coord_polar("y", start = 0) +
geom_text(
aes(y = Values, label = label),
color = "black",
position = position_stack(vjust = 0.5)
) +
scale_fill_manual(values = mycols)
To move the pieces towards the outside of the pie, you can look into ggrepel
https://stackoverflow.com/a/44438500/4650934
For my earlier point, I might try something like this instead of a piechart:
ggplot(Averages, aes(Parameters, Values)) +
geom_col(aes(y = 100), fill = "grey70") +
geom_col(fill = "navyblue") +
coord_flip()

How to modify and add an extra legend in a ggplot2 figure

I have data that looks like this:
example.df <- as.data.frame(matrix( c("height","fruit",0.2,0.4,0.7,
"height","veggies",0.3,0.6,0.8,
"height","exercise",0.1,0.2,0.5,
"bmi","fruit",0.2,0.4,0.6,
"bmi","veggies",0.1,0.5,0.7,
"bmi","exercise",0.4,0.7,0.8,
"IQ","fruit",0.4,0.5,0.6,
"IQ","veggies",0.3,0.5,0.7,
"IQ","exercise",0.1,0.4,0.6),
nrow=9, ncol=5, byrow = TRUE))
colnames(example.df) <- c("phenotype","predictor","corr1","corr2","corr3")
So basically three different correlations between 3x3 variables. I want to visualize the increase in correlations as follows:
ggplot(example.df, aes(x=phenotype, y=corr1, yend=corr3, colour = predictor)) +
geom_linerange(aes(x = phenotype,
ymin = corr1, ymax = corr3,
colour = predictor),
position = position_dodge(width = 0.5))+
geom_point(size = 3,
aes(x = phenotype, y = corr1, colour = predictor),
position = position_dodge(width = 0.5), shape=4)+
geom_point(size = 3,
aes(x = phenotype, y = corr2, colour = predictor),
position = position_dodge(width = 0.5), shape=18)+
geom_point(size = 3,
aes(x = phenotype, y = corr3, colour = predictor),
position = position_dodge(width = 0.5))+
labs(x=NULL, y=NULL,
title="Stackoverflow Example Plot")+
scale_colour_manual(name="", values=c("#4682B4", "#698B69", "#FF6347"))+
theme_minimal()
This gives me the following plot:
Problems:
Tthere is something wrong with the way the geom_point shapes are dodged with BMI and IQ. They should be all with on the line with the same colour, like with height.
How do I get an extra legend that can show what the circle, cross, and square represent? (i.e., the three different correlations shown on the line: cross = correlation 1, square = correlation 2, circle = correlation 3).
The legend now shows a line, circle, cross through each other, while just a line for the predictors (exercise, fruit, veggies) would suffice..
Sorry for the multiple issues, but adding the extra legend (problem #2) is the most important one, and I would be already very satisfied if that could be solved, the rest is bonus! :)
See if the following works for you? The main idea is to convert the data frame from wide to long format for the geom_point layer, and map correlation as a shape aesthetic:
example.df %>%
ggplot(aes(x = phenotype, color = predictor, group = predictor)) +
geom_linerange(aes(ymin = corr1, ymax = corr3),
position = position_dodge(width = 0.5)) +
geom_point(data = . %>% tidyr::gather(corr, value, -phenotype, -predictor),
aes(y = value, shape = corr),
size = 3,
position = position_dodge(width = 0.5)) +
scale_color_manual(values = c("#4682B4", "#698B69", "#FF6347")) +
scale_shape_manual(values = c(4, 18, 16),
labels = paste("correlation", 1:3)) +
labs(x = NULL, y = NULL, color = "", shape = "") +
theme_minimal()
Note: The colour legend is based on both geom_linerange and geom_point, hence the legend keys include both a line and a point shape. While it's possible to get rid of the second one, it does take some more convoluted code, and I don't think the plot would be much improved as a result...

ggplot2 - using two different color scales for same fill in overlayed plots

A very similar question to the one asked here. However, in that situation the fill parameter for the two plots are different. For my situation the fill parameter is the same for both plots, but I want different color schemes.
I would like to manually change the color in the boxplots and the scatter plots (for example making the boxes white and the points colored).
Example:
require(dplyr)
require(ggplot2)
n<-4*3*10
myvalues<- rexp((n))
days <- ntile(rexp(n),4)
doses <- ntile(rexp(n), 3)
test <- data.frame(values =myvalues,
day = factor(days, levels = unique(days)),
dose = factor(doses, levels = unique(doses)))
p<- ggplot(data = test, aes(x = day, y = values)) +
geom_boxplot( aes(fill = dose))+
geom_point( aes(fill = dose), alpha = 0.4,
position = position_jitterdodge())
produces a plot like this:
Using 'scale_fill_manual()' overwrites the aesthetic on both the boxplot and the scatterplot.
I have found a hack by adding 'colour' to geom_point and then when I use scale_fill_manual() the scatter point colors are not changed:
p<- ggplot(data = test, aes(x = day, y = values)) +
geom_boxplot(aes(fill = dose), outlier.shape = NA)+
geom_point(aes(fill = dose, colour = factor(test$dose)),
position = position_jitterdodge(jitter.width = 0.1))+
scale_fill_manual(values = c('white', 'white', 'white'))
Are there more efficient ways of getting the same result?
You can use group to set the different boxplots. No need to set the fill and then overwrite it:
ggplot(data = test, aes(x = day, y = values)) +
geom_boxplot(aes(group = interaction(day, dose)), outlier.shape = NA)+
geom_point(aes(fill = dose, colour = dose),
position = position_jitterdodge(jitter.width = 0.1))
And you should never use data$column inside aes - just use the bare column. Using data$column will work in simple cases, but will break whenever there are stat layers or facets.

R geom_bar with overlay and no distance between bars

I'm trying to have this plot:
library(ggplot2)
testdata <- data.frame(x=c(1:5),a=c(1:5),b=c(10:6))
ggplot(testdata, aes(x = x)) +
geom_bar(aes(y=b), stat = "identity", fill="darkgrey")+
geom_bar(aes(y=a), linetype="solid", colour="black", stat = "identity", fill=NA)
with a legend. Since I can't get a legend going here (this would be a nice workaround if you know how), I tried to approach the 'correct' way to plot this in ggplot, namely with long data such as:
testdata <- data.frame(x = c(1:5,1:5), y = c(1:5,10:6), group = c(rep("a",5), rep("b",5)))
ggplot(testdata, aes(x = x, y = y, group = group, fill = group)) +
geom_bar(stat = "identity", linetype = "solid", colour = "black", position = position_dodge(width = 0))+
scale_fill_manual(values = c(NA, "darkgrey"))
While I do have a legend here, the bars are much further apart. The usual parameter to change this is the width inside position_dodge but I need that equal to 0 for the 100% overlay. So my question in the ideal world is: can I decrease the distance of the bars in the second plot? If this is not feasible, can I add a legend to the fist plot? Any help would be appreciated!
Try changing the width outside position_dodge.
ggplot(testdata, aes(x = x, y = y, group = group, fill = group)) +
geom_bar(width=1.5,stat = "identity", linetype = "solid", colour = "black", position = position_dodge(width = 0))+
scale_fill_manual(values = c(NA, "darkgrey"))

Resources