How to modify and add an extra legend in a ggplot2 figure - r

I have data that looks like this:
example.df <- as.data.frame(matrix( c("height","fruit",0.2,0.4,0.7,
"height","veggies",0.3,0.6,0.8,
"height","exercise",0.1,0.2,0.5,
"bmi","fruit",0.2,0.4,0.6,
"bmi","veggies",0.1,0.5,0.7,
"bmi","exercise",0.4,0.7,0.8,
"IQ","fruit",0.4,0.5,0.6,
"IQ","veggies",0.3,0.5,0.7,
"IQ","exercise",0.1,0.4,0.6),
nrow=9, ncol=5, byrow = TRUE))
colnames(example.df) <- c("phenotype","predictor","corr1","corr2","corr3")
So basically three different correlations between 3x3 variables. I want to visualize the increase in correlations as follows:
ggplot(example.df, aes(x=phenotype, y=corr1, yend=corr3, colour = predictor)) +
geom_linerange(aes(x = phenotype,
ymin = corr1, ymax = corr3,
colour = predictor),
position = position_dodge(width = 0.5))+
geom_point(size = 3,
aes(x = phenotype, y = corr1, colour = predictor),
position = position_dodge(width = 0.5), shape=4)+
geom_point(size = 3,
aes(x = phenotype, y = corr2, colour = predictor),
position = position_dodge(width = 0.5), shape=18)+
geom_point(size = 3,
aes(x = phenotype, y = corr3, colour = predictor),
position = position_dodge(width = 0.5))+
labs(x=NULL, y=NULL,
title="Stackoverflow Example Plot")+
scale_colour_manual(name="", values=c("#4682B4", "#698B69", "#FF6347"))+
theme_minimal()
This gives me the following plot:
Problems:
Tthere is something wrong with the way the geom_point shapes are dodged with BMI and IQ. They should be all with on the line with the same colour, like with height.
How do I get an extra legend that can show what the circle, cross, and square represent? (i.e., the three different correlations shown on the line: cross = correlation 1, square = correlation 2, circle = correlation 3).
The legend now shows a line, circle, cross through each other, while just a line for the predictors (exercise, fruit, veggies) would suffice..
Sorry for the multiple issues, but adding the extra legend (problem #2) is the most important one, and I would be already very satisfied if that could be solved, the rest is bonus! :)

See if the following works for you? The main idea is to convert the data frame from wide to long format for the geom_point layer, and map correlation as a shape aesthetic:
example.df %>%
ggplot(aes(x = phenotype, color = predictor, group = predictor)) +
geom_linerange(aes(ymin = corr1, ymax = corr3),
position = position_dodge(width = 0.5)) +
geom_point(data = . %>% tidyr::gather(corr, value, -phenotype, -predictor),
aes(y = value, shape = corr),
size = 3,
position = position_dodge(width = 0.5)) +
scale_color_manual(values = c("#4682B4", "#698B69", "#FF6347")) +
scale_shape_manual(values = c(4, 18, 16),
labels = paste("correlation", 1:3)) +
labs(x = NULL, y = NULL, color = "", shape = "") +
theme_minimal()
Note: The colour legend is based on both geom_linerange and geom_point, hence the legend keys include both a line and a point shape. While it's possible to get rid of the second one, it does take some more convoluted code, and I don't think the plot would be much improved as a result...

Related

Problem with jittered data points in geom_boxplot

I have created a boxplot using the following code -
ggplot(xray50g, aes(x = Company, y = DefScore, label = Batch,
label2 = PercentPopAff, label3 = AvVertAff,
label4 = EggsPerLitreReceiving)) +
geom_boxplot() +
geom_point(aes(colour = Ploidy), size = 0.5) +
geom_jitter() +
# USE ENVSTATS PACKAGE TO INCLUDE SAMPLE SIZE
stat_n_text(size = 3) +
# INCLUDE MEAN VALUES
stat_summary(fun = mean, geom = "point", shape = 4, size = 2, color = "black") +
stat_summary(fun = mean, colour = "black", geom = "text", size = 3, show.legend = FALSE,
hjust = -0.35, vjust = -0.5, aes( label = round(..y.., digits = 2)))
I wanted to spread the data points out a little; however, when I use geom_jitter it seems to blur all the data points together and ruin the chart (see image).
Any help with this would be greatly appreciated.
You can use the width argument of geom_jitter to control how much the points are spread along the x-axis. I'd also recommend making the jittered points transparent (alpha argument) and to stop geom_boxplot from plotting the outliers with the outlier.shape argument (as those points also will be plotted by the jitter layer). Try the following:
ggplot(xray50g, aes(x = Company, y = DefScore, label = Batch,
label2 = PercentPopAff, label3 = AvVertAff,
label4 = EggsPerLitreReceiving)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter(alpha = 0.25, width = 0.1)

Manually plotting significance relations between sub-groups on ggplot2 barplot

I've been trying to plot manually labelled significance bars for a subset of groups on a ggplot2 barplot using ggsignif or ggpubr without much luck. The data is something like the following MWE:
set.seed(3)
## create data
df <- data.frame(activity = rep(c("Flying", "Jumping"), 3),
mean = rep(rnorm(6, 50, 25)),
group = c(rep("Ecuador", 2),
rep("Peru", 2),
rep("Brazil", 2)))
## plot it
ggplot(df, aes(x = activity, y = mean, fill = group)) +
geom_bar(position = position_dodge(0.9), stat = "identity",
width = 0.9, colour = "black", size = 0.1) +
xlab("Activity") + ylab("Mean")
Where I'd like to manually specify significance labels, say between Brazil/Ecuador" on "Flying", and Ecuador/Peru on "Jumping". Does anyone know how to properly deal with this kind of data, for example with ggsignif? And is there a way to refer to each bar by name, rather than try to work out its x-axis position?
If you know on which barchart you want to add your significance labels, you can do:
library(ggsignif)
library(ggplot2)
ggplot(df, aes(x = activity, y = mean, fill = group)) +
geom_bar(position = position_dodge(0.9), stat = "identity",
width = 0.9, colour = "black", size = 0.1) +
xlab("Activity") + ylab("Mean")+
geom_signif(y_position = c(60,50), xmin = c(0.7,2), xmax = c(1,2.3),
annotation=c("**", "***"), tip_length=0)
Does it answer your question ?

How to avoid over lapping bubbles in bubble plot?

I want to separately plot data in a bubble plot like the image right (I make this in PowerPoint just to visualize).
At the moment I can only create a plot that looks like in the left where the bubble are overlapping. How can I do this in R?
b <- ggplot(df, aes(x = Year, y = Type))
b + geom_point(aes(color = Spp, size = value), alpha = 0.6) +
scale_color_manual(values = c("#0000FF", "#DAA520", "#228B22","#E7B888")) +
scale_size(range = c(0.5, 12))
You can have the use of position_dodge() argument in your geom_point. If you apply it directly on your code, it will position points in an horizontal manner, so the idea is to switch your x and y variables and use coord_flip to get it in the right way:
library(ggplot2)
ggplot(df, aes(y = as.factor(Year), x = Type))+
geom_point(aes(color = Group, size = Value), alpha = 0.6, position = position_dodge(0.9)) +
scale_color_manual(values = c("#0000FF", "#DAA520", "#228B22","#E7B888")) +
scale_size(range = c(1, 15)) +
coord_flip()
Does it look what you are trying to achieve ?
EDIT: Adding text in the middle of each points
To add labeling into each point, you can use geom_text and set the same position_dodge2 argument than for geom_point.
NB: I use position_dodge2 instead of position_dodge and slightly change values of width because I found position_dodge2 more adapted to this case.
library(ggplot2)
ggplot(df, aes(y = as.factor(Year), x = Type))+
geom_point(aes(color = Group, size = Value), alpha = 0.6,
position = position_dodge2(width = 1)) +
scale_color_manual(values = c("#0000FF", "#DAA520", "#228B22","#E7B888")) +
scale_size(range = c(3, 15)) +
coord_flip()+
geom_text(aes(label = Value, group = Group),
position = position_dodge2(width = 1))
Reproducible example
As you did not provide a reproducible example, I made one that is maybe not fully representative of your original dataset. If my answer is not working for you, you should consider providing a reproducible example (see here: How to make a great R reproducible example)
Group <- c(LETTERS[1:3],"A",LETTERS[1:2],LETTERS[1:3])
Year <- c(rep(1918,4),rep(2018,5))
Type <- c(rep("PP",3),"QQ","PP","PP","QQ","QQ","QQ")
Value <- sample(1:50,9)
df <- data.frame(Group, Year, Value, Type)
df$Type <- factor(df$Type, levels = c("PP","QQ"))

How to correct the position of labels on piechart in ggplot. Also tell me how produce 3D piechart

One of the value in my dataset is zero, I think because of that I am not able to adjust labels correctly in my pie chart.
#Providing you all a sample dataset
Averages <- data.frame(Parameters = c("Cars","Motorbike","Bicycle","Airplane","Ships"), Values = c(15.00,2.81,50.84,51.86,0.00))
mycols <- c("#0073C2FF", "#EFC000FF", "#868686FF", "#CD534CFF","#FF9999")
duty_cycle_pie <- Averages %>% ggplot(aes(x = "", y = Values, fill = Parameters)) +
geom_bar(width = 1, stat = "identity", color = "white") +
coord_polar("y", start = 0)+
geom_text(aes(y = cumsum(Values) - 0.7*Values,label = round(Values*100/sum(Values),2)), color = "white")+
scale_fill_manual(values = mycols)
Labels are not placed in the correct way. Please tell me how can get 3D piechart.
Welcome to stackoverflow. I am happy to help, however, I must note that piecharts are highly debatable and 3D piecharts are considered bad practice.
https://www.darkhorseanalytics.com/blog/salvaging-the-pie
https://en.wikipedia.org/wiki/Misleading_graph#3D_Pie_chart_slice_perspective
Additionally, if the names of your variables reflect your actual dataset (Averages), a piechart would not be appropriate as the pieces do not seem to be describing parts of a whole. Ex: avg value of Bicycle is 50.84 and avg value of Airplane is 51.86. Having these result in 43% and 42% is confusing; a barchart would be easier to follow.
Nonetheless, the answer to your question about placement can be solved with position_stack().
library(tidyverse)
Averages <-
data.frame(
Parameters = c("Cars","Motorbike","Bicycle","Airplane","Ships"),
Values = c(15.00,2.81,50.84,51.86,0.00)
) %>%
mutate(
# this will ensure the slices go biggest to smallest (a best practice)
Parameters = fct_reorder(Parameters, Values),
label = round(Values/sum(Values) * 100, 2)
)
mycols <- c("#0073C2FF", "#EFC000FF", "#868686FF", "#CD534CFF","#FF9999")
Averages %>%
ggplot(aes(x = "", y = Values, fill = Parameters)) +
geom_bar(width = 1, stat = "identity", color = "white") +
coord_polar("y", start = 0) +
geom_text(
aes(y = Values, label = label),
color = "black",
position = position_stack(vjust = 0.5)
) +
scale_fill_manual(values = mycols)
To move the pieces towards the outside of the pie, you can look into ggrepel
https://stackoverflow.com/a/44438500/4650934
For my earlier point, I might try something like this instead of a piechart:
ggplot(Averages, aes(Parameters, Values)) +
geom_col(aes(y = 100), fill = "grey70") +
geom_col(fill = "navyblue") +
coord_flip()

How do I add a legend identifying the different geom_points in RStudio

I need help:
I cannot seem to add a legend to the following piece of code for ggplot R STDUIO
ggplot(Report_Data,
aes(x=Report_Data$Transect Point), show.legend = TRUE) +
geom_point(aes(y=Report_Data$Q1North),
shape = 6, size = 5, colour = label , show.legend = TRUE) +
geom_point(aes(y=Report_Data$Q1South),
shape = 4, size = 5, colour = label, show.legend = TRUE)+
labs(title="Density of Trees Species found North & South of the creek using two sampling methods",
y="Density in Tree Species Found", x="Transect Points",caption = "n7180853")+
geom_line(aes(y=Report_Data$Q1North, colour = Q1North),
colour = "green", size = 1, show.legend = TRUE)+
geom_line(aes(y=Report_Data$Q1South, color = Q1South),
colour = "pink4", size = 1, show.legend = TRUE)+ theme(legend.position = "right")
Hard to replicate without your data, but if you want to have your plot to have a legend for point shape, you need to include it within aes(...) for that layer. Similar with colour, or size, if you want. Then, add scale_shape_manual(...) or scale_colour_manual(...) as needed, specifying your specific values.
Here's a toy example using the default diamonds dataset.
data("diamonds")
ggplot(diamonds, aes(x = carat)) +
geom_point(data = diamonds[diamonds$cut == "Ideal",],
aes(y= price, shape = cut)) +
geom_point(data = diamonds[diamonds$cut == "Good",],
aes(y = price, shape = cut)) +
scale_shape_manual(values = c(6,4))

Resources