Modifying legend in ggplot2 - r

I'm seeking some assistance with modifying the legend in my plot using the data below.
dput(df)
structure(list(Week.Number = 1:16, Dist.18 = c(5331.83038, 14084.08602,
12219.423585, 14406.407445, 5032.74848, 10820.094835, 16935.546075,
15387.590625, 16195.21247, 20012.09881, 14057.385255, 5127.14891,
16241.98523, 12793.21837, 10526.785375, 6014.43878), HIR.18 = c(1098.56001,
4093.010015, 4372.84498, 4074.22002, 709.70499, 2460.04999, 5037.77501,
5521.029965, 5463.410025, 6761.34502, 3953.20997, 1189.89, 3663.69006,
2333.005005, 2289.38001, 1069.740005), V6.18 = c(0, 40.77, 63.505,
112.63, 52.395, 56.795, 211.115, 75.52, 215.059995, 121.725,
57.64, 15.35, 140.34, 15.615, 85.66, 31.815), Dist.17 = c(11820.06249,
18123.592835, 14560.30914, 17193.56009, 7733.785765, 15536.659865,
8694.08218, 19569.060865, 14153.71578, 18498.63446, 16452.63166,
16820.32351, 9242.407875, 8857.62039, 2371.09375, 10340.258575
), HIR.17 = c(2693.425035, 4971.474985, 4521.895065, 5561.53997,
1759.31996, 3924.48, 1893.485, 5571.700035, 3239.94503, 4773.02004,
5927.174995, 4537.58996, 1618.49499, 2771.84002, 284.56, 2181.749995
), V6.17 = c(15.58, 38.355, 240.355, 354.059995, 1.76, 187.575,
93.495, 184.925, 88.27, 165.08, 231.075, 171.09, 32.55, 93.88,
0, 56.19)), .Names = c("Week.Number", "Dist.18", "HIR.18", "V6.18",
"Dist.17", "HIR.17", "V6.17"), row.names = c(NA, -16L), class = "data.frame")
This code generates the plot.
plot <- ggplot(df, aes(x = Week.Number, y = Dist.18, fill = "2018")) +
geom_col() +
geom_line(aes(x = Week.Number, y = Dist.17, fill = "2017"), size = 0.75) +
geom_point(aes(x = Week.Number, y = Dist.17), size = 0.75) +
scale_fill_manual("color", values = c("2017" = "black", "2018" = "blue")) +
scale_x_continuous(breaks = c(1:16)) +
ylab("Dist") +
theme_classic() +
theme(plot.title = element_text(face = "bold"),
axis.title.x = element_text(face = "bold"),
axis.title.y = element_text(face = "bold"))
I wish to change the title of the legend to "Season" and modify the key. I'm wondering if it's possible to have two different points in the key. For example, a solid blue square for the label 2018 and a black line for 2017, representing each geom in the plot.
Also, i used fill = in the aes() argument to generate a legend in the first instance. This seems to work, but not sure if it's best practice or not.
Hope I've provided enough information. Any help will be greatly appreciated. Thank you.

As per my comment above, one legend is created for each 'aesthetics' - you have currently only the fill aesthetics. If you want more than one legend, you need to specify several aesthetics, here e.g. linetype or color.
There are some problems with your code, though.
First, in order to make full use of ggplot's functionality with the aesthetics and grouping, I would recommend putting your data in a long format - currently it's in a wide format. E.g., it might make sense to group by years - you could achieve that to put all values which belong to one measurememt into one column, and have a column specifying the year, and then specify the aes for this 'year- column'.
Furthermore, See comments below
ggplot(df) +
# avoid specifying your `aes` in the ggplot main call -
# specially if you have several plots following.
# Some people say it's even better to leave it completely empty.
geom_col(aes(x = Week.Number, y = Dist.18, fill = "2018")) +
# now here you are currently not really making use of the aes-functionality,
# because you are only creating an aesthetic for one value, i.e. '2018'
geom_line(aes(x = Week.Number, y = Dist.17, color = "2017"), size = 0.75) +
# Here I have changed fill to color
geom_point(aes(x = Week.Number, y = Dist.17), size = 0.75) +
scale_fill_manual("your title", values = c("2017" = "black", "2018" = "blue")) +
# this is to show you that you actually already know
# how to change your legend title - see the graph :)
scale_x_continuous(breaks = c(1:16)) +
ylab ("Dist") +
theme_classic()

I guess it would be nice to have one title for both legends:
ggplot(df, aes(x = Week.Number)) +
geom_col(aes(y = Dist.18, fill = "2018")) +
geom_line(aes(y = Dist.17, col = "2017"), size = 0.75) +
geom_point(aes(y = Dist.17, col = "2017")) +
scale_colour_manual("Season", values = c("2017" = "black")) +
scale_fill_manual("", values = c("2018" = "blue")) +
scale_x_continuous(breaks = c(1:16)) +
ylab ("Dist") +
theme_classic() +
theme(plot.title = element_text(face = "bold"),
axis.title.x = element_text(face = "bold"),
axis.title.y = element_text(face = "bold")) +
theme(legend.margin = margin(-0.8, 0, 0, 0, unit = "cm"))
If you do not want to have point in the legend, just remove col = "2017" from geom_point and you get:
The trick is to remove space between two legends with legend.margin argument in theme.

Related

Combine legend for fill and colour ggplot to give only single legend

I am plotting a smooth to my data using geom_smooth and using geom_ribbon to plot shaded confidence intervals for this smooth. No matter what I try I cannot get a single legend that represents both the smooth and the ribbon correctly, i.e I am wanting a single legend that has the correct colours and labels for both the smooth and the ribbon. I have tried using + guides(fill = FALSE), guides(colour = FALSE), I also read that giving both colour and fill the same label inside labs() should produce a single unified legend.
Any help would be much appreciated.
Note that I have also tried to reset the legend labels and colours using scale_colour_manual()
The below code produces the below figure. Note that there are two curves here that are essentially overlapping. The relabelling and setting couours has worked for the geom_smooth legend but not the geom_ribbon legend and I still have two legends showing which is not what I want.
ggplot(pred.dat, aes(x = age.x, y = fit, colour = tagged)) +
geom_smooth(size = 1.2) +
geom_ribbon(aes(ymin = lci, ymax = uci, fill = tagged), alpha = 0.2, colour = NA) +
theme_classic() +
labs(x = "Age (days since hatch)", y = "Body mass (g)", colour = "", fill = "") +
scale_colour_manual(labels = c("Untagged", "Tagged"), values = c("#3399FF", "#FF0033")) +
theme(axis.title.x = element_text(face = "bold", size = 14),
axis.title.y = element_text(face = "bold", size = 14),
axis.text.x = element_text(size = 12),
axis.text.y = element_text(size = 12),
legend.text = element_text(size = 12))
The problem is that you provide new labels for the color-aesthetic but not for the fill-aesthetic. Consequently ggplot shows two legends because the labels are different.
You can either also provide the same labels for the fill-aesthetic (code option #1 below) or you can set the labels for the levels of your grouping variable ("tagged") before calling ggplot (code option #2).
library(ggplot2)
#make some data
x = seq(0,2*pi, by = 0.01)
pred.dat <- data.frame(x = c(x,x),
y = c(sin(x), cos(x)) + rnorm(length(x) * 2, 0, 1),
tag = rep(0:1, each = length(x)))
pred.dat$lci <- c(sin(x), cos(x)) - 0.4
pred.dat$uci <- c(sin(x), cos(x)) + 0.4
#option 1: set labels within ggplot call
pred.dat$tagged <- as.factor(pred.dat$tag)
ggplot(pred.dat, aes(x = x, y = y, color = tagged, fill = tagged)) +
geom_smooth(size = 1.2) +
geom_ribbon(aes(ymin = lci, ymax = uci), alpha = 0.2, color = NA) +
scale_color_manual(labels = c("untagged", "tagged"), values = c("#F8766D", "#00BFC4")) +
scale_fill_manual(labels = c("untagged", "tagged"), values = c("#F8766D", "#00BFC4")) +
theme_classic() + theme(legend.title = element_blank())
#option 2: set labels before ggplot call
pred.dat$tagged <- factor(pred.dat$tag, levels = 0:1, labels = c("untagged", "tagged"))
ggplot(pred.dat, aes(x = x, y = y, color = tagged, fill = tagged)) +
geom_smooth(size = 1.2) +
geom_ribbon(aes(ymin = lci, ymax = uci), alpha = 0.2, color = NA) +
theme_classic() + theme(legend.title = element_blank())

change the color of 20% of bars in geom_bar ggplot

I'm trying to change the color of 9 states in the following image. Those states are top mining states and I want them to stand out in the image attached below. I probably need to modify my dataframe as the easiest step. But any other ideas?
ggplot(data = media_impact_by_state) +
#geom_hline(yintercept=0,linetype="dashed", color = "red") +
geom_bar(aes(x= reorder(GeoName,trustclimsciSSTOppose - mean(trustclimsciSSTOppose)),
y= CO2limitsOppose-mean(CO2limitsOppose), fill = "fill1"),
stat = 'identity') +
geom_point(aes(x = GeoName,
y = trustclimsciSSTOppose - mean(trustclimsciSSTOppose),
color = "dot1"),
size=3) +
scale_color_manual(values = c("black"),
label = "Distrust of Scientists",
name = "Mean Deviation") +
scale_fill_manual(values = c(fill1 = "darkorange1",fill2 = "blue"),
labels = c(fill1 = "Oppose Limits to Co2 Emissions",fill2 = "poop"),
name = "Mean Deviation") +
labs(x = "State",
y = "(%)",
title = "Distrust of Scientists") +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1,size=12),
axis.text.y = element_text(size=14),
axis.title.y = element_text(size=16),
axis.title.x = element_text(size=16),
plot.title = element_text(size=16,hjust=0.5))
It will be difficult to offer guidance without seeing a subset of your data. To offer some suggestions, try amending the appropriate column(s) (i.e., variables) using ifelse() before feeding it to the fill aesthetic. Make sure this is wrapped inside of the aes() call. Your legend titled "Mean Deviation" should appropriately split into two categories. Then, simply amend the colors inside of scale_fill_manual() as needed.
ggplot(data = media_impact_by_state) +
geom_bar(aes(x = reorder(GeoName, trustclimsciSSTOppose - mean(trustclimsciSSTOppose)),
y = CO2limitsOppose - mean(CO2limitsOppose),
fill = factor(ifelse(GeoName %in% c(...), "Top 20", "Bottom 80"))), # index the states
stat = 'identity') +
geom_point(aes(x = GeoName,
y = trustclimsciSSTOppose - mean(trustclimsciSSTOppose),
color = "dot1"),
size = 3) +
scale_color_manual(name = "Mean Deviation"
values = c("black"),
labels = "Distrust of Scientists") +
scale_fill_manual(name = "Mean Deviation",
values = c("darkorange1", # supply the vector of colors
"blue"),
labels = c("Oppose (Top 20)", # supply the vector of labels
"Oppose (Bottom 80)") +
labs(x = "State",
y = "(%)",
title = "Distrust of Scientists") +
theme(
axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1, size = 12),
axis.text.y = element_text(size = 14),
axis.title.y = element_text(size = 16),
axis.title.x = element_text(size = 16),
plot.title = element_text(size = 16, hjust = 0.5)
)
However, if you want to flag the top 20 percent of states by any other arbitrary measures of mining output, then maybe you should consider modifying the existing data frame using one of R's generic functions. I'm not sure by what standard(s) you are using to determine the "top" mining states, but that is for you to decide. For example, try creating a variable ahead of time, call it fill_col and pass it to fill inside of the aes() call. Here is how you could pre-process the data:
media_impact_by_state %>%
arrange(GeoName, desc(mining_output)) %>% # order in descending order by mining output
mutate(fill_col = mining_output > quantile(mining_output, .8)) # flag the top 20 percent
In the end, there's nothing wrong with manually typing in all the states that you want to highlight, though it is harder on the eyes and could become unwieldy if you had more than 50 states (or 51 if you included the District of Columbia).
I hope this helps!

Putting horizontal lines on grouped boxplots

I am trying to make a boxplot with this basic code:
design=c("Red","Green","Blue")
actions=c("1","2","3","4","5","6","7","8")
proportion=(seq(1:240)+sample(1:500, 240, replace=T))/2000
df=data.frame(design, actions , proportion)
ggplot(df, aes(x=actions, y=proportion, fill=design)) +
geom_boxplot()+
xlab(TeX("group"))+
ylab("Y value")+
ggtitle("Y values for each group stratified by color")
Producing something like this:
I want to add horizontal lines for "true" Y values that are different for each group.
Does anyone have any tips for doing this? I don't know how to extract the width of each group of boxes, otherwise I could use geom_segment.
Here is a MWE with a non-grouped boxplot:
dBox <- data.frame(y = rnorm(10),group="1")
dBox=rbind(dBox,data.frame(y=rnorm(10),group="2"))
dLines <- data.frame(X =c(-0.36, 0.015),
Y = c(0.4, -0.2),
Xend = c(0.-0.015, 0.36),
Yend=c(0.4, -0.2),
group = c("True", "True"),
color = c("black", "red"))
ggplot(dBox, aes(x=0, y=y,fill=group)) +
geom_boxplot(outlier.shape = 1)+
geom_segment(data = dLines, aes(x = X, xend = Xend, y = Y, yend = Yend),color="red",size=1.5,linetype=1) +
theme(legend.background = element_rect(fill = "white", size = 0.1, linetype = "solid", colour = "black"))
This produces something like this:
However, it's difficult to make the geom_segments line up with the boxes exactly, and to then extend this to the grouped boxplot setting.
Thanks!
This can be done using a workaround with facets:
lines = data.frame(actions = 1:8, proportion=abs(rnorm(8)))
design=c("Red","Green","Blue")
actions=c("1","2","3","4","5","6","7","8")
proportion=(seq(1:240)+sample(1:500, 240, replace=T))/2000
df=data.frame(design, actions , proportion)
lines = data.frame(actions = 1:8, proportion=abs(rnorm(8)))
p = ggplot(df, aes(x=actions, y=proportion, fill=design)) +
geom_boxplot()+
xlab("group")+
ylab("Y value")+
ggtitle("Y values for each group stratified by color") +
facet_grid(~actions, scale='free_x') +
theme(
panel.spacing.x = unit(0, "lines"),
strip.background = element_blank(),
strip.text.x = element_blank())
p + geom_hline(aes(yintercept = proportion), lines)
You could probably fiddle around with removing the spaces between the facets to make it look more like what you intended.
Thanks to #eugene100hickey for pointing out how to remove spacing between facets.
theme(panel.spacing.x) can remove those pesky lines:
p + geom_hline(aes(yintercept = proportion), lines) +
theme(panel.spacing.x = unit(0, "lines"))

Adding legend for combo bar and line graph -- ggplot ignoring commands

I am trying to make a bar chart with line plots as well. The graph has created fine but the legend does not want to add the line plots to the legend.
I have tried so many different ways of adding these to the legend including:
ggplot Legend Bar and Line in Same Graph
None of which have worked. show.legend also seems to have been ignored in the geom_line aes.
My code to create the graph is as follows:
ggplot(first_q, aes(fill = Segments)) +
geom_bar(aes(x= Segments, y= number_of_new_customers), stat =
"identity") + theme(axis.text.x = element_blank()) +
scale_y_continuous(expand = c(0, 0), limits = c(0,3000)) +
ylab('Number of Customers') + xlab('Segments') +
ggtitle('Number Customers in Q1 by Segments') +theme(plot.title =
element_text(hjust = 0.5)) +
geom_line(aes(x= Segments, y=count) ,stat="identity",
group = 1, size = 1.5, colour = "darkred", alpha = 0.9, show.legend =
TRUE) +
geom_line(aes(x= Segments, y=bond_count)
,stat="identity", group = 1, size = 1.5, colour = "blue", alpha =
0.9) +
geom_line(aes(x= Segments, y=variable_count)
,stat="identity", group = 1, size = 1.5, colour = "darkgreen",
alpha = 0.9) +
geom_line(aes(x= Segments, y=children_count)
,stat="identity", group = 1, size = 1.5, colour = "orange", alpha
= 0.9) +
guides(fill=guide_legend(title="Segments")) +
scale_color_discrete(name = "Prod", labels = c("count", "bond_count", "variable_count", "children_count)))
I am fairly new to R so if any further information is required or if this question could be better represented then please let me know.
Any help is greatly appreciated.
Alright, you need to remove a little bit of your stuff. I used the mtcars dataset, since you did not provide yours. I tried to keep your variable names and reduced the plot to necessary parts. The code is as follows:
first_q <- mtcars
first_q$Segments <- mtcars$mpg
first_q$val <- seq(1,nrow(mtcars))
first_q$number_of_new_costumers <- mtcars$hp
first_q$type <- "Line"
ggplot(first_q) +
geom_bar(aes(x= Segments, y= number_of_new_costumers, fill = "Bar"), stat =
"identity") + theme(axis.text.x = element_blank()) +
scale_y_continuous(expand = c(0, 0), limits = c(0,3000)) +
geom_line(aes(x=Segments,y=val, linetype="Line"))+
geom_line(aes(x=Segments,y=disp, linetype="next line"))
The answer you linked already gave the answer, but i try to explain. You want to plot the legend by using different properties of your data. So if you want to use different lines, you can declare this in your aes. This is what get's shown in your legend. So i used two different geom_lines here. Since the aes is both linetype, both get shown at the legend linetype.
the plot:
You can adapt this easily to your use. Make sure you using known keywords for the aesthetic if you want to solve it this way. Also you can change the title names afterwards by using:
labs(fill = "costum name")
If you want to add colours and the same line types, you can do customizing by using scale_linetype_manual like follows (i did not use fill for the bars this time):
library(ggplot2)
first_q <- mtcars
first_q$Segments <- mtcars$mpg
first_q$val <- seq(1,nrow(mtcars))
first_q$number_of_new_costumers <- mtcars$hp
first_q$type <- "Line"
cols = c("red", "green")
ggplot(first_q) +
geom_bar(aes(x= Segments, y= number_of_new_costumers), stat =
"identity") + theme(axis.text.x = element_blank()) +
scale_y_continuous(expand = c(0, 0), limits = c(0,3000)) +
geom_line(aes(x=Segments,y=val, linetype="solid"), color = "red", alpha = 0.4)+
geom_line(aes(x=Segments,y=disp, linetype="second"), color ="green", alpha = 0.5)+
scale_linetype_manual(values = c("solid","solid"),
guide = guide_legend(override.aes = list(colour = cols)))

Dotplot: How to change dot sizes of dotplot based on a value in data and make all x axis values into whole numbers

I have made a dotplot for my data but need to help with the finishing touches. Been around stackoverflow a bit and haven't seen any posts that directly answer my queries yet.
My code for my dotplot is:
ggplot()+
geom_dotplot(mapping = aes(x= reorder(Description, -p.adjust), y=Count, fill=-p.adjust),
data = head(X[which(X$p.adjust < 0.05),], n = 15), binaxis = 'y', dotsize = 2,
method = 'dotdensity', binpositions = 'all', binwidth = NULL)+
scale_fill_continuous(low="black", high="light grey") +
labs(y = "Associated genes", x = "wikipathways", fill = "p.adjust") +
theme(axis.text=element_text(size=8)) +
ggtitle('') +
theme(plot.title = element_text(2, face = "bold", hjust = 1),
legend.key.size = unit(2, "line")) +
theme(panel.background = element_rect(fill = 'white', colour = 'black'))+
coord_fixed(ratio = 0.5)+
coord_flip()
Let's say the X is something along the lines of:
Description p.adjust Count GeneRatio
1 DescriptionA 0.001 3 3/20
2 DescriptionB 0.002 2 2/20
3 DescriptionC 0.003 5 5/20
4 DescriptionD 0.004 10 10/20
To complete this plot I need two edits.
I would like to use base the size of the dots on the GeneRatio, and make a secondary key based around this size. Is this possible with ggplot2, dotplots?
Next I would like to keep the X axis values as integers. I'd want to avoid using something like scale_x_continuous(limits = c(2, 10)) as this plot code is part of a function for multiple data sets of various sizes. Thus containing the limits/scale would not work well.
Help would be most appreciated.
If you can switch to a geom_point chart instead of geom_dotplot it's easy to adjust the dot size according to a variable. It also seems to have corrected your axis issue luckily enough.
ggplot(x)+
geom_point(mapping = aes(x= reorder(Description, -p.adjust), y=Count, fill=-p.adjust, size=GeneRatio),
data = head(x[which(x$p.adjust < 0.05),], n = 15), binaxis = 'y', #dotsize = 2,
method = 'dotdensity', binpositions = 'all', binwidth = NULL)+
scale_fill_continuous(low="black", high="light grey") +
labs(y = "Associated genes", x = "wikipathways", fill = "p.adjust") +
theme(axis.text=element_text(size=8)) +
ggtitle('') +
theme(plot.title = element_text(2, face = "bold", hjust = 1),
legend.key.size = unit(2, "line")) +
theme(panel.background = element_rect(fill = 'white', colour = 'black'))+
coord_fixed(ratio = 0.5)+
coord_flip()

Resources