R - ggplot2 Legend not appearing for line graph [duplicate] - r

This question already has answers here:
Add legend to ggplot2 line plot
(4 answers)
Closed 4 years ago.
I know this question has been asked before, and I've looked at many of the links, but none of them seem to be helping my case.
I'm plotting a line graph for 4 lines of different colors. But I can't get the legend to appear.
I've read that I need to put the color attribute in the aes part of the graph. That hasn't been successful either.
I have a data frame of four column, and 1000 rows. Here's a small reproducible example of what my data looks like, and how I'd like to plot it.
library(ggplot2)
vec1 <- c(0.1, 0.2, 0.25, 0.12, 0.3, 0.7, 0.41)
vec2 <- c(0.5, 0.4, 0.3, 0.55, 0.12, 0.12, 0.6)
vec3 <- c(0.01, 0.02, 0.1, 0.5, 0.14, 0.2, 0.5)
vec4 <- c(0.08, 0.1, 0.54, 0.5, 0.1, 0.12, 0.3)
df <- data.frame(vec1, vec2, vec3, vec4)
df_plot <- ggplot() +
geom_line(data = df, color = "black", aes(x = c(1:7), y = df[,1], color =
"black")) +
geom_line(data = df, color = "blue", aes(x = c(1:7), y = df[,2], color =
"blue")) +
geom_line(data = df, color = "green", aes(x = c(1:7), y = df[,3], color =
"green")) +
geom_line(data = df, color = "yellow", aes(x = c(1:7), y = df[,4], color
= "yellow")) +
xlab("x axis") +
ylab("y axis") +
ggtitle("A random plot") +
theme(legend.title = element_text("Four lines"), legend.position =
"right")
(Also, did SO change the process of indenting code? Before, I could just press Ctrl + K to indent the entire block of code. But I can't do that anymore. Ctrl+K puts the cursor in my URL box for some reason)
I'd like ti it print the legend to the right of the graph.

First: I see a lot of people here creating data frames by first creating individual vectors. I don't know where this practice originated but it isn't necessary:
df1 <- data.frame(vec1 = c(0.1, 0.2, 0.25, 0.12, 0.3, 0.7, 0.41),
vec2 = c(0.5, 0.4, 0.3, 0.55, 0.12, 0.12, 0.6),
vec3 = c(0.01, 0.02, 0.1, 0.5, 0.14, 0.2, 0.5),
vec4 = c(0.08, 0.1, 0.54, 0.5, 0.1, 0.12, 0.3))
Next: your data is in "wide" form. ggplot2 works better with "long" form: one column for variables, another for their values. You can get to that using tidyr::gather. While we're at it, we can use dplyr::mutate to add the x variable:
library(dplyr)
library(tidyr)
library(ggplot2)
df1 %>%
gather(Var, Val) %>%
mutate(x = rep(1:7, 4))
Now we can plot. With the data in this form, there is no need to use a separate geom for each variable and aes() automatically takes care of colors and legends. You can specify custom colors using scale_color_manual. I don't know that yellow or green are great choices, but here it is:
df1 %>%
gather(Var, Val) %>%
mutate(x = rep(1:7, 4)) %>%
ggplot(aes(x, Val)) +
geom_line(aes(color = Var)) +
scale_color_manual(values = c("black", "blue", "green", "yellow"))
The key is having your data in the correct format, and understanding how that allows aes to map variables to chart properties.

Related

Scale fill gradient using absolute values

In the following chart, I would like a gradient to be applied at an absolute value level, rather than relative values. For example, rows I and G should be the same color of red as their values are -75 and 75, respectively. By the same token, rows F and E should be the same shade of green as their values are -15 and 15, respectively. Can anyone tell me how I would do this?
library(dplyr)
library(ggplot2)
data.frame(grp = LETTERS[1:10],
vals = c(0.11, 0.39, -0.06, 0.42, 0.15, -0.15, 0.75, -0.02, -0.75, 0.00)) %>%
ggplot(aes(x = vals, y = grp, fill = vals)) +
geom_col() +
scale_fill_gradient(low = "green", high = "red")
You could simply use fill = abs(vals)
data.frame(grp = LETTERS[1:10],
vals = c(0.11, 0.39, -0.06, 0.42, 0.15, -0.15, 0.75, -0.02, -0.75, 0.00)) %>%
ggplot(aes(x = vals, y = grp, fill = abs(vals))) +
geom_col() +
scale_fill_gradient(low = "green", high = "red")

Single option in scale_fill_stepsn changes color rendering in legend

I'm using ggplot's scale_fill_stepsn to generate a map with a stepped scale. When I use the option n.breaks the colors specified render properly in the legend. n.breaks calculates the breaks based on the number of breaks specified. However, when I use the option to manually specify the breaks with the same number of breaks used in n.breaks, the color rendering in the legend changes and are not rendered properly.
This does not make sense. Can this be fixed such that the legend colors in the second example look like that in the first?
library(urbnmapr)
library(ggplot2)
library(dplyr)
library(ggthemes)
# Set colors
red <- c(0.67, 0.75, 0.84, 0.92, 1, 1, 0.8, 0.53, 0, 0, 0, 0)
green <- c(0.25, 0.4, 0.56, 0.71, 0.86, 1, 1, 0.95, 0.9, 0.75, 0.6, 0.48)
blue <- c(0.11, 0.18, 0.25, 0.33, 0.4, 0.45, 0.4, 0.27, 0, 0, 0, 0)
# Obtain county polygon data
states_sf <- get_urbn_map(map = "states", sf = TRUE)
counties_sf <- get_urbn_map(map = "counties", sf = TRUE)
# Assign random values of data to each count
counties_sf$value = runif(length(counties_sf$county_fips), min=-3.0, max=3.0)
# Remove AK and HI - lower 48 only
states_sf <- states_sf[!(states_sf$state_abbv %in% c("HI","AK")),]
counties_sf <- counties_sf[!(counties_sf$state_abbv %in% c("HI","AK")),]
# Plot county-level data with a discrete legend
data_levels <- c(-3,-1.5, -0.8, -0.5, -0.25,-0.1,0.1,0.25,0.5,.8,1.5,3)
level_colors <- rgb(red, green, blue)
length(data_levels)
length(level_colors)
# First version -
counties_sf %>%
ggplot() +
# Overlay State Outlines
# Plot county data and fill with value
geom_sf(mapping = aes(fill = value), color = NA) +
geom_sf(data = states_sf, fill = NA, color = "black", size = 0.25) +
# Remove grid lines from plot
coord_sf(datum = NA) +
#
# THE FIRST OPTION of scale_fill_stepsn IS WHERE THEY ARE DIFFERENT
#
scale_fill_stepsn(n.breaks=12, colors=level_colors, limits=c(-3,3),
labels=scales::label_number(accuracy=0.1)) +
labs(title='This Data is Completely Random',
fill ='The Legend') +
theme_map() +
theme(legend.position = "bottom",
legend.key.width=unit(1.5,"cm"),
legend.box.background = element_rect(color="black", size=2),
legend.title = element_text(face = "bold"),
legend.spacing = unit(0.25,"cm"),
legend.justification = "center",
plot.title=element_text(hjust=0.5)) +
guides(fill = guide_colorsteps(even.step=TRUE,
title.position="top",
title.hjust = 0.5,
frame.colour = 'black',
barwidth=unit(250,'points'),
axis.linewidth=unit(3,'points')))
This yields:
#
# Second version -
counties_sf %>%
ggplot() +
# Overlay State Outlines
# Plot county data and fill with value
geom_sf(mapping = aes(fill = value), color = NA) +
geom_sf(data = states_sf, fill = NA, color = "black", size = 0.25) +
# Remove grid lines from plot
coord_sf(datum = NA) +
#
# THE FIRST OPTION of scale_fill_stepsn IS WHERE THEY ARE DIFFERENT
# replaced n.breaks with breaks option
#
scale_fill_stepsn(breaks=data_levels, colors=level_colors, limits=c(-3,3),
labels=scales::label_number(accuracy=0.1)) +
labs(title='This Data is Completely Random',
fill ='The Legend') +
theme_map() +
theme(legend.position = "bottom",
legend.key.width=unit(1.5,"cm"),
legend.box.background = element_rect(color="black", size=2),
legend.title = element_text(face = "bold"),
legend.spacing = unit(0.25,"cm"),
legend.justification = "center",
plot.title=element_text(hjust=0.5)) +
guides(fill = guide_colorsteps(even.step=TRUE,
title.position="top",
title.hjust = 0.5,
frame.colour = 'black',
barwidth=unit(250,'points'),
axis.linewidth=unit(3,'points')))
This version yields the following image. Notice the colors in the legend are now different.
The first option evenly spaces out 12 breaks from -3 to 3 which then exactly coincide with your colours. Whereas the second option sets unevenly spaced values with the exact colours falling in between some of the breaks. The (hidden) gradient is still evenly spaced though. To have the gradient spaced as your breaks, you need to set the values argument of the scale. Simplified example below.
library(ggplot2)
df <- data.frame(
x = runif(100),
y = runif(100),
z = runif(100, -3, 3)
)
level_colors <- rgb(
red = c(0.67, 0.75, 0.84, 0.92, 1, 1, 0.8, 0.53, 0, 0, 0, 0),
green = c(0.25, 0.4, 0.56, 0.71, 0.86, 1, 1, 0.95, 0.9, 0.75, 0.6, 0.48),
blue = c(0.11, 0.18, 0.25, 0.33, 0.4, 0.45, 0.4, 0.27, 0, 0, 0, 0)
)
data_levels <- c(-3,-1.5, -0.8, -0.5, -0.25,-0.1,0.1,0.25,0.5,.8,1.5,3)
ggplot(df, aes(x, y, fill = z)) +
geom_point(shape = 21) +
scale_fill_stepsn(breaks=data_levels, colors=level_colors, limits=c(-3,3),
values = scales::rescale(data_levels),
labels=scales::label_number(accuracy=0.1))
Created on 2021-04-08 by the reprex package (v1.0.0)

ggplot: filling color based on condition

I want to plot two categorical variables (group, condition) and one numeric variable (value). In addition, I want to base the filling color on the significance of the values (significant bars should be grey, the rest white). With the following code, however, only some significant bars are colored in grey.
plot <- ggplot(dat, aes(group, value))+
geom_col(aes(fill = condition), position = position_dodge(0.8), width = .7, color= "black") +
scale_fill_manual(values = ifelse(dat$significance > .05, "white", "grey")) +
geom_linerange(aes(group = condition, ymin = ci_lower, ymax= ci_upper), position = position_dodge(0.8)) +
coord_flip(ylim =c(-.2,1))
plot
here is my data:
dat <- structure(list(group = c("friends", "parent", "esm", "friends", "parent", "esm"),
value = c(0.25, 0.44, 0.33, 0.47, 0.25, 0.32),
significance = c(0.08, 0, 0, 0, 0.01, 0),
condition = c("S1", "S1", "S1", "S2", "S2", "S2"),
trait = c("E", "E", "E", "E", "E", "E"),
ci_lower = c(0.52, 0.74, 0.53, 0.67, 0.44, 0.49),
ci_upper = c(-0.03, 0.14, 0.14, 0.27, 0.06, 0.15)),
row.names = c(1L,2L, 3L, 16L, 17L, 18L), class = "data.frame")
You can add an inline mutate to create a column to specify the color group based on significance. The key here is to use the group aesthetic so the bars can still be dodged and positioned correctly based on the condition variable.
dat %>%
mutate(sig = significance < .05) %>%
ggplot(aes(group, value, group = condition)) +
geom_col(
aes(fill = sig),
position = position_dodge(0.8),
color = "black",
width = .7
) +
scale_fill_manual(values = c("white", "grey")) +
geom_linerange(aes(ymin = ci_lower, ymax = ci_upper),
position = position_dodge(0.8)) +
coord_flip(ylim = c(-.2, 1))
Gives this plot:
However, I think you need another aesthetic to distinguish condition in addition to significance. Color is one option, but this is a nice place to use ggpattern which will be more obvious than the outline color and keep the B&W look.
Here's an example:
library(ggpattern)
dat %>%
mutate(sig = significance > .05) %>%
ggplot(aes(group, value, group = condition)) +
geom_col_pattern(
aes(fill = sig, pattern_angle = condition),
position = position_dodge(0.8),
pattern_fill = "black",
pattern_spacing = 0.025,
pattern = "stripe",
width = .7,
color = "black"
) +
scale_pattern_angle_discrete(range = c(45, 135)) +
scale_fill_manual(values = c("grey", "white")) +
geom_linerange(aes(ymin = ci_lower, ymax = ci_upper),
position = position_dodge(0.8)) +
coord_flip(ylim = c(-.2, 1))
Which gives this plot:
Finally, it's worth noting that the color of a bar is not usually used to denote significance of a statistical metric; a much more common convention would be to use asterisk to indicate relevant p value thresholds (e.g. ** p < 0.01) or letters to indicate membership in a grouped analysis such as an ANOVA. These can be easily implemented using the ggpubr package. That would leave fill color free to indicate the grouping by condition.
It can also be useful:
library(ggplot2)
#Code
ggplot(dat, aes(group, value))+
geom_col(aes(fill = interaction(condition,significance > .05)),
position = position_dodge(0.8), width = .7, color= "black") +
scale_fill_manual(values = c("grey","grey","white"),
breaks = c('S2.FALSE','S1.TRUE'),
labels=c('S2','S1')) +
geom_linerange(aes(group = condition, ymin = ci_lower, ymax= ci_upper), position = position_dodge(0.8)) +
coord_flip(ylim =c(-.2,1))+
labs(fill='Var')
Output:

Highlight the value according to the specific position in x axis with ggplot

My aim to highlight the StockValue value above 0.5 with one colour and StockValue of the years 2002, 2003, 2012 and 2015 using another colour with ggplot. I am successful to highlight the value above 0.5 but not able to solve the second problem.
I have tried:
require(ggplot2)
require(reshape2)
df <- data.frame(Year = c(2001:2015),
StockValue = c(0.93, 0.32, 0.24, 0.53, 0.43, 0.53, 0.43, 0.58, 0.31, 0.52, 0.49, 0.27,0.34,0.48, 0.45))
df %>% ggplot(aes(x=Year,y=StockValue)) + geom_point(color = 'blue', shape = 18) + theme(legend.position="none") + ggtitle("Stock Value")
highlight <- df %>% filter(StockValue>=0.5)
df %>% ggplot(aes(x=Year,y=StockValue)) + geom_point(color = 'blue', shape = 18, size = 2.3) + geom_point(data=highlight, aes(x=Year,y=StockValue), color='red', shape=18) + theme(legend.position="none")
The simplest way is perhaps to create the data you want to use first. Like so:
df %>%
mutate(above = StockValue>=.5) %>%
mutate(year = Year %in% c(2002,2003,2012,2015)) %>%
mutate(comb = paste(above,year)) %>%
ggplot(aes(Year,StockValue,color = comb)) +
geom_point() +
scale_color_manual(values = c('blue','violet','black')) +
theme(legend.position = 'none')

Individual axes labels in facet_wrap without scales="free"

My data looks like this:
df <- data.frame(Year = as.factor(c(rep(2015, 3), rep(2016, 3), rep(2017,3))),
Tax = as.factor(c(rep(c("A", "B", "C"), 3))),
Depth = as.factor(c(10, 30, 50, 20,30,50,10,30,40)),
values= c(0.5, 0.25, 0.25, 0.1, 0.4, 0.5, 0.2, 0.6, 0.2))
I want to plot it with gaps for missing data and individual axis labels.
library(ggplot2)
The scale argument of facet_wrap gives individual axes, but is not performing as desired, as missing data is not reflected.:
ggplot(df, aes(Depth, values, fill=Tax)) + geom_bar(stat="identity")+
facet_wrap(~Year, scale="free") +
coord_flip()
Without scales:
ggplot(df, aes(Depth, values, fill=Tax)) + geom_bar(stat="identity")+
facet_wrap(~Year) +
coord_flip()
The missing data is represented (which i want!), but it lacks axis labels (which i need).
is there anything i can do?
It looks like this can be done using the lemon package:
library(tidyverse)
library(lemon)
df <- data.frame(Year = as.factor(c(rep(2015, 3), rep(2016, 3), rep(2017,3))),
Tax = as.factor(c(rep(c("A", "B", "C"), 3))),
Depth = as.factor(c(10, 30, 50, 20,30,50,10,30,40)),
values= c(0.5, 0.25, 0.25, 0.1, 0.4, 0.5, 0.2, 0.6, 0.2))
ggplot(df, aes(Depth, values, fill=Tax)) + geom_bar(stat="identity")+
facet_rep_wrap(~Year,repeat.tick.labels = T) +
coord_flip()

Resources