Why geom_line legend if show.legend = FALSE and why different colours - r

After executing the code here below, I was wondering:
1- Why "A.line" and "B.line" variables appear in the geom_point() legend.
2- why there are four colors in the legend.
I guess both answers are related, but I can not tell what is going on.
I would like to have the legend just with "A.points" and "B.points".
I would also like the same colors in both lines and points (I guess this I can do manually).
Thanks in advance for your help.
Best,
David
data.frame(x = rep(1:2,2),
names.points = rep(c("A.point","B.point"), 2),
y.point = c(2, 4, 7, 9),
names.lines = rep(c("A.line","B.line"), each = 2),
y.line = c(3, 3, 8, 8)) %>%
ggplot() +
geom_point(aes(x = x, y = y.point, group = names.points, colour = names.points), size = 5) +
geom_line(aes(x = x, y = y.line, group = names.lines, colour = names.lines), show.legend = FALSE)

Legends are not related to geoms but to the scales and display the categories (or the range of the values) mapped on an aesthetic. Hence, you get four colors because you have four categories mapped on the color aesthetic. The geoms used are only displayed in the legend key via the so called key glyph which is a point for geom_point and a line for geom_line. And show.legend=FALSE only means to not display the key glyph for geom_line in the legend key, i.e. the legend keys shows only a point but no line.
To remove the categories related to the lines from your legend use e.g. the breaks argument of scale_color_discrete instead.
library(ggplot2)
library(dplyr)
data.frame(
x = rep(1:2, 2),
names.points = rep(c("A.point", "B.point"), 2),
y.point = c(2, 4, 7, 9),
names.lines = rep(c("A.line", "B.line"), each = 2),
y.line = c(3, 3, 8, 8)
) %>%
ggplot() +
geom_point(aes(x = x, y = y.point, group = names.points, colour = names.points), size = 5) +
geom_line(aes(x = x, y = y.line, group = names.lines, colour = names.lines), show.legend = FALSE) +
scale_color_discrete(breaks = c("A.point", "B.point"))
UPDATE To fix your issue with the colors you could use a named color vector:
pal_col <- rep(c("darkblue","darkred"), 2)
names(pal_col) <- c("A.point", "B.point", "A.line", "B.line")
data.frame(
x = rep(1:2, 2),
names.points = rep(c("A.point", "B.point"), 2),
y.point = c(2, 4, 7, 9),
names.lines = rep(c("A.line", "B.line"), each = 2),
y.line = c(3, 3, 8, 8)
) %>%
ggplot() +
geom_point(aes(x = x, y = y.point, group = names.points, colour = names.points), size = 5) +
geom_line(aes(x = x, y = y.line, group = names.lines, colour = names.lines), show.legend = FALSE) +
scale_color_manual(breaks = c("A.point", "B.point"),
values = pal_col)

Related

Heatmap using geom_tile in a loop/function then save the output figures

I would like to make heatmaps using the following data:
dt <- data.frame(
h = rep(LETTERS[1:7], 7),
j = c(rep("A", 7), rep("B", 7), rep("C", 7), rep("D", 7), rep("E", 7), rep("F", 7), rep("G", 7)),
Red = runif(7, 0, 1),
Yellow = runif(7, 0, 1),
Green = runif(7, 0, 1),
Blue = runif(7, 0, 1),
Black = runif(7, 0, 1)
)
For each of the heatmaps, the x and y axes stay as the first 2 columns of df. The values that fill in each heatmap will be each of the remaining columns, e.g., Red, Yellow, ...
I borrowed this example to produce the following code:
loop = function(df, x_var, y_var, f_var) {
ggplot(df, aes(x = .data[[x_var]], y = .data[[y_var]], fill = .data[[f_var]])) +
geom_tile(color = "black") +
scale_fill_gradient(low = "white", high = "blue") +
geom_text(aes(label = .data[[f_var]]), color = "black", size = 4) +
coord_fixed() +
theme_minimal() +
labs(x = "",
y = "",
fill = "R", # Want the legend title to be each of the column names that are looped
title = .data[[f_var]])
ggsave(a, file = paste0("heatmap_", f_var,".png"), device = png, width = 15, height = 15, units = "cm")
}
plot_list <- colnames(dt)[-1] %>%
map( ~ loop(df = dt,
x_var = colnames(dt)[1],
y_var = colnames(dt)[2],
f_var = .x))
# view all plots individually (not shown)
plot_list
Problems I encountered when ran this chunk of code:
Error: Discrete value supplied to continuous scale
Step ggsave didn't work. I would like to save each plot by the names of the changing columns.
There are some minor issues with your code. You get the first error as you included the second column of your dataset (which is a categorical, i.e. discrete variable) in the loop. Second, title = .data[[f_var]] will not work. Simply use title = f_var to add the variable name as the title. Finally, you are trying to save an object called a which however is not defined in your code, i.e. you have to assign your plot to a variable a and to return the plot I added a return(a):
set.seed(123)
library(ggplot2)
library(purrr)
loop = function(df, x_var, y_var, f_var) {
a <- ggplot(df, aes(x = .data[[x_var]], y = .data[[y_var]], fill = .data[[f_var]])) +
geom_tile(color = "black") +
scale_fill_gradient(low = "white", high = "blue") +
geom_text(aes(label = .data[[f_var]]), color = "black", size = 4) +
coord_fixed() +
theme_minimal() +
labs(x = "",
y = "",
fill = "R", # Want the legend title to be each of the column names that are looped
title = f_var)
ggsave(a, file = paste0("heatmap_", f_var,".png"), device = png, width = 15, height = 15, units = "cm")
return(a)
}
plot_list <- colnames(dt)[-c(1, 2)] %>%
map( ~ loop(df = dt,
x_var = colnames(dt)[1],
y_var = colnames(dt)[2],
f_var = .x))
# view all plots individually (not shown)
plot_list[c(1, 5)]
#> [[1]]
#>
#> [[2]]

How to produce neat label positions in the ggplot2 line chart?

I have a line chart built using ggplot2. It looks following:
Lines are close to each other and data labels are overlapping. It is not convenient. It would be better if light red labels were below the line and green labels where there is room for them. Something of the sort:
This post is helpful. However, I do not know in advance for which line it would be better to put labels above and for which it would be better to keep them below. Therefore I am looking for a generic solution.
ggrepel does a great job in organizing labels. But cannot figure out how to make it work in my case. I tried different parameters. Here is one of the simplest variants (not the best looking):
Questions:
Is there any way to make in R the chart look like on the 2nd picture?
I think ggrepel computes the best label position taking into account the size of the chart. If I export the chart to PowerPoint, for example, the size of the PowerPoint chart might be different from the size used to get optimal data label positions. Is there any way to pass the size of the chart to ggrepel?
Here is a code I used to generate data and charts:
library(ggplot2)
library(ggrepel)
set.seed(1)
x = rep(1:20, 3)
y = c(runif(20, 10, 11),
runif(20, 11, 12),
runif(20, 12, 13))
z = rep(c("a", "b", "c"), each = 20)
df = data.frame(x = x, y = y, z = z)
ggplot(data = df, aes(x = x, y = y, group = z, color = z)) +
geom_line() +
geom_text(aes(label = round(y, 1)), nudge_y = 1) +
ylim(c(0, 20))
ggplot(data = df, aes(x = x, y = y, group = z, color = z)) +
geom_line() +
geom_text_repel(aes(label = round(y, 1)), nudge_y = 1) +
ylim(c(0, 20))
Changing the theme to theme_bw() and removing gridlines from {ggExtra}'s removeGridX() gets the plot closer your second image. I also increased the size of the lines, limited the axes, and changed geom_text_repel to geom_label_repel to improve readability.
library(ggplot2)
library(ggrepel)
library(ggExtra)
set.seed(1)
x = rep(1:20, 3)
y = c(runif(20, 10, 11),
runif(20, 11, 12),
runif(20, 12, 13))
z = rep(c("a", "b", "c"), each = 20)
df = data.frame(x = x, y = y, z = z)
ggplot(data = df, aes(x = x, y = y, group = z, color = z)) +
theme_bw() + removeGridX() +
geom_line(size = 2) +
geom_label_repel(aes(label = round(y, 1)),
nudge_y = 0.5,
point.size = NA,
segment.color = NA,
min.segment.length = 0.1,
key_glyph = draw_key_path) +
scale_x_continuous(breaks=seq(0,20,by=1)) +
scale_y_continuous(breaks = seq(0, 14, 2), limits = c(0, 14))

Shift geometric object along horizontal axis with ggplot

I want to use ggplot to plot three curves, each made with stat_function and with its own parameters.
This is done with the code below:
library(ggplot2)
ggplot(data.frame(x = c(0, 25)), aes(x)) +
stat_function(fun = function(x) plogis(x, location = 5, scale = 2), colour = "red") +
stat_function(fun = function(x) plogis(x, location = 9, scale = 3), colour = "blue") +
stat_function(fun = function(x) plogis(x, location = 9, scale = 4), colour = "green")
which gives the figure below:
What I want to achieve is to shift the blue and green curves, exactly as they are, to the right along the horizontal axis (each by an arbitrary amount).
I don't know of an explicit way to do it in ggplot, so I tried to specify a different frame for the second and third geometric objects, as below:
ggplot(data.frame(x = c(0, 25)), aes(x)) +
stat_function(fun = function(x) plogis(x, location = 5, scale = 2), colour = "red") +
stat_function(data = data.frame(x = c(3, 28)), fun = function(x) plogis(x, location = 9, scale = 3), colour = "blue") +
stat_function(data = data.frame(x = c(5, 30)), fun = function(x) plogis(x, location = 9, scale = 4), colour = "green")
But the resulting image is the same as the one above.
Your solution is almost correct, but you need to subtract the same constant within the function itself, so that the y-values still correspond.
c1 <- 4
c2 <- 4
p2 <- ggplot(data.frame(x = c(0, 25)), aes(x)) +
stat_function(fun = function(x) plogis(x, location = 5, scale = 2), colour = "red") +
stat_function(data = data.frame(x = c(0+c1, 25+c1)),
fun = function(x) plogis(x - c1, location = 9, scale = 3), colour = "blue") +
stat_function(data = data.frame(x = c(0+c2, 25+c2)),
fun = function(x) plogis(x - c2, location = 9, scale = 4), colour = "green")
p2
PS: In the answer, I have added the constants also to the data.frame itself, so that the shift is shown (you can remove them from the df in case you want you want only the original x-range shown).

control overlaying lines while color is continuous value in ggplot

I have a data and would like to plot the lines and have control over the order that lines are laying on top of each other.
I would like to use 'cale_color_viridis()' as my pallet. I have no idea how can plot the lighter(yellow) line on the darker ones.
Here is my toy data frame and my code:
toy_data <- data.frame(x = c(1,3,1,2,5,0), y = c(0, 01, 1, 0.6, 1, .7),
col = rep(c("r", "b", "g"), each = 2), group = seq(0,1, by = 0.2))
ggplot(toy_data, aes(x = x, y = y, group = col, color = group)) +
geom_line(size = 2) +
scale_color_viridis()
any idea how can I do this?
The group aesthetic determines the plotting order, in this case, the col variable which is character data. It will normally plot in alphabetical order (b g r), so to get the yellow line from col "g" to print last, you could convert it to a factor ordered in order of appearance, like with forcats::fct_inorder:
ggplot(toy_data,
aes(x = x, y = y, group = col %>% forcats::fct_inorder(), color = group)) +
geom_line(size = 2) +
scale_color_viridis_c() # added in ggplot2 3.0 in July 2018.
# scale_color_viridis for older ggplot2 versions
If col is numeric, you could achieve the same thing by giving your "top" series the biggest number.
toy_data2 <- data.frame(x = c(1,3,1,2,5,0), y = c(0, 01, 1, 0.6, 1, .7),
col = rep(c(3, 1, 2), each = 2), group = seq(0,1, by = 0.2))
ggplot(toy_data2,
aes(x = x, y = y, group = if_else(col == 2, 1e10, col), color = group)) +
geom_line(size = 2) +
scale_color_viridis_c()

gganimate: two layers with different geometries and timepoints

The problem is similar to this question but here the two layers use different geometries, geom_tile and geom_point. The idea is to have tiles show up at different locations only in frames 2, 5, 8, and the point move along the diagonal in every frame.
When trying to run the following example, I get the error:
Error: time data must be the same class in all layers
Example
require(data.table)
require(ggplot2)
require(gganimate)
# 3 tiles along x = 10-y; present at time points 2, 5, 8
dtP1 = data.table(x = c(1, 5, 9),
y = c(9, 5, 1),
t = c(2, 5, 8))
# 9 points along x=y; present at every time point
dtP2 = data.table(x = 1:9,
y = 1:9,
t = 1:9)
p = ggplot() +
geom_tile(data = dtP1,
aes(x = x,
y = y),
color = "#000000") +
geom_point(data = dtP2,
aes(x = x,
y = y),
color = "#FF0000") +
gganimate::transition_time(t) +
gganimate::ease_aes('linear')
pAnim = gganimate::animate(p,
renderer = av_renderer("~/test.mp4"),
fps = 1,
nframes = 9,
height = 400, width = 400)
Does the following work for you?
library(dplyr)
p <- rbind(dtP1 %>% mutate(group = "group1"),
dtP2 %>% mutate(group = "group2")) %>%
tidyr::complete(t, group) %>%
ggplot(aes(x = x, y = y)) +
geom_tile(data = . %>% filter(group == "group1"),
color = "black") +
geom_point(data = . %>% filter(group == "group2"),
color = "red") +
ggtitle("{frame_time}") + # added this to show the frame explicitly; optional
transition_time(t) +
ease_aes('linear')
animate(p, nframes = 9, fps = 1)

Resources