How to make drop=TRUE work (so legend contains only categories that exist in the subset) within scale_colour_discrete when using ggplot and trying to have stable colour mapping for categories in different plots?
This question is linked to this one and especially this comment.
Reproducible code borrowed from one of the answers in the linked question:
set.seed(2014)
library(ggplot2)
dataset <- data.frame(category = rep(LETTERS[1:5], 100),
x = rnorm(500, mean = rep(1:5, 100)),
y = rnorm(500, mean = rep(1:5, 100)))
dataset$fCategory <- factor(dataset$category)
subdata <- subset(dataset, category %in% c("A", "D", "E"))
ggplot(dataset, aes(x = x, y = y, colour = fCategory)) + geom_point()
ggplot(subdata, aes(x = x, y = y, colour = fCategory)) + geom_point() +
scale_colour_discrete(drop=TRUE,limits = levels(dataset$fCategory))
Why does the drop=TRUE not work in the second plot? The legend still contains all categories.
Output from sessionInfo():
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ggplot2_1.0.0
loaded via a namespace (and not attached):
[1] colorspace_1.2-4 digest_0.6.8 grid_3.1.2 gtable_0.1.2 labeling_0.3
[6] MASS_7.3-35 munsell_0.4.2 plyr_1.8.1 proto_0.3-10 Rcpp_0.11.3
[11] reshape2_1.4.1 scales_0.2.4 stringr_0.6.2 tools_3.1.2
This is either a misconception of what drop does (the help entry does not give much detail, unfortunately) or a bug. However, I'd recommend dropping drop altogether (pun intended) and setting both limits and breaks:
ggplot(subdata, aes(x = x, y = y, colour = fCategory)) + geom_point() +
scale_colour_discrete(limits = levels(dataset$fCategory),
breaks = unique(subdata$fCategory))
The colour set is consistent, the legend is fine.
Related
I am having a strange issue where saving a ggplot figure that I make does not maintain the colors I set using scale_color_manual. I have made a reproducible example (with some editing) using the mtcars dataset.
plot1 <- ggplot(data = mtcars %>% rownames_to_column("type") %>%
dplyr::filter(between(cyl, 6, 8)) %>%
dplyr::filter(between(gear, 4, 5))
) +
aes(y = wt, x = type) +
geom_boxplot(outlier.size = 0) +
geom_jitter(aes(color = factor(cyl), shape = factor(gear)), size = 10, position=position_jitter(width=.25, height=0)) +
#geom_smooth(method = lm, se = TRUE) +
scale_shape_manual(values=c("👧","👦"), name = "Gear", labels = c("4", "5")) + # I need 9 values (I for each ID)
scale_color_manual(values=c('red4', 'springgreen4'), name = "cyl", labels = c("4 cylinder", "5 cylinder")) +
# # geom_jitter(size=8, aes(shape=Sex, color=Sex), position = position_dodge(.4)) +
theme(legend.position = "top",
plot.title = element_text(hjust = 0.5) # Center the text title)
)
ggsave("images/review/mean_AllAgents_test.png",plot1, width=11, height=6.5, dpi=400)
The figure in the RStudio "Plots" pane has cyl colored in red and green shown below
Whereas the file saved using ggsave does not show these colors.
I have tried using the fix from this SO post. I also have tried using cowplot::save_plot. The colors do remain if I manually Export the figure from the "Plots" pane.
Does anyone know why this is occurring?
R version 4.0.4 (2021-02-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)
Matrix products: default
locale:
[1] LC_COLLATE=English_Canada.1252 LC_CTYPE=English_Canada.1252 LC_MONETARY=English_Canada.1252 LC_NUMERIC=C
[5] LC_TIME=English_Canada.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] apastats_0.3 ggstatsplot_0.8.0 rstatix_0.7.0 hrbrthemes_0.8.0 gtsummary_1.4.2.9011 car_3.0-11
[7] carData_3.0-4 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.6 purrr_0.3.4 readr_2.0.0
[13] tidyr_1.1.3 tibble_3.1.2 tidyverse_1.3.1 Rmisc_1.5 plyr_1.8.6 lattice_0.20-41
[19] ggplot2_3.3.5 rio_0.5.27 pacman_0.5.1
EDIT
I was asked to provide additional detail in my Preferences
I'm trying to add a segment in ggplot. However, adding alpha causes the segment to disappear. Although this is a known behavior that has been documented in many SO posts, I'm experiencing a particularly strange thing: when I generate the plot with reprex() I see the segment, but otherwise I don't.
Example with reprex()
library(ggplot2)
library(ggforce)
df_empty_circle <-
data.frame(x = 0,
y = 0,
r = 1)
p_empty_circle <-
ggplot(df_empty_circle) +
geom_circle(mapping = aes(x0 = x, y0 = y, r = r)) +
coord_fixed() +
theme_void()
p_no_alpha <-
p_empty_circle +
annotate(geom = "segment", y = -1, yend = -1, x = -Inf, xend = 0)
p_no_alpha
p_with_alpha <-
p_empty_circle +
annotate(geom = "segment", y = -1, yend = -1, x = -Inf, xend = 0, alpha = 0.2)
p_with_alpha
Created on 2021-08-02 by the reprex package (v2.0.0)
Example when running code without reprex
Well, the same code as above, and the output is:
p_no_alpha
p_with_alpha
Why no segment in p_with_alpha when it's run outside reprex()?
Session Info
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8.1 x64 (build 9600)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] reprex_2.0.0 dplyr_1.0.7 ggforce_0.3.3 ggplot2_3.3.5
Is there any explanation for this?
I'm trying to change the legend title on my ggplot. Here are two examples (partly from here); the first one is with the sf package, which is what I'm really using. The second one is without that package which seems to have the same problem.
With sf, what I want:
cities <- tibble::tribble(
~ lon, ~ lat, ~ name, ~ pop,
5.121420, 52.09074, "Utrecht", 311367,
6.566502, 53.21938, "Groningen", 189991,
4.895168, 52.37022, "Amsterdam", 779808
) %>% sf::st_as_sf(coords = c("lon", "lat"), crs = 4326)
lines_sfc <- sf::st_sfc(list(
sf::st_linestring(rbind(cities$geometry[[1]], cities$geometry[[2]])),
sf::st_linestring(rbind(cities$geometry[[2]], cities$geometry[[3]]))
))
lines <- sf::st_sf(
id = 1:2,
size = c(10,50),
geometry = lines_sfc,
crs = 4326
)
ggplot() +
geom_sf(aes(colour = pop, size=pop), data = cities)
which gives a nice legend with bad title:
Using this, I modified my script to change the legend:
ggplot() +
geom_sf(aes(colour = pop), data = cities) +
guides(colour=guide_legend(title="New color"))
which gives:
The legend isn't a gradient anymore, why?
If you don't have sf, the same happens with a geom_bar:
ggplot() +
geom_bar(aes(x=name, y=pop, colour = pop), stat="identity", data = cities)
gives:
while this :
ggplot() +
geom_bar(aes(x=name, y=pop, colour = pop), stat="identity", data = cities) +
guides(colour=guide_legend(title="New color"))
gives:
Is there a way to change only the title of the legend and not the whole thing?
my sessionInfo:
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server >= 2012 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] sp_1.2-6 rgeos_0.3-26 ggplot2_2.2.1.9000 dplyr_0.7.4 sf_0.6-1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.14 pillar_1.0.1 compiler_3.4.3 git2r_0.19.0 plyr_1.8.4
[6] bindr_0.1 viridis_0.4.0 class_7.3-14 tools_3.4.3 digest_0.6.13
[11] viridisLite_0.2.0 memoise_1.1.0 tibble_1.4.1 gtable_0.2.0 lattice_0.20-35
[16] pkgconfig_2.0.1 rlang_0.1.6.9002 cli_1.0.0 DBI_0.7 rstudioapi_0.6
[21] rgdal_1.2-16 curl_2.8.1 bindrcpp_0.2 gridExtra_2.3 e1071_1.6-8
[26] withr_2.1.1.9000 httr_1.3.1 knitr_1.18 devtools_1.13.3 classInt_0.1-24
[31] grid_3.4.3 glue_1.1.1 R6_2.2.2 udunits2_0.13 magrittr_1.5
[36] scales_0.5.0.9000 RStudioShortKeys_0.1.0 units_0.5-1 assertthat_0.2.0 colorspace_1.3-2
[41] labeling_0.3 utf8_1.1.3 lazyeval_0.2.1 munsell_0.4.3 crayon_1.3.4
ggplot() +
geom_sf(aes(colour = pop, size=pop), data = cities) +
scale_color_continuous(name = 'newname')
You can call the color scale and just specify the name
I would like to add rolling medians to my data in ggplot2. Calculating the rolling median in the ggplot aes and in the data.frame itself do not produce similar results (see plots).
I am looking for a solution within ggplot2 that produces the same results as in the data.frame calculation. I know this can be done with ggseas::stat_rollapplyr, but would prefer a solution in base ggplot2.
code;
library(ggplot2)
library(data.table)
library(zoo)
library(gridExtra)
# set up dummy data
set.seed(123)
x = data.table(
date = rep( seq(from = as.Date("2016-01-01"), to = as.Date("2016-04-01"), by = "day"), 2),
y = c(5 + runif(92), 6 + runif(92)),
label = c(rep("A", 92), rep("B", 92))
)
x[, `:=` (
roll = rollmedian(y, k = 15, fill = NA, align = "center")
), by = label]
# plots
theme_set(theme_bw())
p = ggplot(x) +
geom_line(aes(date, y), col = "lightgrey") +
facet_wrap(~label)
# within aes
p1 = p +
geom_line(aes(date, rollmedian(y, k = 15, fill = NA, align = "center"))) +
labs(title = "within aes")
# calculated in data.frame
p2 = p +
geom_line(aes(date, roll)) +
labs(title = "within data.frame")
grid.arrange(p1, p2)
sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=nl_NL.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=nl_NL.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=nl_NL.UTF-8
[8] LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] zoo_1.7-13 magrittr_1.5 data.table_1.9.7 ggplot2_2.1.0.9000
loaded via a namespace (and not attached):
[1] labeling_0.3 colorspace_1.2-6 scales_0.4.0 assertthat_0.1 plyr_1.8.4 rsconnect_0.4.3 tools_3.2.3 gtable_0.2.0 tibble_1.2 Rcpp_0.12.
7 grid_3.2.3 munsell_0.4.3
[13] lattice_0.20-33
While plotting several ecdf curves that overlapped, I tried adjusting the alpha of the curves to improve visibility. While tinkering with the correct placement of alpha, I found the following.
library(ggplot2)
library(dplyr)
x <- data.frame(Var = rep(1:3, 10000)) %>%
mutate(Val = rnorm(10000)*Var,
Var = factor(Var)) %>%
arrange(Var, Val) %>%
group_by(Var) %>%
mutate(ecdf = ecdf(Val)(Val))
ggplot(x, aes(x=Val)) +
stat_ecdf(aes(color = Var), size = 1.25, alpha = .9)
This gives the lines the correct alpha, but makes the legend useless. (I'm only using alpha=.9 here to demonstrate the point that the legend colors completely disappear). The work around I've found is to add:
ggplot(x, aes(x=Val)) +
stat_ecdf(aes(color = Var), size = 1.35, alpha = .9) +
guides(color = guide_legend(override.aes= list(alpha = 1)))
So while I have a solution for my immediate problem, can someone explain why the first call to ggplot is messed up? Is this a bug? If it makes any difference, I believe this issue also exists when using geom_line (though a slightly different data.frame is needed).
Wierd. Here's my sessionInfo(). I've also checked to see if there are any outdated packages.
sessionInfo()
R version 3.2.1 (2015-06-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=Japanese_Japan.932 LC_CTYPE=Japanese_Japan.932 LC_MONETARY=Japanese_Japan.932
[4] LC_NUMERIC=C LC_TIME=Japanese_Japan.932
attached base packages:
[1] splines stats graphics grDevices utils datasets methods base
other attached packages:
[1] RColorBrewer_1.1-2 ggplot2_1.0.1 stringr_1.0.0 tidyr_0.2.0 dplyr_0.4.2
[6] data.table_1.9.4
loaded via a namespace (and not attached):
[1] Rcpp_0.11.6 magrittr_1.5 MASS_7.3-40 munsell_0.4.2 colorspace_1.2-6
[6] R6_2.0.1 plyr_1.8.3 tools_3.2.1 parallel_3.2.1 grid_3.2.1
[11] gtable_0.1.2 DBI_0.3.1 lazyeval_0.1.10 assertthat_0.1 digest_0.6.8
[16] reshape2_1.4.1 labeling_0.3 stringi_0.5-4 scales_0.2.5 chron_2.3-47
[21] proto_0.3-10
How are they different? What am I missing?
library(ggplot2)
library(dplyr)
library(gridExtra)
x <- data.frame(Var = rep(1:3, 10000)) %>%
mutate(Val = rnorm(10000)*Var,
Var = factor(Var)) %>%
arrange(Var, Val) %>%
group_by(Var) %>%
mutate(ecdf = ecdf(Val)(Val))
ggplot(x, aes(x=Val)) +
stat_ecdf(aes(color = Var), size = 1.25, alpha = .9) -> gg1
ggplot(x, aes(x=Val)) +
stat_ecdf(aes(color = Var), size = 1.35, alpha = .9) +
guides(color = guide_legend(override.aes= list(alpha = 1))) -> gg2
grid.arrange(gg1, gg2)