Dark to light colours based on value ggplot2 - r

I am trying to customize the colours using ggplot2. The function I wrote is as follows:
library(tidyverse)
spaghetti_plot_multiple <- function(input, MV, item_level){
MV <- enquo(MV)
titles <- enquo(item_level)
input %>%
filter(!!(MV) == item_level) %>%
mutate(first_answer = first_answer) %>%
ggplot(.,aes( x = time, y = jitter(Answer), group = ID)) +
geom_line(aes(colour = first_answer)) +
labs(title = titles ,x = 'Time', y = 'Answer', colour = 'Answer given at time 0') +
facet_wrap(~ ID, scales = "free_x")+
theme(strip.text = element_text(size = 8)) +
scale_color_manual(values = c('red', 'blue', 'brown', 'purple', 'black'))
}
This however doesn't work, but I can't seem to figure out why scale_color_manual(..) values doesn't work. The current plot I am using is:
This is somewhat in line with what I am trying to achieve: a dark color for values 1-3 (i.e. based on first_answer which ranges from 1 to 5) and lighter ones for 4 and 5. The reason is simply because there are many more lines with a value of 4 or 5 and I want to be able to see the direction of lines across time.
EDIT The image is the plot I currently have. Although it somewhat resembles what I'd like to get, I'd much rather set the colors myself or use some function that chooses colors to enhance the plotting visibility (the lines in the plot) automatically.

You can specify color gradients with 'scale_x_gradient' scale_x_gradient2 or scale_x_gradientn
(x can be fill or color)
Caveat when specifying the color values with values = c(...)): values() assigns colours based on their position within c(0,1). You therefore need to scale the values from your vector which you want to have as breaks to the range c(0,1).
Re your question which palette best to use for 5 distinct lines: I think best is to manually specify the colours as you have done. I often use hex codes instead. I personally look those up at
html color codes.

Related

GGPlot is returning different colours to what I specify

I'm somewhat new to R - and having a really strange issue while trying to produce the following plot
worst_death <- df_clean %>%
group_by(event_cat) %>%
summarise(Deaths = sum(FATALITIES)
, Injuries = sum(INJURIES)) %>%
ggplot()+
geom_segment(aes(x=reorder(event_cat,Injuries),xend=reorder(event_cat,Injuries), y=Deaths, yend = Injuries, color="black")) +
geom_point(aes(x=reorder(event_cat,Injuries), y=Deaths,color="yellow", size=1 ))+
geom_point(aes(x=reorder(event_cat,Injuries), y=Injuries,color="white", size=1 ))+
coord_flip()+
theme_ipsum()+
theme(legend.position = "none",) +
xlab("Event Type") +
ylab("Human Impact")
worst_death
The graph is running perfectly - except the colours and aesthetic options (size etc.) are not returning what I specified.
Strangely enough the colours are red blue and green, rather than yellow black and white.
does anyone know why this might be happening?
thanks
I can't test this without your data, but the following should work for you:
worst_death <- df_clean %>%
group_by(event_cat) %>%
summarise(Deaths = sum(FATALITIES),
Injuries = sum(INJURIES)) %>%
ggplot(aes(x = reorder(event_cat,Injuries), y = Deaths)) +
geom_segment(aes(xend = reorder(event_cat,Injuries),
y = Deaths, yend = Injuries)) +
geom_point(color = "yellow", size = 1) +
geom_point(aes(y = Injuries), color = "white", size = 1) +
coord_flip() +
theme_ipsum() +
theme(legend.position = "none") +
xlab("Event Type") +
ylab("Human Impact")
worst_death
There are a couple of points to note:
When you use a character string for color inside aes, ggplot reads it as a single factor level that assigns the geom to a labelled color grouping, and will not interpret it as a literal color assignment. If you had a legend in your plot, the key would show the labels "white" and "yellow" against the red and blue key dots. You can either add + scale_color_identity() to your plot if you want these labels to be interpreted as literal colors or, more commonly, just bring color = outside of aes, where it is interpreted as an actual color assignment. This is the easiest way to do it if you don't want a legend.
You should probably bring size = outside the aes call too, effectively for the same reason. ggplot is mapping the number 1 to its default size scale rather than literally making the points size 1.
The geom_segment is black by default, so it doesn't need a color assignment.
You can save some typing (and hence reduce risk of bugs and make maintenance easier) if you include the default x and y aesthetics in the original ggplot call. These are inherited by any subsequent geoms, but can be over-ridden if required.
When posting a question on SO, please include data as well as code, otherwise no-one can reproduce your problem or test / demonstrate possible solutions. The easiest way to do this in your case is to copy and paste the output of dput(df_clean) into your question.

Excluding cells from transparency in heatmap with ggplot

I am trying to generate a heatmap where I can show more than one level of information on each cell. For each cell I would like to show a different color depending on its value in one variable and then overlay this with a transparency (alpha) that shades the cell according to its value for another variable.
Similar questions have been addressed here (Place 1 heatmap on another with transparency in R) a
and here (Making a heatmap in R varying both color and transparency). In both cases the suggestion is to use ggplot and overlay two geom_tiles, one with the colors one with the transparency.
I have managed to overlay two geom_tiles (see code below). However, in my case, the problem is that the shading defined by the transparency (or "alpha") geom_tile also shades some cells that should remain as white or blank according to the colors (or "fill") geom_tile. I would like these cells to remain white even after overlaying the transparency.
#Create sample dataframe
df <- data.frame("x_pos" = c("A","A","A","B","B","B","C","C","C"),
"y_pos" = c("X","Y","Z","X","Y","Z","X","Y","Z"),
"col_var"= c(1,2,NA,4,5,6,NA,8,9),
"alpha_var" = c(7,12,0,3,2,15,0,6,15))
#Convert factor columns to numeric
df$col_var<- as.numeric(df$col_var)
df$alpha_var<- as.numeric(df$alpha_var)
#Cut display variable into breaks
df$col_var_cut <- cut(df$col_var,
breaks = c(0,3,6,10),
labels = c("cat1","cat2", "cat3"))
#Plot
library(ggplot2)
ggplot(df, aes (x = x_pos, y = y_pos, fill = col_var_cut, label = col_var)) +
geom_tile () +
geom_text() +
scale_fill_manual(values=(brewer.pal(3, "RdYlBu")),na.value="white") +
geom_tile(aes(alpha = alpha_var), fill ="gray29")+
scale_alpha_continuous("alpha_var", range=c(0,0.7), trans = 'reverse')+
theme_bw() +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
I would like cells "AZ" and "CX" in the heatmap resulting from the code above to be colored white instead of grey such that the alpha transparency doesn't apply to them. In my data, these cells have NA in the color variable (col_var) and can have a value of NA or 0 (as in the example code) in the transparency/alpha variable (alpha_var).
If this is not possible, then I would like to know whether there are other options to display both variables in a heatmap and keep the NA cells in the col_var white? I am happy to use other packages or alternative heatmap layouts such as those where the size of each cell or the thickness of its border vary according to the values the alpha_var. However, I am not sure how I could achieve this either.
Thanks in advance and my apologies for the cumbersome bits in the example code (I am still learning R and this is my first time asking questions here).
You were not far. See below for a possible solution. The first plot shows an implementation of adding transparency within the geom_tile call itself - note I removed the trans = reverse specification from your plot.
Plot 2 just adds back the white tiles on top of the other plot - simple hack which you will often find necessary when wanting to plot certain data points differently.
Note I have added a few minor comments to your code below.
# creating your data frame with better name - df is a base R function and not recommended as example name.
# Also note that I removed the quotation marks in the data frame call - they were not necessary. I also called as.numeric directly.
mydf <- data.frame(x_pos = c("A","A","A","B","B","B","C","C","C"), y_pos = c("X","Y","Z","X","Y","Z","X","Y","Z"), col_var= as.numeric(c(1,2,NA,4,5,6,NA,8,9)), alpha_var = as.numeric(c(7,12,0,3,2,15,0,6,15)))
mydf$col_var_cut <- cut(mydf$col_var, breaks = c(0,3,6,10), labels = c("cat1","cat2", "cat3"))
#Plot
library(tidyverse)
library(RColorBrewer) # you forgot to add this to your reprex
ggplot(mydf, aes (x = x_pos, y = y_pos, fill = col_var_cut, label = col_var)) +
geom_tile(aes(alpha = alpha_var)) +
geom_text() +
scale_fill_manual(values=(brewer.pal(3, "RdYlBu")), na.value="white")
#> Warning: Removed 2 rows containing missing values (geom_text).
# a bit hacky for quick and dirty solution. Note I am using dplyr::filter from the tidyverse
ggplot(mapping = aes(x = x_pos, y = y_pos, fill = col_var_cut, label = col_var)) +
geom_tile(data = filter(mydf, !is.na(col_var))) +
geom_tile(data = filter(mydf, !is.na(col_var)), aes(alpha = alpha_var), fill ="gray29")+
geom_tile(data = filter(mydf, is.na(col_var)), fill = 'white') +
geom_text(data = mydf) +
scale_fill_manual(values = (brewer.pal(3, "RdYlBu"))) +
scale_alpha_continuous("alpha_var", range=c(0,0.7), trans = 'reverse')
#> Warning: Removed 2 rows containing missing values (geom_text).
Created on 2019-07-04 by the reprex package (v0.2.1)

Stacked barplot with colour gradients for each bar

I want to color a stacked barplot so that each bar has its own parent colour, with colours within each bar to be a gradient of this parent colour.
Example:
Here is a minimal example. I would like for the color of each bar to be different for color, with a gradient within each bar set by `clarity.
library(ggplot2)
ggplot(diamonds, aes(color)) +
geom_bar(aes(fill = clarity), colour = "grey")
In my real problem, I have many more groups of each: requiring 18 different bars with 39 different gradient colours.
I have made a function ColourPalleteMulti, which lets you create a multiple colour pallete based on subgroups within your data:
ColourPalleteMulti <- function(df, group, subgroup){
# Find how many colour categories to create and the number of colours in each
categories <- aggregate(as.formula(paste(subgroup, group, sep="~" )), df, function(x) length(unique(x)))
category.start <- (scales::hue_pal(l = 100)(nrow(categories))) # Set the top of the colour pallete
category.end <- (scales::hue_pal(l = 40)(nrow(categories))) # set the bottom
# Build Colour pallette
colours <- unlist(lapply(1:nrow(categories),
function(i){
colorRampPalette(colors = c(category.start[i], category.end[i]))(categories[i,2])}))
return(colours)
}
Essentially, the function identifies how many different groups you have, then counts the number of colours within each of these groups. It then joins together all the different colour palettes.
To use the palette, it is easiest to add a new column group, which pastes together the two values used to make the colour palette:
library(ggplot2)
# Create data
df <- diamonds
df$group <- paste0(df$color, "-", df$clarity, sep = "")
# Build the colour pallete
colours <-ColourPalleteMulti(df, "color", "clarity")
# Plot resultss
ggplot(df, aes(color)) +
geom_bar(aes(fill = group), colour = "grey") +
scale_fill_manual("Subject", values=colours, guide = "none")
Edit:
If you want the bars to be a different colour within each, you can just change the way the variable used to plot the barplot:
# Plot resultss
ggplot(df, aes(cut)) +
geom_bar(aes(fill = group), colour = "grey") +
scale_fill_manual("Subject", values=colours, guide = "none")
A Note of Caution: In all honesty, the dataset you have want to plot probably has too many sub-categories within it for this to work.
Also, although this is visually very pleasing, I would suggest avoiding the use of a colour scale like this. It is more about making the plot look pretty, and the different colours are redundant as we already know which group the data is in from the X-axis.
An easier approach to achieve a colour gradient is to use alpha to change the transparency of the colour. However, this can have unintended consequences as transparency means you can see the guidelines through the plot.
library(ggplot2)
ggplot(diamonds, aes(color, alpha = clarity)) +
geom_bar(aes(fill = color), colour = "grey") +
scale_alpha_discrete(range = c(0,1))
I have recently created the package ggnested which creates such plots. It is essentially a wrapper around ggplot2 that takes main_group and sub_group in the aesthetic mapping, where colours are generated for the main_group, and a gradient is generated for the levels of sub_group that are nested within each level of the main_group.
devtools::install_github("gmteunisse/ggnested")
require(ggnested)
data(diamonds)
ggnested(diamonds, aes(main_group = color, sub_group = clarity)) +
geom_bar(aes(x = color))
Another option is to use any custom color palette and simply darken/lighten those depending on the fill category. It can be slightly tricky to get a smooth gradient in each bar, but if you keep the natural order of the data (either appearance in data frame or the factor levels) this is not a big problem.
I am using the colorspace package for this task. The shades package also has the option to darken/lighten colors, but the syntax is slightly longer. It is more suitable for modification of entire palettes without specifying specific colors.
library(tidyverse)
library(colorspace)
## get some random colors, here n colors based on the Dark2 palette using the colorspace package.
## But ANY palette is possible
my_cols <- qualitative_hcl(length(unique(diamonds$color)), "Dark2")
## for easier assignment, name the colors
names(my_cols) <- unique(diamonds$color)
## assign the color to the category, by group
df_grad <-
diamonds %>%
group_by(color) %>%
## to keep the order of your stack and a natural gradient
## use order by occurrence in data frame or by factor
## clarity is an ordered factor, so I'm using a dense rank
mutate(
clarity_rank = dense_rank(as.integer(clarity)),
new_cols = my_cols[color],
## now darken or lighten according to the rank
clarity_dark = darken(new_cols, amount = clarity_rank / 10),
clarity_light = lighten(new_cols, amount = clarity_rank / 10)
)
## use this new color for your fill with scale_identity
## you additionally need to keep your ordering variable as group, in this case
## an interaction between color and your new rank
ggplot(df_grad, aes(color, group = interaction(color, clarity_rank))) +
geom_bar(aes(fill = clarity_dark)) +
scale_fill_identity()
ggplot(df_grad, aes(color, group = interaction(color, clarity_rank))) +
geom_bar(aes(fill = clarity_light)) +
scale_fill_identity()
Created on 2022-07-03 by the reprex package (v2.0.1)

Separate palettes for facets in ggplot facet_grid

Question
How can I use a different color palette for each facet? Ideally I would like to have a generic legend in gray to serve as a reference.
I'm working on a visualization using ggplot's facet_grid. The layout is fine, but I would like to use a distinct color palette for every row in the grid. My goal is to use a similarly-shaded gradient for every palette and then tie them together with a grayscale legend. I'm would like to do this to maintain internal color-coding consistency within a larger set of graphics. It would amazing to be able to still use facet_grid instead of using grobs (with which I am vastly less familiar).
I've included an example to work with using the diamonds data set and an arbitrary grouping to approximate what my data looks like.
data(diamonds)
diamonds$arbitrary = sample(c("A", "B", "C"), length(diamonds$cut), replace = TRUE)
blues = brewer.pal(name="Blues", n=3)
greens = brewer.pal(name="Greens", n=3)
oranges = brewer.pal(name="Oranges", n=3)
purples = brewer.pal(name="Purples", n=3)
ggplot(diamonds) +
geom_bar(aes(x = clarity, stat = "bin", fill = arbitrary, group = arbitrary)) +
facet_grid(cut~.) +
# Here I assign one palette... is this where I could also
# designate the other palettes?
scale_fill_manual(values = blues)
Thank you!
faking a colour scale with transparency might be your best option, unless you're willing to combine multiple pieces at the grid/gtable level.
ggplot(diamonds) +
geom_bar(aes(x = clarity, stat = "bin", fill = cut,
alpha=arbitrary, group = arbitrary)) +
facet_grid(cut~.) +
scale_fill_manual(values = brewer.pal(name="Set1", n=5), guide="none") +
scale_alpha_manual(values=c(0.8, 0.6, 0.4))
Another option would be to change the color brightness based on the factor level. You can use the colorspace package or the shades package for this. Here, I prefer the colorspace package because of its slightly simpler syntax and because you only need to modify specific colors.
Disadvantage - Adding a legend of grey values could become very hacky here. Another option would be to just overlay gray values in different shades. This will require two fill scales (easiest with the ggnewscale package).
Advantage of the latter - it's much shorter and seems less hacky. Disadvantage possibly that it can be tricky to figure out the alpha and the exact grey values used so to get the colors saturated enough to your liking.
Option 1 - change color brightness
library(tidyverse)
library(colorspace)
data(diamonds)
diamonds$arbitrary = sample(c("A", "B", "C"), length(diamonds$cut), replace = TRUE)
## get colors of your choice
my_cols <- RColorBrewer::brewer.pal(length(unique(diamonds$cut)), "Set1")
## for easier assignment, name the colors
names(my_cols) <- unique(diamonds$cut)
## assign the color to the category
df_grad <-
diamonds %>%
## make your colored aesthetic an ordered factor for better control,
## and change the brightness according to the factor level
mutate(
arbitrary_ind = as.integer(ordered(arbitrary)),
new_cols = my_cols[cut],
## now darken or lighten according to the rank
arbitrary_dark = darken(new_cols, amount = arbitrary_ind / 10)
)
## use this new color for your fill with scale_identity
ggplot(df_grad) +
geom_bar(aes(clarity, fill = arbitrary_dark)) +
facet_grid(cut~.) +
scale_fill_identity()
Option 2 - add layer of grey scales
ggplot(diamonds) +
geom_bar(aes(clarity, fill = cut), show.legend = F) +
## add new layer of grey values with an alpha
ggnewscale::new_scale_fill() +
geom_bar(aes(clarity, fill = arbitrary), alpha = .5) +
scale_fill_brewer(palette = "Greys") +
facet_grid(cut~.)

R: In ggplot2, how do you combine linetype and color when they're set by different variables?

I've got a line graph where the linetype is set by one variable (SampleType, in this example) and the color is set by another (Sample). For these data, I'd like it if the legend combined both of those variables into one legend entry rather than having one entry for the color and one for the linetype. Here are my example data:
EIC data
Here is the code and the plot I've come up with so far:
EIC <- read.csv("EIC data.csv")
ggplot(EIC, aes(x = Time, y = Counts, color = Sample, linetype = SampleType)) +
geom_line()
What I'd really like is for the legend to just show "Sample" and then the appropriate colors AND linetypes for each of those samples, so it would be solid for the standards and dashed for the clinical samples and each sample would be a different color. I love ggplot2, so I'd prefer to continue to use ggplot2 to do this. I've tried adding scale_linetype_manual like this to the code:
scale_linetype_manual(values = c(rep("solid", 3), rep("dashed", 2)))
but that's not changing the legend or the graph. I've also tried making a new column with "solid" or "dashed" in each row depending on whether the sample is a clinical sample or a standard and then using scale_linetype_identity(), but while that does work for the graph, it's not changing the legend since I'm still mapping color to one variable and linetype to a second.
I'm using R version 3.0.2 and ggplot2_1.0.0.
Thanks in advance for any ideas!
Use variable Sample for both - color= and linetype= and then with scale_linetype_manual() get the desired linetypes.
ggplot(EIC, aes(x = Time, y = Counts, color = Sample, linetype = Sample)) +
geom_line()+scale_linetype_manual(values = c(rep("solid", 3), rep("dashed", 2)))

Resources