How to fix ggplot continuous color range - r

I can't fix the colors of my heat-maps according to their values. Same values should have same colors. The goal is to keep all values below a certain threshold (0.05) in (constant) gray. For values greather than this threshold, the colors should gradually change from "firebrick1" to "firebrick4".
For example, "Plant 5"/"202004" = 70.6 is red if I use variable utilization2 and gray if I use variable utilization. How can I fix that?
library(tidyverse)
library(rlang)
MONTHS <- str_c("2020", sprintf("%02d", 1:12))
PLANTS <- str_c("Plant ", 1:5)
crossing(month = MONTHS, plant = PLANTS) %>%
mutate(utilization = runif(nrow(.), 70, 100)) %>%
mutate(utilization2 = if_else(plant == "Plant 2", utilization * 0.67, utilization)) -> d
draw_plot <- function(fill) {
fill <- ensym(fill)
d %>%
ggplot(mapping = aes(x = month, y = plant, fill = !!fill)) +
geom_tile(aes(width = 0.85, height = 0.85)) +
geom_text(aes(label = round(!!fill, 1)), color = "white") +
scale_x_discrete(expand=c(0,0)) +
scale_y_discrete(expand=c(0,0)) +
scale_fill_gradientn(colours = c("darkgray", "firebrick1", "firebrick4"),
values = c(0, 0.05, 1)) +
labs(x = "Month", y = "Production plant", title = str_c("fill = ", fill), color = "Utilization") +
theme_light() +
theme(legend.position = "none")
}
draw_plot(utilization)
draw_plot(utilization2)

library(tidyverse)
library(rlang)
MONTHS <- str_c("2020", sprintf("%02d", 1:12))
PLANTS <- str_c("Plant ", 1:5)
crossing(month = MONTHS, plant = PLANTS) %>%
mutate(utilization = runif(nrow(.), 70, 100)) %>%
mutate(utilization2 = if_else(plant == "Plant 2", utilization * 0.67, utilization)) -> d
draw_plot <- function(fill) {
fill <- ensym(fill)
d %>%
ggplot(mapping = aes(x = month, y = plant, fill = !!fill)) +
geom_tile(aes(width = 0.85, height = 0.85)) +
geom_text(aes(label = round(!!fill, 1)), color = "white") +
scale_x_discrete(expand=c(0,0)) +
scale_y_discrete(expand=c(0,0)) +
scale_fill_gradientn(colours = c("darkgray", "firebrick1", "firebrick4"),
values = c(0, 0.05, 1), limits = c(min(d$utilization, d$utilization2), max(d$utilization, d$utilization2))) +
labs(x = "Month", y = "Production plant", title = str_c("fill = ", fill), color = "Utilization") +
scale_color_identity() +
theme_light() +
theme(legend.position = "none")
}
draw_plot(utilization)
draw_plot(utilization2)
The point is that scale_fill_gradientn() sets the limits of the scale to max and min of the vector of interest. You have to set them manually. In this case I chose both the max and min of both columns (limits = c(min(d$utilization, d$utilization2), max(d$utilization, d$utilization2))).

The colours are interpolated between the values, so a trick you could do is is to set both 0 and 0.05 as gray, and begin the next colour at a very small increment to 0.05.
draw_plot <- function(fill) {
fill <- ensym(fill)
d %>%
ggplot(mapping = aes(x = month, y = plant, fill = !!fill)) +
geom_tile(aes(width = 0.85, height = 0.85)) +
geom_text(aes(label = round(!!fill, 1)), color = "white") +
scale_x_discrete(expand=c(0,0)) +
scale_y_discrete(expand=c(0,0)) +
scale_fill_gradientn(colours = c("darkgray", "darkgray", "firebrick1", "firebrick4"),
values = c(0, 0.05, 0.05 + .Machine$double.eps, 1)) +
labs(x = "Month", y = "Production plant", title = str_c("fill = ", fill), color = "Utilization") +
theme_light() +
theme(legend.position = "none")
}
draw_plot(utilization)
draw_plot(utilization2)
Maybe this is not necessary to mention, but the fill scale rescales all fill values to a range between 0-1 depending on the limits (see ?scales::rescale), so the 0.05 you put in the values argument is the bottom 5% of the range value and not unscaled data values in utilization that are below 0.05. If you want to have consistent fill scales over multiple plots, you'd have to set the limits argument manually.

Related

How to add OR and 95% CI as text outside a forest plot?

I have previously asked a similar question, which was "how to add OR and CI 95% as text into a forest plot".
In that previous question, I got my codes from a third question by someone named stupidwolf.
I used his codes to get a forest plot, BUT without OR and CI as text. This is the codes I used from stupidwolf, which worked for me.
library('ggplot2')
Outcome_order <- c('Outcome C', 'Outcome A', 'Outcome B', 'Outcome D')
#this is the first dataset you have
df1 <- data.frame(Outcome=c("Outcome A", "Outcome B", "Outcome C", "Outcome D"),
OR=c(1.50, 2.60, 1.70, 1.30),
Lower=c(1.00, 0.98, 0.60, 1.20),
Upper=c(2.00, 3.01, 1.80, 2.20))
# add a group column
df1$group <- "X"
# create a second dataset, similar format to first
df2 <- df1
# different group
df2$group <- "Y"
# and we adjust the values a bit, so it will look different in the plot
df2[,c("OR","Lower","Upper")] <- df2[,c("OR","Lower","Upper")] +0.5
# combine the two datasets
df = rbind(df1,df2)
# you can do the factoring here
df$Outcome = factor (df$Outcome, level=Outcome_order)
#define colours for dots and bars
dotCOLS = c("#a6d8f0","#f9b282")
barCOLS = c("#008fd5","#de6b35")
p <- ggplot(df, aes(x=Outcome, y=OR, ymin=Lower, ymax=Upper,col=group,fill=group)) +
#specify position here
geom_linerange(size=5,position=position_dodge(width = 0.5)) +
geom_hline(yintercept=1, lty=2) +
#specify position here too
geom_point(size=3, shape=21, colour="white", stroke = 0.5,position=position_dodge(width = 0.5)) +
scale_fill_manual(values=barCOLS)+
scale_color_manual(values=dotCOLS)+
scale_x_discrete(name="(Post)operative outcomes") +
scale_y_continuous(name="Odds ratio", limits = c(0.5, 5)) +
coord_flip() +
theme_minimal()
Then I asked in my previous question, if someone could help me with adding the OR and CI as text on the forest plot, which Allan Cameron helped me with.
This almost solved my problem.
So what I did was this, as he suggested me to do and it worked for me as well:
ggplot(df, aes(x = Outcome, y = OR, ymin = Lower, ymax = Upper,
col = group, fill = group)) +
geom_linerange(linewidth = 5, position = position_dodge(width = 0.5)) +
geom_hline(yintercept = 1, lty = 2) +
geom_point(size = 3, shape = 21, colour = "white", stroke = 0.5,
position = position_dodge(width = 0.5)) +
geom_text(aes(y = 3.75, group = group,
label = paste0("OR ", round(OR, 2), ", (", round(Lower, 2),
" - ", round(Upper, 2), ")")), hjust = 0,
position = position_dodge(width = 0.5), color = "black") +
scale_fill_manual(values = barCOLS) +
scale_color_manual(values = dotCOLS) +
scale_x_discrete(name = "(Post)operative outcomes") +
scale_y_continuous(name = "Odds ratio", limits = c(0.5, 5)) +
coord_flip() +
theme_minimal()
And I get this forest plot
As you can see on the forest plot the OR and CI text is inside the plot area. So I have following questions that I hope someone can help me to fix:
How to add one title "OR" above all the OR values instead of it is written for each OR value?
How can I plot the OR and CI text outside the plot, like on the side to the right. Because on my real plot I have very long CI unfortunately, so I can't avoid the text merging with the horizontal CI lines. If I start moving the OR text by changing the y = 3.75 position more to the right, then the OR and 95%CI text starts to disappear (half of it), because it gets pushed out of the plot. So I was thinking if I could plot it outside the plot, then it will solve the issue maybe? But how?
This is the link to my previous question if necessary: How to add OR and 95% CI as text into a forest plot?
Using the patchwork package.
library(ggplot2)
library(patchwork)
p1 <- ggplot(df, aes(x = Outcome, y = OR, ymin = Lower, ymax = Upper,
col = group, fill = group)) +
geom_linerange(size = 5, position = position_dodge(width = 0.5)) +
geom_hline(yintercept = 1, lty = 2) +
geom_point(size = 3, shape = 21, colour = "white", stroke = 0.5,
position = position_dodge(width = 0.5)) +
scale_fill_manual(values = barCOLS) +
scale_color_manual(values = dotCOLS) +
scale_x_discrete(name = "(Post)operative outcomes") +
scale_y_continuous(name = "Odds ratio", limits = c(0.5, 5)) +
coord_flip() +
theme_minimal()
p2 <- ggplot(df, aes(x = Outcome, y = 1.25, ymin = Lower, ymax = Upper)) +
geom_text(aes(group = group,
label = paste0(round(OR, 2), ", (", round(Lower, 2),
" - ", round(Upper, 2), ")")),
position = position_dodge(width = 0.5), color = "black") +
labs(title = "OR") +
coord_flip() +
theme_void()
p1 + p2

geom_ribbon: Fill area between lines - spurious lines connecting groups

I'm trying to build a plot with two lines and fill the area between with geom_ribbon. I've managed to select a fill color (red/blue) depending on the sign of the difference between two lines. First I create two new columns in the dataset for ymax, ymin. It seems to work but some spurious lines appear joining red areas.
Is geom_ribbon appropriate to fill the areas? Is there any problem in the plot code?
This is the code used to create the plot
datos.2022 <- datos.2022 %>% mutate(y1 = SSTm-273.15, y2 = SST.mean.day-273.15)
datos.2022 %>% ggplot(aes(x=fecha)) +
geom_line(aes(y=SSTm-273.15), color = "red") +
geom_line(aes(y=SST.mean.day - 273.15), color = "black") +
geom_ribbon(aes(ymax=y1, ymin = y2, fill = as.factor(sign)), alpha = 0.6) +
scale_fill_manual(guide = "none", values=c("blue","red")) +
scale_y_continuous(limits = c(10,30)) +
scale_x_date(expand = c(0,0), breaks = "1 month", date_labels = "%b" ) +
theme_hc() +
labs(x="",y ="SST",title = "Temperature (2022)") +
theme(text = element_text(size=20,family = "Arial"))
And this is the output
Example data for the plot available at https://www.dropbox.com/s/mkk8w7py2ynuy1t/temperature.dat?dl=0
What if you made two different series to plot as ribbons - one for the positive values where there is no distance between ymin and ymax for the places where the difference is negative. And one for the negative values that works in a similar way.
library(dplyr)
library(ggplot2)
datos.2022 <- datos.2022 %>%
mutate(y1 = SSTm-273.15,
y2 = SST.mean.day-273.15) %>%
rowwise() %>%
mutate(high_pos = max(SST.mean.day - 273.15, y1),
low_neg = min(SSTm-273.15, y2))
datos.2022 %>% ggplot(aes(x=fecha)) +
geom_line(aes(y=SSTm-273.15), color = "red") +
geom_line(aes(y=SST.mean.day - 273.15), color = "black") +
geom_ribbon(aes(ymax=high_pos, ymin = SST.mean.day - 273.15, fill = "b"), alpha = 0.6, col="transparent", show.legend = FALSE) +
geom_ribbon(aes(ymax = SST.mean.day - 273.15, ymin = low_neg, fill = "a"), alpha = 0.6, col="transparent", show.legend = FALSE) +
scale_fill_manual(guide = "none", values=c("blue","red")) +
scale_y_continuous(limits = c(10,30)) +
scale_x_date(expand = c(0,0), breaks = "1 month", date_labels = "%b" ) +
#theme_hc() +
labs(x="",y ="SST",title = "Temperature (2022)") +
theme(text = element_text(size=20,family = "Arial"))

ggplot trying to make a Cleveland plot but I cannot get a legend

library(ggplot2)
library(ggthemes)
data <- read.csv('/Users/zbhay/Documents/r-data.csv', header = 1)
zb <- ggplot(data) +
geom_segment( aes(x=x, xend=x, y=value1, yend=value2), color="black")+
geom_point( aes(x=x, y=value1), color=rgb(0.2,0.7,0.1,1), size=4 )+
geom_point( aes(x=x, y=value2), color=rgb(0.7,0.2,0.1,1), size=4 )+
coord_flip() +
theme_solarized() +
scale_y_continuous(breaks = seq(0, 10000, by = 500)
)
zb + labs(title = "Title",
subtitle = "subtitle") +
xlab("Business Functions") +
ylab("# of hours")
legend("left", c("Starting", "Ending"),
box.col = "darkgreen"
)
Hello, here is the code. The CSV file is structured as follows; column A = names, column b = starting number, column c = final number.
I am trying to set up a legend that calls out the final number vs starting number. I have tried and tried but cannot seem to be able to crack it. If anyone knows a fix, I would appreciate it if you could let me know.
As a general rule when using ggplot2 you have to map on aesthetics if you want to get a legend, i.e. instead of setting the colors for your points as arguments map a value on the color aes, e.g. in my code below I map the constant value or category start on the color aes inside aes() for the first geom_point. Afterwards you could use scale_color_manual to assign your desired colors and labels to these "categories" or "values". Finally, the color of the legend box could be set via the theme option legend.background. However, the legend keys themselves have a background color too, which I set to NA via legend.key.
Using some fake random example data:
library(ggplot2)
library(ggthemes)
set.seed(123)
data <- data.frame(x = letters[1:5], value1 = runif(5, 0, 10000), value2 = runif(5, 0, 10000))
ggplot(data) +
geom_segment(aes(x = x, xend = x, y = value1, yend = value2), color = "black") +
geom_point(aes(x = x, y = value1, color = "start"), size = 4) +
geom_point(aes(x = x, y = value2, color = "end"), size = 4) +
coord_flip() +
theme_solarized() +
scale_y_continuous(breaks = seq(0, 10000, by = 500)) +
scale_color_manual(values = c(start = rgb(0.2, 0.7, 0.1, 1), end = rgb(0.7, 0.2, 0.1, 1)), labels = c(start = "Starting", end = "Ending")) +
labs(title = "Title", subtitle = "subtitle", x = "Business Functions", y = "# of hours", color = NULL) +
theme(
legend.key = element_rect(fill = NA),
legend.background = element_rect(fill = "darkgreen")
)

Using ggrepel to add two labels for geom_point in ggplot

I would like to label geom_points using two columns of information.
As an example, in the below, I would like the points to labelled:
A, ($5m)
B, $2m
C, ($3m)
D, $3m
Using geom_text_repel I am only able to add the first part of the label, but not quite sure how to add the y-value into the label as above.
alpha <- c("A","B","C","D")
percent <- c(0.012, -0.02, 0.015, -0.01)
flow <- c(-5, 2, -3, 3)
to <- c(0.1, 0.15, 0.2, 0.1)
df <- data.frame(alpha,percent,flow,to)
df %>%
ggplot(aes(x = percent, y = flow, label = alpha)) +
geom_point(aes(size = to)) +
geom_text_repel(show.legend = FALSE, size = 3) +
scale_size_continuous(labels = scales::percent) +
theme_bw() +
scale_x_continuous(labels = scales::percent_format(accuracy = 0.1L)) +
scale_y_continuous(labels = scales::dollar_format(negative_parens = TRUE, suffix = "m"))
You can preformat the y label (using a similar method) and paste it in.
df$ylbl <- paste(df$alpha, scales::dollar_format(negative_parens = TRUE, suffix = "m")(df$flow))
df %>%
ggplot(aes(x = percent, y = flow, label = alpha)) +
geom_point(aes(size = to)) +
geom_text_repel(aes(label = ylbl), show.legend = FALSE, size = 3) +
scale_size_continuous(labels = scales::percent) +
theme_bw() +
scale_x_continuous(labels = scales::percent_format(accuracy = 0.1L)) +
scale_y_continuous(labels = scales::dollar_format(negative_parens = TRUE, suffix = "m"))
# Warning: integer literal 0.1L contains decimal; using numeric value

Reordering Groups in Raincloud Plot [duplicate]

This question already has answers here:
Change stacked bar order when aesthetic fill is based on the interaction of two factors
(1 answer)
ggplot legends - change labels, order and title
(1 answer)
Closed 2 years ago.
Currently, I have a plot that looks like this:
library(ggplot2)
df <- ToothGrowth
df %>%
ggplot(aes(x = supp, y = len, fill = supp)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0),
alpha = .8) +
geom_point(aes(shape = supp),
position = position_jitter(width = .05),
size = 2, alpha = 0.8) +
geom_boxplot(width = .1, outlier.shape = NA, alpha = 0.5) +
coord_flip() +
labs(title = "ToothGrowth Length by Supplement",
y = "Length") +
theme_classic() +
raincloud_theme
I'd like to change the order so that OJ appears above VC. I've tried adding scale_x_discrete before coord_flip(), but that seems to mess up my plot as this is a raincloud plot -- I'd have to move not only the violin plot, but also the points and the box plot. I've also tried adding rev(), which also messed up my plot. What is the best way to reorder this?
EDIT
Thank you for the comment! How do I change the orders in an interaction plot?
df %>%
mutate(Supplement = ifelse(supp == "VC",
"VC",
"OJ"),
Dose = ifelse(dose == "0.5",
"0.5",
"1.0"),
Interaction = factor(str_replace(interaction(Supplement, Dose),
'\\.', '\n'),
ordered=TRUE)) %>%
ggplot(aes(x = Interaction, y = len, fill = Interaction)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0),
alpha = .8) +
geom_point(aes(shape = Dose),
position = position_jitter(width = .05),
size = 2, alpha = 0.8) +
geom_boxplot(width = .1, outlier.shape = NA, alpha = 0.5) +
coord_flip() +
labs(title = "Effect of Supplement and Dose on Length",
y = "Growth Length") +
scale_fill_discrete(guide = guide_legend(override.aes = list(shape = c(".", ".")))) +
scale_shape_discrete(guide = guide_legend(override.aes = list(size = 3))) +
theme_classic() +
raincloud_theme
ggplot2 will interpret the supp factor and the order in the plot correspond to the levels of the factor.
You will need to change the levels of the supp factor.
df <- ToothGrowth
df$supp
df$supp <- relevel(ToothGrowth$supp,ref = "VC")
df$supp
df %>%
ggplot(aes(x = supp, y = len, fill = supp)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0),
alpha = .8) +
geom_point(aes(shape = supp),
position = position_jitter(width = .05),
size = 2, alpha = 0.8) +
geom_boxplot(width = .1, outlier.shape = NA, alpha = 0.5) +
coord_flip() +
labs(title = "ToothGrowth Length by Supplement",
y = "Length") +
theme_classic() +
raincloud_theme

Resources