Related
Problem: My dataset has a shared baseline (timepoint 1) with 2 repeated measures. The followup points are repeated with 2 different conditions (ie a cross-over). When plotting, the results, the error bars and data points overlap.
library(tidyverse)
set.seed(11)
df_test <-
tibble(group = factor(rep(c("A", "B"), each = 3)),
timepoint = factor(rep(1:3, 2)),
y = c(0, rnorm(2, mean = 0, sd = .2), 0, rnorm(2, mean = .7, sd = 0.35))) %>%
mutate(ymax = y + .2,
ymin = y - .2)
# plot_nododge <-
df_test %>%
ggplot(aes(timepoint, y,
group = group,
shape = group,
fill = group)) +
geom_linerange(linetype = 1,
aes(ymin = ymin,
ymax = ymax)) +
geom_point() +
geom_line()
Solution with position_dodge(): This solution fixes the overlap, but the "Baseline" point is actually a single measure. I would like for this to be a single point to avoid confusion, but still dodge the followup points.
Is there a simple solution to this that I'm missing?
I would like to use a custom dodge for each point (eg scale_position_dodge_identity), but position is not accepted as an aesthetic.
# plot1 <-
df_test %>%
ggplot(aes(timepoint, y,
group = group,
shape = group,
fill = group)) +
geom_linerange(linetype = 1,
position = position_dodge(.2),
aes(ymin = ymin,
ymax = ymax)) +
geom_point(position = position_dodge(.2)) +
geom_line(position = position_dodge(.2))
Solution and expansion*
Maybe something better will be developed in the future but manually adjusting the position seems best for now. Since the point shape and color should also reflect the different "Baseline" measure and the x-axis should really have nice labels, I've edited to show some more finishing touches.
df_test %>%
mutate(shape_var = case_when(timepoint == 1 ~ 22,
group == "A" ~ 21,
group == "B" ~ 24),
fill_var = case_when(timepoint == 1 ~ "black",
group == "A" ~ "red",
group == "B" ~ "blue"),
timepoint2 = as.numeric(as.character(timepoint)),
timepoint2 = timepoint2 + 0.05*(timepoint2 > 1 & group == "B"),
timepoint2 = timepoint2 - 0.05*(timepoint2 > 1 & group == "A")) %>%
ggplot(aes(timepoint2, y,
group = group,
shape = shape_var,
fill = fill_var)) +
geom_linerange(aes(ymin = ymin,
ymax = ymax)) +
geom_line() +
geom_point(size = 2) +
scale_x_continuous(breaks = 1:3, # limit to selected time points
labels = c("Baseline", "time 1", "time 2"), # label like a discrete scale
limits = c(.75, 3.25)) + # give it some room
scale_shape_identity(guide = "legend",
name = "Treatment",
breaks = c(21, 24), # Defines legend order too; only label the treatment groups
labels = c("A", "B" )) + # this part is tricky - could easily reverse them
scale_fill_identity(guide = "legend",
name = "Treatment",
breaks = c("red", "blue"),# Defines legend order too; only label the treatment groups
labels = c("A", "B" )) + # this part is tricky - could easily reverse them
labs(x=NULL)
Created on 2021-09-19 by the reprex package (v2.0.1)
After some experimentation, I have found a pretty good way to do this. Seems like you can use position_dodge2(width = c(...)) to specify individual dodge widths.
For your example, specify width = c(0.000001, 0.2, 0.2):
df_test %>%
ggplot(aes(timepoint, y, group = group, shape = group, fill = group)) +
geom_linerange(linetype = 1, position = position_dodge2(width = c(0.000001, 0.2, 0.2)), aes(ymin = ymin, ymax = ymax)) +
geom_point(position = position_dodge2(width = c(0.000001, 0.2, 0.2))) +
geom_line(position = position_dodge2(width = c(0.000001, 0.2, 0.2)))
Initially, I tried setting width = c(0, 0.2, 0.2), but that didn't behave as expected:
?position_dodge doesn't explicity explain using a string of values as the dodge widths, so I don't know why it doesn't work when you use zero as a width. A value of 0.000001 is 'close enough' to zero that you can't tell by looking at the figure, so hopefully this will suffice.
One solution could be to just change the position of the original points directly in the data set for the purpose of the plot:
df_test %>%
mutate(timepoint = as.numeric(as.character(timepoint))) %>%
mutate(timepoint = timepoint + 0.1*(timepoint > 1 & group == "B")) %>%
ggplot(aes(timepoint, y,
group = group,
shape = group,
fill = group)) +
geom_linerange(linetype = 1,
aes(ymin = ymin,
ymax = ymax)) +
scale_x_continuous(breaks = unique(as.numeric(as.character(df_test$timepoint))))+
geom_point() +
geom_line()
I want to show two ribbons per variable e.g. (max - min ribbon, and a confidence level ribbon) with geom_ribbon() in ggplot2 in R, as in the example below. I've not been able to set the colours for each ribbon separately. Ideally, I can map a colour palette to each level of the categorical variable (type), and the two ribbons will take colours from these.
## Set up data
set.seed(999)
n <- 100
mn1 <- seq(0.5, 0.9, length.out = n)
mn2 <- seq(0.75, 0.25, length.out = n)
tmp1 <- lapply(seq_len(n), function(x) {
x1 <- rnorm(n, mn1[x], 0.1)
x2 <- rnorm(n, mn2[x], 0.1)
rbind(cbind(min(x1), mean(x1)-sd(x1), mean(x1), mean(x1)+sd(x1), max(x1)),
cbind(min(x2), mean(x2)-sd(x2), mean(x2), mean(x2)+sd(x2), max(x2))
)
})
year <- seq(1900, 1900+n-1, 1)
type <- rep(c("all", "partial"), n)
df1 <- data.frame(rep(year,each = 2), do.call(rbind, tmp1), type)
colnames(df1) <- c("year", "xmin", "xsd_lwr", "xmn", "xsd_upr", "xmax", "type")
head(df1)
rm(tmp1, mn1, mn2, year, type, n)
This is what I have so far in ggplot2:
library(ggplot2)
ggplot(df1, aes(x = year, y = xmn, fill = type, col= type))+
geom_ribbon(aes(ymin=xsd_lwr, ymax = xsd_upr), linetype = 0, alpha = 0.4)+
geom_ribbon(aes(ymin=xmin, ymax = xmax), linetype = 0, alpha = 0.4)+
scale_color_manual(values = c("black", "darkred"))+
scale_fill_manual(values = c("grey10", "grey30"))+
geom_line(aes(linetype= type), size = 1)+
scale_x_continuous(breaks = seq(1900, 2000,10))
UPDATE:
I've accepted the answer as it does give exact control for four different colours, however, I wanted to show that the comment by #teunbrand also works as the transparency, in effect, creates four colours too, and has a better legend. I've modified the suggestion to this below:
ggplot(df1, aes(x = year, y = xmn,col= type))+
geom_ribbon(aes(ymin=xsd_lwr, ymax = xsd_upr, fill = type), linetype = 0, alpha = 0.5)+
geom_ribbon(aes(ymin=xmin, ymax = xmax, fill = type), linetype = 0, alpha =0.5)+
scale_fill_manual(values = c("tomato", "dodgerblue"))+
scale_color_manual(values = c("black", "darkred"))+
geom_line(aes(linetype= type), size = 1)+
scale_x_continuous(breaks = seq(1900, 2000,10))
You could define a specific fill group in the ribbon's aes, and associate the color you wish with it in scale_fill_manual:
ggplot(df1, aes(x = year, y = xmn, fill = type, col= type))+
geom_ribbon(aes(ymin=xsd_lwr, ymax = xsd_upr), linetype = 0, alpha = 0.4,show.legend=F)+
scale_fill_manual(values = c("grey10","red","grey30","green"))+
geom_ribbon(aes(ymin=xmin, ymax = xmax,fill=ifelse(type=='all','all_minmax','partial_min_max')), linetype = 0, alpha = 0.4,show.legend=F)+
scale_color_manual(values = c("black", "darkred"))+
geom_line(aes(linetype= type), size = 1)+
scale_x_continuous(breaks = seq(1900, 2000,10))
Note that I had to remove the legend for the ribbons to avoid to show the second legend with the new color groups.
For easiest and full control of as many colors/fills as you wish, use ggnewscale. This gives you also full legend control.
I am not on a console, coding on rdrr.io/snippets, therefore struggling to show a figure output. But reproducible it is
library(ggplot2)
library(ggnewscale)
ggplot(df1, aes(x = year, y = xmn, col = type))+
geom_ribbon(aes(ymin=xsd_lwr, ymax = xsd_upr, fill = "SE"), linetype = 0, alpha = 0.4)+
scale_fill_manual(name = NULL, values = c("tomato","darkred"))+
new_scale_fill()+
geom_ribbon(aes(ymin=xmin, ymax = xmax, fill= "range"), linetype = 0, alpha = 0.4)+
scale_fill_manual(name = NULL, values = c("dodgerblue", "darkred"))+
geom_line(aes(linetype= type), size = 1)+
scale_color_manual(values = c("black", "darkred"))+
scale_x_continuous(breaks = seq(1900, 2000,10))
How would I label the points in this scatter plot using numbers instead of colors?
Below is the code I am using, instead of the legend saying what color is related to what change I would like it to use numbers. It's hard to tell what color it is since I am using colored panels.
Code:
d=data.frame(x1=c(.5,2,.5,2),
x2 = c(2,3.5,2,3.5),
y1 = c(.5,.5,2,2),
y2 = c(2,2,3.2,3.2),
t=c('low,low','high,low','low,high','high,high'),
r=c('low,low','high,low','low,high','high,high'))
ggplot() +
geom_point(data = df, aes(x=df$Impact, y=df$Likelihood, colour = df$Change)) +
scale_x_continuous(name = "Impact", limits = c(.5,3.5),
breaks=seq(.5,3.5, 1), labels = seq(.5,3.5, 1)) +
scale_y_continuous(name = "Likelihood", limits = c(.5,3.2),
breaks=seq(.5, 3.2, 1), labels = seq(.5, 3.2, 1)) +
geom_rect(data=d,
mapping = aes(xmin = x1, xmax = x2, ymin = y1, ymax = y2, fill = t),
alpha = .5, color = "black")+
geom_text(data=d,
aes(x=x1+(x2-x1)/2, y=y1+(y2-y1)/2, label=r),
size=4)
I would like each item i.e 'Add Server' to correspond to a unique integer and then for that integer to be plotted. Thanks
Edit:
Dataframe structure:
Columns: Change (string), Impact (float), Likelihood (float)
dput(df)
structure(list(Change = c("Windows Patches\n-CRPDB1", "Change DNS settings",
"SSIS Schedule change\n-Warehouse", "OnBase Upgrade", "Add Server",
"Change IL Parameter", "Code Change - Validation missing", "Mass Update Data in Infolease",
"User add, remove or update user permission", "ServiceNow Deployment",
"Creating of a sever or desktop image for mass deployment", "Database table update. Column add/modify",
"Update add PRTG/Sensor"), Impact = c(3, 1.8, 2.6, 2.3, 1, 2.25,
1.8, 1.95, 1.3, 1.5, 1.8, 1, 1), Likelihood = c(3, 1.75, 1.7,
1.6, 1.3, 1.15, 1.15, 1.15, 1.15, 1.1, 1, 1, 1)), class = "data.frame", row.names = c(NA,
-13L))
You can keep the aesthetic mapping between change & colour in order to create a legend, while setting that layer invisible so that it doesn't detract from the overall picture:
df$ID <- seq(1, nrow(df))
df$Legend <- paste0(df$ID, ". ", df$Change)
df$Legend <- factor(df$Legend,
levels = df$Legend[order(df$ID)])
p <- ggplot() +
# text layer to position the numbers
geom_text(data = df,
aes(x = Impact, y = Likelihood, label = ID)) +
# invisible layer to create legend for the numbers
geom_point(data = df,
aes(x = Impact, y = Likelihood, colour = Legend),
alpha = 0, size = 0) +
# rest of the code is unchanged
scale_x_continuous(name = "Impact", limits = c(.5,3.5),
breaks=seq(.5,3.5, 1), labels = seq(.5,3.5, 1)) +
scale_y_continuous(name = "Likelihood", limits = c(.5,3.2),
breaks=seq(.5, 3.2, 1), labels = seq(.5, 3.2, 1)) +
geom_rect(data=d,
aes(xmin = x1, xmax = x2, ymin = y1, ymax = y2, fill = t),
alpha = .5, color = "black") +
geom_text(data=d,
aes(x=x1+(x2-x1)/2, y=y1+(y2-y1)/2, label=r),
size=4)
p
In addition, if you want to remove the empty grey legend keys, set its key width to 0:
p + scale_color_discrete(guide = guide_legend(keywidth = unit(0, "pt")))
I cannot think of a way to do this using only ggplot2 functions, but maybe there is an elegant way to do so. Instead, you can use gridExtra and a tableGrob to display the correct legend.
I replace your call to geom_point() with a call to geom_text(), convert to a grob, then create a table grob with the text you want displayed in the legend, and finally arrange the two grobs.
# load your data as d and df
library(grid)
library(gridExtra)
# add in a Label column with numbers
df$Label <- 1:nrow(df)
g2 <- ggplot() +
geom_text(data = df, aes(x = Impact, y = Likelihood, label = Label)) +
scale_x_continuous(
name = "Impact",
limits = c(.5,3.5),
breaks=seq(.5,3.5, 1),
labels = seq(.5,3.5, 1)
) +
scale_y_continuous(
name = "Likelihood",
limits = c(.5,3.2),
breaks=seq(.5, 3.2, 1),
labels = seq(.5, 3.2, 1)
) +
geom_rect(
data = d,
mapping = aes(xmin = x1, xmax = x2, ymin = y1, ymax = y2, fill = t),
alpha = .5,
color = "black"
) +
geom_text(data = d, aes(x=x1+(x2-x1)/2, y=y1+(y2-y1)/2, label=r), size=4)
g2_grob <- ggplotGrob(g2)
# pasted the two columns together for it to appear a little nicer
tab_leg <- tableGrob(
paste(df$Label,"-", df$Change),
theme = ttheme_minimal(
core = list(fg_params = list(hjust=0, x=0.1,fontsize=8))
)
)
# arrange the plot and table
grid.arrange(arrangeGrob(
g2_grob, nullGrob(), tab_leg, nullGrob(),
layout_matrix = matrix(1:4, ncol = 4),
widths = c(6,.5,2,1)
))
If you want to move the region legend around, you can check out this answer: Show the table of values under the bar plot.
What I have here are two graphs "PlotA" and "PlotB", however I want a combined graph with geom_pointranges showing points, geom_line showing the line and geom_ribbon showing the standard deviation.
water <- c(35,40,42,46,48,50)
depth <- c(1,2,3,4,5,6)
sd <- c(10,10,10,10,10,10)
dataA <- data.frame(depth, water, sd)
from <- c(0.5, 1.5, 2.5, 3.5, 4.5, 5.5)
to <- c(1.5, 2.5, 3.5, 4.5, 5.5, 6.5)
depth1 <- c(1,2,3,4,5,6)
water1 <- c(40,32,50,55,62,30)
dataB <- data.frame(from,to,depth1, water1)
# Load necessary packages
require(ggplot2)
# Plotting Started
#PlotA
ggplot(data=dataA, aes(x = water, y = depth), na.rm=T) +
geom_path(size=0.4, color="black")+
geom_pointrange(data=dataB, aes(water1, depth1, ymin=from, ymax=to), size=0.1, color='black') +
scale_y_reverse(lim = c(10,0), breaks = seq(0,10,1)) +
theme_bw(12) +
scale_x_continuous(lim =c(0,100), breaks = seq(0,100,20))
#PlotB
ggplot() + geom_ribbon(data=dataA, aes(x=depth, y=water, ymin = water - sd, ymax = water + sd), alpha=0.3, fill='grey12') + coord_flip() +
scale_x_reverse(lim = c(10,0), breaks = seq(0,10,1)) + theme_bw(12) +
scale_y_continuous(lim =c(0,100), breaks = seq(0,100,20))
coord_flip is difficult to use well in the middle of a plot. I strongly recommend debugging plots without it and then adding it as the last step.
I think this is what you're looking for. If not, please describe your desired result in more detail.
ggplot(data = dataA, aes(x = depth, y = water)) +
geom_ribbon(
data = dataA,
aes(
x = depth,
ymin = water - sd,
ymax = water + sd
),
alpha = 0.3,
fill = 'grey12'
) +
geom_path(size = 0.4, color = "black") +
geom_point(
data = dataB,
aes(x = depth1, y = water1),
size = 0.1,
color = 'black'
) +
geom_errorbarh(
data = dataB,
aes(
x = depth1,
xmin = from,
xmax = to,
y = water1
),
size = 0.1,
height = 0
) +
theme_bw(12) +
scale_x_reverse(lim = c(10, 0), breaks = seq(0, 10, 1)) +
scale_y_continuous(lim = c(0, 100), breaks = seq(0, 100, 20)) +
coord_flip()
I'd like to do the following in R: I have 2 datasets (one consisting of 4, the other of 3 values) and I'd like to plot them with ggplot2 as bar charts (separately). However, I'd like to use the same scale for the both, i.e.: if the minimum value of dataset #1 is 0.2 and 0.4 of dataset #2, then I want to use 0.2 for both. Same applies for the maximum values (choosing the greater there).
So, basically, I want to make the 2 plots comparable. Of course, would be great to apply the common scale for coloring the bars, as well. Now, I'm using colorRampPalette and applying it in the scale_fill_gradient2 property.
A MWE provided below:
library("ggplot2")
val <- c(0.2, 0.35, 0.5, 0.65)
labels <- c('A', 'B', 'C', 'D')
LtoM <-colorRampPalette(c('green', 'yellow'))
df <- data.frame(val)
bar <- ggplot(data = df,
aes(x = factor(labels),
y = val,
fill = val)) +
geom_bar(stat = 'identity') +
scale_fill_gradient2(low=LtoM(100), mid='snow3',
high=LtoM(100), space='Lab') +
geom_text(aes(label = val), vjust = -1, fontface = "bold") +
labs(title = "Title", y = "Value", x = "Methods") +
theme(legend.position = "none")
print(bar)
Given the code above, and another dataset like c(0.4, 0.8, 1.2) with labels c('E', 'F', 'G'), how to adjust the code to create 2 different and separated plots (saved into PNGs finally, i.e.) but use the common (0.2 to 1.2) scale for both the heights of bars and their colors (so moving the images exactly next to each other indicates that the bars with the same height but belonging to different images appear in the same way and their colors are the same)?
We can use a mix of the breaks argument in scale_y_continuous to ensure that we have consistent axis ticks, then use coord_cartesian to ensure that we force both plots to have the same y-axis range.
df1 <- data.frame(val = c(0.2, 0.35, 0.5, 0.65), labels = c('A', 'B', 'C', 'D'))
df2 <- data.frame(val = c(0.4, 0.8, 1.2), labels = c('E', 'F', 'G'))
g_plot <- function(df) {
ggplot(data = df,
aes(x = factor(labels),
y = val,
fill = val)) +
geom_bar(stat = 'identity') +
scale_fill_gradient2(low=LtoM(100), mid='snow3',
high=LtoM(100), space='Lab') +
geom_text(aes(label = val), vjust = -1, fontface = "bold") +
scale_y_continuous(breaks = seq(0, 1.2, 0.2)) +
coord_cartesian(ylim = c(0, 1.2)) +
labs(title = "Title", y = "Value", x = "Methods") +
theme(legend.position = "none")
}
bar1 <- g_plot(df1);
bar2 <- g_plot(df2);
gridExtra::grid.arrange(bar1, bar2, ncol = 2);
You actually dont need to use coord_cartesian. You can just use the limits argument in scale_y_continuous, like this:
scale_y_continuous(limits = c(0,1.2), breaks = seq(0, 1.2, 0.2))