Forcing specific plot symbols for points in R ggplot2 - r

I am trying to specify manually, the shape of data points in r ggplot2 but can't seem to get it to work. Below is a sample example
p.est<- c(1.65, 1.55, 0.70, 1.61, 1.25)
lcl<-c(1.25, 1.10, 0.50, 1.20, 1.02)
ucl<-c(2.20, 2.05, 0.90, 2.20, 1.50)
toy.data <- tibble(zc = zc, p.est = p.est, p.lcl = p.lcl, p.ucl = p.ucl)
Assume I want two types of plot symbols for the five points, I use scale_shape_manual() in ggplot2 but it doesn't seem to work. Below is my sample code and the resulting plot attached. I'm trying to modify so the plot symbols for the points correspond to 5 which is a diamond and 16 which is a circle.
ggplot(toy.data, aes(zc, p.est, ymin = p.lcl, ymax = p.ucl)) +
scale_shape_manual(values = c(5, 16, 5, 5, 16))+
geom_pointrange(position = position_dodge(width = 0.1))+
geom_hline(yintercept = 1)+
ylim(0.5, 2.5)

You can add shape = factor(p.est). Otherwise scale_shape doesn't apply to anything.
zc <- c(1,2,3,4,5)
p.est <- c(1.65, 1.55, 0.70, 1.61, 1.25)
p.lcl <-c(1.25, 1.10, 0.50, 1.20, 1.02)
p.ucl <-c(2.20, 2.05, 0.90, 2.20, 1.50)
toy.data <- tibble(zc = zc,
p.est = p.est,
p.lcl = p.lcl,
p.ucl = p.ucl)
ggplot(toy.data, aes(zc, p.est, ymin = p.lcl, ymax = p.ucl, shape = factor(p.est))) +
scale_shape_manual(values = c(5, 16, 5, 5, 16)) +
geom_pointrange(position = position_dodge(width = 0.1)) +
geom_hline(yintercept = 1) +
ylim(0.5, 2.5)
EDITED Follow up question
toy.data <- tibble(zc = zc,
p.est = p.est,
p.est.x = c("A","B","A","A","B"),
p.lcl = p.lcl,
p.ucl = p.ucl)
ggplot(toy.data, aes(zc, p.est, ymin = p.lcl, ymax = p.ucl, shape = p.est.x)) +
scale_shape_manual(values = c(5, 16)) +
geom_pointrange(position = position_dodge(width = 0.1)) +
geom_hline(yintercept = 1) +
ylim(0.5, 2.5)

Related

How fill geom_ribbon with different colour in R?

I am trying to use different fill for geom_ribbon according to the x-values (For Temp = 0-20 one fill, 20-30.1 another fill and > 30.1 another fill). I am using the following code
library(tidyverse)
bounds2 <- df %>%
mutate(ymax = pmax(Growth.rate, slope),
ymin = pmin(Growth.rate, slope),
x_bins = cut(Temp, breaks = c(0,20,30.1,max(Temp)+5)))
ggplot(df, aes(x = Temp, y = Growth.rate)) +
geom_line(colour = "blue") +
geom_line(aes(y = slope), colour = "red") +
scale_y_continuous(sec.axis = sec_axis(~ .^1, name = "slope")) +
geom_ribbon(data = bounds2, aes(Temp, ymin = ymin, ymax = ymax, fill = x_bins),
alpha = 0.4)
It is returning me following output
As you can see from the output some regions are remaining empty. Now how can I fill those parts in the curve?
Here is the data
df = structure(list(Temp = c(10, 13, 17, 20, 25, 28, 30, 32, 35, 38
), Growth.rate = c(0, 0.02, 0.19, 0.39, 0.79, 0.96, 1, 0.95,
0.65, 0), slope = c(0, 0.02, 0.16, 0.2, 0.39, 0.1, 0.03, -0.04,
-0.29, -0.65)), row.names = c(NA, 10L), class = "data.frame")
Here's a solution that involves interpolating new points at the boundaries between the areas. I used approx to get the values of ymin and ymax at Temp=30.1 and added this to the plotting dataset.
Then, instead of using cut just once as you did I use it twice, once with lower bounds included in each set then once with upper bounds included. Then I reshape the data long, and de-duplicate the rows I don't need.
If you zoom in enough you can see that the boundary is at 30.1 not at 30.
bounds2 <- df %>%
mutate(ymax = pmax(Growth.rate, slope),
ymin = pmin(Growth.rate, slope))
bounds2 <- bounds2 |>
add_case(Temp=30.1,
ymax=approx(bounds2$Temp,bounds2$ymax,xout = 30.1)$y,
ymin=approx(bounds2$Temp,bounds2$ymin,xout = 30.1)$y) |>
mutate(x_bins2 = cut(Temp, breaks = c(0,20,30.1,max(Temp)+5),right=FALSE, labels=c("0-20","20-30.1","30.1-max")),
x_bins = cut(Temp, breaks = c(0,20,30.1,max(Temp)+5), labels=c("0-20","20-30.1","30.1-max"))) |>
tidyr::pivot_longer(cols=c(x_bins2, x_bins), names_to = NULL, values_to = "xb") |>
distinct()
ggplot(df, aes(x = Temp, y = Growth.rate)) +
geom_line(colour = "blue") +
geom_line(aes(y = slope), colour = "red") +
scale_y_continuous(sec.axis = sec_axis(~ .^1, name = "slope")) +
geom_ribbon(data = bounds2, aes(Temp, ymin = ymin, ymax = ymax, fill = xb),
alpha = 0.4)
The idea is here but the code I show can be much improved at the step ### Dupplicate the 2 last x_bins from each category and move them into the next
### Libraries
library(tidyverse)
df <- structure(list(Temp = c(10, 13, 17, 20, 25, 28, 30, 32, 35, 38
), Growth.rate = c(0, 0.02, 0.19, 0.39, 0.79, 0.96, 1, 0.95,
0.65, 0), slope = c(0, 0.02, 0.16, 0.2, 0.39, 0.1, 0.03, -0.04,
-0.29, -0.65)), row.names = c(NA, 10L), class = "data.frame")
### Preprocessing
bounds2 <- df %>%
mutate(ymax = pmax(Growth.rate, slope),
ymin = pmin(Growth.rate, slope),
x_bins = cut(Temp, breaks = c(0, 20, 30.1, max(Temp)+5)))
### Dupplicate the 2 last x_bins from each category and move them into the next category
bounds2 <- rbind(bounds2, bounds2[c(4, 7), ])
bounds2$x_bins[c(11, 12)] <- bounds2[c(5, 8), ]$x_bins
### Plot
ggplot(df, aes(x = Temp, y = Growth.rate)) +
geom_line(colour = "blue") +
geom_line(aes(y = slope), colour = "red") +
scale_y_continuous(sec.axis = sec_axis(~ .^1, name = "slope")) +
geom_ribbon(data = bounds2, aes(Temp, ymin = ymin, ymax = ymax, fill = x_bins),
alpha = 0.4)

How to add a new (custom) variable to a ggplot legend

I've run a number of models with two estimated parameters per model with five groups and two treatments. I'm trying to graph the confidence intervals of these estimates in a large panel plot. Since, I've simulated these data sets I would like to be able to include a dashed line for the "true value" of the parameter which I set at the beginning of the exercise for reference so we can see how well the confidence interval of the model estimates includes the true value. I can do this just fine but I'd like to include another line in the legend that shows "dashed black line" = True Value.
Here's an example of the code. The first set of code works and does not include the dashed black line in the legend.
group = c("group1", "group2", "group3", "group4", "group5")
treatment = c("treatment1", "treatment2")
estimates = c("estim1", "estim2")
parameters = c("param1", "param2")
means = c(0, 0, 5, 0, -5, 0, 0, 7, -5, 10, -5, 0, 0, 0, 0, 0, -5, 0, 0, 10)
UL = c(.5, .5, 5.5, .5, -4.5, 0.5, 0.5, 7.5, -4.5, 10.5, -4.5, .5, .5, .5, .5, .5, -4.5, .5, .5, 10.5)
LL = c(-.5, -.5, 4.5, -.5, -5.5, -.5, -.5, 6.5, -4.5, 9.5, -4.5, -.5, -.5, -.5, -.5, -.5, -4.5, -.5, -.5, 9.5)
values = c(.2, -.2, 5.2, -.3, -4.7, -.1, -.2, 6.9, -5.3, 10.1, -4.4, 0.1, 0.2, 0.3, 0.1, -0.1, -4.9, -.2, -.2, 9.9)
df = data.frame(
group = rep(rep(group, each = 2), 2),
treatment = rep(treatment, each = 10),
estimates = rep(estimates, 10),
LL = LL,
means = means,
UL = UL,
parameters = rep(parameters, 10),
values = values
)
ggplot(data = df, aes(x = as.factor(estimates), y = means, color = estimates))+
geom_point()+
geom_errorbar(aes(ymin = LL, ymax = UL), width=.1, position = position_dodge(0.1))+
geom_segment(x = rep(c(.6, 1.6), 10), xend = rep(c(1.4, 2.4), 10),
y = values, yend = values, col = "black",
linetype = 3)+
scale_x_discrete(labels = c(expression(beta[1]), expression(beta[2])))+
xlab("Beta coefficient type")+ylab("Confidence Interval of Estimate")+
ggtitle("Coefficient Estimates")+
facet_grid(row = vars(treatment), col = vars(group))+
scale_color_manual(name = "Symbols",
values = c("estim1" = "#F8766D", "estim2" = "#00BFC4"),
labels = c(expression(beta[1]),
expression(beta[2])))
scale_shape_manual(values = c("b1" = 16,
"b2" = 16)+
scale_linetype_manual(values = c("b1" = 1,
"b2" = 1))
The second set of code, does not work but is my best attempt as to what maybe I should do to try to get the dashed black line in the legend.
ggplot(data = df, aes(x = as.factor(estimates), y = means, color = estimates))+
geom_point()+
geom_errorbar(aes(ymin = LL, ymax = UL), width=.1, position = position_dodge(0.1))+
geom_segment(x = rep(c(.6, 1.6), 10), xend = rep(c(1.4, 2.4), 10),
y = values, yend = values, col = "black",
linetype = 3)+
scale_x_discrete(labels = c(expression(beta[1]), expression(beta[2])))+
xlab("Beta coefficient type")+ylab("Confidence Interval of Estimate")+
ggtitle("Coefficient Estimates")+
facet_grid(row = vars(treatment), col = vars(group))+
scale_color_manual(name = "Symbols",
values = c("estim1" = "#F8766D", "estim2" = "#00BFC4"),
#"" = "#00000"),
labels = c(expression(beta[1]),
expression(beta[2])))#,
#"True Value"))#+
scale_shape_manual(values = c("b1" = 16,
"b2" = 16,
"" = 0))+
scale_linetype_manual(values = c("b1" = 1,
"b2" = 1,
"b3" = 3))
I've also thought that maybe I could include try to relevel the df$estimates column to include three levels (the existing) "estim1", "estim2" and a dummy "True Value" level with no observations but I'm worried that this would just add an empty "True Value" tick to each of my 12 plots on the x-axis sublabels.
Thanks for you help.
Map the linetype of your geom_segment to a string called "True value" inside aes, then add a scale_linetype_manual call. This will create a separate legend entry that matches the appearance of your segment and has the correct label.
ggplot(data = df, aes(x = as.factor(estimates), y = means, color = estimates)) +
geom_point() +
geom_errorbar(aes(ymin = LL, ymax = UL), width=.1,
position = position_dodge(0.1)) +
geom_segment(x = rep(c(.6, 1.6), 10), xend = rep(c(1.4, 2.4), 10),
y = values, yend = values, col = "black",
aes(linetype = "True value")) +
scale_x_discrete(labels = c(expression(beta[1]), expression(beta[2]))) +
xlab("Beta coefficient type")+ylab("Confidence Interval of Estimate") +
ggtitle("Coefficient Estimates") +
facet_grid(row = vars(treatment), col = vars(group)) +
scale_color_manual(name = "Symbols",
values = c("estim1" = "#F8766D", "estim2" = "#00BFC4"),
labels = c(expression(beta[1]),
expression(beta[2]))) +
scale_linetype_manual(values = 3, name = NULL)

Rearranging trendline colors in ggplot

I created a plot that turned out mostly how I'd like it in ggplot but I need the lines to appear in a slightly different color arrangement.
Basically, I need all "mean" lines to appear in blue and all "odd" lines to appear in red. Pref 1 will appear in either the lighter or darker shade and vice versa. As you can see ggplot has not quite done that.
p2 <- ggplot(asd_pref_plot_groups, aes(x, pref_plot_groups$predicted, col = combined)) +
geom_line(size=1.5) +
scale_color_manual(values = c("blue","deepskyblue","red","pink")) +
geom_ribbon(aes(ymin=conf.low,ymax=conf.high, fill=combined),alpha=.2,colour=NA) +
scale_fill_manual(values = c("blue","deepskyblue","red","pink")) +
geom_point(data=summStats,aes(trial,mean,col = combined),size=2) +
scale_color_manual(values = c("blue","deepskyblue","red","pink")) +
theme_bw() +
xlab('Trial') +
ylab('Prediction Error') +
ggtitle('ASD learning about TD vs. ASD \n learning about ASD') +
theme(text=element_text(size=20),
plot.title = element_text(hjust = 0.5),
panel.border = element_blank())
Above is my code. I thought I could shift around scale_color_manual as needed but it doesn't seem to work? Is there an easy fix or does this extend to my data frames. Thank you
Your question didn't include any example data, so I have had to try to recreate your data set (see footnote)
To ensure we are on the right track, I will use exactly your plotting code to get a very similar plot:
ggplot(asd_pref_plot_groups, aes(x, pref_plot_groups$predicted, col = combined)) +
geom_line(size=1.5) +
scale_color_manual(values = c("blue","deepskyblue","red","pink")) +
geom_ribbon(aes(ymin = conf.low, ymax = conf.high, fill = combined),
alpha = 0.2, colour = NA) +
scale_fill_manual(values = c("blue","deepskyblue","red","pink")) +
geom_point(data = summStats, aes(trial, mean,col = combined), size = 2) +
scale_color_manual(values = c("blue","deepskyblue","red","pink")) +
theme_bw() +
xlab('Trial') +
ylab('Prediction Error') +
ggtitle('ASD learning about TD vs. ASD \n learning about ASD') +
theme(text=element_text(size=20),
plot.title = element_text(hjust = 0.5),
panel.border = element_blank())
All we need to do here is to remove one of your redundant scale_color_manual calls (you currently have 2), and change the ordering of the colors in both the fill and color scales:
ggplot(asd_pref_plot_groups, aes(x, pref_plot_groups$predicted,
col = combined, fill = combined)) +
geom_line(size = 1.5) +
geom_ribbon(aes(ymin = conf.low, ymax = conf.high),
alpha = 0.2, colour = NA) +
scale_fill_manual(values = c("blue","red", "deepskyblue", "pink")) +
scale_color_manual(values = c("blue","red","deepskyblue", "pink")) +
geom_point(data = summStats, aes(trial, mean,col = combined), size = 2) +
theme_bw() +
xlab('Trial') +
ylab('Prediction Error') +
ggtitle('ASD learning about TD vs. ASD \n learning about ASD') +
theme(text=element_text(size=20),
plot.title = element_text(hjust = 0.5),
panel.border = element_blank())
Footnote: Reproducible data to approximate data in question
set.seed(1)
asd_pref_plot_groups <- data.frame(x = rep(c(1, 60), 4),
combined = rep(c('pref1_mean', 'pref1_odd',
'pref2_mean', 'pref2_odd'),
each = 2),
predicted = c(1.3, 1.3, 1.45, 1.3,
2, 1.75, 2.05, 1.77),
conf.high = c(1.35, 1.35, 1.5, 1.35,
2.05, 1.8, 2.1, 1.82),
conf.low = c(1.25, 1.25, 1.4, 1.25,
1.95, 1.7, 2, 1.72))
pref_plot_groups <- asd_pref_plot_groups
summStats <- data.frame(trial = rep(1:60, 4),
combined = rep(c('pref1_mean', 'pref1_odd',
'pref2_mean', 'pref2_odd'),
each = 60),
mean = c(rnorm(60, seq(1.3, 1.3, length = 60), 0.05),
rnorm(60, seq(1.45, 1.3, length = 60), 0.05),
rnorm(60, seq(2, 1.75, length = 60), 0.05),
rnorm(60, seq(2.05, 1.77, length = 60), 0.05)))

Error with using side-by-side, nudged graphs in ggplot (facet_wrap and position_nudge)

I'm trying to create side-by-side plots with nudged data points (Odds ratios, with 95% CI error bars) in R using ggplot. Each time I try to combine them I get an error. Can anyone help me identify what I should do to change my code? This is the error I get:
Error in (~surv) + scale_x_continuous(breaks = seq(0, 4, 1)) :
non-numeric argument to binary operator
To illustrate what I'm trying to do, see below a version I created using plot(), which you can see is fairly ugly: I've tried combining the facet_wrap and position_nudge based on the guidance in the J Stuart Carlton blog, but haven't been able to add a position_nudge. The error code above suggests that the problem is with facet_wrap section of my code.
I've included code below describing how to replicate my dataset.
activity <- factor(rep(c("Good interaction", "Poor interaction",
"RTW plan"), times = 4))
surv <- factor(rep(c("T1", "T2"), each = 3, times = 2))
mod <- factor(rep(c("Crude", "Adjusted"), each = 6))
or <- c(1.72, 1.26, 2.39, 2.5, 1.34, 1.89, 1.14, 1.09, 2.02, 1.9, 1.1, 1.02)
low <- c(1.22, 0.74, 1.73, 1.74, 0.61, 1.35, 0.77, 0.61, 1.40, 1.22, 0.60, 0.68)
hi <- c(2.41, 2.16, 3.29, 3.6, 1.8, 2.64, 1.70, 1.94, 2.90, 2.95, 2.04, 1.54)
rtwc <- data.frame(activity, surv, mod, or, low, hi)
And here is the ggplot code I've been using:
ggplot(rtwc, aes(x = or, y = activity, colour = mod)) +
geom_vline(aes(xintercept = 1), size = 0.25, linetype = "dashed") +
geom_errorbarh(data = filter(rtwc, mod == "crude"), aes(xmax = hi, xmin = low), size = 0.5, height = 0.1, colour = "gray50", position = position_nudge(y = fix)) +
geom_point(data = filter(rtwc, mod == "crude"), aes(xmax = hi, xmin = low), size = 4, position_nudge(y = fix)) +
geom_errorbarh(data = filter(rtwc, mod == "Adjusted"), aes(xmax = hi, xmin = low), size = 0.5, height = 0.1, colour = "gray50", position = position_nudge(y = -fix)) +
geom_point(data = filter(rtwc, mod == "Adjusted"), size = 4, position = position_nudge(y = -fix)) +
geom_errorbarh(data = filter(rtwc, mod = "Adjusted")) +
facet_wrap = (~surv) +
scale_x_continuous(breaks = seq(0, 4, 1)) +
coord_trans(x = "log10") +
theme_bw() +
theme(panel.grid.minor = element_blank())
Apologies if there is already a post on this question.
Here are some code to get you started. I used position_dodge together with coord_flip to keep the error bar pairs away from each other:
ggplot(rtwc,
aes(y = or, ymin = low, ymax = hi, x = activity, group = mod)) +
geom_hline(yintercept = 1, size = 0.25, linetype = "dashed", colour = "grey") +
geom_errorbar(width = 0.2, position = position_dodge(0.5)) +
geom_point(aes(col = mod), position = position_dodge(0.5),
size = 3) +
scale_x_discrete(name = "") +
scale_y_log10(name = "", breaks = seq(0, 4)) +
scale_color_manual(name = "", values = c("red", "blue")) + # change colours here
expand_limits(y = c(0.1, 4)) + #adjust x axis range here
facet_wrap(~surv) +
coord_flip() +
theme_bw() + #change look & feel here
theme(panel.grid.minor = element_blank())
For tweaks to the plot's look & feel, you can check out the available themes in ggplot here, & more themes in ggthemes here. Just please don't use the Excel 2013 theme. As the creator noted, its presence is for ironic purposes only.
For tweaks to the point colours, here's a handy reference for colours by name. Or you can use one of the palettes from RColorBrewer, viewable via RColorBrewer::display.brewer.all().

ggplot2 v2.21.9 sec.axis in polar plot

I am currently trying to add a secondary axis using the recently introduced function sec.axis in ggplot2. This function works well with scatter/bar plots, but not for polar plot: In the following code, the name for the second y-axis appears, but not the axis.
Is there any workaround or option, that I have not figured out?
require(ggplot2)
set.seed(40);
Location <- data.frame(Winkel = round(runif(1000, 0, 24), 0))
Location$BAD <- Location$Winkel %in% c(seq(7, 18))
Abschnitte <- c(0:24)
polar <- data.frame(Winkel2 = c(1.5, 2.34, 1.2, 3.45, 1.67, 2.61, 1.11, 13.2),
value = c(0.1, 0.03, 0.02, 0.015, 0.01, 0.04, 0.09, 0.06))
ggplot(Location, aes(x = Winkel, fill = BAD, y = (..count..)/sum(..count..))) +
geom_histogram(breaks = seq(0,24), colour = "black") +
coord_polar(start = 0) + theme_minimal() +
scale_fill_brewer(type = "seq", palette = 3) +
ylab("Percentual allocation time") +
ggtitle("") +
scale_x_continuous("", limits = c(0, 24), breaks = Abschnitte, labels = Abschnitte) +
scale_y_continuous(labels = scales::percent,
sec.axis = sec_axis(~.*5, name = "mean direction")) +
geom_segment(data = polar, aes(x = Winkel2, y = 0, xend = Winkel2, yend = value, fill = NA),
arrow = arrow(angle = 30, type = "closed", length = unit(0.3, "cm")))
As #henrik mentioned in the comments, this is a bug. It's been patched and is available if you use the development version from GitHub (i.e., devtools::install_github("tidyverse/ggplot2")).
Here's the example after the patch:
require(ggplot2)
#> Loading required package: ggplot2
set.seed(40);
Location <- data.frame(Winkel = round(runif(1000, 0, 24), 0))
Location$BAD <- Location$Winkel %in% c(seq(7, 18))
Abschnitte <- c(0:24)
polar <- data.frame(Winkel2 = c(1.5, 2.34, 1.2, 3.45, 1.67, 2.61, 1.11, 13.2),
value = c(0.1, 0.03, 0.02, 0.015, 0.01, 0.04, 0.09, 0.06))
ggplot(Location, aes(x = Winkel, fill = BAD, y = (..count..)/sum(..count..))) +
geom_histogram(breaks = seq(0,24), colour = "black") +
coord_polar(start = 0) + theme_minimal() +
scale_fill_brewer(type = "seq", palette = 3) +
ylab("Percentual allocation time") +
ggtitle("") +
scale_x_continuous("", limits = c(0, 24), breaks = Abschnitte, labels = Abschnitte) +
scale_y_continuous(labels = scales::percent,
sec.axis = sec_axis(~.*5, name = "mean direction")) +
geom_segment(data = polar, aes(x = Winkel2, y = 0, xend = Winkel2, yend = value, fill = NA),
arrow = arrow(angle = 30, type = "closed", length = unit(0.3, "cm")))
#> Warning: Ignoring unknown aesthetics: fill

Resources