I want to build a single graph displaying several outcomes (with different point- and lineshapes for each, respectively) for several strata (displayed in different colours) over time. Using this for one group works:
data <- data.frame(
time = rep(c("Baseline", "Follow-Up 1", "Follow-Up 2"), each = 8),
stratum = rep(c("Intervention", "Control"), 12),
outcome = rep(c("Sensitivity", "Specificity", "PPV", "NPV"), 3, each = 2),
value = runif(24)
)
# working
data %>%
filter(stratum == "Intervention") %>%
ggplot(aes(x = time, y = value, group = outcome, colour = stratum)) +
geom_point(aes(shape = outcome)) +
geom_line(aes(linetype = outcome))
# not working
data %>%
ggplot(aes(x = time, y = value, group = outcome, colour = stratum)) +
geom_point(aes(shape = outcome)) +
geom_line(aes(linetype = outcome))
Graph displaying what I want for one stratum, the other should ideally just be added within the same graph with another colour under "stratum" in the legend
If I want the same for both strata it does not and produces following error:
Error in `f()`:
! geom_path: If you are using dotted or dashed lines, colour, size and linetype must be constant over the line
Run `rlang::last_error()` to see where the error occurred.
The info in last_error() does not help me. Has anyone a solution here?
The group aesthetic should uniquely define the points that you want connected with a line. You need to consider both outcome and stratum in that.
data %>%
ggplot(aes(x = time, y = value, group = paste(outcome, stratum), colour = stratum)) +
geom_point(aes(shape = outcome)) +
geom_line(aes(linetype = outcome))
Related
I am making a stratigraphic plot but somehow, my data points don't connect correctly.
The purpose of this plot is that the values on the x-axis are connected so you get an overview of the change in d18O throughout time (age, ma).
I've used the following script:
library(readxl)
R_pliocene_tot <- read_excel("Desktop/R_d18o.xlsx")
View(R_pliocene_tot)
install.packages("analogue")
install.packages("gridExtra")
library(tidyverse)
R_pliocene_Rtot <- R_pliocene_tot %>%
gather(key=param, value=value, -age_ma)
R_pliocene_Rtot
R_pliocene_Rtot %>%
ggplot(aes(x=value, y=age_ma)) +
geom_path() +
geom_point() +
facet_wrap(~param, scales = "free_x") +
scale_y_reverse() +
labs(x = NULL, y = "Age (ma)")
which leads to the following figure:
Something is wrong with the geom_path function, I guess, but I can't figure out what it is.
Though the comment seem solve the problem I don't think the question asked was answered. So here is some introduction about ggplot2 library regard geom_path
library(dplyr)
library(ggplot2)
# This dataset contain two group with random value for y and x run from 1->20
# The param is just to replicate the question param variable.
df <- tibble(x = rep(seq(1, 20, by = 1), 2),
y = runif(40, min = 1, max = 100),
group = c(rep("group 1", 20), rep("group 2", 20)),
param = rep("a param", 40))
df %>%
ggplot(aes(x = x, y = y)) +
# In geom_path there is group aesthetics which help the function to know
# which data point should is in which path.
# The one in the same group will be connected together.
# here I use the color to help distinct the path a bit more.
geom_path(aes(group = group, color = group)) +
geom_point() +
facet_wrap(~param, scales = "free_x") +
scale_y_reverse() +
labs(x = NULL, y = "Age (ma)")
In your data which work well with group = 1 I guessed all data points belong to one group and you just want to draw a line connect all those data point. So take my data example above and draw with aesthetics group = 1, you can see the result that have two line similar to the above example but now the end point of group 1 is now connected with the starting point of group 2.
So all data point is now on one path but the order of how they draw is depend on the order they appear in the data. (I keep the color just to help see it a bit clearer)
df %>%
ggplot(aes(x = x, y = y)) +
geom_path(aes(group = 1, color = group)) +
geom_point() +
facet_wrap(~param, scales = "free_x") +
scale_y_reverse() +
labs(x = NULL, y = "Age (ma)")
Hope this give you better understanding of ggplot2::geom_path
I have two dataframes and I want to plot a comparison between them. The plot and dataframes look like so
df2019 <- data.frame(Role = c("A","B","C"),Women_percent = c(65,50,70),Men_percent = c(35,50,30), Women_total =
c(130,100,140), Men_total = c(70,100,60))
df2016 <- data.frame(Role= c("A","B","C"),Women_percent = c(70,45,50),Men_percent = c(30,55,50),Women_total =
c(140,90,100), Men_total = c(60,110,100))
all_melted <- reshape2::melt(
rbind(cbind(df2019, year=2019), cbind(df2016, year=2016)),
id=c("year", "Role"))
Theres no reason I need the data in melted from, I just did it because I was plotting bar graphs with it, but now I need a line graph and I dont know how to make line graphs in melted form, and dont know how to keep that 19/16 tag if not in melted frame. When i try to make a line graph I dont know how to specify what "variable" will be used. I want the lines to be the Women,Men percent values, and the label to be the totals. (in this picture the geom_text is the percent values, I want it to use the total values)
Crucially I want the linetype to be dotted in 2016 and for the legend to show that
I think it would be simplest to rbind the two frames after labelling them with their year, then reshape the result so that you have columns for role, year, gender, percent and total.
I would then use a bit of alpha scale trickery to hide the points and labels from 2016:
df2016$year <- 2016
df2019$year <- 2019
rbind(df2016, df2019) %>%
pivot_longer(cols = 2:5, names_sep = "_", names_to = c("Gender", "Type")) %>%
pivot_wider(names_from = Type) %>%
ggplot(aes(Role, percent, color = Gender,
linetype = factor(year),
group = paste(Gender, year))) +
geom_line(size = 1.3) +
geom_point(size = 10, aes(alpha = year)) +
geom_text(aes(label = total, alpha = year), colour = "black") +
scale_colour_manual(values = c("#07aaf6", "#ef786f")) +
scale_alpha(range = c(0, 1), guide = guide_none()) +
scale_linetype_manual(values = c(2, 1)) +
labs(y = "Percent", color = "Gender", linetype = "Year")
I am currently in the process of trying two form two dashed lines using the ggplot function. The graph is one that shows two regression lines belonging to two different factor groups. I've been able to make one of the lines dashed, but I am having trouble getting the other line to have dashes. Any help would be greatly appreciated.
coli_means %>%
ggplot(aes(time, mean_heartrate, group = treatment)) +
geom_point( aes(group = treatment, color = treatment)) +
geom_smooth(aes(method = "loess", linetype = treatment, se = FALSE,
group = treatment, color = treatment, show.legend = TRUE))
I feel I am missing one simple input. Thanks.
What you need to do is use scale_linetype_manual() and then tell it that both the treatment groups require a dashed line.
Let's start with a reproducible example:
# reproducible example:
set.seed(0)
time <- rep(1:100,2)
treatment <- c(rep("A",100), rep("B",100))
mean_heartrate <- c(rnorm(100,60,2), rnorm(100,80,2))
coli_means <- data.frame(time, treatment, mean_heartrate)
# ggplot
coli_means %>%
ggplot(aes(x = time, y = mean_heartrate)) +
geom_point(aes(color = treatment)) +
geom_smooth(aes(linetype = treatment, color = treatment))+
scale_linetype_manual(values = c('dashed','dashed'))
I needed to add some partial boxplots to the following plot:
library(tidyverse)
foo <- tibble(
time = 1:100,
group = sample(c("a", "b"), 100, replace = TRUE) %>% as.factor()
) %>%
group_by(group) %>%
mutate(value = rnorm(n()) + 10 * as.integer(group)) %>%
ungroup()
foo %>%
ggplot(aes(x = time, y = value, color = group)) +
geom_point() +
geom_smooth(se = FALSE)
I would add a grid of (2 x 4 = 8) boxplots (4 per group) to the plot above. Each boxplot should consider a consecutive selection of 25 (or n) points (in each group). I.e., the firsts two boxplots represent the points between the 1st and the 25th (one boxplot below for the group a, and one boxplot above for the group b). Next to them, two other boxplots for the points between the 26th and 50th, etcetera. If they are not in a perfect grid (which I suppose would be both more challenging to obtain and uglier) it would be even better: I prefer if they will "follow" their corresponding smooth line!
That all without using facets (because I have to insert them in a plot which is already facetted :-))
I tried to
bar <- foo %>%
group_by(group) %>%
mutate(cut = 12.5 * (time %/% 25)) %>%
ungroup()
bar %>%
ggplot(aes(x = time, y = value, color = group)) +
geom_point() +
geom_smooth(se = FALSE) +
geom_boxplot(aes(x = cut))
but it doesn't work.
I tried to call geom_boxplot() using group instead of x
bar %>%
ggplot(aes(x = time, y = value, color = group)) +
geom_point() +
geom_smooth(se = FALSE) +
geom_boxplot(aes(group = cut))
But it draws the boxplots without considering the groups and loosing even the colors (and add a redundant call including color = group doesn't help)
Finally, I decided to try it roughly:
bar %>%
ggplot(aes(x = time, y = value, color = group)) +
geom_point() +
geom_smooth(se = FALSE) +
geom_boxplot(data = filter(bar, group == "a"), aes(group = cut)) +
geom_boxplot(data = filter(bar, group == "b"), aes(group = cut))
And it works (maintaining even the correct colors from the main aes)!
Does someone know if it is possible to obtain it using a single call to geom_boxplot()?
Thanks!
This was interesting! I haven't tried to use geom_boxplot with a continuous x before and didn't know how it behaved. I think what is happening is that setting group overrides colour in geom_boxplot, so it doesn't respect either the inherited or repeated colour aesthetic. I think this workaround does the trick; we combine the group and cut variables into group_cut, which takes 8 different values (one for each desired boxplot). Now we can map aes(group = group_cut) and get the desired output. I don't think this is particularly intuitive and it might be worth raising it on the Github, since usually we expect aesthetics to combine nicely (e.g. combining colour and linetype works fine).
library(tidyverse)
bar <- tibble(
time = 1:100,
group = sample(c("a", "b"), 100, replace = TRUE) %>% as.factor()
) %>%
group_by(group) %>%
mutate(
value = rnorm(n()) + 10 * as.integer(group),
cut = 12.5 * ((time - 1) %/% 25), # modified this to prevent an extra boxplot
group_cut = str_c(group, cut)
) %>%
ungroup()
bar %>%
ggplot(aes(x = time, y = value, colour = group)) +
geom_point() +
geom_smooth(se = FALSE) +
geom_boxplot(aes(group = group_cut), position = "identity")
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
Created on 2019-08-13 by the reprex package (v0.3.0)
In R with ggplot, I want to create a spaghetti plot (2 quantitative variables) grouped by a third variable to specify line color. Secondly, I want to aggregate that grouping variable with the line type or width.
Here's an example using the airquality dataset. I want the line's color to represent the month, and the summer months to have a different line width from non-summer months.
First, I created an indicator variable for the aggregated groups:
airquality$Summer <- with(airquality, ifelse(Month >= 6 & Month < 9, 1, 0))
I would like something like this, but with differing line widths:
However, this fails:
library(ggplot2)
ggplot(data = airquality, aes(x=Wind, y = Temp, color = as.factor(Month), group = Summer)) +
geom_point() +
geom_line(linetype = as.factor(Summer))
This also fails (specifying airquality$Summer):
ggplot(data = airquality, aes(x=Wind, y = Temp,
color = as.factor(Month), group = airquality$Summer)) +
geom_point() +
geom_line(linetype = as.factor(airquality$Summer))
I attempted this solution, but get another error:
lty <- setNames(c(0, 1), levels(airquality$Summer))
ggplot(data = airquality, aes(x=Wind, y = Temp,
color = as.factor(Month), group = airquality$Summer)) +
geom_point() +
geom_line(linetype = as.factor(airquality$Summer)) +
scale_linetype_manual(values = lty)
Any ideas?
EDIT:
My actual data show very clear trends, and I want to differentiate the top line from all the others below. My goal is to convince people they should make more than just the minimum payment on their student loans:
You just need to change the group to Month and putlinetype in aes:
ggplot(data = airquality, aes(x=Wind, y = Temp, color = as.factor(Month), group = Month)) +
geom_point() +
geom_line(aes(linetype = factor(Summer)))
If you want to specify the linetype you can use a few methods. Here is one way:
lineT <- c("solid", "dotdash")
names(lineT) <- c("1","0")
ggplot(data = airquality, aes(x=Wind, y = Temp, color = as.factor(Month))) +
geom_point() +
geom_line(aes(linetype = factor(Summer))) +
scale_linetype_manual(values = lineT)