How to plot two dashed regression lines using GGPlot - r

I am currently in the process of trying two form two dashed lines using the ggplot function. The graph is one that shows two regression lines belonging to two different factor groups. I've been able to make one of the lines dashed, but I am having trouble getting the other line to have dashes. Any help would be greatly appreciated.
coli_means %>%
ggplot(aes(time, mean_heartrate, group = treatment)) +
geom_point( aes(group = treatment, color = treatment)) +
geom_smooth(aes(method = "loess", linetype = treatment, se = FALSE,
group = treatment, color = treatment, show.legend = TRUE))
I feel I am missing one simple input. Thanks.

What you need to do is use scale_linetype_manual() and then tell it that both the treatment groups require a dashed line.
Let's start with a reproducible example:
# reproducible example:
set.seed(0)
time <- rep(1:100,2)
treatment <- c(rep("A",100), rep("B",100))
mean_heartrate <- c(rnorm(100,60,2), rnorm(100,80,2))
coli_means <- data.frame(time, treatment, mean_heartrate)
# ggplot
coli_means %>%
ggplot(aes(x = time, y = mean_heartrate)) +
geom_point(aes(color = treatment)) +
geom_smooth(aes(linetype = treatment, color = treatment))+
scale_linetype_manual(values = c('dashed','dashed'))

Related

How to plot several outcomes for several groups in R ggplot?

I want to build a single graph displaying several outcomes (with different point- and lineshapes for each, respectively) for several strata (displayed in different colours) over time. Using this for one group works:
data <- data.frame(
time = rep(c("Baseline", "Follow-Up 1", "Follow-Up 2"), each = 8),
stratum = rep(c("Intervention", "Control"), 12),
outcome = rep(c("Sensitivity", "Specificity", "PPV", "NPV"), 3, each = 2),
value = runif(24)
)
# working
data %>%
filter(stratum == "Intervention") %>%
ggplot(aes(x = time, y = value, group = outcome, colour = stratum)) +
geom_point(aes(shape = outcome)) +
geom_line(aes(linetype = outcome))
# not working
data %>%
ggplot(aes(x = time, y = value, group = outcome, colour = stratum)) +
geom_point(aes(shape = outcome)) +
geom_line(aes(linetype = outcome))
Graph displaying what I want for one stratum, the other should ideally just be added within the same graph with another colour under "stratum" in the legend
If I want the same for both strata it does not and produces following error:
Error in `f()`:
! geom_path: If you are using dotted or dashed lines, colour, size and linetype must be constant over the line
Run `rlang::last_error()` to see where the error occurred.
The info in last_error() does not help me. Has anyone a solution here?
The group aesthetic should uniquely define the points that you want connected with a line. You need to consider both outcome and stratum in that.
data %>%
ggplot(aes(x = time, y = value, group = paste(outcome, stratum), colour = stratum)) +
geom_point(aes(shape = outcome)) +
geom_line(aes(linetype = outcome))

ggplot boxplot + jitter plot showing random sampling of data

I'd like to use ggplot to generate a series of boxplots derived from all data within a dataset, but then with jittered points showing a random sampling of the respective data (e.g., 100 data points) to avoid over-plotting (there are thousands of data points). Can anyone please help me with the code for this? The basic framework I have now is below, but I don't know what if any arguments can be added to draw a random sampling of data to display as the jittered points. Thanks for any help.
ggplot(datafile, aes(x=factor(var1), y=var2, fill=var3)) + geom_jitter(size=0.1, position=position_jitter(width=0.3, height=0.2)) + geom_boxplot(alpha=0.5) + facet_grid(.~var3) + theme_bw() + scale_fil_manual(values=c("red", "green", "blue")
You could take a random subset of your data using dplyr:
library(dplyr)
library(ggplot)
ggplot(data = datafile, aes(x = factor(var1), y = var2, fill = var3)) +
geom_jitter(
# use random subset of data
data = datafile %>% group_by(var1) %>% sample_n(100),
aes(x = factor(var1), y = var2, fill = var3)),
size = 0.1,
position = position_jitter(width = 0.3, height = 0.2)) +
geom_boxplot(alpha = 0.5) +
facet_grid(.~var3) +
theme_bw() +
scale_fill_manual(values = c("red", "green", "blue")

Add legend using geom_point and geom_smooth from different dataset

I really struggle to set the correct legend for a geom_point plot with loess regression, while there is 2 data set used
I got a data set, who is summarizing activity over a day, and then I plot on the same graph, all the activity per hours and per days recorded, plus a regression curve smoothed with a loess function, plus the mean of each hours for all the days.
To be more precise, here is an example of the first code, and the graph returned, without legend, which is exactly what I expected:
# first graph, which is given what I expected but with no legend
p <- ggplot(dat1, aes(x = Hour, y = value)) +
geom_point(color = "darkgray", size = 1) +
geom_point(data = dat2, mapping = aes(x = Hour, y = mean),
color = 20, size = 3) +
geom_smooth(method = "loess", span = 0.2, color = "red", fill = "blue")
and the graph (in grey there is all the data, per hours, per days. the red curve is the loess regression. The blue dots are the means for each hours):
When I tried to set the legend I failed to plot one with the explanation for both kind of dots (data in grey, mean in blue), and the loess curve (in red). See below some example of what I tried.
# second graph, which is given what I expected + the legend for the loess that
# I wanted but with not the dot legend
p <- ggplot(dat1, aes(x = Hour, y = value)) +
geom_point(color = "darkgray", size = 1) +
geom_point(data = dat2, mapping = aes(x = Hour, y = mean),
color = "blue", size = 3) +
geom_smooth(method = "loess", span = 0.2, aes(color = "red"), fill = "blue") +
scale_color_identity(name = "legend model", guide = "legend",
labels = "loess regression \n with confidence interval")
I obtained the good legend for the curve only
and another trial :
# I tried to combine both date set into a single one as following but it did not
# work at all and I really do not understand how the legends works in ggplot2
# compared to the normal plots
A <- rbind(dat1, dat2)
p <- ggplot(A, aes(x = Heure, y = value, color = variable)) +
geom_point(data = subset(A, variable == "data"), size = 1) +
geom_point(data = subset(A, variable == "Moy"), size = 3) +
geom_smooth(method = "loess", span = 0.2, aes(color = "red"), fill = "blue") +
scale_color_manual(name = "légende",
labels = c("Data", "Moy", "loess regression \n with confidence interval"),
values = c("darkgray", "royalblue", "red"))
It appears that all the legend settings are mixed together in a "weird" way, the is a grey dot covering by a grey line, and then the same in blue and in red (for the 3 labels). all got a background filled in blue:
If you need to label the mean, might need to be a bit creative, because it's not so easy to add legend manually in ggplot.
I simulate something that looks like your data below.
dat1 = data.frame(
Hour = rep(1:24,each=10),
value = c(rnorm(60,0,1),rnorm(60,2,1),rnorm(60,1,1),rnorm(60,-1,1))
)
# classify this as raw data
dat1$Data = "Raw"
# calculate mean like you did
dat2 <- dat1 %>% group_by(Hour) %>% summarise(value=mean(value))
# classify this as mean
dat2$Data = "Mean"
# combine the data frames
plotdat <- rbind(dat1,dat2)
# add a dummy variable, we'll use it later
plotdat$line = "Loess-Smooth"
We make the basic dot plot first:
ggplot(plotdat, aes(x = Hour, y = value,col=Data,size=Data)) +
geom_point() +
scale_color_manual(values=c("blue","darkgray"))+
scale_size_manual(values=c(3,1),guide=FALSE)
Note with the size, we set guide to FALSE so it will not appear. Now we add the loess smooth, one way to introduce the legend is to introduce a linetype, and since there's only one group, you will have just one variable:
ggplot(plotdat, aes(x = Hour, y = value,col=Data,size=Data)) +
geom_point() +
scale_color_manual(values=c("blue","darkgray"))+
scale_size_manual(values=c(3,1),guide=FALSE)+
geom_smooth(data=subset(plotdat,Data="Raw"),
aes(linetype=line),size=1,alpha=0.3,
method = "loess", span = 0.2, color = "red", fill = "blue")

How to vary line and ribbon colours in a facet_grid

I'm hoping someone can help with this plotting problem I have. The data can be found here.
Basically I want to plot a line (mean) and it's associated confidence interval (lower, upper) for 4 models I have tested. I want to facet on the Cat_Auth variable for which there are 4 categories (so 4 plots). The first 'model' is actually just the mean of the sample data and I don't want a CI for this (NA values specified in the data - not sure if this is the correct thing to do).
I can get the plot some way there with:
newdata <- read.csv("data.csv", header=T)
ggplot(newdata, aes(x = Affil_Max, y = Mean)) +
geom_line(data = newdata, aes(), colour = "blue") +
geom_ribbon(data = newdata, alpha = .5, aes(ymin = Lower, ymax = Upper, group = Model, fill = Model)) +
facet_grid(.~ Cat_Auth)
But I'd like different coloured lines and shaded ribbons for each model (e.g. a red mean line and red shaded ribbon for model 2, green for model 3 etc). Also, I can't figure out why the blue line corresponding to the first set of mean values is disjointed as it is.
Would be really grateful for any assistance!
Try this:
library(dplyr)
library(ggplot2)
newdata %>%
mutate(Model = as.factor(Model)) %>%
ggplot(aes(Affil_Max, Mean)) +
geom_line(aes(color = Model, group = Model)) +
geom_ribbon(alpha = .5, aes(ymin = Lower, ymax = Upper,
group = Model, fill = Model)) +
facet_grid(. ~ Cat_Auth)

How to create a heatmap with continuous scale using ggplot2 in R

I have got a data frame with several 1000 rows in the form of
group = c("gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3")
pos = c(1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10)
color = c(2,2,2,2,3,3,2,2,3,2,1,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,1,1,2,2)
df = data.frame(group, pos, color)
and would like to make a kind of heatmap in which one axes has a continuous scale (position). The color column is categorical. However due to the large amount of data points I want to use binning, i.e. use it as a continuous variable.
This is more or less how the plot should look like:
I can't think of a way to create such a plot using ggplot2/R. I have tried several geometries, e.g. geom_point()
ggplot(data=df, aes(x=strain, y=pos, color=color)) +
geom_point() +
scale_colour_gradientn(colors=c("yellow", "black", "orange"))
Thanks for your help in advance.
Does this help you?
library(ggplot2)
group = c("gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr1","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr2","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3","gr3")
pos = c(1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10)
color = c(2,2,2,2,3,3,2,2,3,2,1,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2,1,1,2,2)
df = data.frame(group, pos, color)
ggplot(data = df, aes(x = group, y = pos)) + geom_tile(aes(fill = color))
Looks like this
Improved version with 3 color gradient if you like
library(scales)
ggplot(data = df, aes(x = group, y = pos)) + geom_tile(aes(fill = color))+ scale_fill_gradientn(colours=c("orange","black","yellow"),values=rescale(c(1, 2, 3)),guide="colorbar")

Resources