geom_smooth(): One line, different colors

geom_smooth(): One line, different colors - r

I am currently trying to customize my plot with the goal to have a plot like this:
If I try to specify the color or linetype in either aes() or mapping = aes(), I get two different smooths. One for each class. This makes sense, because the smoothing will be applied once for each type.
If I use group = 1 in the aestetics, I will get one line, also one color/linetype.
But I can not find a solution to have one smooth line with different colors/linetypes for each class.
My code:
ggplot(df2, aes(x = dateTime, y = capacity)) +
#geom_line(size = 0) +
stat_smooth(geom = "area", method = "loess", show.legend = F,
mapping = aes(x = dateTime, y = capacity, fill = type, color = type, linetype = type)) +
scale_color_manual(values = c(col_fill, col_fill)) +
scale_fill_manual(values = c(col_fill, col_fill2))
The result for my data:
Reproduceable code:
File: enter link description here (I can not make this file shorter and copy it hear, else I get errors with smoothing for too few data points)
df2 <- read.csv("tmp.csv")
df2$dateTime <- as.POSIXct(df2$dateTime, format = "%Y-%m-%d %H:%M:%OS")
col_lines <- "#8DA8C5"
col_fill <- "#033F77"
col_fill2 <- "#E5E9F2"
ggplot(df2, aes(x = dateTime, y = capacity)) +
stat_smooth(geom = "area", method = "loess", show.legend = F,
mapping = aes(x = dateTime, y = capacity, fill = type, color = type, linetype = type)) +
scale_color_manual(values = c(col_fill, col_fill)) +
scale_fill_manual(values = c(col_fill, col_fill2))

I would suggest to model the data outside the plotting function and then plot it with ggplot. I used the pipes (%>%) and mutate from the tidyversefor convenient reasons, but you don't have to. Also, I prefer to have a line and a fill separated to avoid the dashed line on the right side of your plot.
df2$index <- as.numeric(df2$dateTime) #create an index for the loess model
model <- loess(capacity ~ index, data = df2) #model the capacity
plot <- df2 %>% mutate(capacity_predicted = predict(model)) %>% # use the predicted data for the capacity
ggplot(aes(x = dateTime, y = capacity_predicted)) +
geom_ribbon(aes(ymax = capacity_predicted, ymin = 0, fill = type, group = type)) +
geom_line(aes( color = type, linetype = type)) +
scale_color_manual(values = c(col_fill, col_fill)) +
scale_fill_manual(values = c(col_fill, col_fill2)) +
theme_minimal() +
theme(legend.position = "none")
plot
Please tell me if it works (I don't have the original data to test it), and if you would like a version without tidyverse functions.
EDIT:
Not very clean, but a smoother curve can be obtained with this code:
df3 <- data.frame(index = seq(min(df2$index), max(df2$index), length.out = 300),
type = "historic", stringsAsFactors = F)
modelling_date_index <- 1512562500
df3$type[df3$index <= modelling_date_index] = "predict"
plot <- df3 %>% mutate(capacity_predicted = predict(model, newdata = index),
dateTime = as.POSIXct(index, origin = '1970-01-01')) %>%
# arrange(dateTime) %>%
ggplot(aes(x = dateTime, y = capacity_predicted)) +
geom_ribbon(aes(ymax = capacity_predicted, ymin = 0, fill = type, group =
type)) +
geom_line(aes( color = type, linetype = type)) +
scale_color_manual(values = c(col_fill, col_fill)) +
scale_fill_manual(values = c(col_fill, col_fill2)) +
theme_minimal()+
theme(legend.position = "none")
plot

Related

R ggplot2 : geom_jitter and fill, problem to have the dots on the right boxplot

Here's my R code
ggplot(dat = Table, aes(x = Group, y = value, fill = Type)) +
geom_boxplot(alpha=0.08)+
geom_jitter()+
scale_fill_brewer(palette="Spectral")+
theme_minimal()
Like you can see the dots are in the middle of the boxplots. What can I add in geom_jitter to have each point in the righ boxplot and not in the middle like this ? I also tried geom_point, it gave the same result !
Thanks to the help now It works, but I wanted to add a line to connect the dots and I got this.. can someone tell how to really connect the dots with lines

I think if you group by interaction(Group, Type) and use position_jitterdodge() you should get what you're looking for.
ggplot(mtcars, aes(as.character(am), mpg, color = as.character(vs),
group = interaction(as.character(vs), as.character(am)))) +
geom_boxplot() +
geom_jitter(position = position_jitterdodge()) # same output with geom_point()
Edit - here's an example with manual jittering applied to data where the each subject appears once in each Group.
I looked for a built-in way to do this, and this answer comes close, but I couldn't get it to work in terms of using position_jitterdodge with position defined by the groups of Group/Type, but line grouping defined by id alone and not by Group/Type. Both aesthetics (position adjustment and series identification) rely on the same group parameter, but they each need a different value for it.
Table = data.frame(id = 1:4,
value = rnorm(8),
Group = rep(c("a","b"), each = 4),
Type = c("1", "2"))
library(dplyr)
Table %>%
mutate(x = as.numeric(as.factor(Group)) +
0.2 * scale(as.numeric(as.factor(Type))) +
rnorm(n(), sd = 0.06)) %>%
ggplot(aes(x = Group, y = value, fill = Type, group = interaction(Group, Type))) +
geom_boxplot(alpha=0.2)+
geom_point(aes(x = x)) +
geom_line(aes(x = x, group = id), alpha = 0.1) +
scale_fill_brewer(palette="Spectral")+
theme_minimal()

Best to use position_dodge instead if you want them to line up:
library(ggplot2)
Table <- tibble::tibble(
Group = rep(c("A", "B"), each = 20),
Type = factor(rep(c(1:2, 1:2), each = 10)),
value = rnorm(40, mean = 10)
)
ggplot(dat = Table, aes(x = Group, y = value, fill = Type)) +
geom_boxplot(alpha=0.08)+
geom_point(position = position_dodge(width = 0.75))+
scale_fill_brewer(palette="Spectral")+
theme_minimal()
To add a line, make sure group = ID goes in both the geom_point and geom_line calls:
library(ggplot2)
Table <- tibble::tibble(
Group = rep(c("A", "B"), each = 20),
Type = factor(rep(c(1:2, 1:2), each = 10)),
ID = factor(rep(1:20, times = 2)),
value = rnorm(40, mean = 10)
)
ggplot(dat = Table, aes(x = Group, y = value, fill = Type)) +
geom_boxplot(alpha = 0.08) +
geom_point(aes(group = ID), position = position_dodge(width = 0.75))+
geom_line(aes(group = ID), position = position_dodge(width = 0.75), colour = "grey")+
scale_fill_brewer(palette = "Spectral") +
theme_minimal()

How to plot multiple mean lines in a single histogram with multiple groups present?

I am plotting a distribution of two variables on a single histogram. I am interested in highlighting each distribution's mean value on that graph through a doted line or something similar (but hopefully something that matches the color present already in the aes section of the code).
How would I do that?
This is my code so far.
hist_plot <- ggplot(data, aes(x= value, fill= type, color = type)) +
geom_histogram(position="identity", alpha=0.2) +
labs( x = "Value", y = "Count", fill = "Type", title = "Title") +
guides(color = FALSE)
Also, is there any way to show the count of n for each type on this graph?

i've made some reproducible code that might help you with your problem.
library(tidyverse)
# Generate some random data
df <- data.frame(value = c(runif(50, 0.5, 1), runif(50, 1, 1.5)),
type = c(rep("type1", 50), rep("type2", 50)))
# Calculate means from df
stats <- df %>% group_by(type) %>% summarise(mean = mean(value),
n = n())
# Make the ggplot
ggplot(df, aes(x= value, fill= type, color = type)) +
geom_histogram(position="identity", alpha=0.2) +
labs(x = "Value", y = "Count", fill = "Type", title = "Title") +
guides(color = FALSE) +
geom_vline(data = stats, aes(xintercept = mean, color = type), size = 2) +
geom_text(data = stats, aes(x = mean, y = max(df$value), label = n),
size = 10,
color = "black")
If things go as intended, you'll end up something akin to the following plot.
histogram with means

Set the legend of a ggplotly() plot to have only the color and not the shape index

I have the dataframe below:
etf_id<-c("a","b","c","d","e","a","b","c","d","e","a","b","c","d","e")
factor<-c("A","A","A","A","A","B","B","B","B","B","C","C","C","C","C")
normalized<-c(-0.048436801,2.850578601,1.551666490,0.928625186,-0.638111793,
-0.540615895,-0.501691539,-1.099239823,-0.040736139,-0.192048665,
0.198915407,-0.092525810,0.214317734,0.550478998,0.024613778)
df<-data.frame(etf_id,factor,normalized)
and I create a ggplotly() boxplot with:
library(ggplot2)
library(plotly)
ggplotly(ggplot(data = df, aes(x = factor, y = normalized)) +
geom_boxplot(aes(fill = as.factor(factor)),outlier.colour = 'black') +
geom_point(data = df, position = position_dodge(0.75))+geom_point(data = df,
aes(x = factor, y = normalized, shape = etf_id, color = etf_id),
size = 2))
I take as a result a boxplot with this legend:
but I want my legend to have only the color distinction like below. Note that the factors wont be 3 every time but may vary from 1 to 8.

The recommended way to alter plotly elements is to use the style() function. You can identify the elements and traces by inspecting plotly_json().
I'm not sure if there's a more compact way, but you can achieve the desired result using:
p <- ggplotly(ggplot(data = df, aes(x = factor, y = normalized)) +
geom_boxplot(aes(fill = as.factor(factor)),outlier.colour = 'black') +
geom_point(data = df, position = position_dodge(0.75))+geom_point(data = df,
aes(x = factor, y = normalized, shape = etf_id, color = etf_id),
size = 2))
p <- style(p, showlegend = FALSE, traces = 5:9)
for (i in seq_along(levels(df$factor))) {
p <- style(p, name = levels(df$factor)[i], traces = i)
}
p
Note that in this case the factor levels and traces align but that won't always be the case so you may need to adjust this (i.e. i + x).

One quick way would be to add show.legend = FALSE to supress the legend from showing.
library(ggplot2)
ggplot(data = df, aes(x = factor, y = normalized)) +
geom_boxplot(aes(fill = as.factor(factor)),outlier.colour = 'black') +
geom_point(position = position_dodge(0.75)) +
geom_point(aes(x = factor, y = normalized, shape = etf_id, color = etf_id),
size = 2, show.legend=FALSE)
Unfortunately, this does not work when this is passed to ggplotly. You can use theme(legend.position='none') which works but suppresses all the legends instead of specific ones. One dirty hack is to disable specific legend manually
temp_plot <- ggplotly(ggplot(data = df, aes(x = factor, y = normalized)) +
geom_boxplot(aes(fill = as.factor(factor)),outlier.colour = 'black') +
geom_point(position = position_dodge(0.75)) +
geom_point(aes(x = factor, y = normalized, shape = etf_id, color = etf_id),size = 2))
temp_plot[[1]][[1]][4:9] <- lapply(temp_plot[[1]][[1]][4:9], function(x) {x$showlegend <- FALSE;x})
temp_plot

Adjusting rugplot in ggplot2

Below is the code for a graph I am making for an article I am working on. The plot showed the predicted probabilities along a range of values in my data set. Along the x-axis is a rug plot that shows the distribution of trade share values (I provided the code and an image of the graph):
sitc8 <- ggplot() + geom_line(data=plotdat8, aes(x = lagsitc8100, y = PredictedProbabilityMean), size = 2, color="blue") +
geom_ribbon(data=plotdat8, aes(x = lagsitc8100, ymin = lowersd, ymax = uppersd),
fill = "grey50", alpha=.5) +
ylim(c(-0.75, 1.5)) +
geom_hline(yintercept=0) +
geom_rug(data=multi.sanctions.bust8.full#frame, aes(x=lagsitc8100), col="black", size=1.0, sides="b") +
xlab("SITC 8 Trade Share") +
ylab("Probability of Sanctions Busting") +
theme(panel.grid.major = element_line(colour = "gray", linetype = "dotted"), panel.grid.minor =
element_blank(), panel.background = element_blank())
My question is: is it possible to change the color of the lines of the rugplot of trade share in which the event I am modeling occurs? In other words, I would like to add red lines or red dots along those values of trade share when my event = 1.
Is this possible?

Sure. You'd just have to add a color argument within an aes() function call within geom_rug().
Here's some code to create a dummy data frame.
library(tidyverse)
set.seed(42)
dummy_data <- tibble(x_var = rnorm(100),
y_var = abs(rnorm(100)) * x_var) %>%
rownames_to_column(var = "temp_row") %>%
mutate(color_id = if_else(as.numeric(temp_row) <= 50,
"Type A",
"Type B"))
And here's a ggplot call where the color for geom_rug is mapped to a character column named color_id
ggplot(data = dummy_data, mapping = aes(x = x_var, y = y_var)) +
geom_smooth(method = "lm") +
geom_rug(mapping = aes(color = color_id), sides = "b")
Update:
Following OP's comment, here's an updated version. If it's a numeric vector of 0s and 1s, you have to tell ggplot to treat it as a dichotomous variable. You can do that by wrapping it in a call to factor() for instance.
For the color we can set that manually using scale_color_manual(). So the changes to the code are the following.
color_id is now a vector og 0s and 1s.
the color is now mapped to factor(color_id)
the color scale is determined using scale_color_manual
library(tidyverse)
set.seed(42)
dummy_data <- tibble(x_var = rnorm(100),
y_var = abs(rnorm(100)) * x_var) %>%
rownames_to_column(var = "temp_row") %>%
mutate(color_id = if_else(as.numeric(temp_row) <= 50,
0,
1))
ggplot(data = dummy_data, mapping = aes(x = x_var, y = y_var)) +
geom_smooth(method = "lm") +
geom_rug(mapping = aes(color = factor(color_id)), sides = "b") +
scale_color_manual(values = c("black", "red")) +
labs(color = "This takes two values")

Definitely possible. Here's an example using iris, and a dynamic condition in the rug. You could also do two rugs, if you chose.
library(tidyverse)
iris %>%
ggplot(aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point() +
geom_rug(aes(color = Petal.Length >3), sides = "b")
# Second example, output not shown
iris %>%
ggplot(aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point() +
geom_rug(data = subset(iris, Petal.Length > 3), color = "black", sides = "b") +
geom_rug(data = subset(iris, Petal.Length <= 3), color = "red", sides = "b")

Wrong legends in ggplot2 with two different plot types in one graph based on the same dataset

Please consider the following:
I want to plot a step-wise curve (using geom_step()) and some smooth lines (using geom_line()) in one graph using ggplot2.
I manage to create a graph but the labels are wrong and cannot be corrected using scale_color_discrete().
Desired outcome: Based on the data (see below), line "hello" is the upper line, followed by "foo" and "bar", but the labels are not correct. In addition, I also need a label for the, now missing, geom_step()curve.
Question: What am I doing wrong?
Reproducible example:
library(ggplot2)
# Data
db <- data.frame(time = 0:100,
step = 1-pexp(0:100, rate = 1),
foo = 1-pexp(0:100, rate = 0.4),
bar = 1-pexp(0:100, rate = 0.5),
hello = 1-pexp(0:100, rate = 0.1)
)
# Plotted with wrong labels (automatically)
ggplot(data = db, aes(x = time, y = step)) +
geom_step(show.legend = T) +
geom_line(aes(x = time, y = foo, col = "red")) +
geom_line(aes(x = time, y = bar, col = "blue")) +
geom_line(aes(x = time, y = hello, col = "green"))
Looking at the labels, one can already see that the description of the color and the color of the line do not match.
# Still wrong labels
ggplot(data = db, aes(x = time, y = step)) +
geom_step(show.legend = T) +
geom_line(aes(x = time, y = foo, col = "red")) +
geom_line(aes(x = time, y = bar, col = "blue")) +
geom_line(aes(x = time, y = hello, col = "green")) +
scale_color_discrete(name = "Dose", labels = c("foo", "bar", "hello"))
Changing the labels obviously wont help.
Created on 2019-04-15 by the reprex
package (v0.2.0).

You are specifying the color you want to have inside the aesthetics-call. This means you match the color to the label "red" and not use the color "red".
You can fix this for example like this:
p <- ggplot(data = db, aes(x = time, y = step)) +
geom_step(aes(color = "step")) +
geom_line(aes(y = foo, color = "foo")) +
geom_line(aes(y = bar, color = "bar")) +
geom_line(aes(y = hello, color = "hello"))
p
Note that I dropped the x = time as this is inherited from the ggplot-call in each step. If you want to change the color for each of the lines, you should now use for example scale_color_manual like the following:
p +
scale_color_manual(name = "Dose",
values = c("step" = "black", "foo" = "red",
"bar" = "blue", "hello" = "green"))
Another option would be to transform you data to the long format:
library(tidyr)
library(dplyr)
new_db <- gather(db, type, value, -time)
ggplot(data = filter(new_db, type != "step"), aes(x = time, y = value, color = type)) +
geom_line() +
geom_step(data = filter(new_db, type == "step"))

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

geom_smooth(): One line, different colors - r

Related

R ggplot2 : geom_jitter and fill, problem to have the dots on the right boxplot

How to plot multiple mean lines in a single histogram with multiple groups present?

Set the legend of a ggplotly() plot to have only the color and not the shape index

Adjusting rugplot in ggplot2

Wrong legends in ggplot2 with two different plot types in one graph based on the same dataset

Categories

Resources