I am currently trying to customize my plot with the goal to have a plot like this:
If I try to specify the color or linetype in either aes() or mapping = aes(), I get two different smooths. One for each class. This makes sense, because the smoothing will be applied once for each type.
If I use group = 1 in the aestetics, I will get one line, also one color/linetype.
But I can not find a solution to have one smooth line with different colors/linetypes for each class.
My code:
ggplot(df2, aes(x = dateTime, y = capacity)) +
#geom_line(size = 0) +
stat_smooth(geom = "area", method = "loess", show.legend = F,
mapping = aes(x = dateTime, y = capacity, fill = type, color = type, linetype = type)) +
scale_color_manual(values = c(col_fill, col_fill)) +
scale_fill_manual(values = c(col_fill, col_fill2))
The result for my data:
Reproduceable code:
File: enter link description here (I can not make this file shorter and copy it hear, else I get errors with smoothing for too few data points)
df2 <- read.csv("tmp.csv")
df2$dateTime <- as.POSIXct(df2$dateTime, format = "%Y-%m-%d %H:%M:%OS")
col_lines <- "#8DA8C5"
col_fill <- "#033F77"
col_fill2 <- "#E5E9F2"
ggplot(df2, aes(x = dateTime, y = capacity)) +
stat_smooth(geom = "area", method = "loess", show.legend = F,
mapping = aes(x = dateTime, y = capacity, fill = type, color = type, linetype = type)) +
scale_color_manual(values = c(col_fill, col_fill)) +
scale_fill_manual(values = c(col_fill, col_fill2))
I would suggest to model the data outside the plotting function and then plot it with ggplot. I used the pipes (%>%) and mutate from the tidyversefor convenient reasons, but you don't have to. Also, I prefer to have a line and a fill separated to avoid the dashed line on the right side of your plot.
df2$index <- as.numeric(df2$dateTime) #create an index for the loess model
model <- loess(capacity ~ index, data = df2) #model the capacity
plot <- df2 %>% mutate(capacity_predicted = predict(model)) %>% # use the predicted data for the capacity
ggplot(aes(x = dateTime, y = capacity_predicted)) +
geom_ribbon(aes(ymax = capacity_predicted, ymin = 0, fill = type, group = type)) +
geom_line(aes( color = type, linetype = type)) +
scale_color_manual(values = c(col_fill, col_fill)) +
scale_fill_manual(values = c(col_fill, col_fill2)) +
theme_minimal() +
theme(legend.position = "none")
plot
Please tell me if it works (I don't have the original data to test it), and if you would like a version without tidyverse functions.
EDIT:
Not very clean, but a smoother curve can be obtained with this code:
df3 <- data.frame(index = seq(min(df2$index), max(df2$index), length.out = 300),
type = "historic", stringsAsFactors = F)
modelling_date_index <- 1512562500
df3$type[df3$index <= modelling_date_index] = "predict"
plot <- df3 %>% mutate(capacity_predicted = predict(model, newdata = index),
dateTime = as.POSIXct(index, origin = '1970-01-01')) %>%
# arrange(dateTime) %>%
ggplot(aes(x = dateTime, y = capacity_predicted)) +
geom_ribbon(aes(ymax = capacity_predicted, ymin = 0, fill = type, group =
type)) +
geom_line(aes( color = type, linetype = type)) +
scale_color_manual(values = c(col_fill, col_fill)) +
scale_fill_manual(values = c(col_fill, col_fill2)) +
theme_minimal()+
theme(legend.position = "none")
plot
Related
Here's my R code
ggplot(dat = Table, aes(x = Group, y = value, fill = Type)) +
geom_boxplot(alpha=0.08)+
geom_jitter()+
scale_fill_brewer(palette="Spectral")+
theme_minimal()
Like you can see the dots are in the middle of the boxplots. What can I add in geom_jitter to have each point in the righ boxplot and not in the middle like this ? I also tried geom_point, it gave the same result !
Thanks to the help now It works, but I wanted to add a line to connect the dots and I got this.. can someone tell how to really connect the dots with lines
I think if you group by interaction(Group, Type) and use position_jitterdodge() you should get what you're looking for.
ggplot(mtcars, aes(as.character(am), mpg, color = as.character(vs),
group = interaction(as.character(vs), as.character(am)))) +
geom_boxplot() +
geom_jitter(position = position_jitterdodge()) # same output with geom_point()
Edit - here's an example with manual jittering applied to data where the each subject appears once in each Group.
I looked for a built-in way to do this, and this answer comes close, but I couldn't get it to work in terms of using position_jitterdodge with position defined by the groups of Group/Type, but line grouping defined by id alone and not by Group/Type. Both aesthetics (position adjustment and series identification) rely on the same group parameter, but they each need a different value for it.
Table = data.frame(id = 1:4,
value = rnorm(8),
Group = rep(c("a","b"), each = 4),
Type = c("1", "2"))
library(dplyr)
Table %>%
mutate(x = as.numeric(as.factor(Group)) +
0.2 * scale(as.numeric(as.factor(Type))) +
rnorm(n(), sd = 0.06)) %>%
ggplot(aes(x = Group, y = value, fill = Type, group = interaction(Group, Type))) +
geom_boxplot(alpha=0.2)+
geom_point(aes(x = x)) +
geom_line(aes(x = x, group = id), alpha = 0.1) +
scale_fill_brewer(palette="Spectral")+
theme_minimal()
Best to use position_dodge instead if you want them to line up:
library(ggplot2)
Table <- tibble::tibble(
Group = rep(c("A", "B"), each = 20),
Type = factor(rep(c(1:2, 1:2), each = 10)),
value = rnorm(40, mean = 10)
)
ggplot(dat = Table, aes(x = Group, y = value, fill = Type)) +
geom_boxplot(alpha=0.08)+
geom_point(position = position_dodge(width = 0.75))+
scale_fill_brewer(palette="Spectral")+
theme_minimal()
To add a line, make sure group = ID goes in both the geom_point and geom_line calls:
library(ggplot2)
Table <- tibble::tibble(
Group = rep(c("A", "B"), each = 20),
Type = factor(rep(c(1:2, 1:2), each = 10)),
ID = factor(rep(1:20, times = 2)),
value = rnorm(40, mean = 10)
)
ggplot(dat = Table, aes(x = Group, y = value, fill = Type)) +
geom_boxplot(alpha = 0.08) +
geom_point(aes(group = ID), position = position_dodge(width = 0.75))+
geom_line(aes(group = ID), position = position_dodge(width = 0.75), colour = "grey")+
scale_fill_brewer(palette = "Spectral") +
theme_minimal()
I am plotting a distribution of two variables on a single histogram. I am interested in highlighting each distribution's mean value on that graph through a doted line or something similar (but hopefully something that matches the color present already in the aes section of the code).
How would I do that?
This is my code so far.
hist_plot <- ggplot(data, aes(x= value, fill= type, color = type)) +
geom_histogram(position="identity", alpha=0.2) +
labs( x = "Value", y = "Count", fill = "Type", title = "Title") +
guides(color = FALSE)
Also, is there any way to show the count of n for each type on this graph?
i've made some reproducible code that might help you with your problem.
library(tidyverse)
# Generate some random data
df <- data.frame(value = c(runif(50, 0.5, 1), runif(50, 1, 1.5)),
type = c(rep("type1", 50), rep("type2", 50)))
# Calculate means from df
stats <- df %>% group_by(type) %>% summarise(mean = mean(value),
n = n())
# Make the ggplot
ggplot(df, aes(x= value, fill= type, color = type)) +
geom_histogram(position="identity", alpha=0.2) +
labs(x = "Value", y = "Count", fill = "Type", title = "Title") +
guides(color = FALSE) +
geom_vline(data = stats, aes(xintercept = mean, color = type), size = 2) +
geom_text(data = stats, aes(x = mean, y = max(df$value), label = n),
size = 10,
color = "black")
If things go as intended, you'll end up something akin to the following plot.
histogram with means
I have the dataframe below:
etf_id<-c("a","b","c","d","e","a","b","c","d","e","a","b","c","d","e")
factor<-c("A","A","A","A","A","B","B","B","B","B","C","C","C","C","C")
normalized<-c(-0.048436801,2.850578601,1.551666490,0.928625186,-0.638111793,
-0.540615895,-0.501691539,-1.099239823,-0.040736139,-0.192048665,
0.198915407,-0.092525810,0.214317734,0.550478998,0.024613778)
df<-data.frame(etf_id,factor,normalized)
and I create a ggplotly() boxplot with:
library(ggplot2)
library(plotly)
ggplotly(ggplot(data = df, aes(x = factor, y = normalized)) +
geom_boxplot(aes(fill = as.factor(factor)),outlier.colour = 'black') +
geom_point(data = df, position = position_dodge(0.75))+geom_point(data = df,
aes(x = factor, y = normalized, shape = etf_id, color = etf_id),
size = 2))
I take as a result a boxplot with this legend:
but I want my legend to have only the color distinction like below. Note that the factors wont be 3 every time but may vary from 1 to 8.
The recommended way to alter plotly elements is to use the style() function. You can identify the elements and traces by inspecting plotly_json().
I'm not sure if there's a more compact way, but you can achieve the desired result using:
p <- ggplotly(ggplot(data = df, aes(x = factor, y = normalized)) +
geom_boxplot(aes(fill = as.factor(factor)),outlier.colour = 'black') +
geom_point(data = df, position = position_dodge(0.75))+geom_point(data = df,
aes(x = factor, y = normalized, shape = etf_id, color = etf_id),
size = 2))
p <- style(p, showlegend = FALSE, traces = 5:9)
for (i in seq_along(levels(df$factor))) {
p <- style(p, name = levels(df$factor)[i], traces = i)
}
p
Note that in this case the factor levels and traces align but that won't always be the case so you may need to adjust this (i.e. i + x).
One quick way would be to add show.legend = FALSE to supress the legend from showing.
library(ggplot2)
ggplot(data = df, aes(x = factor, y = normalized)) +
geom_boxplot(aes(fill = as.factor(factor)),outlier.colour = 'black') +
geom_point(position = position_dodge(0.75)) +
geom_point(aes(x = factor, y = normalized, shape = etf_id, color = etf_id),
size = 2, show.legend=FALSE)
Unfortunately, this does not work when this is passed to ggplotly. You can use theme(legend.position='none') which works but suppresses all the legends instead of specific ones. One dirty hack is to disable specific legend manually
temp_plot <- ggplotly(ggplot(data = df, aes(x = factor, y = normalized)) +
geom_boxplot(aes(fill = as.factor(factor)),outlier.colour = 'black') +
geom_point(position = position_dodge(0.75)) +
geom_point(aes(x = factor, y = normalized, shape = etf_id, color = etf_id),size = 2))
temp_plot[[1]][[1]][4:9] <- lapply(temp_plot[[1]][[1]][4:9], function(x) {x$showlegend <- FALSE;x})
temp_plot
Below is the code for a graph I am making for an article I am working on. The plot showed the predicted probabilities along a range of values in my data set. Along the x-axis is a rug plot that shows the distribution of trade share values (I provided the code and an image of the graph):
sitc8 <- ggplot() + geom_line(data=plotdat8, aes(x = lagsitc8100, y = PredictedProbabilityMean), size = 2, color="blue") +
geom_ribbon(data=plotdat8, aes(x = lagsitc8100, ymin = lowersd, ymax = uppersd),
fill = "grey50", alpha=.5) +
ylim(c(-0.75, 1.5)) +
geom_hline(yintercept=0) +
geom_rug(data=multi.sanctions.bust8.full#frame, aes(x=lagsitc8100), col="black", size=1.0, sides="b") +
xlab("SITC 8 Trade Share") +
ylab("Probability of Sanctions Busting") +
theme(panel.grid.major = element_line(colour = "gray", linetype = "dotted"), panel.grid.minor =
element_blank(), panel.background = element_blank())
My question is: is it possible to change the color of the lines of the rugplot of trade share in which the event I am modeling occurs? In other words, I would like to add red lines or red dots along those values of trade share when my event = 1.
Is this possible?
Sure. You'd just have to add a color argument within an aes() function call within geom_rug().
Here's some code to create a dummy data frame.
library(tidyverse)
set.seed(42)
dummy_data <- tibble(x_var = rnorm(100),
y_var = abs(rnorm(100)) * x_var) %>%
rownames_to_column(var = "temp_row") %>%
mutate(color_id = if_else(as.numeric(temp_row) <= 50,
"Type A",
"Type B"))
And here's a ggplot call where the color for geom_rug is mapped to a character column named color_id
ggplot(data = dummy_data, mapping = aes(x = x_var, y = y_var)) +
geom_smooth(method = "lm") +
geom_rug(mapping = aes(color = color_id), sides = "b")
Update:
Following OP's comment, here's an updated version. If it's a numeric vector of 0s and 1s, you have to tell ggplot to treat it as a dichotomous variable. You can do that by wrapping it in a call to factor() for instance.
For the color we can set that manually using scale_color_manual(). So the changes to the code are the following.
color_id is now a vector og 0s and 1s.
the color is now mapped to factor(color_id)
the color scale is determined using scale_color_manual
library(tidyverse)
set.seed(42)
dummy_data <- tibble(x_var = rnorm(100),
y_var = abs(rnorm(100)) * x_var) %>%
rownames_to_column(var = "temp_row") %>%
mutate(color_id = if_else(as.numeric(temp_row) <= 50,
0,
1))
ggplot(data = dummy_data, mapping = aes(x = x_var, y = y_var)) +
geom_smooth(method = "lm") +
geom_rug(mapping = aes(color = factor(color_id)), sides = "b") +
scale_color_manual(values = c("black", "red")) +
labs(color = "This takes two values")
Definitely possible. Here's an example using iris, and a dynamic condition in the rug. You could also do two rugs, if you chose.
library(tidyverse)
iris %>%
ggplot(aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point() +
geom_rug(aes(color = Petal.Length >3), sides = "b")
# Second example, output not shown
iris %>%
ggplot(aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point() +
geom_rug(data = subset(iris, Petal.Length > 3), color = "black", sides = "b") +
geom_rug(data = subset(iris, Petal.Length <= 3), color = "red", sides = "b")
Please consider the following:
I want to plot a step-wise curve (using geom_step()) and some smooth lines (using geom_line()) in one graph using ggplot2.
I manage to create a graph but the labels are wrong and cannot be corrected using scale_color_discrete().
Desired outcome: Based on the data (see below), line "hello" is the upper line, followed by "foo" and "bar", but the labels are not correct. In addition, I also need a label for the, now missing, geom_step()curve.
Question: What am I doing wrong?
Reproducible example:
library(ggplot2)
# Data
db <- data.frame(time = 0:100,
step = 1-pexp(0:100, rate = 1),
foo = 1-pexp(0:100, rate = 0.4),
bar = 1-pexp(0:100, rate = 0.5),
hello = 1-pexp(0:100, rate = 0.1)
)
# Plotted with wrong labels (automatically)
ggplot(data = db, aes(x = time, y = step)) +
geom_step(show.legend = T) +
geom_line(aes(x = time, y = foo, col = "red")) +
geom_line(aes(x = time, y = bar, col = "blue")) +
geom_line(aes(x = time, y = hello, col = "green"))
Looking at the labels, one can already see that the description of the color and the color of the line do not match.
# Still wrong labels
ggplot(data = db, aes(x = time, y = step)) +
geom_step(show.legend = T) +
geom_line(aes(x = time, y = foo, col = "red")) +
geom_line(aes(x = time, y = bar, col = "blue")) +
geom_line(aes(x = time, y = hello, col = "green")) +
scale_color_discrete(name = "Dose", labels = c("foo", "bar", "hello"))
Changing the labels obviously wont help.
Created on 2019-04-15 by the reprex
package (v0.2.0).
You are specifying the color you want to have inside the aesthetics-call. This means you match the color to the label "red" and not use the color "red".
You can fix this for example like this:
p <- ggplot(data = db, aes(x = time, y = step)) +
geom_step(aes(color = "step")) +
geom_line(aes(y = foo, color = "foo")) +
geom_line(aes(y = bar, color = "bar")) +
geom_line(aes(y = hello, color = "hello"))
p
Note that I dropped the x = time as this is inherited from the ggplot-call in each step. If you want to change the color for each of the lines, you should now use for example scale_color_manual like the following:
p +
scale_color_manual(name = "Dose",
values = c("step" = "black", "foo" = "red",
"bar" = "blue", "hello" = "green"))
Another option would be to transform you data to the long format:
library(tidyr)
library(dplyr)
new_db <- gather(db, type, value, -time)
ggplot(data = filter(new_db, type != "step"), aes(x = time, y = value, color = type)) +
geom_line() +
geom_step(data = filter(new_db, type == "step"))