I'm having some issue overlaying 2 point graphs on a box plot. The code seems to work well when i added only one point graph. Here is the code below:
ggplot(data1, aes(x= reorder(DMU,order), y = Efficiency)) +
geom_boxplot() +
geom_point(data = data2, aes(x = dmu, y = eff, color = "eff")) +
scale_color_manual("", breaks = c("eff"), values = c("blue")) +
geom_point(data = data3, aes(x = DMU, y = eff2, color = "eff2")) +
scale_color_manual("", breaks = c("eff2"), values = c("red"))
I keep getting the error below:
Scale for 'colour' is already present. Adding another scale for
'colour', which will replace the existing scale.
Error: Insufficient values in manual scale. 2 needed but only 1 provided.
You cannot add scale_color_manual() twice.
Build a single dataframe for the colon:
df_points <- data.frame(x = c(data2$dmu, data3$DMU),
y = c(data2$eff, data3$eff2),
data = c("data2", "data3")
)
And then:
ggplot(data1, aes(x = reorder(DMU,order), y = Efficiency)) +
geom_boxplot() +
geom_point(data = df_points, aes(x = x, y = y, color = data)) +
scale_colour_manual(values = c("red", "blue") +
theme(legend.position = "none")
Not having the data available I could have made a mistake
Related
I can't seem to be able to set different fill colours for geom_ribbon(), using one of the columns as input to fill
library(ggplot2)
time <- as.factor(c('A','B','C','D'))
grouping <- as.factor(c('GROUP1','GROUP1','GROUP1','GROUP1',
'GROUP2','GROUP2','GROUP2','GROUP2'))
x <- c(1.00,1.03,1.03,1.06,0.5,0.43,0.2,0.1)
x.upper <- x+0.05
x.lower <- x-0.05
df <- data.frame(time, x, x.upper, x.lower,grouping)
ggplot(data = df,aes(as.numeric(time),x,group=grouping,color=grouping)) +
geom_ribbon(data = df, aes(x=as.numeric(time), ymax=x.upper, ymin=x.lower),
fill=grouping, alpha=.5) +
geom_point() + labs(title="My ribbon plot",x="Time",y="Value") +
scale_x_continuous(breaks = 1:4, labels = levels(df$time))
I get the error Error: Unknown colour name: grouping but fill=c("pink","blue") works fine. I don't want to specify the colours manually.
All other examples I can find simply list the column in the fill argument so I'm not sure what I'm doing incorrectly.
Move fill = grouping inside aes so that this column is mapped to the fill variable.
ggplot(data = df, aes(as.numeric(time), x, color = grouping)) +
geom_ribbon(data = df, aes(ymax = x.upper, ymin = x.lower,
fill = grouping), alpha = 0.5) +
geom_point() +
labs(title = "My ribbon plot", x = "Time", y = "Value") +
scale_x_continuous(breaks = 1:4, labels = levels(df$time))
Problem
I have some data points stored in data.frame with three variables, x, y, and gender. My goal is to draw several generally fitted lines and also lines specifically fitted for male/female over the scatter plot, with points coloured by gender. It sounds easy but some issues just persist.
What I currently do is to use a new set of x's and predict y's for every model, combine the fitted lines together in a data.frame, and then convert wide to long, with their model name as the third var (from this post: ggplot2: how to add the legend for a line added to a scatter plot? and this: Add legend to ggplot2 line plot I learnt that mapping should be used instead of setting colours/legends separately). However, while I can get a multicolor line plot, the points come without specific colour for gender (already a factor) as I expected from the posts I referenced.
I also know it might be possible to use aes=(y=predict(model)), but I met other problems for this. I also tried to colour the points directly in aes, and assign colours separately for each line, but the legend cannot be generated unless I use lty, which makes legend in the same colour.
Would appreciate any idea, and also welcome to change the whole method.
Code
Note that two pairs of lines overlap. So it only appeared to be two lines. I guess adding some jitter in the data might make it look differently.
slrmen<-lm(tc~x+I(x^2),data=data[data['gender']==0,])
slrwomen<-lm(tc~x+I(x^2),data=data[data['gender']==1,])
prdf <- data.frame(x = seq(from = range(data$x)[1],
to = range(data$x)[2], length.out = 100),
gender = as.factor(rep(1,100)))
prdm <- data.frame(x = seq(from = range(data$x)[1],
to = range(data$x)[2], length.out = 100),
gender = as.factor(rep(0,100)))
prdf$fit <- predict(fullmodel, newdata = prdf)
prdm$fit <- predict(fullmodel, newdata = prdm)
rawplotdata<-data.frame(x=prdf$x, fullf=prdf$fit, fullm=prdm$fit,
linf=predict(slrwomen, newdata = prdf),
linm=predict(slrmen, newdata = prdm))
plotdata<-reshape2::melt(rawplotdata,id.vars="x",
measure.vars=c("fullf","fullm","linf","linm"),
variable.name="fitmethod", value.name="y")
plotdata$fitmethod<-as.factor(plotdata$fitmethod)
plt <- ggplot() +
geom_line(data = plotdata, aes(x = x, y = y, group = fitmethod,
colour=fitmethod)) +
scale_colour_manual(name = "Fit Methods",
values = c("fullf" = "lightskyblue",
"linf" = "cornflowerblue",
"fullm"="darkseagreen", "linm" = "olivedrab")) +
geom_point(data = data, aes(x = x, y = y, fill = gender)) +
scale_fill_manual(values=c("blue","green")) ## This does not work as I expected...
show(plt)
Code for another method (omitted two lines), which generates same-colour legend and multi-color plot:
ggplot(data = prdf, aes(x = x, y = fit)) + # prdf and prdm are just data frames containing the x's and fitted values for different models
geom_line(aes(lty="Female"),colour = "chocolate") +
geom_line(data = prdm, aes(x = x, y = fit, lty="Male"), colour = "darkblue") +
geom_point(data = data, aes(x = x, y = y, colour = gender)) +
scale_colour_discrete(name="Gender", breaks=c(0,1),
labels=c("Male","Female"))
This is related to using the colour aesthetic for lines and the fill aesthetics for points in your own (first) example. In the second example, it works because the colour aesthetic is used for lines and points.
By default, geom_point can not map a variable to fill, because the default point shape (19) doesn't have a fill.
For fill to work on points, you have to specify shape = 21:25 in geom_point(), outside of aes().
Perhaps this small reproducible example helps to illustrate the point:
Simulate data
set.seed(4821)
x1 <- rnorm(100, mean = 5)
set.seed(4821)
x2 <- rnorm(100, mean = 6)
data <- data.frame(x = rep(seq(20,80,length.out = 100),2),
tc = c(x1, x2),
gender = factor(c(rep("Female", 100), rep("Male", 100))))
Fit models
slrmen <-lm(tc~x+I(x^2), data = data[data["gender"]=="Male",])
slrwomen <-lm(tc~x+I(x^2),data = data[data["gender"]=="Female",])
newdat <- data.frame(x = seq(20,80,length.out = 200))
fitted.male <- data.frame(x = newdat,
gender = "Male",
tc = predict(object = slrmen, newdata = newdat))
fitted.female <- data.frame(x = newdat,
gender = "Female",
tc = predict(object = slrwomen, newdata = newdat))
Plot using colour aesthetics
Use the colour aesthetics for both points and lines (specify in ggplot such that it gets inherited throughout). By default, geom_point can map a variable to colour.
library(ggplot2)
ggplot(data, aes(x = x, y = tc, colour = gender)) +
geom_point() +
geom_line(data = fitted.male) +
geom_line(data = fitted.female) +
scale_colour_manual(values = c("tomato","blue")) +
theme_bw()
Plot using colour and fill aesthetics
Use the fill aesthetics for points and the colour aesthetics for lines (specify aesthetics in geom_* to prevent them being inherited). This will reproduce the problem.
ggplot(data, aes(x = x, y = tc)) +
geom_point(aes(fill = gender)) +
geom_line(data = fitted.male, aes(colour = gender)) +
geom_line(data = fitted.female, aes(colour = gender)) +
scale_colour_manual(values = c("tomato","blue")) +
scale_fill_manual(values = c("tomato","blue")) +
theme_bw()
To fix this, change the shape argument in geom_point to a point shape that can be filled (21:25).
ggplot(data, aes(x = x, y = tc)) +
geom_point(aes(fill = gender), shape = 21) +
geom_line(data = fitted.male, aes(colour = gender)) +
geom_line(data = fitted.female, aes(colour = gender)) +
scale_colour_manual(values = c("tomato","blue")) +
scale_fill_manual(values = c("tomato","blue")) +
theme_bw()
Created on 2021-09-19 by the reprex package (v2.0.1)
Note that the scales for colour and fill get merged automatically if the same variable is mapped to both aesthetics.
It seems to me that what you really want to do is use ggplot2::stat_smooth instead of trying to predict yourself.
Borrowing the data from #scrameri:
ggplot(data, aes(x = x, y = tc, color = gender)) +
geom_point() +
stat_smooth(aes(linetype = "X^2"), method = 'lm',formula = y~x + I(x^2)) +
stat_smooth(aes(linetype = "X^3"), method = 'lm',formula = y~x + I(x^2) + I(x^3)) +
scale_color_manual(values = c("darkseagreen","lightskyblue"))
I want to create a black and white plot using ggplot2, where the data is plotted by category using a combination of lines and points. However, the legend only shows the point shape, with no line running through it, unless I add color to the plot.
Here is some example data to illustrate the problem with:
## Create example data
set.seed(123)
dat <- data.frame(
time_period = rep(1:4, each = 3),
category = rep(LETTERS[1:3], 4),
y = rnorm(12)
)
Here is an example of a color plot, so you can see how I want the legend to look:
library(ggplot2)
## Generate plot with color
ggplot(data = dat, mapping = aes(x = time_period, y = y, color = category)) +
geom_line(aes(group = category)) +
geom_point(aes(shape = category), size = 2) +
theme_bw()
However, if I move to grayscale (which I need to be able to do), the line running through the point in the legend disappears, which I'd like to avoid:
## Generate plot without color
ggplot(data = dat, mapping = aes(x = time_period, y = y)) +
geom_line(aes(group = category)) +
geom_point(aes(shape = category), size = 2) +
theme_bw()
How can I add a line through the point symbols in the legend with a grayscale plot?
I would suggest this approach:
#Plot
ggplot(data = dat, mapping = aes(x = time_period, y = y,group = category,shape = category)) +
geom_line(color='gray',show.legend = T) +
geom_point(size = 2) +
theme_bw()
Output:
I have the dataframe below:
etf_id<-c("a","b","c","d","e","a","b","c","d","e","a","b","c","d","e")
factor<-c("A","A","A","A","A","B","B","B","B","B","C","C","C","C","C")
normalized<-c(-0.048436801,2.850578601,1.551666490,0.928625186,-0.638111793,
-0.540615895,-0.501691539,-1.099239823,-0.040736139,-0.192048665,
0.198915407,-0.092525810,0.214317734,0.550478998,0.024613778)
df<-data.frame(etf_id,factor,normalized)
and I create a ggplotly() boxplot with:
library(ggplot2)
library(plotly)
ggplotly(ggplot(data = df, aes(x = factor, y = normalized)) +
geom_boxplot(aes(fill = as.factor(factor)),outlier.colour = 'black') +
geom_point(data = df, position = position_dodge(0.75))+geom_point(data = df,
aes(x = factor, y = normalized, shape = etf_id, color = etf_id),
size = 2))
I take as a result a boxplot with this legend:
but I want my legend to have only the color distinction like below. Note that the factors wont be 3 every time but may vary from 1 to 8.
The recommended way to alter plotly elements is to use the style() function. You can identify the elements and traces by inspecting plotly_json().
I'm not sure if there's a more compact way, but you can achieve the desired result using:
p <- ggplotly(ggplot(data = df, aes(x = factor, y = normalized)) +
geom_boxplot(aes(fill = as.factor(factor)),outlier.colour = 'black') +
geom_point(data = df, position = position_dodge(0.75))+geom_point(data = df,
aes(x = factor, y = normalized, shape = etf_id, color = etf_id),
size = 2))
p <- style(p, showlegend = FALSE, traces = 5:9)
for (i in seq_along(levels(df$factor))) {
p <- style(p, name = levels(df$factor)[i], traces = i)
}
p
Note that in this case the factor levels and traces align but that won't always be the case so you may need to adjust this (i.e. i + x).
One quick way would be to add show.legend = FALSE to supress the legend from showing.
library(ggplot2)
ggplot(data = df, aes(x = factor, y = normalized)) +
geom_boxplot(aes(fill = as.factor(factor)),outlier.colour = 'black') +
geom_point(position = position_dodge(0.75)) +
geom_point(aes(x = factor, y = normalized, shape = etf_id, color = etf_id),
size = 2, show.legend=FALSE)
Unfortunately, this does not work when this is passed to ggplotly. You can use theme(legend.position='none') which works but suppresses all the legends instead of specific ones. One dirty hack is to disable specific legend manually
temp_plot <- ggplotly(ggplot(data = df, aes(x = factor, y = normalized)) +
geom_boxplot(aes(fill = as.factor(factor)),outlier.colour = 'black') +
geom_point(position = position_dodge(0.75)) +
geom_point(aes(x = factor, y = normalized, shape = etf_id, color = etf_id),size = 2))
temp_plot[[1]][[1]][4:9] <- lapply(temp_plot[[1]][[1]][4:9], function(x) {x$showlegend <- FALSE;x})
temp_plot
I am plotting 2 sets of data on the same plot using ggplot. I have specified the colour for each data set, but there is no legend that comes out when the dot plot is generated.
What can i do to manually add a legend?
# Create an index to hold values of m from 1 to 100
m_index <- (1:100)
data_frame_50 <- data(prob_max_abs_cor_50)
data_frame_20 <- data.frame(prob_max_abs_cor_20)
library(ggplot2)
plot1 <- ggplot(data_frame_50, mapping = aes(x = m_index,
y = prob_max_abs_cor_50),
colour = 'red') +
geom_point() +
ggplot(data_frame_20, mapping = aes(x = m_index,
y = prob_max_abs_cor_20),
colour = 'blue') +
geom_point()
plot1 + labs(x = " Values of m ",
y = " Maximum Absolute Correlation ",
title = "Dot plot of probability")
First, I would suggest neatening your ggplot code a little. This is equivalent to your posted code;
ggplot() +
geom_point(data = data_frame_50, aes(x = m_index, y = prob_max_abs_cor_50,
colour = 'red')) +
geom_point(data = data_frame_20, aes(x = m_index, y = prob_max_abs_cor_20,
colour = 'blue')) +
labs(x = " Values of m ", y = " Maximum Absolute Correlation ",
title = "Dot plot of probability")
You won't get a legend here, because you are plotting different datasets with only one category in each. You need to have a single dataset with a column grouping your data (i.e. 20 or 50). So using some example data, this is the equivalent of what you are plotting and ggplot won't provide a legend;
ggplot() +
geom_point(data = iris, aes(x = Sepal.Length, y = Petal.Width), colour = 'red') +
geom_point(data = iris, aes(x = Sepal.Length, y = Petal.Length), colour = 'blue')
If you want to colour by category, include a colour argument inside the aes call;
ggplot() +
geom_point(data = iris, aes(x = Sepal.Length, y = Petal.Width,
colour = factor(Species)))
Have a look at the iris dataset to get a sense of how you need to shape your data. It's hard to give precise advice, because you haven't provided an idea of what your data look like, but something like this might work;
df.20 <- data.frame("m" = 1:100, "Group" = 20, "Numbers" = prob_max_abs_cor_20)
df.50 <- data.frame("m" = 1:100, "Group" = 50, "Numbers" = prob_max_abs_cor_50)
df.All <- rbind(df.20, df.50)