I'm trying to plots insect counts of 2 species in 18 experimental plots onto a single graph. Since the second species population peaks later, it is visually doable (see picture below). I would like the 18 population lines from species 1 to be green (using "Greens" from RColorBrewer) and the 18 of species 2 to be red (using "Reds"). I do realize this may be problematic for a colourblind audience, but that is irrelevant here.
I've read here that it is not possible with standard ggplot2 options: R ggplot two color palette on the same plot but this post is more than two years old.
There is a short of "cheat" for points: Using two scale colour gradients ggplot2 but since I prefer lines to show the population through time, I can't use it.
Are there any new "cheats" available for this?
Or does anyone have another idea to visualize my data in a way that shows population trends through time in all plots and shows the difference in timing of the peak? I've included a picture at the bottom that shows my real data, all in the same colour scale though.
Sample code
# example data frame
plot <- as.factor(rep(c("A","B","C"),each=5))
time <- as.numeric(rep(c(1:5),times=3))
S1 <- c(1,4,7,5,2, 2,8,9,3,1, 1,6,6,3,1)
S2 <- c(0,0,2,3,2, 1,2,1,5,3, 0,1,1,6,7)
df <- data.frame(time, plot, S1, S2)
# example colour scales
S1Colours <- colorRampPalette(brewer.pal(9,"Greens"))(3)
S2Colours <- colorRampPalette(brewer.pal(9,"Reds"))(3)
names(S1Colours) <- levels(df$plot)
names(S2Colours) <- levels(df$plot)
# example plot
ggplot(data=df) +
geom_line(aes(x=time, y=S1, colour=plot)) +
geom_line(aes(x=time, y=S2, colour=plot)) +
scale_colour_manual(name = "plot", values = S1Colours) +
scale_colour_manual(name = "plot", values = S2Colours)
# this gives the note "Scale for 'colour' is already present. Adding another scale for 'colour', which will replace the existing scale."
Plot real data
I also would go by creating a manual color scale for all the combinations.
library(tidyverse)
library(RColorBrewer)
df_long=pivot_longer(df,cols=c(S1,S2),names_to = "Species",values_to = "counts") %>% # create long format and
mutate(plot_Species=paste(plot,Species,sep="_")) # make identifiers for combined plot and Species
#make color palette
mycolors=c(colorRampPalette(brewer.pal(9,"Greens"))(sum(grepl("S1",unique(df_long$plot_Species)))),
colorRampPalette(brewer.pal(9,"Reds"))(sum(grepl("S2",unique(df_long$plot_Species)))))
names(mycolors)=c(grep("S1",unique(df_long$plot_Species),value = T),
grep("S2",unique(df_long$plot_Species),value = T))
# example plot
ggplot(data=df_long) +
geom_line(aes(x=time, y=counts, colour=plot_Species)) +
scale_colour_manual(name = "Species by plot", values = mycolors)
You can do this easily with the ggnewscale package (disclaimer: I'm the author).
This is how you would do it:
library(RColorBrewer)
library(ggplot2)
library(ggnewscale)
plot <- as.factor(rep(c("A","B","C"),each=5))
time <- as.numeric(rep(c(1:5),times=3))
S1 <- c(1,4,7,5,2, 2,8,9,3,1, 1,6,6,3,1)
S2 <- c(0,0,2,3,2, 1,2,1,5,3, 0,1,1,6,7)
df <- data.frame(time, plot, S1, S2)
# example colour scales
S1Colours <- colorRampPalette(brewer.pal(9,"Greens"))(3)
S2Colours <- colorRampPalette(brewer.pal(9,"Reds"))(3)
names(S1Colours) <- levels(df$plot)
names(S2Colours) <- levels(df$plot)
ggplot(data=df) +
geom_line(aes(x=time, y=S1, colour=plot)) +
scale_colour_manual(name = "plot 1", values = S1Colours) +
new_scale_color() +
geom_line(aes(x=time, y=S2, colour=plot)) +
scale_colour_manual(name = "plot 2", values = S2Colours)
Created on 2019-12-19 by the reprex package (v0.3.0)
Related
I would like to have a separate scale bar for each variable.
I have measurements taken throughout the water column for which the means have been calculated into 50cm bins. I would like to use geom_tile to show the variation of each variable in each bin throughout the water column, so the plot has the variable (categorical) on the x-axis, the depth on the y-axis and a different colour scale for each variable representing the value. I am able to do this for one variable using
ggplot(data, aes(x=var, y=depth, fill=value, color=value)) +
geom_tile(size=0.6)+ theme_classic()+scale_y_continuous(limits = c(0,11), expand = c(0, 0))
But if I put all variables onto one plot, the legend is scaled to the min and max of all values so the variation between bins is lost.
To provide a reproducible example, I have used the mtcars, and I have included alpha = which, of course, doesn't help much because the scale of each variable is so different
data("mtcars")
# STACKS DATA
library(reshape2)
dat2b <- melt(mtcars, id.vars=1:2)
dat2b
ggplot(dat2b) +
geom_tile(aes(x=variable , y=cyl, fill=variable, alpha = value))
Which produces
Is there a way I can add a scale bar for each variable on the plot?
This question is similar to others (e.g. here and here), but they do not use a categorical variable on the x-axis, so I have not been able to modify them to produce the desired plot.
Here is a mock-up of the plot I have in mind using just four of the variables, except I would have all legends horizontal at the bottom of the plot using theme(legend.position="bottom")
Hope this helps:
The function myfun was originally posted by Duck here: R ggplot heatmap with multiple rows having separate legends on the same graph
library(purrr)
library(ggplot2)
library(patchwork)
data("mtcars")
# STACKS DATA
library(reshape2)
dat2b <- melt(mtcars, id.vars=1:2)
dat2b
#Split into list
List <- split(dat2b,dat2b$variable)
#Function for plots
myfun <- function(x)
{
G <- ggplot(x, aes(x=variable, y=cyl, fill = value)) +
geom_tile() +
theme(legend.direction = "vertical", legend.position="bottom")
return(G)
}
#Apply
List2 <- lapply(List,myfun)
#Plot
reduce(List2, `+`)+plot_annotation(title = 'My plot')
patchwork::wrap_plots(List2)
I am trying to create a plot which includes multiple geom_smooth trendlines within one plot. My current code is as follows:
png(filename="D:/Users/...", width = 10, height = 8, units = 'in', res = 300)
ggplot(Data) +
geom_smooth(aes(BA,BAgp1),colour="red",fill="red") +
geom_smooth(aes(BA,BAgp2),colour="turquoise",fill="turquoise") +
geom_smooth(aes(BA,BAgp3),colour="orange",fill="orange") +
xlab(bquote('Tree Basal Area ('~cm^2~')')) +
ylab(bquote('Predicted Basal Area Growth ('~cm^2~')')) +
labs(title = expression(paste("Other Softwoods")), subtitle = "Tree Level Basal Area Growth") +
theme_bw()
dev.off()
Which yields the following plot:
The issue is I can't for the life of me include a simple legend where I can label what each trendline represents. The dataset is quite large- if it would be valuable in indentifying a solution I will post externally to Stackoverflow.
Your data is in the wide format, or like a matrix. There's no easy way to add a custom legend in ggplot, so you need to transform your current data to a long format. I simulated 3 curves like what you have, and you can see if you call geom_line or geom_smooth with a variable ("name" in the example below) that separates your different values, it will work and produce a legend nicely.
library(dplyr)
library(tidyr)
library(ggplot2)
X = 1:50
#simulate data
Data = data.frame(
BA=X,
BAgp1 = log(X)+rnorm(length(X),0,0.3),
BAgp2 = log(X)+rnorm(length(X),0,0.3) + 0.5,
BAgp3 = log(X)+rnorm(length(X),0,0.3) + 1)
# convert this to long format, use BA as id
Data <- Data %>% pivot_longer(-BA)
#define colors
COLS = c("red","turquoise","orange")
names(COLS) = c("BAgp1","BAgp2","BAgp3")
###
ggplot(Data) +
geom_smooth(aes(BA,value,colour=name,fill=name)) +
# change name of legend here
scale_fill_manual(name="group",values=COLS)+
scale_color_manual(name="group",values=COLS)
I'm working with ggplot2 and trajectory plots, plots whom are like scatter plots, but with lines that connect points due a specific rule.
My goal is to overlay a trajectory plot with a scatter plot, and each of them has different data.
First of all, the data:
# first dataset
ideal <- data.frame(ideal=c('a','b')
,x_i=c(0.3,0.8)
,y_i=c(0.11, 0.23))
# second dataset
calculated <- data.frame(calc = c("alpha","alpha","alpha")
,time = c(1,2,3)
,x_c = c(0.1,0.9,0.3)
,y_c = c(0.01,0.26,0.17)
)
Creating a scatter plot with the first one is easy:
library(ggplot2)
ggplot(calculated, aes(x=x_c, y=y_c)) + geom_point()
After that, I created the trajectory plot, using this helpful link:
library(grid)
library(data.table)
qplot(x_c, y_c, data = calculated, color = calc, group = calc)+
geom_path (linetype=1, size=0.5, arrow=arrow(angle=15, type="closed"))+
geom_point (data = calculated, colour = "red")+
geom_point (shape=19, size=5, fill="black")
With this result:
How can I overlay the ideal data to this trajectory plot (without trajectory of course, they should be only points)?
Thanks in advance!
qplot isn't usually recommended. Here's how you could plot the two dataframes. However, ggplot might work better for you if the dataframes were merged, and you had an x and y column, with an additional method column containing with calculated or ideal.
library(ggplot2)
ideal <- data.frame(ideal=c('a','b')
,x_i=c(0.3,0.8)
,y_i=c(0.11, 0.23)
)
# second dataset
calculated <- data.frame(calc = c("alpha","alpha","alpha")
,time = c(1,2,3)
,x_c = c(0.1,0.9,0.3)
,y_c = c(0.01,0.26,0.17)
)
ggplot(aes(x_c, y_c, color = "calculated"), data = calculated) +
geom_point( size = 5) +
geom_path (linetype=1, size=0.5, arrow = arrow(angle=15, type="closed"))+
geom_point(aes(x_i, y_i, color = "ideal"), data = ideal, size = 5) +
labs(x = "x", y = "y", color = "method")
I am trying to simply add a legend to my Nyquist plot where I am plotting 2 sets of data: 1 is an experimental set (~600 points), and 2 is a data frame calculated using a transfer function (~1000 points)
I need to plot both and label them. Currently I have them both plotted okay but when i try to add the label using scale_colour_manual no label appears. Also a way to move this label around would be appreciated!! Code Below.
pdf("nyq_2elc.pdf")
nq2 <- ggplot() + geom_point(data = treat, aes(treat$V1,treat$V2), color = "red") +
geom_point(data = circuit, aes(circuit$realTF,circuit$V2), color = "blue") +
xlab("Real Z") + ylab("-Imaginary Z") +
scale_colour_manual(name = 'hell0',
values =c('red'='red','blue'='blue'), labels = c('Treatment','EQ')) +
ggtitle("Nyquist Plot and Equivilent Circuit for 2 Electrode Treatment Setup at 0 Minutes") +
xlim(0,700) + ylim(0,700)
print(nq2)
dev.off()
Ggplot works best with long dataframes, so I would combine the datasets like this:
treat$Cat <- "treat"
circuit$Cat <- "circuit"
CombData <- data.frame(rbind(treat, circuit))
ggplot(CombData, aes(x=V1, y=V2, col=Cat))+geom_point()
This should give you the legend you want.
You probably have to change the names/order of the columns of dataframes treat and circuit so they can be combined, but it's hard to tell because you're not giving us a reproducible example.
If I have a dataframe like this:
obs<-rnorm(20)
d<-data.frame(year=2000:2019,obs=obs,pred=obs+rnorm(20,.1))
d$pup<-d$pred+.5
d$plow<-d$pred-.5
d$obs[20]<-NA
d
And I want the observation and model prediction error bars to look something like:
(p1<-ggplot(data=d)+aes(x=year)
+geom_point(aes(y=obs),color='red',shape=19)
+geom_point(aes(y=pred),color='blue',shape=3)
+geom_errorbar(aes(ymin=plow,ymax=pup))
)
How do I add a legend/scale/key identifying the red points as observations and the blue plusses with error bars as point predictions with ranges?
Here is one solution melting pred/obs into one column. Can't post image due to rep.
library(ggplot2)
obs <- rnorm(20)
d <- data.frame(dat=c(obs,obs+rnorm(20,.1)))
d$pup <- d$dat+.5
d$plow <- d$dat-.5
d$year <- rep(2000:2019,2)
d$lab <- c(rep("Obs", 20), rep("Pred", 20))
p1<-ggplot(data=d, aes(x=year)) +
geom_point(aes(y = dat, colour = factor(lab), shape = factor(lab))) +
geom_errorbar(data = d[21:40,], aes(ymin=plow,ymax=pup), colour = "blue") +
scale_shape_manual(name = "Legend Title", values=c(6,1)) +
scale_colour_manual(name = "Legend Title", values=c("red", "blue"))
p1
edit: Thanks for the rep. Image added
Here is a ggplot solution that does not require melting and grouping.
set.seed(1) # for reproducible example
obs <- rnorm(20)
d <- data.frame(year=2000:2019,obs,pred=obs+rnorm(20,.1))
d$obs[20]<-NA
library(ggplot2)
ggplot(d,aes(x=year))+
geom_point(aes(y=obs,color="obs",shape="obs"))+
geom_point(aes(y=pred,color="pred",shape="pred"))+
geom_errorbar(aes(ymin=pred-0.5,ymax=pred+0.5))+
scale_color_manual("Legend",values=c(obs="red",pred="blue"))+
scale_shape_manual("Legend",values=c(obs=19,pred=3))
This creates a color and shape scale wiith two components each ("obs" and "pred"). Then uses scale_*_manual(...) to set the values for those scales ("red","blue") for color, and (19,3) for scale.
Generally, if you have only two categories, like "obs" and "pred", then this is a reasonable way to go use ggplot, and avoids merging everything into one data frame. If you have more than two categories, or if they are integral to the dataset (e.g., actual categorical variables), then you are much better off doing this as in the other answer.
Note that your example left out the column year so your code does not run.